LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

MARCH 29, 26 LECTURE 2 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT (Davidson (2), Chapter 4; Phillips Lectures on Unit Roots, Cointegration and Nonstationarity; White (999), Chapter 7) Unit root processes De nition (Random walk) The process fx t g is a random walk if it satis es (a) X t = X t + " t for all t = ; 2; : : : ; (b) X = ; (c) f" t g is iid such that E" t = ; < E" 2 t < : Random walk is not a stationary process: X t = X t + " t = (X t 2 + " t ) + " t tx = X + = tx " t : j= j= " t The process has mean zero, however, A random walk with drift is de ned as V ar (X t ) = t 2 : X t = + X t + " t tx = t + " t : Random walk is a member of a more general class of nonstationary processes. De nition 2 (Unit root process) The process fx t g is a unit root process if it satis es ( L) X t = U t ; where fu t g is a mean zero covariance stationary process with short memory. For example, consider the process (L) ( L) X t = (L) " t ; where (L) and (L) are lag polynomials of nite order and have all the roots outside the unit circle. The rst di erence of X t ; j= X t = X t X t = ( L) X t is an ARMA(p; q) process. The process fx t g is referred as containing a unit root since its autoregressive polynomial (L) ( L) has a unit root. We say that fx t g is an autoregressive integrated moving average process ARIMA (p; d; q) ; (L) ( L) d X t = (L) " t ; where d indicates the number of time one has to di erence fx t g in order to obtain a covariance stationary process with short memory (in the above example, d = ).

De nition 3 (Integrated or di erence stationary processes) The process fx t g is integrated of order d, denoted as X t = I (d), if it satis es ( L) d X t = U t ; where fu t g is a mean zero covariance stationary process with short memory. An I () or unit root process with drift is de ned as ( L) X t = + U t : Integrated processes are referred as di erence stationary, since, for example, in the case of I () ; the rst di erence is a stationary process. This is compared to another class of nonstationary processes: De nition 4 (Trend stationary process) The process fx t g is trend stationary if it satis es X t = +t+u t ; where fu t g is a mean zero covariance stationary process with short memory. A trend stationary process evolves along the line of deterministic trend. Deviations from the trend are only of a short-run nature. The shocks have only temporary e ect, and its mean, + t describes the behavior of the process in the long-run. On the other hand, an integrated process accumulates its shocks and, therefore, is not ergodic. The shocks have permanent e ect, and the process deviates from its expected value for very long periods of time. Suppose that fx t g is a random walk with the increments fu t g such that EUt 2 = : We have n =2 X n = n =2 X n U t t=! d N (; ) : Let [a] be the integer part of a: For r de ne the following function (partial sum) For xed r; W n (r) = n =2 X [nr] [nr] X = n =2 U t : () t= [nr] X W n (r) = n =2 t= U t = n =2 [nr] =2 @[nr]! d r =2 N (; ) = N (; r) : [nr] X =2 U t t= A Further, for xed r < r 2 ; W n (r 2 ) W n (r )! d N (; r 2 r ) ; and, for xed r < r 2 < : : : < r k ; W n (r ) W n (r 2 ) B C @. A! d N B @ ; B @ W n (r k ) r r : : : r r r 2 : : : r 2 : : : : : : : : : : : : r r 2 : : : r k CC AA : (2) 2

De nition 5 (Brownian motion) The continuous process fw (t) : t g is a standard Brownian motion if (a) W () = with probability one, (b) For any r < r 2 < : : : < r k and k; W (r ) ; W (r 2 ) W (r ) ; : : : ; W (r k ) W (r k ) are independent, (c) For s < t; W (t) W (s) N (; t s) : A Brownian motion is viewed as a random function, W (t;!) ; where! 2 is an outcome of the random experiment (a single outcome determines the whole sample path W (;!)). For a xed!; W (;!) : [; )! R is a continuous function. The result in (2) can be alternatively stated as B @ W n (r ) W n (r 2 ). W n (r k ) C A! d B @ W (r ) W (r 2 ). W (r k ) for xed r < r 2 < : : : < r k : However, a stronger result is available. We can treat W n (r) as a random function on the zero-one interval, and discuss convergence in distributions of such a function to W (r) as n! : Weak convergence The concept of weak convergence is an extension of convergence in distribution of random variables or vectors (elements of R k ) to metric spaces. De nition 6 (Metric, metric space) Let S be a set. A metric is a mapping d : S S! R such that (a) d (x; y) for all x; y 2 S, (b) d (x; y) = if and only if x = y; (c) d (x; y) = d (y; x) for all x; y 2 S, (d) d (x; y) d (x; z) + d (y; z) for all x; y; z 2 S. The pair (S; d) is called a metric space. An example of a metric space is R k with d (x; y) = kx yk ; where kk denotes the Euclidean norm. Another example is the space of all continuous functions on the zero-one interval f : [; ]! R: This space is denoted as C [; ] : For any f; g 2 C [; ] de ne the uniform metric C A d u (f; g) = sup jf (r) g (r)j : r (C [; ] ; d u ) is a metric space. Sample paths of a Brownian motion are in C [; ] : A metric gives the distance between the elements of S, which allows us to introduce the notion of an open set. The set fx 2 S : d (x; y) < rg is called the open sphere with the center at y and radius r: For the metric space (S; d) the Borel - eld, denoted B S ; is the smallest - eld containing all open subsets of S. The following requirements are imposed on (S; d): completeness and separability. Completeness is required since we want the limit of a convergent sequence of the elements of S to be an element of S as well. The second requirement is separability. The metric space (S; d) is separable if it contain a countable dense subset; a subset is dense if every point of S is arbitrary close to one of its elements. Separability ensures that the probabilities can be assigned to the elements of B S. For example, the real line with the Euclidean metric is a separable space, since the set of rational numbers is countable and dense. The metric space (C [; ] ; d u ) is complete and separable. For A 2 B S ; its boundary, denoted @A; is the set of all points not interior to A: Let be a probability measure on ((S; d) ; B S ) : The set A is a continuity set of if (@A) = : (3) 3

De nition 7 (Weak convergence) Let n ; be probability measures on the complete and separable measurable metric space ((S; d) ; B S ) : The sequence of probability measures n is said to converge weakly to ; denoted n ) ; if for all continuity sets A 2 B S of ; n (A)! (A) : Let X n ; X be the random elements on the complete and separable measurable metric space ((S; d) ; B S ) ; i.e. and measurable. For A 2 B S de ne X : (; F)! ((S; d) ; B S ) ; X n : (; F)! ((S; d) ; B S ) ; n (A) = P f! 2 : X n (!) 2 Ag ; (A) = P f! 2 : X (!) 2 Ag ; where P is a probability measure on (; F) : We say that X n converges weakly to X; denoted as X n ) X; if n =) : The notion of a continuity set is similar to that of a continuity point of a CDF. Let F n ; F be CDFs. We say that F n! F if F n (x)! F (x) for all continuity points x of F: We require convergence of measures only for the continuity points in order to avoid disappearance of the probability mass. For example, suppose X n = =n with probability one. This random variable has the CDF ; F n (x) = x < =n; x =n: The limit of X n is X = with probability one; and its distribution function is ; x < ; F (x) = x : However, F n (x) does not converge to F (x) for all x 2 R: For x >, lim n! F n (x) = ; and for x < ; lim n! F n (x) = : However, since for all n; < =n; lim n! F n () = : Thus, the limit of F n (x) is ef (x) = ; x ; x > ; which is not a CDF. As a result, the probability mass disappears. The reason is that x = is not a continuity point of F: Hence, we have to rede ne the limit of F n as F: The continuous mapping theorem can be extended to the case of weak convergence. In this case, we have to deal with the mappings that are functions of functions. Such a mapping is called a functional. Theorem (CMT) Suppose that h : ((S; d) ; B S )! (R; B) is a mapping continuous with probability one. Let X n ; X be the random elements on ((S; d) ; B S ) : If X n =) X; then h (X n ) =) h (X) : For example, suppose that f 2 C [; ] : Let h (f) = R f (r) dr: The integral is a continuous functional. Suppose that f n! f in the uniform metric, i.e. d u (f n ; f)! : Then, Z Z jh (f n ) h (f)j = f n (r) dr f (r) dr Z jf n (r) sup jf n (r) r = d u (f n ; f r )! : f (r)j dr f (r)j 4

Functional CLT (FCLT) Consider partial sums W n () : It is a step function on zero-one interval. The function is W n (r) constant for t=n r < (t + ) =n; since for such r s, [rn] = t: When r hits (t + ) =n; the function jumps from n =2 X t to n =2 (X t + U t+ ) : Thus, this function is continuous on the right: lim W n (r) = W n (r ) ; r#r and has left limit, i.e. lim r"r W n (r) exists. Such functions are referred as cadlag. Let D [; ] be the space of cadlag functions on the zero-one interval. The space of continuous functions, C [; ] is a subspace of D [; ] ; and therefore, W () 2 D [; ] : It turns out that (D [; ] ; d u ) is not separable. A space is not separable if it contains noncountable discrete set; a set is discrete if its points are separated. Consider, for example, the following set of functions on D [; ] : Let ; x < ; f (x) = ; x : De ne ff : 2 [; ]g : This set is noncountable. However, d u (f ; f 2 ) = sup x jf (x) f 2 (x)j = for 6= 2 : The distance between any two elements is always regardless of how close and 2 are (as long as they di erent). Therefore, the set is discrete. The appropriate metric to study weak convergence on D [; ] is the Billingsley s metric, d B ; which is a modi cation of d u (see, for example Davidson (994, Chapter 28)). The space ((D [; ] ; d B ) ; B D ) is complete and separable. The following result establishes convergence of re-scaled partial sums of iid random variables to a Brownian motion. Theorem 2 (Donsker s FCLT, White (999), Theorem 7.3) Let fu t g be a sequence of iid random variables such that EU t = and EUt 2 = : Let W n be de ned as in (), and W be a standard Brownian motion. Then, W n =) W: then, The FCLT says that if, for B 2 B D ; n = P f! 2 : W n (;!) 2 Bg ; and (4) = P f! 2 : W (;!) 2 Bg ; n =) : Thus, for n large enough, we can approximate the distribution of the sample paths of a random walk by the distribution of the sample paths of a Brownian motion. The weak convergence of the FCLT is a stronger result than in (3). The result in (3) gives joint convergence in distribution of a nite number of random variables, or convergence of nite dimensional distributions. On the other hand, the FCLT provides convergence of an in nite dimensional object. In fact, W n =) W implies that (W n (r ) ; W n (r 2 ) ; : : : ; W n (r k ))! d (W (r ) ; W (r 2 ) ; : : : ; W (r k )) : Convergence of nite dimensional distributions does not imply weak convergence unless the sequence of measures f n g is uniformly tight; a measure P on ((S; d) ; B S ) is tight if for all " > there exists a compact set K 2 B S ; such that P (K) > ": One can show that the sequence of probability measures in (4) is uniformly tight. The result can be extended to the time-series case by the means of the Phillips-Solo device (Lecture ). Suppose that U t = C(L)" t ; f" t g is iid such that E" t = and E" 2 t = 2 < ; P j= j=2 jc j j < ; 5

C () 6= : Let W n be de ned as in (). Then, W n ) B; where B (r) = C()W (r) ; and W is a standard Brownian motion. The Brownian motion B satis es (a) and (b) of De nition 5, and (c) is replaced with B (t) B (s) N ; 2 C() 2 (t s) for t > s: The result can be extended to the vector case as well. De nition 8 (Vector Brownian motion) The k-vector process W (t) = (W (t) ; : : : ; W k (t)) : t is a vector standard Brownian motion if W ; : : : ; W k are independent scalar standard Brownian motion processes. Suppose that the k-vector process fu t : t = ; 2; : : :g satis es U t = C(L)" t ; f" t g is iid such that E" t = and E" t " t = nite, P j= j=2 kc j k < ; C() 6= : Again, let W n (r) = n =2 P [nr] t= U t: Then, W n ) B; where B (r) = C() =2 W (r) ; and W (r) is a k-vector Brownian motion. Notice that in this case, B (t) B (s) N (; (t s) C()C() ) for t > s: 6