Stationary Gaussian Markov Processes As Limits of Stationary Autoregressive Time Series

Stationary Gaussian Markov Processes As Limits of Stationary Autoregressive Time Series Lawrence D. Brown, Pilip A. Ernst, Larry Sepp, and Robert Wolpert August 27, 2015 Abstract We consider te class, C p, of all zero mean stationary Gaussian processes, Y t, t (, ) wit p derivatives, for wic te vector valued process ( Y (0) t, Y (1) t ),..., Y (p) t, t 0 ( ) is a p + 1-vector Markov process, were Y (0) t = Y (t). We provide a rigorous description and treatment of tese stationary Gaussian processes as limits of stationary AR(p) time series. MSC 2010 Primary: 60G10, Secondary: 60G15 Keywords: Continuous autoregressive processes, stationary Gaussian Markovian processes, stocastic differential. equations 1 Introduction In many data-driven applications in bot te natural sciences and in finance, time series data are often discretized prior to analysis and are ten formulated using autoregressive models. Te teoretical and applied properties 1

of te convergence of discrete autoregressive ( AR ) processes to teir continuous analogs (continuous autoregressive or CAR processes) as been well studied by many matematicians, statisticians, and economists. See, for example, te works of [3, [4, [2, and [5. A special class of autoregressive processes are te discrete-time zero-mean stationary Gaussian Markovian processes on te line (, ). Te continuous time analogs of tese processes are documented in [9 (c.10) and [10 (pp. 207-212). For processes in tis class, te sample pats possess p 1 derivatives at eac value of t, and te evolution of te process following t depends only on te values of tese derivatives at t. Notationally, we term suc a process as a member of te class C p. For convenience, we will use te notation CAR(p)=C p. Te standard Ornstein-Ulenbeck processes is of course a member of C 1, and ence CAR(p) processes can be described as a generalization of te Ornstein-Ulenbeck process. It is well understood tat te Ornstein-Ulenbeck process is related to te usual Gaussian AR(1) process on a discrete-time index, and tat an Ornstein- Ulenbeck process can be described as a limit of appropriately cosen AR(1) processes (see [7). In an analogous fasion we sow tat processes in C p are related to AR(p) processes and can be described as limits of an appropriately cosen sequence of AR(p) processes. Section 2 begins by reviewing te literature on CAR(p) processes, recalling tree equivalent definitions of te processes in C p. Section 3 discusses ow to correctly approximate C p by discrete AR(p) processes. Tis construction is, to te best of our knowledge, novel. 2 Equivalent Descriptions of te Class C p We give ere tree distinct descriptions of processes comprising te class C p, wic are documented in [10 (p. 212), but in different notation. [10 prove (p.211-212) tat tese descriptions are equivalent ways of describing te same class of processes. Te first description matces te euristic description given in te introduction. Te remaining descriptions provide more explicit descriptions tat can be useful in construction and interpretation of tese processes. In all te descriptions Y = {Y (t)} symbolizes a zero-mean Gaussian process on t [0, ). 2

2.1 Tree Definitions (I) Y is stationary. Te sample pats are continuous and are p 1 times differentiable, a.e., at eac t [0, ) (Te derivatives at t = 0 are defined only from te rigt. At all oter values of t, te derivatives can be computed from eiter te left or te rigt, and bot rigt and left derivatives are equal). We denote te derivatives at t by Y (i) (t), i = 1,..., p 1. At any t 0 (0, ), te conditional evolution of te process given Y (t), t [0, t 0 depends only on te values of Y (i) (t 0 ), i = 0,..., p 1. Te above can be formalized as ( follows: let Y (0) t, Y (1) t,..., Y (p 1) t Itô vector diffusion process defined by te system of equations ), t 0 denote te values of a mean zero dy (i 1) t = Y (i) t dt, t > 0, i = 1, 2,..., p 1 p 1 (2.1) dy (p 1) t = a i+1 Y (i) t dt + σdw t i=0 for all t > 0, were σ > 0 and te coefficients {a j } satisfy conditions (2.3) and (2.4), below. Ten let Y (t) = Y (0) t. (II) Eac Y C p is given uniquely by a certain polynomial P (z) via te covariance of any suc Y as follows: E [Y s Y t = R(s, t) = r(t s) = e i(t s)z P (z) 2 dz. (2.2) P (z) is a complex polynomial of degree p + 1. It as positive leading coefficients and all complex roots ζ j = ρ j + iσ j, j = 0,..., p wit σ j > 0, and ρ j real. We impose te following constraint on te roots of P (z): wenever ρ j 0, ten tere is anoter ζ j = ζ j wic is te negative conjugate of ζ j. Tis definition of P (z) ensures tat P (z) 2 is an even function. Finally, it can easily be sown tat, for all t, r(t) automatically as 2p derivatives. Te conditions in equations (2.3) and (2.4) below link equations (2.2) and (2.1) and caracterize wic processes satisfying (2.1) are stationary. Te coefficients a i are te unique solution of te equations: 3

r (p+i+1) (0 + ) = a j r (i+j) (0), i = 0, 1,..., p 1. (2.3) j=0 Note tat te left and rigt derivatives of r (j) are equal except for j = 2p. Te diffusion coefficient, σ, is given by σ 2 = a j r (j+p) (0)( 1) j+1 + ( 1) p r (2p+1) (0 ). (2.4) j=0 Te stationarity of te process in (2.1) can also be determined via te caracterization in Section 2.2. (III) Equivalently, it is necessary and sufficient tat Y C p as te representation via Wiener integrals wit a standard Brownian motion, W, Y t = g(t u)dw (u), (2.5) were g as te L 2 Fourier transform ĝ(z) = 1 P (z). (2.6) By well-known results from bot Fourier analysis and stocastic integration, a full treatment of wic is given in [6, (2.5) and (2.6) jointly are equivalent to construction (II). Furter, by [6, an equivalent construction to tat given jointly by equations (2.5) and (2.6) is te following: given a pair of independent standard Brownian motions, W 1 and W 2, Y as te following spectral representation: Y t = cos tzĝ dw 1 (z) + sin tzĝ dw 2 (z). (2.7) Wit regard to initial conditions for (2.1), we note tat te process in (2.2) as a stationary distribution. If we use tat distribution as te initial distribution for (2.1), and ceck tat equations (2.3) and (2.4) old, we arrive at a stationary process. 4

2.2 Caracterization of Stationarity via (2.1) Te system in (2.1) is linear. Stationary of vector-valued processes described in suc a way as been studied elsewere. See in particular ([7, p. 357, Teorem 5.6.7). Te coefficients in (2.1) tat yield stationarity can be caracterized via te caracteristic polynomial of te matrix Λ, were Λ λi is: λ p a p λ p 1... a 2 λ a 1 = 0. (2.8) Te process is stationary if and only if all te roots of equation (2.8) ave strictly negative real parts. In order to discover weter te coefficients in (2.1) yield a stationary process it is tus necessary and sufficient to ceck weter all te roots of equation (2.8) ave strictly negative real parts. In te case of C 2 te condition for stationarity is quite simple, namely tat a 1, a 2 sould lie in te quadrant a i < 0, i = 1, 2. Te covariance functions for C 2 can be found in [9 (p. 326). For iger order order processes te conditions for stationarity are not so easily described. Indeed, for C 3 it is necessary tat a i < 0, i = 1, 2, 3, but te set of values for wic stationarity olds is not te entire octant. For larger p one needs to study te solutions of te iger order polynomial in equation (2.8). 3 Weak Convergence of te -AR(2) Process to CAR(2) Process 3.1 Discrete Time Analogs of te CAR Processes We now turn our focus to describing te discrete time analogs of te CAR processes and te expression of te CAR processes as limits of tese discrete time processes. In tis section, we discuss te situation for p = 2. Define te -AR(2) processes on te discrete time domain domain {0,, 2,...} via X t = b 1X t + b 2X t 2 + ς Z t, (3.1) Z t IID N(0, 1), t = 2, 3,... Te goal is to establis conditions on te coefficients b 1, b 2 and ς so tat tese AR(2) processes converge to te continuous time CAR(2) process as in 5

te system of equations given in (2.1). We ten discuss some furter features of tese processes. To see te similarity of te -AR(2) process in (3.1) wit te CAR(2) process of (2.1), we introduce te corresponding -VAR(2) processes { 0;t, 1;t} defined via 0;t 0;t = 1;t, From (3.2) we see tat 1;t 1;t = [ c 1 0;t + c 2 1;t + ξ Z t. Z t IID N(0, 1), t =, 2,... (3.2) 0;t = 0;t + 1;t = 0;t + 1;t + [ c 1 0;t + c 2 1;t 2 + ξ Z t = [ 2 + c 1 2 + c 2 0;t [ 1 + c 2 0;t 2 + ξ Z t. Tis sows tat te -AR(2) process of (3.1) is equivalent to te -VAR(2) in (3.2) wit b 1 c 1 2 + c 2 + 2, b 2 c 2 1, and ς ξ, (3.3) or, equivalently, c 1 2 [ b 1 + b 2 1, c 2 1 [ 1 b 2, and ξ 1 ς. (3.4) From above, te -AR(2) process of (3.1) wit coefficients given by (3.3) is equivalent to te -VAR(2) in equations (3.2). Teorem 3.1. Consider a sequence of -AR(2) processes of (3.1) wit coefficients given by (3.3), were c j a j, j = 1, 2, and ξ / σ as 0. Tis sequence converges in distribution to te CAR(2) process of (2.1). Proof. It suffices to sow tat te -VAR(2) process of (3.2) converges to te SDE system of dy t = Ẏtdt, t > 0 dẏt = [ a 1 Y t + a 2 Ẏ t dt + σdwt, t > 0. (3.5) 6

We employ te framework of Teorems 2.1 and 2.2 of [8. Let M t be te σ-algebra generated by and 0;0, 0;, 0;2,..., 0;t 1;0, 1;, 1;2,..., 1;t for t =, 2,.... Te -VAR(2) process of (3.2) is clearly Markovian of order 1, since to construct { 0;t, 1;t} from { 0;t, 1;t } one needs to use te second equation of (3.2) to construct 1;t and ten use te first equation to construct 0;t as well. Tis establises tat 0;t is M t adapted. Tus, te corresponding drifts per unit of time conditioned on information at time t are given by: [ 0;t [ 0;t E M 0;t + 1;t 0;t t = E M t = 1;t (3.6) and [ 1;t+ [( ) 1;t c E M 1 0;t + c 2 1;t + ξ Z t+ t = E M t = c 1 0;t + c 2 1;t. (3.7) Furtermore, te variances and covariances per unit of time are given by [( 0;t 2 [( ) 0;t ) 2 E M 1;t t = E M t = ( 1;t) 2 (3.8) and [ ( 1;t+ 1;t) 2 [ (c ) 2 E M 1 0;t + c 2 1;t + ξ Z t+ t = E M t = [ ( c 1 0;t + c 21;t 2 [ + 2 c 1 0;t + c 21;t ) ξ 2 ξ E [Z t+ + E [ Zt+ 2 = [ ( ) c 1 0;t + c 2 2 ξ 2 1;t +, (3.9) 7

were te last equality assumes tat ɛ t+ IID N(0, 1). By te same logic, [( )( 0;t 0;t 1;t+ 1;t) E M t [ (c ) = E 1;t 1 0;t + c 2 1;t + ξ Z t+ M t (3.10) [ = 1;t c 1 0;t + c 21;t. Te relationsips in (3.8) - (3.10) become [( 0;t 2 0;t ) E M t = o(1), (3.11) [( 1;t+ 2 1;t) E M t = ( ξ ) 2 + o(1), (3.12) and [( )( 0;t 0;t 1;t+ 1;t) E M t = o(1). (3.13) Te o(1) terms vanis uniformly on compact sets. We may additionally sow by brute force tat te limits of [( 0;t 4 0;t ) E M t and [( 1;t+ 4 1;t) E M t exist and converge to zero as 0. We proceed to define te continuous time version of te -VAR(2) process of (3.2) by 8

0;t 0;k and 1;t 1;k for k t < (k + 1). Ten, according to te Teorem 2.2 in [8, te relationsips (3.6), (3.7) and (3.11) - (3.13) provide te weak (in distribution) limit diffusion [ [ [ [ [ Yt 0 1 Yt 0 0 0 d = dt + d, a 1 a 2 0 σ W t Ẏ t Ẏ t were W t, t 0, is a Brownian motion. Tis is te linear SDE system of (3.5) and it as a unique solution. Remark 3.1. [8 Teorems 2.1 and 2.2 are explicitly stated for real-valued processes but apply to vector valued processes as well. One only needs to explicitly allow te processes to be vector valued, and to write te regularity conditions to allow for te full cross-covariance of te vector valued observations, rater tan just ordinary covariance functions. Our processes are muc better beaved tan te most general type of process [8 considers since our error variance is constant (depending only on ) and our distributions are Gaussian, and ence very ligt tailed. Tus bot of Teorems 2.1 and 2.2 in [8 apply. 3.2 Stationarity of AR(2)-VAR(2) process In tis section we present necessary and sufficient conditions for stationarity of te AR(2) process and its equivalent VAR(2) process and connect tis to te stationary condition for CAR(2). First, we invoke te following proposition, wic is easily proved using well-known results from te time series literature (see [1). Proposition 3.1. Te AR(2) process X t = b 1 X t 1 + b 2 X t 2 + ς Z t, Z t IID N(0, 1). (3.14) is stationary if and only if 2 < b 1 < 2, b 2 b 1 < 1 and b 1 + b 2 < 1, (3.15) wic require ( ) b 1, b 2 to lie in te interior of te triangle wit vertices (-2,-1), (0,1) and (2,-1). Its stationary variance is given by γ 0 = ς 2 1 b2 1 1 b 2 b 2 ( b 2 1 1 b 2 + b 2 ). (3.16) 9

Because of te relations in (3.3) and (3.4), it follows tat we ave te following corollary to Proposition 3.1. Corollary 3.1. Te VAR(2) process 0;t 0;t 1 = 1;t, 1;t 1;t 1 = c 1 0;t 1 + c 2 1;t 1 + ξz t, Z t IID N(0, 1), t = 1, 2,..., is equivalent to te AR(2) process of (3.14) for b 1 = c 1 + c 2 + 2 b 2 = c 2 1 and ς = ξ. Furtermore, te VAR(2) process is stationary if and only if 4 < c 1 < 0 and 2 c 1 2 < c 2 < 0. (3.17) (3.17) is equivalent to te condition tat (c 1, c 2 ) lies in te interior of te triangle wit vertices (-4,0), (0,0) and (0,-2). 3.3 Stationary Variance of -VAR(2) Process In tis section we investigate te stationarity of a special version of te - VAR(2) process in Teorem 5.1.1, wic converges to te CAR(2) process of (3.5) as 0, troug te stationarity of its equivalent -AR(2) process. Proposition 3.2. Te -VAR(2) process of (3.2) for c 1 = a 1, c 2 = a 2 and ξ = σ is stationary as 0 if and only if a j < 0, j = 1, 2. Ten its stationary variance satisfies lim γ0 = 0 σ2 2a 1 a 2. (3.18) Proof. From (3.3), te -VAR(2) process of te ypotesis is equivalent to te -AR(2) process of equation (3.1) wit coefficients given by b 1 = a 1 2 + a 2 + 2, b 2 = a 2 1 and ς = σ ( ) 3. (3.19) From condition (3.15) of Proposition 3.1, tis -AR(2) process is stationary if and only if te following conditions old: b 1 + b 2 < 1 a 1 2 + a 2 + 2 a 2 1 < 1 a 1 < 0, 10

b 1 < 2 a 1 2 + a 2 + 2 < 2 a 2 < a 1 0 a 2 < 0, 2 < b 1 2 < a 1 2 + a 2 + 2 a 1 2 + a 2 + 4 > 0 0 < < a 2 a 2 2 16a 1 2a 1, b 2 b 1 < 1 a 2 1 a 1 2 a 2 2 < 1 a 1 2 + 2a 2 + 4 > 0 0 < < a 2 a 2 2 4a 1 a 1, were te last two conditions old as 0. Finally, from equation (3.16) and (3.19) we can compute te stationary variance as follows: ( ) ς 2 γ0 = ( ) 2 [( ) 2, b 1 1 b b 1 2 + b 2 1 b 2 1 b 2 = σ 2 (2 + a 2 ) a 1 a 2 (4 + a 1 2 + 2a 2 ) 0 = σ2. 2a 1 a 2 Tis concludes te proof. 3.4 Stationarity of CAR(2) Process Tis section provides a new derivation of te necessary and sufficient conditions for te stationarity of a CAR(2) ( process. ) It also gives te stationary covariance function of te vector Y (0) t, Y (1) t. Teorem 3.2. Te CAR(2) process given by te SDEs system [ [ [ Yt Yt 0 d = Λ dt + Σ d, t > 0, (3.20) Ẏ t Ẏ t W t were [ 0 1 Λ = a 1 a 2 and Σ = [ 0 0 0 σ, 11

is stationary if and only if a j < 0, j = 1, 2. Ten its solution [Y t, Ẏt, t 0, is a zero-mean 2-dimensional Gaussian process wit covariance V 0 e tλ ΣΣ e tλ dt (3.21) and covariance function ([ Ys [ Ẏt ) ρ(s, t) E Y t, Ẏs { e (s t)λ V, 0 t s < ; = V e (t s)λ, 0 s t <. Proof. According to Teorem 6.7 in Capter 5 of [7, te assertion of te teorem olds if all te eigenvalues of matrix Λ ave negative real parts. Hence, we compute te eigenvalues of matrix Λ. We calculate te caracteristic polynomial as λ 1 φ(λ) = Λ λi = a 2 λ = λ2 a 2 λ a 1, a 1 wic is of quadratic order wit a discriminant equal to D = a 2 2 + 4a 1. We ten consider te following cases: (i) If D = 0 a2 2 4 = a 1, te caracteristic polynomial as te double root λ 1,2 = a 2 2. (ii) If D > 0 a2 2 4 > a 1, te caracteristic polynomial as te two real roots λ 1,2 = a 2 ± a 2 2 + 4a 1. 2 (iii) If D < 0 a2 2 4 < a 1, te caracteristic polynomial as te two complex roots λ 1,2 = a 2 ± i 4a 1 + a 2 2. 2 In every case we need to impose conditions on te coefficients of te caracteristic polynomial so as te real part of all eigenvalues is negative. 12

Indeed, in case (i) te double real root of te caracteristic polynomial is negative if and only if a 2 < 0, wic troug te discriminant condition implies also tat a 1 < 0. In case (ii) we need to impose tat bot real eigenvalues are negative; i.e., λ 1 = a 2 a 2 2 + 4a 1 2 < 0, λ 2 = a 2 + a 2 2 + 4a 1 < 0 2 a 2 2 + 4a 1 < a 2 a 2 <0 a 2 2 + 4a 1 < a 2 2 a 1 < 0. Note tat te latter condition implies bot tat a j < 0 for j = 1, 2. Ten te former condition olds as well. In case (iii) te common real part of te two complex eigenvalues is negative if and only if a 2 < 0, wic troug te discriminant condition also implies tat a 1 < 0. Consequently, in all cases te real part of bot eigenvalues of matrix Λ is negative if and only if a j < 0, j = 1, 2. We now compute te stationary variance V of te CAR(2) process of (3.20), as given in (3.21), beginning wit te computation of te matrix e tλ, t 0. In particular, we are looking for f(λ), were f(λ) = e λt. From standard matrix teory, tis can be computed via a polynomial expression of order 1, and tus f(λ) = δ 0 I + δ 1 Λ. Hence, it suffices to set g(λ) = δ 0 + δ 1 λ and to demand tat f(λ) and g(λ) to be equal on te spectrum of Λ. Ten we will ave tat f(λ) = g(λ). Te roots (one double or two real/complex) λ 1, λ 2 of te caracteristic polynomial φ(λ) of Λ satisfy te relationsips: λ 1 + λ 2 = a 2 and λ 1 λ 2 = a 1. (3.22) Since te polynomials f(λ) = e λt and g(λ) = δ 0 + δ 1 λ must be equal on te spectrum of Λ, we ave tat f(λ 1 ) = g(λ 1 ) e λ 1 t = δ 0 + δ 1 λ 1, f(λ 2 ) = g(λ 2 ) e λ 2 t = δ 0 + δ 1 λ 2. 13

Ten, e Λt = f(λ) = g(λ) = δ 0 I + δ 1 Λ = δ 0 δ 1. a 1 δ 1 δ 0 + a 2 δ 1 From (3.21), we compute te stationary variance V = 0 e tλ ΣΣ e tλ dt = σ 2 0 e tλ [ 0 0 0 1 (e tλ ) dt. For any a < 0 we ave tat e (a+bi)t dt = e(a+bi)t (a + bi) 0 0 = 1 a + bi. Using te equalities in (3.22), we can rewrite te covariance function as: = σ 2 λ 2e λ 1 (s t) λ 1 e λ 2 (s t) 2a 1 a 2 (λ 2 λ 1 ) σ 2 eλ 2 (s t) e λ 1 (s t) 2a 2 (λ 2 λ 1 ) σ 2 eλ 2 (s t) e λ 1 (s t) 2a 2 (λ 2 λ 1 ) σ 2 λ 2e λ 2 (s t) λ 1 e λ 1 (s t) 2a 2 (λ 2 λ 1 ), 0 t s <. 3.5 Weak Convergence of -AR(p) process to CAR(p) process We now consider te AR(p) process on te discrete time domain {0,, 2,...}, given as X t = b 1X t + b 2X t 2 + + b i X t i + + b px t p + ς Z t, (3.23) Z t IID N(0, 1), t = p, (p + 1),..., and sow tat subject to suitable conditions on te coefficients b 1, b 2,... b p and ς, tis converges as 0 to its continuous time CAR(p) process of te form p 1 Y (p) t = a i+1 Y (i) t + σw t, t > 0, (3.24) i=0 14

for a j 0, j = 1, 2,..., p, and σ 2 > 0. Define te coefficients {c j : j = 1,..., p} and ζ troug te equations { (p ) ( ) } k 1 b i ( 1) i 1 + p k+1 c k, and (3.25) i i 1 ς p 1 ξ. k=i Te following teorem, wic is relegated to te Appendix, can be proven: Teorem 3.3. Te -AR(p) process of (3.23) wit coefficients given by (3.25), were c j a j, j = 1, 2,..., p, and ξ / σ as 0, converges in distribution to te CAR(p) process of (3.24). It is of interest to note te scaling for te Gaussian variable Z t in (3.23). In order to ave te desired convergence, one must ave ζ σ and via (3.25 tis entails ζ σ p 1/2. APPENDIX Te Appendix proves Teorem 5.3. To do so, we first study te similarity of te -AR(p) process in (3.23) wit te CAR(p) process (3.24). We begin by introducing te corresponding -VAR(p) process 0;t 0;t = 1;t, 1;t 1;t = 2;t, i 1;t i 1;t = i;t,. (A.1) p 2;t p 2;t = p 1;t,. p 1 p 1;t p 1;t = c i+1 i;t + ξ Z t, i=0 Te process of (A.1) immediately yields: i=1 = 1;t = 0;t 0;t 2 Z t IID N(0, 1), t =, 2,... 15

= ( ) 1 ( 1) 0 0;t + 0 i=2 = 2 2;t = [ 1;t 1;t 2 ( ) 1 ( 1) 1 1 0;t 2, = [ 0;t 0;t 2 [ 0;t 2 1;t 3 = 0;t 2 0;t 2 + 0;t 3 ( ) ( ) ( ) 2 2 2 = ( 1) 0 0;t + ( 1) 1 0;t 2 + ( 1) 2 0 1 2 0;t 3. Te process of (A.1) generalizes as follows: i+1 ( ) i i i;t = ( 1) k 1 k 1 0;t k, k=1 (A.2) for i = 1, 2,..., p 1. We prove (A.2) via matematical induction. (i) For i = 1 te relationsip (A.2) olds trivially. (ii) Let (A.2) old for i = m. (iii) We sall sow tat (A.2) also olds for i = m + 1. m+1 m+1;t (A.1) for i=m+1 =========== m m;t m m;t 2 m+1 (ii) === k=1 m+1 k=1 ( ) m ( 1) k 1 k 1 0;t k ( ) m ( 1) k 1 0;t (k+1) k 1 m+1 2nd sum: l=k+1 =========== = m+2 k=1 k=1 m+2 l=2 ( ) m ( 1) k 1 k 1 0;t k+ ( ) m ( 1) l 1 0;t l l 2 ( ) m + 1 ( 1) k 1 k 1 0;t k. 16

Furtermore, we ave tat 0;t (A.1) for i=1 ========= 0;t + 1;t (A.1) for i=2 ========= 0;t + [ 1;t + 2;t = 0;t + 1;t + 2 2;t p 2 = = i i;t + p 1 p 1;t i=0 p 2 by te last equation of (A.1) ================== p 1 (A.2) ==== i=0 i i;t + [ p 1 p 1;t + i=0 p 1 c i+1 i;t + ξ Z t i=0 [ i+1 1 + p i c i+1 by intercanging te sums =============== k=1 ( ) i ( 1) k 1 0;t k + p 1 ξ Z t k 1 { p 1 ( 1) k 1 k=1 i=k 1 0;t k + p 1 ξ Z t ( ) i [ } 1 + p i c i+1 k 1 telescopic sum ========== { (p ) ( 1) k 1 + k k=1 p 1 ξ Z t, i=k ( ) } i 1 p i+1 c i k 1 0;t k+ wic, wen compared wit te -AR(p) process of (3.23), yields te relationsips { (p ) ( ) } k 1 b i ( 1) i 1 + p k+1 c k, and (A.3) i i 1 ς p 1 ξ k=i 17

for all i = 1, 2..., p. From above, te -AR(p) process of (3.23) wit coefficients given by (3.25) is equivalent to te -VAR(p) of (A.1). We conclude wit a statement of Teorem 5.3, te proof and supporting details of wic are in te Appendix. We sall next find te coefficients c i, i = 1, 2,..., p, in terms of te coefficients b i, i = 1, 2,..., p. In particular, we ave tat, from (3.25), {( ) ( ) } i=p p p 1 = b p = ( 1) p 1 + c p p p 1 [ ( ) p 1 c p = 1 ( 1) p 1 b p 1 p 1 and { ( ) i=p 1 p = b p 1 = ( 1) p 2 + p 1 k=p 1 { [( ) p 2 c p 1 = 2 ( 1) p 2 b p 1 + p 2 ( ) } k 1 p k+1 c k p 2 ( ) } p 1 b p 1. p 2 Proposition A.1. Tis gives te general formula [ ( ) k 1 c i = p+i 1 ( 1) i 1 b k 1, i = 1, 2,..., p. (A.4) i 1 k=i Proof. We prove Proposition A.1 by backward induction. (i) For i = p te relationsip (A.4) olds trivially. (ii) Let (A.4) old for every i = p 1, p 2,..., m + 1. (iii) We sall sow tat (A.4) also olds for i = m. { ( ) i=m p = b m = ( 1) m 1 + m i=m+1 i=m ( ) (i), (ii) p = ( 1) m 1 b m = + p m+1 c m m+ ( ) i 1 ( 1) i 1 m 1 ( ) i 1 p i+1 c i m 1 k=i } ( ) k 1 b k i 1 i=m+1 ( ) i 1. m 1 Hence, intercanging te order of summation in te double sum, te former 18

index bounds i k p and m+1 i p ave now become m+1 i k and m + 1 k p. We furter ave: as well as i=m+1 ( ) i 1 ( ) k 1 ( 1) i 1 b k = m 1 i 1 i=m+1 k=i k ( )( ) i 1 k 1 = b k ( 1) i 1 m 1 i 1 k=m+1 j=i m ====== i=m+1 k=m+1 b k = ( 1) m 1 k=m+1 ( ) k m k 1 ( ) k m ( 1) j+m 1 m 1 j b k Newton ====== ( 1) m 1 = ( 1) m ( ) i 1 = m 1 k=m+1 i=m+1 Te last relationsip yields b k j=1 ( ) [ k m k 1 ( ) k m ( 1) j m 1 j k=m+1 b k ( ) k 1, m 1 [( ) i m j=0 1 ( ) k 1 [ ( 1 + 1) k m 1 m 1 ( ) i 1 m telescopic sum ========== ( ) p ( 1) m 1 b m = + p m+1 c m + ( 1) m m [( ) p 1 m ( ) k 1 p m+1 c m = ( 1) m 1 b k 1, m 1 k=m concluding part (iii) and tus te proof. 19 k=m+1 b k ( ) p 1. m ( ) k 1 m 1

Finally, for t > 0 we know tat: dy t = Y (1) t dt, dy (1) dy (i 1) t dy (p 2) t t = Y (2) t dt,.. = Y (i) t dt, (A.5) = Y (p 1) t dt, and from (3.24) we ave also tat [ dy (p 1) t = a 1 Y t + a 2 Y (1) t + + a i Y (i 1) t + + a p 1 Y (p 2) t + a p Y (p 1) t dt + σdw t. (A.6) Tus te CAR(p) process of (3.24) is equivalent from (A.5) and (A.6) to te system of stocastic differential equations in (2.1). Teorem 5.3 Te -AR(p) process of (3.23) wit coefficients given by (3.25), were c j a j, j = 1, 2,..., p, and ξ / σ as 0, converges in distribution to te CAR(p) process of (3.24). Proof. We generalize Teorem 3.3 by proving tat it suffices to sow tat te -VAR(p) process of (A.1) converges to te SDEs system of (2.1). As in Teorem 5.1.1, we employ te framework of Teorems 2.1 and 2.2 of [8. Let M t be te σ-algebra generated by i;0, i;, i;2,..., i;t, i = 0, 1,..., p 2, and p 1;0, p 1;, p 1;2,..., p 1;t for t =, 2,... Te -VAR(p) process of (A.1) is clearly Markovian of order 1, since we may construct 0;t, 1;t, 2;t,..., p 1;t from 0;t, 1;t, 2;t,..., p 1;t by constructing first p 1;t from te last equation of (A.1), p 2;t from (A.1) for i = p 1, and so fort, and ten finally 0;t from (A.1) for i = 1. Tis also establises tat i;t, i = 0, 1,..., p 2 is M t adapted. Tus te corresponding drifts per unit of time conditioned on information at time t are given by: [ i 1;t [ i 1;t E M (A.1) i 1;t + i;t i 1;t t ==== E M t 20

= i;t, i = 1, 2,..., p 1, (A.7) and [ p 1;t+ p 1;t E M t [ by te last equation of (A.1) p 1 i=0 ================== E c i+1 i;t + ξ Z t+ M t = c 1 0;t + c 2 1;t + + c p p 1;t. (A.8) Furtermore, te variances and covariances per unit of time are given by ( ) 2 ( ) i 1;t 2 i 1;t E M t ==== (A.1) i;t E M t ( 2, = i;t) i = 1, 2,..., p 1, (A.9) and ( p 1;t+ p 1;t E ) 2 M t [ by te last equation of (A.1) 2 p 1 i=0 c i+1 i;t) + ξ Z t+ ================== E M t = [ c 1 0;t + c 2 1;t + + c p p 1;t 2 + ( ξ ) 2, (A.10) were te last equality assumes tat Z t+ IID N(0, 1). By te same logic: ( )( ) i 1;t i 1;t j 1;t j 1;t E M t 21

and ( )( ) (A.1) i;t j;t ==== E M t = i;t j;t, i, j = 1, 2,..., p 1, i j, (A.11) ( )( ) i 1;t i 1;t p 1;t+ E p 1;t M t [ (c ) (A.1) ==== E i;t 1 0;t + c 2 1;t + + c p p 1;t + ξ Z t+ M t [ = i;t c 1 0;t + c 2 1;t + + c p p 1;t, i = 1, 2,..., p 1. (A.12) Terefore, te relationsips of (A.9) - (A.12) become ( i 1;t i 1;t E ) 2 M t = o(1), i = 1, 2,..., p 1, (A.13) ( p 1;t+ p 1;t E ) 2 M t = ( ξ ) 2 + o(1), (A.14) ( )( E i 1;t i 1;t j 1;t j 1;t ) M t = o(1) (A.15) and ( )( E i, j = 1, 2,..., p 1, i j i 1;t i 1;t p 1;t+ p 1;t ) M t = o(1), (A.16) 22

i = 1, 2,..., p 1, were te o(1) terms vanis uniformly on compact sets. We may also sow by brute force tat te limits of ( ) 4 i 1;t i 1;t E M t, i = 1, 2,..., p 1 and ( p 1;t+ p 1;t E ) 4 M t exist and converge to zero as 0. We can ten define te continuous time version of te -VAR(p) process of (A.1) by i;t i;k for k t < (k + 1) and i = 0, 1,..., p 1. Tus, according to Teorem 2.2 in [8, te relationsips (A.7), (A.8) and (A.13) - (A.16) provide te weak (in distribution) limit diffusion. Tis is precisely te linear SDE system of (2.1) and it as a unique solution. References [1 Anderson, T. Te Statistical Analysis of Time Series. Wiley- Interscience, 1994. [2 Brockwell, P. J., Ferrazzano, V., and Klüppelberg, C. Higfrequency sampling and kernel estimation for continuous-time moving average processes. J. Time Series Anal. 34, 3 (2013), 385 404. [3 Brockwell, P. J., and Lindner, A. Existence and uniqueness of stationary Lévy-driven CARMA processes. Stocastic Process. Appl. 119, 8 (2009), 2660 2681. [4 Brockwell, P. J., and Lindner, A. Strictly stationary solutions of autoregressive moving average equations. Biometrika 97, 3 (2010), 765 772. 23

[5 Brockwell, P. Davis, R., and Yang, Y. Continuous-time Gaussian autoregression. Statistica Sinica 17, 1 (2007), 63. [6 Dym, H., and McKean, H. Gaussian Processes, Function Teory, an te Inverse Spectral Problem. Academic Press, 1976. [7 Karatzas, I., and Sreve, S. Brownian Motion and Stocastic Calculus. Springer, 1991. [8 Nelson, D. ARCH models as diffusion approximations. Journal of Econometrics 45, 1 (1990), 7 38. [9 Papoulis, A. Probability, Random Variables, Stocastic Processes. Mc- Graw Hill, 1991. [10 Rasmussen, C., and Williams, C. Te Statistical Analysis of Time Series. MIT Press, 2006. 24