Grouped Network Vector Autoregression

Size: px

Start display at page:

Download "Grouped Network Vector Autoregression"

Hector Allen
5 years ago
Views:

1 Statistica Sinica: Supplement Grouped Networ Vector Autoregression Xuening Zhu 1 and Rui Pan 2 1 Fudan University, 2 Central University of Finance and Economics Supplementary Material We present here the detailed technical proofs of Lemma 1 Lemma 4 in Appendi A. Net, the proofs of Theorem 1 to Theorem 3 are given in Appendi B, C, and D respectively. Appendi A. Four Useful Lemmas Lemma 1. Let X = (X 1,, X n) R n, where X is are independent and identically distributed random variables with mean zero, variance σ 2 X and finite fourth order moment. Let Ỹ t = j=0 Gj UE t j, where G R n n, U R n N, and {E t} satisfy Condition (C1) and are independent of {X i}. Then for a matri A = (a ij) R n n and a vector B = (b 1,, b n) R n, it holds that (a) n 1 B X p 0 if n 2 B B 0 as n. (b) n 1 X AX p σ 2 X lim n n 1 tr(a) if the limit eists, and n 2 tr(aa ) 0 as n. (c) (nt ) 1 T B Ỹ t p 0 if n 1 j=0 (B G j UU (G ) j B) 1/2 0 as n. (d) (nt ) 1 T Ỹ t AỸ t p lim n n 1 tr{aγ(0)} if the limit eists, and n 1 i=0 j=0 [tr{u (G ) i AG j UU (G ) j A G i U}] 1/2 0 as n.

2 (e) (nt ) 1 T X AỸ t p 0 if n 1 j=0 [tr{agj UU (G ) j A }] 1/2 0 as n. Proof: The detailed proof can be found in Lemma 1 of Zhu et al. (2017). Lemma 2. Assume min N = O(N δ ) and the stationary condition c β < 1, where c β = ma ( β 1 + β 2 ). Further assume Conditions (C1)-(C3) hold. For matrices M 1 = (m (1) ij ) R n p and M 2 = (m (2) ij ) Rn p, define M 1 M 2 as m (1) ij m (2) ij for 1 i n and 1 j p. In addition, define M e = ( m ij ) R n p for any arbitrary matri M = (m ij) R n p. Then there eists J > 0, such that (a) for any integer n > 0, we have G n (G ) n e n J c 2n β MM, (A.1) G n Σ Y e αn J c n βmm, (A.2) where M = C1π + J j=0 W j, C > 1 is a constant, π is defined in (C3.1), and α is a finite constant. (b) For positive integers 1 1, 2 1, and j 0, define g j,1, 2 (G, W () ) = (W () ) 1 {G j (G ) j } 2 (W () ) 1 e R N N. In addition, define (W () ) 0 = I = (I N, 0) R N N. For integers 0 1, 2, m 1, m 2 1, as N we have N 1 N 1 j=0 i,j=0 { µ g j,1, 2 (G, W () )µ} 1/2 0, (A.3) [ { 1/2 tr g i,1, 2 (G, W () )g j,m1,m 2 (G, W )}] () 0, (A.4) where µ e c µ1 and c µ is a finite constant. (c) For integers 0 1, 2 1, define f 1, 2 (W (), Q) = (W () ) 1 Q 2 (W () ) 1 e R N N,

3 where Q is given in (C3). Then for integers 0 1, 2, m 1, m 2 1, as N we have N 2 µ f 1, 2 (W (), Q)µ 0, { } N 2 tr f 1, 2 (W (), Q)f m1,m 2 (W (), Q) 0, N 1 j=0 (A.5) (A.6) [ { 1/2 tr f 1, 2 (W (), Q)g j,m1,m 2 (G, W )}] () 0, (A.7) where µ e c µ1 and c µ is a finite constant. Proof: The proof is similar in spirit to Zhu et al. (2017). Therefore, we give the guideline of the proof and sip some similar details. Without loss of generality, we let c β = β 11 + β 21 (i.e., = 1). Consequently, we have G e β 11 W + β 21 I. Let G = β 11 W + β 21 I. Follow similar technique in part (a) in Lemma 2 of Zhu et al. (2017), it can be verified G n e n J c n βm, (A.8) where M = C1π + J j=0 W j is defined in (a) of Lemma 2. Subsequently, the result (A.1) can be readily obtained. Net, recall that Σ Y = (I G) 1 Σ Z (I G ) 1 + j=0 Gj Σ e(g ) j = ( j=0 Gj )Σ Z ( j=0 (G ) j ) + j=0 Gj Σ e(g ) j. Let σ 2 z = ma {γ Σ zγ } and σ 2 e = ma {σ 2 }. Then we have G n Σ Y e σ 2 z( j=0 Gn+j e)( j=0 (G ) j e) + σ 2 e j=0 Gn+j e (G ) j e. Subsequently, (A.2) can by obtained by applying (A.8). Net, we give the proof of (b) in the following. The conclusion (c) can be proved by similar techniques, which is omitted here to save space. Let 1 = 2 = 1. Then we have g j,1,1(g, W () ) = W () G j G j W (). Recall that W () = (w ij : i M, 1 j N) R N N. Since we have µ e c µ1, then it suffices to show j=0 N { 1 1 g j,1,1(g, W () )1 } 1/2 0. We first prove (A.3). By (A.8) we have W () G j e

4 j K ( β 1 + β 2 ) j W () M. As a result, we have W () G j (G ) j W () e j 2K ( β 1 + β 2 ) 2j M, (A.9) where M is defined as M = W () MM W (). As a result, we have j=0 N { 1 1 W () G j (G ) j W () 1 } 1/2 N 1 α1(1 M1) 1/2, where α 1 = j=0 jk c j β <. Then it leads to show N 2 1 M1 0. It can be shown 1 M1 = N 2 C j π2 j + K j=1 1 W () W j (W ) j W () 1 + 2N C j π (W ) j W () 1+ i j 1 W () W i (W ) j W () 1. For the last two terms of 1 M1, by Cauchy inequality, we have ( N π (W ) j W () 1 N j j π 2 j ) 1/2 { 1 W () W j (W ) j W () 1} 1/2, and i j 1 W () W i (W ) j W () 1 i j W () 1 } 1/2. As a result, it leads to show { 1 W () W i (W ) i W () 1 } 1/2{ 1 W () W j (W ) j N j=1 π 2 j 0 and N 2 1 W () W j (W ) j W () 1 0 (A.10) for 1 j K + 1. As the first convergence in (A.10) is implied by (C2.1), we net prove N 2 1 W () W j (W ) j W () 1 0 (1 j K). Recall that W = W + W. Therefore, we have N 2 1 W () W j (W ) j W () 1 N 2 1 W () W 2j W () 1. Then it suffices to show N 2 1 W () W 2j W () 1 0. By eigenvalue-eigenvector decomposition of W we have W = λ (W )u u, where λ (W ) and u R N are the th eigenvalue and eigenvector of W respectively. As a result, we have N 2 1 W () W 2j W () 1 N 2 λma(w ) 2j (1 W () W () 1) (1 j K). Further we have 1 W () W () 1 N λ ma(w () W () ). Note that W () W ()

5 is a sub-matri of W W with row and column inde in M. Therefore, by Cauchy s interlacing Theorem, we have λ ma(w () W () ) λ ma(w W ) = O(N δ ) for δ < δ. Since we have min N = N δ for δ > 0, then we have N 1 λma(w () W () ) 0 as N. As a consequence, the second term in (A.10) holds. Similarly, it can be proved that (A.10) holds for all 0 1, 2 1. As a result, we have (A.3) holds. We net prove (A.4) with 1 = 2 = m 1 = m 2 = 1, and g i,1,1(g, W () )g j,1,1(g, W () ) = W () G i (G ) i W () W () G j (G ) j W () e. Then it can be similarly proved for other cases (i.e., 0 1, 2, m 1, m 2 1). Note that by (A.9), we have [ { tr W () G i (G ) i W () W () G j (G ) j W () }] 1/2 i K j K ( β 1 + β 2 ) i+j tr { M 2} 1/2. It then can be derived that N 1 i,j=0 [tr{w () G i (G ) i W () W () G j (G ) j W () }] 1/2 α 2 N 1 tr { M 2} 1/2. In order to obtain (A.4), it suffices to show that N 2 tr{m 2 } 0. (A.11) Equivalently, by Cauchy inequality, it suffices to prove ( πj 2 ) 2 0, and N 2 tr{w () W j W j W () W () W j W j W () } 0 holds for 1 j K. It can be easily verified the first term holds by (C2.1). For the second one, we have N 2 tr{w () W j W j W () W () W j W j W () } N 2 tr{w () (W ) 4j W () } N 2 λma(w ) 4j tr(w () W () ) N 2 N λ ma(w ) 4K λ ma(w W ). Similarly, due to that λ ma(w ) = O(log N) and λ ma(w W ) = O(N δ ) in (C2.2), we have N 1 λma(w ) 4K λ ma(w W ) 0 as N. Consequently, we have (A.11) and then (A.4) holds. This completes the proof of (b). Lemma 3. Let {X it : 1 t T } and {Y it : 1 t T } be random sub-gaussian time

6 series with mean 0, var(x it) = σ i,, var(y it) = σ i,yy, and cov(x it, Y it) = σ i,y. Let σ i,t1 t 2 = cov(x it1, X it2 ) and Σ i = (σ i,t1 t 2 : 1 t 1, t 2 T ) R T T. Similarly, define σ yi,t1 t 2 and Σ yi R T T. Then we have ( T 1 P T ) X ity it σ i,y > ν { } c 1 ep( c 2σ 2 i T 2 ν 2 ) + ep( c 2σ 2 yi T 2 ν 2 ) (A.12) for ν δ, where σ 2 i = tr(σ 2 i), σ 2 yi = tr(σ 2 yi), c 1, c 2, and δ are finite constants. Proof: Let X i = (X i1,, X it ) R T and Y i = (Y i1,, Y it ) R T. In addition, let Z i = Z i + Y i. Therefore, we have Zi Z i = 2 1 (Zi Z i Xi X i Yi Y i). It can be derived that P { T 1 (X i Y i) σ i,y ν} P { T 1 (Z i Z i) (σ i, + σ i,yy + 2σ i,y) ν 1} + P { T 1 (X i X i) σ i, ν 1} + P { T 1 (Y i Y i) σ i,yy ν 1}, (A.13) where ν 1 = 2ν/3. Net, we derive the upper bound for the right side of (A.13). Note that X i X i, Yi Y i, and Zi Z i all tae quadratic form. Therefore the proofs are similar. For the sae of simplicity, we tae Y i Y i for an eample and derive the upper bound for P { n 1 (Y Y i) σ i,yy ν 1}. Similar results can be obtained for the other two terms. First we have Yi Y i = Yi Σ 1/2 yi Σ yiσ 1/2 yi Y i = Ỹ Σ yi Ỹ i, where i i Ỹi = Σ 1/2 yi Y i follows sub- Gaussian distribution. Let λ 1 λ 2 λ T be the eigenvalues of Σ yi. Since Σ yi is a nonnegative definite matri, The eigenvalue decomposition can be applied to obtain Σ yi = U ΛU, where U = (U 1,, U T ) R T T is an orthogonal matri and Λ = diag{λ 1,, λ T }. As a consequence, we have Yt Y t = t λtζ2 t, where ζ t = Ut Ỹt and ζts are independent and identically distributed as standard sub-gaussian. It can be verified ζ 2 t 1 satisfies sub-eponential distribution and T 1 ( t λt) = σi,yy. In addition, the sub-eponential distribution satisfies condition

7 (P) on page 45 of Saulis and Statuleviveccius (2012). There eists constants c 1, c 2, and δ such that P { T 1 (Y i Y i) σ i,yy ν 1} = P { t λt(ζ2 t 1) T ν 1} c 1 ep{ c 2( t λ2 t ) 1 T 2 ν 2 } = c 1 ep{ c 2σ 1 yi T 2 ν 2 } for ν < δ by the Theorem 3.3 of Saulis and Statuleviveccius (2012). Consequently, (A.12) can be obtained by appropriately chosen c 1, c 2, and δ. Lemma 4. Assume Y it follows the GNAR model (2.4) and c β < 1. Then there eists finite constants c 1, c 2, and δ, for ν < δ we have P { T 1 P { T 1 P { T 1 P { T 1 T Y 2 it µ 2 i e i Σ Y e i > ν } δ T, T Y it(wi Y t) µ Y i(wi µ Y ) wi } Σ Y e i > ν δt T } Y i(t 1) ε it > ν δt, T } Y i(t 1) µ i > ν δt, P { T 1 (A.14) (A.15) T (wi } Y t 1)ε it > ν δt, (A.16) P { T 1 T wi Y t wi } µ Y > ν δt, (A.17) where δ T = c 1 ep( c 2T ν 2 ), e i R N is an N-dimensional vector with all elements being 0 but the ith element being 1, and µ i = e i µ Y. Proof: For the similarity of proof procedure, we only prove (A.14) in the following. Without loss of generality, let µ Y = 0. Recall that the group information is denoted as Z = {z i : 1 i N, 1 K}. Define P ( ) = P ( Z), E ( ) = E( Z), and cov ( ) = cov( Z). Write Y i = (Y i1,, Y it ) R T. Given Z, Y i is a sub-gaussian random vector with cov(y i) = Σ i = (σ i,t1 t 2 ) R T T, where σ i,t1 t 2 = e i G t 1 t 2 Σ Y e i for t 1 t 2, σ i,t1 t 2 = e i Σ Y (G ) t 2 t 1 e i, and G is pre-defined in (2.6) as G = B 1W + B 2. It can be derived var (Y i Y i) ctr(σ 2 i ), where c is a positive constant and tr(σ 2 i ) = T (e i Σ Y e i) T 1 (T t)(e i G t Σ Y e i) 2. It can be derived Σ Y e αmm and G t Σ Y e α 1t J c t βmm by (A.2) of Lemma 2, where c β, J and M are defined in Lemma 2, α and α 1 are finite constants. In addition, it can be

8 verified T 1 (T t)t2j c 2t β α 2T, where α 2 is a finite constant. Therefore we have tr(σ 2 i ) T (α + 2α 1α 2){(e i MM e i) 2 }. Since we have e i MM e i (J + 1)e i M1 (J + 1) 2 = O(1), it can be concluded that tr(σ 2 i ) T α 3, where α 3 = (α + 2α 1α 2)(J + 1) 2. By Lemma 3, the (A.14) can be obtained. Appendi B. Proof of Theorem 1 Let λ i(m) be the ith eigenvalue of M R N N. We first verify that the solution (2.7) is strictly stationary. By Banerjee et al. (2014), we have ma i λ i(w ) 1. Hence we have ma λi(g) ( ma β 1 )( ma λi(w ) ) + ma β 2 < 1. 1 i N 1 K 1 i N 1 K (A.18) Consequently, we have lim m m j=0 Gj E t j eists and {Y t} given by (2.7) is a strictly stationary process. In addition, one could directly verify that {Y t} satisfies the GNAR model (2.4). Net, we verify that the strictly stationary solution (2.7) is unique. Assume {Ỹt} is another strictly stationary solution to the GNAR model (2.4) with E Ỹt <. Then we have Ỹt = m 1 j=1 Gj (B 0 + E t j) + G m Ỹ t m for any positive integer m. Let ρ = ma ( β 1 + β 2 ). Then one could verify E Y t Ỹt = E j=m Gj (B 0 + E t j) G m Ỹ t m Cρ m, where C is a finite constant unrelated to t and m. Note that m can be chosen arbitrarily. As a result, we have that E Y t Ỹt = 0, i.e. Yt = Ỹt with probability one. This completes the proof. Appendi C. Proof of Theorem 2 According to (3.9), θ can be eplicitly written as θ = θ + Σ 1 ζ, where Σ = (N T ) 1 T X () t 1 X() t 1 and ζ = (N T ) 1 T X() t 1 E () t. Without loss of generality, we assume σ 2 = 1 for

9 = 1,, K. Let Σ = lim N E( Σ ). As a result, it suffices to show that Σ p Σ, N T ζ = O p(1), (A.19) (A.20) as min{n, T }. Subsequently, we prove (A.19) in Step 1 and (A.20) in Step 2. Step 1. Proof of (A.19). Define Q = (I G) 1 Σ V (I G ) 1. In this step, we intend to show that Σ = 1 N T T X () t 1 X() t 1 = 1 S 12 S 13 S 14 S 22 S 23 S 24 S 33 S 34 S 44 p 1 c 1β c 2β 0 Σ 1 Σ 2 κ 8γ Σ z Σ 3 κ 3γ Σ z Σ z = Σ, where S 12 = 1 N T T wi Y t 1, S 13 = 1 T Y i(t 1), S 14 = 1 N T N i M i M Vi, i M S 22 = 1 N T T (wi Y t 1) 2, S 23 = 1 T wi Y t 1Y i(t 1), N T i M i M S 24 = 1 N T T wi Y t 1Vi, S 33 = 1 N T i M T Yi(t 1), 2 i M S 34 = (N T ) 1 T i M Y i(t 1) V i, S 44 = N 1 i M V ivi. By (2.7), we have Y t = (I G) 1 b 0 + (I G) 1 b v + Ỹt, (A.21)

10 where b 0 = D B 0, b v = D Vγ, and Ỹt = j=0 Gj E t j. By the law of large numbers, one could directly obtain that S 44 p Σ v and S 14 p 0. Subsequently, we only show the convergence of S 12 and S 23 in Σ as follows. Convergence of S 12. It can be derived that S 12 = 1 N T T 1 W () Y t 1 = 1 W () µ Y N + S 12a + S 12b, where S 12a = N 1 1 W () (I G) 1 b v and S 12b = (N T ) 1 T 1 W () Ỹ t 1. Then by (A.5) and (A.3) in Lemma 2, we have N 2 1 W () QW () 1 0 and N 1 { j=0 1 W () G j (G ) j W () 1 } 1/2 0, as N. As a result, it is implied by Lemma 1 (a) and (c) that S 12a p 0 and S 12b p 0. Convergence of S 23. Note that S 23 = 1 N T T wi Y t 1Y i(t 1) = 1 T N T i M = µ() Y W () µ Y + S 23a + S 23b + S 23c + S 23d + S 23e, N Y () t 1 W () Y t 1 where S 23a = N 1 b v I W () bv, S 23b = N 1 T 1 T ( b v I W () Ỹ t 1 + Ỹ t 1I W () bv), S 23d = N 1 Ỹ() t 1 W () Ỹ t 1 and S 23c = N 1 T 1 T ( b v I µ Y + µ Y I b v), S 23e = N 1 T 1 T (Y () t 1 µy + µ Y I W () Y t 1), where µ Y = W () µ Y and b v = (I G) 1 b v. We net loo at the terms one by one. First we have N 2 tr(i QI W () QW () ) 0 by (A.6) in Lemma 2 (c). Therefore, by (b) in Lemma 1, we have S 23a p s 23a, where s 23a = lim N E(S 23a). Net, for S 23b we have N 1 i,j=0 tr{i G i (G ) i I W () G j (G ) j W () } 0 by (A.4) in Lemma 2 (b). Therefore, by (d) in Lemma 1, we have S 23b p s 23b, where s 23b = lim N E(S 23b ). Net, let S 23c = S (1) 23c +S(2) 23c, where S(1) 23c = N 1 T 1 T b z I W () Ỹ t 1 and

11 23c = N 1 T 1 T Ỹ t 1I W () bv. Note that we have N 1 j=0 tr{w () G j (G ) j W () I Q S (2) I } 0 and N 1 j=0 tr{i G j (G ) j I W () QW () } 0 by (A.7) in Lemma 2 (c). Therefore, S 23c p s 23c by (e) in Lemma 1, where s 23c = lim N E(S 23c). Net, by similar proof to the convergence of S 13, we have that S 23d p 0 and S 23e p 0. As a consequence, we have S 23 p Σ 2. Step 2. Proof of (A.20). It can be verified that N T E( ζ ) = 0. In addition, we have var{ N T ζ } = E( Σ ) Σ as N. Consequently, we have N T ζ = O p(1). Appendi D. Proof of Theorem 3 Let Σ (i) ε it). We then have = T 1 T X i(t 1)X i(t 1) = ( σ,ij) R 3 3 (i), and Σ e = T 1 ( T X i(t 1)δ i bi b i = ( Σ (i) ) 1 Σ (i) e. Let Σ (i) = ( σ,j1 j 2 : 1 l 1, l 2 3) R 3 3, where the inde i of σ,l1 l 2 is omitted. Specifically, σ,11 = 1, σ,12 = T 1 t w i Y t 1, σ,13 = T 1 t e i Y t 1, σ,22 = T 1 t Y 2 i(t 1), σ,23 = T 1 t Y i(t 1)(w i Y t 1), σ,33 = T 1 t (w i Y t 1) 2. Mathematically, it can be computed ( Σ (i) ) 1 = Σ (i) 1 Σ (i), where Σ (i) is the determinant of (i) (i) Σ, and Σ is the adjugate matri of (i) Σ, and Σ (i) = ( σ,l 1 l 2 ), where σ,11 = σ,22 σ,33 σ,23, 2 σ,12 = σ,13 σ,32 σ,12 σ,33 σ,13 = σ,21 σ,32 σ,22 σ,31, σ,22 = σ,11 σ,33 σ 2,13, σ,23 = σ,13 σ,32 σ,12 σ,33, and σ,33 = σ,11 σ,22 σ,12. 2 It can be derived Σ (i) = σ,11( σ,22 σ,33 σ,23) 2 σ,12( σ,12 σ 33 σ 13 σ 23) + σ 13( σ 12 σ 23 σ 22 σ 13). By the maimum inequality, we have P (sup b i b i > ν) i N P ( b i b i > ν). i=1 (A.22)

12 REFERENCES In addition, we have P ( b i b i > ν) P ( Σ(i) σ (i) ) ( δi + P Σ (i) Σ (i) e ) δiν, (A.23) where σ (i) = σ,11(σ,22σ,33 σ 2,23) σ,12(σ,12σ 33 σ 13σ 23) + σ 13(σ 12σ 23 σ 22σ 13) = (e i Σ Y e i)(w i Σ Y w i) (e i Σ Y w i) 2, δ i = σ (i) /2. By lemma 4, for each component of Σ (i) we have P ( σ,l1 l 2 σ,l1 l 2 > ν 0) c 1 ep( c 2T ν 2 0), where σ,l1 l 2 = E( σ,l1 l 2 ) and ν 0 is a finite positive constant. Moreover, by the conditions of Theorem 3, we have σ (i) tending to 1. Consequently, it is not difficult to obtain the result P ( Σ (i) τ with probability σ (i) δ i) c 1 ep( c 2T τ 2 ), where c 1, c 2 are finite constants. Subsequently, we have P ( Σ (i) Σ(i) e δ iν) P ( Σ (i) Σ(i) e τν/2). By similar technique, one could verify that each element of Σ (i) and Σ (i) e converge with probability and the tail probability can be controlled, where the basic results are given in Lemma 4. Consequently, there eists constants c 3 and c 4 such that P ( Σ (i) Σ(i) e τν/2) c 3 ep( c 4T τ 2 ν 2 ). Consequently, we have P ( b i b i > ν) c 1 ep( c 2T τ 2 ) + c 3 ep( c 4T τ 2 ν 2 ) by (A.23). By the condition N = o(ep(t )), the right side of (A.22) goes to 0 as N. This completes the proof. References Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2014), Hierarchical modeling and analysis for spatial data, Crc Press. Saulis, L. and Statuleviveccius, V. (2012), Limit theorems for large deviations, vol. 73, Springer Science & Business Media. Zhu, X., Pan, R., Li, G., Liu, Y., and Wang, H. (2017), Networ vector autoregression, Annals

13 of statistics, 45, REFERENCES

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.