Assumptions for the theoretical properties of the

Size: px
Start display at page:

Download "Assumptions for the theoretical properties of the"

Transcription

1 Statistica Sinica: Supplement IMPUTED FACTOR REGRESSION FOR HIGH- DIMENSIONAL BLOCK-WISE MISSING DATA Yanqing Zhang, Niansheng Tang and Annie Qu Yunnan University and University of Illinois at Urbana-Champaign Supplementary Material S Assumptions for the theoretical properties of the proposed method Let A = { tr (A A)} /2 be the norm of matrix A and C denote a positive constant. The following assumptions are needed to establish the theoretical properties in Section 3.3. (C) Factors: E( w t 4 ) C <, n n t= W ()tw ()t p Σ w and n n t= w tw t p Σ w for some r r positive definite matrix Σ w, where W ()t is an r vector from the tth row of W () R n r. (C2) Factor loadings: λ i λ < and Λ Λ/q Σ Λ 0 for some positive finite value λ and some r r positive definite matrix Σ Λ. (C3) Weak dependence: (i) E(e ti ) = 0, E( e ti 8 ) C; (ii) E(e ti e lj ) = τ ij

2 2 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU if t = l and 0 otherwise, q q i= τ ii C, q q q i= j= τ ij C, q j= τ ij C, q q q i= j= τ ij 2 C; (iii) q q q i= j= {E(e2 tie 2 tj) τ 2 ij} C; (iv) E( q /2 q i= {e tie li E(e ti e li )} 4 ) C for every (t, l). (C4) E{q q i= n /2 n t= W ()te ()ti 2 } C. (C5) For any t =,..., n, E( (n q) /2 q n i= t= W ()tλ ie ()ti 2 ) C and E( (n q) /2 q n i= l= W ()l[e ()li e ()ti E{e ()li e ()ti }] 2 ) C. (C6) Predicting model: E(ε t ) = 0, E(ε 2 t ) < C <, n n t w tε t p 0 and α <. Assumption (C) is general for the factor model where components of factor variables are correlated. Assumption (C2) ensures that each factor has a nontrivial contribution to the variance of z t. Here we only consider non-random factor loadings for simplicity. Assumption (C3) is the weak correlation assumption. Given Assumption (C3)(i), the remaining assumptions in (C3) are satisfied if the e ti s are independent for all i. Assumptions (C3)(iii) and (iv) imply that the fourth and the eighth moments are bounded, respectively. Thus, the proposed method is applicable for sub-gaussian cases. The assumption that q j= τ ij C for all i in Assumption (C3)(ii) implies that the eigenvalues of the covariance matrix of random error e i are bounded, since the largest eigenvalue of the covariance matrix is bounded by

3 S2. THE ACCURACY OF SCREENING FOR HIGH-DIMENSIONAL BLOCK-WISE MISSING DATA3 max i q j= τ ij. Assumption (C4) provides weak dependence between factors and random errors. When factors and errors are independent, which is a standard assumption for conventional factor models, Assumption (C4) is implied by Assumptions (C) and (C3), although independence is not required for Assumption (C4) to hold. Assumption (C5) is not restrictive since the sums in Assumption (C5) involve zero mean random variables. Assumption (C6) is a standard set of conditions that implies consistency of the ordinary least square estimator in the predicting model. S2 The accuracy of screening for high-dimensional block-wise missing data Let M o be the collection of indices for nonzero parameters in true sparse model y = p l= X lβ l + ε, and p o be the number of elements in M o. Define Σ = cov(x) and X (i) o = X (i) o Σ /2 (i) for i =,..., K, where X (i) o is an n o(i) s i matrix of observed values from the ith data source and Σ (i) is the corresponding submatrix of Σ. Thus, the covariance matrix of the transformed matrix X (i) o is I si. The following assumptions are needed for the accuracy of screening.

4 4 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU (A) Var(y) = O(), and for some κ 0 and c, c 2 > 0, min β l l M o c min ik n κ o(i) and min cov(β l y, X l ) c 2. l M o (A2) For i =,..., K, X(i) o has a continuous and spherically symmetric distribution. If there are some c 3 > and C > 0 such that the deviation inequality Pr{λ max (s i X (i) o X (i) o ) > c 3 or λ min (s i X (i) o X (i) o ) < /c 3 } exp( C n o(i) ) holds, where λ max ( ) and λ min ( ) are the largest and smallest eigenvalues of a matrix, respectively. Also, ε N(0, σ 2 ) for some σ > 0. (A3) There are some τ 0 and c 4 > 0 such that λ max (Σ (i) ) c 4 n τ o(i) for i =,..., K. (A4) p > n and log(p) = O(n ϱ ) for some ϱ (0, 2κ), where κ is given by condition (A). The following lemma gives the accuracy of screening for high-dimensional block-wise missing data. Lemma. (accuracy of screening) Under Conditions (A)-(A4), if 2κ+τ < then there is some γ (0, ) such that, when γ 0 in such a way that γn 2κ τ as n, we have, for some C > 0, ( ) Pr(M o M γ ) = O exp[ C min{ min s i, min ik ik n 2κ o(i) / log(n o(i) )}],

5 S2. THE ACCURACY OF SCREENING FOR HIGH-DIMENSIONAL BLOCK-WISE MISSING DATA5 where M γ = { i p : ω i is among the first [γp] largest of all}. Proof. The lemma can be proved similarly to the argument of Fan and Lv (2008). The key difference in proof is that the Lemma 4 and Lemma 5 in Fan and Lv (2008) have different conclusions in our proof. In our proof, Lemma 4 is that for i =,..., K and any C > 0, there are constants c and c 2 with 0 < c < < c 2 such that Pr ( S (i) e, e < c n o(i) /s i or > c 2 n o(i) /s i ) 4 exp{ C min(no(i), s i )}, where S (i) = ( X o(i) X o(i) ) + X o(i) Xo(i) and e R p i is a unit vector with the th entry and 0 elsewhere. Lemma 5 is that let S (i) e = (V (i),..., V (i)si ) for i =,..., K, given that the first co-ordinate V (i) = v, the random vector (V (i)2,..., V (i)si ) is uniformly distributed on the sphere S si 2 (v v 2 ) /2 ; moreover, for any C > 0, there is some c > such that ( Pr V (i)2 > cn /2 o(i) s i ) A 3 exp{ C min(n o(i), s i )}, where A is an independent N(0, )-distributed random variable. With the Lemma 4 and Lemma 5, we are able to prove this lemma similarly to Fan and Lv (2008).

6 6 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU S3 Proof of Theorems For the proof of theorems, we introduce some notations as follows. Let A = { tr (A A)} /2 denote the norm of matrix A, T j be the collection of row indices in the jth data group Z (j), and M oj and M mj be the collections of column indices of observed data and missing data in the jth data group Z (j), respectively. Denote δ = min( n, q), H = (Λ Λ/q)(W () W () /n )Ṽ (), and Ṽ() as the r r diagonal matrix of the first r largest eigenvalues of Z () Z () /(n q) in decreasing order. Let D t = {(d ti e ti )δ ti ; i =,..., q} for t =,..., n, where δ ti = if Z ti is missing and δ ti = 0 otherwise, and d ti = λ i w t λ iw t. We provide the proof of some lemmas and theorems. Lemma 2, 4 and 5 are the lemmas from Bai (2003), and Lemma 3 is theorem in Bai and Ng (2002), which are needed subsequently in the proof of theorems. Lemma 2. Under Assumptions (C)-(C4), as n, q : (i) n W () {Z ()Z () /(n q)} W () = Ṽ() p V, (ii) W () W () n ( Λ Λ q ) W () W () n p V, where V is the diagonal matrix consisting of the eigenvalues of Σ Λ Σ w.

7 Lemma 3. Under Assumption (C)-(C4), we have n n u= W ()u H W ()u 2 = O p (δ 2 ). Lemma 4. Under Assumption (C)-(C5), we have S3. PROOF OF THEOREMS7 n ( W () W () H ) W () = O p (δ 2 ). Lemma 5. Under Assumption (C)-(C5), we have n n u= ( W ()u H W ()u )e ()ui = O p (δ 2 ) for i =,..., q. Lemma 6. Under Assumption (C)-(C5), we have ( ) ( λ i H q j λ i )e li = O p min( n, q) i M oj for l T j, j = 2,..., k. Proof. From Λ = Z () W () /n and Z () = W () Λ + e (), we have Λ = ΛW W () () /n +e W () () /n. Writing W () = W () W () H + W () H and W () W () /n = I r, we obtain λ i = W () W ()λ i /n + n t= W ()t e ()ti /n = W () (W () W () H )λ i /n + H λ i + n t= ( W ()t H W ()t )e ()ti /n + H n t= W ()te ()ti /n = ( W () W () H ) (W () W () H )λ i /n +H W () (W () W () H )λ i /n + H λ i + n t= ( W ()t H W ()t )e ()ti /n + H n t= W ()te ()ti /n.

8 8 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU Thus, we have λ i H λ i = H n u= W ()ue ()ui /n +( W () W () H ) (W () W () H )λ i /n +H W () (W () W () H )λ i /n (S3.) + n u= ( W ()u H W ()u )e ()ui /n. Thus, i M oj ( λ i H λ i )e li = I + I 2 + I 3 + I 4, where I = i M oj H n u= W ()ue ()ui e li /n, I 2 = i M oj ( W () W () H ) (W () W () H )λ i e li /n, I 3 = i M oj H W () (W () W () H )λ i e li /n, I 4 = i M oj n u= ( W ()u H W ()u )e ()ui e li /n. Note that H = O p () because H Λ Λ q W () W /2 () W W () () n n /2 Ṽ () and each of the matrix norms is stochastically bounded by Assumption (C) and (C2) together with W () W () /n = I r and Lemma 2. By Assumption (C3) and (C4), we have I H q j ( ) qj = O p n. { n q j i M oj n n u= W ()ue ()ui 2 } /2 { } /2 q j i M oj e 2 li By Assumption (C2) and (C3), we have i M oj λ i e li / q j = O p () since E( i M oj λ i e li / q j ) = 0 and E( i M oj λ i e li / q j 2 ) λ 2 i M oj τ ii /q j

9 S3. PROOF OF THEOREMS9 C. By Lemma 3, we obtain I 2 = n n u= ( W ()u H W ()u )( W ()u H W ()u ) H i M oj λ i e li ( n n u= W ()u H W ()u 2 ) H q j qj i M oj λ i e li = O p ( q j δ 2 ). By Lemma 4, we have I 3 H n n u= W ()u( W ()u H W ()u ) H q j qj i M oj λ i e li O p ( q j δ 2 ). By Lemma 5 and Assumption (C3), we have I 4 q j { q j ( q j ) /2 q j i M oj e 2 li } /2 i M oj n n u= ( W ()u H W ()u )e ()ui 2 O p ( q j δ 2 ). Thus, we obtain ( ) ( λ i H q j λ i )e li = O p min(. n, q) i M oj Lemma 7. Under Assumption (C)-(C5), we have ( ) λ i H λ i 2 q j = O p min(n, q 2 ) i M oj for j = 2,..., k. Proof. Since equation (S3.) and (x + y + z + u) 2 4(x 2 + y 2 + z 2 + u 2 ),

10 0 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU we have i M oj λ i H λ i 2 4(J + J 2 + J 3 + J 4 ), where J = i M oj H n u= W ()ue ()ui /n 2, J 2 = i M oj ( W () W () H ) ( W () W () H )H λ i /n 2, J 3 = i M oj H W () ( W () W () H )H λ i /n 2, J 4 = i M oj n u= ( W ()u H W ()u )e ()ui /n 2. By Assumption (C4), we have J n H 2 q j ( q j i M oj n n u= W ()u e ()ui 2 ) O p ( q j n ). By Assumption (C2) and Lemma 3, we have J 2 n n u= ( W ()u H W ()u )( W ()u H W ()u ) 2 H 2 ( i M oj λ i 2 ) ( n n u= W ()u H W ()u 2 ) 2 O p ()q j λ2 = O p (q j δ 4 ). By Assumption (C2) and Lemma 4, we have J 3 H 2 n n u= W ()u( W ()u H W ()u ) 2 H 2 ( i M oj λ i 2 ) = O p (q j δ 4 ). By Lemma 5, we have J 4 = O p (q j δ 4 ). Then we obtain ( ) λ i H λ i 2 q j = O p. min(n, q 2 ) i M oj Lemma 8. Under Assumption (C)-(C5), as q, n, we have

11 S3. PROOF OF THEOREMS (i) W()t H W ()t = O p ( (min(n, q)) ) for t =,..., n, (ii) λi H λ i = O p ( (min( n, q)) ) for i =,..., q, (iii) W (j)t H W (j)t = O p ( q j q min( n, q j ) ) for j = 2,..., k and t =,..., n j. Proof. Since W () and λ i are obtained in the complete observed data block Z (), the proofs of part (i) and (ii) are similar to Bai (2003). The details are omitted. Consider part (iii). Since W (j) = Z o(j) Λo(j) ( Λ Λ o(j) o(j) ) and Z o(j) = W (j) Λ o(j) + e o(j) for j = 2,..., k, then we have, for t =,..., n j, That is W (j)t = ( Λ Λ o(j) o(j) ) Λ o(j)λ o(j) W (j)t + ( Λ Λ o(j) o(j) ) Λ o(j)e o(j)t = H W (j)t + ( Λ Λ o(j) o(j) ) Λ o(j)(λ o(j) Λ o(j) H )W (j)t +( Λ o(j) Λ o(j) ) ( Λ o(j) Λ o(j) H ) e o(j)t +( Λ o(j) Λ o(j) ) H Λ o(j)e o(j)t. W (j)t H W (j)t = ( Λ o(j) Λ o(j) ) {( Λ o(j) Λ o(j) H ) (Λ o(j) Λ o(j) H )W (j)t +H Λ o(j)(λ o(j) Λ o(j) H )W (j)t +( Λ o(j) Λ o(j) H ) e o(j)t + H Λ o(j)e o(j)t } = ( Λ o(j) Λ o(j) ) (I + I 2 + I 3 + I 4 ).

12 2 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU By Lemma 2, we have Λ o(j) Λ o(j) = q j i= λ o(j)i λ o(j)i q λ i= i λ i = Λ Λ = q{ Z () Z () W n () n W q () } = O p (q). Thus, by Assumption (C) and Lemma 7, we have I = i M oj ( λ i H λ i )( λ i H λ i ) H W (j)t i M oj λ i H λ i 2 O p () q j = O p ( ). min(n,q 2 ) By Assumption (C)-(C2) and Lemma 7, we have I 2 = H i M oj λ i ( λ i H λ i ) H W (j)t O p ()( i M oj λ i 2 ) /2 ( i M oj λ i H λ i 2 ) /2 O p () q j λop ( = O p ( By Lemma 6, we have q j min( ). n,q) qj min( ) n,q) I 3 = ( λ i H q j λ i )e li = O p ( min( n, q) ) for l T j. i M oj By Assumption (C2) and (C3), we have Thus, we have I 4 = H q j qj i M oj λ i e li O p ( q j ). W (j)t H W (j)t O p ( q min( n, q j ) ). q j

13 S3. PROOF OF THEOREMS3 The proof of part (iii) is completed. Proof of Theorem. Based on Lemma 8, we can obtain λ i w t λ iw t = ( λ i H λ i ) ( w t H w t ) + ( λ i H λ i ) H w t + λ ih ( w t H w t ) = O p { q j (min( n q j, q)) }. The proof of theorem is completed. Lemma 9. Under Assumption (C), (C2) and (C3), we have Λ Dt /q = O p ( (min( n, q)) ) for t T j, j = 2,..., k. Proof. Based on the definition of D t, we have for t T j (j = 2,..., k) Λ Dt /q = q i= λ i(d ti e ti )δ ti /q = i M mj λ i (d ti e ti )/q i M mj λ i d ti /q + i M mj λ i e ti /q. Based on Assumption (C) and (C2), we have q i M mj λ i d ti = q i M mj λ i ( λ i w t λ iw t ) = q i M mj λ i {( λ i H λ i ) ( w t H w t ) +( λ i H λ i ) H w t +λ ih ( w t H w t )} λ q q j q + λ q q j q q q j i M mj λ i H λ i w t H w t q q j i M mj λ i H λ i H w t + q q i= λ iλ i H w t H w t O p ( n ) + O p ( q j q ). For Assumption (C2) and (C3), we have q i M mj λ i e ti = O p ( q ). Thus,

14 4 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU we obtain q Λ Dt = O p ( min( n, ). The proof of the lemma is complet- q) ed. Lemma 0. Under Assumption (C)-(C5), we have nq 2 n e D ( l t 2 = O p (min(n, q)) ) for t T j, j = 2,..., k. l= Proof. Based on the definition of D t, we have for t T j, j = 2,..., k n nq 2 l= e D l t 2 = n nq 2 l= q i= e li(d ti e ti )δ ti 2 = nq 2 n l= i M mj e li (d ti e ti ) 2 2 nq 2 n l= i M mj e li d ti 2 (S3.2) For Assumption (C3), we have n e nq 2 li d ti 2 n nq i M mj l= + 2 nq 2 n l= i M mj e li e ti 2. e 2 li l= i M mj Based on Lemma 7 and 8, we can obtain q i M mj d 2 ti = q i M mj ( λ i w t λ iw t ) 2 = q q i M mj d 2 ti = O p () q i M mj {( λ i H λ i ) ( w t H w t ) +( λ i H λ i ) H w t + λ ih ( w t H w t )} 2 i M mj d 2 ti. C(q q j) q {( q q j i M mj λ i H λ i 2 ) w t H w t 2 +( q q j i M mj λ i H λ i 2 ) H 2 w t 2 + λ 2 H 2 w t H w t 2 } O p ( n ) + O p ( q j q 2 ). (S3.3)

15 S3. PROOF OF THEOREMS5 Thus, we have nq 2 n l= i M mj e li d ti 2 O p ( n ) + O p ( q j q 2 ). For Assumption (C3), we have i M mj E{ n nq l= i M e mj lie ti 2 } = n nq l= u M mj E(e li e ti e lu e tu ) = q i M mj u M mj τiu 2 i M mj u M mj {E(e 2 tie 2 tu) τiu} 2 + nq C. Thus, n nq 2 l= i M mj e li e ti 2 = O p ( ). From equation (S3.2), we have q n nq 2 l= e D l t 2 = O p ( min(n ). The proof of lemma is completed.,q) Lemma. Under Assumptions (C)-(C5), we have nq n l= ( ) Λ Dl 2 max2jk (n j ) = O p. n Proof. Based on the definition of D l, we have nq n Λ Dl 2 2 nq l= n q λ i d li δ li nq l= i= n q λ i e li δ li 2. l= i= (S3.4) Then, we have n nq l= q i= λ id li δ li 2 C nq n l= q i= λ id li 2 δ li C n q nq l= i= λ i 2 λ i H λ i 2 w l H w l 2 δ li + C n q nq l= i= λ i 2 λ i H λ i 2 H 2 w l 2 δ li + C n q nq l= i= λ iλ i 2 H 2 w l H w l 2 δ li = I + I 2 + I 3.

16 6 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU For Lemma 7 and 8, we have I C λ 2 nq = C λ 2 nq n l= ( q i= λ i H λ i 2 ) w l H w l 2 δ li n k l= j=2 ( i M λ mj i H λ i 2 ) w l H w l 2 I(l T j ) (q q j )q 2 j n k nq l= j=2 O p( )I(l T q 2 min(n,q j ) min(n,q 2 ) j) n j q 2 j = k j=2 O p( ), nq 2 min(n,q j ) min(n,q 2 ) where I( ) is a indicator function. Similarly, we have I 2 C λ 2 nq = C λ 2 nq n q l= i= λ i H λ i 2 δ li n k l= j=2 { i M λ mj i H λ i 2 I(l T j )} k n j j=2 O p( ), n min(n,q 2 ) and I 3 C nq q λ i λ i 2 i= n k w l H w l 2 I(l T j ) l= j=2 k n j qj 2 O p ( nq 2 min(n, q j ) ). j=2 Thus, n nq l= q i= λ id li δ li 2 O p ( max 2jk(n j ) nn )+O p ( max 2jk(n j q j ) )}. For nq 2 Assumptions (C2) and (C3), we have E{ nq n l= q i= λ ie li δ li 2 } = nq n l= q i= q u= E(λ iλ u e li e lu δ li δ lu ) λ 2 nq n l= q i= q u= E(e lie lu )δ li δ lu = λ 2 nq n l= k j=2 { i M mj = λ 2 k n j j=2 nq O p ( max 2jk(n j ) n ). i M mj u M mj τ iu u M mj E(e li e lu )}I(l T j )

17 From inequality (S3.4), we have nq proof of lemma is completed. S3. PROOF OF THEOREMS7 n l= Λ Dl 2 = O p ( max 2jk(n j ) ). The n Lemma 2. Under Assumptions (C)-(C5), we have nq 2 n l= { } D l e t 2 max2jk (n j ) = O p n min(n, q) Proof. Based on the definition of D l, we have nq 2 n l= D l e t 2 2 nq 2 n l= q i= where nq 2 n l= q i= d lie ti δ li 2 nq d li e ti δ li nq 2 for t =,..., n. n l= q e li e ti δ li 2, (S3.5) i= n l= ( q i= d2 li δ li)( q q i= e2 ti). Based on Assumption (C3), we know E{ q q i= e2 ti} C. From (S3.3), we have n q nq l= i= d2 li δ li = n k n l= j=2 ( q i M mj d 2 li )I(l T j) O p { max 2jk(n j ) nn } + O p { max 2jk(n j q j ) nq 2 }. Thus, nq 2 n l= q i= d lie ti δ li 2 O p { max 2jk(n j ) nn } + O p { max 2jk(n j q j ) nq 2 }. Based on Assumption (C3), we have E{ nq n l= q i= e lie ti δ li 2 } = nq n l= q i= q u= E(e lie ti e lu e tu )δ li δ lu = n k nq l= j=2 { i M mj k j=2 { n j nq i M mj u M mj τiu 2 + nq O p ( max 2jk(n j ) n ). u M mj E(e li e ti e lu e tu )I(l T j )} i M mj u M mj (E(e 2 tie 2 tu) τ 2 iu)} Thus, n nq 2 l= q i= e lie ti δ li 2 = O p ( max 2jk(n j ) ). From inequality (S3.5), nq

18 8 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU we have n nq 2 l= D l e t 2 = O p ( max 2jk(n j ) n min(n ). The proof of lemma is com-,q) pleted. Lemma 3. Under Assumptions (C)-(C5), we have n { } nq D D 2 l t 2 max2jk (n j ) = O p for t T u, u = 2,..., k. n min(n, q) l= Proof. Based on the definition of D l, we have n nq 2 l= D D l t 2 = n nq 2 l= q D i= li Dti 2 From (S3.3), we have = nq 2 n l= q i= (d li e li )(d ti e ti )δ li δ ti 2 C{ nq 2 n l= q i= d lid ti δ li δ ti 2 + nq 2 n l= q i= d lie ti δ li δ ti 2 + nq 2 n l= q i= e lid ti δ li δ ti 2 + nq 2 n l= q i= e lie ti δ li δ ti 2 } = C(I + I 2 + I 3 + I 4 ). I nq 2 n l= ( q i= d2 li δ li)( q i= d2 tiδ ti ) = n n l= k j=2 ( q O p ( max 2jk(n j ) nn 2 i M mj d 2 li )I(l T j)( q Based on Assumption (C3) and (S3.3), we have i M mu d 2 ti) ) + O( max 2jk(n j q j ) nn ) + O q 2 p ( max 2jk(n j q j )q u ). nq 4 I 2 nq 2 n l= ( q i= d2 li δ li)( q i= e2 tiδ ti ) = n n l= k j=2 ( q i M mj d 2 li )I(l T j)( q O p ( max 2jk(n j ) nn ) + O( max 2jk(n j q j ) nq 2 ). i M mu e 2 ti)

19 S3. PROOF OF THEOREMS9 Similarly, we have I 3 nq 2 n l= ( q i= e2 li δ li)( q i= d2 tiδ ti ) = n n l= k j=2 ( q i M mj e 2 li )I(l T j)( q O p ( max 2jk(n j ) nn ) + O( max 2jk(n j )q u ). nq 2 Based on Assumption (C3), we have i M mu d 2 ti) I 4 = nq 2 n l= q i= q a= e lie la e ti e ta δ li δ la δ ti δ ta = nq k j=2 n l= { q O p ( max 2jk(n j ) nq ) i M mj Mmu a M mj Mmu e li e la e ti e ta }I(l T j ) since and = q = q E{ q E{ q + q i M mj Mmu a M mj Mmu e li e la e ti e ta } i M mj Mmu a M mj Mmu τ 2 ia C if l t i M mj Mmu a M mj Mmu e li e la e ti e ta } i M mj Mmu a M mj Mmu {E(e 2 tie 2 ta) τ 2 ia} i M mj Mmu a M mj Mmu τ 2 ia C if l = t. Thus, we can obtain n nq 2 l= D D l t 2 = O p { max 2jk(n j ) n min(n }. The proof of,q) lemma is completed. Lemma 4. Under Assumptions (C)-(C5), we have n n t= { } ( ) d 2 max2jk (n j q j ) tiδ ti = O p + O nq 2 p min(q 2, n ) for i =,..., q.

20 20 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU Proof. Based on the definition of d ti, we have n n t= d2 tiδ ti = n n t= ( λ i w t λ iw t ) 2 δ ti C{ n n t= λ i H λ i 2 w t H w t 2 δ ti + n + n n t= λ i H λ i 2 H 2 w t 2 δ ti n t= λ i 2 H 2 w t H w t 2 δ ti } = C(I + I 2 + I 3 ). Based on Lemma 8, we have I = n k j=2 ( t T j w t H w t 2 ) λ i H λ i 2 I(i M mj ) k n j q 2 j j=2 O p{ }I(i M nq 2 min(n,q j ) min(n,q 2 ) mj). For Assumption (C), we have I 2 λ i H λ i 2 H 2 ( n n t= w t 2 ) = O p ( min(n ). For Assumption (C2), we have,q 2 ) I 3 = n k j=2 t T j λ i 2 H 2 w t H w t 2 I(i M mj ) k n j q 2 j j=2 O p( )I(i M nq 2 min(n,q j ) mj). Thus, we can obtain n proof of lemma is completed. n t= d2 tiδ ti O p { max 2jk(n j q j ) } + O nq 2 p ( min(q 2,n ). The ) Proof of Theorem 2. Using the definition of Ŵ, we have nqẑẑ Ŵ = Ŵ V where V is a diagonal matrix whose diagonal elements are the first r

21 eigenvalues of ẐẐ /(nq) in decreasing order. We have S3. PROOF OF THEOREMS2 ŵ t = nq V Ŵ ẐẐt = nq V Ŵ (Z + D)(Z t + D t ) = nq V Ŵ (WΛ + e + D)(Λw t + e t + D t ) = nq V {Ŵ WΛ Λw t + Ŵ WΛ e t + Ŵ WΛ Dt + Ŵ eλw t Thus, we obtain +Ŵ ee t + Ŵ e D t + Ŵ DΛwt + Ŵ Det + Ŵ D Dt }. ŵ t H w t = V { nqŵ WΛ e t + nqŵ WΛ Dt + nqŵ eλw t + nqŵ ee t + nqŵ e D t + nqŵ DΛwt + nqŵ Det + nqŵ D Dt } = V 8 i= I i. Using the definition of D t, we have V (I + I 3 + I 4 + I 6 + I 7 ) t T ŵ t H w t = V 8 i= I i t T u, u = 2,..., k. For Assumptions (C), (C2) and (C3) together with Ŵ Ŵ n = I r, we have I = nqŵ WΛ e t q Ŵ Ŵ n = O p ( q ) /2 W W /2 q q i= λ ie ti where / q q i= λ ie ti = O p () since E{/ q q i= λ ie ti } = 0 and E( / q q i= λ ie ti 2 ) λ/q q i= τ ii C. Based on Assumption (C) n

22 22 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU and (C3) with Ŵ Ŵ/n = I r, we have I 3 = eλw nqŵ t = n nq l= ŵlw tλ e l q ( n n l= ŵ l 2 ) /2 w t n n l= q q i= λ ie li 2 ) /2 = O p ( q ). Based on Assumption (C3), we can obtain I 4 = nqŵ ee t = nq n l= ŵle l e t q ( n n l= ŵ l 2 ) /2 ( nq n l= e l e t 2 ) /2 O p ( q ), where nq n l= e l e t 2 = O p () and E{ n nq l= e l e t 2 } = q q n nq i= j= l= E(e lie lj e ti e tj ) = q q nq i= j= {nτ ij 2 + E(e 2 tie 2 tj) τij} 2 = q q q i= j= τ ij 2 + q q nq i= j= {E(e2 tie 2 tj) τij} 2 C. Based on Assumption (C), Ŵ Ŵ n = I r and Lemma 9, we have for t T u, u = 2,..., k I 2 = WΛ Dt Ŵ Ŵ /2 W W /2 nqŵ n n q Λ Dt = O p () q Λ Dt = O p ( min( n, Ŵ Ŵ ). For q) n = I r and Lemma 0, we have I 5 ( n n l= ŵ l 2 ) /2 ( nq 2 n l= e l D t 2 ) /2 = O p ()( nq 2 n l= e l D t 2 ) /2 = O p ( min( n, q) ) for t T u, u = 2,..., k. For Assumption (C), Ŵ Ŵ n = I r and Lemma,

23 S3. PROOF OF THEOREMS23 we have I 6 = nq n l= ŵlw tλ Dl q ( n n l= ŵ l 2 ) /2 w t ( nq n l= Λ Dl 2 ) /2 O p { max 2jk( n j ) nq }. Similarly, for Lemma 2, we have I 7 ( n n l= ŵ l 2 ) /2 ( nq 2 n l= D l e t 2 ) /2 O p { max 2jk( n j ) n min( n, q) }. Based on Lemma 3, we can obtain I 8 ( n n l= ŵ l 2 ) /2 ( nq 2 n l= D l D t 2 ) /2 O p { max 2jk( n j ) n min( n, q) } for t T u, u = 2,..., k. Similarly to the augment of Lemma A, we can obtain that V is bounded. Thus, we have that ŵ t H w t = O p ( ) + O p { max 2jk( n j ) } = O p ( q nn min( q, n ) ) for t T, and ŵ t H w t = O p ( min( q, n ) ) for t T u, u = 2,..., k. The proof of the part (i) in theorem 2 is completed. We prove the part (ii) as the follow. From Λ = Ẑ Ŵ/n = n t= Ẑtŵ t/n and Ẑt = Λw t + e t + D t, we have λ i = n n t= ŵtw tλ i + n n t= ŵte ti + n n t= ŵt D ti. Writing w t = w t H ŵ t +H ŵ t and using Ŵ = I nŵ r,

24 24 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU we have λ i H λ i = n n t= (ŵ t H w t )(ŵ t H w t ) H λ i n n t= H w t (ŵ t H w t ) H λ i + n n t= (ŵ t H w t )e ti + n n t= H w t e ti + n n t= (ŵ t H w t ) D ti + n n t= H w t Dti = J + J 2 + J 3 + J 4 + J 5 + J 6. Assumptions (C) and (C2) together with Ŵ Ŵ/n = I r imply that H Λ Λ q W W n /2 Ŵ Ŵ /2 V = O n p (). Following the part (i) in Theorem 2, we have J n n ŵ t H w t 2 H λ i O p ( min(q, n ) ). t= Based on Assumptions (C)-(C2) and Ŵ Ŵ/n = I r, we have J 2 H ( n n t= w t 2 ) /2 ( n n t= ŵ t H w t 2 ) /2 H λ i O p ( min( q, n ) ). For Assumption (C3), we have J 3 ( n n ŵ t H w t 2 ) /2 ( n t= n e 2 ti) /2 O p ( min( q, n ) ). For Assumption (C4), we have J 4 = H n n t= w te ti = O p ( n ). Based on the definition of D t, we have t= J 5 = n n t= (ŵ t H w t )(d ti e ti )δ ti ( n n t= ŵ t H w t 2 δ ti ) /2 ( n n t= d2 tiδ ti ) /2 +( n n t= ŵ t H w t 2 δ ti ) /2 ( n n t= e2 tiδ ti ) /2,

25 S3. PROOF OF THEOREMS25 where n n t= ŵ t H w t 2 δ ti = k n j=2 t T j ŵ t H w t 2 I(i M mj ) O p { max 2jk(n j ) n min(q,n ) } = O p{ min(q,n ) }. For Assumption (C3), we have n n t= e2 tiδ ti = O p (). Based on Lemma 4, we have J 5 = O p { min( q, n ) }. Similarly, we can prove J 6 = O p { min( q, }. Thus, we can obtain λ n ) i H λ i = O p ( min( q, n ). The ) proof of part (ii) is completed. Proof of Theorem 3. Based on the definite of α, we have α = (Ŵ Ŵ) Ŵ Y = ( n t= ŵtŵ t) n t= ŵty t = { n t= (ŵ t H w t )(ŵ t H w t ) + n t= (ŵ t H w t )w th + H n t= w t(ŵ t H w t ) +H n t= w tw th} { n t= (ŵ t H w t )y t + H n t= w ty t } = {H n t= w tw th + o p ()} {H n t= w tw tα + H n t= w tε t + o p ()} = H α + H ( n t= w tw t) n t= w tε t + o p () since H = O p () and Assumption (C) together with Theorem 2 (i) holds. Thus, from Assumption (C6), we can obtain n α H α = H ( w t w t) t= n w t ε t + o p () p 0. t=

26 26 YANQING ZHANG, NIANSHENG TANG AND ANNIE QU The proof of part (i) is completed. Similarly, we have α ŵ t α w t = α ŵ t (H α) (H w t ) = ( α H α) (ŵ t H w t ) + ( α H α) H w t p 0. +(H α) (ŵ t H w t ) The proof of part (ii) is completed. Bibliography Bai, J. and Ng, S. (2002). Determing the number of factors in approximate factor models. Econometrica 70, pp Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica 7, pp Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society B 70, pp

Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS Statistica Sinica Preprint No: SS-2018-0008 Title Imputed Factor Regression for High-dimensional Block-wise Missing Data Manuscript ID SS-2018-0008 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202018.0008

More information

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)

More information

Grouped Network Vector Autoregression

Grouped Network Vector Autoregression Statistica Sinica: Supplement Grouped Networ Vector Autoregression Xuening Zhu 1 and Rui Pan 2 1 Fudan University, 2 Central University of Finance and Economics Supplementary Material We present here the

More information

Vast Volatility Matrix Estimation for High Frequency Data

Vast Volatility Matrix Estimation for High Frequency Data Vast Volatility Matrix Estimation for High Frequency Data Yazhen Wang National Science Foundation Yale Workshop, May 14-17, 2009 Disclaimer: My opinion, not the views of NSF Y. Wang (at NSF) 1 / 36 Outline

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

Discussion of High-dimensional autocovariance matrices and optimal linear prediction,

Discussion of High-dimensional autocovariance matrices and optimal linear prediction, Electronic Journal of Statistics Vol. 9 (2015) 1 10 ISSN: 1935-7524 DOI: 10.1214/15-EJS1007 Discussion of High-dimensional autocovariance matrices and optimal linear prediction, Xiaohui Chen University

More information

Random Matrix Theory and its Applications to Econometrics

Random Matrix Theory and its Applications to Econometrics Random Matrix Theory and its Applications to Econometrics Hyungsik Roger Moon University of Southern California Conference to Celebrate Peter Phillips 40 Years at Yale, October 2018 Spectral Analysis of

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

The Algebra of the Kronecker Product. Consider the matrix equation Y = AXB where

The Algebra of the Kronecker Product. Consider the matrix equation Y = AXB where 21 : CHAPTER Seemingly-Unrelated Regressions The Algebra of the Kronecker Product Consider the matrix equation Y = AXB where Y =[y kl ]; k =1,,r,l =1,,s, (1) X =[x ij ]; i =1,,m,j =1,,n, A=[a ki ]; k =1,,r,i=1,,m,

More information

arxiv: v1 [stat.ml] 5 Jan 2015

arxiv: v1 [stat.ml] 5 Jan 2015 To appear on the Annals of Statistics SUPPLEMENTARY MATERIALS FOR INNOVATED INTERACTION SCREENING FOR HIGH-DIMENSIONAL NONLINEAR CLASSIFICATION arxiv:1501.0109v1 [stat.ml] 5 Jan 015 By Yingying Fan, Yinfei

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Stein s Method and Characteristic Functions

Stein s Method and Characteristic Functions Stein s Method and Characteristic Functions Alexander Tikhomirov Komi Science Center of Ural Division of RAS, Syktyvkar, Russia; Singapore, NUS, 18-29 May 2015 Workshop New Directions in Stein s method

More information

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors Sec. 6.1 Eigenvalues and Eigenvectors Linear transformations L : V V that go from a vector space to itself are often called linear operators. Many linear operators can be understood geometrically by identifying

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

Index Models for Sparsely Sampled Functional Data. Supplementary Material.

Index Models for Sparsely Sampled Functional Data. Supplementary Material. Index Models for Sparsely Sampled Functional Data. Supplementary Material. May 21, 2014 1 Proof of Theorem 2 We will follow the general argument in the proof of Theorem 3 in Li et al. 2010). Write d =

More information

Matrix Algebra Review

Matrix Algebra Review APPENDIX A Matrix Algebra Review This appendix presents some of the basic definitions and properties of matrices. Many of the matrices in the appendix are named the same as the matrices that appear in

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

7. MULTIVARATE STATIONARY PROCESSES

7. MULTIVARATE STATIONARY PROCESSES 7. MULTIVARATE STATIONARY PROCESSES 1 1 Some Preliminary Definitions and Concepts Random Vector: A vector X = (X 1,..., X n ) whose components are scalar-valued random variables on the same probability

More information

Fundamentals of Engineering Analysis (650163)

Fundamentals of Engineering Analysis (650163) Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

Section 9.2: Matrices.. a m1 a m2 a mn

Section 9.2: Matrices.. a m1 a m2 a mn Section 9.2: Matrices Definition: A matrix is a rectangular array of numbers: a 11 a 12 a 1n a 21 a 22 a 2n A =...... a m1 a m2 a mn In general, a ij denotes the (i, j) entry of A. That is, the entry in

More information

Lecture: Examples of LP, SOCP and SDP

Lecture: Examples of LP, SOCP and SDP 1/34 Lecture: Examples of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html wenzw@pku.edu.cn Acknowledgement:

More information

Chapter 2:Determinants. Section 2.1: Determinants by cofactor expansion

Chapter 2:Determinants. Section 2.1: Determinants by cofactor expansion Chapter 2:Determinants Section 2.1: Determinants by cofactor expansion [ ] a b Recall: The 2 2 matrix is invertible if ad bc 0. The c d ([ ]) a b function f = ad bc is called the determinant and it associates

More information

Zhaoxing Gao and Ruey S Tsay Booth School of Business, University of Chicago. August 23, 2018

Zhaoxing Gao and Ruey S Tsay Booth School of Business, University of Chicago. August 23, 2018 Supplementary Material for Structural-Factor Modeling of High-Dimensional Time Series: Another Look at Approximate Factor Models with Diverging Eigenvalues Zhaoxing Gao and Ruey S Tsay Booth School of

More information

Appendices to the paper "Detecting Big Structural Breaks in Large Factor Models" (2013) by Chen, Dolado and Gonzalo.

Appendices to the paper Detecting Big Structural Breaks in Large Factor Models (2013) by Chen, Dolado and Gonzalo. Appendices to the paper "Detecting Big Structural Breaks in Large Factor Models" 203 by Chen, Dolado and Gonzalo. A.: Proof of Propositions and 2 he proof proceeds by showing that the errors, factors and

More information

SUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION

SUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION IPDC SUPPLEMENTARY MATERIAL TO INTERACTION PURSUIT IN HIGH-DIMENSIONAL MULTI-RESPONSE REGRESSION VIA DISTANCE CORRELATION By Yinfei Kong, Daoji Li, Yingying Fan and Jinchi Lv California State University

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

Linear Ordinary Differential Equations

Linear Ordinary Differential Equations MTH.B402; Sect. 1 20180703) 2 Linear Ordinary Differential Equations Preliminaries: Matrix Norms. Denote by M n R) the set of n n matrix with real components, which can be identified the vector space R

More information

Factor-Adjusted Robust Multiple Test. Jianqing Fan (Princeton University)

Factor-Adjusted Robust Multiple Test. Jianqing Fan (Princeton University) Factor-Adjusted Robust Multiple Test Jianqing Fan Princeton University with Koushiki Bose, Qiang Sun, Wenxin Zhou August 11, 2017 Outline 1 Introduction 2 A principle of robustification 3 Adaptive Huber

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Ma 227 Review for Systems of DEs

Ma 227 Review for Systems of DEs Ma 7 Review for Systems of DEs Matrices Basic Properties Addition and subtraction: Let A a ij mn and B b ij mn.then A B a ij b ij mn 3 A 6 B 6 4 7 6 A B 6 4 3 7 6 6 7 3 Scaler Multiplication: Let k be

More information

Linear Systems and Matrices

Linear Systems and Matrices Department of Mathematics The Chinese University of Hong Kong 1 System of m linear equations in n unknowns (linear system) a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.......

More information

2 Two-Point Boundary Value Problems

2 Two-Point Boundary Value Problems 2 Two-Point Boundary Value Problems Another fundamental equation, in addition to the heat eq. and the wave eq., is Poisson s equation: n j=1 2 u x 2 j The unknown is the function u = u(x 1, x 2,..., x

More information

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma

Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Multivariate Statistics Random Projections and Johnson-Lindenstrauss Lemma Suppose again we have n sample points x,..., x n R p. The data-point x i R p can be thought of as the i-th row X i of an n p-dimensional

More information

high-dimensional inference robust to the lack of model sparsity

high-dimensional inference robust to the lack of model sparsity high-dimensional inference robust to the lack of model sparsity Jelena Bradic (joint with a PhD student Yinchu Zhu) www.jelenabradic.net Assistant Professor Department of Mathematics University of California,

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

The inverse of a tridiagonal matrix

The inverse of a tridiagonal matrix Linear Algebra and its Applications 325 (2001) 109 139 www.elsevier.com/locate/laa The inverse of a tridiagonal matrix Ranjan K. Mallik Department of Electrical Engineering, Indian Institute of Technology,

More information

Improving linear quantile regression for

Improving linear quantile regression for Improving linear quantile regression for replicated data arxiv:1901.0369v1 [stat.ap] 16 Jan 2019 Kaushik Jana 1 and Debasis Sengupta 2 1 Imperial College London, UK 2 Indian Statistical Institute, Kolkata,

More information

Inference After Variable Selection

Inference After Variable Selection Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection

More information

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable

More information

CHAPTER 10 Shape Preserving Properties of B-splines

CHAPTER 10 Shape Preserving Properties of B-splines CHAPTER 10 Shape Preserving Properties of B-splines In earlier chapters we have seen a number of examples of the close relationship between a spline function and its B-spline coefficients This is especially

More information

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang The supplementary material contains some additional simulation

More information

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis

More information

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS

ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS ASYMPTOTIC PROPERTIES OF BRIDGE ESTIMATORS IN SPARSE HIGH-DIMENSIONAL REGRESSION MODELS Jian Huang 1, Joel L. Horowitz 2, and Shuangge Ma 3 1 Department of Statistics and Actuarial Science, University

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2 Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution Defn: Z R 1 N(0,1) iff f Z (z) = 1 2π e z2 /2 Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) (a column

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 ) Covariance Stationary Time Series Stochastic Process: sequence of rv s ordered by time {Y t } {...,Y 1,Y 0,Y 1,...} Defn: {Y t } is covariance stationary if E[Y t ]μ for all t cov(y t,y t j )E[(Y t μ)(y

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2018 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

More information

5: MULTIVARATE STATIONARY PROCESSES

5: MULTIVARATE STATIONARY PROCESSES 5: MULTIVARATE STATIONARY PROCESSES 1 1 Some Preliminary Definitions and Concepts Random Vector: A vector X = (X 1,..., X n ) whose components are scalarvalued random variables on the same probability

More information

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as MAHALANOBIS DISTANCE Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as d E (x, y) = (x 1 y 1 ) 2 + +(x p y p ) 2

More information

Ch.10 Autocorrelated Disturbances (June 15, 2016)

Ch.10 Autocorrelated Disturbances (June 15, 2016) Ch10 Autocorrelated Disturbances (June 15, 2016) In a time-series linear regression model setting, Y t = x tβ + u t, t = 1, 2,, T, (10-1) a common problem is autocorrelation, or serial correlation of the

More information

On the Skeel condition number, growth factor and pivoting strategies for Gaussian elimination

On the Skeel condition number, growth factor and pivoting strategies for Gaussian elimination On the Skeel condition number, growth factor and pivoting strategies for Gaussian elimination J.M. Peña 1 Introduction Gaussian elimination (GE) with a given pivoting strategy, for nonsingular matrices

More information

II. Determinant Functions

II. Determinant Functions Supplemental Materials for EE203001 Students II Determinant Functions Chung-Chin Lu Department of Electrical Engineering National Tsing Hua University May 22, 2003 1 Three Axioms for a Determinant Function

More information

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes.

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes. MAY, 0 LECTURE 0 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA In this lecture, we continue to discuss covariance stationary processes. Spectral density Gourieroux and Monfort 990), Ch. 5;

More information

Random matrices: Distribution of the least singular value (via Property Testing)

Random matrices: Distribution of the least singular value (via Property Testing) Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued

More information

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i ) Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical

More information

Gaussian Elimination and Back Substitution

Gaussian Elimination and Back Substitution Jim Lambers MAT 610 Summer Session 2009-10 Lecture 4 Notes These notes correspond to Sections 31 and 32 in the text Gaussian Elimination and Back Substitution The basic idea behind methods for solving

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Chapter 3. Linear and Nonlinear Systems

Chapter 3. Linear and Nonlinear Systems 59 An expert is someone who knows some of the worst mistakes that can be made in his subject, and how to avoid them Werner Heisenberg (1901-1976) Chapter 3 Linear and Nonlinear Systems In this chapter

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

EQUIVARIANT COHOMOLOGY IN ALGEBRAIC GEOMETRY LECTURE TEN: MORE ON FLAG VARIETIES

EQUIVARIANT COHOMOLOGY IN ALGEBRAIC GEOMETRY LECTURE TEN: MORE ON FLAG VARIETIES EQUIVARIANT COHOMOLOGY IN ALGEBRAIC GEOMETRY LECTURE TEN: MORE ON FLAG VARIETIES WILLIAM FULTON NOTES BY DAVE ANDERSON 1 A. Molev has just given a simple, efficient, and positive formula for the structure

More information

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA Discussiones Mathematicae General Algebra and Applications 23 (2003 ) 125 137 RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA Seok-Zun Song and Kyung-Tae Kang Department of Mathematics,

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Basic Concepts in Linear Algebra

Basic Concepts in Linear Algebra Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University February 2, 2015 Grady B Wright Linear Algebra Basics February 2, 2015 1 / 39 Numerical Linear Algebra Linear

More information

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao A note on a Marčenko-Pastur type theorem for time series Jianfeng Yao Workshop on High-dimensional statistics The University of Hong Kong, October 2011 Overview 1 High-dimensional data and the sample covariance

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

Linear Algebra Highlights

Linear Algebra Highlights Linear Algebra Highlights Chapter 1 A linear equation in n variables is of the form a 1 x 1 + a 2 x 2 + + a n x n. We can have m equations in n variables, a system of linear equations, which we want to

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Deterministic Component Analysis Goal (Lecture): To present standard and modern Component Analysis (CA) techniques such as Principal

More information

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Online Exercises for Linear Algebra XM511

Online Exercises for Linear Algebra XM511 This document lists the online exercises for XM511. The section ( ) numbers refer to the textbook. TYPE I are True/False. Lecture 02 ( 1.1) Online Exercises for Linear Algebra XM511 1) The matrix [3 2

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

Ma/CS 6b Class 20: Spectral Graph Theory

Ma/CS 6b Class 20: Spectral Graph Theory Ma/CS 6b Class 20: Spectral Graph Theory By Adam Sheffer Eigenvalues and Eigenvectors A an n n matrix of real numbers. The eigenvalues of A are the numbers λ such that Ax = λx for some nonzero vector x

More information

Review of Basic Concepts in Linear Algebra

Review of Basic Concepts in Linear Algebra Review of Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University September 7, 2017 Math 565 Linear Algebra Review September 7, 2017 1 / 40 Numerical Linear Algebra

More information

Hamiltonian simulation with nearly optimal dependence on all parameters

Hamiltonian simulation with nearly optimal dependence on all parameters Hamiltonian simulation with nearly optimal dependence on all parameters Dominic Berry + Andrew Childs obin Kothari ichard Cleve olando Somma Quantum simulation by quantum walks Dominic Berry + Andrew Childs

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

Lecture 9: Submatrices and Some Special Matrices

Lecture 9: Submatrices and Some Special Matrices Lecture 9: Submatrices and Some Special Matrices Winfried Just Department of Mathematics, Ohio University February 2, 2018 Submatrices A submatrix B of a given matrix A is any matrix that can be obtained

More information

Differential equations

Differential equations Differential equations Math 7 Spring Practice problems for April Exam Problem Use the method of elimination to find the x-component of the general solution of x y = 6x 9x + y = x 6y 9y Soln: The system

More information

Randomized sparsification in NP-hard norms

Randomized sparsification in NP-hard norms 58 Chapter 3 Randomized sparsification in NP-hard norms Massive matrices are ubiquitous in modern data processing. Classical dense matrix algorithms are poorly suited to such problems because their running

More information

Interaction balance in symmetrical factorial designs with generalized minimum aberration

Interaction balance in symmetrical factorial designs with generalized minimum aberration Interaction balance in symmetrical factorial designs with generalized minimum aberration Mingyao Ai and Shuyuan He LMAM, School of Mathematical Sciences, Peing University, Beijing 100871, P. R. China Abstract:

More information

I forgot to mention last time: in the Ito formula for two standard processes, putting

I forgot to mention last time: in the Ito formula for two standard processes, putting I forgot to mention last time: in the Ito formula for two standard processes, putting dx t = a t dt + b t db t dy t = α t dt + β t db t, and taking f(x, y = xy, one has f x = y, f y = x, and f xx = f yy

More information

Lecture 20: Linear model, the LSE, and UMVUE

Lecture 20: Linear model, the LSE, and UMVUE Lecture 20: Linear model, the LSE, and UMVUE Linear Models One of the most useful statistical models is X i = β τ Z i + ε i, i = 1,...,n, where X i is the ith observation and is often called the ith response;

More information

Section 9.2: Matrices. Definition: A matrix A consists of a rectangular array of numbers, or elements, arranged in m rows and n columns.

Section 9.2: Matrices. Definition: A matrix A consists of a rectangular array of numbers, or elements, arranged in m rows and n columns. Section 9.2: Matrices Definition: A matrix A consists of a rectangular array of numbers, or elements, arranged in m rows and n columns. That is, a 11 a 12 a 1n a 21 a 22 a 2n A =...... a m1 a m2 a mn A

More information

Eigenvectors and Reconstruction

Eigenvectors and Reconstruction Eigenvectors and Reconstruction Hongyu He Department of Mathematics Louisiana State University, Baton Rouge, USA hongyu@mathlsuedu Submitted: Jul 6, 2006; Accepted: Jun 14, 2007; Published: Jul 5, 2007

More information

Bootstrapping factor models with cross sectional dependence

Bootstrapping factor models with cross sectional dependence Bootstrapping factor models with cross sectional dependence Sílvia Gonçalves and Benoit Perron University of Western Ontario and Université de Montréal, CIREQ, CIRAO ovember, 06 Abstract We consider bootstrap

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

SUPPLEMENTARY MATERIAL FOR FAST COMMUNITY DETECTION BY SCORE. By Jiashun Jin Carnegie Mellon University

SUPPLEMENTARY MATERIAL FOR FAST COMMUNITY DETECTION BY SCORE. By Jiashun Jin Carnegie Mellon University SUPPLEMENTARY MATERIAL FOR FAST COMMUNITY DETECTION BY SCORE By Jiashun Jin Carnegie Mellon University In this supplement we propose some variants of SCORE and present the technical proofs for the main

More information

Matrix Operations: Determinant

Matrix Operations: Determinant Matrix Operations: Determinant Determinants Determinants are only applicable for square matrices. Determinant of the square matrix A is denoted as: det(a) or A Recall that the absolute value of the determinant

More information

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction

ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction Acta Math. Univ. Comenianae Vol. LXV, 1(1996), pp. 129 139 129 ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES V. WITKOVSKÝ Abstract. Estimation of the autoregressive

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

Solution via Laplace transform and matrix exponential

Solution via Laplace transform and matrix exponential EE263 Autumn 2015 S. Boyd and S. Lall Solution via Laplace transform and matrix exponential Laplace transform solving ẋ = Ax via Laplace transform state transition matrix matrix exponential qualitative

More information