arxiv: v2 [math.st] 22 Nov 2013

Size: px

Start display at page:

Download "arxiv: v2 [math.st] 22 Nov 2013"

Clyde Barnett
5 years ago
Views:

1 The Annals of Statistics 203, Vol 4, No 5, DOI: 024/3-AOS54 c Institute of Mathematical Statistics, 203 arxiv:35000v2 [mathst] 22 Nov 203 CONVERGENCE RATES OF EIGENVECTOR EMPIRICAL SPECTRAL DISTRIBUTION OF LARGE DIMENSIONAL SAMPLE COVARIANCE MATRIX By Ningning Xia, Yingli Qin 2 and Zhidong Bai 3 Northeast Normal University and National University of Singapore, University of Waterloo, and Northeast Normal University and National University of Singapore The eigenvector Empirical Spectral Distribution(VESD) is adopted to investigate the limiting behavior of eigenvectors and eigenvalues of covariance matrices In this paper, we shall show that the Kolmogorov distance between the expected VESD of sample covariance matrix and the Marčenko Pastur distribution function is of order O(N /2 ) Given that data dimension n to sample size N ratio is bounded between 0 and, this convergence rate is established under finite 0th moment condition of the underlying distribution It is also shown that, for any fixed η >0, the convergence rates of VESD are O(N /4 ) in probability and O(N /4+η ) almost surely, requiring finite 8th moment of the underlying distribution Introductionandmainresults LetX i =(X i,x 2i,,X ni ) T andx= (X,,X N ) be an n N matrix of iid (independent and identically distributed) complex random variables with mean 0 and variance We consider, a class of sample covariance matrices S n = N N X k X k = N XX, k= Received February 203; revised June 203 Supported by the Fundamental Research Funds for the Central Universities SSXT3 2 Supported by University of Waterloo Start-up Grant 3 Supported by NSF China Grant 7057, PCSIRT Fundamental Research Funds for the Central Universities, and NUS Grant R AMS 2000 subect classifications Primary 5A52, 60F5, 62E20; secondary 60F7, 62H99 Key words and phrases Eigenvector empirical spectral distribution, empirical spectral distribution, Marčenko Pastur distribution, sample covariance matrix, Stieltes transform This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in The Annals of Statistics, 203, Vol 4, No 5, This reprint differs from the original in pagination and typographic detail

2 2 N XIA, Y QIN AND Z D BAI where X denotes the conugate transpose of the data matrix X The Empirical Spectral Distribution (ESD) F Sn (x) of S n is then defined as F Sn (x)= n () I(λ i x), n i= where λ λ n are the eigenvalues of S n in ascending order and I( ) is the conventional indicator function Marčenko and Pastur in [6] proved that with probability, F Sn (x) converges weakly to the standard Marčenko Pastur distribution F y (x) with density function (2) p y (x)= df y(x) dx = (x a)(b x)i(a x b), 2πxy where a=( y) 2 and b=(+ y) 2 Here the positive constant y is the limit of dimension to sample size ratio when both n and N tend to infinity In applications of asymptotic theorems of spectral analysis of large dimensional random matrices, one of the important problems is the convergence rate of the ESD The Kolmogorov distance between the expected ESD of S n and the Marčenko Pastur distribution F y (x) is defined as = EF Sn F y =sup EF Sn (x) F y (x) x as well as the distance between two distributions F Sn (x) and F y (x), p = F Sn F y =sup F Sn (x) F y (x) x Notice that, for any constant C >0, { } P( p C)=P sup F Sn (x) F y (x) C C E p x Thus, p measures the rate of convergence in probability Bai in[2, 3] firstly tackled the problem of convergence rate and established three Berry Esseen type inequalities for the difference of two distributions in terms of their Stieltes transforms Götze and Tikhomirov in [] further improved the Berry Esseen type inequlatiy and showed the convergence rate of F Sn (x) is O(N /2 ) in probability under finite 8th moment condition More recently, a sharper bound is obtained by Pillai and Yin in [8], under a stronger condition, that is, the sub-exponential decay assumption It is shown that the difference between eigenvalues of S n and the Marčenko Pastur distribution is of order O(N (logn) O(loglogN) ) in probability In the literature, research on limiting properties of eigenvectors of large dimensional sample covariance matrices is much less developed than that of eigenvalues, due to the cumbersome formulation of the eigenvectors Some great achievements have been made in proving the properties of eigenvectors

3 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 3 for large dimensional sample covariance matrices, such as[4, 9 22], and that for Wigner matrices, such as [0, 3, 25] However, the eigenvectors of large sample covariance matrices play an important role in high-dimensional statistical analysis In particular, due to the increasing availability of high-dimensional data, principal component analysis (PCA) has been favorably recognized as a powerful technique to reduce dimensionality The eigenvectors corresponding to the leading eigenvalues are the directions of the principal components Johnstone [2] proposed the spiked eigenvalue model to test the existence of principal component Paul [7] discussed the length of the eigenvector corresponding to the spiked eigenvalue In PCA, the eigenvectors (ν 0,,ν0 n) of population covariance matrix Σ determine the directions in which we proect the observed data and the corresponding eigenvalues (λ 0,,λ0 n ) determine the proportion of total variability loaded on each direction of proections In practice, the (sample) eigenvalues (λ,,λ n ) and eigenvectors (ν,,ν n ) of the sample covariance matrix S n are used in PCA In [], Anderson has shown the following asymptotic distribution for the sample eigenvectors ν,,ν n when the observations are from a multivariate normal distribution of covariance matrix Σ with distinct eigenvalues: N(νi ν 0 i ) N d n (0,D i ), where D i =λ 0 i n k=,k i λ 0 k (λ 0 k λ0 i )2ν0 k ν0 T k However, this is a large sample result when the dimension n is fixed and low Inparticular, if Σ=σ 2 I n, then theeigenmatrix (matrix of eigenvectors) should be asymptotically isotropic when the sample size is large That is, the eigenmatrix should be asymptotically Haar, under some minor moment conditions However, when the dimension is large (increasing), the Haar property is not easy to formulate Motivated by the orthogonal iteration method, [5] proposed an iterative thresholding method to estimate sparse principal subspaces (spanned by the leading eigenvectors of Σ) in high dimensional and spiked covariance matrix setting The convergence rates of the proposed estimators are provided By reducing the sparse PCA problem to a high-dimensional regression problem, [9] established the optimal rates of convergence for estimating the principal subspace with respect to a large collection of spiked covariance matrices See the reference therein for more literature on sparse PCA and spiked covariance matrices To perform the test of existence of spiked eigenvalues, one has to investigate the null properties of the eigenmatrices, that is, when Σ=σ 2 I n (ie,

4 4 N XIA, Y QIN AND Z D BAI nonspiked) Then the eigenmatrix should be asymptotically isotropic, when the sample size is large That is, the eigenmatrix should be asymptotically Haar However, when the dimension is large, the Haar property is not easy to formulate The recent development in random matrix theory can help us investigate the large dimension and large sample properties of eigenvectors We will adopt the VESD, defined later in the paper, to characterize the asymptotical Haar property so that if the eigenmatrix is Haar, then the process defined the VESD tends to a Brownian bridge Conversely, if the process defined by the VESD tends to a Brownian bridge, then it indicates a similarity between the Haar distribution and that of the eigenmatrix Therefore, studying the large sample and large dimensional results of the VESD can assist us in better examining spiked covariance matrix as assumed by [5] and [9] among many others Let U n Λ n U n denote the spectral decomposition of S n, where Λ n = diag(λ,λ 2,,λ n ) and U n = (u i ) n n is a unitary matrix consisting of the corresponding orthonormal eigenvectors of S n For each n, let x n C n, x n = be nonrandom and let d n =U n x n =(d,,d n ), where x n denotes Euclidean norm of x n Define a stochastic process X n (t) by X n (t)= [nt] ( n/2 d 2 ), [a] denotes the greatest integer a n = If U n is Haar distributed over the orthogonal matrices, then d n would be uniformly distributed over the unit sphere in R n, and the limiting distribution of X n (t) is a unique Brownian bridge B(t) when n tends to infinity In this paper, we use the behavior of X n (t) for all x n to reflect the uniformity of U n The process X n (t) is considerably important for us to understand the behavior of the eigenvectors of S n Motivated by Silverstein s ideas in [9 22], we want to examine the limiting properties of U n through stochastic process X n (t) We claim that U n is asymptotically Haar distributed, which means X n (t) converges to a Brownian bridge B(t) In [2], it showed that the weak convergence of X n (t) converging to a Brownian bridge B(t) is equivalent to X n (F Sn (x)) converging to B(F y (x)) We therefore consider transforming X n (t) to X n (F Sn (x)) where F Sn (x) is the ESD of S n Wedefinetheeigenvector EmpiricalSpectralDistribution(VESD)H Sn (x) of S n as follows: n (3) H Sn (x)= d i 2 I(λ i x) i= Between H Sn (x) in (3) and F Sn (x) in (), we notice that there is no difference except the coefficient associated with each indicator function such

5 that (4) ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 5 X n (F Sn (x))= n/2(h Sn (x) F Sn (x)) Henceforth, the investigation of X n (t) is converted to that of the difference between two empirical distributions H Sn (x) and F Sn (x) The authors in [4] proved that H Sn (x) and F Sn (x) have the same limiting distribution, the Marčenko Pastur distribution F y (x), where y n = n/n and y = lim n,n y n (0,) Before we present the main theorems, let us introduce the following notation: and H = EH Sn F yn =sup EH Sn (x) F yn (x) x H p = H Sn F yn =sup H Sn (x) F yn (x) x We denote ξ n = O p (a n ) and η n = O as (b n ) if, for any ǫ > 0, there exist a large positive constant c and a positive random variable c 2, such that P(ξ n /a n c ) ǫ and P(η n /b n c 2 )=, respectively In this paper, we follow the work in [4] and establish three types of convergence rates of H Sn (x) to F yn (x) in the following theorems Theorem Suppose that X i, i =,,n, =,,N are iid complex random variables with EX =0, E X 2 = and E X 0 < For any fixed unit vector x n C n ={x Cn : x =}, and y n =n/n, it then follows that { H O(N = EH Sn F yn = /2 a 3/4 ), if N /2 a<, O(N /8 ), if a<n /2, where a=( y n ) 2 as it is defined in (2) and F yn denotes the Marčenko Pastur distribution function with an index y n Remark 2 From the proof of Theorem, it is clear that the condition E X 0 < is required only in thetruncation step in the next section We therefore believe that the condition E X 0 < can be replaced by E X 8 < in Theorems 6 and 8 Remark 3 Because the convergence rate of EH Sn F y depends on the convergence rate of y n y, we only consider the convergence rate of EH Sn F yn Remark 4 As a = ( y n ) 2, we can characterize the closeness between y n and through a In particular, when y n is away from (or

6 6 N XIA, Y QIN AND Z D BAI a N /2 ), the convergence rate of EH Sn F yn is O(N /2 ), which we believe is the optimal convergence rate This is because we observe in [4] that for an analytic function f, Y n (f)= (5) n f(x)d(h Sn (x) F yn (x)) converges to a Gaussian distribution While in[6], Bai and Silverstein proved that the limiting distribution of n f(x)d(f Sn (x) F yn (x)) is also a Gaussian distribution We therefore conecture that the optimal rate of H Sn (x) should be O(N /2 ) and O(N ) for F Sn (x) Although F Sn (x) and H Sn (x) converge to the same limiting distribution, there exists a substantial difference between F Sn (x) and H Sn (x) Remark 5 Notice that two matrices XX and X X share the same set of nonzero eigenvalues However, these two matrices do not always share the same set of eigenvectors Especially when y n, the eigenvectors of S n corresponding to 0 eigenvalues can be arbitrary As a result, the limit of H Sn may not exist or heavily depends on the choice of unit vector x n Therefore, we only consider the case of y n in this paper and leave the case of y n as a future research problem The rates of convergence in probability and almost sure convergence of the VESD are provided in the next two theorems Theorem 6 Under the assumptions in Theorem except that we now only require E X 8 <, we have { H O p = H Sn p (N /4 a /2 ), if N /4 a<, F yn = O p (N /8 ), if a<n /4 Remark 7 As an application of Theorem 6, in [8] we extended the CLT of the linear spectral statistics Y n (f) established in [4] to the case where the kernel function f is continuously twice differentiable provided that the sample covariance matrix S n satisfies the assumptions of Theorem 6 This result is useful in testing Johnstone s hypothesis when normality is not assumed Theorem 8 Under the assumptions in Theorem 6, for any η >0, we have { H Oas (N p = H Sn F yn = /4+η a /2 ), if N /4 a<, O as (N /8+η ), if a<n /4

7 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 7 Remark 9 In this paper, we will use the following notation: X denote the conugate transpose of a matrix (or vector) X; X T denote the (ordinary) transpose of a matrix (or vector) X; x denote the Euclidean norm for any vector x; A = λ max (AA ), the spectral norm; F =sup x F(x) for any function F; z denote the conugate of a complex number z The rest of the paper is organized as follows In Section 2, we introduce the main tools used to prove Theorems, 6 and 8, including Stieltes transform and a Berry Esseen type inequality The proofs of these three theorems are presented in Sections 3 6 Several important results which are repeatedly employed throughout Sections 3 6 are proved in Appendix A Appendix B contains some existing results in the literature Finally, preliminaries on truncation, centralization and rescaling are postponed to the last section 2 Main tools 2 Stieltes transform The Stieltes transform is an essential tool in random matrix theory and our paper Let us now briefly review the Stieltes transform and some important and relevant results For a cumulative distribution function G(x), its Stieltes transform m G (z) is defined as m G (z)= λ z dg(λ), z C+ ={z C,I(z)>0}, where I( ) denotes the imaginary part of a complex number The Stieltes transforms of the ESD F Sn (x) and the VESD H Sn (x) are and m F Sn(z)= n tr(s n zi n ) m H Sn(z)=x n(s n zi n ) x n, respectively Here I n denotes the n n identity matrix For simplicity of notation, we use m n (z) and m H n (z) to denote m F Sn(z) and m H Sn(z), respectively Remark 2 Notice that although the eigenmatrix U n may not be unique, the Stieltes transform m H n (z) of HSn depends on S n for any x n rather than U n

8 8 N XIA, Y QIN AND Z D BAI Let S n = X X/N denote the companion matrix of S n As S n and S n share the same set of nonzero eigenvalues, it can be shown that Stieltes transforms of F Sn (x) and F S n (x) satisfy the following equality: (2) m n (z)= y n z +y n m n (z), where m n (z) denotes the Stieltes transform of F S n (x) Moreover, [5] and [24] claimed that F S n converges, almost surely, to a nonrandom distribution function F y (x) with Stieltes transform m(z) such that (22) m(z)= y z +ym y (z), where m y (z) denotes the Stieltes transform of the Marčenko Pastur distribution with index y Using (64) in [7], we also obtain the relationship between two limits m y (z) and m(z) as follows: (23) m y (z)= 22 A Berry Esseen type inequality z(+m(z)) Lemma 22 Let H Sn (x) and F yn (x) be the VESDof S n and the Marčenko Pastur distribution with index y n, respectively Denote their corresponding Stieltes transforms by m H n (z) and m yn (z), respectively Then there exist large positive constants A,B,K,K 2 and K 3, such that for A>B >5, H = EH Sn (x) F yn (x) K Em H n (z) m y n (z) du+k 2 v +K 3 v sup x t <v F yn (x+t) F yn (x) dt, x >B EH Sn (x) F yn (x) dx where z = u + iv is a complex number with positive imaginary part (ie, v >0) Remark 23 Lemma 22 can be proved using Lemma B To prove Theorem, we apply Lemma 22 In addition, we prove Theorems 6 and 8 by replacing EH Sn (x), Em H n (z) with H Sn (x) and m H n (z), respectively 3 Proof of Theorem Under the condition of E X 0 <, we can choose a sequence of η N with η N 0 and η N N /4 as N, such that (3) lim N η 0 N E( X 0 I( X >η N N /4 ))=0

9 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 9 Furthermore, without loss of generality, we can assume that every X i is bounded by η N N /4 and has mean 0 and variance See Appendix C for details on truncation, centralization and rescaling We introduce some notation before start proving Theorem Throughout the paper, we use C and C i for i=0,,2, to denote positive constant numbers which are independent of N and may take different values at different appearances Let X denote the th column of the data matrix X Let r =X / N so that S n = N = r r and let (32) v y = a+ v= y n + v, B =S n r r, A(z)=S n zi n, A (z)=b zi n, α (z)=r A (z)x n x n r N x n A (z)x n, ξ (z)=r A (z)r N EtrA (z), ˆξ (z)=r A (z)r N tra (z), b(z)= It is easy to show that +(/N)EtrA (z), b (z)= +(/N)EtrA (z), β (z)= +r A (z)r β (z) b (z)= b (z)β (z)ξ (z) For any =,2,,N, we can also show that (33) due to the fact that r A (z)=β (z)r A (z) (B zi n ) (B +r r zi n) =(B zi n ) r r (B +r r zi n) From (22) in [23], we can write m n (z) in terms of β (z) as follows: (34) m n (z)= zn N β (z) =

10 0 N XIA, Y QIN AND Z D BAI We proceed with the proof of Theorem : δ=:em H n (z) m y (z) = x n [EA (z) ( zm(z)i n zi n ) ]x n = (zm(z)+z) x ne[(zm(z)+z)a (z)+i n ]x n = (zm(z)+z) x ne[(zi n +A(z))A (z)+zm(z)a (z)]x n [ N = (zm(z)+z) x ne r r A (z)+z(em n (z))a (z) = (zm(z)+z) x ne = z(em n (z))a (z)+zm(z)a (z) [ N = = (zm(z)+z) x n E [ N = ] x n β (z)r r A (z) ( zem n (z))a (z) (zem n (z) zm(z))a (z) β (z)r r A (z) ( N ] x n ) ] N Eβ (z) A (z) (zm(z)+z) x n(zem n (z) zm(z))(ea (z))x n [ N ( = (zm(z)+z) x n Eβ (z) r r A (z) (z))] N EA x n = +m(z)(zem n (z) zm(z))em H n (z) =:δ +δ 2, where δ =(zm(z)+z) x n [ N = δ 2 =m y (z)(zem n (z) zm(z))em H n (z) = ( Eβ (z) r r A (z) (z))] N EA x n, x n Lemma 3 If δ C ( ) Nvv y v y v

11 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES holds for some constants C 0 and C, when v 2 v y C 0 N, under the conditions of Theorem, there exists a constant C such that H Cv/v y Proof According to Lemma 22, H K Em H n (z) m y (z) du +K 2 v +K 3 v sup x x >B EH Sn (x) F yn (x) dx t <v F yn (x+t) F yn (x) dt From Lemmas B2 and A8, we know that there exists a positive constant C, such that K 2 v EH Sn (x) F yn (x) dx (35) Notice that x >B +K 3 v sup x Cv/v y Em H n (z) m y(z) du + δ du+ δ du+ δ du+ δ 2 du t <v F yn (x+t) F yn (x) dt zm y (z) Em H n (z) Em n (z) m(z) du zm y (z) Em H n (z) m y(z) Em n (z) m(z) du zm y (z) m y (z) Em n (z) m(z) du Lemma A, (2) and (A3) imply that and C Em n (z) m(z) = y n Em n (z) m y (z) Nv 3/2 vy 2 Em n (z) m(z) du= y n Em n (z) m y (z) du Cv

12 2 N XIA, Y QIN AND Z D BAI From (23) in [3], we have zm y (z)= y z+ (+y z) 2 4y 2y = ( z y 2 y m semi ), y where m semi ( ) denotes the Stieltes transform of the semicircle law, see (32) in [2] Therefore zm y (z) is bounded by a constant, for m semi ( ), see (33) in [2] Combined with Lemma B7, there exist constants C 2, C 3, such that Em H n (z) m y(z) du δ du+ C 2 Nv 3/2 vy 2 Em H n (z) m y (z) du+ C 3v v y Given v 2 v y C 0 N, for v y v, we have v 3/2 vy 2 v2 v y C 0 N For a large enough C 0 such that C 2 /C 0 /2, we have Em H n (z) m y (z) du δ du+ Cv v y As δ C Nvv y ( v y v ) and v2 v y C 0 N, if C 0 4AC K, we then have (36) Em H n (z) m y(z) du 2AC Nv 2 v y v v y + 2AC Nv 2 v y H + Cv v y H 2K + Cv v y Thus, from Lemma 22, equations (36) and (35), we conclude that there exists a constant C, such that The proof is complete H Cv v y C0 N To finish the proof of Theorem, we choose v = a+n such that /4 v 2 v y =C 0 N v y a+n /4 C 0 N According to Lemma 3, we know that H Cv v y CN /2 ( a+n /4 ) 3/2

13 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 3 If a<n /4, H CN /2 (N /4 ) 3/2 =O(N /8 ) If a N /4, H CN /2 ( a) 3/2 =O(N /2 a 3/4 ) Thus, the proof of Theorem is complete 4 The bound for δ In this section, we are going to show that when v 2 v y C 0 N, δ is indeed bounded by C Nvv y ( v y v ), as required by Lemma 3 From δ, we can further write δ =δ +δ 2 +δ 3, where [ ( δ =N(zm(z)+z) E β (z) r A (z)x nx nr )] N x na (z)x n =N(zm(z)+z) E(β (z)α (z)), δ 2 =(zm(z)+z) E[β (z)x n (A (z) (z))x n ], δ 3 =(zm(z)+z) E[β (z)x n (A (z) EA (z))x n ] According to (23) and Lemma B7, (4) (zm(z)+z) = m y (z) C v y for some constant C Using identity (32) three times, we have β (z)=b (z) b 2 (z)ξ (z)+b 3 (z)ξ2 (z) b3 (z)β (z)ξ 3 (z) Notice that Eα (z) =0 and b (z) is bounded by a constant (due to Lemma A3), we then have (42) δ CN v y Eβ (z)α (z) CN v y ( Eξ (z)α (z) + Eξ 2 (z)α (z) + Eβ (z)ξ 3 (z)α (z) ) Let us start with the first term in the above upper bound of δ as in (42) Note that r and A (z) are independent Therefore, for any integer p>0 we have and E(trA (z) EtrA (z))α (z)=0 E(trA (z))p α (z)=0 Denote A (z)=(a i) n n, A (z)x nx n =(b i) n n, and e i bethe ith canonical basis vector, that is, the n-vector whose coordinates are all 0 except

14 4 N XIA, Y QIN AND Z D BAI that the ith coordinate is Then Lemmas B3, A5, A6 and the inequality (z) /v imply that A Eξ (z)α (z) = Eˆξ (z)α (z) C N 2E ( tr(a (z)a ( z)x nx n)+ i= ) n a ii b ii i= { ( C N N 2 v v y v +v E x n A ( z)e i 2 C ( N 2 v v y v ) ) /2 } N E a ii 2 x n e i 2 In the above, we use the following two results, which can be proved by applying Lemmas A5 and A6: ( ) E a =E e A (z)e C, v y v i= E a 2 E e A (z)a ( z)e C ( v v y v Hence, we have shown that Eξ (z)α (z) C ( ) (43) N 2 v v y v Let us denote X the conugate transpose of X, that is, X =( X, X 2,, X n ) Then we can rewrite the second term in the upper bound of δ as Eξ(z)α 2 (z)= [( N 3E a i Xi X + i i ( i, ) ) 2 (a ii Xi Ea 2 ii ) )] b i ( X i X δ i ) = [{( ) 2 ( ) N 3E a i Xi X +2 (a ii Xi 2 Ea ii) i i ( ( ) 2 } a i Xi X )+ a ii Xi 2 Ea ii i i

15 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 5 { b i Xi X + }] b ii ( Xi ) 2 i i C N 3E ( i a 2 iib ii + i Ea ii 2 b ii + i a 2 ib ii + a 2 ib i + a i a ii b + i i i + C N 3 Ea ι i aτ k b ik, ι,τ i k ) a ii a i b i where a ι i and aτ i denote a i or ā i By following the similar proofs in establishing (43), we are able to show that E i a 2 ii b ii E a ii b ii v i E i v (E a2 Ex na (z)a ( z)x n) /2 C v 2 ( v y v ), Ea ii 2 b ii Ea 2 (Ex na (z)a ( z)x n) /2 [ ( )] C 3/2 v v y v The Cauchy Schwarz inequality implies that E ( ( )) a 2 i b ii E b ii a 2 i i i ( ) /2 ( E x na (z)e i 2 i E i i =(Ex n (A C v 3/2 [ a 2 i b i i ( x ne i 2 E a 2 i ) 2 ) /2 (z)a ( z))x n) /2 (E (A (z)a ( z) ) 2 ) /2 ( ) 3/2, v y v E a i x n A (z)e i 2 i E a i x n e 2 ] /2

16 6 N XIA, Y QIN AND Z D BAI [ E (A (z)a ( z)) ii (x na (z)e i) 2 i ] /2 E (A ( z)a (z)) (x n e ) 2 E i E i [ a ii a i b C ( ) 3/2 v 2, v y v i [ i N C v 2 [ a ii a i b i C ι,τ i N v 2 i k E a ii x n A (z)e 2 i E a 2 ii x n A (z)a ( z)x n E a i x n e 2 ] /2 ] /2 E(A (z)a ( z)) x n e 2 ( ), v y v E a ii x na e 2 i ( ) v y v Finally, we establish Ea ι i aτ k b N ik C E a i x ne i 2 ] /2 v 2 ( ) v y v By inclusive exclusive principle and what we have ust proved, it remains to show that ( ) Ea ι i aτ k b N (44) ik C v 2 v y v ι,τ i,,k Notice that γ ik = a ia k is the (i,k)-element of A 2 (z) We obtain Ea i a k b ik = γ ik b ik i,,k i,k ( E γ ik 2 i,k i,k E b ik 2 ) /2

17 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 7 =(Etr(A 2 (z)a 2 ( z))ex na (z)a ( z)x n) /2 ( ) N C v 2 v y v Similarly, one can prove the other terms of (44) share this common bound In summary, when v 2 v y C 0 N it holds that Eξ(z)α 2 (z) C ( ) (45) N 2 v v y v For the last term in the upper bound of δ, we apply Lemma A4 and the Cauchy Schwarz inequality again In particular, for any fixed t > 0, we have that Eβ (z)ξ 3 (z)α (z) CE ξ 3 (z)α (z) +o(n t ) (46) C(E ξ (z) 6 ) /2 (E α (z) 2 ) /2 +o(n t ) C N 5/2 v 2 v 3/2 y ( v y v ) /2 The last inequality in (46) is due to Lemmas A2 and A7 Therefore, for any v 2 v y C 0 N, (43), (45) and (46) lead us to δ C ( ) (47) Nvv y v y v To establish the upper bound for δ 2, we will make use of the following equality: (48) A (z) (z)=β (z)a (z)r r A (z) Note that (4) implies that δ 2 C v y Eβ (z)x n(a (z) (z))x n C v y Eβ 2 (z)x na (z)r r A (z)x n (49) C Nv y E X A (z)x nx n A (z)x +o(n t ) (see Lemma A4) C Etr(A Nv (z)x nx na (z)) (see Lemma B5) y = C Nv y Ex na 2 (z)x n C Nvv y ( ) v y v

18 8 N XIA, Y QIN AND Z D BAI At last, we establish the upper bound for δ 3 By (4), Lemma A3, and the fact (40) we obtain β (z)=b (z) b (z)β (z)ξ (z), δ 3 C v y E{β (z)x n (A (z) EA (z))x n } = C v y E{b(z)β (z)ξ (z)x n(a (z) EA (z))x n } C v y E{ξ (z)x n(a (z) EA (z))x n } + C v y E{β (z)ξ 2 (z)x n (A (z) EA (z))x n } From C r inequality (see Loève (author?) [4]), we have δ 3 C v y ( II + II 2 + II 3 ), where II =E{ξ (z)x n (A (z) EA (z))x n}, II 2 =E{ξ (z)x n (A (z) (z))x n}, II 3 =E{β (z)ξ (z)x n E(A (z) (z))x n} It should be noted that E(ξ (z) A (z)) = 0, and r and A (z) are independent Then we have II =E[E{ξ (z)x n(a (z) EA (z))x n A (z)}]=0 By the results in (48) and (40), we have II 2 = Eξ (z)β (z)x na (z)r r A (z)x n b (z) Eξ (z)r A (z)x nx na (z)r + b (z) Eβ (z)ξ 2 (z)r A (z)x nx n A (z)r =:III +III 2, where ( III = b (z) Eξ (z) r A (z)x nx n A (z)r N x n A 2 n) (z)x C N 2 Etr(x n A 3 (z)x n)+e i = C ( ) N 2 v 2 v y v (A (z)x nx n A (z)) ii a ii

19 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 9 The above inequality follows from Lemmas B3 and A3 By Lemma A4 and the Cauchy Schwarz inequality, it holds that III 2 CE ξ 2 (z)r A (z)x nx n A (z)r C(E ξ (z) 4 ) /2 (E r A (z)x nx n A (z)r 2 ) /2 ( ) /2 C N 2 v 2 vy 2 N 2E x na 2 (z)x n 2 (see Lemmas A2 and B5) ( ) C N 2 v 2 v y v y v Hence, we have shown that C II 2 N 2 v 2 v y ( ) v y v Moreover, Lemmas A2 and A5, and the Cauchy Schwarz inequality lead us to the following: II 3 C(E ξ (z) 2 E x n(a (z) (z))x n 2 ) /2 ( ) C C N /2 v /2 vy /2 Nv v y v = C N 3/2 v 3/2 v /2 y Therefore, it follows that (4) ( v y v ) δ 3 C C ( II + II 2 + II 3 ) v y N 3/2 v 3/2 v y ( ) v y v As it has been shown in (47), (49) and (4), we conclude that δ C ( ) (42) Nvv y v y v 5 Proof of Theorem 6 From Lemma 22, and by replacing EH Sn (x) and Em H n (z) by H Sn (x) and m H n (z), respectively, we have E H p =:E H Sn (x) F yn (x) K E m H n (z) EmH n (z) du+k +K 2 v x >B EH Sn (x) F yn (x) dx Em H n (z) m y(z) du

20 20 N XIA, Y QIN AND Z D BAI +K 3 v sup F yn (x+t) F yn (x) dt x t <v K E m H n (z) EmH n (z) du+ H As the convergence rate of H has already been established in Theorem, we only focus on the convergence rate of E m H n (z) Em H n (z) By Lemma A6 and the Cauchy Schwarz inequality, it follows that E m H n (z) EmH n (z) (E mh n (z) EmH n (z) 2 ) /2 C ( Nv v y v Together with Lemma 3 that H Cv/v y, when v 2 v y O(N ), we have ( ) C E H Sn (x) F yn (x) +v Nv v y By choosing v=o(n /4 ), we obtain { O(N E H Sn (x) F yn (x) /4 a /2 ), when a N /4, O(N /8 ), otherwise The proof of Theorem 6 is complete 6 Proof of Theorem 8 Notice that the proof of Theorem 8 is almost the same as that of Theorem 6 By Lemma 22, choosing v=o(n /4 ), H Sn F yn By Lemma A6, we have ) m H n (z) Em H n (z) du+cv/v y E m H n (z) Em H n (z) 2l CN l v 2l ( v y v When a<n /4, with v=o(n /4 ), N 2l(/8 η) E m H n (z) EmH n (z) 2l CN 2lη, which implies that if we choose an l such that 2lη>, m H n (z) EmH n (z) du=o as(n /8+η ) ) 2l When a N /4, in this case, by choosing v=o(n /4 ), we have a l N 2l(/4 η) E m H n (z) Em H n (z) 2l CN 2lη Theorem 8 then follows by setting l > 2η This completes the proof of Theorem 8

21 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 2 APPENDIX A In this section, we establish some lemmas which are used in the proofs of the main theorems Lemma A Under the conditions of Theorem 6, for all z <A and v 2 v y C 0 N, we have C Em n (z) m y (z) Nv 3/2 vy 2, where C 0 is a constant and v y = y n + v Proof Since Em n (z)= n Etr(S n zi n ) (A) where s kk = N = n = n n E s kk z N 2 α k (S nk zi n ) α k k= n E ǫ+ y n z y n zem n (z) k= = z+y n +y n zem n (z) +δ n, N X k 2, = S nk = N X (k)x (k), α k =X (k) X k, ǫ k =(s kk )+y n +y n zem n (z) N 2α k (S nk zi n ) α k, δ n = n b n Eβ k ǫ k, n k= b n =b n (z)= β k =β k (z)= z+y n +y n zem n (z), z+y n +y n zem n (z) ǫ k,

22 22 N XIA, Y QIN AND Z D BAI where X (k) is the (n ) N matrix obtained from X with its kth row removed and X k is the kth row of X It has proved that one of the roots of equation (A) is (see (37) in [3]) Em n (z)= 2y n z (z+y n y n zδ n (z+y n +y n zδ n ) 2 4y n z) The Stieltes transform of the Marčenko Pastur distribution with index y is given by (see (23) in [3]) Thus, Em n (z) m y (z) δ [ n + 2 Let us define by convention R( z)= m y (z)= y n+z (+y n z) 2 4y n 2y n z 2(z+y n ) y n zδ n (z+y n ) 2 4y n z+ (z+y n +y n zδ n ) 2 4y n z I(z) 2( z R(z)), I( z)= I(z) 2( z +R(z)) If u y n 5(A+), then the real parts of (z+y n ) 2 4y n z and (z+yn +y n zδ n ) 2 4y n z have the same sign Since they both have positive imaginary parts, it follows that (z+y n ) 2 4y n z+ (z+y n +y n zδ n ) 2 4y n z I((z+y n ) 2 4y n z) = ( ) 2v /2 2v(u y n ) 5(A+) Thus, (A2) Em n (z) m y (z) δ n (+ C ) v C δ n 2 v If u y n < 5(A+), we have Em n(z) m y (z) C δ n In [7] [see the inequality above (836)], we have δ n C Nv 3 Combined with (A2), we get The proof of the lemma is complete ( + v ) 2 C v y Nvvy 2 C Em n (z) m y (z) Nv 3/2 vy 2 ]

23 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 23 In addition, the following relevant result which is involved in the proof of Lemma 3 is presented here: (A3) Em n (z) m y (z) du = Em n (z) m y (z) du u y n /(5(A+)), u A + Em n (z) m y (z) du Cv u y n </(5(A+)), u A Lemma A2 Under the conditions of Theorem 6, for v 2 v y C 0 N and l 3, there exists a constant C, such that E ξ (z) 2l C N l v l vy l Proof By the C r -inequality (see Loève (author?) [4]), it follows that ( E ξ (z) 2l C E N tra (z) 2l ) N EtrA (z) +E ˆξ (z) 2l =:I +I 2 From Lemmas B6, B8, B0 and the C r -inequality with v 2 v y C 0 N, we have I E N tra (z) 2l N tra (z) +E N tra (z) 2l N EtrA (z) + N EtrA (z) 2l N EtrA (z) {( ) 2l C + ( Nv N 2l v 4l + v ) l ( ) 2l } + v y Nv C N 2l v 3l vy l Under finite 8th moment assumption, for l 2 we have E X 4l =E{ X 8 X 4l 8 I( X η N N /4 )} CN l 2 It can be shown that B zi n =A, (B zi n ) /v, and tr((b zi n ) (B zi n ) )=v I(tr(B zi n ) )

24 24 N XIA, Y QIN AND Z D BAI Hence, by Lemma B5, I 2 = N 2lE X A (z)x tra (z) 2l C N 2lE{CNl 2 tr(a (z)a ( z))l +(Ctr(A (z)a ( z)))l } = C N 2lE{CNl 2 v 2l+ I(tr(B zi n ) ) +C l v l (I(tr(B zi n ) )) l } = C N 2l{CNl v 2l+ E(I(m F B (z))) +C l N l v l E(I(m F B (z))) l } C N l+ v 2l + C v y N l v l vy l The last inequality is due to Lemmas B6, B7, B8 and E(I(m F B (z))) l E m F B (z) m n (z) l +E m n (z) Em n (z) l + Em n (z) m y (z) l + m y (z) l Cv l y Here let = EF Sn F yn, by integration by parts and Lemma B0, we have (A4) Em n (z) m y (z) C v C v y Therefore, for l 3 and v 2 v y C 0 N, it follows that This completes the proof E ξ (z) l I +I 2 C N l v l vy l Lemma A3 Under the conditions of Theorem 6, for all z <A and v 2 v y C 0 N, we have b (z) C Proof From (23) in [3], we have m y (z)= y n z+ ( y n z) 2 4y n z, 2y n z

25 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 25 wherethesquarerootofacomplexnumberisdefinedastheonewithpositive imaginary part It then can be verified that b 0 (z)=: +y n m y (z) =+ 2 (z y n (z y n ) 2 4y n ) = + y n m semi ( z yn yn ), where m semi denotes the Stieltes transform of the semicircular law As m semi, we conclude that (A5) b 0 (z) + y n By the relationship between b 0 and b, we have b (z)= b 0 (z) +y n b 0 (z)((/n)etra (z) m y(z)) When C 0 is chosen large enough, by Lemma A, for all large N we have n EtrA (z) m y(z) n Etr(A (z) (z)) + Em n (z) m y (z) and consequently we obtain Thus, the proof is complete nv + 3(+ y n ) 2 3(+ y n ) b (z) 3(+ y n ) C Lemma A4 If b (z) C, then for any fixed t>0, P( β (z) >2C)=o(N t ) Proof Note that if b (z)ξ (z) /2, by Lemma A3, we get β (z) = b (z) +b (z)ξ (z) b (z) b (z)ξ (z) 2 b (z) 2C As a result, ( P( β (z) >2C) P b (z)ξ (z) > ) 2 ( P ξ (z) > ) 2C (2C) p E ξ (z) p (see Lemma A3)

26 26 N XIA, Y QIN AND Z D BAI By the C r -inequality, Lemmas B8 and B9, for some η = η N N /4 and p logn, we have E ξ (z) p =E ˆξ (z) p +E N tra (z) p N EtrA (z) C(Nη 4 NN ) (v η 2 NN /2 ) p + Cη 2p 4 N Cη p N C N p v 3p/2 v p/2 y For any fixed t>0, when N is large enough so that logη N >t+, it can be shown that We finish the proof E ξ (z) p Ce plogη N Ce p(t+) Ce (t+)logn =CN t =o(n t ) Lemma A5 If v 2 v y C 0 N, C 0 is a large constant For l, it holds that Proof Recall that E m H B (z) 2l CE m H n (z) 2l A (z) (z)=β (z)a (z)r r A (z) By Lemmas A4 and B5, it holds that E m H B (z) m H n (z) 2l =E x n(a (z) (z))x n 2l =E x n β (z)a (z)r r A (z)x n 2l =E x nβ (z)a (z)r r A (z)x n 2l I( β (z) C) +E x n β (z)a (z)r r A (z)x n 2l I( β (z) >C) C N 2lE X A (z)x nx na (z)x 2l +o(n t ) C N 2lE X A (z)x nx na (z)x x na 2 (z)x n 2l + C N 2lE x n A 2 (z)x n 2l

27 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 27 C N 2l[E(trA (z)x nx n A (z)a ( z)x nx n A ( z))l +N l 2 Etr(A (z)x nx n A (z)a ( z)x nx n A ( z))l ] + C N 2lE x na 2 (z)x n 2l C N l+2 v 2lE m H B (z) 2l The last step follows the fact that x na (z)a ( z)x n =v I(x na (z)x n)=v I(m H B (z)) For v 2 v y C 0 N, which implies that v C 0 N /2 Choose C 0 and N C large enough, such that N l+2 v 2l 2 Further, by C r-inequality, we obtain E m H B (z) 2l CE m H B (z) m H n (z) 2l +CE m H n (z) 2l 2 E m H B (z) 2l +CE m H n (z) 2l That is, E m H B (z) 2l CE m H n (z) 2l, for some constant C This finishes the proof Lemma A6 Under the conditions of Theorem 6, for v 2 v y C 0 N, we have E m H n (z) EmH n (z) 2l C ( ) 2l N l v 2l v y v Proof Write E ( ) as the conditional expectation given {r,,r } It can then be shown that m H n (z) EmH n (z)= N = γ, where γ =:E (x na (z)x n ) E (x na (z)x n ) = (E E ){x n(a (z) (z))x n } = (E E ){β (z)x na (z)r r A (z)x n } Therefore, by Lemmas B4(b), we have ( N ) l N E m H n (z) Em H n (z) 2l CE E γ 2 +C E γ 2l Using Lemmas A4 and B5, we have = = E =E (E E )β (z)x n A (z)r r A (z)x n 2 C N 2E X A (z)x n x n A (z)x x n A 2 (z)x n 2

28 28 N XIA, Y QIN AND Z D BAI + C N 2E x n A 2 (z)x n 2 C N 2E tr(a (z)x n x n A (z)a ( z)x n x n A ( z)) + C N 2E x na 2 (z)x n 2 By the fact that x na (z)a ( z)x n =v I(x na (z)x n ) and A (z) v, we have On the other side, E γ 2 C N 2 v 2E m H B (z) 2 E γ 2l = N 2lE X A (z)x nx na (z)x 2l Thus, we obtain C N 2lE X A (z)x nx na (z)x x na 2 (z)x n 2l + C N 2lE x na 2 (z)x n 2l C Nl 2 N2l v 2l E(I(x na (z)x n)) 2l + C N 2lE x na (z)x n 2l C N l+2 v 2lE m H B (z) 2l E m H n (z) Em H n (z) 2l C N l v 2lE m H B (z) 2l C + N l+ v 2lE m H B (z) 2l Further C N l v 2lE mh n (z) 2l (by Lemma A5) E m H n (z) 2l E m H n (z) Em H n (z) 2l + Em H n (z) m y (z) 2l + m y (z) 2l For v C 0 N /2 C, choose C 0 large enough, such that integration by parts, it is easy to find that (A6) Em H n (z) m y (z) C H, v where H = EH Sn F yn Besides, from Lemma B7, we know that m y (z) C v y N l v 2l 2 And using

29 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 29 Therefore, we obtain E m H n (z) Em H n (z) 2l C ( ) 2l N l v 2l v y v The proof is then complete Lemma A7 E α (z) 2 C ( ) N 2 v v y v Proof Lemma B5 implies that E α (z) 2 = N 2E X A (z)x nx n X x n A (z)x n 2 C N 2E(x na ( z)a (z)x n) C N 2 v Em H B (z) Using Lemmas A5 and A6 and integration by parts, we have E m H B (z) 2 E m H n (z) 2 This finishes the proof E m H n (z) EmH n (z) 2 + Em H n (z) m y(z) 2 + m y (z) 2 ( ) 2 C v y v Lemma A8 Under the conditions in Theorem, for any fixed t>0, B EH Sn (x) F yn (x) dx=o(n t ) Proof For any fixed t>0, by Lemma B2, it follows that P(λ max (S n ) B+x) CN t (B+x ε) 2 and EH Sn (x) F yn (x) dx B = B B ( EH Sn (x))dx ( ) N y i 2 P(λ i x) dx i=

30 30 N XIA, Y QIN AND Z D BAI The proof is complete B B ( N ) N y i 2 y i 2 P(λ max x) dx i= i= N t (B+x ε) 2 dx=o(n t ) APPENDIX B In what follows, we will present some existing results which are of substantial importance in proving the main theorems Lemma B (Theorem 22 in [2]) Let F be a distribution function and let G be a function of bounded variation satisfying F(x) G(x) dx< Denote their Stieltes transforms by f(z) and g(z), respectively Then F G =sup F(x) G(x) x π( κ)(2γ ) ( f(z) g(z) du+ 2π v + v sup x y 2vτ x >B F(x) G(x) dx ) G(x+y) G(x) dy, where z =u+iv is a complex variable, γ, κ, τ, A and B are positive constants such that A>B, and κ= γ = π 4B π(a B)(2γ ) < u <τ u 2 + du> 2 Lemma B2 (Lemma 85 in [7]) For any v>0, we have sup F yn (x+u) F yn (x) du< 2(+y n ) v 2 /v y, x 3πy n u <v where F yn is the cdf of the Marčenko Pastur distribution with index y n, and v y = y n + v

31 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 3 Lemma B3 ((5) in [6]) Let A=(a i ) n n and B=(b i ) n n be two nonrandom matrices Let X = (X,,X n ) be a random vector of independent complex entries Assume that EX i = 0 and E X i 2 = Then we have E(X AX tra)(x BX trb) n = (E X i 4 EXi 2 2 2)a ii b ii + EXi 2 2 trab T +trab i= Lemma B4 (Burkholder inequalities (Lemmas 2 and 22 in [5])) Let {X k } be a complex martingale difference sequence with respect to the increasing σ-field {F k }, and let E k denote the conditional expectation with respect to F k Then we have: (a) for p>, ( n n ) p/2, E X k p K p E X k 2 k= k= (b) for p 2, n p ( n ) p/2+ E X k K p (E E k X k 2 E k= k= where K p is a constant which depends on p only n X k ), p Lemma B5 (Lemma 27 in [5]) Let A=(a i ) be an n n nonrandom matrix and X=(X,,X n ) be random vector of independent complex entries Assume that EX i =0, E X i 2 = and E X i l V l Then for any p 2, k= E X AX tra p K p ((V 4 tr(aa )) p/2 +V 2p tr(aa ) p/2 ), where K p is a constant depending on p only Lemma B6 (Lemma 26 in [24]) Let z C + with v =I(z), A and B n n with B Hermitian, τ R, and q C n Then tr((b zi n ) (B+τqq zi n ) )A A v, where A denotes spectral norm on matrices Lemma B7 ((849) in[7]) For the Stieltes transform of the Marčenko Pastur distribution, we have 2 m y (z), yvy where v y = a+ v= y n + v

32 32 N XIA, Y QIN AND Z D BAI Lemma B8 (Lemma 820 in [7]) If z <A, v 2 v y C 0 N and l, then ( E m n (z) Em n (z) 2l C N 2l v 4l yn 2l + v ) l, v y where A is a positive constant, v y = y n + v and := EF Sn F yn Lemma B9 (Lemma 9 in [7]) Suppose that X i, i=,,n, are independent, with EX i =0, E X i 2 =, supe X i 4 =ν < and X i η n with η>0 Assume that A is a complex matrix Then for any given p such that 2 p blog(nν η 4 ) and b>, we have where α=(x,,x n ) T E α Aα tr(a) p νn p (nη 4 ) (40b 2 A η 2 ) p, Lemma B0 (Theorem 80 in [7]) Let S n = XX /N, where X = (X i (n)) n N Assume that the following conditions hold: () For each n, X i (n) are independent, (2) EX i (n)=0, E X i (n) 2 =, for all i,, (3) sup n sup i, E X i (n) 6 < Then we have { O(N =: EF Sn F yn = /2 a ), if a>n /3, O(N /6 ), otherwise, where y n =n/n and a is defined in the Marčenko Pastur distribution Lemma B (Theorem 5 in [7]) Assume that the entries of {X i } is a double array of iid complex random variables with mean zero, variance σ 2 and finite 4th moment Let X=(X i ) n N be the n N matrix of the upperleft corner of the double array If n/n y (0,), then, with probability one, we have and lim λ min(s n )=σ 2 ( y) 2 n lim λ max(s n )=σ 2 (+ y) 2 n Lemma B2 (Theorem 59 in[7]) Suppose that the entries of the matrix X=(X i ) n N are independent (not necessarily identically distributed) and satisfy: () EX i =0, (2) X i Nδ N,

33 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 33 (3) max i E X i 2 σ 2 0 as N and (4) E X i l b( Nδ N ) l 3 for all l 3, where δ N 0 and b > 0 Let S n =XX /N Then, for any x>ǫ>0, n/n y, and fixed integer l 2, we have P(λ max (S n ) σ 2 (+ y) 2 +x) CN l (σ 2 (+ y) 2 +x ǫ) l for some constant C >0 APPENDIX C Notethat thedatamatrix X=(X i ) n N consists ofiid complex random variables withmean0andvariance Inwhatfollows, wewillfurtherassume that every X i is bounded by η N N /4 for some carefully selected η N The proofs presented in the following three steps ointly ustify such a convenient assumption C Truncation for Theorem Choose η N 0 and η N N /4 as N such that lim N η 0 N E X 0 I( X >η N N /4 )=0 Let X n denote the truncated data matrix whose entry on the ith row and th column is X i I( X i η N N /4 ), i=,,n, =,,N Define Ŝn = X n X n/n Then P(S n Ŝn) nnp( X i >η N N /4 ) nn 3/2 η 0 N E X 0 I( X >η N N /4 ) =o(n /2 ) C2 Truncation for Theorems 6 and 8 Choose η N 0 and η N N /4 as N such that (C) lim N η 8 N E X 8 I( X >η N N /4 )=0 Let X n denote the truncated data matrix whose entry on the ith row and th column is X i I( X i η N N /4 ), i=,,n, =,,N Define Ŝn = X n X n /N Then ( ) P(S n Ŝn, io)= lim P n N X i >η N N /4 k N=ki==

34 34 N XIA, Y QIN AND Z D BAI ( ) = lim P n N X i >η N N /4 k t=kn [2 t,2 t+ ) i== ( (yn+)2 ) t+ 2 t+ lim P X i >η 2 t2 t/4 k t=k i= = C lim (2 t+ ) 2 P( X >η 2 t2 t/4 ) k t=k C lim 4 t P(η 2 l2 l/4 < X η 2 l+2 (l+)/4 ) k t=k l=t l =C lim 4 t P(η 2 l2 l/4 < X η 2 l+2 (l+)/4 ) k l=k t=k lim Cη 8 E X k 2 l 8 I(η 2 l2 l/4 < X η 2 l+2 (l+)/4 ) l=k =0 The last equality is due to (C) C3 Centralization The centralization procedures for three theorems are identical, only 8th moment is required and thus we treat them uniformly Let X n denote the centralized version of X n More explicitly, on the ith row and th column of X n, the entry is X i I( X i η N N /4 ) E(X i I( X i η N N /4 )) Notice that according to Theorem 3 of [26], (S n zi n ) is bounded by /v, where denotes the spectral norm for a matrix Define S n = X n X n /N Suppose that v C 0 N /2, we obtain m H Ŝn (z) m H Sn (z) = x n(ŝn zi n ) x n x n( S n zi n ) x n (Ŝn zi n ) Ŝn S n ( S n zi n ) v 2 Ŝn S n Nv 2( X n X n X n + X n X n X n ) by Lemma B2 C Nv 2 X n X n as

35 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 35 = C Nv 2 E{X I( X η N N /4 )} n N C Nv 2 η 7 N N 7/4 E( X 8 I( X >η N N /4 )) =o(n /4 ) To establish both the weak and the strong convergence rates of the VESD to the Marčenko Pastur distribution, this o(n /4 ) suffices Moreover, for the convergence rate presented in Theorem, we shall prove the following Let m y (z) denotes the Stieltes transform of the Marčenko Pastur distribution, thus m y (z) is bounded by 2 yn( y n+ v) C v in Lemma B7 Then Em H n (z) Em H n (z) m y (z) + m y (z) C m y (z) C v Besides x n (Ŝn zi n ) x n can be considered as a Stieltes transform of some VESD function So, we have Thus, E x n (Ŝn zi n ) 2 =v Ex n (Ŝn zi n ) x n C v E m H Ŝn (z) m H Sn (z) E Ŝn S n x n (Ŝn zi n ) ( S n zi n ) x n C Nη 7 N N 7/4 E( X 8 I( X η N N /4 )) (E x n(ŝn zi) 2 ) /2 (E ( S n zi) x n 2 ) /2 C Nv 3/2 η 7 N N 7/4 E( X 8 I( X η N N /4 )) o(n /2 ) C4 Rescaling The rescaling procedures for the three theorems are exactly the same, and only 8th moment is required Thus, we treat them uniformly Write Y n = X n /σ, where σ 2 =E X I( X η N N /4 ) E(X I( X η N N /4 )) 2 Notice that σ tends to as N goes to Define G n =Y n Y n/n, which is the sample covariance matrix of Y n We shall show that G n and S n are asymptotically equivalent, that is, the VESD of G n and S n have the same limit if either one limit exists For v C 0 N /2, m H Gn(z) m H Sn (z) = x n( S n zi n ) ( S n G n )(G n zi n ) x n

36 36 N XIA, Y QIN AND Z D BAI v 2 ( σ ) S n (see Lemma B2) C v 2( σ2 ) as Cv 2 η 6 N N 3/2 E( X 8 I( X >η N N /4 )) o(n /2 ) Hence, we shall without loss of generality assume that every X i is bounded by η N N /4, and every X i has mean 0 and variance as REFERENCES [] Anderson, T W (963) Asymptotic theory for principal component analysis Ann Inst Statist Math MR [2] Bai, Z D (993) Convergence rate of expected spectral distributions of large random matrices I Wigner matrices Ann Probab MR27559 [3] Bai, Z D (993) Convergence rate of expected spectral distributions of large random matrices II Sample covariance matrices Ann Probab MR27560 [4] Bai, Z D, Miao, B Q and Pan, G M (2007) On asymptotics of eigenvectors of large sample covariance matrix Ann Probab MR [5] Bai, Z D and Silverstein, J W (998) No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices Ann Probab MR6705 [6] Bai, Z D and Silverstein, J W (2004) CLT for linear spectral statistics of largedimensional sample covariance matrices Ann Probab MR [7] Bai, Z D and Silverstein, J W (200) Spectral Analysis of Large Dimensional Random Matrices, 2nd ed Springer, New York MR [8] Bai, Z D and Xia, N (203) Functional CLT of eigenvectors for large sample covariance matrices Statist Papers To appear [9] Cai, T, Ma, Z M and Wu, Y H (203) Sparse PCA: Optimal rates and adaptive estimation Available at arxiv:2309 [0] Erdős, L, Schlein, B and Yau, H-T (2009) Semicircle law on short scales and delocalization of eigenvectors for Wigner random matrices Ann Probab MR [] Götze, F and Tikhomirov, A (2004) Rate of convergence in probability to the Marchenko Pastur law Bernoulli MR [2] Johnstone, I M (200) On the distribution of the largest eigenvalue in principal components analysis Ann Statist MR86396 [3] Knowles, A and Yin, J (203) Eigenvector distribution of Wigner matrices Probab Theory Related Fields MR [4] Loève, M (977) Probability Theory I, 4th ed Springer, New York Graduate Texts in Mathematics 45 MR06507 [5] Ma, Z M (203) Sparse principal component analysis and iterative thresholding Ann Statist [6] Marčenko, V A and Pastur, L A (967) Distribution of eigenvalues for some sets of random matrices Math USSR-Sb

37 ASYMPTOTICS OF EIGENVECTORS AND EIGENVALUES 37 [7] Paul, D (2007) Asymptotics of sample eigenstructure for a large dimensional spiked covariance model Statist Sinica MR [8] Pillai, N S and Yin, J (203) Universality of covariance matrices Available at arxiv:0250v6 [9] Silverstein, J W (98) Describing the behavior of eigenvectors of random matrices using sequences of measures on orthogonal groups SIAM J Math Anal MR [20] Silverstein, J W (984) Some limit theorems on the eigenvectors of largedimensional sample covariance matrices J Multivariate Anal MR [2] Silverstein, J W (989) On the eigenvectors of large-dimensional sample covariance matrices J Multivariate Anal 30 6 MR [22] Silverstein, J W (990) Weak convergence of random functions defined by the eigenvectors of sample covariance matrices Ann Probab MR [23] Silverstein, J W (995) Strong convergence of the empirical distribution of eigenvalues of large-dimensional random matrices J Multivariate Anal MR [24] Silverstein, J W and Bai, Z D (995) On the empirical distribution of eigenvalues of a class of large-dimensional random matrices J Multivariate Anal MR [25] Tao, T and Vu, V (20) Universal properties of eigenvectors Available at arxiv:03280v2 [26] Yin, Y Q, Bai, Z D and Krishnaiah, P R (988) On the limit of the largest eigenvalue of the large-dimensional sample covariance matrix Probab Theory Related Fields MR N Xia Z D Bai KLAS and School of Mathematics and Statistics Northeast Normal University Changchun China and Department of Statistics and Applied Probability National University of Singapore Singapore 7546 Singapore xiann664@gmailcom baizd@nenueducn Y Qin Department of Statistics and Actuarial Science University of Waterloo Waterloo, ON N2L 3G Canada yingliqin@uwaterlooca

arxiv: v1 [math.pr] 13 Aug 2007

arxiv: v1 [math.pr] 13 Aug 2007 The Annals of Probability 2007, Vol. 35, No. 4, 532 572 DOI: 0.24/009790600000079 c Institute of Mathematical Statistics, 2007 arxiv:0708.720v [math.pr] 3 Aug 2007 ON ASYMPTOTICS OF EIGENVECTORS OF LARGE