arxiv: v1 [math.pr] 13 Aug 2007

Similar documents
Large sample covariance matrices and the T 2 statistic

Non white sample covariance matrices.

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices

Spectral analysis of the Moore-Penrose inverse of a large dimensional sample covariance matrix

Preface to the Second Edition...vii Preface to the First Edition... ix

Design of MMSE Multiuser Detectors using Random Matrix Techniques

INVERSE EIGENVALUE STATISTICS FOR RAYLEIGH AND RICIAN MIMO CHANNELS

Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Stein s Method and Characteristic Functions

Spectral representations and ergodic theorems for stationary stochastic processes

Estimation of the Global Minimum Variance Portfolio in High Dimensions

A new method to bound rate of convergence

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title

Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction

Lecture notes: Applied linear algebra Part 1. Version 2

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013

Gaussian Random Fields

Gaussian vectors and central limit theorem

Random regular digraphs: singularity and spectrum

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v2 [math.st] 22 Nov 2013

Statistical signal processing

MIMO Capacities : Eigenvalue Computation through Representation Theory

Convergence of empirical spectral distributions of large dimensional quaternion sample covariance matrices

Random Matrices for Big Data Signal Processing and Machine Learning

Lectures 2 3 : Wigner s semicircle law

Spectral Efficiency of CDMA Cellular Networks

The following definition is fundamental.

Stochastic Design Criteria in Linear Models

Applications of random matrix theory to principal component analysis(pca)

Diagonalization by a unitary similarity transformation

ECE534, Spring 2018: Solutions for Problem Set #5

Weiming Li and Jianfeng Yao

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Math Homework 2

The Multivariate Gaussian Distribution

Lecture 7 MIMO Communica2ons

Multivariate Analysis and Likelihood Inference

Asymptotic Statistics-III. Changliang Zou

Lecture Notes 1: Vector spaces

Lecture I: Asymptotics for large GUE random matrices

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Modeling and testing long memory in random fields

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

MATHEMATICS 217 NOTES

Comparison Method in Random Matrix Theory

2. Matrix Algebra and Random Vectors

Lecture 9. d N(0, 1). Now we fix n and think of a SRW on [0,1]. We take the k th step at time k n. and our increments are ± 1

On prediction and density estimation Peter McCullagh University of Chicago December 2004

Triangular matrices and biorthogonal ensembles

Notes on Random Vectors and Multivariate Normal

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

CHAPTER 3: LARGE SAMPLE THEORY

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Random matrices: Distribution of the least singular value (via Property Testing)

Random Bernstein-Markov factors

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

MA5206 Homework 4. Group 4. April 26, ϕ 1 = 1, ϕ n (x) = 1 n 2 ϕ 1(n 2 x). = 1 and h n C 0. For any ξ ( 1 n, 2 n 2 ), n 3, h n (t) ξ t dt

Wavelet Transform And Principal Component Analysis Based Feature Extraction

Math Linear Algebra II. 1. Inner Products and Norms

Submitted to the Brazilian Journal of Probability and Statistics

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Kernel Method: Data Analysis with Positive Definite Kernels

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices

Empirical Processes: General Weak Convergence Theory

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

«Random Vectors» Lecture #2: Introduction Andreas Polydoros

Stable Process. 2. Multivariate Stable Distributions. July, 2006

Multivariate Distributions

Basic Concepts in Matrix Algebra

Math 321 Final Examination April 1995 Notation used in this exam: N. (1) S N (f,x) = f(t)e int dt e inx.

An introduction to G-estimation with sample covariance matrices

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

MATH 205C: STATIONARY PHASE LEMMA

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Second-order approximation of dynamic models without the use of tensors

A Note on Hilbertian Elliptically Contoured Distributions

Stochastic Comparisons of Order Statistics from Generalized Normal Distributions

Boolean Inner-Product Spaces and Boolean Matrices

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

VISCOSITY SOLUTIONS. We follow Han and Lin, Elliptic Partial Differential Equations, 5.

Universal examples. Chapter The Bernoulli process

Covariance estimation using random matrix theory

Estimation of the Bivariate and Marginal Distributions with Censored Data

DISTRIBUTION OF EIGENVALUES OF REAL SYMMETRIC PALINDROMIC TOEPLITZ MATRICES AND CIRCULANT MATRICES

Parallel Additive Gaussian Channels

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

SOS TENSOR DECOMPOSITION: THEORY AND APPLICATIONS

Lecture: Examples of LP, SOCP and SDP

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

Delta Method. Example : Method of Moments for Exponential Distribution. f(x; λ) = λe λx I(x > 0)

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Transcription:

The Annals of Probability 2007, Vol. 35, No. 4, 532 572 DOI: 0.24/009790600000079 c Institute of Mathematical Statistics, 2007 arxiv:0708.720v [math.pr] 3 Aug 2007 ON ASYMPTOTICS OF EIGENVECTORS OF LARGE SAMPLE COVARIANCE MATRIX By Z. D. Bai,,2 B. Q. Miao 3 and G. M. Pan 2,3 Northeast Normal University, National University of Singapore and University of Science and Technology of China Let {X ij}, i,j =..., be a double array of i.i.d. complex random variables with EX = 0,E X 2 = and E X 4 <, and let A n = T /2 N n X nxnt n /2, where Tn /2 is the square root of a nonnegative definite matrix T n and X n is the n N matrix of the upper-left corner of the double array. The matrix A n can be considered as a sample covariance matrix of an i.i.d. sample from a population with mean zero and covariance matrix T n, or as a multivariate F matrix if T n is the inverse of another sample covariance matrix. To investigate the limiting behavior of the eigenvectors of A n, a new form of empirical spectral distribution is defined with weights defined by eigenvectors and it is then shown that this has the same limiting spectral distribution as the empirical spectral distribution defined by equal weights. Moreover, if {X ij} and T n are either real or complex and some additional moment assumptions are made then linear spectral statistics defined by the eigenvectors of A n are proved to have Gaussian limits, which suggests that the eigenvector matrix of A n is nearly Haar distributed when T n is a multiple of the identity matrix, an easy consequence for a Wishart matrix.. Introduction. Let X n = (X ij ) be an n N matrix of i.i.d. complex random variables and let T n be an n n nonnegative definite Hermitian Received June 2005; revised March 2006. Supported by NSFC Grant 057020. 2 Supported in part by NUS Grant R-55-000-056-2. 3 Supported by Grants 04735 and 05700 from the National Natural Science Foundation of China. AMS 2000 subject classifications. Primary 5A52, 60F5, 62E20; secondary 60F7, 62H99. Key words and phrases. Asymptotic distribution, central limit theorems, CDMA, eigenvectors and eigenvalues, empirical spectral distribution function, Haar distribution, MIMO, random matrix theory, sample covariance matrix, SIR, Stieltjes transform, strong convergence. This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in The Annals of Probability, 2007, Vol. 35, No. 4, 532 572. This reprint differs from the original in pagination and typographic detail.

2 Z. D. BAI, B. Q. MIAO AND G. M. PAN matrix with a square root Tn /2. In this paper, we shall consider the matrix A n = N T n /2 X n Xn T n /2. If T n is nonrandom, then A n can be considered as a sample covariance matrix of a sample drawn from a population with the same distribution as Tn /2 X,, where X, = (X,...,X n ). If T n is an inverse of another sample covariance matrix, then the multivariate F matrix can be considered as a special case of the matrix A n. In this paper, we consider the case where both dimension n and sample size N are large. Bai and Silverstein [7] gave an example demonstrating the considerable difference between the case where n is fixed and that where n increases with N proportionally. When T n = I, A n reduces to the usual sample covariance matrix of N n-dimensional random vectors with mean 0 and covariance matrix I. An important statistic in multivariate analysis is N W n = ln(deta n ) = ln(λ j ), where λ j,j =,...,n, are the eigenvalues of A n. When n is fixed, by taking a Taylor expression of ln( + x), one can easily prove that N n W n D N(0,EX 4 ). It appears that when n is fixed, this distribution can be used to test the hypothesis of variance homogeneity. However, it is not the case when n increases as [cn] (the integer part of cn) with c (0,). Using results of the limiting spectral distribution of A n (see [2] or []), one can show that with probability one that n W n c ln( c) d(c) < 0, c which implies that N n W n d(c) Nn. More precisely, the distribution of W n shaft to left quickly when n increases as n cn. Figure gives the kernel density estimates using 000 realizations of W n for N = 20,00,200 and 500 with n = 0.2N. Figures 2 and 3 give the kernel density estimates of N n W n for the cases n = 5 and n = 0 with N = 50. These figures clearly show that the distribution of W n cannot be approximated by a centered normal distribution even if the ratio c is as small as 0.. This phenomenon motivates the development of the theory of spectral analysis of large-dimensional random matrices which is simply called random

ASYMPTOTICS OF EIGENVECTORS 3 Fig.. Density of W n under different sample sizes, c = 0.2. matrix theory (RMT). In this theory, for a square matrix A of real eigenvalues, its empirical spectral distribution (ESD) F A is defined as the empirical distribution generated by its eigenvalues. The limiting properties of the ESD of sample covariance matrices have been intensively investigated in the literature and the reader is referred to [[, 5, 6, 7,, 2, 3, 7, 20, 2, 23]]. An important mathematical tool in RMT is the Stieltjes transform, which is defined by m G (z) = λ z dg(λ), z C+ {z C, Iz > 0}, for any distribution function G(x). It is well known that G n v G if and only if m Gn (z) m G (z), for all z C +. In [3], it is assumed that: (i) for all n,i,j, X ij are independently and identically distributed with EX ij = 0 and E X ij 2 = ; (ii) F Tn D H, a proper distribution function; (iii) n N c > 0 as n. It is then proved that, with probability, F An converges to a nonrandom distribution function F c,h whose Stieltjes transform m(z) = m F c,h(z), for

4 Z. D. BAI, B. Q. MIAO AND G. M. PAN N Fig. 2. Probability density of Wn (n = 5,N = 50). n each z C +, is the unique solution in C + of the equation (.) m(z) = t( c czm) z dh(t). Let A n = N X nt n X N. The spectrum of A n differs from that of A n only by n N zero eigenvalues. Hence, we have ( F A n = n ) I N [0, ) + n N F An. It then follows that (.2) m n (z) = m F A n (z) = n/n z and, correspondingly for their limits, (.3) m(z) = m F c,h(z) = c z + n N m F An (z) + cm(z), where F c,h is the limiting empirical distribution function of A n.

ASYMPTOTICS OF EIGENVECTORS 5 N Fig. 3. Probability density of Wn (n = 0,N = 50). n Using this notation, equation (.) can be converted to equation (.4) for m(z). That is, m(z) is the unique solution in C + of the equation ( ) tdh(t) (.4) m = z c. + tm From this equation, the inverse function has the explicit form (.5) z = m + c tdh(t) + tm. The limiting properties of the eigenvalues of A n have been intensively investigated in the literature. Among others, we shall now briefly mention some remarkable ones. Yin, Bai and Krishnaiah [22] established the limiting value of the largest eigenvalue, while Bai and Yin [2] employed a unified approach to obtain the limiting value for the smallest and largest eigenvalues of A n when T N = I. A breakthrough on the convergence rate of the ESD of a sample covariance matrix was made in [3] and [4]. In [5], it is shown that, with probability, no eigenvalues of A n appear in any interval [a,b] which is contained in an open interval outside the supports of F cn,hn for all large

6 Z. D. BAI, B. Q. MIAO AND G. M. PAN N under the condition of finite 4th moment (here, c n = n/n and H n is the ESD of T n ). However, relatively less work has been done on the limiting behavior of eigenvectors of A n. Some results on this aspect can be found in [[4, 5, 6]]. That more attention has been paid to the ESD of the sample covariance matrix may be due to the origins of RMT, which lie with quantum mechanics (QM), where the eigenvalues of large-dimensional random matrices are used to describe energy levels of particles. With the application of RMT to many other areas, such as statistics, wireless communications, for example, the CDMA (code division multiple access) systems and MIMO (multiple input multiple output) systems, finance and economics, and so on, the importance of the limiting behavior of eigenvectors has been gradually recognized. For example, in signal processing, for signals received by linearly spaced sensors, the estimates of the directions of arrivals (DOA) are based on the noise eigenspace. In principal component analysis or factor analysis, the directions of the principal components are the eigenvectors corresponding to the largest eigenvalues. Now, let us consider another example of the application of eigenvectors of a large covariance matrix A n which is important in wireless communications. Consider transmission methods in wireless systems. In a direct sequence CDMA system, the discrete-time model for a synchronous systems is formulated as K r = b k s k +w, k= where b k ( C) and s k ( C N ) are the transmitted data symbols and signature sequence of the user k, respectively, and w is an N-dimensional background Gaussian noise of i.i.d. variables with mean zero and variance σ 2. The goal is to demodulate the transmitted b k for each user. In this case, the performance measure is defined as the signal-to-interference ratio (SIR) of the estimates. In a large network, since the number of users is very large, it is reasonable to assume that K is proportional to N. That is, one can assume that their ratio remains constant when both K and N tend to infinity. Thus, it is feasible to apply the theory of large-dimensional random matrices to wireless communications and, indeed, there has already accumulated a fruitful literature in this direction (see, e.g., [8] and [9], among others). Eldar and Chen [0] derived an expression of SIR for the decorrelator receiver in terms of eigenvectors and eigenvalues of random matrices and then analyzed the asymptotics of the SIR (see [0] for details). Our research is motivated by the fact that the matrix of eigenvectors (eigenmatrix for short) of the Wishart matrix has the Haar distribution, that is, the uniform distribution over the group of unitary matrices (or orthogonal matrices in the real case). It is conceivable that the eigenmatrix

ASYMPTOTICS OF EIGENVECTORS 7 of a large sample covariance matrix should be asymptotically Haar distributed. However, we are facing a problem on how to formulate the terminology asymptotically Haar distributed because the dimensions of the eigenmatrices are increasing. In this paper, we shall adopt the method of Silverstein [4, 5]. If U has a Haar measure over the orthogonal matrices, then for any unit vector x R n, y = Ux = (y,...,y n ) has a uniform distribution over the unit sphere S n = {x R n ; x = }. If z = (z,...,z n ) N(0,I n ), then y has the same distribution as z/ z. Now, define a stochastic process Y n (t) in the space D(0,) by (.6) n Y n (t) = 2 [nt] ( i= [nt] d n = 2 z 2 i= y i 2 n ) ( z i 2 z 2 n where [a] denotes the greatest integer a. From the second equality, it is easy to see that Y n (t) converges to a Brownian bridge (BB) B(t) when n converges to infinity. Thus we are interested in whether the same is true for general sample covariance matrices. Let U n Λ n U n denote the spectral decomposition of A n, where Λ n = diag(λ, λ 2,...,λ n ) and U n = (u ij ) is a unitary matrix consisting of the orthonormal eigenvectors of A n. Assume that x n C n, x n =, is an arbitrary nonrandom unit vector and that y = (y,y 2,...,y n ) = U nx n. We define a stochastic process by way of (.6). If U n is asymptotically Haar distributed, then y should be asymptotically uniformly distributed over the unit sphere and Y n (t) should tend to a BB. Our main goal is to examine the limiting properties of the vector y through the stochastic process Y n (t). For ease of application of RMT, we make a time transformation in Y n (t), X n (x) = Y n (F An (x)), where F An is the ESD of the matrix A n. If the distribution of U n is close to Haar, then X n (x) should approximate B(F c,h (x)), where F c,h is the limiting spectral distribution of A n. We define a new empirical distribution function based on eigenvectors and eigenvalues: (.7) F An (x) = n y i 2 I(λ i x). i= Recall that the ESD of A n is F An (x) = n I(λ i x). n i= ),

8 Z. D. BAI, B. Q. MIAO AND G. M. PAN It then follows that X n (x) = n An (F (x) F An (x)). 2 The investigation of Y n (t) is then converted to one concerning the difference F An (x) F An (x) of the two empirical distributions. It is obvious that F An (x) is a random probability distribution function and that its Stieltjes transform is given by (.8) m F An(z) = x n(a n zi) x n. As we have seen, the difference between F An (x) and F An (x) is only in their different weights on the eigenvalues of A n. However, it will be proven that although these two empirical distributions have different weights, they have the same limit; this is included in Theorem. below. To investigate the convergence of X n (x), we consider its linear functional, which is defined as ˆX n (g) = g(x)dx n (x), where g is a bounded continuous function. It turns out that [ n n ˆX n (g) = yj 2 g(λ j ) ] n g(λ j ) 2 n = n 2 [ g(x)df An (x) ] g(x)df An (x). Proving the convergence of ˆXn (g) under general conditions is difficult. Following an idea of [7], we shall prove the central limit theorem (CLT) for those g which are analytic over the support of the limiting spectral distribution of A n. To this end, let G n (x) = N(F An (x) F cn,hn (x)), where c n = n N and where F cn,hn (x) denotes the limiting distribution by substituting c n for c and H n for H in F c,h. The main results of this paper are formulated in the following three theorems. Theorem. Suppose that: () for each n, X ij = Xij n,i,j =,2,..., are i.i.d. complex random variables with EX = 0,E X 2 = and E X 4 < ; (2) x n C n = {x Cn n, x = } and lim n N = c (0, );

ASYMPTOTICS OF EIGENVECTORS 9 (3) T n is nonrandom Hermitian nonnegative definite with its spectral norm bounded in n, with H n = F Tn D H a proper distribution function and with x n (T n zi) x n m F H(z), where m F H(z) denotes the Stieltjes transform of H(t). It then follows that F An (x) F c,h (x) a.s. Remark. The condition x n(t n zi) x n m F H(z) is critical for our Theorem as well as for the main theorems which we give later. At first, we indicate that if T n = bi for some positive constant b or, more generally, λ max (T n ) λ min (T n ) 0, then the condition x n(t n zi) x n m F H(z) holds uniformly for all x n C n. We also note that this condition does not require T n to be a multiple of an identity. As an application of this remark, one sees that the eigenmatrix of a sample covariance matrix transforms x n to a unit vector whose entries absolute values are close to / N. Consequently, the condition x n (T n zi) x n m F H(z) holds when T n is the inverse of another sample covariance matrix which is independent of X n. Therefore, the multivariate F matrix satisfies Theorem. In general, the condition may not hold for all x n C n. However, there always exist some x n C n such that this condition holds, say x n = (u + + u n )/ n, where u,...,u n are the orthonormal eigenvectors of the spectral decomposition of T n. Applying Theorem, we get the following interesting results. Corollary. Let (A m n ) ii,m =,2,..., denote the ith diagonal elements of matrices A m n. Under the conditions of Theorem for x n = e ni, it follows that for any fixed m, lim (Am n ) ii x m df c,h (x) 0 a.s., n where e ni is the n-vector with ith element and all others 0. Remark 2. If T n = bi for some positive constant b or, more generally, λ max (T n ) λ min (T n ) 0, then there is a better result, that is, (.9) lim (Am n ) ii x m df c,h (x) 0 a.s. max n i [The corollary follows easily from Theorem. The uniform convergence of (.9) follows from the uniform convergence of condition (3) of Theorem and by careful checking of the proof of Theorem.] More generally, we have the following:

0 Z. D. BAI, B. Q. MIAO AND G. M. PAN Corollary 2. If f(x) is a bounded function and the assumptions of Theorem are satisfied, then n yj 2 f(λ j ) n f(λ j ) 0 a.s. n Remark 3. The proof of the above corollaries are immediate. Applying Corollary 2, Theorem of [0] can be extended to a more general case without difficulty. Theorem 2. In addition to the conditions of Theorem, we further assume that: (4) g,...,g k are defined and analytic on an open region D of the complex plane which contains the real interval [ lim inf λ Tn n min I (0,)(c)( c) 2,limsupλ Tn max ( + ] (.0) c) 2 and (5) sup N x n(m F cn,hn(z)t n + I) x n z as n. Then the following conclusions hold: (a) The random vectors ( (.) g (x)dg n (x),..., n m F cn,hn(z)t + dh n(t) 0 ) g k (x)dg n (x) form a tight sequence. (b) If X and T n are real and EX 4 = 3, then the above random vector converges weakly to a Gaussian vector X g,...,x gk with mean zero and covariance function (.2) Cov(X g,x g2 ) = 2π 2 C C 2 g (z )g 2 (z 2 ) (z 2 m(z 2 ) z m(z )) 2 c 2 z z 2 (z 2 z )(m(z 2 ) m(z )) dz dz 2. The contours C and C 2 in the above equation are disjoint, are both contained in the analytic region for the functions (g,...,g k ) and both enclose the support of F cn,hn for all large n.

ASYMPTOTICS OF EIGENVECTORS (c) If X is complex with EX 2 = 0 and E X 4 = 2, then the conclusions (a) and (b) still hold, but the covariance function reduces to half of the quantity given in (.2). Remark 4. If T n = bi for some positive constant b or, more generally, n(λmax (T n ) λ min (T n )) 0, then condition (5) holds uniformly for all x n C n. Remark 5. Indeed, we can also establish the central limit theorem for ˆX n (g) according to Theorem. of [7] and Theorem 2. Beside Theorem 2, which holds for more general functions g(x), the following theorem reveals more similarities between the process Y n (t) and the BB. Theorem 3. Beside the assumptions of Theorem 2, if H(x) satisfies (.3) dh(t) ( + tm(z ))( + tm(z 2 )) dh(t) dh(t) ( + tm(z )) ( + tm(z 2 )) = 0, then all results of Theorem 2 remain true. Moreover, formula (.2) can be simplified to Cov(X g,x g2 ) = 2 ( g (x)g 2 (x)df c,h (x) c (.4) ) g (x )df c,h (x ) g 2 (x 2 )df c,h (x 2 ). Remark 6. Obviously, (.3) holds when T n = bi. Actually, (.3) holds if and only if H(x) is a degenerate distribution. To see this, one need only choose z 2 to be the complex conjugate of z. Remark 7. Theorem 3 extends the theorem of Silverstein [5]. First, one sees that the rth moment of F An (x) is x na r nx n, which is a special case with g(x) = x r. Then applying Theorem 3 with T n = bi and combining with Theorem. of [7], one can obtain the sufficient part of (a) in the theorem of Silverstein [5]. Actually, for T n = I, formula (.2) can be simplified to Cov(X g,x g2 ) = 2 ( g (x)g 2 (x)df c (x) c (.5) ) g (x )df c (x ) g 2 (x 2 )df c (x 2 ), where F c (x) is a special case of F c,h (x), as T n = I.

2 Z. D. BAI, B. Q. MIAO AND G. M. PAN The organization of the rest of the paper is as follows. In the next section, we complete the proof of Theorem. The proof of Theorem 2 is converted to an intermediate Lemma 2, given in Section 3. Sections 4 and 5 contain the proof of Lemma 2. Theorem 3 and some comparisons with the results of [5] are given in Section 6. A truncation lemma (Lemma 4) is postponed to Section 7. 2. Proof of Theorem. Without loss of generality, we assume that T n, where denotes the spectral norm on the matrices, that is, their largest singular values. Throughout this paper, K denotes a universal constant which may take different values at different appearances. Lemma. (Lemma 2.7 in [5]). Let X = (X,...,X n ), where X j s are i.i.d. complex random variables with zero mean and unit variance. Let B be a deterministic n n complex matrix. Then for any p 2, we have E X BX trb p K p ((E X 4 trbb ) p/2 + E X 2P tr(bb ) p/2 ). For K > 0, let Xij = X ij I( X ij K) EX ij I( X ij K) and Ãn = N T n /2 Xn X n Tn /2, where X n = ( X ij ). Let v = Iz > 0. Since X ij X ij = X ij I( X ij > K) EX ij I( X ij > K) and (A n zi) is bounded by v, by Theorem 3. in [22], we have x n(a n zi) x n x n(ãn zi) x n (A n zi) (Ãn zi) (A n zi) (A n Ān) (Ãn zi) Nv 2( X n X n X n + X n X n X n ) ( + c) 2 v 2 [E /2 X X 2 (E /2 X 2 + E /2 X 2 )] a.s. 2( + c) 2 v 2 E /2 X 2 I( X > K). The above bound can be made arbitrarily small by choosing K sufficiently large. Since lim n E X 2 =, the rescaling of Xij can be dealt with similarly. Hence, in the sequel, it is enough to assume that X ij K,EX = 0 and E X 2 = (for simplicity, suppressing all super- and subscripts on the variables X ij ). Next, we will show that (2.) x n(a n zi) x n x ne(a n zi) x n 0 a.s.

Let s j denote the jth column of s j s j, and ASYMPTOTICS OF EIGENVECTORS 3 N Tn /2 α j (z) = s j j (z)x n x n(em n (z)t n + I) s j N x n(em n (z)t n + I) T n j (z)x n, ξ j (z) = s j A j (z)s j N trt n j (z), γ j = s j j β j (z) = (z)x n x n j (z)s j N x n X n, A(z) = A n zi, A j (z) = A(z) j (z)t n j (z)x n +s, b j (z) = j A j (z)s j + N trt n j (z). Noting that β j (z) z /v and j (z) /v, by Lemma, we have ( ) ( ) (2.2) E s j j (z)x n x n j (z)s j r = O, E ξ j (z) r = O. N r N r/2 Define the σ-field F j = σ(s,...,s j ), let E j ( ) denote conditional expectation given the σ-field F j and let E 0 ( ) denote the unconditional expectation. Note that x n (A n zi) x n x n E(A n zi) x n (2.3) N = x ne j (z)x n x ne j (z)x n = N x ne j ( (z) j (z))x n x ne j ( (z) j (z))x n N = (E j E j )β j (z)s j j (z)x n x n j (z)s j N = E j b j (z)γ j (z) (E j E j )s j j (z)x n x n j (z)s j β j (z)b j (z)ξ j (z). By the fact that z +s j A j (z)s j v and use of the Burkholder inequality, (2.2) and the martingale expression (2.3), we have E x n(a n zi) x n x ne(a n zi) x n r

4 Z. D. BAI, B. Q. MIAO AND G. M. PAN E [ N + E E j (E j E j )β j (z)s j j (z)x n x n j (z)s j 2 N (E j E j )β j (z)s j j (z)x n x n j (z)s j r ] r/2 [ N K z 2 E v 2 E j γ j (z) 2 + E j s j A j (z)x n x n A j (z)s j ξ j (z) 2 N K z r + v r E s j A j (z)x n x n A j (z)s j r K[N r/2 + N r+ ]. Thus (2.) follows from Borel Cantelli lemma, by taking r > 2. Write N A(z) ( zem n (z)t n zi) = s j s j ( zem n(z))t n. Using the identities and (2.4) (see (2.2) of [3]), we obtain s j (z) = β j (z)s j j (z) m n (z) = zn N β j (z) E (z) ( zem n (z)t n zi) [ N ] = (zem n (z)t n + zi) E s j s j ( zem n (z))t n (z) = z N [ Eβ j (z) (Em n (z)t n + I) s j s j j (z) ] N (Em n(z)t n + I) T n E (z). Multiplying by x n on the left and x n on the right, we have x ne (z)x n x n( zem n (z)t n zi) x n ] r/2

(2.5) ASYMPTOTICS OF EIGENVECTORS 5 = N [ z Eβ (z) s A (z)x nx n (Em n(z)t n + I) s N x n(em n (z)t n + I) T n E (z)x n ] = δ + δ 2 + δ 3, where δ = N z Eβ (z)α (z), δ 2 = z Eβ (z)x n (Em n(z)t n + I) T n ( (z) A (z))x n, δ 3 = z Eβ (z)x n(em n (z)t n + I) T n ( (z) E (z))x n. Similar to (2.2), by Lemma, for r 2, we have ( ) E α j (z) r = O. Therefore, N r (2.6) δ = N z Eb (z)β (z)ξ (z)α (z) = O(N 3/2 ). It follows that (2.7) δ 2 = z Eβ2 (z)x n (Em n(z)t n + I) T n (z)s s A (z)x n K( E x n(em n (z)t n + I) T n (z)s 2 E s (z)x n 2 ) /2 = O(N ) and (2.8) δ 3 = z Eβ (z)b (z)ξ (z)x n(em n (z)t n + I) T n ( (z) E (z))x n K(E ξ (z) 2 E x n (Em n(z)t n + I) = o(n /2 ), T n ( (z) E (z))x n 2 ) /2 where to estimate the second factor, we need to use the martingale decomposition of (z) E (z).

6 Z. D. BAI, B. Q. MIAO AND G. M. PAN (2.9) Combining the above three results and (2.5), we conclude that x n EA (z)x n x n ( zem n(z)t n zi) x n 0. In [3], it is proved that, under the conditions of Theorem, Em n (z) m(z), which is the solution of equation (.2), and we then conclude that x ne (z)x n x n( zm(z)t n zi) x n 0. By condition (3) of Theorem, we finally obtain that x ne dh(t) (z)x n zmt z, which completes the proof of Theorem. 3. An intermediate lemma. In the sequel, we will follow the work of Bai and Silverstein [7]. To complete the proof of Theorem 2, we need an intermediate lemma. Write M n (z) = N(m F An(z) m F cn,hn(z)), which is defined on a contour C in the complex plane, described as follows. Let u r be a number which is greater than the right endpoint of interval (.0) and let u l be a negative number if the left endpoint of interval (.0) is zero, otherwise let u l (0,liminf n λ Tn min I (0,)(c)( c) 2 ). Let v 0 > 0 be arbitrary. Define Then the contour C u = {u + iv 0 :u [u l,u r ]}. C = C u {u l + iv :v [0,v 0 ]} {u r + iv :v [0,v 0 ]} {their symmetric parts below the real axis}. Under the conditions of Theorem 2, for later use, we may select the contour C in the region on which the functions g are analytic. As in [7], due to technical difficulties, we will consider M n(z), a truncated version of M n (z). Choose a sequence of positive numbers {δ n } such that for 0 < ρ <, (3.) Write δ n 0, δ n n ρ. C l = { {ul + iv :v [n δ n,v 0 ]}, if u l > 0, {u l + iv :v [0,v 0 ]}, if u l < 0,

ASYMPTOTICS OF EIGENVECTORS 7 and C r = {u r + iv : v [n δ n,v 0 ]}. Let C 0 = C u C l C r. Now, for z = u + iv, we can define the process M n (z), if z C 0 C 0, nv + δ n M n (u r + in δ n ) + δ n nv M n (u r in δ n ), Mn(z) 2δ = n 2δ n if u = u r, v [ n δ n,n δ n ], nv + δ n M n (u l + in δ n ) + δ n nv M n (u l in δ n ), 2δ n 2δ n if u = u l > 0, v [ n δ n,n δ n ]. M n(z) can be viewed as a random element in the metric space C(C,R 2 ) of continuous functions from C to R 2. We shall prove the following lemma. Lemma 2. Under the assumptions of Theorem and assumptions (4) and (5) of Theorem 2, M n(z) forms a tight sequence on C. Furthermore, when the conditions in (b) and (c) of Theorem 2 on X hold, for z C, M n (z) converges to a Gaussian process M( ) with zero mean and for z,z 2 C, under the assumptions in (b), (3.2) Cov(M(z ),M(z 2 )) = 2(z 2 m(z 2 ) z m(z )) 2 c 2 z z 2 (z 2 z )(m(z 2 ) m(z )), while under the assumptions in (c), the covariance function similar to (3.2) is the half of the value of (3.2). Similar to the approach of [7], to prove Theorem 2, it suffices to prove Lemma 2. Before proceeding with the detailed proof of the lemma, we need to truncate, recentralize and renormalize the variables X ij. However, those procedures are purely technical (and tedious), thus we shall postpone then to the last section of the paper. Now, according to Lemma 4, we further assume that the underlying variables satisfy the following additional conditions: and X ij ε n n /4, EX = 0, E X 2 =, E X 4 < E X 4 = 3 + o(), if assumption (b) of Theorem 2 is satisfied, EX 2 = o(n /2 ), E X 4 = 2 + o(), if assumption (c) of Theorem 2 is satisfied. Here, ε n is a sequence of positive numbers which converges to zero. The proof of Lemma 2 will be given in the next two sections.

8 Z. D. BAI, B. Q. MIAO AND G. M. PAN and 4. Convergence in finite dimensions. For z C 0, let Then M n(z) = N(m F An (z) Em F An(z)) Mn 2 (z) = N(Em F An(z) m F cn,hn(z)). M n (z) = M n(z) + M 2 n(z). In this section for any positive integer r and complex numbers a,...,a r, we will show that r a i Mn (z i) (Iz i 0) i= converges in distribution to a Gaussian random variable and will derive the covariance function (3.2). To this end, we employ the notation introduced in Section 2. Before proceeding with the proofs, we first recall some known facts and results.. (See [7].) Let Y = (Y,...,Y n ), where Y i s are i.i.d. complex random variables with mean zero and variance. Let A = (a ij ) n n and B = (b ij ) n n be complex matrices. Then the following identity holds: (4.) E(Y AY tra)(y BY trb) n = (E Y 4 EY 2 2 2) a ii b ii + EY 2 2 trab T + trab. i= 2. (See Theorem 35.2 of [9].) Lemma 3. Suppose that for each n, Y n,y n2,...,y nrn is a real martingale difference sequence with respect to the increasing σ-field {F nj } having second moments. If, as n, we have (i) r n E(Y 2 nj F n,j ) i.p. σ 2, where σ 2 is a positive constant and for each ε > 0, then (ii) r n r n E(Y 2 nji ( Ynj ε)) 0, Y nj D N(0,σ 2 ).

ASYMPTOTICS OF EIGENVECTORS 9 3. Some simple results follow by using the truncation and centralization steps described in Lemma 4 given in the Appendix: (4.2) (4.3) (4.4) E s Cs N trt n C p K p C p (ε 2p 4 n N p/2 + N p/2 ) K p C p N p/2, E s Cx nx n Ds N x n DT ncx n p K p C p D p ε 2p 4 n N p/2, E s Cx nx n Ds p K p C p D p ε 2p 4 n N p/2. Let v = Iz. To facilitate the analysis, we will assume that v > 0. By (2.3), we have Since we get N(mF An (z) Em F An(z)) = N N (E j E j )β j (z)s j j (z)x n x n j (z)s j. β j (z) = b j (z) β j (z)b j (z)ξ j (z) = b j (z) b 2 j(z)ξ j (z) + b 2 j(z)β j (z)ξ 2 j (z), (E j E j )β j (z)s j A j (z)x n x n A j (z)s j ( = E j b j (z)γ j (z) E j b 2 j (z)ξ j(z) ) N x n A j (z)t n j (z)x n Applying (4.2), we obtain + (E j E j )(b 2 j(z)β j (z)ξ 2 j (z)s j j (z)x n x n j (z)s j b 2 j(z)ξ j (z)γ j (z)). N ( E N E j b 2 j (z)ξ j(z) ) 2 N x n A j (z)t n j (z)x n = N N E E j (b 2 j (z)ξ j(z)x n A j (z)t n j (z)x n ) 2 K z 4 v 8 E ξ (z) 2 = O(N ), which implies that N N E j (b 2 j (z)ξ j(z) N x n j (z)t n j (z)x n ) i.p. 0.

20 Z. D. BAI, B. Q. MIAO AND G. M. PAN By (4.2), (4.4) and Hölder s inequality, we have N E N (E j E j )b 2 j(z)β j (z)ξj 2 (z)s j j (z)x n x n j (z)s j K ( z v = O(N 3/2 ), which implies that ) 6 ( N 2 E ξ 4 (z) γ2 (z) + ) N 2E ξ4 (z) x n A (z)t n (z)x n 2 N N (E j E j )b 2 j (z)β j(z)ξj 2 (z)s j A j (z)x n x n A i.p. j (z)s j 0. Using a similar argument, we have N N (E j E j )b 2 j(z)ξ j (z)γ j (z) i.p. 0. The estimate above (4.3) of [5], (2.7) of [7] and (4.3) collectively yield that which gives E (b j (z) + zm(z))γ j (z) 2 = E[E( (b j (z) + zm(z))γ j (z) 2 σ(s i,i j))] = E[ b j (z) + zm(z) 2 E( γ j (z) 2 σ(s i,i j))] = o(n 2 ), N N E j [(b j (z) + zm(z))γ j (z)] i.p. 0. Note that the above results also hold when Iz v 0, by symmetry. Hence for finite-dimensional convergence, we need only consider the sum r N N r a i Y j (z i ) = a i Y j (z i ), i= i= where Y j (z i ) = Nz i m(z i )E j γ j (z i ). Since E Y j (z i ) 4 = O(ε 4 n N ), we have N r 2 )) r E( a i Y j (z i ) I( a i Y j (z i ) ε N r 4 ε i= i= 2 E a i Y j (z i ) = O(ε 4 n). i= Thus condition (ii) of Lemma 3 is satisfied. 2

(4.5) ASYMPTOTICS OF EIGENVECTORS 2 Now, we only need to show that for z,z 2 C \ R, N E j (Y j (z )Y j (z 2 )) converges in probability to a constant under the assumptions in (b) or (c). It is easy to verify that tre j ( j (z )x n x n j (z ))T n E j ( j (z 2 )x n x n j (z 2 )T n ) v v 2 2, (4.6) where v = I(z ) and v 2 = I(z 2 ). It follows that, for the complex case, applying (4.), (4.5) now becomes (4.7) z z 2 m(z )m(z 2 ) N E j tre j ( j (z )x n x n N A j (z ))T n = z z 2 m(z )m(z 2 ) N E j ( j (z 2 )x n x n A j (z 2 )T n ) + o p () N E j (x n j (z )T n Ă j (z 2 )x n ) (x nă j (z 2 )T n j (z )x n ) + o p (), where Ă j (z 2 ) is defined similarly as j (z 2 ) by (s,...,s j, s j+,..., s N ) and where s j+,..., s N are i.i.d. copies of s j+,...,s N. For the real case, (4.5) will be twice the magnitude of (4.7). Write (4.8) x n (A j (z ) E j j (z ))T n Ă j (z 2 )x n = N t=j From (4.8), we note that (4.9) x n(e t j (z ) E t j (z ))T n Ă j (z 2 )x n. E j (x n j (z )T n Ă j (z 2 )x n )(x nă j (z 2 )T n j (z )x n ) = N t=j E j x n (E t j (z ) E t j (z ))T n Ă j (z 2 )x n x nă j (z 2 ) T n (E t j (z ) E t j (z ))x n + E j (x n(e j j (z )T n )Ă j (z 2 )x n ) (x nă j (z 2 )T n (E j j (z ))x n )

22 Z. D. BAI, B. Q. MIAO AND G. M. PAN = E j (x n(e j j (z )T n )Ă j (z 2 )x n ) (x nă j (z 2 )T n (E j j (z ))x n ) + O(N ), where we have used the fact that E j x n(e t j (z ) E t j (z )) T n Ă j (z 2 )x n x nă j (z 2 )T n (E t j (z ) E t j (z ))x n 4( E j β tj (z )x n( tj (z )s t s t( tj (z )T n Ă j (z 2 )x n 2 E j β tj (z )x nă j (z 2 )T n ( tj (z )s t s t j (z ))x n 2 ) /2 = O(N 2 ). Similarly, one can prove that E j (x n (E j j (z )T n )Ă j (z 2 )x n )(x nă j (z 2 )T n (E j j (z ))x n ) = E j (x n A j (z )T n Ă j (z 2 )x n ) E j (x nă j (z 2 )T n j (z )x n ) + O(N ). Define ( A ij (z) = A(z) s i s i s j s j, T (z ) = z I N ) N b n(z )T n, β ij (z) = +s i A ij (z)s and b n (z) = i + N E trt n 2 (z). Then (see (2.9) in [7]) (4.0) A j (z ) = T (z ) + b n (z )B j (z ) + C j (z ) + D j (z ), where B (z ) = B j (z ) + B j2 (z ), B j (z ) = T (z )(s i s i N T n ) ij (z ), i>j B j2 (z ) = T (z )(s i s i N T n ) ij (z ), i<j C j (z ) = (β ij (z ) b n (z ))T (z )s i s i A ij (z ) i j and D j (z ) = N b n (z )T (z )T n i j ( ij (z ) A j (z )).

It follows that (4.) where and E j (x n A j ASYMPTOTICS OF EIGENVECTORS 23 (z )T n Ă j (z 2 )x n )E j (x nă j (z 2 )T n j (z )x n ) = E j (x n A j (z )T n Ă j (z 2 )x n ) E j (x nă j (z 2 )T n T (z )T n x n ) + B(z,z 2 ) + C(z,z 2 ) + D(z,z 2 ), B(z,z 2 ) = b n (z )E j (x n A j (z )T n Ă j (z 2 )x n ) E j (x nă j (z 2 )T n B j2 (z )x n ), C(z,z 2 ) = E j (x n A j (z )T n Ă j (z 2 )x n ) D(z,z 2 ) = E j (x n A j We then prove that (4.2) E C(z,z 2 ) = o() E j (x nă j (z 2 )T n C j (z )x n ) (z )T n Ă j (z 2 )x n )(x nă j (z 2 )T n D j (z )x n ). and E D(z,z 2 ) = o(). Note that although C and D depend on j implicitly, E C(z,z 2 ) and E D(z,z 2 ) are independent of j since the entries of X n are i.i.d. We then have E C(z,z 2 ) E x nă j (z 2 )T n C j (z )x n v v 2 v v 2 (E β ij (z ) b n (z ) 2 i j E s i ij (z )x n x n(ă j (z 2 ))T n T (z )s i 2 ) /2. When i > j, s i is independent of Ă j (z 2 ). As in the proof of (2.2), we have (4.3) E s i A ij (z )x n x n (Ă j (z 2 ))T n T (z )s i 2 = O(N 2 ). When i < j, by substituting Ă j (z 2 ) for Ă ij (z 2) β ij (z 2 )Ă ij (z 2)s i s i Ă ij (z 2), we can also obtain the above inequality. Noting that (4.4) E β ij (z ) b n (z ) 2 = E β ij (z )b n (z )ξ ij 2 = O(n ), where ξ ij (z) = s i A ij (z)s i N tra ij (z) and β ij (z 2 ) is defined similarly to β ij (z 2 ), and combining (4.3) (4.4), we conclude that E C(z,z 2 ) = o().

24 Z. D. BAI, B. Q. MIAO AND G. M. PAN The argument for D(z,z 2 ) is similar to that of C(z,z 2 ), only simpler, and is therefore omitted. Hence, (4.2) holds. Next, write (4.5) where and B (z,z 2 ) = i<j B 2 (z,z 2 ) = i<j B(z,z 2 ) = B (z,z 2 ) + B 2 (z,z 2 ) + B 3 (z,z 2 ), b n (z )E j x nβ ij (z ) ij (z )s i s i ij (z )T n Ă j (z 2 )x n E j x nă j (z 2 )T n T (z )(s i s i N T n ) ij (z )x n, b n (z )E j x n ij (z )T n Ă ij (z 2)s i s iă ij (z 2) β ij (z 2 )x n E j x nă j (z 2 )T n T (z )(s i s i N T n ) ij (z )x n B 3 (z,z 2 ) = b n (z )E j x n ij (z )T n Ă ij (z 2)x n i<j E j x nă j (z 2 )T n T (z )(s i s i N T n ) ij (z )x n. Splitting Ă j (z 2 ) into the sum of Ă ij (z 2) and β ij (z 2 )Ă ij (z 2)s i s i Ă ij (z 2), as in the proof of (4.4), one can show that E B (z,z 2 ) b n (z ) (E x nβ ij (z ) ij (z )s i s i ij (z )T n Ă j (z 2 )x n 2 i<j = O(N /2 ). By the same argument, we have E x nă j (z 2 )T n T (z )(s i s i N T n ) ij (z )x n 2 ) /2 E B 2 (z,z 2 ) = O(N ). To deal with B 3 (z,z 2 ), we again split Ă j (z 2 ) into the sum of Ă ij (z 2) and β ij (z 2 )Ă ij (z 2)s i s i Ă ij (z 2). We first show that B 3 (z,z 2 ) = b n (z )E j x n A ij (z )T n Ă ij (z 2)x n i<j (4.6) = o p (). E j x nă ij (z 2)T n T (z )(s i s i N T n ) ij (z )x n

We have E B 3 (z,z 2 ) 2 = ASYMPTOTICS OF EIGENVECTORS 25 i,i 2 <j b n (z ) 2 EE j x n i j (z )T n Ă i j (z 2)x n E j x n i 2 j ( z )T n Ă i 2 j ( z 2)x n E j x nă i j (z 2)T n T (z ) (s i s i N T n ) i j (z )x n x nă i 2 j ( z 2) T n T ( z )(s i2 s i 2 N T n ) i 2 j ( z )x n. When i = i 2, the term in the above expression is bounded by KE x nă i j (z 2)T n T (z )(s i s i N T n ) ij (z )x n 2 = O(N 2 ). For i i 2 < j, define β i i 2 j(z ) = +s i 2 i i 2 j (z )s i2, A i i 2 j(z ) = A(z ) s i s i s i2 s i 2 s j s j and similarly define β i,i 2,j(z 2 ) and Ăi i 2 j(z 2 ). We have EE j x n i,i 2,j (z )s i2 s i 2 i,i 2,j (z )β i,i 2,j(z )T n Ă i j (z 2) x n E j x n i 2 j ( z )T n Ă i 2 j ( z 2)x n E j x nă i j (z 2)T n T (z )(s i s i N T n ) i j (z )x n x nă i 2 j ( z 2)T n T ( z )(s i2 s i 2 N T n ) i 2 j ( z )x n K(E x n i,i 2,j (z )s i2 s i 2 i,i 2,j (z )β i,i 2,j(z )T n Ă i j (z 2)x n 2 ) /2 (E x nă i j (z 2)T n T (z )(s i s i N T n ) i j (z )x n 4 ) /4 (E x nă i 2 j (z 2)T n T (z )(s i2 s i 2 N T n ) i 2 j (z )x n 4 ) /4 = O(N 5/2 ), EE j x n i,i 2,j (z )T n Ă i,i 2,j (z 2)s i2 s i 2 Ă i,i 2,j (z 2) β i,i 2,j(z 2 )x n E j x n i 2 j ( z )T n Ă i 2 j ( z 2)x n E j x nă i j (z 2)T n T (z )(s i s i N T n ) i j (z )x n x nă i 2 j ( z 2)T n T ( z )(s i2 s i 2 N T n ) i 2 j ( z )x n

26 Z. D. BAI, B. Q. MIAO AND G. M. PAN K(E x n A i,i 2,j (z )T n Ă i,i 2,j (z 2)s i2 s i 2 Ă i,i 2,j (z 2) β i,i 2,j(z 2 )x n 2 ) /2 (E x nă i j (z 2)T n T (z )(s i s i N T n ) i j (z )x n 4 ) /4 (E x nă i 2 j (z 2)T n T (z )(s i2 s i 2 N T n ) i 2 j (z )x n 4 ) /4 = O(N 5/2 ) and, by (4.), EE j x n A i,i 2,j (z )T n Ă i,i 2,j (z 2)x n E j x n A i 2 j ( z )T n Ă i 2 j ( z 2)x n E j x nă i,i 2,j (z 2)s i2 s i 2 Ă i,i 2,j (z 2) β i,i 2,j(z 2 )T n T (z ) (s i s i N T n ) i j (z )x n x nă i 2 j ( z 2)T n T ( z )(s i2 s i 2 N T n ) i 2 j ( z )x n K(E x nă i,i 2,j (z 2)s i2 s i 2 Ă i,i 2,j (z 2) β i,i 2,j(z 2 )T n T (z ) (s i s i N T n ) i j (z )x n 2 ) /2 (E x nă i 2 j (z 2)T n T (z )(s i2 s i 2 N T n ) i 2 j (z )x n 2 ) /2 K(E x nă i,i 2,j (z 2)s i2 s i 2 Ă i,i 2,j (z 2)T n T (z ) (s i s i N T n ) i j (z )x n 2 ) /2 O(N ) K(Ex nă i,i 2,j (z 2)s i2 s i 2 Ă i,i 2,j (z 2)T n T (z )T n T ( z )T n Ă i,i 2,j ( z 2)s i2 s i 2 Ă i,i 2,j ( z 2)x n x n A i,j ( z )T n i j (z )x n ) /2 O(N 2 ) K(E x nă i,i 2,j (z 2)s i2 s i 2 Ă i,i 2,j ( z 2)x n 2 ) /4 (E s i 2 Ă i,i 2,j (z 2)T n T (z )T n T ( z )T n Ă i,i 2,j ( z 2)s 2 i 2 ) /4 O(N 2 ) = O(N 9/4 ). The conclusion (4.6) then follows from the above three estimates. Therefore, B 3 (z,z 2 ) = B 32 (z,z 2 ) + o p (), B 32 (z,z 2 ) = b n (z )E j x n ij (z )T n Ă ij (z 2)x n i<j E j x nă ij (z 2)s i s i Ă ij (z 2) β ij (z 2 )T n T (z )

ASYMPTOTICS OF EIGENVECTORS 27 (s i s i N T n ) ij (z )x n = b n (z )E j x n ij (z )T n Ă ij (z 2)x n i<j E j x nă ij (z 2)s i s i Ă ij (z 2) β ij (z 2 )T n T (z ) s i s i ij (z )x n + o p (). By (4.3) of [5], (4.2) and (4.4), for i < j, we have (4.7) E x nă ij (z 2)s i s i A ij (z )x n (s i Ă ij (z 2) β ij (z 2 )T n T (z )s i (E x nă ij (z 2)s i s i A ij (z )x 2 n )/2 [(E β ij (z 2 ) 2 s i Ă ij (z 2)T n T (z )s i N b n (z 2 )trt n Ă ij (z 2)T n T (z )) N trt n Ă ij (z 2)T n T (z ) 2 ) /2 + (E β ij (z 2 ) b n (z 2 ) 2 N trt n Ă ij (z 2)T n T (z ) 2 ) /2 ] = O(N 3/2 ). Collecting the proofs from (4.0) to (4.7), we have proved that B(z,z 2 ) = b n (z )b n (z 2 ) E j x n A ij (z )T n Ă ij (z 2)x n E j i<j + o p (). (x nă ij (z 2)s i s i A ij (z )x n N trt n trt n Ă ij (z 2)T n T (z )) Similarly to the proof of (4.6), we may further replace s i s i in the above expression by N T n, that is, B(z,z 2 ) = b n (z )b n (z 2 )N 2 E j x n ij (z )T n Ă ij (z 2)x n E j i<j (x nă ij (z 2)T n ij (z )x n trt n Ă ij (z 2)T n T (z )) + o p (). Reversing the above procedure, one finds that we may also replace ij (z ) and Ă ij (z 2) in B(z,z 2 ) by j (z ) and j (z 2 ), respectively. That is, B(z,z 2 ) = b n(z )b n (z 2 )(j ) N 2 E j x n A j (z )T n Ă j (z 2 )x n E j (x nă j (z 2 )T n j (z )x n trt n Ă j (z 2 )T n T (z )) + o p ().

28 Z. D. BAI, B. Q. MIAO AND G. M. PAN Using the martingale decomposition (4.8), one can further show that B(z,z 2 ) = b n(z )b n (z 2 )(j ) N 2 E j x n j (z )T n Ă j (z 2 )x n E j x nă j (z 2 )T n j (z )x n E j trt n Ă j (z 2 )T n T (z ) + o p (). It is easy to verify that N tr(t n M(z 2 )T n T (z )) = o p () when M(z 2 ) takes the value B j (z 2 ), C j (z 2 ) or D j (z 2 ). Thus, substituting the decomposition (4.0) for Ă j (z 2 ) in the above approximation for B(z,z 2 ), one finds that B(z,z 2 ) = b n(z )b n (z 2 )(j ) N 2 E j x n j (z )T n Ă j (z 2 )x n (4.8) j E j x nă j (z 2 )T n j (z )x n E j trt n T (z 2 )T n T (z ) + o p (). Finally, let us consider the first term of (4.). Using the expression for (z ) in (4.0), we obtain (4.9) where and E j x n A j (z )T n Ă j (z 2 )x n E j x nă j (z 2 )T n T (z )x n = W (z,z 2 ) + W 2 (z,z 2 ) + W 3 (z,z 2 ) + W 4 (z,z 2 ), W (z,z 2 ) = E j x n T (z )T n Ă j (z 2 )x n E j x nă j (z 2 )T n T (z )x n, W 2 (z,z 2 ) = b n (z )E j x n B j2 (z )T n Ă j (z 2 )x n E j x nă j (z 2 )T n T (z )x n, W 3 (z,z 2 ) = E j x ncj (z )T n Ă j (z 2 )x n E j x nă j (z 2 )T n T (z )x n W 4 (z,z 2 ) = E j x nd j (z )T n Ă By the same argument as (4.2), one can obtain (4.20) E W 3 (z,z 2 ) = o() j (z 2 )x n E j x nă j (z 2 )T n T (z )x n. and E W 4 (z,z 2 ) = o(). Furthermore, as in dealing with B(z,z 2 ), the first Ă j (z 2 ) in W 2 (z,z 2 ) can be replaced by b n (z 2 )Ă ij (z 2)s i s i Ă ij (z 2), that is, W 2 (z,z 2 )

= b n (z )b n (z 2 ) i<j ASYMPTOTICS OF EIGENVECTORS 29 E j x nt (z )(s i s i N T n ) ij (z )T n Ă ij (z 2)s i s iă ij (z 2)x n E j x nă j (z 2 )T n T (z )x n + o p () = b n (z )b n (z 2 ) E j x n T (z )s i s i A ij (z )T n i<j Ă ij (z 2)s i s i Ă ij (z 2)x n E j x nă j (z 2 )T n T (z )x n + o p () = b n(z )b n (z 2 )(j ) N 2 E j (x nt (z )T n Ă j (z 2 )x n trt n j (z )T n Ă j (z 2 )) E j x nă j (z 2 )T n T (z )x n + o p (). It can also be verified that x nm(z 2 )T n T (z )x n = o p (), when M(z 2 ) takes the value B j (z 2 ), C j (z 2 ) or D j (z 2 ). Therefore, W 2 (z,z 2 ) can be further approximated by (4.2) W 2 (z,z 2 ) = b n(z )b n (z 2 )(j ) N 2 (x n T (z )T n T (z 2 )x n ) 2 In (2.8) of [7], it is proved that E j tr(t n j (z )T n Ă (z 2 )) + o p (). E j tr(t n j (z )T n Ă j (z 2 )) = tr(t n T (z )T n T (z 2 )) + o p () (j )/N 2 z z 2 m(z )m(z 2 )tr(t n T (z )T n T (z 2 ). By the same method, W (z,z 2 ) can be approximated by W (z,z 2 ) = x n T (z )T n T (z 2 )x n x n T (z 2 )T n T (z )x n + o p (). (4.22) Consequently, from (4.) (4.22), we obtain (4.23) E j x n A j (z )T n Ĕ j ( j (z 2 )x n E j x nă j (z 2 ))T n j (z )x n [ j N b n(z )b n (z 2 ) ] N trt (z 2 )T n T (z )T n = x nt (z )T n T (z 2 )x n x nt (z 2 )T n T (z )x n

30 Z. D. BAI, B. Q. MIAO AND G. M. PAN ( + j + o p (). N b n(z )b n (z 2 ) N E j tr( j Recall that b n (z) zm(z) and F Tn H. Hence, (4.24) d(z,z 2 ) := limb n (z )b n (z 2 ) N tr(t (z )T n T (z 2 )T n ) = ct 2 m(z )m(z 2 ) ( + tm(z ))( + tm(z 2 )) dh(t) = + m(z )m(z 2 )(z z 2 ). m(z 2 ) m(z ) By the conditions of Theorem 2, (4.25) h(z,z 2 ) = limz z 2 m(z )m(z 2 )x n T (z )T n T (z 2 )x n x n T (z 2 )T n T (z )x n ( = m(z )m(z 2 ) z z 2 = m(z )m(z 2 ) z z 2 = m(z )m(z 2 ) z z 2 ( From (4.0), (4.24) and (4.25), we get ( (4.7) i.p. h(z,z 2 ) 0 ) (z )T n Ă j (z 2 )T n ) t 2 ) m(z )m(z 2 ) 2 ( + tm(z ))( + tm(z 2 )) dh(t) tdh(t) ( + tm(z ))( + tm(z 2 )) ( ) z m(z ) z 2 m(z 2 ) 2. (m(z 2 ) m(z )) ( td(z,z 2 )) dt + 0 ) 2 ) td(z,z 2 ) ( td(z,z 2 )) 2 dt = h(z,z 2 ) d(z,z 2 ) = (z 2 m(z 2 ) z m(z )) 2 c 2 z z 2 (z 2 z )(m(z 2 ) m(z )). 5. Tightness of Mn (z) and convergence of M2 n (z). First, we proceed to the proof of the tightness of Mn(z). By (4.3), we obtain r N 2 N r 2 E a i Y j (z i ) = E a i Y j (z i ) K. i= Thus, as pointed out in [7], condition (i) of Theorem 2.3 of [8] is satisfied. Therefore, to complete the proof of tightness, we only need verify that (5.) i= E M n(z ) M n(z 2 ) 2 z z 2 2 K if z,z 2 C.

ASYMPTOTICS OF EIGENVECTORS 3 Write Q(z,z 2 ) = Nx n (A n z I) (A n z 2 I) x n. Recalling the definition of M n, we have Mn (z ) Mn (z 2) z z 2 Q(z,z 2 ) EQ(z,z 2 ), if z,z 2 C 0 C 0, Q(z,z 2+ ) EQ(z,z 2+ ) + Q(z,z 2 ) EQ(z,z 2 ), if z C 0 C 0 only, Q(z +,z ) EQ(z +,z ), otherwise, where R(z 2± ) = Rz 2, I(z 2± ) = ±δ n n, I(z ± ) = ±δ n n and R(z ± ) = u l or u r. Thus we need only to show (5.) when z,z 2 C 0 C 0. From the identity above (3.7) in [7], we obtain (5.2) M n (z ) M n (z 2) z z 2 = N N (E j E j )x n (z ) (z 2 )x n = V (z,z 2 ) + V 2 (z,z 2 ) + V 3 (z,z 2 ), where V (z,z 2 ) = N N (E j E j )β j (z )β j (z 2 )s j j (z ) j (z 2 )s j s j j (z 2 )x n x n A j (z )s j, V 2 (z,z 2 ) = N N (E j E j )β j (z )s j j (z ) j (z 2 )x n x n j (z )s j and V 3 (z,z 2 ) = N N (E j E j )β j (z 2 )s j j (z 2 )x n x n j (z ) j (z 2 )s j. Applying (3.) and the bounds for β j (z) and s j A j (z ) j (z 2 )s j given in the remark concerning (3.2) in [7], we obtain E V (z,z 2 ) 2 = N N E (E j E j )β j (z )β j (z 2 )s j A j (z ) j (z 2 )s j s j j (z 2 )x n x n j (z )s j 2

32 Z. D. BAI, B. Q. MIAO AND G. M. PAN (5.3) KN 2 (E s j j (z 2 )x n x n j (z )s j 2 K, + v 2 n 2 P( A > u r or λ A () min < u l)) where (.9a) and (.9b) in [7] are also employed. It is easy to see that (.9a) and (.9b) in [7] also hold under our truncation case. Similarly, the above argument can also be used to treat V 2 (z,z 2 ) and V 3 (z,z 2 ). Therefore, we have completed the proof of (5.). Next, we will consider the convergence of Mn 2 (z). Note that (5.4) m F cn,hn(z) = z Substituting (2.5) into (5.4), we obtain that + tm F cn,hn(z) dh n(t). z N((x ne( )(z)x n m F cn,hn(z))) = ( (5.5) N x n( Em n (z)t n I) x n + + Nz(δ + δ 2 + δ 3 ). ) + tm F cn,hn(z) dh n(t) Applying (2.6) (2.8), we have Nz(δ + δ 2 + δ 3 ) = o(). On the other hand, in Section 4 of [7], it is proved that (5.6) sup N(mF cn,hn(z) Em n (z)) 0. z Following a similar line to (4.3) of [7], along with (4.2) of [7], we can obtain sup n,z C 0 sup n,z C 0 (m F cn,hn(z)t n + I) <. It follows, via (4.3) of [7] and the assumption of Theorem 2, that t (5.7) ( + tm F cn,hn(z))(tem n (z) + ) dh n(t) <. Appealing to (4.), (4.3) in [7], (5.6) and (5.7), we conclude that ( ) N x n(em n (z)t n + I) x n Em n (z)t + dh n(t) = ( ) N x n (m F cn,hn (z)t n + I) x n m F cn,hn(z)t + dh n(t) + o().

ASYMPTOTICS OF EIGENVECTORS 33 Using (5.6) and (5.7), we also have ( ) N Em n (z)t + dh n(t) + tm F cn,hn(z) dh n(t) = t N(m F cn,hn(z) Em n (z)) ( + tm F cn,hn(z))(tem n (z) + ) dh n(t) = o(). Combining the above arguments, we can conclude that (5.5) 0. 6. Proof of Theorem 3 and supplement to Remark 7. In this section, when T n = I, we will show that formula (.2) includes (.2) of [5] as a special case and we will also present a proof of (.4). First, one can easily verify that (.5) reduces to (.2) of [5] when g(x) = x r. Next, our goal is to prove that (.2) implies (.4) under the condition of Theorem 3. Write (6.) (6.2) (z 2 m(z 2 ) z m(z )) 2 = z z 2 (m(z ) m(z 2 )) 2 + m(z )m(z 2 )(z 2 z ) 2 Recall that z = m(z) + c + z 2 m(z 2 )(z 2 z )(m(z 2 ) m(z )) + z m(z )(z 2 z )(m(z 2 ) m(z )). t + tm(z) dh(t), from which [together with assumption (.3)] we obtain m(z )m(z 2 )(z 2 z ) t 2 m(z )m(z 2 )dh(t) = c m(z 2 ) m(z ) ( + tm(z ))( + tm(z 2 )) = c tm(z ) + tm(z ) dh(t) = c ( + z m(z ))( + z 2 m(z 2 )). tm(z 2 ) + tm(z 2 ) dh(t) Replacing one copy of z 2 z by this in the second term on the right-hand side of (6.), we obtain (6.3) (z 2 m(z 2 ) z m(z )) 2 = z z 2 (m(z ) m(z 2 )) 2 + (z 2 z )(m(z 2 ) m(z ))[( c )( + z m(z ) + z 2 m(z 2 )) c z z 2 m(z )m(z 2 )].

34 Z. D. BAI, B. Q. MIAO AND G. M. PAN Using this and the facts that (), (2), (3), we obtain RHS { z g(0), if C encloses the origin, g(z)dz = 2πi C 0, otherwise, m(z)g(z)dz = g(x)df c,h (x), 2πi C m(z 2 ) m(z ) 4π 2 g (z )g 2 (z 2 )dz dz 2 = C C 2 z 2 z when C j enclose the origin, we obtain RHS of (.2) = 2 c 2 g (x)g 2 (x)df c,h (x) = 2 c + ( 2(c ) c 3 g (0)g 2 (0) g (0) g 2 (0) 2 c 3 ( g (x)df c,h (x) g (x)g 2 (x)df c,h (x) g 2 (x)df c,h (x) g 2 (x)df c,h (x) ) g (x)df c,h (x) g (x)df c,h (x) g (x)g 2 (x)df c,h (x), ) g 2 (x)df c,h (x). The same we can obtain when C j does not enclose the origin, we can obtain the same result even more easily. Thus the result is proved. 7. Truncation, recentralization and renormalization of X ij. In this section, to facilitate the analysis in the other sections, the underlying random variables need to be truncated, recentralized and renormalized. Since the argument for (.8) in [7] can be carried directly over to the present case, we can then select ε n such that (7.) ε n 0 and ε 4 n X 4 0. { X ε nn /4 } Let ˆXij = X ij I( X ij ε n n /4 ) EX ij I( X ij ε n n /4 ), Xn = X n ˆX n, and Ân = N T n /2 ˆXn ˆX n Tn /2, where ˆX n = ( ˆX ij ). Let σn 2 = E ˆX 2 and Ǎn = Nσ T /2 n 2 n ˆXn ˆX n Tn /2. Write (z) = (A n zi) and Ǎ (z) = (Ǎn zi).

ASYMPTOTICS OF EIGENVECTORS 35 Lemma 4. Suppose that X ij C,i =,...,n,j =,...,N, are i.i.d. with EX = 0,E X 2 = and E X 4 <. For z C 0, we then have Nx n (Ǎ (z)x n x n i.p. (7.2) (z))x n 0. Proof. Corresponding to the truncated and renormalized variables, we similarly define ŝ j, s j, (z), j (z), ˆβ j (z), ˆβ j j 2 (z) and ˆβ j2 j (z). We then have N(x n (z)x n x n A (z)x n ) = (x N nâ (z)tn /2 (X n ˆX n )Xn T n /2 A (z)x n (7.3) +x nâ (z)tn /2 ˆX n (Xn ˆX n /2 )Tn A (z)x n ) = N N x nâ (z) s j s j (z)x n + N N x nâ (z)ŝ j s j (z)x n = N N + N = ω + ω 2. β j (z)x nâ (z) s j s j j (z)x n N Consider first the term ω. (7.4) E ω 2 N ˆβ j (z)x nâ j (z)ŝ j s j (z)x n N E β j (z)x nâ (z) s j s j A j (z)x n 2 + N Eβ j (z)x nâ (z) s j s j j (z)x n j j 2 = ω + ω 2, β j2 ( z)x n A j 2 ( z)s j2 s j 2 ( z)x n where β j2 ( z) is the complex conjugate of β j2 (z). Our next goal is to show that the above two terms converge to zero for all z C u and z C l when u l < 0. In this case, β i (z) is bounded. It is straightforward to verify that E β j (z)x nâ j (z) s j s j j (z)x n 2 KE s j j (z)x n x nâ j (z) s j 2 = o(n 2 ),

36 Z. D. BAI, B. Q. MIAO AND G. M. PAN (7.5) E ŝ jâ j (z) s j E( ˆX X )N trâ j (z)t n 4 = o(n 2 ), Therefore, E s j A j (z)x n x nâ j (z)ŝ j 4 = O(N 3 ). E β j (z)ˆβ j (z)x nâ j (z)ŝ j ŝ jâ j (z) s j s j A j (z)x n 2 K(E ŝ jâ j (z) s j E( ˆX X )N trâ j (z)t n 4 ) /2 (E s j A j (z)x n x nâ j (z)ŝ j 4 ) /2 + E X 2 I( X ε n n /4 )E s j A j (z)x n x nâ j (z)ŝ j 2 = o(n 2 ). It then follows that N ω KN (E β j (z)ˆβ j (z)x nâ j (z)ŝ j ŝ jâ j (z) s j s j j (z)x n 2 (7.6) = o(). + E β j (z)x nâ j (z) s j s j j (z)x n 2 ) Now consider the term ω 2. We have where Eβ j (z)β j2 ( z)s j j (z)x n x n A j 2 ( z)s j2 s j 2 ( z)x n x nâ (z) s j = + 2 + 3 + 4, = Eβ j (z)β j2 ( z)ˆβ j (z)ˆβ j2 ( z)s j j (z)x n x n A j 2 ( z)s j2 s j 2 j 2 ( z)ŝ j2 ŝ j 2 j 2 ( z)x n x nâ j (z)ŝ j ŝ j j (z) s j, 2 = Eβ j (z)β j2 ( z)ˆβ j2 (z)s j j (z)x n x n A j 2 ( z)s j2 s j 2 j 2 ( z)ŝ j2 ŝ j 2 j 2 ( z)x n x nâ j (z) s j, 3 = Eβ j (z)β j2 ( z)ˆβ j (z)s j j (z)x n x n A j 2 ( z)s j2 s j 2 j 2 ( z)x n x nâ j (z)ŝ j ŝ j j (z) s j, 4 = Eβ j (z)β j2 ( z)s j j (z)x n x n A j 2 ( z)s j2 s j 2 j 2 ( z)x n x nâ j (z) s j. In the sequel, will be further decomposed. However, since the expansions of are rather complicated, as an illustration, we only present estimates for some typical terms; other terms can be estimated similarly. The main technique is to extract the j th and j 2 th column, respectively, from