Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices

Similar documents
INVERSE EIGENVALUE STATISTICS FOR RAYLEIGH AND RICIAN MIMO CHANNELS

Large sample covariance matrices and the T 2 statistic

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

Eigenvectors of some large sample covariance matrix ensembles

Estimation of the Global Minimum Variance Portfolio in High Dimensions

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

LARGE DIMENSIONAL RANDOM MATRIX THEORY FOR SIGNAL DETECTION AND ESTIMATION IN ARRAY PROCESSING

Irina Pchelintseva A FIRST-ORDER SPECTRAL PHASE TRANSITION IN A CLASS OF PERIODICALLY MODULATED HERMITIAN JACOBI MATRICES

A new method to bound rate of convergence

Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction

Lecture I: Asymptotics for large GUE random matrices

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

DISTRIBUTION OF EIGENVALUES OF REAL SYMMETRIC PALINDROMIC TOEPLITZ MATRICES AND CIRCULANT MATRICES

Lectures 2 3 : Wigner s semicircle law

Spectral analysis of the Moore-Penrose inverse of a large dimensional sample covariance matrix

Non white sample covariance matrices.

International Competition in Mathematics for Universtiy Students in Plovdiv, Bulgaria 1994

Eigenvectors of some large sample covariance matrix ensembles.

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title

Numerical Solutions to the General Marcenko Pastur Equation

Preface to the Second Edition...vii Preface to the First Edition... ix

Linear Algebra Massoud Malek

Banach Algebras of Matrix Transformations Between Spaces of Strongly Bounded and Summable Sequences

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws

Spectrum for compact operators on Banach spaces

Math212a1413 The Lebesgue integral.

On corrections of classical multivariate tests for high-dimensional data

4th Preparation Sheet - Solutions

ON THE UNIQUENESS OF SOLUTIONS TO THE GROSS-PITAEVSKII HIERARCHY. 1. Introduction

1 Intro to RMT (Gene)

The Matrix Dyson Equation in random matrix theory

Institut für Mathematik

Design of MMSE Multiuser Detectors using Random Matrix Techniques

A NOTE ON ALMOST PERIODIC VARIATIONAL EQUATIONS

Random regular digraphs: singularity and spectrum

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.

Linear Algebra: Characteristic Value Problem

Inhomogeneous circular laws for random matrices with non-identically distributed entries

Additive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535

Math 209B Homework 2

The norm of polynomials in large random matrices

arxiv: v1 [math.pr] 13 Aug 2007

Nonparametric Eigenvalue-Regularized Precision or Covariance Matrix Estimator

Markov operators, classical orthogonal polynomial ensembles, and random matrices

Random Matrix Theory and its Applications to Econometrics

A Conceptual Proof of the Kesten-Stigum Theorem for Multi-type Branching Processes

A remark on the maximum eigenvalue for circulant matrices

The properties of L p -GMM estimators

Comparison Method in Random Matrix Theory

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices

arxiv: v3 [math-ph] 21 Jun 2012

Functional Analysis F3/F4/NVP (2005) Homework assignment 3

arxiv: v1 [math.pr] 10 Dec 2009

Spectral Efficiency of CDMA Cellular Networks

METHODOLOGIES IN SPECTRAL ANALYSIS OF LARGE DIMENSIONAL RANDOM MATRICES, A REVIEW

Random Matrices for Big Data Signal Processing and Machine Learning

Inequalities Relating Addition and Replacement Type Finite Sample Breakdown Points

Gaussian Random Fields

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Metric Spaces and Topology

ON THE HÖLDER CONTINUITY OF MATRIX FUNCTIONS FOR NORMAL MATRICES

Fluctuations from the Semicircle Law Lecture 4

Appendix B for The Evolution of Strategic Sophistication (Intended for Online Publication)

10.1. The spectrum of an operator. Lemma If A < 1 then I A is invertible with bounded inverse

Stochastic Design Criteria in Linear Models

Weiming Li and Jianfeng Yao

Asymptotic Statistics-III. Changliang Zou

CONCENTRATION OF THE MIXED DISCRIMINANT OF WELL-CONDITIONED MATRICES. Alexander Barvinok

Quantile Processes for Semi and Nonparametric Regression

Notes on Random Vectors and Multivariate Normal

Numerical Range of non-hermitian random Ginibre matrices and the Dvoretzky theorem

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

On the Set of Limit Points of Normed Sums of Geometrically Weighted I.I.D. Bounded Random Variables

arxiv: v2 [math.pr] 2 Nov 2009

Multiuser Receivers, Random Matrices and Free Probability

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

arxiv: v5 [math.na] 16 Nov 2017

Weighted Sums of Orthogonal Polynomials Related to Birth-Death Processes with Killing

Measure and Integration: Solutions of CW2

BULK SPECTRUM: WIGNER AND MARCHENKO-PASTUR THEOREMS

Semicircle law on short scales and delocalization for Wigner random matrices

Applications and fundamental results on random Vandermon

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

Self-normalized Cramér-Type Large Deviations for Independent Random Variables

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices

Lecture 7: Semidefinite programming

The Dynamics of Learning: A Random Matrix Approach

Math Linear Algebra II. 1. Inner Products and Norms

Random Matrices: Invertibility, Structure, and Applications

2 Two-Point Boundary Value Problems

Comparison of perturbation bounds for the stationary distribution of a Markov chain

Diagonalization by a unitary similarity transformation

Transcription:

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices by Jack W. Silverstein* Department of Mathematics Box 8205 North Carolina State University Raleigh, North Carolina 27695-8205 Summary Let X be n N containing i.i.d. complex entries with E X EX 2 =,andt an n n random Hermitian non-negative definite, independent of X. Assume, almost surely, as n, the empirical distribution function (e.d.f.) of the eigenvalues of T converges in distribution, and the ratio n/n tends to a positive number. Then it is shown that, almost surely, the e.d.f. of the eigenvalues of (/N )XX T converges in distribution. The limit is nonrandom and is characteried in terms of its Stieltjes transform, which satisfies a certain equation. * Supported by the National Science Foundation under grant DMS-9404047 AMS 99 subject classifications. Primary 60F5; Secondary 62H99. Key Words and Phrases. Random matrix, empirical distribution function of eigenvalues, Stieltjes transform.

. Introduction. For any square matrix A with only real eigenvalues, let F A denote the empirical distribution function (e.d.f.) of the eigenvalues of A (that is, F A (x) isthe proportion of eigenvalues of A x). This paper continues the work on the e.d.f. of the eigenvalues of matrices of the form (/N )XX T,whereX is n N containing i.i.d. complex entries with E X EX 2 =,T is n n random Hermitian non-negative definite, independent of X, andn, N are large but on the same order of magnitude. Assuming the entries of X and T to be real, it is shown in Yin [4] that, if n and N both converge to infinity while their ratio n/n converges to a positive quantity c, and the moments of F T converge almost surely to those of a nonrandom probability distribution function (p.d.f.) H satisfying the Carleman sufficiency condition, then, with probability one, F (/N )XX T converges in distribution to a nonrandom p.d.f. F. The aim of this paper is to extend the limit theorem to the complex case, with arbitrary H having mass on [0, ), assuming convergence in distribution of F T to H almost surely. Obviously the method of moments, used in the proof of the limit theorem in Yin [4], cannot be further relied on. As will be seen, the key tool in understanding both the (/N )XX T limiting behavior of F and analytic properties of F is the Stieltjes transform, defined for any p.d.f. G as the analytic function m G () λ dg(λ) C+ { C : Im > 0}. Due to the inversion formula G{[a, b]} = b π lim Imm G (ξ + iη)dξ η 0 + a (a, b continuity points of G), convergence in distribution of a tight sequence of p.d.f. s is guaranteed once convergence of the corresponding Stieltjes transforms on a countable subset in C + possessing at least one accumulation point is verified. One reason for using the Stieltjes tranform to characterie spectral e.d.f s is the simple way it can be expressed in terms of the resolvent of the matrix. Indeed, for p pahaving real eigenvalues λ,...,λ p, m F A() =(/p) p i= λ i =(/p)tr (A I) (tr denoting trace, and I the identity matrix). Stieltjes transform methods are used in Marčenko and Pastur [] on matrices of the form A +(/N )X TX, where the entries of X have finite absolute fourth moments (independence is replaced by a mild dependency condition reflected in their mixed second and fourth moments), T = diag(τ,...,τ n ) with τ i s i.i.d. having p.d.f. H where H is an

arbitrary p.d.f. on R, anda is N N nonrandom Hermitian with F A converging vaguely to a (possibly degenerate) distribution function. For ease of exposition, we confine our discussion only to the case A = 0. It is proven that F (/N )X TX (x) i.p. F (x) for all x 0 (it can be shown that F is absolutely continuous away from 0. See Silverstein and Choi [3] for results on the analytic behavior of F and F ), where m F is the solution to ( ) τdh(τ) (.) m = c +τm in the sense that, for every C +, m = m F () is the unique solution in C + to (.). Under the assumptions on X originally given in this paper, strong convergence for T diagonal is proven in Silverstein and Bai [2] with no restriction on {τ,...,τ n } other than its e.d.f. converges in distribution to H almost surely. The proof takes an approach more direct than in Marčenko and Pastur [] (which involves the construction of a certain partial differential equation), providing a clear understanding of why m F satisfies (.), at the same time displaying where random behavior primarily comes into play (basically from Lemma 2. given below). The difference between the spectra of (/N )XX T and (/N )X TX is n N ero eigenvalues, expressed via their e.d.f. s by the relation F (/N )X TX =( n N )I [0, ) + n N F (/N )XX T (I B denoting the indicator function on the set B). It follows that their Stieltjes transforms satisfy (.2) m F (/N )X TX() = ( n N ) + n N m F (/N )XX T () C +. Therefore, in the limit (when F and F are known to exist) F =( c) [0, ) + cf, and ( c) (.3) m F () = + cm F () C +. From (.) and (.3) it is straightfoward to conclude that for each C +, m = m F () is a solution to (.4) m = τ( c cm) dh(τ). 2

It is unique in the set {m C : ( c) + cm C + }. It is remarked here that (.) reveals much of the analytic behavior of F, and consequently F (Silverstein and Choi [3]), and should be viewed as another indication of the importance of Stieltjes transforms to these types of matrices. Even when H has all moments, it seems unlikely much information about F can be extracted from the explicit expressions for the moments of F given in Yin [4]. Using again the Stieltjes transform as the essential tool in analying convergence, this paper will establish strong convergence of F (/N )XX T to F under the weakest assumptions on non-negative definite T. In order to keep notation to a minimum, the statement of the result and its proof will be expressed in terms of the Hermitian matrix (/N )T /2 XX T /2, T /2 denoting a Hermitian square root of T The following theorem will be proven. Theorem.. Assume on a common probability space a) For n =, 2,... X n =(Xij n ), n N, Xn ij C, i.d. for all n, i, j, independent across i, j for each n, E X EX 2 =. b) N = N(n) with n/n c>0asn. c) T n n n random Hermitian non-negative definite, with F T n converging almost surely in distribution to a p.d.f. H on [0, ) asn. d) X n and T n are independent. Let Tn /2 be the Hermitian non-negative square root of T n, and let B n = (/N )Tn /2 X n XnT n /2 (obviously F B n = F (/N )X nx n T n ). Then, almost surely, F B n converges in distribution, as n, to a (nonrandom) p.d.f. F, whose Stieltjes transform m() ( C + ) satisfies (.4), in the sense that, for each C +, m = m() is the unique solution to (.4) in D c {m C : ( c) + cm C + }. The proof will be given in the next section. Much of the groundwork has already been laid out in Silverstein and Bai [2], in particular the first step, which is to truncate and centralie the entries of X, and Lemma 2.. Therefore we will, on occasion, refer the reader to the latter paper for further details. 3

Proof of Theorem.. As in Silverstein and Bai [2], the dependency of the variables on n will occasionally be dropped. Through two successive stages of truncations and centraliations (truncate at ± N, centralie, truncate at ± ln N, centralie) and a final scaling, the main part of section 3 in Silverstein and Bai [2] argues that the assumptions on the entries of X can be replaced by standardied variables bounded in absolute value by a fixed multiple of ln N. WriteT in its spectral decompositon: T = U(diag(τ,...,τ n ))U. By replacing the diagonal matrix T α in that paper with U(diag(τ I (τ α),...,τ n I (τn α)))u, exactly the same argument applies in the present case. However, the following proof requires truncation of the eigenvalues of T. It is shown in Silverstein and Bai [2] (and used in section 3) that for any N n matrix Q, ifα = α n then F QT Q F QT αq a.s. 0 as n, where here denotes the sup norm on functions. Therefore, we may assume along with the conditions in Theorem. ) X log n, where log n denotes the logarithm of n with a certain base (defined in Silverstein and Bai [2]), 2) EX =0,E X 2 = 3) T log n, where here and throughout the following denotes the spectral norm on matrices. The following two results are derived in Silverstein and Bai [2]. The first accounts for much of the truth of Theorem. due to random behavior. The second relies on the following fact (which contributes much to the form of equation (.4)): For n nband q C n for which B and B + qq are invertible, (2.) q (B + qq ) = +q B q q B, (follows from q B (B + qq )=(+q B q)q ). Lemma 2. (Lemma 3. of Silverstein and Bai [2]). Let C be an n n matrix with C, and Y =(X,...,X n ) T, where the X i s are i.i.d. satisfying conditions ) and 2) above. Then E Y CY tr C 6 Kn 3 log 2 n where the constant K does not depend on n, C, nor on the distribution of X. Lemma 2.2 (Lemma 2.6 of Silverstein and Bai [2]). Let C + with v = Im, A and B n n with B Hermitian, and q C n. Then tr ( (B I) (B + τqq I) ) A A v. 4

The next lemma contains some additional inequalities. Lemma 2.3. For = u + iv C + let m (), m 2 () be Stieltjes transforms of any two p.d.f. s, A and Bn n with A Hermitian non-negative definite, and r C n. Then a) (m ()A + I) max(4 A /v, 2) b) tr B((m ()A+I) (m 2 ()A+I) ) m 2 () m () n B A (max(4 A /v, 2)) 2 c) r B(m ()A+I) r r B(m 2 ()A+I) r m 2 () m () r 2 B A (max(4 A /v, 2)) 2 ( r denoting Euclidean norm on r). Proof: Notice b) and c) follow easily from a) using basic matrix properties. We have for any positive x m ()x + 2 =(Re m ()x +) 2 +(Imm ()) 2 x 2. Using the Cauchy- Schwar inequality it is easy to show Re m () (Imm ()/v) /2. This leads us to consider minimiing over m the expression (vx) 2 m 4 +(mx +) 2, or with y = mx the function f(y) =ay 4 +(y +) 2 where a =(v/x) 2. Upon considering y on either side of /2 we find f(y) min(a/6, /4), from which a) follows. We proceed with the proof of Theorem. Fix = u + iv C +. Let B n =(/N )X TX, m n = m F Bn,andm n = m F B n. Let c n = n/n. In Silverstein and Bai [2] it is argued that, almost surely, the sequence {F B n}, for diagonal T, is tight. The argument carries directly over to the present case. Thus the quantity δ = inf Imm vdf B n(λ) n() inf n n 2(λ 2 + u 2 )+v 2 is positive almost surely. For j =, 2,...,N,letq j =(/ n)x j (X j denoting the j th column of X), r j = (/ N)T /2 X j,andb (j) = B n (j) = B n r j r j.write B n I + I = r j rj. Taking the inverse of B n I on the right on both sides and using (2.) we find I + (B n I) = j= j= +r j (B (j) I) r j r j r j (B (j) I). 5

Taking the trace on both sides and dividing by N we have c n + c n m n = N j= r j (B (j) I) r j +r j (B (j) I) r j = N j= +r j (B (j) I) r j. From (.2) we see that (2.2) m n () = N j= ( + r j (B (j) I) r j ). For each j we have Imr j ((/)B (j) I) r j = 2i r j ( ((/)B(j) I) ((/)B (j) I) ) r j Therefore (2.3) = v 2 r j ((/)B (j) I) B (j) ((/)B (j) I) r j 0. ( + r j (B (j) I) r j ) v. Write B n I ( m n ()T n I ) = N j= r jr j ( m n())t n. Taking inverses and using (2.), (2.2) we have ( m n ()T n I) (B n I) =( m n ()T n I) [ N = j= j= r j r j ( m n ())T n ] (B n I) ] [(m (+rj (B (j) I) r j ) n ()T n +I) r j r j (B (j) I) N (m n()t n +I) T n (B n I). Taking the trace and dividing by n we find (2.4) n tr ( m n()t n I) m n () = N where j= ( + r j (B (j) I) r j ) d j d j = q j T /2 (B (j) I) (m n ()T n + I) T /2 q j n tr (m n()t n + I) T n (B n I). From Lemma 2.2 we see max j N m n() m F B (j) () nv. 6

p Let for each jm (j) () = ( c n) + c n m B F (j) (). From (.2) we have for any positive (2.5) max j N logp n m n () m (j) () 0 as n. Also, by writing m (j) () = N + ( N N )( ( n N ) ) + n N m F B (j) (), from (.2) we see that m (j) is the Stieltjes transform of a p.d.f. Using condition 3), Lemma 2., Lemma 2.3 a), the fact that q j is independent of both B (j) and m (j) (), and (A I) /v for any Hermitian matrix A, we find and for n sufficiently large E q j 2 6 K log2 n n 3 E q j T /2 (B (j) I) (m (j) ()T n + I) T /2 q j n tr T /2 (B (j) I) (m (j) ()T n + I) T /2 6 K46 v 2 Therefore we have almost surely as n (2.6) max j N max[ q j 2, q j T /2 (B (j) I) (m (j) ()T n + I) T /2 q j log 24 n n 3. n tr T /2 (B (j) I) (m (j) ()T n + I) T /2 ] 0. We concentrate on a realiation for which (2.6) holds, {F B n } is tight (implying δ>0), and F T n converges in distribution to H. From condition 3), Lemma 2.2, Lemma 2.3 b), c), (2.5), and (2.6) we find that max d j 0asn. Therefore, from (2.3), (2.4) j N n tr ( m n()t n I) m n () 0 as n. Consider a subsequence {n i } on which {m ni ()} (bounded in absolute value by /v) converges to a number m. Letm = ( c) + cm be the corresponding limit of m ni (). We have Imm δ so that m D c. We use the fact that for m C +, τ R, /(m τ +) m /Im m and τ/(m τ +) /Im m to conclude that the function f(τ) = mτ + 7

is bounded and satisfies m ni ()τ + f(τ) m δ 2 m n i () m. Therefore n tr ( m n i ()T n I) = m ni τ + df T n i (τ) mτ + dh(τ) as n. Thus m satisfies (.4). Since m is unique we have m n () m. Thus, with probability one, F B n converges in distribution to F having Stieltjes transform defined through (.4). This completes the proof of Theorem.. 8

REFERENCES [] V. A. MarčenkoandL. A. Pastur, Distribution of eigenvalues for some sets of random matrices, USSR-Sb., (967), pp. 457-483. [2] J. W. Silverstein and Z. D. Bai, On the empirical distribution of eigenvalues of a class of large dimensional random matrices, J. Multivariate Anal. 54 (995), pp. 75-92. [3] J. W. Silverstein and S.I. Choi, Analysis of the limiting spectral distribution of large dimensional random matrices, J. Multivariate Anal. 54 (995), pp. 295-309. [4] Y. Q. Yin, Limiting spectral distribution for a class of random matrices, J. Multivariate Anal. 20 (986), pp. 50-68. 9