arxiv: v2 [math.pr] 16 Aug 2014

Size: px
Start display at page:

Download "arxiv: v2 [math.pr] 16 Aug 2014"

Transcription

1 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS VAN VU DEPARTMENT OF MATHEMATICS, YALE UNIVERSITY arxiv: v2 [math.pr] 6 Aug 204 KE WANG INSTITUTE FOR MATHEMATICS AND ITS APPLICATIONS, UNIVERSITY OF MINNESOTA Abstract. We present a concentration result concerning random weighted projections in high dimensional spaces. As applications, we prove New concentration inequalities for random quadratic forms. The infinity norm of most unit eigenvectors of a random ± matrix is of order O( log n/n). An estimate on the threshold for the local semi-circle law which is tight up to a log n factor.. Introduction.. Projection of a random vector. Consider C n with a subspace H of dimension d. Let X = (ξ,..., ξ n ) be a random vector. In all considerations in this paper, we assume that X is in isotropic position, namely EX X = Id. The length of the orthogonal projection of X onto H is an important parameter which plays an essential role in the studies of random matrices and related areas. In [26], Tao and the first author showed that (under certain conditions) this length is strongly concentrated. In other words, the projection of X onto H lies essentially on a circle centered at the origin. This fact played a crucial role in the studies of the determinant of a random matrix with independent entries (see [26, 20]). We say that ξ is K-bounded if ξ K with probability. Lemma. (Projection lemma, [26] ). Let X = (ξ,..., ξ n ) be a random vector in C n whose coordinates ξ i are independent K-bounded random variables with mean 0 and variance, where K 0(E ξ i 4 + ) for all i. Let H be a subspace of dimension d and Π H X be the length of the projection of X onto H. Then P( Π H X d t) 0 exp( 20K 2 ). The projection lemma follows from the Talagrand s inequality ([7, Chapter 4]). The constants 0 and 20 are rather arbitrary. We make no attempt to optimize the constants in this paper. Notation. We use standard asymptotic notations such as O, o, Θ, ω, etc., under the assumption that n. For a vector X, X is its Euclidean norm and X its infinity norm. For a matrix t2 Key words and phrases. random weighted projections; random quadratic forms; infinity norm of eigenvectors; local semi-circle law; random covariance matrix. V. Vu is supported by research grants DMS-09026, DMS and AFOSAR-FA

2 2 V. VU AND K. WANG A C n n, A F and A 2 denote its Frobenius and spectral norm, respectively. All eigenvectors have unit length..2. Weighted projection lemmas. Let us fix an orthonormal basis {u,..., u d } of H. We can express Π H X as () Π H X = ( d u i X 2 ) /2. i= In recent studies, we came up with situations when the roles of the axes are not compatible. Formally speaking, one is required to consider a weighted version of () where ( d i= u i X 2 ) /2 is replaced by ( d i= c i u i X 2 ) /2 with c i being non-negative numbers (weights). We are able to prove a variant of Lemma. for this more general problem, which turns out to be useful in a number of applications, some of which will be discussed in the paper. Furthermore, we can also weaken the assumption on random vector X in various ways. We say a random vector X = (ξ,..., ξ n ) is K-concentrated (where K may depend on n) if there are constants C, C > 0 such that for any convex, -Lipschitz function F : C n R and any t > 0 (2) P( F (X) M(F (X)) t) C exp( C t 2 K 2 ), where M(Y ) denotes the median of a random variable Y (choose an arbitrary one as there are many). Notice that the notion of K-concentrated is somewhat similar to the notion of threshold in random graph theory in the sense that if X is K-concentrated then it is ck-concentrated for any constant c > 0. (Similarly, if p(n) is a threshold for a property P (say, containing a triangle) then cp(n) is also a threshold.) One can also replace the median by the expectation (see Lemma 2.). The dependence on K on the RHS of (2) is flexible (one can replace K 2 by any function f(k)); however, the quality of the concentration bound will depend on f(k) and we leave it as an exercise for the reader to work out this dependence. Examples of K-concentrated random variables If the coordinates of X are iid standard gaussian (real or complex), then X is -concentrated (see [7]). If ξ i are independent and ξ i are K-bounded for all i, then X is K-concentrated (this is a corollary of Talagrand s inequality; see [7, Chapter 4] or [27, Theorem F.5]). If X satisfies the log-sobolev inequality with parameter K 2, then it is K-concentrated (see [7, Theorem 5.3]). The coordinates ξ i of X come from a random walk satisfying certain mixing properties (see [25, Corollary 4]; in this corollary Γ plays the role of K). In this and the previous example, the coordinates of X are not necessarily indepedent. Lemma.2. Let X = (ξ,..., ξ n ) be a K-concentrated random vector in C n. Then there are constants C, C > 0 (which depend on, but could be different from the constants in (2)) such that the following holds. Let H be a subspace of dimension d with an orthonormal basis {u,..., u d }.

3 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 3 Then for any c,..., c d 0 P d c j u j X 2 d c j t C exp( C t 2 K 2 ). In particular, by squaring, it follows that d d (3) P c j u jx 2 c j 2t d c j + t 2 C exp( C t 2 K 2 ). Another way to weaken the K-bounded assumption is to consider truncation. If ξ is not bounded, but has light tail, then by setting K appropriately, we can show that P( ξ K) is negligible with respect to the probability bound we want to prove. Assume that the ξ i are independent with mean zero and variance one. Choose a number K > and let ε := max i n P( ξ i > K). Set ξ i := ξ ii ξi K and let µ i and σi 2 denote its mean and variance. Set ε 2 := max i n µ i and ε 3 := max i n σi 2. Assume all ε j /2 (in practice this assumption is satisfied easily). Lemma.3. There are constants C, C > 0 such that the following holds. Let X = (ξ,..., ξ n ) be a random vector in C n whose coordinates ξ i are independent random variables with mean 0 and variance. Under the above notations, we have, for any c,..., c n 0 and any t > 0 (4) P d c j u j X 2 d c j t + 4n 2 K 2 (ε 2 + ε 3 ) C exp( C t 2 K 2 ) + nε..3. Concentration of random quadratic forms. Consider a quadratic form Y := X AX where X = (ξ,..., ξ n ) is, as usual, a random vector and A = (a ij ) i,j n a deterministic matrix. As application of the new projection lemmas, we prove a large deviation result for Y, which can be seen as the quadratic version of the standard Chernoff bound. Theorem.4 (Concentration of quadratic forms I). Let X be a K-concentrated random vector in C n. Then there are constants C, C > 0 such that for any matrix A (5) P( X AX trace(a) t) C log n exp( C K 2 t min{ A 2 }). F log n, A 2 Theorem.5 (Concentration of quadratic forms II). Let X and ε, ε 2, ε 3 be as in Lemma.3. There are constants C, C > 0 such that the following holds. Assume n 2 K 2 (ε 2 + ε 3 ) = o(), then P( X AX trace(a) t) C log n exp( C K 2 t min{ A 2 }) + nε. F log n, A 2 As an illustration, let us consider the case when ξ i are sub-exponential. We say that ξ is subexponential with exponent α > 0 (with accompanying positive constants a and b) if for any t > 0 P( ξ Eξ t α ) a exp( bt). Corollary.6 (Concentration of quadratic forms with sub-exponential variables). Assume that ξ i are independent and sub-exponential (with exponent α > 0) random variables with mean 0 and variance. Then there are constants C, C > 0 such that for any t satisfying (6) t = ω(( A F + log α n A 2 ) log α+ n), t 2 t 2

4 4 V. VU AND K. WANG we have (7) P( X AX trace(a) t) C exp( C t t min{( ) α+/2, ( ) A F log n A 2 2α+ }). Quadratic forms of random variables appear frequently in probability and the large deviation problem has been considered by several researchers, with first and perhaps most famous result being by Hanson-Wright inequality [4]. In many cases, our results improve earlier results significantly; see Section 3 for more details..4. Norm of random eigenvectors. Let M n be a symmetric ± matrix (the upper diagonal and diagonal entries are iid Bernoulli random variables taking values ± with probability /2). This is an important object in both probabilistic combinatorics and the theory of random matrices. Let u be an arbitrary unit eigenvector of M n. We investigate the following natural question, How big is u? A good bound on the infinity norm of the eigenvectors is important in spectral analysis of graphs and many other applications, such as the studies of nodal domains (see for instance [7] and the references therein). Recently, it plays a crucial role in breakthrough works concerning local statistics of random matrices (see Section.5 and also [8, 3] for surverys). Set W n = n M n. Thanks to the classical Wigner s semi-circle law (see Section.5), we know that most of the eigenvalues of W n belong to the interval ( 2 + ɛ, 2 ɛ). Using our new concentration results, we are able to obtain (what we believe to be) the optimal bound for the eigenvectors corresponding to these eigenvalues. Theorem.7 (Infinity norm of eigenvectors). Let M n be a n n symmetric matrix whose upper diagonal entries are iid random variables that takes values ± with the same probability. Let W n = n M n. For any constant C > 0, there is a constant C 2 > 0 such that the following holds. (Bulk case) With probability at least n C, for any fixed ɛ > 0 and any i n with λ i (W n ) [ 2 + ɛ, 2 ɛ], there is a unit eigenvector u i (W n ) of λ i (W n ) satisfying u i (W n ) C 2 log /2 n n. (Edge case) With probability at least n C, for any ɛ > 0 and any i n with λ i (W n ) [ 2 ɛ, 2 + ɛ] [2 ɛ, 2 + ɛ], there is a unit eigenvector u i (W n ) of λ i (W n ) satisfying u i (W n ) C 2 log n n. The best previous bound was of the form logc n for some large (usually not explicit) constant n /2 C [9, 0,, 27]. We conjecture that the bound O( log n/n) is sharp (it is easy to see that this is the case if the entries are standard gaussian) and also that it holds for all eigenvectors. Very recently, Rudelson and Vershynin [24] also studied the norm of random eigenvectors using a geometric method, which is different from our approach discussed in Section.5.

5 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 5.5. The local semi-circle law. Denote by ρ sc the semi-circle density function with support on [ 2, 2], { (8) ρ sc (x) := 2π 4 x 2, x 2 0, x > 2. Let us recall the classical Wigner s semi-circle law: Theorem.8 (Semi-circular law). Let M n be a random Hermitian matrix whose entries on and above the diagonal are iid bounded random variables with zero mean and unit variance and W n = n M n. Then for any real number x, lim n n { i n : λ i(w n ) x} = x 2 ρ sc (y) dy in the sense of probability, where we use I to denote the cardinality of a finite set I. The key tool for bounding the infinity norm of an eigenvector is a statement of the following type: any interval of length at least T (which tends to zero with n) in the spectrum [ 2, 2] contains an eigenvalue, with high probability. The quality of the bound will depend on how small T is. This approach was developed by Erdős, Schlein and Yau in [9, 0, ], leading to eigenvector norm bounds of order n 2/3, n 3/4 and finally n +o(). A simpler argument, following the same approach, was developed by Tao and the first author in [27] (see [27, Section 4] for a problem concerning random non-hermitian matrices). One way to attack the above problem is to show that the semi-circle law holds for small intervals (or at small scale). Intuitively, we would like to have with high probability that N I n ρ sc (x) dx δn I, I for any interval I and fixed δ > 0, where N I denotes the number of eigenvalues of W n := n M n in the interval I. Of course, the reader can easily see that I cannot be arbitrarily short (since N I is an integer). Following [], we call a statement of this kind a local semi-circle law (LSCL). A natural question arises: how short can I be? Formally, we say that the LSCL holds at a scale f(n) if with probability o() N I n ρ sc (x) dx δn I, I for any interval I in the bulk of length ω(f(n)) and any fixed δ > 0. Furthermore, we say that f(n) is a threshold scale if the LSCL holds at scale f(n) but does not holds at scale g(n) for any function g(n) = o(f(n)). (The reader may notice a similarity between this definition and the definition of threshold functions for random graphs.) We would like to raise the following problem. Problem.9. Determine the threshold scale (if exists). We do not know a sharp estimate for the threshold for any matrix ensembles, even in the basic GUE (random matrix with complex gaussian entries) and GOE (random matrix with real gaussian entries) cases. A recent result by Ben Arous and Bourgade [] shows that the maximum gap between two consecutive (bulk) eigenvalues of GUE is of order Θ( log n/n), with high probability. Thus, if we partition the bulk into intervals of length α log n/n for a sufficiently small α, one of these intervals contains at most one eigenvalue. Therefore, we expect that in natural ensembles,

6 6 V. VU AND K. WANG the LSCL does not hold below the log n/n scale. In [, 29], upper bound of the form log C n/n was proved for some large value of C. Here we are going to show Theorem.0 (Threshold for local semi-circle law). Let M n be a Hermitian matrix whose upper diagonal entries are independent random variables with mean 0 and variance. Assume furthermore that for i n, the vectors X i, obtained by deleting the i-th entry of the i-th row vector of M, are K-concentrated. Then the threshold scale for LSCL is bounded from above by K 2 log n/n. In the GUE case, the gap between the upper and lower bound is only O( log n) and it is an intriguing problem to remove this factor. We also conjecture that Ben Arous and Bourgade s result on the largest gap holds for ± random matrices. The results of Section.4 and Section.5 also hold for random sample covariance matrices. We sketch the results and proofs in the appendices. Structure of the paper. In the next section, we prove the new projection lemmas. In Section 3, we prove the new concentration inequalities for quadratic forms and make a comparison with prior results. The next section, Section 4 can be seen as a preparation step in which we recall facts about random matrices. We prove the new threshold for the local semi-circle law in Section 5, and the bound on the infinity norm of eigenvectors in Section 6. The appendices contain proofs concerning random sample covariance matrices. Acknowledgement. The authors would like to thank the anonymous referees for their careful reading and constructive suggestions. Proof of Lemma.2. Set f(x) := 2. Proofs of Lemmas.2 and.3 d c j u j X 2. Thus, f is a function from C n to R. We first observe that f(x) is convex. Indeed, for 0 λ, µ with λ + µ = and any X, Y C n, by Cauchy-Schwardz inequality, f(λx + µy ) d c j (λ u j X + µ u j Y )2 λ d c j u j X 2 + µ d c j u j Y 2 = λf(x) + µf(y ). Next, we show that f(x) is -Lipschitz. Notice that f(x) is convex, one has d u j X 2 X. Since f(x) 2 f(x) = f( 2 X) = f( 2 (X Y ) + 2 Y ) 2 f(x Y ) + f(y ). 2 Thus f(x) f(y ) f(x Y ) and f(y ) f(x) f(y X) = f(x Y ), which imply f(x) f(y ) f(x Y ) X Y.

7 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 7 Thus, by the definition of K-concentrated property, (9) P( f(x) M(f(X)) t) C exp( C t 2 K 2 ) for some constants C, C > 0. d To conclude the proof, it suffices to show M(f(X)) c j = O(K). We use the following lemma. The proof of the lemma is classical and thus omitted. Lemma 2.. Let Y be a real random variable. Assume P( Y µ t) f(t), where 0 f(x)dx = O(), then EY µ = O(). Assume furthermore that Y is non-negative, µ 0, σ = EY 2, and 0 xf(x)dx = O(), then EY σ = O(). To apply this lemma, set c i := c i max i d c i, Y := d K i= c i u i X 2 and µ := M(Y ). We have, by the K-concentration property P( Y µ t) = P( d c i u i X 2 M( d c i u i X 2 ) tk) C exp( C t 2 ). i= i= Set f(x) = C exp( C x 2 ). The assumptions on f(x) in Lemma 2. are trivially satisfied. As X is isotropic, σ 2 = EY 2 = d K 2 i= c i. It follows from Lemma 2. that M(Y ) = d c i K + O(). i= Renormalizing, we obtain M( d c i u i X 2 ) = d c i + O(K max c i), i d i= which concludes the proof of Lemma.2. Proof of Lemma.3. Recall that ξ i = ξ ii ξi K has mean µ i and variance σi 2. The parameters ε = max i n P( ξ i > K), ε 2 = max i n µ i and ε 3 = max i n σi 2 satisfy all ε j /2. Define ξ i := ξ i µ i σ i. The ξ i are independent with mean zero and variance and are 2K-bounded. Let X := (ξ,..., ξ n) and X := ( ξ,..., ξ n ). It is obvious that P d c j u j X 2 d c j t P d c j u j X 2 d c j t + nε. i= The next observation is that if ε 2, ε 3 are small, then i d c i u i X 2 and i d c i u i X 2 are more or less the same. By definition, we have with probability one ξ i ξ i = ξ i (σ i ) + µ i σ i 2(Kε 3 + ε 2 ).

8 8 V. VU AND K. WANG It follows that D := X X has norm at most 2n /2 (Kε 3 + ε 2 ) with probability one. On the other hand, c i u i X 2 c i u X i 2 2 c i u i X i u i D + c i u i D 2. i d i d i d i d As u i are unit vectors, u i X i X i nk and u i D i D i 2 n(kɛ 2 + ɛ 3 ) (these bounds are generous and can be improved by a polynomial factor in certain cases, but in applications such improvement rarely matters). It follows, again rather generously, i d c i u i X 2 i d c i u i X 2 4n d c i K 2 (ε 2 + ε 3 ) 4n 2 K 2 (ε 2 + ε 3 ). i= Applying Lemma.2 for X, we obtain Lemma.3. In practice, ε j are typically super-polynomially small, i.e. n ω(), which yields 4n 2 K 2 (ε 2 + ε 3 ) = o(). This term can be ignored (by slightly changing the values of C, C if necessary) and we end up with a more friendly inequality (0) P( d c j u j X 2 d c j t) C exp( C t 2 K 2 ) + nε. In the sub-exponential case, for a sufficiently large K (compared to a and b), ε j exp( b 2 K/α ) for j =, 2, 3. For K = ω(log α n), n 2 K 2 exp( b 2 K/α ) = o() and (0) yield () P d c j u j X 2 d c j t C exp( C t 2 K 2 ) + n exp( b 2 K/α ). 3. Random Quadratic Forms 3.. Proofs of new results. Let us first prove Theorem.4. Notice that if Y = X AX, then Y + Ȳ = X (A + A )X and Y Ȳ = X (A A )X. Since we have Y trace A = 2 [(Y + Ȳ ) trace(a + A )] + 2 [(Y Ȳ ) trace(a A )], P( Y trace A t) P( (Y +Ȳ ) trace(a+a ) t)+p( (Y Ȳ ) trace( (A A )) t). Moreover, as A + A F, A A F = O( A F ) and A + A 2, A A 2 = O( A 2 ), it suffices to prove the theorem in the case A is Hermitian. Next, we observe that any Hermitian matrix A can be written as A := A A 2 where A i are positive semi-definite and max i=,2 A i 2 A 2, max i=,2 A i F A F. (In fact, the positive eigenvalues of A are the positive eigenvalues of A and the positive eigenvalues of A 2 are the absolute values of the negative eigenvalues of A.) This enables us to further reduce the problem to the case when A is positive semi-definite.

9 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 9 Finally, as the content of the theorem is invariant under scaling, we can assume that A 2 =. Let = c c 2,..., c n 0 be the eigenvalues of A together with corresponding orthonormal eigenvectors {u,..., u n }. We have n n (2) X AX trace(a) = c j u jx 2 c j. This is precisely the setting of the projection lemmas. By Lemma.2, we know that for any numbers 0 d j, j J, (3) P( d j u jx 2 d j 2t d i + t 2 ) C exp( C K 2 t 2 ). j J j J i However, it is somewhat wasteful to apply this directly to (2). We will perform an extra partition step. Set J k := { j n : 4 k+ c j 4 k }, 0 k k 0 := 0 log n, and let J k0 + be the collection of the remaining indices. For each 0 k k 0 +, apply Lemma.2 to d i := 4 k c i, c i J k, we have, for any s 0 P( 4 k c i ( u i X 2 ) 2s 4 k c i + s 2 ) C exp( C K 2 s 2 ). i J k i J k t A F Set s := and simplify by 4 k, the above inequality becomes P( c i ( u i X 2 2t t 2 ) 2 k c i + A F 4 k A 2 ) C exp( C K 2 t 2 i J k i J k F A 2 ). F Apparently, k 0 + k=0 t 2 4 k A 2 F 2 t2. Moreover, A 2 i J c F k0 i n n 5 = n 4 and + c i k /2 k 0 0 ( 4 k c i ) /2 i J k i J k 0 k k 0 2 k by Cauchy-Schwartz inequality. k=0 k 0 8 log /2 n( c 2 i ) /2 i J k k=0 8 log /2 n A F, Putting the above estimates together and using the union bound, we obtain n P( c i ( u i X 2 ) 6 log /2 nt + 2 t2 A 2 + n 2 ) C log n exp( C K 2 t 2 F A 2 ). F i= We can ignore the small term n 2 (by slightly adjusting the constant 6), the desired bound follows.

10 0 V. VU AND K. WANG Remark 3.. If we have more information about A, the log n term can be improved. For instance, if all eigenvalues of A are comparable, then we do not need this term. The proof of Theorem.5 uses Lemma.3 and is left as an exercise. To prove Corollary.6, notice that we can obtain an analogue of () (4) P( X AX trace(a) t) C exp( C K 2 t min{ A 2 }) + n exp( b F log n, A 2 2 K/α ), under the assumption that K = ω(log α n). To optimize the bound, we choose K such that K 2 min{ t 2 setting K := min{( t A F log n ) 2 2+/α, ( t A 2 ) 2+/α }. Assume t 2 t A 2 F log n, (5) t = ω(( A F + log α n A 2 ) log α+ n). A 2 } = K /α. This leads to This assumption guarantees K = ω(log α n). It also implies n exp( b 2 K/α ) exp( b 3 K/α ), proving Corollary Comparison to earlier results. In 97, Hanson and Wright [4] obtained the first important inequality for sub-gaussian random variables. Theorem 3.2 (Hanson-Wright inequality). Let X = (ξ,..., ξ n ) R n be a random vector with ξ i being iid symmetric and sub-gaussian random variables with mean 0 and variance. There exist constants C, C > 0 such that the following holds. Let A be a real matrix of size n with entries a ij and B := ( a ij ). Then (6) P( X T AX trace(a) t) C exp( C min{ t2 t A 2, }) F B 2 for any t > 0. Later, Wright [34] extended Theorem 3.2 to non-symmetric random variables. Recently, Hsu, Kakade and Zhang [6] showed that one can obtain a better upper tail (notice that B 2 is replaced by A 2 ) (7) P(X T AX trace(a) t) C exp( C min{ t2 t A 2, }) F A 2 under a considerably weaker assumption (which, in particular, does not require the ξ i to be independent). On the other hand, their method does not cover the lower tail. Let us pause here to point out a strong distinction from the linear case and the quadratic case: In the linear case (Chernoff type bounds), the lower tail follows from the upper tail by simply switching ξ i to ξ i, but this trick is useless in the quadratic case. Recently, Rudelson and Vershynin [23] proved the Hanson-Wright inequality (8) P( X T AX trace(a) t) C exp( C min{ t2 t A 2, }), F A 2 assuming ξ i are sub-gaussian. In the previous papers, the random variables ξ i are required to be real. Few years ago, motivated by the delocalization problem for random matrices, Erdős, Schlein and Yau [] considered the

11 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS complex case. By assuming either both the real and imaginary parts of ξ i are iid sub-gaussian or the distribution of ξ i is rotationally symmetric (real and imaginary parts still sub-gaussian), they proved (9) P( X AX trace(a) t) C exp( C t A F ). Later, Erdős, Yau and Yin [3] showed that if ξ i are independent sub-exponential random variables with exponent α > 0, having mean 0 and variance, then (20) P( X AX trace(a) t) C exp( C t ( ) A F 2+2α ). To simplify the comparison, let us ignore the log n terms in our theorems (which play little role in practice). If K = O(), then the main difference between Theorem 3.2 of Hanson and Wright and Theorem.4 is that the term B 2 in Theorem 3.2 is now replaced by A 2. It is easy to see that B 2 A 2 for any real matrix A. In fact, in many cases, B 2 is significantly larger than A 2. For instance, a random matrix A with entries of order typically has spectral norm of order n, but in this case it is clear that B 2 has spectral norm of order n (as all row sums are of this order). The same holds for several classical explicit matrices, such as the Hadamard matrix. In these cases, our bound improves Hanson-Wright s significantly. Furthermore, our result applies in the complex case while the approach used by Hanson and Wright is restricted to the real case. Comparing to (9), we do not need the fairly restricted assumption that either both the real and imaginary parts of ξ i are iid sub-gaussian or the distribution of ξ i is rotationally symmetric. In the t case K = O(), both terms 2 t t and A 2 F log n A 2 in our bound can be considerably larger than A F. For instance, differ by a factor n in both the random and Hadamard cases. t A 2 and t A F In order to make a Hanson-Wright type bound non-trivial, we need to assume t A F + A 2. In many applications, we want the probability bound to be polynomially or even super-polynomially small, i.e. n O() or n ω(). This requires a lower bound log Ω() n( A F + A 2 ) on t, which is consistent with the assumption (6) in Corollary.6. Notice that (7) compares favorably to (20). For the term t A F, the exponent α+/2 is superior to (notice that we are talking about a double exponent, so an improvement here could improve 2α+2 the quality of the bound quite a lot). For the term t A 2, the exponent 2α+ 2α+2. Furthermore, A 2 can be significantly smaller than A F, as discussed earlier. is still better than 4. Random matrices and the Stieltjes transform This section serves as a preparation, in which we recall several facts about random matrices. The empirical spectral distribution (ESD) function of the n n Hermitian matrix W n := n M n = n (ζ ij ) i,j n is a one-dimensional function F Wn (x) = n { j n : λ j(w ) x}, where I denotes the cardinality of a set I. We are going to focus on the case when the entries of M n are K-bounded; it is easy to extend this assumption to K-concentrated.

12 2 V. VU AND K. WANG The Stieltjes transform of a real measure µ(x) is defined for any complex number z not in the support of µ as s(z) = x z dµ(x). Thus, the Stieltjes transform s n (z) of W n is s n (z) = x z df Wn (x) = n n λ i (W n ) z. R Furthermore, the Stieltjes transform s sc (z) of the semi-circle distribution is ρ sc (x) z + z s sc (z) := dx = 2 4, x z 2 R R where z 2 4 is the branch of square root with a branch cut in [ 2, 2] and asymptotically equals z at infinity [4]. The beauty (and power) of the Stieltjes transform lies in the fact that it has a clear linear algebra content; s n (z) of W n is exactly the trace of the matrix (W n zi). This allows us to compute the Stieltjes transform by looking at the diagonal entries of (W n zi). In matrix theory, Stieltjes transform plays the role Fourier transform in analysis. If the Stieltjes transforms of two spectral measures are close to each other (for all z), then the two measures are more or less the same. In particular, if s n (z) is close to s sc (z), then the spectral distribution of W n is close to the semi-circle distribution (see for instance [4, Chapter ], [0]). We are going to use the following lemma. Lemma 4.. Let M n be a random Hermitian matrix with independent K-bounded entries with mean 0 and variance. Let /n < η < /0 and L, ε, δ > 0. For any constant C > 0, there exists a constant C > 0 such that if one has the bound s n (z) s sc (z) δ with probability at least n C uniformly for all z with Re(z) L and Im(z) η, then for any interval I in [ L + ε, L ε] with I max(2η, η δ log δ ), one has N I n ρ sc (x) dx δn I with probability at least n C. I i= This is [29, Lemma 64], which, in turn, is a variant of [0, Corollary 4.3]. An appropriate application of Lemma 4. will imply Theorem.0. (As a matter of fact, we a going to prove a little bit more.) In order to use this lemma, we set L = 4, ε =, and critically η := K2 C 2 log n nδ 6, where C = C We are going to show that (2) s n (z) s sc (z) = o(δ) holds with probability at least n C for any fixed z in the region {z C : Re(z) 4, Im(z) η}. Notice that in this statement we fix z. However, it is simple to strengthen the statement to hold for all z, using an ɛ-net argument, exploiting the fact that s n (z) is Lipschitz continuous with Lipschitz constant O(n 2 ) (for details, we refer to [9, Theorem.] or [29, Section 5.2]).

13 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 3 In order to show that s n (z) is close to s sc (z), the key observation is that s sc (z) can also be defined by the equation (22) s sc (z) = z + s sc (z). This equation is stable, so if we can show s n (z) z+s n(z) then it follows that s n(z) s sc (z). This observation was due to Bai et al. [2], who used it to prove the n /2 rate of convergence of s n (z) to s sc (z). In [9, 0, ], Erdős et al. refined Bai s approach to prove local semi-circle law at scales finer than n /2, ultimately to n log C n [0]. Our main contribution here is to push the scale further down to n log n, which we believe is (at most) a factor log n from the truth. Recall that s n (z) is the trace of (W n zi). By computing the diagonal entires, one can show (see [4, Chapter ], [0] or [29, Lemma 39]) (23) s n (z) = n where n ζ kk k=, n z Y k Y k = a k (W n,k zi) a k and W n,k is the matrix W n with the k-th row and column removed, and a k is the k-th row of W n with the k-th element removed. The entries of a k are independent of each other and of W n,k, and have mean zero and variance /n. By linearity of expectation we have where E(Y k W n,k ) = n trace(w n,k zi) = ( n )s n,k(z) s n,k (z) := n n i= λ i (W n,k ) z is the Stieltjes transform of W n,k. From the Cauchy interlacing law, we can get s n (z) ( n )s n,k(z) = O( dx) = O( n x z 2 nη ) = o(δ2 ) and thus E(Y k W n,k ) = s n (z) + o(δ 2 ). R The heart of the matter now is the following concentration result. Lemma 4.2. Let M n be as in Lemma 4.. For k n, Y k = E(Y k W n,k ) + o(δ 2 ) holds with probability at least O(n C ) for any z with Re(z) 4 and Im(z) η. To prove this lemma, we are going to make an essential use of the weighted projection lemma, as showed in the next section.

14 4 V. VU AND K. WANG 5. Proof of Lemma 4.2 and Threshold of the Local Law We are going to prove Lemma 4.2 and the following more quantitative version of Theorem.0. Theorem 5.. For any constants ɛ, δ, C > 0, there is a constant C 2 > 0 such that the following holds. Let M n be a Hermitian matrix whose upper diagonal entries are independent random variables with mean 0 and variance. Assume furthermore that for i n, the vectors X i, obtained by deleting the i-th entry of the i-th row vector of M n, are K-concentrated. Then with probability at least n C, we have N I n ρ sc (x) dx δn ρ sc (x) dx, I I for all interval I ( 2 + ɛ, 2 ɛ) of length at least C 2 K 2 log n/n. First, we record a lemma that provides a crude upper bound on the number of eigenvalues in short intervals. Lemma 5.2. Let M n be a random Hermitian matrix with independent K-bounded entries with mean 0 and variance. For any constant C > 0, there exists a constant C 2 > 0 such that for any interval I R with I C 2K 2 log n n, N I n I with probability at least n C. This lemma is Proposition 66 in [29], which is a variant of [, Theorem 5.]. Notice that (24) Y k = a k (W n,k zi) a k = n u j (W n,k ) a k 2 λ j (W n,k ) z = n n u j (W n,k ) X k 2 λ j (W n,k ) z, where X k = na k is the k-th row of M n with the k-th element removed. Note that the entries of X k are independent with mean 0 and variance. Therefore, (25) Y k E(Y k W n,k ) = n n u j (W n,k ) X k 2 λ j (W n,k ) z = n n R j λ j (W n,k ) x η, where R j := u j (W n,k ) X k 2. By symmetry, we can restrict the sum to those indices j where λ j (W n,k ) x 0. Let J be the set of indices j such that 0 λ j (W n,k ) x η. Since x = Rez, η = Imz, we have n j J nη j J n j J R j λ j (W n,k ) x η λ j (W n,k ) x (λ j (W n,k ) x) 2 +η 2 R j + n j J (λ j (W n,k ) x)η (λ j (W n,k ) x) 2 +η 2 R j + nη j J η (λ j (W n,k ) x) 2 +η 2 R j η 2 (λ j (W n,k ) x) 2 +η 2 R j. Consider the sum S := (λ j (W n,k ) x)η j J (λ j (W n,k R ) x) 2 +η 2 j. As 0 (λ j(w n,k ) x)η (λ j (W n,k, we are in position ) x) 2 +η 2 to apply Lemma.2. Taking t = C 4 K log n with a sufficiently large constant C 4, by (3) we have S C 4 nη (2K J log n + C 4 K 2 log n)

15 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 5 with probability at least C exp( C C4 2 log n) n C4/2. By Lemma 5.2, J Bnη with probability at least n C 4, for some sufficiently large constant B > 0. Recall η := K2 C3 2 log n ; it nδ 6 follows that with probability at least 2n C4/2 we have S C 4 C 3 δ3 (2 B + C 4 C 3 δ3 ). Thus, for C 3 sufficiently large compared to C 4 and B, we have S δ 3. Similarly, we can prove the same bound for S 2 := nη η 2 j J (λ j (W n,k R ) x) 2 +η 2 j. For the other eigenvalues, we divide the real line into small intervals. For integer l 0, let J l be the set of eigenvalues λ j (W n,k ) such that 0 l η < λ j (W n,k ) x 0 l+ η. The number of such J l is at most 20 log n. By Lemma 5.2 one has, J l 9B0 l nη with probability at least n C 4, for some sufficiently large constant B > 0. Again by Lemma.2 (taking t = KC 4 log n), n R j λ j J j (W n,k ) x η l n λ j x (λ j x) 2 + η 2 R j + n η (λ j x) 2 + η 2 R j j J l j J l 0 l nη j J l 0 l η(λ j x) (λ j x) 2 + η 2 R j + 0 2l nη j J l (0 l η) 2 (λ j x) 2 + η 2 R j 2C 4K 0 l nη (2 J l log n + KC 4 log n) δ 3 BC 4 C 3 0 l/2+2 with probability at least 2C exp( C C 2 4 log n) n C 4 n C 4/2. Summing over l, we have n l j J l R j λ j (W n,k ) x η 200C 4C 3 Bδ3 δ 3 with probability at least n C 4/2+, for C 3 sufficiently large. This completes the proof of Lemma 4.2. Inserting the bounds into (23), one has s n (z) + n n k= s n (z) + z + o(δ 2 ) = 0 with probability at least O(n C ). The term ζ kk / n = o(δ 2 ) as ζ kk K by assumption. Comparing this equation with (22), one can use a continuity argument (see [28] for details) to obtain s n (z) s(z) δ with probability at least O(n C+00 ). By Lemma 4., it follows that for random matrices M n with K-bounded entries, for any constant C > 0, there exists a constant C 2 > 0 such that for 0 δ /2 and any interval I ( 3, 3) of length at least C 2 K 2 log n/nδ 8, (26) N I n ρ sc (x) dx δn I holds with probability at least n C. In particular, Theorem 5. follows. I

16 6 V. VU AND K. WANG 6. The infinity norm of eigenvectors We prove Theorem.7 in the following more general form. Theorem 6. (Optimal infinity norm of eigenvectors). Let M n be a Hermitian matrix whose upper diagonal entries are independent random variables with mean 0 and variance. Further assume that for any index i n, the vector X i, obtained by deleting the i-th entry of the i-th row vector of M n, is K-concentrated. Let W n = n M n. Then for any constant C > 0, there is a constant C 2 > 0 such that the following holds. (Bulk case) With probability at least n C, for any ɛ > 0 and any i n with λ i (W n ) [ 2 + ɛ, 2 ɛ] there is a unit eigenvector u i (W n ) of λ i (W n ) satisfying u i (W n ) C 2K log /2 n n. (Edge case) With probability at least n C, for any ɛ > 0 and any i n with λ i (W n ) [ 2 ɛ, 2 + ɛ] [2 ɛ, 2 + ɛ], there is a unit eigenvector u i (W n ) of λ i (W n ) satisfying u i (W n ) C 2K 2 log n n. We give here the proof of the first part of Theorem 6.. The proof of the second part is somewhat different and deferred to the appendix. With the threshold for local semi-circle law, we are able to derive the eigenvector delocalization results thanks to the next lemma. Lemma 6.2 (Eq (4.3), [9] or Lemma 4, [29]). Let ( ) a Y W n = Y W n be an n n Hermitian matrix for some a C and Y C n, and let ( x v ) be an eigenvector of W n with eigenvalue λ i (W n ), where x C and v C n. Assume none of the eigenvalues of W n equals λ i (W n ). Then x 2 = + n (λ j(w n ) λ i (W n )) 2 u j (W n ) Y, 2 where u j (W n ) is a unit eigenvector corresponding to the eigenvalue λ j (W n ). The assumption that the eigenvalues of W n and W n do not collide was taken care of in [32, Section 3.], so we can assume that the above formula makes sense in applications. First, for the bulk case, for any λ i (W n ) ( 2 + ε, 2 ε), by Theorem 5., one can find an interval I ( 2 + ε, 2 ε), centered at λ i (W n ) and with length I = K 2 C log n/n, such that N I δ n I (δ > 0 small enough) with probability at least n C 0. By Cauchy interlacing law, we can find a set J {,..., n } with J N I /2 such that λ j (W n ) λ i (W n ) I for all j J. Let X be the first column of M n with the first entry removed. Then X = ny.

17 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 7 By Lemma 6.2, we have (27) x 2 = + n (λ j(w n ) λ i (W n )) 2 u j (W n ) n X 2 + j J (λ j(w n ) λ i (W n )) 2 u j (W n ) n X 2 + n I 2 j J u j(w n ) X n I 2 J 200 I /δ K2 C2 2 log n n for some constant C 2 with probability at least n C 0. The third inequality follows from (3) by taking t = δ K C log n (say). Thus, by union bound and symmetry, u i (W n ) C 2K log /2 n n n C. holds with probability at least Appendix A. Proof for the Edge case of Theorem 6. For the edge case in Theorem 6., we use a different approach based on the next lemma. Lemma A. (Interlacing identity, Lemma 37, [28]). Let W n be the matrix W n with the n-th row and n-th column removed and Y is the n-th column of W n with the n-th element ζ nn / n removed. If none of the eigenvalues of W n equals λ i (W n ), then (28) n u j (W n ) Y 2 λ j (W n ) λ i (W n ) = n ζ nn λ i (W n ). By symmetry, it suffices to consider the case λ i (W n ) [2 ɛ, 2 + ɛ] for ɛ > 0 very small. Denote X the n-th column of M n with the n-th element removed. Thus Y = nx. By Lemma 6.2, in order to show x 2 C 4 K 4 log 2 n/n (for constant C > C + 00) with probability at least n C 0, it is enough to show n u j (W n ) X 2 (λ j (M n ) λ i (M n )) 2 n C 4 K 4 log 2 n. By the projection lemma, u j (W n ) X 0K C log n with probability at least 0n C. It suffices to show that with probability at least n C 0, n u j (W n ) X 4 (λ j (M n ) λ i (M n )) 2 00n C 3 K 2 log n. By Cauchy-Schwardz inequality, it is enough to show for some integers T < T + n that u j (W n ) Y 2 λ j (W n ) λ i (W n ) 0 T+ T C.5 K log n. T j T +

18 8 V. VU AND K. WANG By Lemma A., we are going to show for some integers T +, T satisfying T + T = O(log n) (the choice of T +, T will be given later) that u j (W (29) n ) Y 2 λ j (W n ) λ i (W n ) 2 ɛ 0 T+ T C.5 K log n + o(), j T + orj T with probability at least n C 0. Let η = K2 C log n nδ 8 with constant δ = ɛ/000. Divide the real line into disjoint intervals I k for k 0 where I 0 = (λ i (W n ) η, λ i (W n ) + η). For k k 0 = log 0.9 n (say), I k has length 2ηδ 8k = o() and I k = (λ i (W n ) β k η, λ i (W n ) β k η] [λ i (W n ) + β k η, λ i (W n ) + β k η), where we denote by β k = k s=0 δ 8s. The distance from λ i (W n ) to the interval I k satisfies dist(λ i (W n ), I k ) β k η. For each such interval, by (26), for sufficiently large constant C > 0, the number of eigenvalues J k = N Ik nα Ik I k + δ k+ n I k with probability at least n C 00, where α Ik = I k ρ sc (x)dx/ I k. For the k-th interval, by (3) taking t = K C log n, we have that, with probability at least C exp( C C log n) n C 00 for sufficiently large C, u j (W n ) X 2 n λ j (W n ) λ i (W n ) u j (W n ) X 2 n dist(λ i (W n ), I k ) j J k j J k n dist(λ i (W n ), I k ) ( J k + K J k C log n + CK 2 log n) α Ik I k dist(λ i (W n ), I k ) + (nδ k+ I k + 2K C log n n I k + CK 2 log n) nηβ k α Ik I k dist(λ i (W n ), I k ) + 0δk 7. For k k 0 +, let the interval I k s have the same length of I k0 = 2δ 8k 0 η. Note that the number of such intervals is bounded crudely by o(n). By (26), the number of eigenvalues J k nα Ik I k + δ k0+ n I k with probability at least n C 00. And the distance from λ i (W n ) to the interval I k satisfies dist(λ i (W n ), I k ) β k0 η + (k k 0 ) I k0. The contribution of such intervals can be computed similarly u j (W n ) X 2 n λ j (W n ) λ i (W n ) u j (W n ) X 2 n dist(λ i (W n ), I k ) j J k j J k n dist(λ i (W n ), I k ) ( J k + K J k C log n + CK 2 log n) α Ik I k dist(λ i (W n ), I k ) + with probability at least n C 00. δk0 k k 0

19 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 9 Summing over all intervals for k 0 (say), we obtain (30) u j (W n ) Y 2 λ j (W n ) λ i (W n ) α Ik I k dist(λ i (W n ), I k ) + δ. j T + orj T I k On the other hand, it follows from Riemann integration of the principal value integral that α I 2 Ik k dist(λ i (W n ), I k ) = p.v. ρ sc (x) dx + o(), I 2 λ i (W n ) x k where 2 ρ sc (x) p.v. 2 λ i (W n ) x dx := lim ε 0 2 x 2, x λ i (W n) ε ρ sc (x) λ i (W n ) x dx. From the explicit formula for the Stieltjes transform and from residue calculus, one obtains p.v. 2 2 ρ sc (x) x λ i (W n ) dx = λ i(w n )/2 for λ i (W n ) 2, and with the right-hand side replaced by λ i (W n )/2 + λ i (W n ) 2 4/2 for λ i (W n ) > 2. Finally, we always have (3) I k α Ik I k dist(λ i (W n ), I k ) + 2ɛ. Now for the rest of eigenvalues that satisfy λ i (W n ) λ j (W n ) I 0 + I I 0 4η/δ 80, by Theorem 5. and Cauchy interlacing law, the number of eigenvalues is at most T + T 8nη/δ 80 = 8CK 2 log n/δ 88 with probability at least n C 00 for sufficiently large constant C > 0. Thus T+ T (32) C.5 K log n 0 δ 44 C ɛ/000, by choosing C sufficiently large compared to δ 44. Thus, from (29), (30), (3) and (32), we have proved that there exits a constant C > 0 such that with probability at least n C 0, x C2 K 2 log n n. The conclusion of the second part of Theorem.7 follows from symmetry and union bounds. Appendix B. Local Marchenko-Pastur law for random covariance matrix and delocalization of singular vectors In this appendix, we extend the results obtained for random Hermitian matrices discussed in the previous sections to random covariance matrices, focusing on the changes needed for the proofs. Interested reader can refer to closely related papers [30] and [33] (see also [2, 22]). Let M = M p,n = (ζ ij ) i p, j n be a p n matrix, where p = p(n) is an integer such that p n and lim n p/n = y (0, ]. Assume the entries of M n,p are independent random variables with mean zero and variance one. For such a p n random matrix M, we form the n n (sample)

20 20 V. VU AND K. WANG covariance matrix W = W p,n = n M M. This (non-negative definite) matrix has at most p non-zero eigenvalues which are ordered as 0 λ (W ) λ 2 (W )... λ p (W ). Denote by σ (M),..., σ p (M) the singular values of M. It is easy to see that σ i (M) = nλ i (W ) /2. From the singular value decomposition, there exist orthonormal bases {u,..., u p } for C n and {v,..., v p } for C p such that Mu i = σ i v i and M v i = σ i u i. A fundamental result concerning the asymptotic limiting behavior of ESD for large covariance matrices is the Marchenko Pastur Law (see [3] and [8]). Theorem B.. (Marchenko Pastur Law) Assume the entries of matrix M C p n are independent random variables with mean zero and variance one and lim n p/n = y (0, ]. Then the empirical spectral distribution of the matrix W = n M M converges with probability to the Marchenko- Pastur Law with a density function where ρ MP,y (x) := (b x)(x a)[a,b] (x), 2πxy a := ( y) 2, b := ( + y) 2. The hard edge of the limiting support of spectrum refers to the left edge a when y = where it gives rise to a singularity of x /2. The cases of left edge a when y < and the right edge b regardless of the value of y are usually called the soft edge. Recent progress on studying the local convergence to Marchenko Pastur Law includes [2, 22, 30, 33] for the soft edge and [6, 27] for the hard edge. We focus on improving the previous results for the soft edge in this appendix. Our main results for the random covariance matrices are the following local Marchenko Pastur law (LMPL) and the delocalization property of singular vectors. Theorem B.2. For any constants ɛ, δ, C > 0, there exists a constant C 2 > 0 such that the following holds. Assume lim n p/n = y for some 0 < y. Let M = M p,n = (ζ ij ) i p, j n be a random matrix whose entries are independent K-bounded random variables with mean 0 and variance. Consider the covariance matrix W = n M M. Then with probability at least n C, one has N I (W n,p ) p ρ MP,y (x) dx δp ρ MP,y (x) dx. for any interval I (a + ɛ, b ɛ) of length at least C 2 K 2 log n/n. I I Theorem B.3 (Delocalization of singular vectors). Let M p,n be as in Theorem B.2. constant C > 0, there is a constant C 2 > 0 such that the following holds. For any (Bulk case) With probability at least n C, for any ɛ > 0 and any i p such that σ i (M p,n ) 2 /n [a + ɛ, b ɛ], there is a left singular vector u i corresponding to σ i (M p,n ) such that u i C 2K log /2 n n. The same holds for right singular vectors.

21 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 2 (Edge case) With probability at least n C, for any ɛ > 0 and any i p such that σ i (M p,n ) 2 /n [a ɛ, a + ɛ] [b ɛ, b + ɛ] if a 0 and σ i (M p,n ) 2 /n [4 ɛ, 4] if a = 0, there is a left singular vector u i corresponding to σ i (M n,p ) such that u i C 2K 2 log n n. The same holds for right singular vectors. Remark B.4. Theorem B.2 and Theorem B.3 actually hold for a larger class of matrices, using the K-concentration introduced in the previous sections. For instance, Theorem B.2 holds for random matrices M p,n = (ζ ij ) whose entries are independent random variables with mean 0 and variance, and the row vectors are K-concentrated. And Theorem B.3 holds if we further assume the column vectors of M p,n are also K-concentrated. Indeed, the K-bounded assumption is only used to guarantee K-concentration. B.. Proof of Theorem B.2. Similarly to the Hermitian case, we compare the Stieltjes transform of W s(z) := p p λ i (W ) z with that of the Marchenko Pastur Law s MP,y (z) := x z ρ MP,y(x) dx = R i= b The explicit expression of s MP,y (z) is given by (see [4]) a (b x)(x a) dx. 2πxy(x z) s MP,y (z) = y + z (y + z ) 2 4yz, 2yz where we take the branch of (y + z ) 2 4yz with cut at [a, b] that is asymptotically y + z as z tends to infinity. Note that it is uniquely defined by the equation s MP,y (z) + y + z + yzs MP,y (z) = 0. We will show that s(z) satisfies a similar equation. The analogue of Lemma 4. is the following lemma. Lemma B.5. (Lemma 29, [30]) Let M n,p be a random matrix with independent K-bounded entries with mean 0 and variance. Assume lim n + p/n = y (0, ]. Let /n < η < /0, and L, L 2, ε, δ > 0. For any constant C > 0, there exists a constant C > 0 such that if one has the bound s(z) s MP,y (z) δ with (uniformly) probability at least n C for all z with L Re(z) L 2 and Im(z) η. Then for any interval I in [L ε, L 2 + ε] with I max(2η, η δ log δ ), one has N I p ρ MP,y (x) dx δp I with probability at least n C. I

22 22 V. VU AND K. WANG The objective is to show (33) s(z) s MP,y (z) = o(δ) with probability at least n C for any z in the region R y, where R y = {z C : z 0, a ɛ Re(z) b + ɛ, Im(z) η} if y, and R y = {z C : z 0, ɛ Re(z) 4 + ɛ, Im(z) η} if y =. We use the parameter η := K2 C 2 log n nδ 6, where C = C Note that in the defined region R y, s MP,y (z) = O(). First, by Schur s complement, one can rewrite (34) s(z) = p trace(w zi) = p p k= ξ kk z Y k where Y k = a k (W k zi) a k, and W k is the matrix W = n MM = (ξ ij ) i,j p with the k-th row and k-th column removed, and a k is the k-th row of W with the k-th element removed. Let M k be the (p ) n minor of M with the k-th row removed and X i Cn ( i p) be the rows of M. Thus ξ kk = X k X k /n = X k 2 /n, a k = n M kx k, W k = n M km k. Thus Y k = p a k v j(m k ) 2 p λ j (W k ) z = λ j (W k ) Xk u j(m k ) 2 n λ j (W k ) z where u (M k ),..., u p (M k ) C n and v (M k ),..., v n (M k ) C p are orthonormal right and left singular vectors of M k. Here we use the fact that a k v j(m k ) = n X k M k v j(m k ) = n σ j(m k )X k u j(m k ) and σ j (M k ) 2 = nλ j (W k ). The entries of X k are independent of each other and of W k, and have mean 0 and variance. Since u j (M k ) are unit vectors, by linearity of expectation we have where E(Y k W k ) = p λ j (W k ) n λ j (W k ) z = p n + z n s k (z) = p p i= p λ j (W k ) z = p n ( + zs k(z)), λ i (W k ) z is the Stieltjes transform of W k. By Cauchy interlacing law, we have s(z) ( p )s k(z) = O( dx) = O( p x z 2 pη ). R Thus E(Y k W k ) = p n + z p s(z) + O( n nη ) = p n + z p n s(z) + o(δ2 ). On the other hand, Y k is concentrated about E(Y k W k ) with high probability: Lemma B.6. Let M n,p be as in Lemma B.5. For k p, Y k = E(Y k W k ) + o(δ 2 ) holds with probability at least O(n C ) for any z in the region R y.

23 RANDOM WEIGHTED PROJECTIONS, RANDOM QUADRATIC FORMS AND RANDOM EIGENVECTORS 23 To prove Lemma B.6, we estimate (35) Y k E(Y k W k ) = p λ j (W k ) n ( X k u j (M k ) 2 ) = λ j (W k ) z n p λ j (W k ) λ j (W k ) x η R j where R j = Xk u j(m k ) 2. Note that λ j (W k ) = O(). The estimation of (35) is a repetition of the calculation in (25). Interested reader are encouraged to work out the details. Inserting the bounds to (34), we have s(z) + y + z + yzs(z) + o(δ 2 ) = 0 with probability at least O(n C ). By a continuity argument (see for instance [33]), one has s(z) s MP,y (z) = o(δ) with probability at least n C+00 (say). By Lemma B.5, we have showed that for any constants ɛ, C > 0, there exists a constant C 2 > 0 such that for 0 < δ < /2 and any interval I (a ɛ, b+ɛ) if a 0 or I (ɛ, 4+ɛ) if a = 0 of length at least C 2 K 2 log n/nδ 8, with probability at least n C, (36) N I p ρ MP,y (x) dx δp I. In particular, Theorem B.2 follows. I B.2. Proof of Theorem B.3. To prove the delocalization of singular vectors, we need the following formula to express the entries of a singular vector in terms of the singular values and singular vectors of a minor. It is enough to prove the delocalization for the right (unit) singular vectors. Lemma B.7 (Corollary 25, [30]). Let p, n, and let M p,n = ( M p,n X ) ( ) u be a p n matrix for some X C p, and let be a right unit singular vector of M x p,n with singular value σ i (M p,n ), where x C and u C n. Suppose that none of the singular values of M p,n are equal to σ i (M p,n ). Then x 2 = + min(p,n ) σ j (M p,n ) 2 (σ j (M p,n ) 2 σ i (M p,n) v 2 ) 2 j (M p,n ) X, 2 where v (M p,n ),..., v min(p,n ) (M p,n ) C p is an orthonormal system of left singular vectors corresponding to the non-trivial singular values of M p,n. In a similar vein, if ( ) Mp,n M p,n = Y ( ) v for some Y C n, and is a left unit singular vector of M y p,n with singular value σ i (M p,n ), where y C and v C p, and none of the singular values of M p,n are equal to σ i (M p,n ). Then y 2 = + min(p,n) σ j (M p,n ) 2 (σ j (M p,n ) 2 σ i (M p,n) u 2 ) 2 j (M p,n ) Y, 2 where u (M p,n ),..., u min(p,n) (M p,n ) C n is an orthonormal system of right singular vectors corresponding to the non-trivial singular values of M p,n.

OPTIMAL UPPER BOUND FOR THE INFINITY NORM OF EIGENVECTORS OF RANDOM MATRICES

OPTIMAL UPPER BOUND FOR THE INFINITY NORM OF EIGENVECTORS OF RANDOM MATRICES OPTIMAL UPPER BOUND FOR THE INFINITY NORM OF EIGENVECTORS OF RANDOM MATRICES BY KE WANG A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New Jersey in partial

More information

Random regular digraphs: singularity and spectrum

Random regular digraphs: singularity and spectrum Random regular digraphs: singularity and spectrum Nick Cook, UCLA Probability Seminar, Stanford University November 2, 2015 Universality Circular law Singularity probability Talk outline 1 Universality

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

Random matrices: Distribution of the least singular value (via Property Testing)

Random matrices: Distribution of the least singular value (via Property Testing) Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued

More information

arxiv: v3 [math-ph] 21 Jun 2012

arxiv: v3 [math-ph] 21 Jun 2012 LOCAL MARCHKO-PASTUR LAW AT TH HARD DG OF SAMPL COVARIAC MATRICS CLAUDIO CACCIAPUOTI, AA MALTSV, AD BJAMI SCHLI arxiv:206.730v3 [math-ph] 2 Jun 202 Abstract. Let X be a matrix whose entries are i.i.d.

More information

Exponential tail inequalities for eigenvalues of random matrices

Exponential tail inequalities for eigenvalues of random matrices Exponential tail inequalities for eigenvalues of random matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify

More information

Random Matrices: Invertibility, Structure, and Applications

Random Matrices: Invertibility, Structure, and Applications Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37

More information

Semicircle law on short scales and delocalization for Wigner random matrices

Semicircle law on short scales and delocalization for Wigner random matrices Semicircle law on short scales and delocalization for Wigner random matrices László Erdős University of Munich Weizmann Institute, December 2007 Joint work with H.T. Yau (Harvard), B. Schlein (Munich)

More information

Small Ball Probability, Arithmetic Structure and Random Matrices

Small Ball Probability, Arithmetic Structure and Random Matrices Small Ball Probability, Arithmetic Structure and Random Matrices Roman Vershynin University of California, Davis April 23, 2008 Distance Problems How far is a random vector X from a given subspace H in

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices László Erdős University of Munich Oberwolfach, 2008 Dec Joint work with H.T. Yau (Harvard), B. Schlein (Cambrigde) Goal:

More information

Invertibility of random matrices

Invertibility of random matrices University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]

More information

III. Quantum ergodicity on graphs, perspectives

III. Quantum ergodicity on graphs, perspectives III. Quantum ergodicity on graphs, perspectives Nalini Anantharaman Université de Strasbourg 24 août 2016 Yesterday we focussed on the case of large regular (discrete) graphs. Let G = (V, E) be a (q +

More information

Concentration Inequalities for Random Matrices

Concentration Inequalities for Random Matrices Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic

More information

Isotropic local laws for random matrices

Isotropic local laws for random matrices Isotropic local laws for random matrices Antti Knowles University of Geneva With Y. He and R. Rosenthal Random matrices Let H C N N be a large Hermitian random matrix, normalized so that H. Some motivations:

More information

Local Kesten McKay law for random regular graphs

Local Kesten McKay law for random regular graphs Local Kesten McKay law for random regular graphs Roland Bauerschmidt (with Jiaoyang Huang and Horng-Tzer Yau) University of Cambridge Weizmann Institute, January 2017 Random regular graph G N,d is the

More information

RANDOM MATRICES: TAIL BOUNDS FOR GAPS BETWEEN EIGENVALUES. 1. Introduction

RANDOM MATRICES: TAIL BOUNDS FOR GAPS BETWEEN EIGENVALUES. 1. Introduction RANDOM MATRICES: TAIL BOUNDS FOR GAPS BETWEEN EIGENVALUES HOI NGUYEN, TERENCE TAO, AND VAN VU Abstract. Gaps (or spacings) between consecutive eigenvalues are a central topic in random matrix theory. The

More information

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA) The circular law Lewis Memorial Lecture / DIMACS minicourse March 19, 2008 Terence Tao (UCLA) 1 Eigenvalue distributions Let M = (a ij ) 1 i n;1 j n be a square matrix. Then one has n (generalised) eigenvalues

More information

Invertibility of symmetric random matrices

Invertibility of symmetric random matrices Invertibility of symmetric random matrices Roman Vershynin University of Michigan romanv@umich.edu February 1, 2011; last revised March 16, 2012 Abstract We study n n symmetric random matrices H, possibly

More information

Random Matrix: From Wigner to Quantum Chaos

Random Matrix: From Wigner to Quantum Chaos Random Matrix: From Wigner to Quantum Chaos Horng-Tzer Yau Harvard University Joint work with P. Bourgade, L. Erdős, B. Schlein and J. Yin 1 Perhaps I am now too courageous when I try to guess the distribution

More information

Upper Bound for Intermediate Singular Values of Random Sub-Gaussian Matrices 1

Upper Bound for Intermediate Singular Values of Random Sub-Gaussian Matrices 1 Upper Bound for Intermediate Singular Values of Random Sub-Gaussian Matrices 1 Feng Wei 2 University of Michigan July 29, 2016 1 This presentation is based a project under the supervision of M. Rudelson.

More information

Random matrices: A Survey. Van H. Vu. Department of Mathematics Rutgers University

Random matrices: A Survey. Van H. Vu. Department of Mathematics Rutgers University Random matrices: A Survey Van H. Vu Department of Mathematics Rutgers University Basic models of random matrices Let ξ be a real or complex-valued random variable with mean 0 and variance 1. Examples.

More information

Least singular value of random matrices. Lewis Memorial Lecture / DIMACS minicourse March 18, Terence Tao (UCLA)

Least singular value of random matrices. Lewis Memorial Lecture / DIMACS minicourse March 18, Terence Tao (UCLA) Least singular value of random matrices Lewis Memorial Lecture / DIMACS minicourse March 18, 2008 Terence Tao (UCLA) 1 Extreme singular values Let M = (a ij ) 1 i n;1 j m be a square or rectangular matrix

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Comparison Method in Random Matrix Theory

Comparison Method in Random Matrix Theory Comparison Method in Random Matrix Theory Jun Yin UW-Madison Valparaíso, Chile, July - 2015 Joint work with A. Knowles. 1 Some random matrices Wigner Matrix: H is N N square matrix, H : H ij = H ji, EH

More information

Lecture 3. Random Fourier measurements

Lecture 3. Random Fourier measurements Lecture 3. Random Fourier measurements 1 Sampling from Fourier matrices 2 Law of Large Numbers and its operator-valued versions 3 Frames. Rudelson s Selection Theorem Sampling from Fourier matrices Our

More information

Universality of local spectral statistics of random matrices

Universality of local spectral statistics of random matrices Universality of local spectral statistics of random matrices László Erdős Ludwig-Maximilians-Universität, Munich, Germany CRM, Montreal, Mar 19, 2012 Joint with P. Bourgade, B. Schlein, H.T. Yau, and J.

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Anti-concentration Inequalities

Anti-concentration Inequalities Anti-concentration Inequalities Roman Vershynin Mark Rudelson University of California, Davis University of Missouri-Columbia Phenomena in High Dimensions Third Annual Conference Samos, Greece June 2007

More information

RANDOM MATRICES: LAW OF THE DETERMINANT

RANDOM MATRICES: LAW OF THE DETERMINANT RANDOM MATRICES: LAW OF THE DETERMINANT HOI H. NGUYEN AND VAN H. VU Abstract. Let A n be an n by n random matrix whose entries are independent real random variables with mean zero and variance one. We

More information

DISTRIBUTION OF EIGENVALUES OF REAL SYMMETRIC PALINDROMIC TOEPLITZ MATRICES AND CIRCULANT MATRICES

DISTRIBUTION OF EIGENVALUES OF REAL SYMMETRIC PALINDROMIC TOEPLITZ MATRICES AND CIRCULANT MATRICES DISTRIBUTION OF EIGENVALUES OF REAL SYMMETRIC PALINDROMIC TOEPLITZ MATRICES AND CIRCULANT MATRICES ADAM MASSEY, STEVEN J. MILLER, AND JOHN SINSHEIMER Abstract. Consider the ensemble of real symmetric Toeplitz

More information

A sequence of triangle-free pseudorandom graphs

A sequence of triangle-free pseudorandom graphs A sequence of triangle-free pseudorandom graphs David Conlon Abstract A construction of Alon yields a sequence of highly pseudorandom triangle-free graphs with edge density significantly higher than one

More information

EIGENVECTORS OF RANDOM MATRICES OF SYMMETRIC ENTRY DISTRIBUTIONS. 1. introduction

EIGENVECTORS OF RANDOM MATRICES OF SYMMETRIC ENTRY DISTRIBUTIONS. 1. introduction EIGENVECTORS OF RANDOM MATRICES OF SYMMETRIC ENTRY DISTRIBUTIONS SEAN MEEHAN AND HOI NGUYEN Abstract. In this short note we study a non-degeneration property of eigenvectors of symmetric random matrices

More information

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with Appendix: Matrix Estimates and the Perron-Frobenius Theorem. This Appendix will first present some well known estimates. For any m n matrix A = [a ij ] over the real or complex numbers, it will be convenient

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

The Matrix Dyson Equation in random matrix theory

The Matrix Dyson Equation in random matrix theory The Matrix Dyson Equation in random matrix theory László Erdős IST, Austria Mathematical Physics seminar University of Bristol, Feb 3, 207 Joint work with O. Ajanki, T. Krüger Partially supported by ERC

More information

Wigner s semicircle law

Wigner s semicircle law CHAPTER 2 Wigner s semicircle law 1. Wigner matrices Definition 12. A Wigner matrix is a random matrix X =(X i, j ) i, j n where (1) X i, j, i < j are i.i.d (real or complex valued). (2) X i,i, i n are

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

RANDOM MATRICES: OVERCROWDING ESTIMATES FOR THE SPECTRUM. 1. introduction

RANDOM MATRICES: OVERCROWDING ESTIMATES FOR THE SPECTRUM. 1. introduction RANDOM MATRICES: OVERCROWDING ESTIMATES FOR THE SPECTRUM HOI H. NGUYEN Abstract. We address overcrowding estimates for the singular values of random iid matrices, as well as for the eigenvalues of random

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

On the concentration of eigenvalues of random symmetric matrices

On the concentration of eigenvalues of random symmetric matrices On the concentration of eigenvalues of random symmetric matrices Noga Alon Michael Krivelevich Van H. Vu April 23, 2012 Abstract It is shown that for every 1 s n, the probability that the s-th largest

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Decoupling course outline Decoupling theory is a recent development in Fourier analysis with applications in partial differential equations and

Decoupling course outline Decoupling theory is a recent development in Fourier analysis with applications in partial differential equations and Decoupling course outline Decoupling theory is a recent development in Fourier analysis with applications in partial differential equations and analytic number theory. It studies the interference patterns

More information

Eigenvalue variance bounds for Wigner and covariance random matrices

Eigenvalue variance bounds for Wigner and covariance random matrices Eigenvalue variance bounds for Wigner and covariance random matrices S. Dallaporta University of Toulouse, France Abstract. This work is concerned with finite range bounds on the variance of individual

More information

From the mesoscopic to microscopic scale in random matrix theory

From the mesoscopic to microscopic scale in random matrix theory From the mesoscopic to microscopic scale in random matrix theory (fixed energy universality for random spectra) With L. Erdős, H.-T. Yau, J. Yin Introduction A spacially confined quantum mechanical system

More information

Inhomogeneous circular laws for random matrices with non-identically distributed entries

Inhomogeneous circular laws for random matrices with non-identically distributed entries Inhomogeneous circular laws for random matrices with non-identically distributed entries Nick Cook with Walid Hachem (Telecom ParisTech), Jamal Najim (Paris-Est) and David Renfrew (SUNY Binghamton/IST

More information

arxiv: v1 [math.pr] 22 May 2008

arxiv: v1 [math.pr] 22 May 2008 THE LEAST SINGULAR VALUE OF A RANDOM SQUARE MATRIX IS O(n 1/2 ) arxiv:0805.3407v1 [math.pr] 22 May 2008 MARK RUDELSON AND ROMAN VERSHYNIN Abstract. Let A be a matrix whose entries are real i.i.d. centered

More information

Notes on Linear Algebra and Matrix Theory

Notes on Linear Algebra and Matrix Theory Massimo Franceschet featuring Enrico Bozzo Scalar product The scalar product (a.k.a. dot product or inner product) of two real vectors x = (x 1,..., x n ) and y = (y 1,..., y n ) is not a vector but a

More information

Universality for random matrices and log-gases

Universality for random matrices and log-gases Universality for random matrices and log-gases László Erdős IST, Austria Ludwig-Maximilians-Universität, Munich, Germany Encounters Between Discrete and Continuous Mathematics Eötvös Loránd University,

More information

Concentration and Anti-concentration. Van H. Vu. Department of Mathematics Yale University

Concentration and Anti-concentration. Van H. Vu. Department of Mathematics Yale University Concentration and Anti-concentration Van H. Vu Department of Mathematics Yale University Concentration and Anti-concentration X : a random variable. Concentration and Anti-concentration X : a random variable.

More information

arxiv: v1 [math.pr] 22 Dec 2018

arxiv: v1 [math.pr] 22 Dec 2018 arxiv:1812.09618v1 [math.pr] 22 Dec 2018 Operator norm upper bound for sub-gaussian tailed random matrices Eric Benhamou Jamal Atif Rida Laraki December 27, 2018 Abstract This paper investigates an upper

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

Random matrix pencils and level crossings

Random matrix pencils and level crossings Albeverio Fest October 1, 2018 Topics to discuss Basic level crossing problem 1 Basic level crossing problem 2 3 Main references Basic level crossing problem (i) B. Shapiro, M. Tater, On spectral asymptotics

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

1 Tridiagonal matrices

1 Tridiagonal matrices Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate

More information

1. General Vector Spaces

1. General Vector Spaces 1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

More information

Rectangular Young tableaux and the Jacobi ensemble

Rectangular Young tableaux and the Jacobi ensemble Rectangular Young tableaux and the Jacobi ensemble Philippe Marchal October 20, 2015 Abstract It has been shown by Pittel and Romik that the random surface associated with a large rectangular Young tableau

More information

The Kadison-Singer Conjecture

The Kadison-Singer Conjecture The Kadison-Singer Conjecture John E. McCarthy April 8, 2006 Set-up We are given a large integer N, and a fixed basis {e i : i N} for the space H = C N. We are also given an N-by-N matrix H that has all

More information

Notes on Complex Analysis

Notes on Complex Analysis Michael Papadimitrakis Notes on Complex Analysis Department of Mathematics University of Crete Contents The complex plane.. The complex plane...................................2 Argument and polar representation.........................

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Non white sample covariance matrices.

Non white sample covariance matrices. Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions

More information

Analysis-3 lecture schemes

Analysis-3 lecture schemes Analysis-3 lecture schemes (with Homeworks) 1 Csörgő István November, 2015 1 A jegyzet az ELTE Informatikai Kar 2015. évi Jegyzetpályázatának támogatásával készült Contents 1. Lesson 1 4 1.1. The Space

More information

BALANCING GAUSSIAN VECTORS. 1. Introduction

BALANCING GAUSSIAN VECTORS. 1. Introduction BALANCING GAUSSIAN VECTORS KEVIN P. COSTELLO Abstract. Let x 1,... x n be independent normally distributed vectors on R d. We determine the distribution function of the minimum norm of the 2 n vectors

More information

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012. Math 5620 - Introduction to Numerical Analysis - Class Notes Fernando Guevara Vasquez Version 1990. Date: January 17, 2012. 3 Contents 1. Disclaimer 4 Chapter 1. Iterative methods for solving linear systems

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Lecture I: Asymptotics for large GUE random matrices

Lecture I: Asymptotics for large GUE random matrices Lecture I: Asymptotics for large GUE random matrices Steen Thorbjørnsen, University of Aarhus andom Matrices Definition. Let (Ω, F, P) be a probability space and let n be a positive integer. Then a random

More information

CONCENTRATION OF THE MIXED DISCRIMINANT OF WELL-CONDITIONED MATRICES. Alexander Barvinok

CONCENTRATION OF THE MIXED DISCRIMINANT OF WELL-CONDITIONED MATRICES. Alexander Barvinok CONCENTRATION OF THE MIXED DISCRIMINANT OF WELL-CONDITIONED MATRICES Alexander Barvinok Abstract. We call an n-tuple Q 1,...,Q n of positive definite n n real matrices α-conditioned for some α 1 if for

More information

Inverse Theorems in Probability. Van H. Vu. Department of Mathematics Yale University

Inverse Theorems in Probability. Van H. Vu. Department of Mathematics Yale University Inverse Theorems in Probability Van H. Vu Department of Mathematics Yale University Concentration and Anti-concentration X : a random variable. Concentration and Anti-concentration X : a random variable.

More information

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1

< k 2n. 2 1 (n 2). + (1 p) s) N (n < 1 List of Problems jacques@ucsd.edu Those question with a star next to them are considered slightly more challenging. Problems 9, 11, and 19 from the book The probabilistic method, by Alon and Spencer. Question

More information

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices S. Dallaporta University of Toulouse, France Abstract. This note presents some central limit theorems

More information

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint

More information

arxiv: v2 [math.ag] 24 Jun 2015

arxiv: v2 [math.ag] 24 Jun 2015 TRIANGULATIONS OF MONOTONE FAMILIES I: TWO-DIMENSIONAL FAMILIES arxiv:1402.0460v2 [math.ag] 24 Jun 2015 SAUGATA BASU, ANDREI GABRIELOV, AND NICOLAI VOROBJOV Abstract. Let K R n be a compact definable set

More information

Supremum of simple stochastic processes

Supremum of simple stochastic processes Subspace embeddings Daniel Hsu COMS 4772 1 Supremum of simple stochastic processes 2 Recap: JL lemma JL lemma. For any ε (0, 1/2), point set S R d of cardinality 16 ln n S = n, and k N such that k, there

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

On the singular values of random matrices

On the singular values of random matrices On the singular values of random matrices Shahar Mendelson Grigoris Paouris Abstract We present an approach that allows one to bound the largest and smallest singular values of an N n random matrix with

More information

, then the ESD of the matrix converges to µ cir almost surely as n tends to.

, then the ESD of the matrix converges to µ cir almost surely as n tends to. CIRCULAR LAW FOR RANDOM DISCRETE MATRICES OF GIVEN ROW SUM HOI H. NGUYEN AND VAN H. VU Abstract. Let M n be a random matrix of size n n and let λ 1,..., λ n be the eigenvalues of M n. The empirical spectral

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

Optimization Theory. A Concise Introduction. Jiongmin Yong

Optimization Theory. A Concise Introduction. Jiongmin Yong October 11, 017 16:5 ws-book9x6 Book Title Optimization Theory 017-08-Lecture Notes page 1 1 Optimization Theory A Concise Introduction Jiongmin Yong Optimization Theory 017-08-Lecture Notes page Optimization

More information

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws Random Matrices: Beyond Wigner and Marchenko-Pastur Laws Nathan Noiry Modal X, Université Paris Nanterre May 3, 2018 Wigner s Matrices Wishart s Matrices Generalizations Wigner s Matrices ij, (n, i, j

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Topological properties

Topological properties CHAPTER 4 Topological properties 1. Connectedness Definitions and examples Basic properties Connected components Connected versus path connected, again 2. Compactness Definition and first examples Topological

More information

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011)

Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Chapter Four Gelfond s Solution of Hilbert s Seventh Problem (Revised January 2, 2011) Before we consider Gelfond s, and then Schneider s, complete solutions to Hilbert s seventh problem let s look back

More information

4 Sums of Independent Random Variables

4 Sums of Independent Random Variables 4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables

More information

Random Fermionic Systems

Random Fermionic Systems Random Fermionic Systems Fabio Cunden Anna Maltsev Francesco Mezzadri University of Bristol December 9, 2016 Maltsev (University of Bristol) Random Fermionic Systems December 9, 2016 1 / 27 Background

More information

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY OLIVIER GUÉDON AND ROMAN VERSHYNIN Abstract. We present a simple and flexible method to prove consistency of semidefinite optimization

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Lecture 1 and 2: Random Spanning Trees

Lecture 1 and 2: Random Spanning Trees Recent Advances in Approximation Algorithms Spring 2015 Lecture 1 and 2: Random Spanning Trees Lecturer: Shayan Oveis Gharan March 31st Disclaimer: These notes have not been subjected to the usual scrutiny

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Eigenvalues, random walks and Ramanujan graphs

Eigenvalues, random walks and Ramanujan graphs Eigenvalues, random walks and Ramanujan graphs David Ellis 1 The Expander Mixing lemma We have seen that a bounded-degree graph is a good edge-expander if and only if if has large spectral gap If G = (V,

More information

Diagonalizing Matrices

Diagonalizing Matrices Diagonalizing Matrices Massoud Malek A A Let A = A k be an n n non-singular matrix and let B = A = [B, B,, B k,, B n ] Then A n A B = A A 0 0 A k [B, B,, B k,, B n ] = 0 0 = I n 0 A n Notice that A i B

More information

Linear Algebra and Eigenproblems

Linear Algebra and Eigenproblems Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details

More information

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices by Jack W. Silverstein* Department of Mathematics Box 8205 North Carolina State University Raleigh,

More information

Concentration inequalities: basics and some new challenges

Concentration inequalities: basics and some new challenges Concentration inequalities: basics and some new challenges M. Ledoux University of Toulouse, France & Institut Universitaire de France Measure concentration geometric functional analysis, probability theory,

More information

Lecture 7: Positive Semidefinite Matrices

Lecture 7: Positive Semidefinite Matrices Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information