arxiv: v1 [math.pr] 22 May 2008

Similar documents
THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX

THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX

Small Ball Probability, Arithmetic Structure and Random Matrices

Upper Bound for Intermediate Singular Values of Random Sub-Gaussian Matrices 1

THE LITTLEWOOD-OFFORD PROBLEM AND INVERTIBILITY OF RANDOM MATRICES

Invertibility of symmetric random matrices

The Littlewood Offord problem and invertibility of random matrices

Invertibility of random matrices

Anti-concentration Inequalities

arxiv: v3 [math.pr] 8 Oct 2017

Random matrices: Distribution of the least singular value (via Property Testing)

NO-GAPS DELOCALIZATION FOR GENERAL RANDOM MATRICES MARK RUDELSON AND ROMAN VERSHYNIN

arxiv: v2 [math.fa] 7 Apr 2010

INVESTIGATE INVERTIBILITY OF SPARSE SYMMETRIC MATRIX arxiv: v2 [math.pr] 25 Apr 2018

Estimates for the concentration functions in the Littlewood Offord problem

RANDOM MATRICES: OVERCROWDING ESTIMATES FOR THE SPECTRUM. 1. introduction

RANDOM MATRICES: TAIL BOUNDS FOR GAPS BETWEEN EIGENVALUES. 1. Introduction

Random regular digraphs: singularity and spectrum

Least singular value of random matrices. Lewis Memorial Lecture / DIMACS minicourse March 18, Terence Tao (UCLA)

arxiv: v2 [math.pr] 10 Oct 2014

Lecture 3. Random Fourier measurements

Subsequences of frames

arxiv: v1 [math.pr] 22 Dec 2018

Chapter 4 Euclid Space

Isoperimetry of waists and local versus global asymptotic convex geometries

DELOCALIZATION OF EIGENVECTORS OF RANDOM MATRICES

EIGENVECTORS OF RANDOM MATRICES OF SYMMETRIC ENTRY DISTRIBUTIONS. 1. introduction

Random matrices: A Survey. Van H. Vu. Department of Mathematics Rutgers University

Inhomogeneous circular laws for random matrices with non-identically distributed entries

Concentration Inequalities for Random Matrices

SHARP TRANSITION OF THE INVERTIBILITY OF THE ADJACENCY MATRICES OF SPARSE RANDOM GRAPHS

Delocalization of eigenvectors of random matrices Lecture notes

Reconstruction from Anisotropic Random Measurements

Random sets of isomorphism of linear operators on Hilbert space

arxiv: v2 [math.pr] 15 Dec 2010

Lecture 3: Review of Linear Algebra

Smallest singular value of sparse random matrices

Lecture notes: Applied linear algebra Part 1. Version 2

BALANCING GAUSSIAN VECTORS. 1. Introduction

ROW PRODUCTS OF RANDOM MATRICES

Random Matrices: Invertibility, Structure, and Applications

RANDOM MATRICES: LAW OF THE DETERMINANT

On Riesz-Fischer sequences and lower frame bounds

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Characterization of half-radial matrices

arxiv: v3 [math.pr] 24 Mar 2016

A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE

SMALL BALL PROBABILITIES FOR LINEAR IMAGES OF HIGH DIMENSIONAL DISTRIBUTIONS

Linear Algebra. Preliminary Lecture Notes

LOWER BOUNDS FOR THE SMALLEST SINGULAR VALUE OF STRUCTURED RANDOM MATRICES. By Nicholas Cook University of California, Los Angeles

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

LINEAR ALGEBRA W W L CHEN

arxiv: v5 [math.na] 16 Nov 2017

I teach myself... Hilbert spaces

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Stable Process. 2. Multivariate Stable Distributions. July, 2006

PRODUCT OF OPERATORS AND NUMERICAL RANGE

Math 443 Differential Geometry Spring Handout 3: Bilinear and Quadratic Forms This handout should be read just before Chapter 4 of the textbook.

SAMPLING FROM LARGE MATRICES: AN APPROACH THROUGH GEOMETRIC FUNCTIONAL ANALYSIS

Inner product spaces. Layers of structure:

Linear Algebra Massoud Malek

Lecture 3: Review of Linear Algebra

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007

Projection Theorem 1

A Banach space with a symmetric basis which is of weak cotype 2 but not of cotype 2

MATH 304 Linear Algebra Lecture 10: Linear independence. Wronskian.

arxiv:math/ v1 [math.fa] 30 Sep 2004

Supremum of simple stochastic processes

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

Decompositions of frames and a new frame identity

Applied Linear Algebra in Geoscience Using MATLAB

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

A Characterization of Distance-Regular Graphs with Diameter Three

Linear Algebra. Preliminary Lecture Notes

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

2. Review of Linear Algebra

STAT 200C: High-dimensional Statistics

Kernel Method: Data Analysis with Positive Definite Kernels

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Means of unitaries, conjugations, and the Friedrichs operator

Quantum NP - Cont. Classical and Quantum Computation A.Yu Kitaev, A. Shen, M. N. Vyalyi 2002

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Concentration and Anti-concentration. Van H. Vu. Department of Mathematics Yale University

MAT Linear Algebra Collection of sample exams

Methods for sparse analysis of high-dimensional data, II

Using Friendly Tail Bounds for Sums of Random Matrices

Exercise Sheet 1.

Spectral inequalities and equalities involving products of matrices

Diagonalizing Matrices

Inner Product and Orthogonality

Sections of Convex Bodies via the Combinatorial Dimension

Radial Basis Functions I

Lecture 1 Operator spaces and their duality. David Blecher, University of Houston

Non commutative Khintchine inequalities and Grothendieck s theo

On the singular values of random matrices

Elementary linear algebra

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

Transcription:

THE LEAST SINGULAR VALUE OF A RANDOM SQUARE MATRIX IS O(n 1/2 ) arxiv:0805.3407v1 [math.pr] 22 May 2008 MARK RUDELSON AND ROMAN VERSHYNIN Abstract. Let A be a matrix whose entries are real i.i.d. centered random variables with unit variance and suitable moment assumptions. Then the smallest singular value s n (A) is of order n 1/2 with high probability. The lower estimate of this type was proved recently by the authors; in this note we establish the matching upper estimate. 1. Introduction Let A be an n n matrix whose entries are real i.i.d. centered random variables with suitable moment assumptions. Random matrix theory studies the distribution of the singular values s k (A), which are the eigenvalues of A = A A arranged in the non-increasing order. In this paper we study the magnitude of the smallest singular value s n (A), which can also be viewed as the reciprocal of the spectral norm: (1) s n (A) = inf x: x 2 =1 Ax 2 = 1/ A 1. Motivated by numerical inversion of large matrices, von Neumann and his associates speculated that (2) s n (A) n 1/2 with high probability. (See [4], pp. 14, 477, 555). A more precise form of this estimate was conjectured by Smale and proved by Edelman [1] for Gaussian matrices A. For general matrices, conjecture (2) had remained open until we proved in [2] the lower bound s n (A) = Ω(n 1/2 ). In the present paper, we shall prove the corresponding upper bound s n (A) = O(n 1/2 ), thereby completing the proof of (2). Theorem 1.1 (Fourth moment). Let A be an n n matrix whose entries are i.i.d. centered random variables with unit variance and fourth moment bounded by B. Then, for every δ > 0 there exist K > 0 and n 0 which depend (polynomially) only on δ and B, and such that P ( s n (A) > Kn 1/2) δ for all n n 0. M.R. was supported by NSF DMS grant 0652684. R.V. was supported by the Alfred P. Sloan Foundation and by NSF DMS grant 0652617. 1

2 MARK RUDELSON AND ROMAN VERSHYNIN Remark. The same result but with the reverse estimate, P ( s n (A) < Kn 1/2) δ, was proved in [2]. Together, these two estimates amount to (2). Under more restrictive (but still quite general) moment assumptions, Theorem 1.1 takes the following sharper form. Recall that a random variable ξ is called subgaussian if its tail is dominated by that of the standard normal random variable: there exists B > 0 such that P( ξ > t) 2 exp( t 2 /B 2 ) for all t > 0. The minimal B is called the subgaussian moment of ξ. The class of subgaussian random variables includes, among others, normal, symmetric ±1, and in general all bounded random variables. Theorem 1.2 (Subgaussian). Let A be an n n matrix whose entries are i.i.d. centered random variables with unit variance and subgaussian moment bounded by B. Then for every K 2 one has (3) P ( s n (A) > Kn 1/2) (C/K) log K + c n, where C > 0 and c (0, 1) depend (polynomially) only on B. Remark. A reverse result was proved in [2]: for every ε 0, one has P ( s n (A) εn 1/2) Cε + c n. Our argument is an application of the small ball probability bounds and the structure theory developed in [2] and [3]. We shall give a complete proof of Theorem 1.2 only; we leave to the interested reader to modify the argument as in [2] to obtain Theorem 1.1. 2. Proof of Theorem 1.2 By (e k ) n we denote the canonical basis of the Euclidean space Rn equipped with the canonical inner product, and Euclidean norm 2. By C, C 1, c, c 1,... we shall denote positive constants that may possibly depend only on the subgaussian moment B. Consider vectors (X k ) n and (X k )n an n-dimensional Hilbert space H. Recall that the system (X k, Xk )n is called a biorthogonal system in H if Xj, X k = δ j,k for all j, k = 1,...,n. The system is called complete if span(x k ) = H. The following notation will be used throughout the paper: (4) H k := span(x i ) i k, H j,k := span(x i ) i {j,k}, j, k = 1,...,n. The next proposition summarizes some elementary and known properties of biorthogonal systems. Proposition 2.1 (Biorthogonal systems). 1. Let A be an n n invertible matrix with columns X k = Ae k, k = 1,..., n. Define X k = (A 1 ) e k. Then (X k, X k )n is a complete biorthogonal system in Rn.

THE LEAST SINGULAR VALUE OF A RANDOM MATRIX 3 2. Let (X k ) n be a linearly independent system in an n-dimensional Hilbert space H. Then there exist unique vectors (Xk )n such that (X k, Xk )n is a biorthogonal system in H. This system is complete. 3. Let (X k, Xk )n be a complete biorthogonal system in a Hilbert space H. Then Xk 2 = 1/dist(X k, H k ) for k = 1,...,n. Without loss of generality, we can assume that n 2 and that A is a.s. invertible (by adding independent normal random variables with small variance to all entries of A). Let u, v > 0. By (1), the following implication holds: (5) x R n : x 2 u, A 1 x 2 vn 1/2 implies s n (A) (u/v)n 1/2. We will now describe how to find such x. Consider the columns X k = Ae k of A and the subspaces H k, H j,k defined in (4). Let P 1 denote the orthogonal projection in R n onto H 1. We define the vector x := X 1 P 1 X 1. Define Xk = (A 1 ) e k. By Proposition 2.1 (X k, Xk )n is a complete biorthogonal system in R n, so (6) ker(p 1 ) = span(x 1 ). Clearly, x 2 = dist(x 1, H 1 ). Conditioning on H 1 and using a standard concentration bound, we obtain (7) P( x 2 > u) Ce cu2, u > 0. This settles the first bound in (5) with high probability. To address the second bound in (5), we write A 1 x = A 1 X 1 A 1 P 1 X 1 = e 1 A 1 P 1 X 1. Since P 1 X 1 H 1, the vector A 1 P 1 X 1 is supported in {2,..., n} and hence is orthogonal to e 1. Therefore A 1 x 2 2 > A 1 P 1 X 1 2 2 = A 1 P 1 X 1, e k 2 = P 1 (A 1 ) e k, X 1 2 = P 1 Xk, X 1 2. The first term of the last sum is zero since P 1 X1 = 0 by (6). We have proved that (8) A 1 x 2 2 Yk, X 1 2, where Yk := P 1 Xk H 1, k = 2,...,n. k=2 Lemma 2.2. (Y k, X k) n k=2 is a complete biorthogonal system in H 1.

4 MARK RUDELSON AND ROMAN VERSHYNIN Proof. By (8) and (6), Yk X k ker(p 1) = span(x1 ), so Yk = X k λ kx1 for some λ k R and all k = 2,...,n. By the orthogonality of X1 to all of X k, k = 2,..., n, we have Yj, X k = Xj, X k = δ j,k for all j, k = 2,...,n. The biorthogonality is proved. The completeness follows since dim(h 1 ) = n 1. In view of the uniqueness in Part 2 of Proposition 2.1, Lemma 2.2 has the following crucial consequence. Corollary 2.3. The system of vectors (Yk )n k=2 is uniquely determined by the system (X k ) n k=2. In particular, the system (Yk )n k=2 and the vector X 1 are statistically independent. By Part 3 of Proposition 2.1, Y k 2 = 1/dist(X k, H 1,k ). We have therefore proved that (9) A 1 x 2 2 (a k /b k ) 2, where a k = Yk Yk, X 1, bk = dist(x k, H 1,k ). 2 k=2 We will now need to bound a k above and b k below. Without loss of generality, we will do this for k = 2. We are going to use a result of [3] that states that random subspaces have no additive structure. The amount of structure is formalized by the concept of the least common denominator. Given parameters α > 0 and γ (0, 1), the least common denominator of a vector a R n is defined as LCD α,γ (a) := inf { θ > 0 : dist(θa, Z N ) < min(γ θa 2, α) }. The least common denominator of a subspace H in R n is then defined as LCD α,γ (H) = inf{lcd α,γ (a) : a H, a 2 = 1}. Since H 1,2 is the span of n 2 random vectors with i.i.d. coordinates, Theorem 4.3 of [3] yields that P { LCD α,c ((H 1,2 ) ) e cn} 1 e cn where α = c n, and c > 0 is some constant that may only depend on the subgaussian moment B. On the other hand, note that the random vector X 2 is statistically independent of the subspace H 1,2. So, conditioning on H 1,2 and using the standard concentration inequality, we obtain Therefore, the event P ( b 2 = dist(x 2, H 1,2 ) t ) Ce ct2, t > 0. (10) E := { LCD α,c ((H 1,2 ) ) e cn, b 2 < t } satisfies P(E) 1 e cn Ce ct2. Note that the event E depends only on (X j ) n j=2. So let us fix a realization of (X j ) n j=2 for which E holds. By Corollary 2.3, the vector Y2 is now fixed. By

THE LEAST SINGULAR VALUE OF A RANDOM MATRIX 5 Lemma 2.2, Y 2 is orthogonal to (X j) n j=3. Therefore Y := Y 2 / Y 2 2 (H 1,2 ), and because event E holds, we have LCD α,c (Y ) e cn. Let us write in coordinates a 2 = Y, X 1 = n i=1 Y (i)x 1 (i) and recall that Y (i) are fixed coefficients with n i=1 Y (i) 2 = 1, and X 1 (i) are i.i.d. random variables. We can now apply Small Ball Probability Theorem 3.3 of [3] (in dimension m = 1) for this random sum. It yields (11) P X1 (a 2 ε) C(ε + 1/ LCD α,c (Y ) + e c 1n ) C(ε + e c 2n ). Here the subscript in P X1 means that we the probability is with respect to the random variable X 1 while the other random variables (X j ) n j=2 are fixed; we will use similar notations later. Now we unfix all random vectors, i.e. work with P = P X1,...,X n. We have P(a 2 ε or b 2 t) = E X2,...,X n P X1 (a 2 ε or b 2 t) E X2,...,X n 1 E P X1 (a 2 ε) + P X2,...,X n (E c ) because b 2 < t on E. By (11) and (10), we continue as P(a 2 ε or b 2 t) C(ε + e c 2n ) + (e cn + Ce ct2 ) = C 1 (ε + e c 3t 2 + e cn ) := p(ε, t, n). Repeating the above argument for any k {2,..., n} instead of k = 2, we conclude that (12) P ( a k /b k ε/t ) p(ε, t, n) for ε > 0, t > 0, k = 2,...,n. From this we can easily deduce the lower bound on the sum of (a k /b k ) 2, which we need for (9). This can be done using the following elementary observation proved by applying Markov s inequality twice. Proposition 2.4. Let Z k 0, k = 1,...,n, be random variables. Then, for every ε > 0, we have ( 1 ) P Z k ε 2 P(Z k 2ε). n n We use Proposition 2.4 for Z k = (a k /b k ) 2, along with the bounds (12). In view of (9), we obtain (13) P ( A 1 x 2 (ε/t)n 1/2) 2p(4ε, t, n). Estimates (7) and (13) settle the desired bounds in (5), and therefore we conclude that P ( s n (A) (ut/ε)n 1/2) P ( x 2 u, A 1 x 2 (ε/t)n 1/2) 1 Ce cu2 2p(4ε, t, n).

6 MARK RUDELSON AND ROMAN VERSHYNIN This estimate is valid for all ε, u, t > 0. Choosing ε = 1/K, u = t = log K, the proof of Theorem 1.2 is complete. References [1] A. Edelman, Eigenvalues and condition numbers of random matrices, SIAM J. Matrix Anal. Appl. 9 (1988) 543 560 [2] M. Rudelson, R. Vershynin, The Littlewood-Offord Problem and invertibility of random matrices, Advances in Mathematics 218 (2008) 600 633 [3] M. Rudelson, R. Vershynin, The smallest singular value of a random rectangular matrix, submitted [4] J. von Neumann, Collected works. Vol. V: Design of computers, theory of automata and numerical analysis. General editor: A. H. Taub. A Pergamon Press Book The Macmillan Co., New York, 1963 (M.R.) Department of Mathematics, University of Missouri, Columbia, MO 65211, USA E-mail address: rudelson@math.missouri.edu (R.V.) Department of Mathematics, University of California, Davis, CA 95616, USA E-mail address: vershynin@math.ucdavis.edu