Sharp lower bounds on the least singular value of a random matrix without the fourth moment condition *

Electron. Commun. Probb. 2 (215), no. 44, 1 9. DOI: 1.1214/ECP.v2-489 ISSN: 183-589X ELECTRONIC COMMUNICATIONS in PROBABILITY Shrp lower bounds on the lest singulr vlue of rndom mtrix without the fourth moment condition * Pvel Yskov Abstrct We obtin non-symptotic lower bounds on the lest singulr vlue of X pn/ n, where X pn is p n rndom mtrix whose columns re independent copies of n isotropic rndom vector X p in R p. We ssume tht there exist M > nd α (, 2] such tht P( (X p, v) > t) M/t 2+α for ll t > nd ny unit vector v R p. These bounds depend on y = p/n, α, M nd re symptoticlly optiml up to constnt fctor. Keywords: Rndom mtrices; Singulr vlues; Hevy-tiled distributions. AMS MSC 21: 6B2. Submitted to ECP on Februry 3, 215, finl version ccepted on June 2, 215. 1 Introduction In this pper we obtin shrp lower bounds on the lest singulr vlue of rndom mtrix with independent hevy-tiled rows. For precise sttements, we need to introduce some nottion. Let X p be n isotropic rndom vector in R p, i.e. EX p Xp = I p for p p identity mtrix I p. Let lso X pn be p n rndom mtrix whose columns {X pk } n k=1 re independent copies of X p. Denote by s p (n 1/2 X pn) the lest singulr vlue of the mtrix n 1/2 X pn. The celebrted Bi-Yin theorem sttes tht, with probbility one, s p (n 1/2 X pn) = 1 y + o(1) when n, p = p(n) stisfies p/n y (, 1), nd the entries of X p re independent copies of rndom vrible ξ with Eξ =, Eξ 2 = 1, nd Eξ 4 <. In [5], Tikhomirov extended this result to the cse Eξ 4 =. Severl uthors hve studied non-symptotic versions of this theorem, relxing the independence ssumption, nd obtined bounds of the form s p (n 1/2 X pn) 1 Cy log y b tht hold with lrge probbility for some C,, b > nd ll smll enough y = p/n. See ppers [2], [3], [4], nd [6]. For generl isotropic rndom vectors X p with dependent entries not hving finite fourth moments, the optiml vlues of nd b re unknown. Assuming tht there exist M > nd α (, 2] such tht P( (X p, v) > t) M t 2+α for ll t > nd ny unit (in the l 2 -norm) vector v R p, (1.1) *Supported by RNF grnt 14-21-162 from the Russin Scientific Fund. Steklov Mthemticl Institute, Russi. E-mil: yskov@mi.rs.ru

Shrp lower bounds on the lest singulr vlue we derive the optiml vlues of nd b in this pper. The pper is orgnized s follows. Section 2 contins the min results of the pper. Section 3 dels with the proofs. An Appendix with proofs of uxiliry results is given in Section 4. 2 Min results Our min lower bound is corollry of Theorem 2.1 in [6]. It is given below. Theorem 2.1. Let C 1 nd n > p 1. If (1.1) holds for M = C α/2 nd some α (, 2], then, with probbility t lest 1 e p, K α (Cy) α/(2+α), α (, 2) s p (n 1/2 X pn) 1 14 Cy log(c/y) α = 2 nd C/y > e Cy, α = 2 nd C/y e where y = p/n nd K α = 1/(α(1 α/2)) 2/(2+α). The next theorem contins our min upper bound for clss of rndom vectors X p = ηz p for Z p = (z 1,..., z p ) with i.i.d. entries {z i } p i=1 independent of η. (2.1) Theorem 2.2. Let (2.1) hold for ech p 1, where {z i } i=1 re independent copies of rndom vrible z with Ez =, Ez 2 = 1, nd η is rndom vrible with Eη 2 = 1. If there exist α (, 2] nd C > such tht then, for ech smll enough y >, P( η > t) Cα/2 t 2+α for ll lrge enough t >, (2.2) s p (n 1/2 X pn) 1 + o(1) 1 2 { K α (Cy) α/(2+α), α (, 2) Cy log(c/y), α = 2 lmost surely s n, where p = p(n) = yn + o(n) nd K α is given in Theorem 2.1. Theorem 2.2 nd the next proposition show tht, when y is smll enough, the lower bounds in Theorem 2.1 re symptoticlly optiml up to constnt fctor (equl to 14). Proposition 2.3. For ny given C > 1/4 nd α (, 2], there exists rndom vrible η such tht Eη 2 = 1, (2.2) holds, nd P( (X p, v) > t) (κc)α/2 t 2+α for ll t > nd ny unit vector v R p, where X p = ηz p, Z p is stndrd norml vector in R p tht is independent of η, nd κ > is universl constnt. The proof of Proposition 2.3 is given t the end of the pper, before the Appendix. 3 Proofs We will use below the following fct. By definition, s p (n 1/2 X pn) is the squre root of λ p (n 1 X pn X pn), where λ p (A) is the lest eigenvlue of p p mtrix A. In ddition, if 1 b for some, b, then 1 b. Moreover, if 1 b for some, b, then 1 b/2. Thus, to prove Theorems 2.1 nd 2.2 we need to derive pproprite lower nd upper bounds only for λ p (n 1 X pn X pn). ECP 2 (215), pper 44. Pge 2/9

Shrp lower bounds on the lest singulr vlue Proof of Theorem 2.1. By Theorem 2.1 in [6], for ll > nd y = p/n (, 1), λ p (n 1 X pn X pn) c p () C p() Cp (2)Z 5y +, n where Z = Z(p, n, ) is rndom vrible with EZ = nd P(Z < t) e t2 /2, t >, c p () = inf E min{(x p, v) 2, } nd C p () = sup E(X p, v) 2 min{(x p, v) 2, } with inf nd sup tken over ll unit vectors v R p. Since P ( Z < 2p ) e p nd y = p/n, we hve, with probbility t lest 1 e p, λ p (n 1 X pn X pn) c p () C p() 5y 2yC p (2). (3.1) To estimte c p () nd C p (), we will use the following lemm tht is proved in the Appendix. Lemm 3.1. Let >, X p be n isotropic rndom vector in R p, nd (1.1) hold for some M > nd α (, 2]. If α (, 2), then c p () 1 2M α α/2 nd C p () (2/α + 4/(2 α))m 1 α/2. In ddition, if α = 2, then c p () 1 M nd C p () 2M + M log( 2 /M) I( 2 > M). Tking First, ssume tht α (, 2). Using (1.1) nd Lemm 3.1, we get we hve In ddition, C p (2) 2 c p () C p() [ 4 1 α + 4 ] M 8M α/2 = 1 2 α α/2 α(2 α). [ ] 2My 1 2/(2+α) = = K α (M/y) 2/(2+α), α(2 α) y = 2M α/2 α(2 α) nd c p () C p() 1 4y. [ 2 α + 4 ] [ 4 M(2) α/2 2 α α + 4 ]M α/2 = 8M α/2 2 α α(2 α) = 4y nd 2yC p (2) 2y(8 2 y) = 4y = 4K α (M 2/α y) α/(2+α). Since C = M 2/α, we infer from (3.1) tht, with probbility t lest 1 e p, λ p (n 1 X pn X pn) 1 13y = 1 13K α (Cy) α/(2+α). Thus we get the desired lower bounds for α (, 2). Suppose now α = 2. Then M = C α/2 = C 1 nd log( 2 /C) log( 2 ) for ny >. Lemm 3.1 implies tht c p () C p() 1 3C + C log(2 )I( 2 > C). ECP 2 (215), pper 44. Pge 3/9

Shrp lower bounds on the lest singulr vlue Consider two possibilities log(c/y) > 1 nd log(c/y) 1. Assuming tht log(c/y) 1 nd tking = C/y, we hve 2 > C, log( 2 ) 1, nd 3C + C log( 2 ) 4C = 4 Cy. Additionlly, we get 5y = 5 Cy, C p (2) 2C + C log(4 2 ) (3 + log 4)C 9C/2 nd 2yC p (2) 3 Cy. As result, we conclude from (3.1) tht, with probbility t lest 1 e p, λ p (n 1 X pn X pn) 1 12 Cy. Suppose log(c/y) > 1. Set = (C/y) log(c/y). Then 2 > C, C/y C/y, nd 3C + C log( 2 ) 3C + C log(c/y)2 C/y 5 Cy log(c/y). Similrly, C p (2) 2C + C log(4 2 ) 7C/2 + C log( 2 ) (7/2 + 2)C log(c/y) nd 2yC p (2) 4 Cy log(c/y). Noting tht 5y = 5 Cy log(c/y), we infer tht, with probbility t lest 1 e p, Thus we hve proved the theorem. λ p (n 1 X pn X pn) 1 14 Cy. Proof of Theorem 2.2. We will use the following lemm (for the proof, see the Appendix). Lemm 3.2. Under the conditions of Theorem 2.2, λ p (n 1 X pn X pn) mx{, sup λ(s)} + o(1).s., n, (3.2) s> where p = p(n), p/n y (, 1), nd λ(s) = y/s + Eη 2 /(1 + sη 2 ). We estimte λ = λ(s) given in Lemm 3.2 s follows. Set ζ = η 2. Since Eζ = 1, λ(s) + y s = E ζ ( 1 + sζ = 1 + E ζ ) 1 + sζ ζ = 1 E sζ2 1 + sζ. It follows from the inequlity x/(1 + x) min{x, 1}/2, x, nd (4.1) tht E sζ2 1 + sζ 1 2 Eζ min{sζ, 1} = 1 2s [E(sζ 1)I(sζ > 1) + E min{(sζ)2, 1}]. As result, for ll s >, we get the following upper bound λ(s) 1 y s 1 2s [E(sζ 1)I(sζ > 1) + E min{(sζ)2, 1}]. (3.3) Recll lso tht, by (2.2) nd the definition of ζ (= η 2 ), there exists t 1 such tht P(ζ > t) Cα/2 t 1+α/2 for ll t t. (3.4) ECP 2 (215), pper 44. Pge 4/9

Shrp lower bounds on the lest singulr vlue As in the proof of Lemm 3.2 (see the Appendix), we get tht λ (s) = (y h(s))/s 2, s >, where h(s) = E(sζ) 2 /(1 + sζ) 2 is continuous strictly incresing function on R + with h() = nd h( ) = P(ζ > ) >. Hence, if y < P(ζ > ), λ(s) chieves its mximum in s = b with b = h 1 (y). Let α (, 2) nd tke y smll enough to mke b = h 1 (y) 1/(2 1/(1 α/2) t ). Then 1/b > t nd, by (3.4), E(bζ 1)I(bζ > 1) = 1 P(bζ > t) dt Moreover, (1/b) 1 α/2 /2 > t 1 α/2 nd, by (3.4), 1 1/b E min{(bζ) 2, 1} = P((bζ) 2 > t) dt = 2b 2 2b 2 1/b t C α/2 1 C α/2 (t/b) 1+α/2 dt = 2 α (Cb)α/2 b. zp(ζ > z) dz z dz = α/2 2Cα/2 b 2 (1/b)1 α/2 t 1 α/2 1 α/2 2C α/2 b 2 (1/b)1 α/2 /2 1 α/2 = (Cb)α/2 b 1 α/2. By (3.3), λ(b) g(b), where g(b) = 1 y/b Kb α/2 nd ( ) K = Cα/2 1 2 α/2 + 1 C α/2 = 1 α/2 α(1 α/2). By Young s inequlity, nd (K 2/α y) α 2+α = ( ) α y 2+α (Kb α/2 ) 2 2+α b λ(b) g(b) 1 (K 2/α y) α/(2+α). y/b (2 + α)/α + Kbα/2 (2 + α)/2 y b + Kbα/2 The right-hnd side of the lst inequlity cn be mde positive for smll enough y. Hence, combining the bove bounds with Lemm 3.2, we get the desired upper bound for λ p (n 1 X pn X pn) when α (, 2) (see lso the beginning of Section 3). Let now α = 2 nd tke y smll enough to mke b = h 1 (y) 1/t 2. Since t 1, we hve 1/b t 2 t nd, hence, the sme rguments s bove yield C E(bζ 1)I(bζ > 1) = P(bζ > t) dt 1 1 (t/b) 2 dt = Cb2, E min{(bζ) 2, 1} 2b 2 1/b Therefore, it follows from (3.3) tht λ(b) g(b), where Differentiting g yields t g (s) = y s 2 C 2 C z dz = 2Cb2 log 1 2Cb 2 log 1 = Cb 2 log(1/b). bt b g(s) = 1 y s Cs (log(1/s) + 1), s >. 2 Cs 1 (log(1/s) + 1) + 2 s = 2y Cs2 log(1/s) 2s 2. ECP 2 (215), pper 44. Pge 5/9

Shrp lower bounds on the lest singulr vlue If 2y/C is smll enough, then g = g(s) hs unique locl mximum in s 1 nd unique locl minimum in s 2, where s 1 < s 2, nd s 1, s 2 re solutions to the eqution f(s) = 2y/C with f(s) = s 2 log(1/s). The function f = f(s) is incresing on [, 1/ e], decresing on [1/ e, ] nd hs f() = f(1) =. Hence, s 2 > 1/2 nd b = h 1 (y) < 1/2 when y is smll enough. Thus, λ(b) g(b) 1 y Cs 1 s 1 2 (log(1/s 1) + 1) 1 y Cs2 1 log(1/s 1 ) = 1 2y. s 1 2s 1 s 1 Let us bound s 1 from bove. Tke s = (4y/C)/ log(c/y). If y is smll enough, then s < 1/ e s well s s 2 log(1/s ) = 4y/C [ 1 log(c/y) 2 log(c/y) + 1 ( 1 )] 2 log 4 log(c/y) = 2y 4 2y log log C/y + > 2y C C log(c/y) C. Therefore, s 1 < s nd λ(b) 1 2y s 1 1 2y s = 1 Cy log(c/y). The right-hnd side of the lst inequlity cn be mde positive for smll enough y. Hence, combining the bove bounds with Lemm 3.2, we get the desired upper bound for λ p (n 1 X pn X pn) in the cse with α = 2 (see lso the beginning of Section 3). Proof of Proposition 2.3. Let t = (1 + 2/α) 1 nd q = C/t 1+2/α. If α (, 2], then q C inf (1 + α (,2] 2/α)1+2/α = 4C > 1. Let η = ξζ, where ξ nd ζ re independent rndom vribles, ζ hs the Preto distribution P(ξ = q) = q 1 nd P(ξ = ) = 1 q 1, P(ζ > t) = { (t /t) 1+α/2, t t, 1, t < t. It is esy to see tht Eξ = 1. Moreover, P(ζ > t) (t /t) 1+α/2 for ll t > nd Eζ = P(ζ > t) dt = t + (t /t) 1+α/2 dt = t + 2t t α = 1. Hence, Eη 2 = Eξ Eζ = 1. In ddition, (2.2) holds since, for ll lrge enough t >, P( η > t) = q 1 P(ζ > t 2 /q) = q 1 (qt /t 2 ) 1+α/2 = qα/2 t 1+α/2 t 2+α We lso hve = Cα/2 t 2+α. (X p, v) = ξζ (Z p, v) d = ξζ Z for ll unit vectors v R p, where Z N (, 1) is independent of (ξ, ζ), «d =» mens equlity in lw. Hence, if t >, P( ξζ Z > t) = EP(sζ > t 2 ) s=ξz 2 E ( st /t 2) 1+α/2 I(s > ) s=ξz 2 E(t ξz 2 ) 1+α/2 t 2+α = t1+α/2 q α/2 E Z 2+α t 2+α = Cα/2 E Z 2+α t 2+α (κc)α/2 t 2+α, ECP 2 (215), pper 44. Pge 6/9

Shrp lower bounds on the lest singulr vlue where κ = sup (E Z 2+α ) 2/α. α (,2] Let us show tht κ <. If Z N (, 1), then f(α) = E Z 2+α = 2 2+α 2 Γ ( 3 + α ) π 2 is smooth function on [, 2] with f() = 1 nd, in prticulr, f () exists nd is finite. The function g(α) = f(α) 2/α is continuous on (, 2] nd g(α) = (1 + f ()α + o(α)) 2/α exp{2f ()}, α +. As result, κ = sup{g(α) : α (, 2]} is finite. This finishes the proof of the proposition. 4 Appendix Proof of Lemm 3.1. If U is non-negtive rndom vrible with EU = 1, then E min{u, } = P(U > t) dt = EU P(U > t) dt 1 M dt = 1 2M t1+α/2 α, α/2 where M = sup{t 1+α/2 P(U > t) : t > }. Putting U = (X p, v) 2 for given unit vector v R p nd tking the infimum over such v, we obtin the desired lower bound for c p (). Similrly, we hve where I 1 = EU min{u, } =E(U )I(U > ) + 2 P(U > ) + EU 2 I(U ) =E(U )I(U > ) + E min{u 2, 2 } =I 1 + I 2, (4.1) M P(U > t) dt t dt = 2M 2 1+α/2 α 1 α/2, I 2 = P(U 2 > t) dt. If α (, 2), then I 2 cn be bounded s follows Similrly, if α = 2, then I 2 2 Mdt M1 α/2 = t1/2+α/4 1/2 α/4. 2 I 2 M + I( 2 Mdt > M) = M + M log( 2 /M)I( 2 > M). M t Thus, we hve proved tht EU min{u, } M { (2/α + 4/(2 α)) 1 α/2, α (, 2), 2 + log( 2 /M)I( 2 > M), α = 2. Putting U = (X p, v) 2 for given unit vector v R p nd tking the supremum over such v, we get the desired upper bound for C p (). ECP 2 (215), pper 44. Pge 7/9

Shrp lower bounds on the lest singulr vlue Proof of Lemm 3.2. We hve n 1 X pn X pn = n 1 Z pn T n Z pn, where Z pn is p n mtrix with i.i.d. entries, T n is n n digonl mtrix whose digonl entries re independent copies of ζ = η 2, nd Z pn is independent of T n. By the Glivenko-Cntelli theorem, the empiricl spectrl distribution of T n converges.s. to the distribution of ζ. By Theorem 4.3 in [1], there is non-decresing cádlág function F = F (λ), λ R, such tht F (λ) = for λ <, F ( ) 1, nd ( P lim n 1 p p k=1 ) I(λ kn λ) = F (λ) = 1 for ll continuity points λ of F, (4.2) where p = p(n) = yn + o(n) nd {λ kn } p k=1 is the set of eigenvlues of p 1 X pn X pn. The Stieltjes trnsform F (dλ) f(z) = R λ z, z C+ = {w C : Iz > }, (4.3) of F cn be defined explicitly s unique solution in C + to the eqution ( f(z) = z 1 ) 1 y E ζ or, equivlently, z = 1 1 + f(z)ζ f(z) + 1 y E ζ 1 + f(z)ζ. (4.4) Define S G = {λ : G(λ + ε) > G(λ ε) for ny smll enough ε > } for non-decresing cádlág function G = G(λ), λ R. In other words, S G is the set of points of increse of G. Obviously, S G is closed set. Using (4.2) nd setting G = F s well s = inf{λ : λ S F }, we conclude tht S F nd when n. Consider the function λ p (n 1 X pn X pn) = p n λ p(p 1 X pn X pn) y + o(1).s. (4.5) z(s) = 1 s + 1 y E ζ 1 + sζ defined for s D, where D consists of ll s R\{} with s 1 / S G for G(λ) = P(ζ λ), λ R. This function differs from λ = λ(s) given in Lemm 3.2 by the fctor y, i.e. λ(s) = yz(s) for ll s >. Therefore, to finish the proof, we only need to show tht = mx{, sup z(s)}. s> Let us show tht = when z(s) for ll s >. The ltter cn be reformulted s follows: if >, then there is s > stisfying z(s) >. Suppose >. Then /2 R \ S F nd F (/2) =. Hence, F (dλ) f(/2) = > nd lim f(/2 + iε) = f(/2) >. R λ /2 ε + Tking z = /2 + iε in (4.4) nd tending ε to zero, we get /2 = z(s) > for s = f(/2). Assume further tht there is s > stisfying z(s) > or, equivlently, g(s) = E sζ 1 + sζ > y. ECP 2 (215), pper 44. Pge 8/9

Shrp lower bounds on the lest singulr vlue The function g = g(s) is continuous nd strictly incresing on R +. It chnges from zero to P(ζ > ) when s chnges from zero to infinity. The sme cn be sid bout h(s) = E (sζ)2 (1 + sζ) 2. Hence, y < P(ζ > ) nd there is b = b(y) > tht solves h(b) = y. By the Lebesgue dominted convergence theorem, z (s) = 1 s 2 1 y E ζ 2 (1 + sζ) 2 = y h(s) ys 2 for ny s >. Therefore, b is strict globl mximum point of z = z(s) on {s : s > }. The rest of the proof is bsed on Lemm 6.1 in [1] which sttes tht z (s) > nd s D if s = f(λ) for some λ R \ S F. Moreover, {z(s) : s D, z (s) > } R \ S F. We will now prove tht z(b). Suppose the contrry, i.e. > z(b). By definition, F (λ) = for ll λ <. Set z = z(b). Then z R \ S F, s = f(z ) = R F (dλ) λ z >, nd, by the bove lemm, z (s ) >. Tking z = z + iε in (4.4) nd tending ε to zero, we rrive t z(b) = z = z(f(z )) = z(s ). Since z (s ) > nd s >, we get the contrdiction to the fct tht b is strict globl mximum point of z = z(s) on {s : s > }. Let us finlly prove tht z(b). The function z = z(s) is continuous nd strictly incresing on the set (, b) with z(+) = nd z(b ) = z(b). By the bove lemm, Thus, z(b). This finishes the proof. References z((, b)) = (, z(b)) R \ S F. [1] Bi, Zh. nd Silverstein, J.: Spectrl nlysis of lrge dimensionl rndom mtrices. Second Edition. New York: Springer, (21). MR-2567175 [2] Koltchinskii, V. nd Mendelson, S.: Bounding the smllest singulr vlue of rndom mtrix without concentrtion. rxiv:1312.358. [3] Oliveir, R.I.: The lower til of rndom qudrtic forms, with pplictions to ordinry lest squres nd restricted eigenvlue properties. rxiv:1312.293. [4] Srivstv, N. nd Vershynin, R.: Covrince estimtion for distributions with 2 + ε moments. Ann. Probb., 41, (213), 381 3111. MR-3127875 [5] Tikhomirov, K.: The limit of the smllest singulr vlue of rndom mtrices with i.i.d. entries. rxiv:141.6263. [6] Yskov, P.: Lower bounds on the smllest eigenvlue of smple covrince mtrix. Electron. Commun. Prob., 19 (214), pper 83, 1-1. MR-329162 ECP 2 (215), pper 44. Pge 9/9

Electronic Journl of Probbility Electronic Communictions in Probbility Advntges of publishing in EJP-ECP Very high stndrds Free for uthors, free for reders Quick publiction (no bcklog) Economicl model of EJP-ECP Low cost, bsed on free softwre (OJS 1 ) Non profit, sponsored by IMS 2, BS 3, PKP 4 Purely electronic nd secure (LOCKSS 5 ) Help keep the journl free nd vigorous Donte to the IMS open ccess fund 6 (click here to donte!) Submit your best rticles to EJP-ECP Choose EJP-ECP over for-profit journls 1 OJS: Open Journl Systems http://pkp.sfu.c/ojs/ 2 IMS: Institute of Mthemticl Sttistics http://www.imstt.org/ 3 BS: Bernoulli Society http://www.bernoulli-society.org/ 4 PK: Public Knowledge Project http://pkp.sfu.c/ 5 LOCKSS: Lots of Copies Keep Stuff Sfe http://www.lockss.org/ 6 IMS Open Access Fund: http://www.imstt.org/publictions/open.htm