Inhomogeneous circular laws for random matrices with non-identically distributed entries

Similar documents
Random regular digraphs: singularity and spectrum

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

Small Ball Probability, Arithmetic Structure and Random Matrices

Random Matrices: Invertibility, Structure, and Applications

LOWER BOUNDS FOR THE SMALLEST SINGULAR VALUE OF STRUCTURED RANDOM MATRICES. By Nicholas Cook University of California, Los Angeles

Operator-Valued Free Probability Theory and Block Random Matrices. Roland Speicher Queen s University Kingston

Invertibility of random matrices

arxiv: v1 [math.pr] 22 May 2008

1 Intro to RMT (Gene)

The Matrix Dyson Equation in random matrix theory

Concentration Inequalities for Random Matrices

, then the ESD of the matrix converges to µ cir almost surely as n tends to.

Comparison Method in Random Matrix Theory

Random matrices: Distribution of the least singular value (via Property Testing)

Semicircle law on short scales and delocalization for Wigner random matrices

III. Quantum ergodicity on graphs, perspectives

Anti-concentration Inequalities

Eigenvalue variance bounds for Wigner and covariance random matrices

Local law of addition of random matrices

RANDOM MATRICES: OVERCROWDING ESTIMATES FOR THE SPECTRUM. 1. introduction

RESEARCH STATEMENT ANIRBAN BASAK

The rank of random regular digraphs of constant degree

Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction

Lecture 3. Random Fourier measurements

arxiv: v3 [math.pr] 8 Oct 2017

Invertibility of symmetric random matrices

Exponential tail inequalities for eigenvalues of random matrices

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Isotropic local laws for random matrices

Random Matrix: From Wigner to Quantum Chaos

arxiv: v2 [math.fa] 7 Apr 2010

Concentration and Anti-concentration. Van H. Vu. Department of Mathematics Yale University

arxiv: v2 [math.pr] 16 Aug 2014

RANDOM MATRICES: TAIL BOUNDS FOR GAPS BETWEEN EIGENVALUES. 1. Introduction

Fluctuations from the Semicircle Law Lecture 4

Least singular value of random matrices. Lewis Memorial Lecture / DIMACS minicourse March 18, Terence Tao (UCLA)

RANDOM MATRICES: LAW OF THE DETERMINANT

Upper Bound for Intermediate Singular Values of Random Sub-Gaussian Matrices 1

Extreme eigenvalues of Erdős-Rényi random graphs

EIGENVECTORS OF RANDOM MATRICES OF SYMMETRIC ENTRY DISTRIBUTIONS. 1. introduction

Lectures 2 3 : Wigner s semicircle law

Inverse Theorems in Probability. Van H. Vu. Department of Mathematics Yale University

Determinantal point processes and random matrix theory in a nutshell

THE SPARSE CIRCULAR LAW UNDER MINIMAL ASSUMPTIONS

Random matrices: A Survey. Van H. Vu. Department of Mathematics Rutgers University

arxiv: v4 [math.pr] 13 Mar 2012

arxiv: v5 [math.na] 16 Nov 2017

THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES

Kernel Method: Data Analysis with Positive Definite Kernels

REPORT DOCUMENTATION PAGE

Non white sample covariance matrices.

Local Kesten McKay law for random regular graphs

Independence and chromatic number (and random k-sat): Sparse Case. Dimitris Achlioptas Microsoft

The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008

Random Matrices for Big Data Signal Processing and Machine Learning

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

Reconstruction from Anisotropic Random Measurements

Convergence of spectral measures and eigenvalue rigidity

arxiv: v3 [math.pr] 9 Nov 2015

The Gaussian free field, Gibbs measures and NLS on planar domains

LUCK S THEOREM ALEX WRIGHT

Random Fermionic Systems

Fluctuations of random tilings and discrete Beta-ensembles

Fluctuations of random tilings and discrete Beta-ensembles

Spectral Statistics of Erdős-Rényi Graphs II: Eigenvalue Spacing and the Extreme Eigenvalues

285K Homework #1. Sangchul Lee. April 28, 2017

Free Probability Theory and Random Matrices. Roland Speicher Queen s University Kingston, Canada

arxiv: v4 [math.pr] 10 Sep 2018

Stein s Method for Matrix Concentration

A new method to bound rate of convergence

THE SMALLEST SINGULAR VALUE OF A RANDOM RECTANGULAR MATRIX

arxiv: v4 [math.pr] 10 Mar 2015

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices

THE LITTLEWOOD-OFFORD PROBLEM AND INVERTIBILITY OF RANDOM MATRICES

arxiv: v3 [math-ph] 21 Jun 2012

Preface to the Second Edition...vii Preface to the First Edition... ix

Compound matrices and some classical inequalities

The Hadamard product and the free convolutions

Free Probability, Sample Covariance Matrices and Stochastic Eigen-Inference

How long does it take to compute the eigenvalues of a random symmetric matrix?

Random Toeplitz Matrices

Spectra of Large Random Stochastic Matrices & Relaxation in Complex Systems

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices

Wigner s semicircle law

1 Math 241A-B Homework Problem List for F2015 and W2016

The norm of polynomials in large random matrices

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

Free probabilities and the large N limit, IV. Loop equations and all-order asymptotic expansions... Gaëtan Borot

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

A Brief Introduction to Functional Analysis

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2

Universality for random matrices and log-gases

Large sample covariance matrices and the T 2 statistic

arxiv: v1 [cs.na] 6 Jan 2017

SHARP TRANSITION OF THE INVERTIBILITY OF THE ADJACENCY MATRICES OF SPARSE RANDOM GRAPHS

Matrix estimation by Universal Singular Value Thresholding

From the mesoscopic to microscopic scale in random matrix theory

Spectral Universality of Random Matrices

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 9, SEPTEMBER

Transcription:

Inhomogeneous circular laws for random matrices with non-identically distributed entries Nick Cook with Walid Hachem (Telecom ParisTech), Jamal Najim (Paris-Est) and David Renfrew (SUNY Binghamton/IST Austria) UC Davis, 2017/04/19

The Circular Law Empirical spectral distribution (ESD) for M M n (C) with eigenvalues λ 1,..., λ n C: µ M = 1 n δ λi. iid matrix model: Atom variable: ξ C with E ξ = 0, E ξ 2 = 1. For n 1 let X n be n n with iid entries ξ (n) d = ξ. Theorem (Mehta 67, Girko 84, Edelman 97, Bai 97, Götze Tikhomirov 05, 07, Pan Zhou 07, Tao Vu 07, Tao Vu 08) Almost surely, µ 1 n X n converges weakly to 1 π 1 B(0,1)dxdy as n. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 2 / 24

Applications: stability for mean field models of dynamical systems Model a neural network with a nonlinear system of ODEs: ẋ i (t) = x i (t) + Y tanh(x j (t)) where x i is the membrane potential for the ith neuron and tanh(x i ) is its firing rate. Sompolinsky, Crisanti, Sommers 88: modeled the synaptic matrix Y with a random matrix. (Similar to approach by [May 72] in ecology.) More recent work [Rajan, Abbott 06], [Tao 09], [Aljadeff, Renfrew, Stern 14] has tried to incorporate other structural features of neural networks into the distribution of Y. Desirable to consider Y with a general, possibly sparse variance profile, i.e. Y n = 1 ( 1 A n X n = n σ ξ ). n j=1 N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 3 / 24

Simulated ESDs for Y n = ( 1 n σ ξ ). 0.4 0.3 0.2 0.1 0 0.1 0.2 n = 2000 ξ {± 1 2 ± i 1 2 } uniform. σ = σ( i n, j n ), with σ(x, y) = 1( x y 0.05) 0.3 0.4 0.4 0.3 0.2 0.1 0 0.1 0.2 0.3 0.4. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 4 / 24

( ) Simulated ESDs for Y n = n 1 σ ξ 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 n = 2000 ξ {± 1 2 ± i 1 2 } uniform. σ = σ( i n, j n ), with σ(x, y) = (x + y) 2 1( x y 0.1) 0.8 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 5 / 24

( ) Simulated ESDs for Y n = n 1 σ ξ 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 n = 2001 ξ {± 1 2 ± i 1 2 } uniform. 0 1 n/3 1 n/3 A n = (σ ) = 1 n/3 0 0. 1 n/3 0 0 0.8 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 6 / 24

The model and assumptions Inputs: Y n = 1 ( 1 A n X n = n n σ (n) ) ξ (n). Atom variable ξ C with E ξ = 0, E ξ 2 = 1, E ξ 4+ε <. X n = (ξ (n) ) sequence of iid matrices with atom variable ξ. Standard deviation profiles A n = (σ (n) ) with uniformly bounded entries, i.e. σ (n) [0, 1]. Additionally assume A n is robustly irreducible for all n. Assumptions allow for vanishing variances (σ (n) = 0 for (say) 99% of entries). N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 7 / 24

Results Y n = 1 n A n X n = ( 1 n σ (n) ξ (n) E ξ 4+ε <, σ (n) [0, 1], A n robustly irreducible. ), with Theorem (C., Hachem, Najim, Renfrew 16) (Abridged) In the above setup, for each n, A n determines a deterministic, compactly supported, rotationally invariant probability measure µ n over C such that µ Yn µ n in probability, i.e. f dµ Yn f dµ n 0 in probability f C b (C). C C Refer to measures µ n as deterministic equivalents for µ Yn. Recent work of Alt, Erdős and Krüger gives a local version under stronger hypotheses (ξ having bounded density and all moments, σ (n) σ min > 0). N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 8 / 24

Results Y n = 1 n A n X n = ( 1 E ξ 4+ε <, n σ (n) ξ (n) ), with σ (n) uniformly bounded, A n robustly irreducible. V n = ( 1 n σ2 ) doubly stochastic. Theorem (Circular law for doubly stochastic variance profile) With Y n as above, µ Yn 1 π 1 B(0,1)dxdy in probability. Analogue of a result of Anderson and Zeitouni 06 for Hermitian random matrices. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 9 / 24

How does A n determine µ n? Through the Master Equations For s > 0 consider the following system in unknowns q, q R n : ME(s) : q i = q i = (V T n q) i s 2 + (V n q) i (V T n q) i, (V n q) i s 2 + (V n q) i (V T n q) i, q i = Can show that if V n is irreducible, s (0, ρ(v n )), unique non-trivial solution (q(s), q(s)) R 2n 0. µ n is the radially symmetric probability measure on C with µ n (B(0, s)) = 1 1 n q(s)t V n q(s) s (0, ) where we set q(s) = q(s) = 0 for s ρ(v n ). q i. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 10 / 24

Plot of F n (s) = 1 1 n q(s)t V n q(s) (curve) vs empirical realization (+) 1 0.8 0.6 0.4 0.2 n = 2001 ξ {± 1 2 ± i 1 2 } uniform. 0 1 n/3 1 n/3 A n = (σ ) = 1 n/3 0 0. 1 n/3 0 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 11 / 24

Useful transforms for proving convergence of measures For sums of independent scalar r.v. s use the Fourier transform (characteristic function). For µ supported on R (such as ESDs of Hermitian matrices) we have the Stieltjes transform s µ : C + C +, s µ (w) = R dµ(x) x w, µ = lim 1 t 0 π Im s µ( + it) = lim µ Cauchy(t). t 0 For µ supported on C (such as ESDs of non-hermitian matrices) we have the log transform (or Coulomb potential) U µ (z) = log λ z dµ(λ), µ = 1 2π U µ. C N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 12 / 24

Girko s Hermitization approach log transform for probability measure µ on C: U µ (z) = log λ z µ(dλ), µ = 1 2π U µ. For the ESD of Y n = 1 n A n X n, C U µyn (z) = 1 log λ i (Y n ) z = 1 n n log det(y n z) = 1 2n log det(y n z ) = log x dµ Y z n (x) R ( ) where Yn z 0 Y n z := is a 2n 2n Hermitian matrix with Yn z 0 eigenvalues {±s i (Y n z)} n R.. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 13 / 24

Girko s Hermitization approach Y z n = 0 Y n z Y n z 0, U µyn (z) = log x dµ Y z n (x). R Two steps. For a.e. z C: 1 Find deterministic equivalents ν z n µ Y z n. 2 Prove these measures uniformly (in n) integrate log. Now we can use the Stieltjes transform sn(w) z := R dµ Y z n (x) x w = 1 2n 2 where R z n (w) := (Y z n w) 1 is the resolvent of Y z n. 1 λ i (Yn z ) w = 1 2n Tr Rz n (w) N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 14 / 24

Deterministic equivalents for resolvents R z n (w): Complex Gaussian case I 2n = Rn z (w) 1 Rn z (w) = w Yn z Y n z S w T T = ws z T + Y T S 1 = w E S ii z E T ii + (Gaussian IBP) = w E S ii z E T ii + (Resolvent derivative formula) (Cauchy Schwarz & Poincaré) = w E S ii z E T ii 1 n E Y Tji j=1 j=1 1 n σ2 E Y T ji σe 2 S ii Sjj j=1 w E S ii z E T ii 1 n (E S ii) σ(e 2 S jj ). j=1 N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 15 / 24

Deterministic equivalents for resolvents Rn z (w) = (Yn z w) 1 = ( S T) T S Similarly obtain equations for E S ii, E S ii, E T ii, E T ii, 1 i n from diagonal entries of the other three blocks. Eventually reduce to a perturbed cubic system of 2n equations: E S ii = E S ii = (Vn T s) i + w z 2 [(Vn T s) i + w][(v n s) i + w] + E n, (V n s) i + w z 2 [(Vn T s) i + w][(v n s) i + w] + E n, where s := (E S ii ) n, s := (E S ii ) n, and E n, E n are small errors. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 16 / 24

Deterministic equivalents for resolvents Rn z (w) = (Yn z w) 1 = ( S T) T S Stability analysis s, s are close to solutions p( z, w), p( z, w) of the unperturbed system (the Schwinger Dyson loop equations): (Vn T p) i + w p i = z 2 [(Vn T p) i + w][(v n p) i + w], LE( z, w) : (V n p) i + w p i = z 2 [(Vn T p) i + w][(v n p) i + w]. In particular: E sn(w) z = E 1 2n Tr Rz n (w) = E 1 2n 1 2n ( S ii + ( p i + ) S ii ) p i = 1 n p i. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 17 / 24

Deterministic equivalents for resolvents Rn z (w) = (Yn z w) 1 = ( S T) T S E s z n(w) 1 n p i. Extend to the non-gaussian case by a Lindeberg swapping argument, and remove the expectation with concentration of measure. Get s z n(w) 1 n p i a.s. in the general case (and we can quantify the error). RHS is the Stieltjes transform of a probability measure ν z n on R, so µ Y z n ν z n a.s. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 18 / 24

Integrability of log We obtained deterministic equivalents ν z n for the Hermitized ESDs µ Y z n. Remains to show for a.e. z C, µ Y z n and νn z uniformly integrate log, i.e. ε > 0 T > 0 : log x dµ Y z n (x) ε log x T with probability 1 ε, and similarly for ν z n. Singularity of log at is easy to handle. Two-step approach to singularity at 0: 1 (Wegner estimate) Use bounds on the Stieltjes transform to show µ Y z n ([ t, t]) = O(t) for t n c. 2 (Invertibility) Prove λ min (Y z n ) = s min (Y n z) n C w.h.p. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 19 / 24

Integrability of log: Wegner estimates Stieltjes transform controls the density of eigenvalues in short intervals: 1 t µ Yn z ([ t, t]) µ Yn z Cauchy(t) = Im sz n(it). From the loop equations: Im sn(it) z 1 Im p i ( z, it). n Key Proposition Let z 0, t > 0, and let p( z, it), p( z, it) be solutions to LE( z, it). If A n is robustly irreducible, then 1 n Im p i ( z, it) K for some constant K < independent of n, t. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 20 / 24

Integrability of log Invertibility of structured random matrices Want to show s min (Y n z) n C with high probability (w.h.p.). Recall s min (M) = 0 iff M is singular, and otherwise s min (M) = 1/ M 1. Say a random matrix M is well-invertible w.h.p. if { P M 1 n α} = O(n β ) for some constants α, β > 0. Question: For what choices of n n matrices A = (a ), B = (b ) is Y = 1 ( 1 ) A X + B = n a ξ + b n well-invertible w.h.p.? (For convergence of ESDs we are interested in the shifts B = zi n.) N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 21 / 24

Invertibility of 1 n A X + B = ( 1 n a ξ + b ) Condition number X X 1 for iid matrices is well-studied (von Neumann et al. 40s, Edelman 88, Sankar Spielman Teng 06, Tao Vu 05, 07, Rudelson 05, Rudelson Vershynin 07). Norm: X = O( n) w.h.p. (folklore / Bai Yin) Norm of the inverse: { P X 1 n α(β)} n β Tao Vu 05, 07 { P X 1 } n/ε ε + e cn Rudelson Vershynin 07 (ξ sub-gaussian). So X X 1 = n O(1) w.h.p. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 22 / 24

Invertibility of 1 n A X + B = ( 1 n a ξ + b ) Structured matrices: 1 n A X + B is well-invertible w.h.p. when a σ 0 > 0, B n O(1) Bordenave Chafaï 11 A is broadly connected, B = O(1) Rudelson Zeitouni 12 for ξ N R (0, 1), C. 16 general case. The above give bounds that are uniform in the shift B, i.e. { ( } 1 1 sup P n A X + B) B: B n nα = O(n β ). O(1) Can we further relax hypotheses on A if B is well-invertible (such as B = zi n for fixed z 0)? N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 23 / 24

Well-invertibility under diagonal perturbation Theorem (C. 16) Let Y = 1 n A X + Z, where 1 X = (ξ ) iid matrix with E ξ = 0, E ξ 2 = 1, E ξ 4+ε = γ < ; 2 A [0, 1] n n arbitrary; 3 Z = diag(z i ) n, z i [r, R] (0, + ) for all i [n]. There exist α, β > 0, C < depending only on ε, γ, r, R such that P ( Y 1 n α) Cn β. Note: Proof gives α = twr[o ε (1) exp((γ/r) O(1) )]. The tower exponential is due to use of Szemerédi s regularity lemma. Conjecture Above holds if Z is any matrix with s i (Z) [r, R] for all i [n]. N. Cook, W. Hachem, J. Najim, D. Renfrew (Stanford) UC Davis, 2017/04/19 24 / 24