Extreme eigenvalues of Erdős-Rényi random graphs

Similar documents
Spectral radii of sparse random matrices

Community detection in stochastic block models via spectral methods

Isotropic local laws for random matrices

Norms of Random Matrices & Low-Rank via Sampling

Random Matrices: Invertibility, Structure, and Applications

III. Quantum ergodicity on graphs, perspectives

1 Tridiagonal matrices

Random regular digraphs: singularity and spectrum

Comparison Method in Random Matrix Theory

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

On the concentration of eigenvalues of random symmetric matrices

Reconstruction in the Generalized Stochastic Block Model

Spectral Partitiong in a Stochastic Block Model

Semicircle law on short scales and delocalization for Wigner random matrices

SPARSE RANDOM GRAPHS: REGULARIZATION AND CONCENTRATION OF THE LAPLACIAN

arxiv: v1 [math.st] 26 Jan 2018

Invertibility of random matrices

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

Spectral gap in bipartite biregular graphs and applications

The non-backtracking operator

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

arxiv: v4 [math.pr] 10 Sep 2018

On the Spectra of General Random Graphs

Random matrices: A Survey. Van H. Vu. Department of Mathematics Rutgers University

arxiv: v2 [math.pr] 13 Dec 2017

Laplacian Integral Graphs with Maximum Degree 3

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

The rank of random regular digraphs of constant degree

The Isotropic Semicircle Law and Deformation of Wigner Matrices

Local Kesten McKay law for random regular graphs

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

arxiv: v5 [math.na] 16 Nov 2017

Preface to the Second Edition...vii Preface to the First Edition... ix

On the principal components of sample covariance matrices

Spectral Properties of Random Matrices for Stochastic Block Model

Diffusion Profile and Delocalization for Random Band Matrices

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

8.1 Concentration inequality for Gaussian random matrix (cont d)

Estimation of large dimensional sparse covariance matrices

Dissertation Defense

Inhomogeneous circular laws for random matrices with non-identically distributed entries

arxiv: v3 [math-ph] 21 Jun 2012

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

The expansion of random regular graphs

Largest eigenvalues of sparse inhomogeneous Erdős-Rényi graphs

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws

A reverse Sidorenko inequality Independent sets, colorings, and graph homomorphisms

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Eigenvalues of Random Power Law Graphs (DRAFT)

Spectral radius, symmetric and positive matrices

Linear algebra and applications to graphs Part 1

Algebraic Representation of Networks

Optimal hypothesis testing for stochastic block models with growing degrees

Least singular value of random matrices. Lewis Memorial Lecture / DIMACS minicourse March 18, Terence Tao (UCLA)

Energy of Graphs. Sivaram K. Narayan Central Michigan University. Presented at CMU on October 10, 2015

Markov Chains, Stochastic Processes, and Matrix Decompositions

On the inverse matrix of the Laplacian and all ones matrix

Spectra of adjacency matrices of random geometric graphs

A sequence of triangle-free pseudorandom graphs

Hypothesis testing for automated community detection in networks

Lecture 5: Random Walks and Markov Chain

Laplacian and Random Walks on Graphs

This section is an introduction to the basic themes of the course.

Improved Sum-of-Squares Lower Bounds for Hidden Clique and Hidden Submatrix Problems

A lower bound for the Laplacian eigenvalues of a graph proof of a conjecture by Guo

Random Walks on graphs

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities

Eigenvalue variance bounds for Wigner and covariance random matrices

Random matrices: Distribution of the least singular value (via Property Testing)

Local law of addition of random matrices

On the distinguishability of random quantum states

Compound matrices and some classical inequalities

Random Lifts of Graphs

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016

Impact of regularization on Spectral Clustering

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Extremal H-colorings of graphs with fixed minimum degree

Eigenvalues and edge-connectivity of regular graphs

Concentration inequalities: basics and some new challenges

A Generalization of Wigner s Law

Lectures 2 3 : Wigner s semicircle law

Stein s Method for Matrix Concentration

Spectral Statistics of Erdős-Rényi Graphs II: Eigenvalue Spacing and the Extreme Eigenvalues

arxiv: v4 [math.pr] 10 Mar 2015

DISTRIBUTION OF EIGENVALUES OF REAL SYMMETRIC PALINDROMIC TOEPLITZ MATRICES AND CIRCULANT MATRICES

Rectangular Young tableaux and the Jacobi ensemble

Eigenvalue Statistics for Toeplitz and Circulant Ensembles

REPORT DOCUMENTATION PAGE

Applications of random matrix theory to principal component analysis(pca)

Invariant Subspace Perturbations or: How I Learned to Stop Worrying and Love Eigenvectors

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Maximum union-free subfamilies

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

A limit theorem for scaled eigenvectors of random dot product graphs

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices

Curriculum Vitae. Education: Employment history:

Singular value decomposition (SVD) of large random matrices. India, 2010

Transcription:

Extreme eigenvalues of Erdős-Rényi random graphs Florent Benaych-Georges j.w.w. Charles Bordenave and Antti Knowles MAP5, Université Paris Descartes May 18, 2018 IPAM UCLA

Inhomogeneous Erdős-Rényi random graph Random graph G : vertices {1,..., n} edges 1 {i,j} G Bernoulli(p ij ), independent. Examples : homogeneous Erdős-Rényi graph : p ij = p. Stochastic Block Model. Partition vertices into communities : {1,..., n} = α N α, p ij depends only on the communities N α i and N β j. In this talk : A := adjacency matrix of G Eigenvalues of A : λ 1 (A) λ n (A) n 1, p ij = p ij (n)

Example 1 : homogeneous graph (p ij = d/n for all i, j) Theorem [Krivelevich, Sudakov ; 2003 + Vu ; 2007]. (log n) 1/2 if d (log n) 1/2 λ 1 (A) d if d (log n) 1/2 λ 2 (A) (up to log log n factors) (log n) 1/2 if d (log n) 1/2 2 d if d (log n) 4 What about λ 2 (A) if (log n) 1/2 d (log n) 4? (question related to the spectral gap) Conjecture : transition at d log n (graph connectivity threshold) A = A EA + EA hence by Weyl s interlacing inequality : λ 3 (A) λ 2 (A EA) λ 2 (A) λ 1 (A EA) λ 1 (A)

Example 2 : Stochastic Block Model (SBM) Partition vertices into communities : {1,..., n} = α N α, p ij depends only on the communities N α i and N β j. Question : how to recover the communities from the graph observation? Spectral clustering algorithm : Spectral decomposition of A : λ 1 λ 2, #» V 1, #» V 2,... λ 1,..., λ r : isolated eigenvalues Apply k-means or EM to the rows of the n r matrix [ #» V1 #»] V r

SBM n = 1000, 3 classes (3.5% of misclustered vertices)

When and why does spectral clustering work well? Spectral clustering algorithm : Spectral decomposition of A : λ 1 λ 2, #» V 1, #» V 2,... λ 1,..., λ r : isolated eigenvalues Apply k-means or EM to the rows of the n r matrix [ #» V1 #»] V r Usually, spectral clustering works well when A EA EA Explanation : a) Eigenvectors of EA is where the information about communities lies b) A = EA + (A EA) c) By perturbation theory (Davis-Kahan theorem), A EA EA and λ i (A) isolated = V #» i (A) V #» i (EA) (or use BBP if A EA c EA for c of order one, large enough)

Main results Conjecture in example 1 (homogeneous graph) and spectral clustering efficiency assessment in example 2 (SBM) both lead to the study of the largest eigenvalues of A EA. Hypothesis : G is an inhomogeneous Erdős-Rényi random graph with : constant mean degree d (i.e. for all i, j p ij = d), max i,j p i,j n 1+κ, for some κ > 0 small.

Theorem 1. [BBK ; 2017] W.h.p., Bound A EA d and Youssef. A EA d η 2 + C, with η := 1 log η log n d. 2 + O(η 2/3 ) also in a recent preprint by Latała, van Handel Consequence : If d log n, then empirical spectral distributions 1 n δ λi n of A EA and A look like : i=1 (a) Centered matrix A EA : semicircle law with no outlier (b) Adjacency matrix A : semicircle law with outliers at the positions of the eigenvalues of (usual) order d of EA

Theorem 2. [BBK ; 2017] Define the ordered degrees D i through D 1 D 2 D n. For d log n, for any k n 0.99, we have λ k (A EA) D k log(n/k) log((log n)/d) Consequence : Centered matrix A EA : spectrum distributed according to the semicircle law, plus n 0.99 eigenvalues in a cloud up to log n

Consequence for the adjacency matrix A when d log n : d log n : no outlier d log n : outliers at the positions of the eigenvalues of (usual) order d of EA

Corollary 1. For homogeneous Erdős-Rényi graphs, the transition mentioned above is actually at d log n. Corollary 2. In the SBM, when the non zero eigenvalues of EA have order d and have gaps with order d, spectral clustering works well if and only if d log n.

Interpretation of the case d log n k n 0.99, λ k (A EA) log(n/k) Consequence : asymptotic density of eigenvalues of A/ log n at x (0, 1) : 2 (log n) n 1 x2 x Previously, in random matrix theory, two types of behaviour have been observed for extreme eigenvalues (out of finite sets of outliers at deterministic positions) : (a) convergence to the edge of the limit support, usually with Tracy-Widom fluctuation (e.g. Wigner matrices) ; (b) convergence (after rescaling) to a Poisson point process (e.g. heavy-tailed random matrices). Our theorem 2 implies that neither is true here when d log n. In fact, there is no deterministic sequence α n such that the point process converges to a nondegenerate process. {α n λ k (A) : 2 k n 0.99 }

Proof of the d log n case We want to prove that for k n 0.99, λ k (A EA) D k, with D k the k-the largest degree. Remark. Spectrum of a star-graph with degree D : D, D, 0, 0,.... Lemma. D k log(n/k) log((log n)/d). Lemma. Let G be the graph of the n 0.99 largest stars, where we have removed all edges joining centers of stars in 1 or 2 steps. Then w.h.p., the stars present in G are disjoint and have degrees D k O(1) the matrix A EA rewrites A EA = Adj(G log n ) + (matrix with norm log log n ) (Le, Levina, Vershynin). Then use perturbation inequalities to prove λ k (A EA) D k.

Proof of the d log n case Goal : estimate H with H := d 1/2 (A EA). Let λ = (λ 1,..., λ n ) be the eigenvalues of H. Goal : λ l. Much easier to estimate λ l p for large (even) p. Elementary fact : λ l p is close to λ l if p log n. Thus, we have to estimate λ l p for p log(n). More precisely, E λ p l = E Tr H p = E H p i1i 2 H i2i 3 H ipi 1. 1 i 1,...,i p n Right-hand side analyzed using that entries of H are independent and have mean zero (Füredi-Komlós-approach).

Use graphs to encode terms arising from i 1,...,i p E(H i1i 2 H i2i 3 H ipi 1 ). Vertices = {i 1, i 2,..., i p }. Edges = {{i k, i k+1 } : k = 1,..., p}. Each non zero term : a walk i 1, i 2,..., i p of length p on a graph, such that each edge is visited at least twice. i 1 i 2 i n Works well when d (log n) 4, but a fundamental problem arises otherwise : proliferation of subtrees, which leads to very complicated combinatorics.

Nonbacktracking matrix To make combinatorics manageable, we try to kill all subtrees. Key observation : each leaf of the graph gives rise to a backtracking piece of the walk : i k 1 = i k+1. i k 1 i k i k+1 To H we associate its nonbacktracking matrix B = (B ef ) e,f {1,...,n} 2 by directed edges : B (i,j)(a,b) := H ab 1 j=a 1 i b. indexed i j = a b Note that B is n 2 n 2 and non-hermitian. E Tr B p (B ) p can be written in terms of walks on graphs with no subtrees : nonbacktracking walks.

Estimates of E Tr B p (B ) p (made possible by the non-backtracking structure) give : Proposition 1 [BBK]. W.h.p., ρ(b) := (Spectral Radius of B) 1 + C d. Proposition 2 [BBK, Ihara-Bass type formula]. We have ( ) ρ(b) H H 2 f + 7 H 1, H 2 for f(x) := 2 1 x 1 + ( x + 1 x) 1x 1 and H 2 := max i j H ij 2, H 1 := max i,j H ij. Lemma. If d log(n), then H 2 1 (Bennett s inequality) Consequence : As H 1 1, H 2 and Theorem 1 follows.