How Robust are Thresholds for Community Detection?

Similar documents
Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

How robust are reconstruction thresholds for community detection?

Belief Propagation, Robust Reconstruction and Optimal Recovery of Block Models

Reconstruction in the Generalized Stochastic Block Model

Spectral thresholds in the bipartite stochastic block model

The non-backtracking operator

Community detection in stochastic block models via spectral methods

CS369N: Beyond Worst-Case Analysis Lecture #4: Probabilistic and Semirandom Models for Clustering and Graph Partitioning

ISIT Tutorial Information theory and machine learning Part II

Spectral Partitiong in a Stochastic Block Model

Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices

Reconstruction in the Sparse Labeled Stochastic Block Model

Statistical and Computational Phase Transitions in Planted Models

Spectral thresholds in the bipartite stochastic block model

Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization

Clustering from Sparse Pairwise Measurements

Consistency Thresholds for the Planted Bisection Model

9 Forward-backward algorithm, sum-product on factor graphs

Spectral Partitioning of Random Graphs

Machine Learning for Data Science (CS4786) Lecture 24

arxiv: v1 [math.st] 26 Jan 2018

Spectral Redemption: Clustering Sparse Networks

Community Detection and Stochastic Block Models: Recent Developments

Probabilistic Graphical Models

XVI International Congress on Mathematical Physics

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

A NEARLY-TIGHT SUM-OF-SQUARES LOWER BOUND FOR PLANTED CLIQUE

Random Walks and Evolving Sets: Faster Convergences and Limitations

Partially Observable Markov Decision Processes (POMDPs)

Fractional Belief Propagation

arxiv: v1 [stat.me] 6 Nov 2014

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

On the efficient approximability of constraint satisfaction problems

Estimation and Optimization: Gaps and Bridges. MURI Meeting June 20, Laurent El Ghaoui. UC Berkeley EECS

Community Detection. Data Analytics - Community Detection Module

Bayesian Networks: Belief Propagation in Singly Connected Networks

Approximating a single component of the solution to a linear system

Capturing Independence Graphically; Undirected Graphs

Learning With Bayesian Networks. Markus Kalisch ETH Zürich

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Linear inverse problems on Erdős-Rényi graphs: Information-theoretic limits and efficient recovery

Introduction to Causal Calculus

Introduction to Probabilistic Graphical Models

Weighted Message Passing and Minimum Energy Flow for Heterogeneous Stochastic Block Models with Side Information

The Tightness of the Kesten-Stigum Reconstruction Bound for a Symmetric Model With Multiple Mutations

On the low-rank approach for semidefinite programs arising in synchronization and community detection

Density Propagation for Continuous Temporal Chains Generative and Discriminative Models

Sampling Algorithms for Probabilistic Graphical models

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Reconstruction for Models on Random Graphs

The Origin of Deep Learning. Lili Mou Jan, 2015

Phase Transitions in Semisupervised Clustering of Sparse Networks

Phase transitions in discrete structures

Artificial Intelligence

Why Water Freezes (A probabilistic view of phase transitions)

Modern WalkSAT algorithms

Probabilistic and Bayesian Machine Learning

Learning Energy-Based Models of High-Dimensional Data

Bayesian networks: approximate inference

CLUSTERING over graphs is a classical problem with

Graph Partitioning Using Random Walks

Probabilistic Graphical Networks: Definitions and Basic Results

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Bayesian Networks. Characteristics of Learning BN Models. Bayesian Learning. An Example

Generating random spanning trees. Andrei Broder. DEC - Systems Research Center 130 Lytton Ave., Palo Alto, CA

Theory and Methods for the Analysis of Social Networks

Final Exam, Machine Learning, Spring 2009

SDP Relaxation of Quadratic Optimization with Few Homogeneous Quadratic Constraints

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

Inference in Bayesian Networks

Information, Physics, and Computation

STA 4273H: Statistical Machine Learning

Variational Inference (11/04/13)

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

Overview of the Talk. Secret Sharing. Secret Sharing Made Short Hugo Krawczyk Perfect Secrecy

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

Markov Chains and MCMC

13: Variational inference II

Applications of Convex Optimization on the Graph Isomorphism problem

Texas A&M University

Junction Tree, BP and Variational Methods

Lecture 16 Deep Neural Generative Models

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

Approximation & Complexity

Pan Zhang Ph.D. Santa Fe Institute 1399 Hyde Park Road Santa Fe, NM 87501, USA

Bayes Networks 6.872/HST.950

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Part 1: Les rendez-vous symétriques

Learning in Bayesian Networks

Benchmarking recovery theorems for the DC-SBM

Benchmarking recovery theorems for the DC-SBM

Inference and Representation

Lecture 13 : Kesten-Stigum bound

Approximate inference in Energy-Based Models

Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting

Belief-Propagation Decoding of LDPC Codes

Transcription:

How Robust are Thresholds for Community Detection? Ankur Moitra (MIT) joint work with Amelia Perry (MIT) and Alex Wein (MIT)

Let me tell you a story about the success of belief propagation and statistical physics

THE STOCHASTIC BLOCK MODEL Introduced by Holland, Laskey and Leinhardt (1983): probability Q 13 k communities connection probabilities probability Q 11 Q = Q 11 Q 12 Q 13 Q 12 Q 22 Q 32 Q 13 Q 32 Q 33 edges independent

THE STOCHASTIC BLOCK MODEL Introduced by Holland, Laskey and Leinhardt (1983): probability Q 13 k communities connection probabilities probability Q 11 Q = Q 11 Q 12 Q 13 Q 12 Q 22 Q 32 Q 13 Q 32 Q 33 edges independent Ubiquitous model studied in statistics, computer science, information theory, statistical physics

Testbed for diverse range of algorithms (1) Combinatorial Methods e.g. degree counting [Bui, Chaudhuri, Leighton, Sipser 87] (2) Spectral Methods e.g. [McSherry 01] (3) Markov chain Monte Carlo (MCMC) e.g. [Jerrum, Sorkin 98] (4) Semidefinite Programs e.g. [Boppana 87]

BELIEF PROPAGATION Introduced by Judea Pearl (1982): For fundamental contributions to probabilistic and causal reasoning

Adapted to community detection: v u Message vèu Probability I think I am community #1, community #2, Do same for other edges

Adapted to community detection: v u Message vèu Probability I think I am community #1, community #2, update beliefs Do same for other edges

Adapted to community detection: v v u update beliefs u Message vèu Probability I think I am community #1, community #2, Message uèv New probability I think I am community #1, community #2, Do same for other edges Do same for other edges

SHARP THRESHOLDS? Renewed interest following Decelle, Krzakala, Moore and Zdeborová (2011), in sparseregime: b a a n n n

SHARP THRESHOLDS? Renewed interest following Decelle, Krzakala, Moore and Zdeborová (2011), in sparseregime: b a a n n n Remark: If a, b = O(1) there are many isolated nodes

SHARP THRESHOLDS? Renewed interest following Decelle, Krzakala, Moore and Zdeborová (2011), in sparseregime: b a a n n n Remark: If a, b = O(1) there are many isolated nodes agreement better than ½ Conjecture: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Motivated by when trivial soln for belief propagation is unstable

CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b)

CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Later attempts to get SDPs to work, but only got to (a-b) 2 > C(a+b), for some C > 2

CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Later attempts to get SDPs to work, but only got to (a-b) 2 > C(a+b), for some C > 2 Are nonconvexmethods better than convex programs?

CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Later attempts to get SDPs to work, but only got to (a-b) 2 > C(a+b), for some C > 2 Are nonconvexmethods better than convex programs? How do predictions of statistical physics and SDPs compare?

SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001):

SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM

SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing

SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing

SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing

SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing Algorithms can no longer over tune to distribution

OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is impossible for (a-b) 2 C a,b (a+b) for some C a,b > 2

OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is impossible for (a-b) 2 C a,b (a+b) for some C a,b > 2 Any algorithm meeting threshold (a-b) 2 > 2(a+b) is inherently brittle and breaks with helpful changes

OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is impossible for (a-b) 2 C a,b (a+b) for some C a,b > 2 Any algorithm meeting threshold (a-b) 2 > 2(a+b) is inherently brittle and breaks with helpful changes But SDPs continue to work in semirandom model

OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is possible iff (a-b) 2 C a,b (a+b) for some C a,b > 2 Any algorithm meeting threshold (a-b) 2 > 2(a+b) is inherently brittle and breaks with helpful changes But SDPs continue to work in semirandom model This is first separation between what is possible in random vs. semirandom models

RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote

RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote But why is recursive majority used in practice?

RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote But why is recursive majority used in practice? Our answer: Because recursive majority works in semirandom models but majority does not (thresholds change again)

RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote But why is recursive majority used in practice? Our answer: Because recursive majority works in semirandom models but majority does not (thresholds change again) Is trying to reach the information theoretic threshold the right goal to begin with?

BETWEEN WORST-CASE AND AVERAGE-CASE Spielman and Teng (2001): Explain why algorithms work well in practice, despite bad worst-case behavior Usually called Beyond Worst-Case Analysis

BETWEEN WORST-CASE AND AVERAGE-CASE Spielman and Teng (2001): Explain why algorithms work well in practice, despite bad worst-case behavior Usually called Beyond Worst-Case Analysis Semirandom models as Above Average-Case Analysis?

BETWEEN WORST-CASE AND AVERAGE-CASE Spielman and Teng (2001): Explain why algorithms work well in practice, despite bad worst-case behavior Usually called Beyond Worst-Case Analysis Semirandom models as Above Average-Case Analysis? THANKS! ANY QUESTIONS?