How Robust are Thresholds for Community Detection? Ankur Moitra (MIT) joint work with Amelia Perry (MIT) and Alex Wein (MIT)
Let me tell you a story about the success of belief propagation and statistical physics
THE STOCHASTIC BLOCK MODEL Introduced by Holland, Laskey and Leinhardt (1983): probability Q 13 k communities connection probabilities probability Q 11 Q = Q 11 Q 12 Q 13 Q 12 Q 22 Q 32 Q 13 Q 32 Q 33 edges independent
THE STOCHASTIC BLOCK MODEL Introduced by Holland, Laskey and Leinhardt (1983): probability Q 13 k communities connection probabilities probability Q 11 Q = Q 11 Q 12 Q 13 Q 12 Q 22 Q 32 Q 13 Q 32 Q 33 edges independent Ubiquitous model studied in statistics, computer science, information theory, statistical physics
Testbed for diverse range of algorithms (1) Combinatorial Methods e.g. degree counting [Bui, Chaudhuri, Leighton, Sipser 87] (2) Spectral Methods e.g. [McSherry 01] (3) Markov chain Monte Carlo (MCMC) e.g. [Jerrum, Sorkin 98] (4) Semidefinite Programs e.g. [Boppana 87]
BELIEF PROPAGATION Introduced by Judea Pearl (1982): For fundamental contributions to probabilistic and causal reasoning
Adapted to community detection: v u Message vèu Probability I think I am community #1, community #2, Do same for other edges
Adapted to community detection: v u Message vèu Probability I think I am community #1, community #2, update beliefs Do same for other edges
Adapted to community detection: v v u update beliefs u Message vèu Probability I think I am community #1, community #2, Message uèv New probability I think I am community #1, community #2, Do same for other edges Do same for other edges
SHARP THRESHOLDS? Renewed interest following Decelle, Krzakala, Moore and Zdeborová (2011), in sparseregime: b a a n n n
SHARP THRESHOLDS? Renewed interest following Decelle, Krzakala, Moore and Zdeborová (2011), in sparseregime: b a a n n n Remark: If a, b = O(1) there are many isolated nodes
SHARP THRESHOLDS? Renewed interest following Decelle, Krzakala, Moore and Zdeborová (2011), in sparseregime: b a a n n n Remark: If a, b = O(1) there are many isolated nodes agreement better than ½ Conjecture: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Motivated by when trivial soln for belief propagation is unstable
CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b)
CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Later attempts to get SDPs to work, but only got to (a-b) 2 > C(a+b), for some C > 2
CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Later attempts to get SDPs to work, but only got to (a-b) 2 > C(a+b), for some C > 2 Are nonconvexmethods better than convex programs?
CONJECTURE IS PROVED! Mossel, Neeman and Sly (2013) and Massoulie (2013): Theorem: It is possible to find a partition that is correlated with true communities iff (a-b) 2 > 2(a+b) Later attempts to get SDPs to work, but only got to (a-b) 2 > C(a+b), for some C > 2 Are nonconvexmethods better than convex programs? How do predictions of statistical physics and SDPs compare?
SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001):
SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM
SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing
SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing
SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing
SEMI-RANDOM MODELS Introduced by Blum and Spencer (1995), Feige and Kilian (2001): (1) Sample graph from SBM (2) Adversary can add edges within community and delete edges crossing Algorithms can no longer over tune to distribution
OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is impossible for (a-b) 2 C a,b (a+b) for some C a,b > 2
OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is impossible for (a-b) 2 C a,b (a+b) for some C a,b > 2 Any algorithm meeting threshold (a-b) 2 > 2(a+b) is inherently brittle and breaks with helpful changes
OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is impossible for (a-b) 2 C a,b (a+b) for some C a,b > 2 Any algorithm meeting threshold (a-b) 2 > 2(a+b) is inherently brittle and breaks with helpful changes But SDPs continue to work in semirandom model
OUR RESULTS Helpful changes can hurt: Theorem: Community detection in semirandom model is possible iff (a-b) 2 C a,b (a+b) for some C a,b > 2 Any algorithm meeting threshold (a-b) 2 > 2(a+b) is inherently brittle and breaks with helpful changes But SDPs continue to work in semirandom model This is first separation between what is possible in random vs. semirandom models
RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote
RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote But why is recursive majority used in practice?
RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote But why is recursive majority used in practice? Our answer: Because recursive majority works in semirandom models but majority does not (thresholds change again)
RECONSTRUCTION ON TREES Older thresholds, Kesten and Stigum (1966), Evans et al. (2000): Best way to reconstruct root from leaves is majority vote But why is recursive majority used in practice? Our answer: Because recursive majority works in semirandom models but majority does not (thresholds change again) Is trying to reach the information theoretic threshold the right goal to begin with?
BETWEEN WORST-CASE AND AVERAGE-CASE Spielman and Teng (2001): Explain why algorithms work well in practice, despite bad worst-case behavior Usually called Beyond Worst-Case Analysis
BETWEEN WORST-CASE AND AVERAGE-CASE Spielman and Teng (2001): Explain why algorithms work well in practice, despite bad worst-case behavior Usually called Beyond Worst-Case Analysis Semirandom models as Above Average-Case Analysis?
BETWEEN WORST-CASE AND AVERAGE-CASE Spielman and Teng (2001): Explain why algorithms work well in practice, despite bad worst-case behavior Usually called Beyond Worst-Case Analysis Semirandom models as Above Average-Case Analysis? THANKS! ANY QUESTIONS?