Statistical and Computational Phase Transitions in Planted Models

Size: px

Start display at page:

Download "Statistical and Computational Phase Transitions in Planted Models"

Lizbeth Potter
6 years ago
Views:

1 Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203

2 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05]

3 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05] Social networks: social communities; Metabolic networks: functional communities; Recommendation systems: user and item communities...

4 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05] Social networks: social communities; Metabolic networks: functional communities; Recommendation systems: user and item communities... Q: How to recover hidden cluster structure? Community Detection

Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05] Social networks: social communities; Metabolic networks: functional communities; Recommendation

5 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05] Social networks: social communities; Metabolic networks: functional communities; Recommendation systems: user and item communities... Q: How to recover hidden cluster structure? Community Detection Application: link prediction in social networks, rating prediction in recommendation systems...

6 Information theory of community detection

7 Information theory of community detection Simple model: Erdős-Rényi type model with planted clusters

8 Information theory of community detection Simple model: Erdős-Rényi type model with planted clusters Information-theoretic view: Converse and achievability for cluster recovery

9 Information theory of community detection Simple model: Erdős-Rényi type model with planted clusters Information-theoretic view: Converse and achievability for cluster recovery Computational view: Performance limit of polynomial-time algorithms for cluster recovery

10 Stochastic blockmodel (planted partition model) A random graph model to generate graph with cluster structure n = 5000, r = 0, K = 500, p = 0.999, q = Ref.

11 Stochastic blockmodel (planted partition model) A random graph model to generate graph with cluster structure n = 5000, r = 0, K = 500, p = 0.999, q = Ref. Goal: Exactly recover the hidden clusters given the graph.

12 Cluster recovery as matrix recovery Cluster matrix: Yij = if i and j are in the same cluster; otherwise Yij = True cluster matrix Y 200

13 Cluster recovery as matrix recovery Cluster matrix: Yij = if i and j are in the same cluster; otherwise Yij = True cluster matrix Y Ber(p) 200 Ber(q) Observed adjacency matrix A

14 Cluster recovery as matrix recovery Cluster matrix: Yij = if i and j are in the same cluster; otherwise Yij = True cluster matrix Y Ber(p) 200 Ber(q) Observed adjacency matrix A Cluster recovery as a specific matrix recovery problem: Y A Y

15 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]:

16 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]...

17 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]...

18 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]...

19 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]... Two fundamental questions still unclear:

20 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]... Two fundamental questions still unclear: Information limit: In which regime of n, K, p, q, is exact cluster recovery possible (impossible)?

21 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]... Two fundamental questions still unclear: Information limit: In which regime of n, K, p, q, is exact cluster recovery possible (impossible)? Computational limit: In which regime of n, K, p, q, is exact cluster recovery easy (hard)?

22 Cluster recovery under stochastic blockmodel Our (non-asymptotic) results apply to general setting allowing any n, K, p, q. large cluster β K = Θ(n β ) small cluster O dense graph separation p = 2q = Θ(n α ) α sparse graph small separation

23 Cluster recovery under stochastic blockmodel Our (non-asymptotic) results apply to general setting allowing any n, K, p, q. large cluster β easy K = Θ(n β ) small cluster O dense graph separation p = 2q = Θ(n α ) hard α sparse graph small separation

24 Converse for cluster recovery β K = Θ(n β ) O p = 2q = Θ(n α ) α

25 Converse for cluster recovery β K = Θ(n β ) impossible O p = 2q = Θ(n α ) α

26 Converse for cluster recovery β K = Θ(n β ) impossible O p = 2q = Θ(n α ) α Proof: Y A Ŷ. Apply Fano s inequality to lower bound P(Ŷ Y ) by upper bounding I(Y ; A).

27 Converse for cluster recovery β K = Θ(n β ) impossible O p = 2q = Θ(n α ) α Proof: Y A Ŷ. Apply Fano s inequality to lower bound P(Ŷ Y ) by upper bounding I(Y ; A). Intuition: The observation A does not carry enough information to distinguish between different possible Y.

28 Converse for cluster recovery β K = Θ(n β )? impossible O p = 2q = Θ(n α ) α Proof: Y A Ŷ. Apply Fano s inequality to lower bound P(Ŷ Y ) by upper bounding I(Y ; A). Intuition: The observation A does not carry enough information to distinguish between different possible Y.

29 Achievability by maximum likelihood estimation Maximum likelihood estimator: Ŷ = arg max P(A Y ) Y A Ŷ

30 Achievability by maximum likelihood estimation Maximum likelihood estimator: Ŷ = arg max P(A Y ) Y A Ŷ If p > q, maximum likelihood estimation is equivalent to finding the r most densely connected subgraphs of size K in the graph: max Y s.t. A ij Y ij i,j Y is a cluster matrix.

31 Achievability by maximum likelihood estimation Maximum likelihood estimator: Ŷ = arg max P(A Y ) Y A Ŷ If p > q, maximum likelihood estimation is equivalent to finding the r most densely connected subgraphs of size K in the graph: max Y s.t. A ij Y ij i,j Y is a cluster matrix. Q: When maximum likelihood estimator equals Y?

32 Achievability by maximum likelihood estimation β K = Θ(n β ) O p = 2q = Θ(n α ) α

33 Achievability by maximum likelihood estimation β K = Θ(n β ) O information limit p = 2q = Θ(n α ) α

34 Achievability by maximum likelihood estimation β K = Θ(n β ) O information limit p = 2q = Θ(n α ) α Proof: Concentration inequality + union bound (needs clever counting argument and peeling technique)

35 Achievability by maximum likelihood estimation β K = Θ(n β ) O information limit p = 2q = Θ(n α ) α Proof: Concentration inequality + union bound (needs clever counting argument and peeling technique) Q: MLE takes an exponential time to solve. Can we achieve information limit via polynomial-time algorithms?

36 Polynomial-time recovery: convex relaxation of MLE Cluster matrix Y has low rank: 0 0 rank =

37 Polynomial-time recovery: convex relaxation of MLE Cluster matrix Y has low rank: 0 0 rank = Nuclear norm Y (sum of singular values) is a convex surrogate for rank function.

38 Polynomial-time recovery: convex relaxation of MLE Cluster matrix Y has low rank: 0 0 rank = Nuclear norm Y (sum of singular values) is a convex surrogate for rank function. A convex relaxation of MLE [Chen-Sangavi-Xu 2]: max Y A ij Y ij ij s.t. Y n Y ij = rk 2, Y ij [0, ]. ij

39 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) O p = 2q = Θ(n α ) α

40 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α

41 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α Proof: Nuclear norm constraint suppresses the random noise and boosts the SNR.

42 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α Proof: Nuclear norm constraint suppresses the random noise and boosts the SNR. Surprise: Convex relaxation might not be order-optimal when there is a growing number of clusters.

43 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2? O p = 2q = Θ(n α ) α Proof: Nuclear norm constraint suppresses the random noise and boosts the SNR. Surprise: Convex relaxation might not be order-optimal when there is a growing number of clusters.

44 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98].

45 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes.

46 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes. β K = Θ(n β ) O p = 2q = Θ(n α ) α

47 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes. β K = Θ(n β ) /2 O /2 p = 2q = Θ(n α ) α

48 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes. β K = Θ(n β ) /2 O /2 p = 2q = Θ(n α ) α Proof: Similarity concentrates around its mean.

49 Polynomial-time recovery: spectral algorithms Spectral algorithms: based on principal singular vectors (PCA)

50 Polynomial-time recovery: spectral algorithms Spectral algorithms: based on principal singular vectors (PCA) Example: n = 64, r = 6, K = n0.75, p = n 0.25, q = p/ Adjacency matrix 200

51 Polynomial-time recovery: spectral algorithms Spectral algorithms: based on principal singular vectors (PCA) Example: n = 64, r = 6, K = n0.75, p = n 0.25, q = p/ Noise Signal of Cluster Information Adjacency matrix Singular value histogram I The r principal singular vectors contain cluster information. I The bulk of spectrum is caused by the random noise.

52 Polynomial-time recovery: spectral algorithms β /2 O /2 α

53 Polynomial-time recovery: spectral algorithms β / O /2 α

54 Polynomial-time recovery: spectral algorithms β / O /2 α

55 Polynomial-time recovery: spectral algorithms β / O /2 α

56 Polynomial-time recovery: spectral algorithms β / O /2 α Signal strength (r-th largest singular value) is K (p q); Noise magnitude is O( np).

57 Polynomial-time recovery: spectral algorithms β / O /2 α Signal strength (r-th largest singular value) is K (p q); Noise magnitude is O( np). Signal strength needs to be larger than noise magnitude: K (p q) np (Spectral barrier).

58 Polynomial-time recovery: spectral algorithms β /2 spectral barrier O /2 α Signal strength (r-th largest singular value) is K (p q); Noise magnitude is O( np). Signal strength needs to be larger than noise magnitude: K (p q) np (Spectral barrier).

59 Conjecture on computational limit β K = Θ(n β ) /2 spectral barrier O /2 p = 2q = Θ(n α ) α

60 Conjecture on computational limit β K = Θ(n β ) /2 spectral barrier computational limit O /2 p = 2q = Θ(n α ) α Conjecture: no polynomial-time algorithm succeeds beyond spectral barrier. A similar conjecture appears in the planted clique model.

61 Review: Conjecture in planted clique model K Ber(0.5) A = K clique +

62 Review: Conjecture in planted clique model K Ber(0.5) A = K clique + Feasible if and only if K > 2 log 2 n Simple algorithm by picking the K nodes with highest degree works if K = Ω( n log n) Spectral algorithm works if K = Ω( n) [Alon et al. 98] Belief: No polynomial-time algorithm works if K = o( n)

63 Review: Conjecture in planted clique model K Ber(q) A = K Ber(p) + Feasible if and only if K > 2 log 2 n Simple algorithm by picking the K nodes with highest degree works if K = Ω( n log n) Spectral algorithm works if K = Ω( n) [Alon et al. 98] Belief: No polynomial-time algorithm works if K = o( n) Planted dense subgraph model: p, q [0, ]

64 Planted dense subgraph model β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α

65 Planted dense subgraph model β /2 O K = Θ(n β ) simple spectral barrier MLE impossible p = 2q = Θ(n α ) α

66 Planted dense subgraph model β /2 O K = Θ(n β ) simple spectral barrier computational limit MLE impossible p = 2q = Θ(n α ) α Conjecture: no polynomial-time algorithm succeeds beyond the spectral barrier.

67 Concluding remarks Simple model: Stochastic blockmodel (planted partition model).

68 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms.

69 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms. If K = o(n), cluster structure can be recovered up to the information limit via exponential-time algorithms but might not via polynomial-time algorithms due to spectral barrier.

70 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms. If K = o(n), cluster structure can be recovered up to the information limit via exponential-time algorithms but might not via polynomial-time algorithms due to spectral barrier. Conjecture on existence of big gap between information and computational limit also appears in planted dense subgraph model.

71 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms. If K = o(n), cluster structure can be recovered up to the information limit via exponential-time algorithms but might not via polynomial-time algorithms due to spectral barrier. Conjecture on existence of big gap between information and computational limit also appears in planted dense subgraph model. Future work: prove the conjecture by assuming no polynomial-time algorithm detects hidden clique of size o( n) in the planted clique model.

72 Gap between information and computational limit Search version Hypothesis testing version Planted Clique Planted dense Subgraph Planted Clique Planted dense Subgraph [Ma&Wu '3] [Berthet&Rigollet '3] Planted Submatrix Sparse PCA Planted Submatrix Sparse PCA Planted Partition

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR