Statistical and Computational Phase Transitions in Planted Models

Size: px
Start display at page:

Download "Statistical and Computational Phase Transitions in Planted Models"

Transcription

1 Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203

2 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05]

3 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05] Social networks: social communities; Metabolic networks: functional communities; Recommendation systems: user and item communities...

4 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05] Social networks: social communities; Metabolic networks: functional communities; Recommendation systems: user and item communities... Q: How to recover hidden cluster structure? Community Detection

5 Cluster/Community structure in networks Network of political webblogs [Adamic-Glance 05] Social networks: social communities; Metabolic networks: functional communities; Recommendation systems: user and item communities... Q: How to recover hidden cluster structure? Community Detection Application: link prediction in social networks, rating prediction in recommendation systems...

6 Information theory of community detection

7 Information theory of community detection Simple model: Erdős-Rényi type model with planted clusters

8 Information theory of community detection Simple model: Erdős-Rényi type model with planted clusters Information-theoretic view: Converse and achievability for cluster recovery

9 Information theory of community detection Simple model: Erdős-Rényi type model with planted clusters Information-theoretic view: Converse and achievability for cluster recovery Computational view: Performance limit of polynomial-time algorithms for cluster recovery

10 Stochastic blockmodel (planted partition model) A random graph model to generate graph with cluster structure n = 5000, r = 0, K = 500, p = 0.999, q = Ref.

11 Stochastic blockmodel (planted partition model) A random graph model to generate graph with cluster structure n = 5000, r = 0, K = 500, p = 0.999, q = Ref. Goal: Exactly recover the hidden clusters given the graph.

12 Cluster recovery as matrix recovery Cluster matrix: Yij = if i and j are in the same cluster; otherwise Yij = True cluster matrix Y 200

13 Cluster recovery as matrix recovery Cluster matrix: Yij = if i and j are in the same cluster; otherwise Yij = True cluster matrix Y Ber(p) 200 Ber(q) Observed adjacency matrix A

14 Cluster recovery as matrix recovery Cluster matrix: Yij = if i and j are in the same cluster; otherwise Yij = True cluster matrix Y Ber(p) 200 Ber(q) Observed adjacency matrix A Cluster recovery as a specific matrix recovery problem: Y A Y

15 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]:

16 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]...

17 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]...

18 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]...

19 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]... Two fundamental questions still unclear:

20 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]... Two fundamental questions still unclear: Information limit: In which regime of n, K, p, q, is exact cluster recovery possible (impossible)?

21 Cluster recovery under stochastic blockmodel Vast literature on stochastic blockmodel [Holland et al. 83] and planted partition model [Condon-Karp 0]: [Bickel-Chen 09] [Rohe et al. 0] [Mossel et al. 2]... [Karrer-Newman ] [Decelle et al. ] [Nadakuditi-Newman 2] [Krzakala et al. 3 ]... [McSherry 0] [Coja-Oghlan 0] [Tomozei-Massoulié ] [Chaudhuri et al. 2] [Chen-Sanghavi-Xu 2] [Heimlicher et al. 2] [Anandkumar et al. 3] [Lelarge et al. 3]... Two fundamental questions still unclear: Information limit: In which regime of n, K, p, q, is exact cluster recovery possible (impossible)? Computational limit: In which regime of n, K, p, q, is exact cluster recovery easy (hard)?

22 Cluster recovery under stochastic blockmodel Our (non-asymptotic) results apply to general setting allowing any n, K, p, q. large cluster β K = Θ(n β ) small cluster O dense graph separation p = 2q = Θ(n α ) α sparse graph small separation

23 Cluster recovery under stochastic blockmodel Our (non-asymptotic) results apply to general setting allowing any n, K, p, q. large cluster β easy K = Θ(n β ) small cluster O dense graph separation p = 2q = Θ(n α ) hard α sparse graph small separation

24 Converse for cluster recovery β K = Θ(n β ) O p = 2q = Θ(n α ) α

25 Converse for cluster recovery β K = Θ(n β ) impossible O p = 2q = Θ(n α ) α

26 Converse for cluster recovery β K = Θ(n β ) impossible O p = 2q = Θ(n α ) α Proof: Y A Ŷ. Apply Fano s inequality to lower bound P(Ŷ Y ) by upper bounding I(Y ; A).

27 Converse for cluster recovery β K = Θ(n β ) impossible O p = 2q = Θ(n α ) α Proof: Y A Ŷ. Apply Fano s inequality to lower bound P(Ŷ Y ) by upper bounding I(Y ; A). Intuition: The observation A does not carry enough information to distinguish between different possible Y.

28 Converse for cluster recovery β K = Θ(n β )? impossible O p = 2q = Θ(n α ) α Proof: Y A Ŷ. Apply Fano s inequality to lower bound P(Ŷ Y ) by upper bounding I(Y ; A). Intuition: The observation A does not carry enough information to distinguish between different possible Y.

29 Achievability by maximum likelihood estimation Maximum likelihood estimator: Ŷ = arg max P(A Y ) Y A Ŷ

30 Achievability by maximum likelihood estimation Maximum likelihood estimator: Ŷ = arg max P(A Y ) Y A Ŷ If p > q, maximum likelihood estimation is equivalent to finding the r most densely connected subgraphs of size K in the graph: max Y s.t. A ij Y ij i,j Y is a cluster matrix.

31 Achievability by maximum likelihood estimation Maximum likelihood estimator: Ŷ = arg max P(A Y ) Y A Ŷ If p > q, maximum likelihood estimation is equivalent to finding the r most densely connected subgraphs of size K in the graph: max Y s.t. A ij Y ij i,j Y is a cluster matrix. Q: When maximum likelihood estimator equals Y?

32 Achievability by maximum likelihood estimation β K = Θ(n β ) O p = 2q = Θ(n α ) α

33 Achievability by maximum likelihood estimation β K = Θ(n β ) O information limit p = 2q = Θ(n α ) α

34 Achievability by maximum likelihood estimation β K = Θ(n β ) O information limit p = 2q = Θ(n α ) α Proof: Concentration inequality + union bound (needs clever counting argument and peeling technique)

35 Achievability by maximum likelihood estimation β K = Θ(n β ) O information limit p = 2q = Θ(n α ) α Proof: Concentration inequality + union bound (needs clever counting argument and peeling technique) Q: MLE takes an exponential time to solve. Can we achieve information limit via polynomial-time algorithms?

36 Polynomial-time recovery: convex relaxation of MLE Cluster matrix Y has low rank: 0 0 rank =

37 Polynomial-time recovery: convex relaxation of MLE Cluster matrix Y has low rank: 0 0 rank = Nuclear norm Y (sum of singular values) is a convex surrogate for rank function.

38 Polynomial-time recovery: convex relaxation of MLE Cluster matrix Y has low rank: 0 0 rank = Nuclear norm Y (sum of singular values) is a convex surrogate for rank function. A convex relaxation of MLE [Chen-Sangavi-Xu 2]: max Y A ij Y ij ij s.t. Y n Y ij = rk 2, Y ij [0, ]. ij

39 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) O p = 2q = Θ(n α ) α

40 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α

41 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α Proof: Nuclear norm constraint suppresses the random noise and boosts the SNR.

42 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α Proof: Nuclear norm constraint suppresses the random noise and boosts the SNR. Surprise: Convex relaxation might not be order-optimal when there is a growing number of clusters.

43 Polynomial-time recovery: convex relaxation of MLE β K = Θ(n β ) /2? O p = 2q = Θ(n α ) α Proof: Nuclear norm constraint suppresses the random noise and boosts the SNR. Surprise: Convex relaxation might not be order-optimal when there is a growing number of clusters.

44 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98].

45 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes.

46 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes. β K = Θ(n β ) O p = 2q = Θ(n α ) α

47 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes. β K = Θ(n β ) /2 O /2 p = 2q = Θ(n α ) α

48 Polynomial-time recovery: counting common neighbor Similarity between two nodes: The number of common neighbors [Dyer-Frieze 98]. Algorithm: Each node finds the K most similar nodes. β K = Θ(n β ) /2 O /2 p = 2q = Θ(n α ) α Proof: Similarity concentrates around its mean.

49 Polynomial-time recovery: spectral algorithms Spectral algorithms: based on principal singular vectors (PCA)

50 Polynomial-time recovery: spectral algorithms Spectral algorithms: based on principal singular vectors (PCA) Example: n = 64, r = 6, K = n0.75, p = n 0.25, q = p/ Adjacency matrix 200

51 Polynomial-time recovery: spectral algorithms Spectral algorithms: based on principal singular vectors (PCA) Example: n = 64, r = 6, K = n0.75, p = n 0.25, q = p/ Noise Signal of Cluster Information Adjacency matrix Singular value histogram I The r principal singular vectors contain cluster information. I The bulk of spectrum is caused by the random noise.

52 Polynomial-time recovery: spectral algorithms β /2 O /2 α

53 Polynomial-time recovery: spectral algorithms β / O /2 α

54 Polynomial-time recovery: spectral algorithms β / O /2 α

55 Polynomial-time recovery: spectral algorithms β / O /2 α

56 Polynomial-time recovery: spectral algorithms β / O /2 α Signal strength (r-th largest singular value) is K (p q); Noise magnitude is O( np).

57 Polynomial-time recovery: spectral algorithms β / O /2 α Signal strength (r-th largest singular value) is K (p q); Noise magnitude is O( np). Signal strength needs to be larger than noise magnitude: K (p q) np (Spectral barrier).

58 Polynomial-time recovery: spectral algorithms β /2 spectral barrier O /2 α Signal strength (r-th largest singular value) is K (p q); Noise magnitude is O( np). Signal strength needs to be larger than noise magnitude: K (p q) np (Spectral barrier).

59 Conjecture on computational limit β K = Θ(n β ) /2 spectral barrier O /2 p = 2q = Θ(n α ) α

60 Conjecture on computational limit β K = Θ(n β ) /2 spectral barrier computational limit O /2 p = 2q = Θ(n α ) α Conjecture: no polynomial-time algorithm succeeds beyond spectral barrier. A similar conjecture appears in the planted clique model.

61 Review: Conjecture in planted clique model K Ber(0.5) A = K clique +

62 Review: Conjecture in planted clique model K Ber(0.5) A = K clique + Feasible if and only if K > 2 log 2 n Simple algorithm by picking the K nodes with highest degree works if K = Ω( n log n) Spectral algorithm works if K = Ω( n) [Alon et al. 98] Belief: No polynomial-time algorithm works if K = o( n)

63 Review: Conjecture in planted clique model K Ber(q) A = K Ber(p) + Feasible if and only if K > 2 log 2 n Simple algorithm by picking the K nodes with highest degree works if K = Ω( n log n) Spectral algorithm works if K = Ω( n) [Alon et al. 98] Belief: No polynomial-time algorithm works if K = o( n) Planted dense subgraph model: p, q [0, ]

64 Planted dense subgraph model β K = Θ(n β ) /2 O p = 2q = Θ(n α ) α

65 Planted dense subgraph model β /2 O K = Θ(n β ) simple spectral barrier MLE impossible p = 2q = Θ(n α ) α

66 Planted dense subgraph model β /2 O K = Θ(n β ) simple spectral barrier computational limit MLE impossible p = 2q = Θ(n α ) α Conjecture: no polynomial-time algorithm succeeds beyond the spectral barrier.

67 Concluding remarks Simple model: Stochastic blockmodel (planted partition model).

68 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms.

69 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms. If K = o(n), cluster structure can be recovered up to the information limit via exponential-time algorithms but might not via polynomial-time algorithms due to spectral barrier.

70 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms. If K = o(n), cluster structure can be recovered up to the information limit via exponential-time algorithms but might not via polynomial-time algorithms due to spectral barrier. Conjecture on existence of big gap between information and computational limit also appears in planted dense subgraph model.

71 Concluding remarks Simple model: Stochastic blockmodel (planted partition model). If K = Θ(n), cluster structure can be recovered up to the information limit via polynomial-time algorithms. If K = o(n), cluster structure can be recovered up to the information limit via exponential-time algorithms but might not via polynomial-time algorithms due to spectral barrier. Conjecture on existence of big gap between information and computational limit also appears in planted dense subgraph model. Future work: prove the conjecture by assuming no polynomial-time algorithm detects hidden clique of size o( n) in the planted clique model.

72 Gap between information and computational limit Search version Hypothesis testing version Planted Clique Planted dense Subgraph Planted Clique Planted dense Subgraph [Ma&Wu '3] [Berthet&Rigollet '3] Planted Submatrix Sparse PCA Planted Submatrix Sparse PCA Planted Partition

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR

More information

Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices

Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices Journal of Machine Learning Research 17 (2016) 1-57 Submitted 8/14; Revised 5/15; Published 4/16 Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number

More information

Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting

Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting : The High-Dimensional Setting Yudong Chen YUDONGCHEN@EECSBERELEYEDU Department of EECS, University of California, Berkeley, Berkeley, CA 94704, USA Jiaming Xu JXU18@ILLINOISEDU Department of ECE, University

More information

Reconstruction in the Sparse Labeled Stochastic Block Model

Reconstruction in the Sparse Labeled Stochastic Block Model Reconstruction in the Sparse Labeled Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign

More information

Computational Lower Bounds for Community Detection on Random Graphs

Computational Lower Bounds for Community Detection on Random Graphs Computational Lower Bounds for Community Detection on Random Graphs Bruce Hajek, Yihong Wu, Jiaming Xu Department of Electrical and Computer Engineering Coordinated Science Laboratory University of Illinois

More information

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria Community Detection fundamental limits & efficient algorithms Laurent Massoulié, Inria Community Detection From graph of node-to-node interactions, identify groups of similar nodes Example: Graph of US

More information

Community detection in stochastic block models via spectral methods

Community detection in stochastic block models via spectral methods Community detection in stochastic block models via spectral methods Laurent Massoulié (MSR-Inria Joint Centre, Inria) based on joint works with: Dan Tomozei (EPFL), Marc Lelarge (Inria), Jiaming Xu (UIUC),

More information

Spectral thresholds in the bipartite stochastic block model

Spectral thresholds in the bipartite stochastic block model Spectral thresholds in the bipartite stochastic block model Laura Florescu and Will Perkins NYU and U of Birmingham September 27, 2016 Laura Florescu and Will Perkins Spectral thresholds in the bipartite

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs Jiaming Xu Joint work with Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, and Lei Ying University of Illinois, Urbana-Champaign

More information

Information Recovery from Pairwise Measurements

Information Recovery from Pairwise Measurements Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen, Changho Suh, Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large

More information

Spectral thresholds in the bipartite stochastic block model

Spectral thresholds in the bipartite stochastic block model JMLR: Workshop and Conference Proceedings vol 49:1 17, 2016 Spectral thresholds in the bipartite stochastic block model Laura Florescu New York University Will Perkins University of Birmingham FLORESCU@CIMS.NYU.EDU

More information

ISIT Tutorial Information theory and machine learning Part II

ISIT Tutorial Information theory and machine learning Part II ISIT Tutorial Information theory and machine learning Part II Emmanuel Abbe Princeton University Martin Wainwright UC Berkeley 1 Inverse problems on graphs Examples: A large variety of machine learning

More information

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY OLIVIER GUÉDON AND ROMAN VERSHYNIN Abstract. We present a simple and flexible method to prove consistency of semidefinite optimization

More information

Belief Propagation, Robust Reconstruction and Optimal Recovery of Block Models

Belief Propagation, Robust Reconstruction and Optimal Recovery of Block Models JMLR: Workshop and Conference Proceedings vol 35:1 15, 2014 Belief Propagation, Robust Reconstruction and Optimal Recovery of Block Models Elchanan Mossel mossel@stat.berkeley.edu Department of Statistics

More information

Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking

Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Spectral Method and Regularized MLE Are Both Optimal for Top-K Ranking Yuxin Chen Electrical Engineering, Princeton University Joint work with Jianqing Fan, Cong Ma and Kaizheng Wang Ranking A fundamental

More information

The non-backtracking operator

The non-backtracking operator The non-backtracking operator Florent Krzakala LPS, Ecole Normale Supérieure in collaboration with Paris: L. Zdeborova, A. Saade Rome: A. Decelle Würzburg: J. Reichardt Santa Fe: C. Moore, P. Zhang Berkeley:

More information

Spectral Partitiong in a Stochastic Block Model

Spectral Partitiong in a Stochastic Block Model Spectral Graph Theory Lecture 21 Spectral Partitiong in a Stochastic Block Model Daniel A. Spielman November 16, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

Consistency Thresholds for the Planted Bisection Model

Consistency Thresholds for the Planted Bisection Model Consistency Thresholds for the Planted Bisection Model [Extended Abstract] Elchanan Mossel Joe Neeman University of Pennsylvania and University of Texas, Austin University of California, Mathematics Dept,

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

arxiv: v1 [stat.me] 6 Nov 2014

arxiv: v1 [stat.me] 6 Nov 2014 Network Cross-Validation for Determining the Number of Communities in Network Data Kehui Chen 1 and Jing Lei arxiv:1411.1715v1 [stat.me] 6 Nov 014 1 Department of Statistics, University of Pittsburgh Department

More information

Community Detection and Stochastic Block Models: Recent Developments

Community Detection and Stochastic Block Models: Recent Developments Journal of Machine Learning Research 18 (2018) 1-86 Submitted 9/16; Revised 3/17; Published 4/18 Community Detection and Stochastic Block Models: Recent Developments Emmanuel Abbe Program in Applied and

More information

Clustering from Labels and Time-Varying Graphs

Clustering from Labels and Time-Varying Graphs Clustering from Labels and Time-Varying Graphs Shiau Hong Lim National University of Singapore mpelsh@nus.edu.sg Yudong Chen EECS, University of California, Berkeley yudong.chen@eecs.berkeley.edu Huan

More information

Benchmarking recovery theorems for the DC-SBM

Benchmarking recovery theorems for the DC-SBM Benchmarking recovery theorems for the DC-SBM Yali Wan Department of Statistics University of Washington Seattle, WA 98195-4322, USA yaliwan@washington.edu Marina Meilă Department of Statistics University

More information

Benchmarking recovery theorems for the DC-SBM

Benchmarking recovery theorems for the DC-SBM Benchmarking recovery theorems for the DC-SBM Yali Wan Department of Statistics University of Washington Seattle, WA 98195-4322, USA Marina Meila Department of Statistics University of Washington Seattle,

More information

Matrix estimation by Universal Singular Value Thresholding

Matrix estimation by Universal Singular Value Thresholding Matrix estimation by Universal Singular Value Thresholding Courant Institute, NYU Let us begin with an example: Suppose that we have an undirected random graph G on n vertices. Model: There is a real symmetric

More information

Topics in Network Models. Peter Bickel

Topics in Network Models. Peter Bickel Topics in Network Models MSU, Sepember, 2012 Peter Bickel Statistics Dept. UC Berkeley (Joint work with S. Bhattacharyya UC Berkeley, A. Chen Google, D. Choi UC Berkeley, E. Levina U. Mich and P. Sarkar

More information

Planted Cliques, Iterative Thresholding and Message Passing Algorithms

Planted Cliques, Iterative Thresholding and Message Passing Algorithms Planted Cliques, Iterative Thresholding and Message Passing Algorithms Yash Deshpande and Andrea Montanari Stanford University November 5, 2013 Deshpande, Montanari Planted Cliques November 5, 2013 1 /

More information

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016

MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 MGMT 69000: Topics in High-dimensional Data Analysis Falll 2016 Lecture 14: Information Theoretic Methods Lecturer: Jiaming Xu Scribe: Hilda Ibriga, Adarsh Barik, December 02, 2016 Outline f-divergence

More information

Provable Alternating Minimization Methods for Non-convex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish

More information

arxiv: v1 [math.st] 26 Jan 2018

arxiv: v1 [math.st] 26 Jan 2018 CONCENTRATION OF RANDOM GRAPHS AND APPLICATION TO COMMUNITY DETECTION arxiv:1801.08724v1 [math.st] 26 Jan 2018 CAN M. LE, ELIZAVETA LEVINA AND ROMAN VERSHYNIN Abstract. Random matrix theory has played

More information

Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization

Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization Jess Banks Cristopher Moore Roman Vershynin Nicolas Verzelen Jiaming Xu Abstract We study the problem

More information

Robust convex relaxation for the planted clique and densest k-subgraph problems

Robust convex relaxation for the planted clique and densest k-subgraph problems Robust convex relaxation for the planted clique and densest k-subgraph problems Brendan P.W. Ames June 22, 2013 Abstract We consider the problem of identifying the densest k-node subgraph in a given graph.

More information

Spectral Redemption: Clustering Sparse Networks

Spectral Redemption: Clustering Sparse Networks Spectral Redemption: Clustering Sparse Networks Florent Krzakala Cristopher Moore Elchanan Mossel Joe Neeman Allan Sly, et al. SFI WORKING PAPER: 03-07-05 SFI Working Papers contain accounts of scienti5ic

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

Network Cross-Validation for Determining the Number of Communities in Network Data

Network Cross-Validation for Determining the Number of Communities in Network Data Network Cross-Validation for Determining the Number of Communities in Network Data Kehui Chen and Jing Lei University of Pittsburgh and Carnegie Mellon University August 1, 2016 Abstract The stochastic

More information

Theory and Methods for the Analysis of Social Networks

Theory and Methods for the Analysis of Social Networks Theory and Methods for the Analysis of Social Networks Alexander Volfovsky Department of Statistical Science, Duke University Lecture 1: January 16, 2018 1 / 35 Outline Jan 11 : Brief intro and Guest lecture

More information

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix

Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2017 Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix Tony Cai University

More information

Non-convex Robust PCA: Provable Bounds

Non-convex Robust PCA: Provable Bounds Non-convex Robust PCA: Provable Bounds Anima Anandkumar U.C. Irvine Joint work with Praneeth Netrapalli, U.N. Niranjan, Prateek Jain and Sujay Sanghavi. Learning with Big Data High Dimensional Regime Missing

More information

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Spectral Graph Theory and its Applications Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity Outline Adjacency matrix and Laplacian Intuition, spectral graph drawing

More information

How Robust are Thresholds for Community Detection?

How Robust are Thresholds for Community Detection? How Robust are Thresholds for Community Detection? Ankur Moitra (MIT) joint work with Amelia Perry (MIT) and Alex Wein (MIT) Let me tell you a story about the success of belief propagation and statistical

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

arxiv: v1 [cond-mat.dis-nn] 16 Feb 2018

arxiv: v1 [cond-mat.dis-nn] 16 Feb 2018 Parallel Tempering for the planted clique problem Angelini Maria Chiara 1 1 Dipartimento di Fisica, Ed. Marconi, Sapienza Università di Roma, P.le A. Moro 2, 00185 Roma Italy arxiv:1802.05903v1 [cond-mat.dis-nn]

More information

CS369N: Beyond Worst-Case Analysis Lecture #4: Probabilistic and Semirandom Models for Clustering and Graph Partitioning

CS369N: Beyond Worst-Case Analysis Lecture #4: Probabilistic and Semirandom Models for Clustering and Graph Partitioning CS369N: Beyond Worst-Case Analysis Lecture #4: Probabilistic and Semirandom Models for Clustering and Graph Partitioning Tim Roughgarden April 25, 2010 1 Learning Mixtures of Gaussians This lecture continues

More information

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into

More information

Graph Detection and Estimation Theory

Graph Detection and Estimation Theory Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 31, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 31, 2017 U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 3, 207 Scribed by Keyhan Vakil Lecture 3 In which we complete the study of Independent Set and Max Cut in G n,p random graphs.

More information

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities Journal of Advanced Statistics, Vol. 3, No. 2, June 2018 https://dx.doi.org/10.22606/jas.2018.32001 15 A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities Laala Zeyneb

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Spectral Clustering for Dynamic Block Models

Spectral Clustering for Dynamic Block Models Spectral Clustering for Dynamic Block Models Sharmodeep Bhattacharyya Department of Statistics Oregon State University January 23, 2017 Research Computing Seminar, OSU, Corvallis (Joint work with Shirshendu

More information

Graph Partitioning Algorithms and Laplacian Eigenvalues

Graph Partitioning Algorithms and Laplacian Eigenvalues Graph Partitioning Algorithms and Laplacian Eigenvalues Luca Trevisan Stanford Based on work with Tsz Chiu Kwok, Lap Chi Lau, James Lee, Yin Tat Lee, and Shayan Oveis Gharan spectral graph theory Use linear

More information

SPARSE RANDOM GRAPHS: REGULARIZATION AND CONCENTRATION OF THE LAPLACIAN

SPARSE RANDOM GRAPHS: REGULARIZATION AND CONCENTRATION OF THE LAPLACIAN SPARSE RANDOM GRAPHS: REGULARIZATION AND CONCENTRATION OF THE LAPLACIAN CAN M. LE, ELIZAVETA LEVINA, AND ROMAN VERSHYNIN Abstract. We study random graphs with possibly different edge probabilities in the

More information

arxiv: v1 [stat.ml] 6 Nov 2014

arxiv: v1 [stat.ml] 6 Nov 2014 A Generic Sample Splitting Approach for Refined Community Recovery in Stochastic Bloc Models arxiv:1411.1469v1 [stat.ml] 6 Nov 2014 Jing Lei 1 and Lingxue Zhu 2 1,2 Department of Statistics, Carnegie Mellon

More information

Project in Computational Game Theory: Communities in Social Networks

Project in Computational Game Theory: Communities in Social Networks Project in Computational Game Theory: Communities in Social Networks Eldad Rubinstein November 11, 2012 1 Presentation of the Original Paper 1.1 Introduction In this section I present the article [1].

More information

Introduction to graphical models: Lecture III

Introduction to graphical models: Lecture III Introduction to graphical models: Lecture III Martin Wainwright UC Berkeley Departments of Statistics, and EECS Martin Wainwright (UC Berkeley) Some introductory lectures January 2013 1 / 25 Introduction

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Stein s Method for concentration inequalities

Stein s Method for concentration inequalities Number of Triangles in Erdős-Rényi random graph, UC Berkeley joint work with Sourav Chatterjee, UC Berkeley Cornell Probability Summer School July 7, 2009 Number of triangles in Erdős-Rényi random graph

More information

Linear inverse problems on Erdős-Rényi graphs: Information-theoretic limits and efficient recovery

Linear inverse problems on Erdős-Rényi graphs: Information-theoretic limits and efficient recovery Linear inverse problems on Erdős-Rényi graphs: Information-theoretic limits and efficient recovery Emmanuel Abbe, Afonso S. Bandeira, Annina Bracher, Amit Singer Princeton University Abstract This paper

More information

Spectral Properties of Random Matrices for Stochastic Block Model

Spectral Properties of Random Matrices for Stochastic Block Model Spectral Properties of Random Matrices for Stochastic Block Model Konstantin Avrachenkov INRIA Sophia Antipolis France Email:konstantin.avratchenkov@inria.fr Laura Cottatellucci Eurecom Sophia Antipolis

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 12: Graph Clustering Cho-Jui Hsieh UC Davis May 29, 2018 Graph Clustering Given a graph G = (V, E, W ) V : nodes {v 1,, v n } E: edges

More information

Consistency Under Sampling of Exponential Random Graph Models

Consistency Under Sampling of Exponential Random Graph Models Consistency Under Sampling of Exponential Random Graph Models Cosma Shalizi and Alessandro Rinaldo Summary by: Elly Kaizar Remember ERGMs (Exponential Random Graph Models) Exponential family models Sufficient

More information

Finding Planted Subgraphs with Few Eigenvalues using the Schur-Horn Relaxation

Finding Planted Subgraphs with Few Eigenvalues using the Schur-Horn Relaxation Finding Planted Subgraphs with Few Eigenvalues using the Schur-Horn Relaxation Utkan Onur Candogan 1 and Venkat Chandrasekaran 2 1 Department of Electrical Engineering, California Institute of Technology,

More information

Predicting Phase Transitions in Hypergraph q-coloring with the Cavity Method

Predicting Phase Transitions in Hypergraph q-coloring with the Cavity Method Predicting Phase Transitions in Hypergraph q-coloring with the Cavity Method Marylou Gabrié (LPS, ENS) Varsha Dani (University of New Mexico), Guilhem Semerjian (LPT, ENS), Lenka Zdeborová (CEA, Saclay)

More information

Clustering Algorithms for Random and Pseudo-random Structures

Clustering Algorithms for Random and Pseudo-random Structures Abstract Clustering Algorithms for Random and Pseudo-random Structures Pradipta Mitra 2008 Partitioning of a set objects into a number of clusters according to a suitable distance metric is one of the

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017 U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017 Scribed by Luowen Qian Lecture 8 In which we use spectral techniques to find certificates of unsatisfiability

More information

Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model

Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model Journal of Machine Learning Research vol (2012) 1 23 Submitted 24 May 2012; ublished 2012 Spectral Clustering of Graphs with General Degrees in the Extended lanted artition Model Kamalika Chaudhuri Fan

More information

Information-theoretic Limits for Community Detection in Network Models

Information-theoretic Limits for Community Detection in Network Models Information-theoretic Limits for Community Detection in Network Models Chuyang Ke Department of Computer Science Purdue University West Lafayette, IN 47907 cke@purdue.edu Jean Honorio Department of Computer

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

Parameter estimators of sparse random intersection graphs with thinned communities

Parameter estimators of sparse random intersection graphs with thinned communities Parameter estimators of sparse random intersection graphs with thinned communities Lasse Leskelä Aalto University Johan van Leeuwaarden Eindhoven University of Technology Joona Karjalainen Aalto University

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018

U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 U.C. Berkeley Better-than-Worst-Case Analysis Handout 3 Luca Trevisan May 24, 2018 Lecture 3 In which we show how to find a planted clique in a random graph. 1 Finding a Planted Clique We will analyze

More information

CS264: Beyond Worst-Case Analysis Lectures #9 and 10: Spectral Algorithms for Planted Bisection and Planted Clique

CS264: Beyond Worst-Case Analysis Lectures #9 and 10: Spectral Algorithms for Planted Bisection and Planted Clique CS64: Beyond Worst-Case Analysis Lectures #9 and 10: Spectral Algorithms for Planted Bisection and Planted Clique Tim Roughgarden February 7 and 9, 017 1 Preamble Lectures #6 8 studied deterministic conditions

More information

Part 1: Les rendez-vous symétriques

Part 1: Les rendez-vous symétriques Part 1: Les rendez-vous symétriques Cris Moore, Santa Fe Institute Joint work with Alex Russell, University of Connecticut A walk in the park L AND NL-COMPLETENESS 313 there are n locations in a park you

More information

1-Bit Matrix Completion

1-Bit Matrix Completion 1-Bit Matrix Completion Mark A. Davenport School of Electrical and Computer Engineering Georgia Institute of Technology Yaniv Plan Mary Wootters Ewout van den Berg Matrix Completion d When is it possible

More information

The condensation phase transition in random graph coloring

The condensation phase transition in random graph coloring The condensation phase transition in random graph coloring Victor Bapst Goethe University, Frankfurt Joint work with Amin Coja-Oghlan, Samuel Hetterich, Felicia Rassmann and Dan Vilenchik arxiv:1404.5513

More information

Dissertation Defense

Dissertation Defense Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation

More information

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018 ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space

More information

Limitations in Approximating RIP

Limitations in Approximating RIP Alok Puranik Mentor: Adrian Vladu Fifth Annual PRIMES Conference, 2015 Outline 1 Background The Problem Motivation Construction Certification 2 Planted model Planting eigenvalues Analysis Distinguishing

More information

Analysis of Robust PCA via Local Incoherence

Analysis of Robust PCA via Local Incoherence Analysis of Robust PCA via Local Incoherence Huishuai Zhang Department of EECS Syracuse University Syracuse, NY 3244 hzhan23@syr.edu Yi Zhou Department of EECS Syracuse University Syracuse, NY 3244 yzhou35@syr.edu

More information

Algorithmic approaches to fitting ERG models

Algorithmic approaches to fitting ERG models Ruth Hummel, Penn State University Mark Handcock, University of Washington David Hunter, Penn State University Research funded by Office of Naval Research Award No. N00014-08-1-1015 MURI meeting, April

More information

Rank minimization via the γ 2 norm

Rank minimization via the γ 2 norm Rank minimization via the γ 2 norm Troy Lee Columbia University Adi Shraibman Weizmann Institute Rank Minimization Problem Consider the following problem min X rank(x) A i, X b i for i = 1,..., k Arises

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

MAP Examples. Sargur Srihari

MAP Examples. Sargur Srihari MAP Examples Sargur srihari@cedar.buffalo.edu 1 Potts Model CRF for OCR Topics Image segmentation based on energy minimization 2 Examples of MAP Many interesting examples of MAP inference are instances

More information

Community Detection. Data Analytics - Community Detection Module

Community Detection. Data Analytics - Community Detection Module Community Detection Data Analytics - Community Detection Module Zachary s karate club Members of a karate club (observed for 3 years). Edges represent interactions outside the activities of the club. Community

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

Clustering from Sparse Pairwise Measurements

Clustering from Sparse Pairwise Measurements Clustering from Sparse Pairwise Measurements arxiv:6006683v2 [cssi] 9 May 206 Alaa Saade Laboratoire de Physique Statistique École Normale Supérieure, 24 Rue Lhomond Paris 75005 Marc Lelarge INRIA and

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

arxiv: v2 [math.st] 21 Oct 2015

arxiv: v2 [math.st] 21 Oct 2015 Computational and Statistical Boundaries for Submatrix Localization in a Large Noisy Matrix T. Tony Cai, Tengyuan Liang and Alexander Rakhlin arxiv:1502.01988v2 [math.st] 21 Oct 2015 Department of Statistics

More information

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University Matrix completion: Fundamental limits and efficient algorithms Sewoong Oh Stanford University 1 / 35 Low-rank matrix completion Low-rank Data Matrix Sparse Sampled Matrix Complete the matrix from small

More information

Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs

Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun (work with S.V.N. Vishwanathan) Department of Statistics Purdue Machine Learning Seminar November 9, 2011 Overview

More information

Network Representation Using Graph Root Distributions

Network Representation Using Graph Root Distributions Network Representation Using Graph Root Distributions Jing Lei Department of Statistics and Data Science Carnegie Mellon University 2018.04 Network Data Network data record interactions (edges) between

More information

Lecture 5: Probabilistic tools and Applications II

Lecture 5: Probabilistic tools and Applications II T-79.7003: Graphs and Networks Fall 2013 Lecture 5: Probabilistic tools and Applications II Lecturer: Charalampos E. Tsourakakis Oct. 11, 2013 5.1 Overview In the first part of today s lecture we will

More information

c 2018 Society for Industrial and Applied Mathematics

c 2018 Society for Industrial and Applied Mathematics SIAM J. OPTIM. Vol. 28, No. 1, pp. 735 759 c 2018 Society for Industrial and Applied Mathematics FINDING PLANTED SUBGRAPHS WITH FEW EIGENVALUES USING THE SCHUR HORN RELAXATION UTKAN ONUR CANDOGAN AND VENKAT

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Clique is hard on average for regular resolution

Clique is hard on average for regular resolution Clique is hard on average for regular resolution Ilario Bonacina, UPC Barcelona Tech July 27, 2018 Oxford Complexity Day Talk based on a joint work with: A. Atserias S. de Rezende J. Nordstro m M. Lauria

More information

COMMUNITY DETECTION IN DEGREE-CORRECTED BLOCK MODELS

COMMUNITY DETECTION IN DEGREE-CORRECTED BLOCK MODELS Submitted to the Annals of Statistics COMMUNITY DETECTION IN DEGREE-CORRECTED BLOCK MODELS By Chao Gao, Zongming Ma, Anderson Y. Zhang, and Harrison H. Zhou Yale University, University of Pennsylvania,

More information

Impact of regularization on Spectral Clustering

Impact of regularization on Spectral Clustering Impact of regularization on Spectral Clustering Antony Joseph and Bin Yu December 5, 2013 Abstract The performance of spectral clustering is considerably improved via regularization, as demonstrated empirically

More information

Polyhedral Approaches to Online Bipartite Matching

Polyhedral Approaches to Online Bipartite Matching Polyhedral Approaches to Online Bipartite Matching Alejandro Toriello joint with Alfredo Torrico, Shabbir Ahmed Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Industrial

More information

Sparse PCA in High Dimensions

Sparse PCA in High Dimensions Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and

More information

Spectral k-support Norm Regularization

Spectral k-support Norm Regularization Spectral k-support Norm Regularization Andrew McDonald Department of Computer Science, UCL (Joint work with Massimiliano Pontil and Dimitris Stamos) 25 March, 2015 1 / 19 Problem: Matrix Completion Goal:

More information