Spectral Clustering for Dynamic Block Models

Size: px
Start display at page:

Download "Spectral Clustering for Dynamic Block Models"

Transcription

1 Spectral Clustering for Dynamic Block Models Sharmodeep Bhattacharyya Department of Statistics Oregon State University January 23, 2017 Research Computing Seminar, OSU, Corvallis (Joint work with Shirshendu Chatterjee, City College, CUNY) Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

2 Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

3 Introduction and Motivation Networks Networks Nodes Edges Social Network People Friendship/kinship Biological Network Gene/Protein Interaction Citation Networks Papers citation Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

4 Introduction and Motivation Network Data G = (V, E): undirected graph and V = {v 1,, v n } arbitrarily labeled vertices. Adjacency matrices (Symmetric), [A ij ] n i,j=1 numerically represent network data: 1 if node i links to node j, A ij = 0 otherwise. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

5 Introduction and Motivation Drosophila protein interactions Guruharsha et al., A protein complex network of Drosophila melanogaster, Cell, 147: , 2011 Experimentally measured and scored protein interactions 1612 nodes; 10,421 edges (edge density ˆρ = ) Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

6 Introduction and Motivation Political blogs Understanding political patterns Adamic, Lada A., and Natalie Glance. "The political blogosphere and the 2004 US election: divided they blog." Proceedings of the 3rd international workshop on Link discovery. ACM, Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

7 Introduction and Motivation Online Social Network Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

8 Introduction and Motivation Dynamic/Time-varying Networks Figure: Dynamic Network Examples Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

9 Introduction and Motivation A Motivating Example: Electro-Corticograph Array Data for Speech Figure: a. MRI reconstruction of a single subject brain with vsmc electrodes (dots), colored according to distance from the Sylvian fissure (black and red are the most dorsal and ventral positions, respectively). b. Expanded view of vsmc anatomy: cs, central sulcus; PoCG, post-central gyrus; PrCG, pre-central gyrus; Sf, Sylvian fissure. c - e.top, vocal tract schematics for three consonants (/b/, /d/, /g/), produced by occlusion at the lips, tongue tip and tongue body, respectively (red arrow). Middle, spectrograms of spoken consonant-vowel syllables (Bouchard et.al., Nature, 2013). Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

10 Introduction and Motivation Other Examples of Network Data Biological Networks: Biochemical pathway networks Gene transcription networks Epidemiological Networks Social Networks: Academic networks such as collaboration and citation networks Networks arising from text-mining Technological Networks: Internet Cell-phone tower and telephone exchange networks Airport and Transport Networks Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

11 Introduction and Motivation Two Main Classes of Problems for Networks (I) Formation of networks given information on vertices as data. (II) Inference on networks given complete network with node and edge structure as data. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

12 Introduction and Motivation Two Main Classes of Problems for Networks (I) Formation of networks given information on vertices. (II) Inference on networks given a complete network with node and edge structure. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

13 Introduction and Motivation Commonly Questions Asked Community Detection. Link Prediction. Covariate or Latent Variable Estimation. Sampling of nodes and subgraphs. Dynamic network inference and information exchange in networks. Most of these questions can be answered by performing inference on network models. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

14 Introduction and Motivation Commonly Questions Asked Community Detection Link Prediction Covariate or Latent Variable Estimation Sampling of nodes and subgraphs Information exchange Most of these questions can be answered by performing inference on network models. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

15 Introduction and Motivation Community in Networks Physical Topological Definition How to Find Topological Nodes within a community has more edges Community detection algorithms among themselves than with nodes proposed by Statisticians/ outside community in average Computer Scientists/ Mathematicians Physical Nodes or Edges within community Verified by Scientists have some shared property Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

16 Community Detection in Networks Community Detection Algorithms Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

17 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms Popular algorithms for community detection are - 1 Modularity maximizing methods. (Newman and Girvan (2006)) 2 Spectral clustering based methods. (McSherry (2001)) 3 Likelihood and its approximation maximization (a) Profile Likelihood Maximization (Bickel and Chen (2009)). (b) Variational Likelihood Maximization. (Celisse et. al. (2011)) (c) Pseudo-likelihood Maximization (Chen et. al. (2012)). Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

18 Community Detection in Networks Community Detection Algorithms: Spectral Methods Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

19 Community Detection in Networks Community Detection Algorithms: Spectral Methods General Spectral Clustering Algorithm Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

20 Community Detection in Networks Community Detection Algorithms: Spectral Methods Well-known Examples of M n For community identification in network, there are some well-known operators M n. Adjacency matrix M n = A (Sussman et.al (2012)) Normalized Laplacian matrices M n = L rw n = D 1 L n and L sym n = D 1/2 L n D 1/2 with L n = D A n (Rohe. et.al. (2011)). These operators although perform well in regime (a) fail to perform well in both regime (b) and (c) described previously. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

21 Community Detection in Networks Community Detection Algorithms: Spectral Methods M n for Sparse Networks For community identification in sparse networks, there are some regularized variations of L n or A n. Adjacency matrix A τ = A + τ11 T, where 1 is a vector of 1 s of length n. (Amini et.al (2012)) Laplacian matrix L τ n = (D + τi ) 1/2 A(D + τi ) 1/2 (Chaudhuri. et.al. (2012)). Trimmed adjacency matrix A τ, where, high-degree nodes are trimmed (Coja-Oghlan (2010)). Theoretical performance of first two regularized operators for sparse networks is under investigation. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

22 Feature and Models of Networks Dynamic Models Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

23 Feature and Models of Networks Dynamic Models A Motivating Example: Electro-Corticograph Array Data for Speech Figure: a. MRI reconstruction of a single subject brain with vsmc electrodes (dots), colored according to distance from the Sylvian fissure (black and red are the most dorsal and ventral positions, respectively). b. Expanded view of vsmc anatomy: cs, central sulcus; PoCG, post-central gyrus; PrCG, pre-central gyrus; Sf, Sylvian fissure. c - e.top, vocal tract schematics for three consonants (/b/, /d/, /g/), produced by occlusion at the lips, tongue tip and tongue body, respectively (red arrow). Middle, spectrograms of spoken consonant-vowel syllables (Bouchard et.al., Nature, 2013). Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

24 Feature and Models of Networks Dynamic Models Dynamic Network Models: A Myopic Review Dynamic time-evolving formation of networks: Barabasi and Albert (1999) and a large literature. Extension of static models of network: Latent space models, Sarkar and Moore (2005), Sewell and Chen (2014). Mixed membership block models, Xing et.al. (2010), Ho et.al. (2011). Random dot-product models, Tang et.al. (2013). Stochastic block models, Xu et.al. (2013), Ghasemian et.al. (2015). Graphon models, Crane (2015). Bayesian models: Ho et.al. (2011), Durante et.al. (2014). Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

25 Feature and Models of Networks Dynamic Models Nonparametric Latent Variable Models Derived from representation of exchangeable random infinite array by Aldous and Hoover (1983). NP Model Define P({A ij } n i,j=1 ) conditionally given latent variables {ξ i} n i=1 associated with vertices {v i } n i=1 respectively. (Bickel & Chen (2009), Bollobás et.al. (2007), Hoff et.al. (2002)). ξ 1,..., ξ n iid U(0, 1) Pr(A ij = 1 ξ i = u, ξ j = v) = h n (u, v) = ρ n w(u, v), Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

26 Feature and Models of Networks Dynamic Models Nonparametric Latent Variable Models Derived from representation of exchangeable random infinite array by Aldous and Hoover (1983). NP Model Define P({A ij } n i,j=1 ) conditionally given latent variables {ξ i} n i=1 associated with vertices {v i } n i=1 respectively. (Bickel & Chen (2009), Bollobás et.al. (2007), Hoff et.al. (2002)). ξ 1,..., ξ n iid U(0, 1) Pr(A ij = 1 ξ i = u, ξ j = v) = h n (u, v) = ρ n w(u, v), w(u, v) is the conditional latent variable density given A ij = 1. Define λ n nρ n as the expected degree parameter and P = [P ij ] n i,j = [ρ nw(ξ i, ξ j )] n i,j. h n : not uniquely defined. h n ( ϕ(u), ϕ(v) ), with measure-preserving ϕ, gives same model. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

27 Feature and Models of Networks Dynamic Models Stochastic Block Model (Holland, Laskey and Leinhardt 1983) A K-block stochastic block model with parameters (π, P) is defined as follows. Consider latent variable corresponding to vertices as z = (z 1, z 2,..., z n ) with z 1,..., z n iid Multinomial(1; (π1,..., π K )) Pr(A ij = 1 z i, z j ) = P zi z j, where P = [P ab ] is a K K symmetric matrix for undirected networks. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

28 Feature and Models of Networks Dynamic Models Dynamic Nonparametric Latent Variable Models Now we try to introduce a time component to the exchangeable model. The most general version of the model becomes ξi t (ξi t 1 = u) ( P A (t) ij = 1 ξi t = u, ξj t = v, A (t 1) ) ij = z ξ 0 i iid U(0, 1) (1) iid F (u) (2) = h n (u, v, z, t) = ρ n w(u, v, z, t) (3) where, F is an univariate distribution and 0 h n 1 and 0 t T is the time variable. Random re-wiring mechanism: h n depends on both t and z (Harry Crane, 2015). Evolving Communities: h n depends on (u, v) only, F non-trivial (Ghasemian et.al., 2015). Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

29 Feature and Models of Networks Dynamic Models Dynamic Stochastic Block Model (DSBM) Specialize to Dynamic Stochastic Block Model with parameters (π, B) and latent variables z, z 1,..., z n iid Mult(1; (π1,..., π K )), (4) ( P A (t) ij = 1 z i, z j ) = B z (t) i z j. (5) where, B t = [Bab t ] are K K symmetric matrix for undirected networks for each time step t and 0 t T is the time variable. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

30 Feature and Models of Networks Dynamic Models Dynamic Degree Corrected Block Model (DDCBM) Specialize to Dynamic Degree Corrected Block Model with parameters (π, B, ψ) and latent variables z, z 1,..., z n iid Mult(1; (π1,..., π K )), (6) ( P A (t) ij = 1 z i, z j, ψ) = ψ i ψ j B z (t) i z j. (7) where, B t = [Bab t ] are K K symmetric matrix for undirected networks for each time step t and 0 t T is the time variable. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

31 Feature and Models of Networks Spectral Methods Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

32 Feature and Models of Networks Spectral Methods Dynamic Spectral Clustering Algorithms Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

33 Feature and Models of Networks Theory Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

34 Feature and Models of Networks Theory First Method: In Detail In the first method, we sum the adjacency matrices to obtain T A = A (t). t=1 We obtain leading K eigenvectors of A corresponding to its largest eigenvalues. Suppose Ûn K contains those eigenvectors as columns. Then we use (1 + ɛ) approximate k-means clustering algorithm to obtain Ẑ M n,k and ˆΘ R K K such that Ẑ ˆΘ Û 2 F (1 + ɛ) min ZΘ Û 2 Z M n K,Θ R K K F. Ẑ is the estimate of Z = (z 1,..., z n ) from this method. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

35 Feature and Models of Networks Theory First Method: Consistency of Ẑ Adjacency matrices, A generated from the DSBM with n nodes and K communities with parameters (π, {B (t) } T t=1 ), γ n be the smallest non-zero singular value of P, d := max k,l [K],t [T ] B (t) k,l n be the maximum expected degree of a node at any time. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

36 Feature and Models of Networks Theory First Method: Consistency of Ẑ Theorem Let A is generated from DSBM. Suppose γ n is large enough so that K γ 2 n max{td, log 2 n/td} = o(1). For any ɛ, c > 0, there is a constant C = C(ɛ, c) > 0 such that if Ẑ is the estimate of Z as described in Algorithm 1, and if f i, i [K] is the fraction of nodes belonging to C i which are misclassified in Ẑ, then K f i C K γn 2 max{td, log 2 n/td} i=1 with probability at least 1 n c. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

37 Feature and Models of Networks Theory First Method: Consistency of Ẑ Corollary In the special case of Theorem when (i) the minimum eigenvalue of n d B(t) is positive and uniformly bounded away from zero for all t [T ], (ii) the community sizes are balanced, i.e. n max /n min = O(1), then consistency holds for Ẑ if either Td log(n) and K = o(td), or (log(n)) 2/3 << Td < log(n) and K = o((td) 3 /(log(n)) 2 ). Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

38 Feature and Models of Networks Theory First Method: Consistency of Ẑ Corollary In the special case of Theorem when (i) the minimum eigenvalue of n d B(t) is positive and uniformly bounded away from zero for all t [T ], (ii) the community sizes are balanced, i.e. n max /n min = O(1), then consistency holds for Ẑ if either Td log(n) and K = o(td), or (log(n)) 2/3 << Td < log(n) and K = o((td) 3 /(log(n)) 2 ). Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

39 Feature and Models of Networks Theory Algorithm 2: In Detail In the second method, we sum the squares of the adjacency matrices to obtain, A [2] and then subtract its diagonal to obtain, A [2], T A [2] Ä := A (t) ä 2 A[2] T Ä, := A (t) ä 2. t=1 t=1 We obtain leading K eigenvectors of A [2] corresponding to its largest eigenvalues. Suppose Ŭ R n K contains those eigenvectors as columns. Then we use (1 + ɛ) approximate K-means clustering algorithm to obtain Z M n,k and Θ R K K such that Z Θ Ŭ 2 F (1 + ɛ) min Z M n K,Θ R K K ZΘ Ŭ 2 F. Z is the estimate of Z = (z 1,..., z n ) from this method. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

40 Feature and Models of Networks Theory Consistency of Z In order to prove consistency of Z, we need some notations and observations. Let T B [2] Ä := B (t) ä 2 t=1 T P [2] Ä := P (t) ä 2 T Ä = Z 1 B (t) ä 2 1 Z T (8) t=1 t=1 The main assumption about the connection probabilities that we need is At least one B (t), t [T ], must be nonsingular. (9) Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

41 Feature and Models of Networks Theory More Notations and Conditions for Consistency of Z A is generated from DSBM with n nodes and K communities and the parameters (aπ, {B (t) } T t=1 ). γ n be the smallest non-zero singular value of P [2] d := max k,l [K],t [T ] B (t) k,l n be the maximum expected degree of a node at any time. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

42 Feature and Models of Networks Theory Second Method: Consistency of Z Theorem Let A is generated from DSBM satisfying assumption (9). Suppose γ n is large enough so that K (Td 3 (1 T 1 d 1 log n + log 10 n) = o(1). For any ɛ, c > 0, there γn 2 is a constant C = C(ɛ, c) > 0 such that if Z is the estimate of Z as described in Algorithm 2, and if f i, i [K] is the fraction of nodes belonging to C i which are misclassified in Z, then K i=1 f i CK Td 3 (1 T 1 d 1 log n) + (Td 2 log 2 (n) log 10 (n)) (Td 2 log 12 (n)) γ 2 n with probability at least 1 4n c. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

43 Feature and Models of Networks Theory Second Method: Consistency of Z Corollary In the special case of Theorem when (i) the number of nonsingular matrices among { n d B(t) : t [T ]} (whose singular values are bounded away from 0 uniformly) grows faster than max{d 2 log 5 n,» T /d}, and (ii) the community sizes are balanced, i.e. n max /n min = O(1), then consistency holds for Z. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

44 Feature and Models of Networks Theory Algorithm 3: Spherical Spectral Clustering In the third method, obtain the sum of the squared adjacency matrices without its diagonal, A [2] := T t=1 Ä A (t) ä 2. Obtain Ŭ R n K consisting of the leading K eigenvectors of corresponding to its largest absolute eigenvalues. A [2] Let n + be the number of nonzero rows of Ŭ. Obtain Ŭ + R n + K consisting of the normalized nonzero rows of Ŭ, i.e. Ŭ + i, = Ŭ i, / Ŭi, for i such that 2 > 0. 2 Ŭi, Use (1 + ɛ) approximate K-median clustering algorithm on the row vectors of Ŭ + to obtain Z + M n +,K and X R K K. Extend Z + to obtain Z by (arbitrarily) adding n n + many canonical unit row vectors at the end, such as, Z i = (1, 0,..., 0) for i such that = 0. 2 Z is the estimate of Z. Ŭi, Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

45 Results Simulation Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

46 Results Simulation Simulation Results: DSBM (a) (b) Figure: (a) For Sparse network λ n = 3 (b) Dense network, λ n = 8. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

47 Results Simulation Simulation Results: DSBM (a) Figure: Dense network, λ n = 10, with B nearly singular. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

48 Results Simulation Simulation Results: DDCBM (a) (b) Figure: Dense: (a) B nearly singular (b) B non-singular. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

49 Results Real Networks Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

50 Results Real Networks Neuroscience ECoG Example Figure: Clustering of the network correctly identifies the lip region (upper right hand part of the vsmc) involved in the production of /b/, which engages the lips. (a): Location of Electrode Clusters based on BolBO-based graph Estimation (b): Organization of articulator representations in the vsmc (black: larynx; red: lips; blue: tongue; green: jaw). (c): Estimated graph of electrodes. harmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

51 Conclusion Outline 1 Introduction and Motivation 2 Community Detection in Networks Community Detection Algorithms Community Detection Algorithms: Spectral Methods 3 Feature and Models of Networks Dynamic Network Models Spectral Clustering Methods Theoretical Results 4 Resullts Simulation Resullts Real Networks: Neuroscience Example 5 Summary Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

52 Conclusion Summary and Future Works Summary We consider two methods of spectral clustering for dynamic SBM. We give theoretical justifications of each method. Works in Progress Extension of more general dynamic SBM. Extension of dynamic models. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

53 Conclusion Future Problems in Networks Methodological Detection of dynamic communities. Detection of communities in presence of covariates. Comparison of networks and communities for multiple networks. Theoretical Condition for community recovery for general K and connectivity matrix. Condition for community recovery for dynamic networks. Condition for community recovery for networks with covariate information. Sharmodeep Bhattacharyya (Oregon State) Dynamic Spectral Clustering January 23, / 53

Network Representation Using Graph Root Distributions

Network Representation Using Graph Root Distributions Network Representation Using Graph Root Distributions Jing Lei Department of Statistics and Data Science Carnegie Mellon University 2018.04 Network Data Network data record interactions (edges) between

More information

Topics in Network Models. Peter Bickel

Topics in Network Models. Peter Bickel Topics in Network Models MSU, Sepember, 2012 Peter Bickel Statistics Dept. UC Berkeley (Joint work with S. Bhattacharyya UC Berkeley, A. Chen Google, D. Choi UC Berkeley, E. Levina U. Mich and P. Sarkar

More information

Theory and Methods for the Analysis of Social Networks

Theory and Methods for the Analysis of Social Networks Theory and Methods for the Analysis of Social Networks Alexander Volfovsky Department of Statistical Science, Duke University Lecture 1: January 16, 2018 1 / 35 Outline Jan 11 : Brief intro and Guest lecture

More information

Statistical Inference for Networks. Peter Bickel

Statistical Inference for Networks. Peter Bickel Statistical Inference for Networks 4th Lehmann Symposium, Rice University, May 2011 Peter Bickel Statistics Dept. UC Berkeley (Joint work with Aiyou Chen, Google, E. Levina, U. Mich, S. Bhattacharyya,

More information

Statistical and Computational Phase Transitions in Planted Models

Statistical and Computational Phase Transitions in Planted Models Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203 Cluster/Community structure in

More information

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity CS322 Project Writeup Semih Salihoglu Stanford University 353 Serra Street Stanford, CA semih@stanford.edu

More information

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria Community Detection fundamental limits & efficient algorithms Laurent Massoulié, Inria Community Detection From graph of node-to-node interactions, identify groups of similar nodes Example: Graph of US

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

arxiv: v1 [math.st] 27 Feb 2018

arxiv: v1 [math.st] 27 Feb 2018 Network Representation Using Graph Root Distributions Jing Lei 1 arxiv:1802.09684v1 [math.st] 27 Feb 2018 1 Carnegie Mellon University February 28, 2018 Abstract Exchangeable random graphs serve as an

More information

arxiv: v1 [stat.ml] 29 Jul 2012

arxiv: v1 [stat.ml] 29 Jul 2012 arxiv:1207.6745v1 [stat.ml] 29 Jul 2012 Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs Daniel L. Sussman, Minh Tang, Carey E. Priebe Johns Hopkins

More information

arxiv: v1 [math.st] 26 Jan 2018

arxiv: v1 [math.st] 26 Jan 2018 CONCENTRATION OF RANDOM GRAPHS AND APPLICATION TO COMMUNITY DETECTION arxiv:1801.08724v1 [math.st] 26 Jan 2018 CAN M. LE, ELIZAVETA LEVINA AND ROMAN VERSHYNIN Abstract. Random matrix theory has played

More information

arxiv: v1 [stat.me] 6 Nov 2014

arxiv: v1 [stat.me] 6 Nov 2014 Network Cross-Validation for Determining the Number of Communities in Network Data Kehui Chen 1 and Jing Lei arxiv:1411.1715v1 [stat.me] 6 Nov 014 1 Department of Statistics, University of Pittsburgh Department

More information

Graph Detection and Estimation Theory

Graph Detection and Estimation Theory Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and

More information

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities Journal of Advanced Statistics, Vol. 3, No. 2, June 2018 https://dx.doi.org/10.22606/jas.2018.32001 15 A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities Laala Zeyneb

More information

Computational Lower Bounds for Community Detection on Random Graphs

Computational Lower Bounds for Community Detection on Random Graphs Computational Lower Bounds for Community Detection on Random Graphs Bruce Hajek, Yihong Wu, Jiaming Xu Department of Electrical and Computer Engineering Coordinated Science Laboratory University of Illinois

More information

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010

Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability. COMPSTAT 2010 Paris, August 23, 2010 Statistical Inference on Large Contingency Tables: Convergence, Testability, Stability Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu COMPSTAT

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Matrix estimation by Universal Singular Value Thresholding

Matrix estimation by Universal Singular Value Thresholding Matrix estimation by Universal Singular Value Thresholding Courant Institute, NYU Let us begin with an example: Suppose that we have an undirected random graph G on n vertices. Model: There is a real symmetric

More information

The non-backtracking operator

The non-backtracking operator The non-backtracking operator Florent Krzakala LPS, Ecole Normale Supérieure in collaboration with Paris: L. Zdeborova, A. Saade Rome: A. Decelle Würzburg: J. Reichardt Santa Fe: C. Moore, P. Zhang Berkeley:

More information

Network Cross-Validation for Determining the Number of Communities in Network Data

Network Cross-Validation for Determining the Number of Communities in Network Data Network Cross-Validation for Determining the Number of Communities in Network Data Kehui Chen and Jing Lei University of Pittsburgh and Carnegie Mellon University August 1, 2016 Abstract The stochastic

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

Scalable Gaussian process models on matrices and tensors

Scalable Gaussian process models on matrices and tensors Scalable Gaussian process models on matrices and tensors Alan Qi CS & Statistics Purdue University Joint work with F. Yan, Z. Xu, S. Zhe, and IBM Research! Models for graph and multiway data Model Algorithm

More information

Two-sample hypothesis testing for random dot product graphs

Two-sample hypothesis testing for random dot product graphs Two-sample hypothesis testing for random dot product graphs Minh Tang Department of Applied Mathematics and Statistics Johns Hopkins University JSM 2014 Joint work with Avanti Athreya, Vince Lyzinski,

More information

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR

More information

Probabilistic Foundations of Statistical Network Analysis Chapter 3: Network sampling

Probabilistic Foundations of Statistical Network Analysis Chapter 3: Network sampling Probabilistic Foundations of Statistical Network Analysis Chapter 3: Network sampling Harry Crane Based on Chapter 3 of Probabilistic Foundations of Statistical Network Analysis Book website: http://wwwharrycranecom/networkshtml

More information

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint

More information

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Submitted to the Annals of Statistics arxiv: arxiv:0000.0000 RATE-OPTIMAL GRAPHON ESTIMATION By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Network analysis is becoming one of the most active

More information

The social sciences have investigated the structure of small

The social sciences have investigated the structure of small A nonparametric view of network models and Newman Girvan and other modularities Peter J. Bickel a,1 and Aiyou Chen b a University of California, Berkeley, CA 9472; and b Alcatel-Lucent Bell Labs, Murray

More information

Benchmarking recovery theorems for the DC-SBM

Benchmarking recovery theorems for the DC-SBM Benchmarking recovery theorems for the DC-SBM Yali Wan Department of Statistics University of Washington Seattle, WA 98195-4322, USA yaliwan@washington.edu Marina Meilă Department of Statistics University

More information

Benchmarking recovery theorems for the DC-SBM

Benchmarking recovery theorems for the DC-SBM Benchmarking recovery theorems for the DC-SBM Yali Wan Department of Statistics University of Washington Seattle, WA 98195-4322, USA Marina Meila Department of Statistics University of Washington Seattle,

More information

Reconstruction in the Sparse Labeled Stochastic Block Model

Reconstruction in the Sparse Labeled Stochastic Block Model Reconstruction in the Sparse Labeled Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign

More information

Impact of regularization on Spectral Clustering

Impact of regularization on Spectral Clustering Impact of regularization on Spectral Clustering Antony Joseph and Bin Yu December 5, 2013 Abstract The performance of spectral clustering is considerably improved via regularization, as demonstrated empirically

More information

Estimating network edge probabilities by neighbourhood smoothing

Estimating network edge probabilities by neighbourhood smoothing Biometrika (27), 4, 4,pp. 77 783 doi:.93/biomet/asx42 Printed in Great Britain Advance Access publication 5 September 27 Estimating network edge probabilities by neighbourhood smoothing BY YUAN ZHANG Department

More information

Foundations of Adjacency Spectral Embedding. Daniel L. Sussman

Foundations of Adjacency Spectral Embedding. Daniel L. Sussman Foundations of Adjacency Spectral Embedding by Daniel L. Sussman A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy. Baltimore,

More information

Bayesian nonparametric models of sparse and exchangeable random graphs

Bayesian nonparametric models of sparse and exchangeable random graphs Bayesian nonparametric models of sparse and exchangeable random graphs F. Caron & E. Fox Technical Report Discussion led by Esther Salazar Duke University May 16, 2014 (Reading group) May 16, 2014 1 /

More information

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay A Statistical Look at Spectral Graph Analysis Deep Mukhopadhyay Department of Statistics, Temple University Office: Speakman 335 deep@temple.edu http://sites.temple.edu/deepstat/ Graph Signal Processing

More information

Modeling of Growing Networks with Directional Attachment and Communities

Modeling of Growing Networks with Directional Attachment and Communities Modeling of Growing Networks with Directional Attachment and Communities Masahiro KIMURA, Kazumi SAITO, Naonori UEDA NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho, Kyoto 619-0237, Japan

More information

Spectral thresholds in the bipartite stochastic block model

Spectral thresholds in the bipartite stochastic block model Spectral thresholds in the bipartite stochastic block model Laura Florescu and Will Perkins NYU and U of Birmingham September 27, 2016 Laura Florescu and Will Perkins Spectral thresholds in the bipartite

More information

SPARSE RANDOM GRAPHS: REGULARIZATION AND CONCENTRATION OF THE LAPLACIAN

SPARSE RANDOM GRAPHS: REGULARIZATION AND CONCENTRATION OF THE LAPLACIAN SPARSE RANDOM GRAPHS: REGULARIZATION AND CONCENTRATION OF THE LAPLACIAN CAN M. LE, ELIZAVETA LEVINA, AND ROMAN VERSHYNIN Abstract. We study random graphs with possibly different edge probabilities in the

More information

A limit theorem for scaled eigenvectors of random dot product graphs

A limit theorem for scaled eigenvectors of random dot product graphs Sankhya A manuscript No. (will be inserted by the editor A limit theorem for scaled eigenvectors of random dot product graphs A. Athreya V. Lyzinski C. E. Priebe D. L. Sussman M. Tang D.J. Marchette the

More information

arxiv: v3 [math.pr] 18 Aug 2017

arxiv: v3 [math.pr] 18 Aug 2017 Sparse Exchangeable Graphs and Their Limits via Graphon Processes arxiv:1601.07134v3 [math.pr] 18 Aug 2017 Christian Borgs Microsoft Research One Memorial Drive Cambridge, MA 02142, USA Jennifer T. Chayes

More information

Lecture 2: Exchangeable networks and the Aldous-Hoover representation theorem

Lecture 2: Exchangeable networks and the Aldous-Hoover representation theorem Lecture 2: Exchangeable networks and the Aldous-Hoover representation theorem Contents 36-781: Advanced Statistical Network Models Mini-semester II, Fall 2016 Instructor: Cosma Shalizi Scribe: Momin M.

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

CLUSTERING over graphs is a classical problem with

CLUSTERING over graphs is a classical problem with Maximum Likelihood Latent Space Embedding of Logistic Random Dot Product Graphs Luke O Connor, Muriel Médard and Soheil Feizi ariv:5.85v3 [stat.ml] 3 Aug 27 Abstract A latent space model for a family of

More information

A Dimensionality Reduction Framework for Detection of Multiscale Structure in Heterogeneous Networks

A Dimensionality Reduction Framework for Detection of Multiscale Structure in Heterogeneous Networks Shen HW, Cheng XQ, Wang YZ et al. A dimensionality reduction framework for detection of multiscale structure in heterogeneous networks. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 27(2): 341 357 Mar. 2012.

More information

COMPSCI 514: Algorithms for Data Science

COMPSCI 514: Algorithms for Data Science COMPSCI 514: Algorithms for Data Science Arya Mazumdar University of Massachusetts at Amherst Fall 2018 Lecture 8 Spectral Clustering Spectral clustering Curse of dimensionality Dimensionality Reduction

More information

Clustering using Mixture Models

Clustering using Mixture Models Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior

More information

A central limit theorem for an omnibus embedding of random dot product graphs

A central limit theorem for an omnibus embedding of random dot product graphs A central limit theorem for an omnibus embedding of random dot product graphs Keith Levin 1 with Avanti Athreya 2, Minh Tang 2, Vince Lyzinski 3 and Carey E. Priebe 2 1 University of Michigan, 2 Johns

More information

Modeling heterogeneity in random graphs

Modeling heterogeneity in random graphs Modeling heterogeneity in random graphs Catherine MATIAS CNRS, Laboratoire Statistique & Génome, Évry (Soon: Laboratoire de Probabilités et Modèles Aléatoires, Paris) http://stat.genopole.cnrs.fr/ cmatias

More information

1 Matrix notation and preliminaries from spectral graph theory

1 Matrix notation and preliminaries from spectral graph theory Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/30/17 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2

More information

Faloutsos, Tong ICDE, 2009

Faloutsos, Tong ICDE, 2009 Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity

More information

Problem Set 4. General Instructions

Problem Set 4. General Instructions CS224W: Analysis of Networks Fall 2017 Problem Set 4 General Instructions Due 11:59pm PDT November 30, 2017 These questions require thought, but do not require long answers. Please be as concise as possible.

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed

More information

Machine Learning for Data Science (CS4786) Lecture 11

Machine Learning for Data Science (CS4786) Lecture 11 Machine Learning for Data Science (CS4786) Lecture 11 Spectral clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ ANNOUNCEMENT 1 Assignment P1 the Diagnostic assignment 1 will

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Discovering molecular pathways from protein interaction and ge

Discovering molecular pathways from protein interaction and ge Discovering molecular pathways from protein interaction and gene expression data 9-4-2008 Aim To have a mechanism for inferring pathways from gene expression and protein interaction data. Motivation Why

More information

Mini course on Complex Networks

Mini course on Complex Networks Mini course on Complex Networks Massimo Ostilli 1 1 UFSC, Florianopolis, Brazil September 2017 Dep. de Fisica Organization of The Mini Course Day 1: Basic Topology of Equilibrium Networks Day 2: Percolation

More information

Data science with multilayer networks: Mathematical foundations and applications

Data science with multilayer networks: Mathematical foundations and applications Data science with multilayer networks: Mathematical foundations and applications CDSE Days University at Buffalo, State University of New York Monday April 9, 2018 Dane Taylor Assistant Professor of Mathematics

More information

Community detection in stochastic block models via spectral methods

Community detection in stochastic block models via spectral methods Community detection in stochastic block models via spectral methods Laurent Massoulié (MSR-Inria Joint Centre, Inria) based on joint works with: Dan Tomozei (EPFL), Marc Lelarge (Inria), Jiaming Xu (UIUC),

More information

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

6.207/14.15: Networks Lecture 12: Generalized Random Graphs 6.207/14.15: Networks Lecture 12: Generalized Random Graphs 1 Outline Small-world model Growing random networks Power-law degree distributions: Rich-Get-Richer effects Models: Uniform attachment model

More information

Statistical Model for Soical Network

Statistical Model for Soical Network Statistical Model for Soical Network Tom A.B. Snijders University of Washington May 29, 2014 Outline 1 Cross-sectional network 2 Dynamic s Outline Cross-sectional network 1 Cross-sectional network 2 Dynamic

More information

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016 A Random Dot Product Model for Weighted Networks arxiv:1611.02530v1 [stat.ap] 8 Nov 2016 Daryl R. DeFord 1 Daniel N. Rockmore 1,2,3 1 Department of Mathematics, Dartmouth College, Hanover, NH, USA 03755

More information

Markov Chains and Spectral Clustering

Markov Chains and Spectral Clustering Markov Chains and Spectral Clustering Ning Liu 1,2 and William J. Stewart 1,3 1 Department of Computer Science North Carolina State University, Raleigh, NC 27695-8206, USA. 2 nliu@ncsu.edu, 3 billy@ncsu.edu

More information

Adventures in random graphs: Models, structures and algorithms

Adventures in random graphs: Models, structures and algorithms BCAM January 2011 1 Adventures in random graphs: Models, structures and algorithms Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM January 2011 2 Complex

More information

Communities, Spectral Clustering, and Random Walks

Communities, Spectral Clustering, and Random Walks Communities, Spectral Clustering, and Random Walks David Bindel Department of Computer Science Cornell University 26 Sep 2011 20 21 19 16 22 28 17 18 29 26 27 30 23 1 25 5 8 24 2 4 14 3 9 13 15 11 10 12

More information

Algebraic Representation of Networks

Algebraic Representation of Networks Algebraic Representation of Networks 0 1 2 1 1 0 0 1 2 0 0 1 1 1 1 1 Hiroki Sayama sayama@binghamton.edu Describing networks with matrices (1) Adjacency matrix A matrix with rows and columns labeled by

More information

arxiv: v1 [stat.me] 12 May 2017

arxiv: v1 [stat.me] 12 May 2017 Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel Patrick Rubin-Delanchy *, Carey E. Priebe **, and Minh Tang ** * University of Oxford and Heilbronn Institute

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Stat 315c: Introduction

Stat 315c: Introduction Stat 315c: Introduction Art B. Owen Stanford Statistics Art B. Owen (Stanford Statistics) Stat 315c: Introduction 1 / 14 Stat 315c Analysis of Transposable Data Usual Statistics Setup there s Y (we ll

More information

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory ORIE 4741: Learning with Big Messy Data Spectral Graph Theory Mika Sumida Operations Research and Information Engineering Cornell September 15, 2017 1 / 32 Outline Graph Theory Spectral Graph Theory Laplacian

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

Empirical Bayes estimation for the stochastic blockmodel

Empirical Bayes estimation for the stochastic blockmodel Electronic Journal of Statistics Vol. 10 (2016) 761 782 ISSN: 1935-7524 DOI: 10.1214/16-EJS1115 Empirical Bayes estimation for the stochastic blockmodel Shakira Suwan,DominicS.Lee Department of Mathematics

More information

arxiv: v1 [math.st] 14 Nov 2018

arxiv: v1 [math.st] 14 Nov 2018 Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing arxiv:1811.06055v1 [math.st] 14 Nov 018 Chao Gao 1 and Zongming Ma 1 University of Chicago University of

More information

Hypothesis testing for automated community detection in networks

Hypothesis testing for automated community detection in networks J. R. Statist. Soc. B (216) 78, Part 1, pp. 253 273 Hypothesis testing for automated community detection in networks Peter J. Bickel University of California at Berkeley, USA and Purnamrita Sarkar University

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Modularity in several random graph models

Modularity in several random graph models Modularity in several random graph models Liudmila Ostroumova Prokhorenkova 1,3 Advanced Combinatorics and Network Applications Lab Moscow Institute of Physics and Technology Moscow, Russia Pawe l Pra

More information

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics Spectral Graph Theory and You: and Centrality Metrics Jonathan Gootenberg March 11, 2013 1 / 19 Outline of Topics 1 Motivation Basics of Spectral Graph Theory Understanding the characteristic polynomial

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Stochastic blockmodels with a growing number of classes

Stochastic blockmodels with a growing number of classes Biometrika (2012), 99,2,pp. 273 284 doi: 10.1093/biomet/asr053 C 2012 Biometrika Trust Advance Access publication 17 April 2012 Printed in Great Britain Stochastic blockmodels with a growing number of

More information

IV. Analyse de réseaux biologiques

IV. Analyse de réseaux biologiques IV. Analyse de réseaux biologiques Catherine Matias CNRS - Laboratoire de Probabilités et Modèles Aléatoires, Paris catherine.matias@math.cnrs.fr http://cmatias.perso.math.cnrs.fr/ ENSAE - 2014/2015 Sommaire

More information

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory Tim Roughgarden & Gregory Valiant May 2, 2016 Spectral graph theory is the powerful and beautiful theory that arises from

More information

ECS 289 F / MAE 298, Lecture 15 May 20, Diffusion, Cascades and Influence

ECS 289 F / MAE 298, Lecture 15 May 20, Diffusion, Cascades and Influence ECS 289 F / MAE 298, Lecture 15 May 20, 2014 Diffusion, Cascades and Influence Diffusion and cascades in networks (Nodes in one of two states) Viruses (human and computer) contact processes epidemic thresholds

More information

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018 ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

CS 664 Segmentation (2) Daniel Huttenlocher

CS 664 Segmentation (2) Daniel Huttenlocher CS 664 Segmentation (2) Daniel Huttenlocher Recap Last time covered perceptual organization more broadly, focused in on pixel-wise segmentation Covered local graph-based methods such as MST and Felzenszwalb-Huttenlocher

More information

Spectral Methods for Subgraph Detection

Spectral Methods for Subgraph Detection Spectral Methods for Subgraph Detection Nadya T. Bliss & Benjamin A. Miller Embedded and High Performance Computing Patrick J. Wolfe Statistics and Information Laboratory Harvard University 12 July 2010

More information

Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs

Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs Lawrence Livermore National Laboratory Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs Keith Henderson and Tina Eliassi-Rad keith@llnl.gov and eliassi@llnl.gov This work was performed

More information

Dissertation Defense

Dissertation Defense Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation

More information

MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW

MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW ESAIM: PROCEEDINGS AND SURVEYS, December 2014, Vol. 47, p. 55-74 F. Abergel, M. Aiguier, D. Challet, P.-H. Cournède, G. Faÿ, P. Lafitte, Editors MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE

More information

Network Analysis and Modeling

Network Analysis and Modeling lecture 0: what are networks and how do we talk about them? 2017 Aaron Clauset 003 052 002 001 Aaron Clauset @aaronclauset Assistant Professor of Computer Science University of Colorado Boulder External

More information

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY

COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY COMMUNITY DETECTION IN SPARSE NETWORKS VIA GROTHENDIECK S INEQUALITY OLIVIER GUÉDON AND ROMAN VERSHYNIN Abstract. We present a simple and flexible method to prove consistency of semidefinite optimization

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

Massive-scale estimation of exponential-family random graph models with local dependence

Massive-scale estimation of exponential-family random graph models with local dependence Massive-scale estimation of exponential-family random graph models with local dependence Sergii Babkin Michael Schweinberger arxiv:1703.09301v1 [stat.co] 27 Mar 2017 Abstract A flexible approach to modeling

More information

Deciphering and modeling heterogeneity in interaction networks

Deciphering and modeling heterogeneity in interaction networks Deciphering and modeling heterogeneity in interaction networks (using variational approximations) S. Robin INRA / AgroParisTech Mathematical Modeling of Complex Systems December 2013, Ecole Centrale de

More information

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) The authors explain how the NCut algorithm for graph bisection

More information

Networks and Their Spectra

Networks and Their Spectra Networks and Their Spectra Victor Amelkin University of California, Santa Barbara Department of Computer Science victor@cs.ucsb.edu December 4, 2017 1 / 18 Introduction Networks (= graphs) are everywhere.

More information

Network Topology Inference from Non-stationary Graph Signals

Network Topology Inference from Non-stationary Graph Signals Network Topology Inference from Non-stationary Graph Signals Rasoul Shafipour Dept. of Electrical and Computer Engineering University of Rochester rshafipo@ece.rochester.edu http://www.ece.rochester.edu/~rshafipo/

More information