Topics in Network Models. Peter Bickel
|
|
- Felicia Goodwin
- 5 years ago
- Views:
Transcription
1 Topics in Network Models MSU, Sepember, 2012 Peter Bickel Statistics Dept. UC Berkeley (Joint work with S. Bhattacharyya UC Berkeley, A. Chen Google, D. Choi UC Berkeley, E. Levina U. Mich and P. Sarkar UC Berkeley)
2 Outline 1 Networks: Examples, References 2 Some statistical questions 3 A nonparametric model for infinite networks. 4 Asymptotic theory for λ n log n, λ n = Average degree. 5 Results of B. and Chen (2009) on block models. 6 Consequences for maximum likelihood and variational likelihood. 7 Count statistics, bootstrap and λ n = O(1).
3 I.Identifying Networks and II.Working With Given Ones I Given vectors of measurements X i, for example, given gene expression sequence nearby binding site information, physical (epigenetic information), protein assays etc. Determine dependency/causal relation between genes. Tools: Gaussian graphical models, clustering etc. II Given a network of relations (edges) identify higher level structures, clusters, like pathways in genomics. In practice, both can be done simultaneously. Focus on the models for II, in the simplest case of unlabelled graphs.
4 Example: Social Network Figure: Facebook Network for Caltech with 769 nodes and average degree 43.
5 Example: Bio Network Figure: Transcription network of E. Coli with 423 nodes and average degree 2.45.
6 Example: Collaboration Network Figure: Collaboration Network in Arxiv for High Energy Physics with 8638 nodes and average degree
7 References on Networks Books 1. (Phys/CS) M.E.J. Newman (2010) Networks: An introduction. Oxford 2. (Stat) Eric D. Kolaczyk (2009) Statistical Analysis of Network Data 3. (Soc. Science/CS) David Easley and Jon Kleinberg (2010) Networks, crowds and markets: Reasoning about a highly connected world. Cambridge University Press 4. (Soc. Science) K. Faust and S. Wasserman (1994) Social Network Analysis: Methods and Applications. Cambridge University Press, New York. 5. (Prob/CS) Fan Chung, Linyuan Lu (2004) Complex graphs and networks. CBMS # 107 AMS Papers 1. (Prob/Math) Bela Bollobas, Svante Janson, Oliver Riordan (2007) The Phase Transition in Random Graphs. Random Structures and Algorithms, 31 (1) (Prob/Math) Oliviera RI (2010): Concentration of Adjacency Matrix and Graph Laplacian for Graphs with Independent Edges. Arxiv (Stat) Celisse, Daudin and Pierre (2011): Consistency of ML in Block Models. Arxiv 4. (Stat) Daudin, Picard and Robin (2008): A Mixture Model for Random Graphs. J. Stat. Comp. 5. (Stat) Chaudhuri, K., Chung F., Tsiatis A. (2012). Spectral Clustering of Graphs with General Degrees in the Extended Planted Partition Model. JMLR. 6. (Stat) Liben-Nowell, D., and Kleinberg, J., (2007). Link Prediction Problem for Social Networks. Journal of the American Society for Information Science and Technology. Our works 1. B. and A. Chen (2009) A nonparametric view of network models and Newman-Girvan and other modularities, PNAS 2. B., E. Levina and A. Chen (2011) Method of Moments Estimation For Networks, Ann. Stat. 3. B. and D. Choi (2012), Arxiv.
8 Questions Focus (i) Community identification for block models (simple and degree-corrected). Other (ii) Link Prediction: Predicting edges between nodes based on partially observed graph. (Liben-Nowell and Kleinberg (2007)), Zhao and Levina (2012). (iii) Model selection: Number of blocks. (iv) Testing stochastic identity of graphs (networks) using count statistics (motifs) (Wasserman and Faust (1994)). (v) Error bars on descriptive statistics, eg. homophily. (vi) Model incorporating edge identification errors. (vii) Directed graphs: graphs with additional edge and vertex information.
9 Erdős-Rényi and Block Models (Holland, Laskey and Leinhardt 1983) Graph on n vertices is equivalent to A n n adjacency matrix. Probability on S n = {all n n symmetric 0-1 matrices}. (For undirected graphs) E-R: A ij i.i.d. Ber(p). H-L-L: K communities c a 1[vertex i a] c M (n, (π 1,..., π K )) Given c, A ij are independent and P(A ij = 1 c i = a, c j = b) = w ab π a π b. where, a,b w ab = 1 and w ab = w ba.
10 Nonparametric Asymptotic Model for Unlabeled Graphs Given: P on graphs Aldous/Hoover (1983) L(A ij : i, j 1} = L(A πi,π j : i, j 1), for all permutations π g : [0, 1] 4 {0, 1} such that A ij = g(α, ξ i, ξ j, η ij ), where α, ξ i, η ij, all i, j i, i.i.d. U(0, 1), g(α, u, v, w) = g(α, v, u, w), η ij = η ji.
11 Ergodic Models L is an ergodic probability iff for g with g(u, v, w) = g(v, u, w) (u, v, w), A ij = g(ξ i, ξ j, η ij ). L is determined by h(u, v) P(A ij = 1 ξ i = u, ξ j = v), h(u, v) = h(v, u). Notes: Equivalent: Hoff, Raftery, Handcock (2002). JASA More general: Bollobas, Riordan, Janson (2007). Random Structures and Algorithms
12 Parametrization of NP Model h is not uniquely defined. h Ä ϕ(u), ϕ(v) ä, where ϕ is measure-preserving, gives same model. Property CAN holds If there exists h CAN such that is τ(z) 1 0 h CAN (z, u)du = P[A ij = 1 ξ i = z] (a) monotonically non-decreasing and (b) If F ( ) is CDF of τ(ξ), w (F τ(ξ 1 ), F τ(ξ 2 )) w(ξ 1, ξ 2 ). If τ is strictly increasing, such an h CAN always exists. ξ i could be replaced by any continuous variables or vectors - but there is no natural unique representation.
13 Asymptotic Approximation h n (u, v) = ρ n w n (u, v) ρ n = P[Edge] w(u, v)dudv = P [ξ 1 [u, u + du], ξ 2 [v, v + dv] Edge] w n (u, v) = min { w(u, v), ρ 1 } n Average Degree = E(D+) n λ n ρ n (n 1). λ n ranges from O(1) to O(n).
14 Examples of models (I) Block models: h CAN (u, v) F ab w ab π aπ b on I a I b, I a = π a, a = 1,..., K. Degree-Corrected: (Karrer and Newman (2010)) P(A ij = 1 c i = a, c j = b, θ i, θ j) = θ iθ jf ab. θ i is independent of c, P[θ i = λ j] = ρ J j, j=1 ρ j1. Parametric sub model of M = KJ block model. Not to be considered (II) Preferential Attachment: (De Solla-Price (1965)) (Asymptotic version, w not bounded) w(u, v) = τ(u) 1 1(u v) + τ(v) u τ(s)ds 1 1(v u) v τ(s)ds 1 τ(u) = h(u, v)dv 0 Dynamically defined, τ(u) (1 u) 1/2 is equivalent to power law of degree distribution F τ 1.
15 Two Questions (I) (a) Estimate π, λ, ρ, F. (b) Classify vertex i by its community. (II) Given i, j construct ŵ(ξ i, ξ j ) (ŵ is an estimate of w) to predict whether A ij = 1.
16 3 Regimes (a) If λ n log n, equivalent to, P[there exists an isolated point] 0. (b) If λ n, full identifiability of w. (c) If λ n = O(1), phase boundaries and partial identifiability of w.
17 Block Models: Community Identification and Maximize Modularities Newman-Girvan modularity (Phys. Rev. E, 2004) e = (e 1,, e n ): e i {1,, K} (community labels) The modularity function: Q N (e) = Å ( ) K O kk (e,a) k=1 D + Dk (e) 2 D + ã, where O ab (e, A) = i,j A ij 1(e i = a, e j = b) = (# of edges between a and b) a b = 2 (# of edges between members of a), a = b D k (e) = K l=1 O kl (e, A) = sum of degrees of nodes in k D + = K k=1 D k (e) = 2 (# of edges between all nodes )
18 Profile Likelihood Given e estimate parameters and plug into block model likelihood. Always consistent for classification and efficient estimation.
19 Global Consistency Theorem 1 If population version of F is consistent and λn log n, then lim sup n λ 1 n log P [ĉ c] s Q, with s Q > 0. Extension to F n F requires simple condition. See also Snijders and Nowicki (1997) J. of Classification.
20 Basic Approach to Proof General Modularity: Data and population Given Q n : K K positive matrices K simplex R +. ( ) Q n (e, A) = F O(e,A) n µ n, D+ µ n, f (e). O(e, A) o ab (e), f(e) (f 1 (e),..., f K (e)) T, f j (e) n j n. ĉ arg max Q n (e, A). µ n = E(D + ) = (n 1)λ n. NG: F n F. Population: Replace random vectors by expectations under block model
21 Corollary Under the given conditions if ˆπ a = 1 n ni=1 1(ĉ i = a) ˆna n, Ŵ = O(ĉ,A) D +, then if S = 1 W 1, where, = Diag(π), n(ˆπ π) N (0, π D ππ T ), nλn (Ŝ S) N (0, Σ(π, W )). These are efficient.
22 Maximum Likelihood, Variational Likelihood for Block Models (B. and Choi (2012) Arxiv, Celisse et. al. (2011) Arxiv) Complete data likelihood for block data f (z, A, θ) = n Ä ä Aij Ä ä 1 Aij π zi ρszi z j 1 ρszi z j i=1 i j P[z i = a] = π a and z 1,..., z n i.i.d. Graph likelihood ratio Å ã g f (A, θ) = E 0 (z, A, θ) A g 0 f 0 where, f 0 f (z, A, θ 0 ), g 0 g(a, θ 0 ) and θ (ρ, π, S) T.
23 Variational Likelihood (Daudin et. al. (2009)) Define, n q(z, τ) τ i (z i ) i=1 f VAR (z, A, θ) = q(z, τ(a, θ), θ)g(a, θ) where, D(f 1 f 2 ) = Ä log f 1 f2 ä f1 dµ. Notation: τ(a, θ) = arg min D (q(z, τ) f (z A, θ)) τ ˆθ: Complete Data likelihood estimator, ˆθ: Graph Likelihood estimator, ˆθ VAR : Variational Likelihood estimator.
24 Theorem Theorem (B. and Choi) If λ n log n, there exists E such that sup θ P θ (Ē A) P 0 0 such that (a) g g 0 (A, θ)1 E (z, A) = f f 0 (z, A)(1 + o P (1))1 E (z, A) (b) The same holds for f VAR f 0,VAR (c) f VAR can be used for consistent classification.
25 Consequence of Theorem Hence, n(ˆπ ˆπ) = o P (1). n(ˆπ ˆπ VAR ) = o P (1). nλ n (Ŝ Ŝ) = o P (1). nλ n (Ŝ Ŝ VAR ) = o P (1). nλ n ( ˆρ ρ 1) = O P(1). ˆρ 1 ni=1 n D i, D i = j A ij.
26 Concentration Results and Consequences in Regimes (a) and (b) Theorem (B., Levina and Chen (2012)), (Channarond, Daudin, Robin (2012)) (i) If λ n (ii) If λ n log n D i D τ(ξ i) = O P(λ 1/2 ) max i D i D τ(ξ i) = o P(1)
27 Statistical Consequences of (ii) (Channarond, Daudin, Robin (2011, Arxiv)) Let us suppose that the block model has the Property CAN with parameters, (π, W, ρ), where, W ab = P[i a, j b ij is an Edge]. Algorithm: Apply k-means to D i D 1 i n. If λ n and Ĉ 1,..., Ĉ k are the resulting clusters and ˆπ j Ĉ j n. (i) ˆπ j P πj, for j = 1,..., k. (ii) D λ n P 1 (iii) Ŵ ab 2{# of edges between Ĉ a and Ĉ b } nλ n P Wab The algorithm does not perform well compared to spectral methods.
28 Other methods which work well for block models under regime (a) 1. Spectral Clustering: Rohe et. al.(2011), McSherry (1993), Dasgupta, Hopcroft and McSherry (2004), Chaudhuri, Chung and Tsiatis (2012) develop methods which classify perfectly for λ n log n (CCT (2012). 2. Pseudo-likelihood (with a good starting point). (Chen, Arash Amini, Levina and B (2012)) 3. Methods based on empirical distribution of geodesic distances. (Bhattacharyya and B (2012))
29 Count statistics Another approach works broadly even for λ n = O(1) regime. Count statistics are normalized subgraph counts and smooth functions of them. The subgraph count,t (R), for subgraph corresponding to edge set R is - T (R) = S subgraph of G 1(S (V R, R))/C(R, n) where, C(R, n) # of all possible subgraphs of type R of G n. Example: (a) Total number of triangles in a graph is an example of count statistic. # (b) Homophily := of s is an example of smooth function of # of s + # of V s count statistics.
30 O(1) λ n O(n) Note that, But, ρ n 0, T (R) P 0. If R = p, E(T (R)) ˆρ p stabilizes.
31 Theorem on Moment Estimate Theorem 1 I. In dense case, if p, R are fixed n (T (R) E(T (R))) N(0, σ 2 (R, P)) II. In general non-dense case conclusion of I. applies to T (R) ˆρ p where R = p and ˆρ D n, if R is acyclic. Else result depends on λ n.
32 The λ n = O(1) Regime Major difficulty: For ρ = λn n, E[# of isolated points] ne λn. We expect isolated points to correspond to special ranges of ξ. Example: For block models, π a small and W ab small for all b. Theorem (Decelle et. al. (2011)), (Mossel, Neeman and Sly (2012)) For particular ranges of (π, W, λ n ) block models can not be distinguished from Erdos-Renyi models. This includes but is not limited to λ n 1.
33 Consequence (Bickel, Chen and Levina (2011)) For all λ n regimes, block models with fixed K if 1, Q1,..., Q k 1 1 are linearly independent with Q ij W ij π i, π is estimable at rate n 1/2. S is estimable at rate (nλ n ) 1/2 D = λ n (1 + O P ((nλ n ) 1/2 ) (all λ n > 0).
34 Identifiability The independence condition implies 1. 1 is not an eigenvector of W. 2. Block models with π 1 = = π K are not identifiable by acyclic count statistics.
35 A Computational Issue Computing T (R) for R complex and σ 2 (T, R) is non-trivial. Possible methods: Estimates counts using (a) Naive Monte Carlo (Sets of m out of n vertices). (b) Weighted sampling of edges (Kashtan et. al. (2004)). (c) Combination of above two methods (Bhattacharyya and B (2012)).
36 Discussion Count statistics computation needs bootstrap. (S. Bhattacharyya (2012)) Simulations and real data applications available but much needed. Block models: In theory λn λn log n and τ satisfying Property CAN are well-understood. same as if the class memberships are known. Fast algorithms being developed in this regime. λ n but possible λ n = O(1) very subtle. λn log n consistent but not n-consistent estimation Many statistical extensions needed: covariates, dynamics etc. log n is If λ n should be able to formulate minimax result for estimation of w(ξ 1, ξ 2 ) or more generally w(z 1, z 2 ) for sparse situations under smoothness conditions on w(, ).
Statistical Inference for Networks. Peter Bickel
Statistical Inference for Networks 4th Lehmann Symposium, Rice University, May 2011 Peter Bickel Statistics Dept. UC Berkeley (Joint work with Aiyou Chen, Google, E. Levina, U. Mich, S. Bhattacharyya,
More informationSpectral Clustering for Dynamic Block Models
Spectral Clustering for Dynamic Block Models Sharmodeep Bhattacharyya Department of Statistics Oregon State University January 23, 2017 Research Computing Seminar, OSU, Corvallis (Joint work with Shirshendu
More informationThe social sciences have investigated the structure of small
A nonparametric view of network models and Newman Girvan and other modularities Peter J. Bickel a,1 and Aiyou Chen b a University of California, Berkeley, CA 9472; and b Alcatel-Lucent Bell Labs, Murray
More informationStatistical and Computational Phase Transitions in Planted Models
Statistical and Computational Phase Transitions in Planted Models Jiaming Xu Joint work with Yudong Chen (UC Berkeley) Acknowledgement: Prof. Bruce Hajek November 4, 203 Cluster/Community structure in
More informationA Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities
Journal of Advanced Statistics, Vol. 3, No. 2, June 2018 https://dx.doi.org/10.22606/jas.2018.32001 15 A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities Laala Zeyneb
More informationModeling heterogeneity in random graphs
Modeling heterogeneity in random graphs Catherine MATIAS CNRS, Laboratoire Statistique & Génome, Évry (Soon: Laboratoire de Probabilités et Modèles Aléatoires, Paris) http://stat.genopole.cnrs.fr/ cmatias
More informationThe non-backtracking operator
The non-backtracking operator Florent Krzakala LPS, Ecole Normale Supérieure in collaboration with Paris: L. Zdeborova, A. Saade Rome: A. Decelle Würzburg: J. Reichardt Santa Fe: C. Moore, P. Zhang Berkeley:
More informationarxiv: v1 [stat.ml] 29 Jul 2012
arxiv:1207.6745v1 [stat.ml] 29 Jul 2012 Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs Daniel L. Sussman, Minh Tang, Carey E. Priebe Johns Hopkins
More informationarxiv: v1 [math.st] 26 Jan 2018
CONCENTRATION OF RANDOM GRAPHS AND APPLICATION TO COMMUNITY DETECTION arxiv:1801.08724v1 [math.st] 26 Jan 2018 CAN M. LE, ELIZAVETA LEVINA AND ROMAN VERSHYNIN Abstract. Random matrix theory has played
More informationGraph Detection and Estimation Theory
Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and
More informationDeciphering and modeling heterogeneity in interaction networks
Deciphering and modeling heterogeneity in interaction networks (using variational approximations) S. Robin INRA / AgroParisTech Mathematical Modeling of Complex Systems December 2013, Ecole Centrale de
More informationSubsampling Bootstrap of Count Features of Networks
Subsampling Bootstrap of Count Features of Networks Bhattacharyya, S., & Bickel, P. J. (2015). Subsampling Bootstrap of Count Features of Networks. Annals of Statistics, 43(6), 2384-2411. doi:10.1214/15-aos1338
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationarxiv: v1 [stat.me] 6 Nov 2014
Network Cross-Validation for Determining the Number of Communities in Network Data Kehui Chen 1 and Jing Lei arxiv:1411.1715v1 [stat.me] 6 Nov 014 1 Department of Statistics, University of Pittsburgh Department
More informationarxiv: v3 [math.pr] 18 Aug 2017
Sparse Exchangeable Graphs and Their Limits via Graphon Processes arxiv:1601.07134v3 [math.pr] 18 Aug 2017 Christian Borgs Microsoft Research One Memorial Drive Cambridge, MA 02142, USA Jennifer T. Chayes
More informationAdventures in random graphs: Models, structures and algorithms
BCAM January 2011 1 Adventures in random graphs: Models, structures and algorithms Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM January 2011 2 Complex
More informationIV. Analyse de réseaux biologiques
IV. Analyse de réseaux biologiques Catherine Matias CNRS - Laboratoire de Probabilités et Modèles Aléatoires, Paris catherine.matias@math.cnrs.fr http://cmatias.perso.math.cnrs.fr/ ENSAE - 2014/2015 Sommaire
More informationBenchmarking recovery theorems for the DC-SBM
Benchmarking recovery theorems for the DC-SBM Yali Wan Department of Statistics University of Washington Seattle, WA 98195-4322, USA yaliwan@washington.edu Marina Meilă Department of Statistics University
More informationBenchmarking recovery theorems for the DC-SBM
Benchmarking recovery theorems for the DC-SBM Yali Wan Department of Statistics University of Washington Seattle, WA 98195-4322, USA Marina Meila Department of Statistics University of Washington Seattle,
More informationConsistency Under Sampling of Exponential Random Graph Models
Consistency Under Sampling of Exponential Random Graph Models Cosma Shalizi and Alessandro Rinaldo Summary by: Elly Kaizar Remember ERGMs (Exponential Random Graph Models) Exponential family models Sufficient
More informationASYMPTOTIC NORMALITY OF MAXIMUM LIKELIHOOD AND ITS VARIATIONAL APPROXIMATION FOR STOCHASTIC BLOCKMODELS 1
The Annals of Statistics 2013, Vol. 41, No. 4, 1922 1943 DOI: 10.1214/13-AOS1124 Institute of Mathematical Statistics, 2013 ASYMPTOTIC NORMALITY OF MAXIMUM LIKELIHOOD AND ITS VARIATIONAL APPROXIMATION
More informationA HYPOTHESIS TESTING FRAMEWORK FOR MODULARITY BASED NETWORK COMMUNITY DETECTION
Statistica Sinica 27 (2017), 000-000 437-456 doi:http://dx.doi.org/10.5705/ss.202015.0040 A HYPOTHESIS TESTING FRAMEWORK FOR MODULARITY BASED NETWORK COMMUNITY DETECTION Jingfei Zhang and Yuguo Chen University
More informationReconstruction in the Generalized Stochastic Block Model
Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR
More informationarxiv: v3 [stat.me] 17 Nov 2015
The Annals of Statistics 2015, Vol. 43, No. 6, 2384 2411 DOI: 10.1214/15-AOS1338 c Institute of Mathematical Statistics, 2015 arxiv:1312.2645v3 [stat.me] 17 Nov 2015 SUBSAMPLING BOOTSTRAP OF COUNT FEATURES
More informationSemiparametric Gaussian Copula Models: Progress and Problems
Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle 2015 IMS China, Kunming July 1-4, 2015 2015 IMS China Meeting, Kunming Based on joint work
More informationNetwork Cross-Validation for Determining the Number of Communities in Network Data
Network Cross-Validation for Determining the Number of Communities in Network Data Kehui Chen and Jing Lei University of Pittsburgh and Carnegie Mellon University August 1, 2016 Abstract The stochastic
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationSemiparametric Gaussian Copula Models: Progress and Problems
Semiparametric Gaussian Copula Models: Progress and Problems Jon A. Wellner University of Washington, Seattle European Meeting of Statisticians, Amsterdam July 6-10, 2015 EMS Meeting, Amsterdam Based on
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationStatistical Model for Soical Network
Statistical Model for Soical Network Tom A.B. Snijders University of Washington May 29, 2014 Outline 1 Cross-sectional network 2 Dynamic s Outline Cross-sectional network 1 Cross-sectional network 2 Dynamic
More informationBelief Propagation, Robust Reconstruction and Optimal Recovery of Block Models
JMLR: Workshop and Conference Proceedings vol 35:1 15, 2014 Belief Propagation, Robust Reconstruction and Optimal Recovery of Block Models Elchanan Mossel mossel@stat.berkeley.edu Department of Statistics
More informationLecture 1: Graphs, Adjacency Matrices, Graph Laplacian
Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian Radu Balan January 31, 2017 G = (V, E) An undirected graph G is given by two pieces of information: a set of vertices V and a set of edges E, G =
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationTheory and Methods for the Analysis of Social Networks
Theory and Methods for the Analysis of Social Networks Alexander Volfovsky Department of Statistical Science, Duke University Lecture 1: January 16, 2018 1 / 35 Outline Jan 11 : Brief intro and Guest lecture
More informationHypothesis testing for automated community detection in networks
J. R. Statist. Soc. B (216) 78, Part 1, pp. 253 273 Hypothesis testing for automated community detection in networks Peter J. Bickel University of California at Berkeley, USA and Purnamrita Sarkar University
More informationConsistency of Modularity Clustering on Random Geometric Graphs
Consistency of Modularity Clustering on Random Geometric Graphs Erik Davis The University of Arizona May 10, 2016 Outline Introduction to Modularity Clustering Pointwise Convergence Convergence of Optimal
More informationConfidence Sets for Network Structure
Confidence Sets for Network Structure Edoardo M. Airoldi 1, David S. Choi 2 and Patrick J. Wolfe 1 1 Department of Statistics, Harvard University 2 School of Engineering and Applied Sciences, Harvard University
More informationCommunity detection in stochastic block models via spectral methods
Community detection in stochastic block models via spectral methods Laurent Massoulié (MSR-Inria Joint Centre, Inria) based on joint works with: Dan Tomozei (EPFL), Marc Lelarge (Inria), Jiaming Xu (UIUC),
More informationLikelihood Inference for Large Scale Stochastic Blockmodels with Covariates based on a Divide-and-Conquer Parallelizable Algorithm with Communication
arxiv:1610.09724v2 [stat.co] 12 Feb 2018 Likelihood Inference for Large Scale Stochastic Blockmodels with Covariates based on a Divide-and-Conquer Parallelizable Algorithm with Communication Sandipan Roy,
More informationInformation Recovery from Pairwise Measurements
Information Recovery from Pairwise Measurements A Shannon-Theoretic Approach Yuxin Chen, Changho Suh, Andrea Goldsmith Stanford University KAIST Page 1 Recovering data from correlation measurements A large
More informationEconometric Analysis of Network Data. Georgetown Center for Economic Practice
Econometric Analysis of Network Data Georgetown Center for Economic Practice May 8th and 9th, 2017 Course Description This course will provide an overview of econometric methods appropriate for the analysis
More informationStatistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting
: The High-Dimensional Setting Yudong Chen YUDONGCHEN@EECSBERELEYEDU Department of EECS, University of California, Berkeley, Berkeley, CA 94704, USA Jiaming Xu JXU18@ILLINOISEDU Department of ECE, University
More informationLearning latent structure in complex networks
Learning latent structure in complex networks Lars Kai Hansen www.imm.dtu.dk/~lkh Current network research issues: Social Media Neuroinformatics Machine learning Joint work with Morten Mørup, Sune Lehmann
More informationCommunity Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria
Community Detection fundamental limits & efficient algorithms Laurent Massoulié, Inria Community Detection From graph of node-to-node interactions, identify groups of similar nodes Example: Graph of US
More informationDRAGAN STEVANOVIĆ. Key words. Modularity matrix; Community structure; Largest eigenvalue; Complete multipartite
A NOTE ON GRAPHS WHOSE LARGEST EIGENVALUE OF THE MODULARITY MATRIX EQUALS ZERO SNJEŽANA MAJSTOROVIĆ AND DRAGAN STEVANOVIĆ Abstract. Informally, a community within a graph is a subgraph whose vertices are
More informationMODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW
ESAIM: PROCEEDINGS AND SURVEYS, December 2014, Vol. 47, p. 55-74 F. Abergel, M. Aiguier, D. Challet, P.-H. Cournède, G. Faÿ, P. Lafitte, Editors MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE
More informationMassive-scale estimation of exponential-family random graph models with local dependence
Massive-scale estimation of exponential-family random graph models with local dependence Sergii Babkin Michael Schweinberger arxiv:1703.09301v1 [stat.co] 27 Mar 2017 Abstract A flexible approach to modeling
More informationModularity in several random graph models
Modularity in several random graph models Liudmila Ostroumova Prokhorenkova 1,3 Advanced Combinatorics and Network Applications Lab Moscow Institute of Physics and Technology Moscow, Russia Pawe l Pra
More informationSpectra of adjacency matrices of random geometric graphs
Spectra of adjacency matrices of random geometric graphs Paul Blackwell, Mark Edmondson-Jones and Jonathan Jordan University of Sheffield 22nd December 2006 Abstract We investigate the spectral properties
More informationDissertation Defense
Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation
More informationStochastic blockmodels with a growing number of classes
Biometrika (2012), 99,2,pp. 273 284 doi: 10.1093/biomet/asr053 C 2012 Biometrika Trust Advance Access publication 17 April 2012 Printed in Great Britain Stochastic blockmodels with a growing number of
More informationA limit theorem for scaled eigenvectors of random dot product graphs
Sankhya A manuscript No. (will be inserted by the editor A limit theorem for scaled eigenvectors of random dot product graphs A. Athreya V. Lyzinski C. E. Priebe D. L. Sussman M. Tang D.J. Marchette the
More informationUncovering structure in biological networks: A model-based approach
Uncovering structure in biological networks: A model-based approach J-J Daudin, F. Picard, S. Robin, M. Mariadassou UMR INA-PG / ENGREF / INRA, Paris Mathématique et Informatique Appliquées Statistics
More informationLecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian
Lecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian Radu Balan February 5, 2018 Datasets diversity: Social Networks: Set of individuals ( agents, actors ) interacting with each other (e.g.,
More informationRandom Geometric Graphs
Random Geometric Graphs Mathew D. Penrose University of Bath, UK Networks: stochastic models for populations and epidemics ICMS, Edinburgh September 2011 1 MODELS of RANDOM GRAPHS Erdos-Renyi G(n, p):
More informationGoodness of fit of logistic models for random graphs: a variational Bayes approach
Goodness of fit of logistic models for random graphs: a variational Bayes approach S. Robin Joint work with P. Latouche and S. Ouadah INRA / AgroParisTech MIAT Seminar, Toulouse, November 2016 S. Robin
More informationStatistical analysis of biological networks.
Statistical analysis of biological networks. Assessing the exceptionality of network motifs S. Schbath Jouy-en-Josas/Evry/Paris, France http://genome.jouy.inra.fr/ssb/ Colloquium interactions math/info,
More informationComplex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds
Complex Networks CSYS/MATH 303, Spring, 2011 Prof. Peter Dodds Department of Mathematics & Statistics Center for Complex Systems Vermont Advanced Computing Center University of Vermont Licensed under the
More informationThe Trouble with Community Detection
The Trouble with Community Detection Aaron Clauset Santa Fe Institute 7 April 2010 Nonlinear Dynamics of Networks Workshop U. Maryland, College Park Thanks to National Science Foundation REU Program James
More informationVariational inference for the Stochastic Block-Model
Variational inference for the Stochastic Block-Model S. Robin AgroParisTech / INRA Workshop on Random Graphs, April 2011, Lille S. Robin (AgroParisTech / INRA) Variational inference for SBM Random Graphs,
More informationTwo-sample hypothesis testing for random dot product graphs
Two-sample hypothesis testing for random dot product graphs Minh Tang Department of Applied Mathematics and Statistics Johns Hopkins University JSM 2014 Joint work with Avanti Athreya, Vince Lyzinski,
More informationFinding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October
Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find
More informationCommunity Detection by L 0 -penalized Graph Laplacian
Community Detection by L 0 -penalized Graph Laplacian arxiv:1706.10273v1 [stat.me] 30 Jun 2017 Chong Chen School of Mathematical Science, Peking University and Ruibin Xi School of Mathematical Science,
More informationShortest Paths in Random Intersection Graphs
Shortest Paths in Random Intersection Graphs Gesine Reinert Department of Statistics University of Oxford reinert@stats.ox.ac.uk September 14 th, 2011 1 / 29 Outline Bipartite graphs and random intersection
More informationRandom Effects Models for Network Data
Random Effects Models for Network Data Peter D. Hoff 1 Working Paper no. 28 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 January 14, 2003 1 Department of
More informationEstimating network edge probabilities by neighbourhood smoothing
Biometrika (27), 4, 4,pp. 77 783 doi:.93/biomet/asx42 Printed in Great Britain Advance Access publication 5 September 27 Estimating network edge probabilities by neighbourhood smoothing BY YUAN ZHANG Department
More informationParameter estimators of sparse random intersection graphs with thinned communities
Parameter estimators of sparse random intersection graphs with thinned communities Lasse Leskelä Aalto University Johan van Leeuwaarden Eindhoven University of Technology Joona Karjalainen Aalto University
More informationBayesian methods for graph clustering
Author manuscript, published in "Advances in data handling and business intelligence (2009) 229-239" DOI : 10.1007/978-3-642-01044-6 Bayesian methods for graph clustering P. Latouche, E. Birmelé, and C.
More informationNonparametric Bayesian Matrix Factorization for Assortative Networks
Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More informationA Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay
A Statistical Look at Spectral Graph Analysis Deep Mukhopadhyay Department of Statistics, Temple University Office: Speakman 335 deep@temple.edu http://sites.temple.edu/deepstat/ Graph Signal Processing
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationMatrix estimation by Universal Singular Value Thresholding
Matrix estimation by Universal Singular Value Thresholding Courant Institute, NYU Let us begin with an example: Suppose that we have an undirected random graph G on n vertices. Model: There is a real symmetric
More informationELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018
ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space
More informationChaos, Complexity, and Inference (36-462)
Chaos, Complexity, and Inference (36-462) Lecture 21: More Networks: Models and Origin Myths Cosma Shalizi 31 March 2009 New Assignment: Implement Butterfly Mode in R Real Agenda: Models of Networks, with
More informationInt. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS014) p.4149
Int. Statistical Inst.: Proc. 58th orld Statistical Congress, 011, Dublin (Session CPS014) p.4149 Invariant heory for Hypothesis esting on Graphs Priebe, Carey Johns Hopkins University, Applied Mathematics
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationSusceptible-Infective-Removed Epidemics and Erdős-Rényi random
Susceptible-Infective-Removed Epidemics and Erdős-Rényi random graphs MSR-Inria Joint Centre October 13, 2015 SIR epidemics: the Reed-Frost model Individuals i [n] when infected, attempt to infect all
More informationEstimating network degree distributions from sampled networks: An inverse problem
Estimating network degree distributions from sampled networks: An inverse problem Eric D. Kolaczyk Dept of Mathematics and Statistics, Boston University kolaczyk@bu.edu Introduction: Networks and Degree
More informationQuilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs
Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs Hyokun Yun (work with S.V.N. Vishwanathan) Department of Statistics Purdue Machine Learning Seminar November 9, 2011 Overview
More informationErdős-Rényi random graph
Erdős-Rényi random graph introduction to network analysis (ina) Lovro Šubelj University of Ljubljana spring 2016/17 graph models graph model is ensemble of random graphs algorithm for random graphs of
More informationHodgeRank on Random Graphs
Outline Motivation Application Discussions School of Mathematical Sciences Peking University July 18, 2011 Outline Motivation Application Discussions 1 Motivation Crowdsourcing Ranking on Internet 2 HodgeRank
More information6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search
6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1 Networks: Lecture 7 Outline Navigation (or decentralized search)
More informationCommunity Detection on Euclidean Random Graphs
Community Detection on Euclidean Random Graphs Abishek Sankararaman, François Baccelli July 3, 207 Abstract Motivated by applications in online social networks, we introduce and study the problem of Community
More informationComputational Lower Bounds for Community Detection on Random Graphs
Computational Lower Bounds for Community Detection on Random Graphs Bruce Hajek, Yihong Wu, Jiaming Xu Department of Electrical and Computer Engineering Coordinated Science Laboratory University of Illinois
More informationGroups of vertices and Core-periphery structure. By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA
Groups of vertices and Core-periphery structure By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA Mostly observed real networks have: Why? Heavy tail (powerlaw most
More informationCSC 412 (Lecture 4): Undirected Graphical Models
CSC 412 (Lecture 4): Undirected Graphical Models Raquel Urtasun University of Toronto Feb 2, 2016 R Urtasun (UofT) CSC 412 Feb 2, 2016 1 / 37 Today Undirected Graphical Models: Semantics of the graph:
More informationMultislice community detection
Multislice community detection P. J. Mucha, T. Richardson, K. Macon, M. A. Porter, J.-P. Onnela Jukka-Pekka JP Onnela Harvard University NetSci2010, MIT; May 13, 2010 Outline (1) Background (2) Multislice
More informationConsistency Thresholds for the Planted Bisection Model
Consistency Thresholds for the Planted Bisection Model [Extended Abstract] Elchanan Mossel Joe Neeman University of Pennsylvania and University of Texas, Austin University of California, Mathematics Dept,
More informationChaos, Complexity, and Inference (36-462)
Chaos, Complexity, and Inference (36-462) Lecture 21 Cosma Shalizi 3 April 2008 Models of Networks, with Origin Myths Erdős-Rényi Encore Erdős-Rényi with Node Types Watts-Strogatz Small World Graphs Exponential-Family
More informationBayesian nonparametric models of sparse and exchangeable random graphs
Bayesian nonparametric models of sparse and exchangeable random graphs F. Caron & E. Fox Technical Report Discussion led by Esther Salazar Duke University May 16, 2014 (Reading group) May 16, 2014 1 /
More informationNetwork Representation Using Graph Root Distributions
Network Representation Using Graph Root Distributions Jing Lei Department of Statistics and Data Science Carnegie Mellon University 2018.04 Network Data Network data record interactions (edges) between
More informationELA A NOTE ON GRAPHS WHOSE LARGEST EIGENVALUE OF THE MODULARITY MATRIX EQUALS ZERO
A NOTE ON GRAPHS WHOSE LARGEST EIGENVALUE OF THE MODULARITY MATRIX EQUALS ZERO SNJEŽANA MAJSTOROVIĆ AND DRAGAN STEVANOVIĆ Abstract. Informally, a community within a graph is a subgraph whose vertices are
More informationLink Prediction. Eman Badr Mohammed Saquib Akmal Khan
Link Prediction Eman Badr Mohammed Saquib Akmal Khan 11-06-2013 Link Prediction Which pair of nodes should be connected? Applications Facebook friend suggestion Recommendation systems Monitoring and controlling
More informationSampling and Estimation in Network Graphs
Sampling and Estimation in Network Graphs Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ March
More informationSpectral Methods for Subgraph Detection
Spectral Methods for Subgraph Detection Nadya T. Bliss & Benjamin A. Miller Embedded and High Performance Computing Patrick J. Wolfe Statistics and Information Laboratory Harvard University 12 July 2010
More informationBayesian non parametric inference of discrete valued networks
Bayesian non parametric inference of discrete valued networks Laetitia Nouedoui, Pierre Latouche To cite this version: Laetitia Nouedoui, Pierre Latouche. Bayesian non parametric inference of discrete
More informationSpectra of Adjacency and Laplacian Matrices
Spectra of Adjacency and Laplacian Matrices Definition: University of Alicante (Spain) Matrix Computing (subject 3168 Degree in Maths) 30 hours (theory)) + 15 hours (practical assignment) Contents 1. Spectra
More informationMixed Membership Stochastic Blockmodels
Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed
More informationNetworks, Random Graphs and Statistics. 4 6 May 2016
Networks, Random Graphs and Statistics 4 6 May 2016 Union Theological Seminary (Thursday & Friday) Northwest Corner Building (Wednesday, Room 501) Department of Statistics School of Social Work Building
More information