A central limit theorem for an omnibus embedding of random dot product graphs

Size: px

Start display at page:

Download "A central limit theorem for an omnibus embedding of random dot product graphs"

Rudolf Clark
5 years ago
Views:

1 A central limit theorem for an omnibus embedding of random dot product graphs Keith Levin 1 with Avanti Athreya 2, Minh Tang 2, Vince Lyzinski 3 and Carey E. Priebe 2 1 University of Michigan, 2 Johns Hopkins University, 3 University of Massachusetts Amherst November 18, 2017

2 Classical two-sample hypothesis testing Well-studied in statistics (indeed, the only thing we teach undergrads?) K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

3 Graph Hypothesis Testing Q: how to tell if two (or more) graphs are from the same distribution? K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

4 Random Dot Product Graph (RDPG; Young and Scheinerman, 2007) Extends stochastic block model (SBM) Vertices assigned latent positions drawn i.i.d. from d-dimensional distribution F F constrained so that 0 x T y 1 whenever x, y supp F Denote i-th latent position by X i R d Edges {i, j} present or absent independently with probability X T i X j. Collect latent positions in rows of X R n d. Warning: Non-identifiability Model specified only up to orthogonal rotation of latent positions. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

Random Dot Product Graph (RDPG; Young and Scheinerman, 2007) Extends stochastic block model (SBM) Vertices assigned latent positions drawn i.i.d. from d-dimensional distribution F F constrained so that 0 x T y 1 whenever x, y supp F.

5 Random Dot Product Graph (RDPG; Young and Scheinerman, 2007) Extends stochastic block model (SBM) Vertices assigned latent positions drawn i.i.d. from d-dimensional distribution F F constrained so that 0 x T y 1 whenever x, y supp F. Denote i-th latent position by X i Edges {i, j} present or absent independently with probability X T i X j. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

6 Estimating latent positions: adjacency spectral embedding (Sussman et al, 2012) Definition (Adjacency Spectral Embedding (ASE)) Given adjacency matrix A, embed vertices of A = USU T into R d as rows of ˆX = Ud S 1/2 R n d, where U d d denotes first d columns of U, S d denotes truncation of S to top d eigenvalues. Under RDPG, W : max 1 i n ˆXi WX i = O P (n 1/2 log n). Lyzinski, et al (2014): ASE yields a.a.s. perfect recovery of block memberships in SBM K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

7 RDPG: what do we mean by same distribution? K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

8 RDPG: what do we mean by same distribution? Option 1: Test if latent positions are drawn from same distribution. G 1 positions drawn i.i.d. F 1, G 2 positions drawn i.i.d. F 2 Test if F 1 = F 2 Nonparametric testing Tang, Athreya, Sussman, Lyzinski and Priebe (2017) Estimate latent positions of G 1 and G 2 via ASE, apply maximum mean discrepancy (Gretton et al, 2012) to ASE estimates. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

9 RDPG: what do we mean by same distribution? Option 2: Test if latent positions are the same G 1 latent positions X R n d, G 2 latent positions Y R n d Test if X = YW for some unitary W. Semiparametric testing Tang, Athreya, Sussman, Lyzinski and Priebe (2015) Embed both graphs via ASE, align estimated positions via Procrustes analysis (Gower, 1975). Reject H 0 if alignment is poor, i.e., if T Proc = min W Ud ˆX ŶW F is large. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

10 Challenges in semiparametric graph testing Problem 1: Procrustes alignment introduces variance More variance less power. Problem 2: How to generalize to multiple-graph hypothesis testing? Ultimately, we want something like ANOVA for graphs. Goal: develop a technique that... 1 Avoids Procrustes alignment 2 Generalizes naturally to 3 or more graphs K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

11 Omnibus matrix: motivation Definition (Omnibus matrix) Let graphs G 1 and G 2 be d-dimensional RDPGs with adjacency matrices A (1) and A (2). We construct an omnibus matrix for the graphs as M = A (1) A (1) +A (2) 2 A (1) +A (2) 2 A (2) R2n 2n Note: generalizes naturally to m graphs, with (i, j)-block (A (i) + A (j) )/2. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

12 Omnibus embedding Reminder M = A (1) A (1) +A (2) 2 A (1) +A (2) 2 A (2) R2n 2n Under H 0, we have EA (1) = EA (2) = XX T = P = U P S P U T P S P R d d diagonal, U P R n d orthonormal columns [ ] [ ] P P U [ [ ] EM = P = = S P P U P U T U T] X [X = T X T] = X U P S P UT P. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

13 Omnibus embedding Under H 0, we have EA (1) = EA (2) = XX T = P = U P S P U T P S P R d d diagonal, U P R n d orthonormal columns [ P EM = P = P Key point ] P = P [ ] U [ S U P U T U T] = Applying ASE to M, we get a 2n-by-d matrix, Ẑ = ] [ˆX, Ŷ [ ] X [X T X T] = X U P S P UT P. ˆX, Ŷ R n d provide estimates of latent positions of G 1, G 2, in the same d-dimensional space without additional alignment step. Natural test statistic given by T Omni = ˆX Ŷ F. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

14 Main results: Notational preliminaries In what follows, we assume the null hypothesis So G 1 and G 2 have shared latent positions X R n d. EA (1) = EA (2) = P = U P S P U T P = XX T R n n We denote the true latent positions of M by [ ] [ ] X UP Z = = S 1/2 = X P U PS 1/2 R 2n d P and their estimates by U P Ẑ = U M S 1/2 M ] [ˆX = R 2n d Ŷ where S M R d d is the diagonal matrix of the top d eigenvalues of M and corresponding eigenvectors in columns of U M R 2n d. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

15 Main results: Concentration inequality Lemma (Uniform concentration of estimates) Let {A (i) } m be adjacency matrices of m independent RDPGs with shared i=1 latent positions X = U P S 1/2 R n d and let M R mn mn be their omnibus P matrix with top eigenvalues collected in diagonal matrix S M R d d and corresponding eigenvalues in the columns of U M R mn d. There exists a constant C > 0 such that with high probability, there exists an orthogonal matrix W R d d such that max (U MS 1/2 1 h mn M 1/2 U PS P W) h, Cm1/2 log mn. n K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

16 Main results: CLT Theorem (CLT: informally) Let {A (i) } m be adjacency matrices of m independent RDPGs with shared i=1 latent positions X = U P S 1/2 R n d drawn i.i.d. from d-dimensional P distribution F. Let M R mn mn be their omnibus matrix with top eigenvalues collected in diagonal matrix S M R d d and corresponding eigenvalues in the columns of U M R mn d. Fix h = m(s 1) + i for i [n] and s [m]. Then the error between the h-th position estimate and the (properly rotated) true h-th position is asymptotically a continuous mixture of normals, with mixing determined by F. n 1/2 (U M S 1/2 1/2 M U PS P W n) h, N(0, Σ(y))dF(y). K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

17 Main results: CLT Theorem (CLT: More formally) Let {A (i) } m be adjacency matrices of m independent RDPGs with shared i=1 latent positions X = U P S 1/2 R n d drawn i.i.d. from d-dimensional P distribution F. Let M R mn mn be their omnibus matrix with top eigenvalues collected in diagonal matrix S M R d d and corresponding eigenvalues in the columns of U M R mn d. Let Φ(x, Σ) denote the cdf of a multivariate Gaussian with mean 0 and covariance matrix Σ. Fix h = m(s 1) + i for i [n] and s [m]. There exists a sequence of d-by-d orthogonal matrices (W n ) n=1 such that for all x Rd, [ lim Pr n 1/2 (U M S 1/2 1/2 n M U PS P W n) h, ] x = where Σ(y) = (m + 3) 1 Σ(y) 1 /(4m) and Φ (x, Σ(y)) df(y), = E F X 1 X T 1, Σ(y) = E F (y T X 1 (y T X 1 ) 2 )X 1 X T 1. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

18 Experiments: hypothesis testing Empirical Power Method Omnibus Procrustes Empirical Power Method Omnibus Procrustes Empirical Power Method Omnibus Procrustes Number of vertices (log scale) (a) Number of vertices (log scale) (b) Number of vertices (log scale) (c) Figure: Power of the Procrustes-based (blue) and omnibus-based (green) tests to detect when the two graphs being testing differ in (a) one, (b) five, and (c) ten of their latent positions. Each point is the proportion of 1000 trials for which the given technique correctly rejected the null hypothesis. Error bars denote two standard errors of this empirical mean. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

19 Experiments: estimating latent positions Mean Squared Error (log scale) 10 Method Abar ASE1 OMNI OMNIbar PROCbar Number of vertices (log scale) Figure: Mean squared error (MSE) in recovery of latent positions (up to rotation) in a 2-graph RDPG model as a function of the number of vertices for different estimation procedures. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

20 Future Work Develop graph analogues of ANOVA and other multiple hypothesis testing procedures Improve techniques for choosing critical value in omnibus test Improve understanding of power under H A K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

21 Thanks! Full paper: K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

Two-sample hypothesis testing for random dot product graphs

Two-sample hypothesis testing for random dot product graphs Minh Tang Department of Applied Mathematics and Statistics Johns Hopkins University JSM 2014 Joint work with Avanti Athreya, Vince Lyzinski,