A central limit theorem for an omnibus embedding of random dot product graphs

Size: px
Start display at page:

Download "A central limit theorem for an omnibus embedding of random dot product graphs"

Transcription

1 A central limit theorem for an omnibus embedding of random dot product graphs Keith Levin 1 with Avanti Athreya 2, Minh Tang 2, Vince Lyzinski 3 and Carey E. Priebe 2 1 University of Michigan, 2 Johns Hopkins University, 3 University of Massachusetts Amherst November 18, 2017

2 Classical two-sample hypothesis testing Well-studied in statistics (indeed, the only thing we teach undergrads?) K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

3 Graph Hypothesis Testing Q: how to tell if two (or more) graphs are from the same distribution? K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

4 Random Dot Product Graph (RDPG; Young and Scheinerman, 2007) Extends stochastic block model (SBM) Vertices assigned latent positions drawn i.i.d. from d-dimensional distribution F F constrained so that 0 x T y 1 whenever x, y supp F Denote i-th latent position by X i R d Edges {i, j} present or absent independently with probability X T i X j. Collect latent positions in rows of X R n d. Warning: Non-identifiability Model specified only up to orthogonal rotation of latent positions. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

5 Random Dot Product Graph (RDPG; Young and Scheinerman, 2007) Extends stochastic block model (SBM) Vertices assigned latent positions drawn i.i.d. from d-dimensional distribution F F constrained so that 0 x T y 1 whenever x, y supp F. Denote i-th latent position by X i Edges {i, j} present or absent independently with probability X T i X j. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

6 Estimating latent positions: adjacency spectral embedding (Sussman et al, 2012) Definition (Adjacency Spectral Embedding (ASE)) Given adjacency matrix A, embed vertices of A = USU T into R d as rows of ˆX = Ud S 1/2 R n d, where U d d denotes first d columns of U, S d denotes truncation of S to top d eigenvalues. Under RDPG, W : max 1 i n ˆXi WX i = O P (n 1/2 log n). Lyzinski, et al (2014): ASE yields a.a.s. perfect recovery of block memberships in SBM K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

7 RDPG: what do we mean by same distribution? K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

8 RDPG: what do we mean by same distribution? Option 1: Test if latent positions are drawn from same distribution. G 1 positions drawn i.i.d. F 1, G 2 positions drawn i.i.d. F 2 Test if F 1 = F 2 Nonparametric testing Tang, Athreya, Sussman, Lyzinski and Priebe (2017) Estimate latent positions of G 1 and G 2 via ASE, apply maximum mean discrepancy (Gretton et al, 2012) to ASE estimates. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

9 RDPG: what do we mean by same distribution? Option 2: Test if latent positions are the same G 1 latent positions X R n d, G 2 latent positions Y R n d Test if X = YW for some unitary W. Semiparametric testing Tang, Athreya, Sussman, Lyzinski and Priebe (2015) Embed both graphs via ASE, align estimated positions via Procrustes analysis (Gower, 1975). Reject H 0 if alignment is poor, i.e., if T Proc = min W Ud ˆX ŶW F is large. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

10 Challenges in semiparametric graph testing Problem 1: Procrustes alignment introduces variance More variance less power. Problem 2: How to generalize to multiple-graph hypothesis testing? Ultimately, we want something like ANOVA for graphs. Goal: develop a technique that... 1 Avoids Procrustes alignment 2 Generalizes naturally to 3 or more graphs K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

11 Omnibus matrix: motivation Definition (Omnibus matrix) Let graphs G 1 and G 2 be d-dimensional RDPGs with adjacency matrices A (1) and A (2). We construct an omnibus matrix for the graphs as M = A (1) A (1) +A (2) 2 A (1) +A (2) 2 A (2) R2n 2n Note: generalizes naturally to m graphs, with (i, j)-block (A (i) + A (j) )/2. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

12 Omnibus embedding Reminder M = A (1) A (1) +A (2) 2 A (1) +A (2) 2 A (2) R2n 2n Under H 0, we have EA (1) = EA (2) = XX T = P = U P S P U T P S P R d d diagonal, U P R n d orthonormal columns [ ] [ ] P P U [ [ ] EM = P = = S P P U P U T U T] X [X = T X T] = X U P S P UT P. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

13 Omnibus embedding Under H 0, we have EA (1) = EA (2) = XX T = P = U P S P U T P S P R d d diagonal, U P R n d orthonormal columns [ P EM = P = P Key point ] P = P [ ] U [ S U P U T U T] = Applying ASE to M, we get a 2n-by-d matrix, Ẑ = ] [ˆX, Ŷ [ ] X [X T X T] = X U P S P UT P. ˆX, Ŷ R n d provide estimates of latent positions of G 1, G 2, in the same d-dimensional space without additional alignment step. Natural test statistic given by T Omni = ˆX Ŷ F. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

14 Main results: Notational preliminaries In what follows, we assume the null hypothesis So G 1 and G 2 have shared latent positions X R n d. EA (1) = EA (2) = P = U P S P U T P = XX T R n n We denote the true latent positions of M by [ ] [ ] X UP Z = = S 1/2 = X P U PS 1/2 R 2n d P and their estimates by U P Ẑ = U M S 1/2 M ] [ˆX = R 2n d Ŷ where S M R d d is the diagonal matrix of the top d eigenvalues of M and corresponding eigenvectors in columns of U M R 2n d. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

15 Main results: Concentration inequality Lemma (Uniform concentration of estimates) Let {A (i) } m be adjacency matrices of m independent RDPGs with shared i=1 latent positions X = U P S 1/2 R n d and let M R mn mn be their omnibus P matrix with top eigenvalues collected in diagonal matrix S M R d d and corresponding eigenvalues in the columns of U M R mn d. There exists a constant C > 0 such that with high probability, there exists an orthogonal matrix W R d d such that max (U MS 1/2 1 h mn M 1/2 U PS P W) h, Cm1/2 log mn. n K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

16 Main results: CLT Theorem (CLT: informally) Let {A (i) } m be adjacency matrices of m independent RDPGs with shared i=1 latent positions X = U P S 1/2 R n d drawn i.i.d. from d-dimensional P distribution F. Let M R mn mn be their omnibus matrix with top eigenvalues collected in diagonal matrix S M R d d and corresponding eigenvalues in the columns of U M R mn d. Fix h = m(s 1) + i for i [n] and s [m]. Then the error between the h-th position estimate and the (properly rotated) true h-th position is asymptotically a continuous mixture of normals, with mixing determined by F. n 1/2 (U M S 1/2 1/2 M U PS P W n) h, N(0, Σ(y))dF(y). K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

17 Main results: CLT Theorem (CLT: More formally) Let {A (i) } m be adjacency matrices of m independent RDPGs with shared i=1 latent positions X = U P S 1/2 R n d drawn i.i.d. from d-dimensional P distribution F. Let M R mn mn be their omnibus matrix with top eigenvalues collected in diagonal matrix S M R d d and corresponding eigenvalues in the columns of U M R mn d. Let Φ(x, Σ) denote the cdf of a multivariate Gaussian with mean 0 and covariance matrix Σ. Fix h = m(s 1) + i for i [n] and s [m]. There exists a sequence of d-by-d orthogonal matrices (W n ) n=1 such that for all x Rd, [ lim Pr n 1/2 (U M S 1/2 1/2 n M U PS P W n) h, ] x = where Σ(y) = (m + 3) 1 Σ(y) 1 /(4m) and Φ (x, Σ(y)) df(y), = E F X 1 X T 1, Σ(y) = E F (y T X 1 (y T X 1 ) 2 )X 1 X T 1. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

18 Experiments: hypothesis testing Empirical Power Method Omnibus Procrustes Empirical Power Method Omnibus Procrustes Empirical Power Method Omnibus Procrustes Number of vertices (log scale) (a) Number of vertices (log scale) (b) Number of vertices (log scale) (c) Figure: Power of the Procrustes-based (blue) and omnibus-based (green) tests to detect when the two graphs being testing differ in (a) one, (b) five, and (c) ten of their latent positions. Each point is the proportion of 1000 trials for which the given technique correctly rejected the null hypothesis. Error bars denote two standard errors of this empirical mean. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

19 Experiments: estimating latent positions Mean Squared Error (log scale) 10 Method Abar ASE1 OMNI OMNIbar PROCbar Number of vertices (log scale) Figure: Mean squared error (MSE) in recovery of latent positions (up to rotation) in a 2-graph RDPG model as a function of the number of vertices for different estimation procedures. K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

20 Future Work Develop graph analogues of ANOVA and other multiple hypothesis testing procedures Improve techniques for choosing critical value in omnibus test Improve understanding of power under H A K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

21 Thanks! Full paper: K. Levin (U.Michigan,JHU,UMass) A CLT for omnibus embeddings November 18, / 20

Two-sample hypothesis testing for random dot product graphs

Two-sample hypothesis testing for random dot product graphs Two-sample hypothesis testing for random dot product graphs Minh Tang Department of Applied Mathematics and Statistics Johns Hopkins University JSM 2014 Joint work with Avanti Athreya, Vince Lyzinski,

More information

Manifold Learning for Subsequent Inference

Manifold Learning for Subsequent Inference Manifold Learning for Subsequent Inference Carey E. Priebe Johns Hopkins University June 20, 2018 DARPA Fundamental Limits of Learning (FunLoL) Los Angeles, California http://arxiv.org/abs/1806.01401 Key

More information

Out-of-sample extension of graph adjacency spectral embedding

Out-of-sample extension of graph adjacency spectral embedding Keith Levin 1 Farbod Roosta-Khorasani 2 3 Michael W. Mahoney 3 4 Carey E. Priebe 5 Abstract Many popular dimensionality reduction procedures have out-of-sample extensions, which allow a practitioner to

More information

arxiv: v1 [stat.ml] 29 Jul 2012

arxiv: v1 [stat.ml] 29 Jul 2012 arxiv:1207.6745v1 [stat.ml] 29 Jul 2012 Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs Daniel L. Sussman, Minh Tang, Carey E. Priebe Johns Hopkins

More information

A limit theorem for scaled eigenvectors of random dot product graphs

A limit theorem for scaled eigenvectors of random dot product graphs Sankhya A manuscript No. (will be inserted by the editor A limit theorem for scaled eigenvectors of random dot product graphs A. Athreya V. Lyzinski C. E. Priebe D. L. Sussman M. Tang D.J. Marchette the

More information

Statistical Inference on Random Dot Product Graphs: a Survey

Statistical Inference on Random Dot Product Graphs: a Survey Journal of Machine Learning Research 8 (8) -9 Submitted 8/7; Revised 8/7; Published 5/8 Statistical Inference on Random Dot Product Graphs: a Survey Avanti Athreya Donniell E. Fishkind Minh Tang Carey

More information

On Spectral Graph Clustering

On Spectral Graph Clustering On Spectral Graph Clustering Carey E. Priebe Johns Hopkins University May 18, 2018 Symposium on Data Science and Statistics Reston, Virginia Minh Tang http://arxiv.org/abs/1607.08601 Annals of Statistics,

More information

Foundations of Adjacency Spectral Embedding. Daniel L. Sussman

Foundations of Adjacency Spectral Embedding. Daniel L. Sussman Foundations of Adjacency Spectral Embedding by Daniel L. Sussman A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy. Baltimore,

More information

1 2 2 Circulant Matrices

1 2 2 Circulant Matrices Circulant Matrices General matrix a c d Ax x ax + cx x x + dx General circulant matrix a x ax + x a x x + ax. Evaluating the Eigenvalues Find eigenvalues and eigenvectors of general circulant matrix: a

More information

arxiv: v3 [stat.ml] 29 Jul 2018

arxiv: v3 [stat.ml] 29 Jul 2018 A statistical interpretation of spectral embedding: the generalised random dot product graph Patrick Rubin-Delanchy *, Carey E. Priebe **, Minh Tang **, and Joshua Cape ** * University of Bristol and Heilbronn

More information

arxiv: v1 [stat.me] 12 May 2017

arxiv: v1 [stat.me] 12 May 2017 Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel Patrick Rubin-Delanchy *, Carey E. Priebe **, and Minh Tang ** * University of Oxford and Heilbronn Institute

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Hypothesis Testing For Multilayer Network Data

Hypothesis Testing For Multilayer Network Data Hypothesis Testing For Multilayer Network Data Jun Li Dept of Mathematics and Statistics, Boston University Joint work with Eric Kolaczyk Outline Background and Motivation Geometric structure of multilayer

More information

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. Adjoint operator and adjoint matrix Given a linear operator L on an inner product space V, the adjoint of L is a transformation

More information

1 Inner Product and Orthogonality

1 Inner Product and Orthogonality CSCI 4/Fall 6/Vora/GWU/Orthogonality and Norms Inner Product and Orthogonality Definition : The inner product of two vectors x and y, x x x =.., y =. x n y y... y n is denoted x, y : Note that n x, y =

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Network Representation Using Graph Root Distributions

Network Representation Using Graph Root Distributions Network Representation Using Graph Root Distributions Jing Lei Department of Statistics and Data Science Carnegie Mellon University 2018.04 Network Data Network data record interactions (edges) between

More information

Empirical Bayes estimation for the stochastic blockmodel

Empirical Bayes estimation for the stochastic blockmodel Electronic Journal of Statistics Vol. 10 (2016) 761 782 ISSN: 1935-7524 DOI: 10.1214/16-EJS1115 Empirical Bayes estimation for the stochastic blockmodel Shakira Suwan,DominicS.Lee Department of Mathematics

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices m:33 Notes: 7. Diagonalization of Symmetric Matrices Dennis Roseman University of Iowa Iowa City, IA http://www.math.uiowa.edu/ roseman May 3, Symmetric matrices Definition. A symmetric matrix is a matrix

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS014) p.4149

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS014) p.4149 Int. Statistical Inst.: Proc. 58th orld Statistical Congress, 011, Dublin (Session CPS014) p.4149 Invariant heory for Hypothesis esting on Graphs Priebe, Carey Johns Hopkins University, Applied Mathematics

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent. Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u

More information

Vertex Nomination via Attributed Random Dot Product Graphs

Vertex Nomination via Attributed Random Dot Product Graphs Vertex Nomination via Attributed Random Dot Product Graphs Carey E. Priebe cep@jhu.edu Department of Applied Mathematics & Statistics Johns Hopkins University 58th Session of the International Statistical

More information

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016

A Random Dot Product Model for Weighted Networks arxiv: v1 [stat.ap] 8 Nov 2016 A Random Dot Product Model for Weighted Networks arxiv:1611.02530v1 [stat.ap] 8 Nov 2016 Daryl R. DeFord 1 Daniel N. Rockmore 1,2,3 1 Department of Mathematics, Dartmouth College, Hanover, NH, USA 03755

More information

Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

More information

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2 EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME Xavier Mestre, Pascal Vallet 2 Centre Tecnològic de Telecomunicacions de Catalunya, Castelldefels, Barcelona (Spain) 2 Institut

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Norms of Random Matrices & Low-Rank via Sampling

Norms of Random Matrices & Low-Rank via Sampling CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 4-10/05/2009 Norms of Random Matrices & Low-Rank via Sampling Lecturer: Michael Mahoney Scribes: Jacob Bien and Noah Youngs *Unedited Notes

More information

Periodicity & State Transfer Some Results Some Questions. Periodic Graphs. Chris Godsil. St John s, June 7, Chris Godsil Periodic Graphs

Periodicity & State Transfer Some Results Some Questions. Periodic Graphs. Chris Godsil. St John s, June 7, Chris Godsil Periodic Graphs St John s, June 7, 2009 Outline 1 Periodicity & State Transfer 2 Some Results 3 Some Questions Unitary Operators Suppose X is a graph with adjacency matrix A. Definition We define the operator H X (t)

More information

COMPSCI 514: Algorithms for Data Science

COMPSCI 514: Algorithms for Data Science COMPSCI 514: Algorithms for Data Science Arya Mazumdar University of Massachusetts at Amherst Fall 2018 Lecture 8 Spectral Clustering Spectral clustering Curse of dimensionality Dimensionality Reduction

More information

Digital Image Processing

Digital Image Processing Digital Image Processing 2D SYSTEMS & PRELIMINARIES Hamid R. Rabiee Fall 2015 Outline 2 Two Dimensional Fourier & Z-transform Toeplitz & Circulant Matrices Orthogonal & Unitary Matrices Block Matrices

More information

MATH36001 Generalized Inverses and the SVD 2015

MATH36001 Generalized Inverses and the SVD 2015 MATH36001 Generalized Inverses and the SVD 201 1 Generalized Inverses of Matrices A matrix has an inverse only if it is square and nonsingular. However there are theoretical and practical applications

More information

Spectral Theorem for Self-adjoint Linear Operators

Spectral Theorem for Self-adjoint Linear Operators Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;

More information

Visualizing the Multivariate Normal, Lecture 9

Visualizing the Multivariate Normal, Lecture 9 Visualizing the Multivariate Normal, Lecture 9 Rebecca C. Steorts September 15, 2015 Last class Class was done on the board get notes if you missed lecture. Make sure to go through the Markdown example

More information

Previously Monte Carlo Integration

Previously Monte Carlo Integration Previously Simulation, sampling Monte Carlo Simulations Inverse cdf method Rejection sampling Today: sampling cont., Bayesian inference via sampling Eigenvalues and Eigenvectors Markov processes, PageRank

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Preface to the Second Edition...vii Preface to the First Edition... ix

Preface to the Second Edition...vii Preface to the First Edition... ix Contents Preface to the Second Edition...vii Preface to the First Edition........... ix 1 Introduction.............................................. 1 1.1 Large Dimensional Data Analysis.........................

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and

More information

Extreme Values and Positive/ Negative Definite Matrix Conditions

Extreme Values and Positive/ Negative Definite Matrix Conditions Extreme Values and Positive/ Negative Definite Matrix Conditions James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 8, 016 Outline 1

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Review of Linear Algebra

Review of Linear Algebra Review of Linear Algebra Definitions An m n (read "m by n") matrix, is a rectangular array of entries, where m is the number of rows and n the number of columns. 2 Definitions (Con t) A is square if m=

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

Fast Angular Synchronization for Phase Retrieval via Incomplete Information Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department

More information

Fundamentals of Matrices

Fundamentals of Matrices Maschinelles Lernen II Fundamentals of Matrices Christoph Sawade/Niels Landwehr/Blaine Nelson Tobias Scheffer Matrix Examples Recap: Data Linear Model: f i x = w i T x Let X = x x n be the data matrix

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Profile Analysis Multivariate Regression

Profile Analysis Multivariate Regression Lecture 8 October 12, 2005 Analysis Lecture #8-10/12/2005 Slide 1 of 68 Today s Lecture Profile analysis Today s Lecture Schedule : regression review multiple regression is due Thursday, October 27th,

More information

Lecture 8 Inequality Testing and Moment Inequality Models

Lecture 8 Inequality Testing and Moment Inequality Models Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes

More information

Quantum Mechanics crash course (For the scholar with an higher education in mathematics) Fabio Grazioso :48

Quantum Mechanics crash course (For the scholar with an higher education in mathematics) Fabio Grazioso :48 Quantum Mechanics crash course (For the scholar with an higher education in mathematics) Fabio Grazioso 2015-03-23 19:48 1 Contents 1 Mathematical definitions 3 11 Hilbert space 3 12 Operators on the Hilbert

More information

Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding

Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding Vladas Pipiras University of North Carolina at Chapel Hill UNC Graduate Seminar, November 10, 2010 (joint

More information

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)

More information

1 Linearity and Linear Systems

1 Linearity and Linear Systems Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)

More information

Recall the convention that, for us, all vectors are column vectors.

Recall the convention that, for us, all vectors are column vectors. Some linear algebra Recall the convention that, for us, all vectors are column vectors. 1. Symmetric matrices Let A be a real matrix. Recall that a complex number λ is an eigenvalue of A if there exists

More information

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and

More information

Optimal Sequences, Power Control and User Capacity of Synchronous CDMA Systems with Linear MMSE Multiuser Receivers

Optimal Sequences, Power Control and User Capacity of Synchronous CDMA Systems with Linear MMSE Multiuser Receivers Optimal Sequences, Power Control and User Capacity of Synchronous CDMA Systems with Linear MMSE Multiuser Receivers Pramod Viswanath, Venkat Anantharam and David.C. Tse {pvi, ananth, dtse}@eecs.berkeley.edu

More information

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs Ma/CS 6b Class 3: Eigenvalues in Regular Graphs By Adam Sheffer Recall: The Spectrum of a Graph Consider a graph G = V, E and let A be the adjacency matrix of G. The eigenvalues of G are the eigenvalues

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

7.3 Ridge Analysis of the Response Surface

7.3 Ridge Analysis of the Response Surface 7.3 Ridge Analysis of the Response Surface When analyzing a fitted response surface, the researcher may find that the stationary point is outside of the experimental design region, but the researcher wants

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

1 Tridiagonal matrices

1 Tridiagonal matrices Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

Graph Metrics and Dimension Reduction

Graph Metrics and Dimension Reduction Graph Metrics and Dimension Reduction Minh Tang 1 Michael Trosset 2 1 Applied Mathematics and Statistics The Johns Hopkins University 2 Department of Statistics Indiana University, Bloomington November

More information

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th. Elementary Linear Algebra Review for Exam Exam is Monday, November 6th. The exam will cover sections:.4,..4, 5. 5., 7., the class notes on Markov Models. You must be able to do each of the following. Section.4

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

Stat 159/259: Linear Algebra Notes

Stat 159/259: Linear Algebra Notes Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding

More information

Convergence of Eigenspaces in Kernel Principal Component Analysis

Convergence of Eigenspaces in Kernel Principal Component Analysis Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

9.520: Class 20. Bayesian Interpretations. Tomaso Poggio and Sayan Mukherjee

9.520: Class 20. Bayesian Interpretations. Tomaso Poggio and Sayan Mukherjee 9.520: Class 20 Bayesian Interpretations Tomaso Poggio and Sayan Mukherjee Plan Bayesian interpretation of Regularization Bayesian interpretation of the regularizer Bayesian interpretation of quadratic

More information

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018 ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space

More information

Lecture 6: Lies, Inner Product Spaces, and Symmetric Matrices

Lecture 6: Lies, Inner Product Spaces, and Symmetric Matrices Math 108B Professor: Padraic Bartlett Lecture 6: Lies, Inner Product Spaces, and Symmetric Matrices Week 6 UCSB 2014 1 Lies Fun fact: I have deceived 1 you somewhat with these last few lectures! Let me

More information

c 1 v 1 + c 2 v 2 = 0 c 1 λ 1 v 1 + c 2 λ 1 v 2 = 0

c 1 v 1 + c 2 v 2 = 0 c 1 λ 1 v 1 + c 2 λ 1 v 2 = 0 LECTURE LECTURE 2 0. Distinct eigenvalues I haven t gotten around to stating the following important theorem: Theorem: A matrix with n distinct eigenvalues is diagonalizable. Proof (Sketch) Suppose n =

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

Extreme inference in stationary time series

Extreme inference in stationary time series Extreme inference in stationary time series Moritz Jirak FOR 1735 February 8, 2013 1 / 30 Outline 1 Outline 2 Motivation The multivariate CLT Measuring discrepancies 3 Some theory and problems The problem

More information

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP) MATH 20F: LINEAR ALGEBRA LECTURE B00 (T KEMP) Definition 01 If T (x) = Ax is a linear transformation from R n to R m then Nul (T ) = {x R n : T (x) = 0} = Nul (A) Ran (T ) = {Ax R m : x R n } = {b R m

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

1. Let A be a 2 2 nonzero real matrix. Which of the following is true?

1. Let A be a 2 2 nonzero real matrix. Which of the following is true? 1. Let A be a 2 2 nonzero real matrix. Which of the following is true? (A) A has a nonzero eigenvalue. (B) A 2 has at least one positive entry. (C) trace (A 2 ) is positive. (D) All entries of A 2 cannot

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Solutions to Review Problems for Chapter 6 ( ), 7.1

Solutions to Review Problems for Chapter 6 ( ), 7.1 Solutions to Review Problems for Chapter (-, 7 The Final Exam is on Thursday, June,, : AM : AM at NESBITT Final Exam Breakdown Sections % -,7-9,- - % -9,-,7,-,-7 - % -, 7 - % Let u u and v Let x x x x,

More information

Transmit Directions and Optimality of Beamforming in MIMO-MAC with Partial CSI at the Transmitters 1

Transmit Directions and Optimality of Beamforming in MIMO-MAC with Partial CSI at the Transmitters 1 2005 Conference on Information Sciences and Systems, The Johns Hopkins University, March 6 8, 2005 Transmit Directions and Optimality of Beamforming in MIMO-MAC with Partial CSI at the Transmitters Alkan

More information

Stat 710: Mathematical Statistics Lecture 31

Stat 710: Mathematical Statistics Lecture 31 Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:

More information