Applications of random matrix theory to principal component analysis(pca)

Size: px
Start display at page:

Download "Applications of random matrix theory to principal component analysis(pca)"

Transcription

1 Applications of random matrix theory to principal component analysis(pca) Jun Yin IAS, UW-Madison IAS, April-2014 Joint work with A. Knowles and H. T Yau. 1

2 Basic picture: Let H be a Wigner (symmetric) random matrix: H = (H ij ) 1 i,j N, H = H, H ij = N 1/2 h ij is a random matrix, whose upper right entries h ij s are independent random variables with mean 0 and variance 1. E H ij = 0, E H ij 2 = 1, 1 i, j N N Let A = A N N be a (full rank) deterministic symmetric matrix. Most of the e.values of A are O(1). What can we say about H + A? [Knowles and Y, ]: on the rank A = O(1) case. Example: A = N k=1 d k v k v k, where d 1 = 10, d k = 1 for 2 k N/2 and other d k = 2. 2

3 Some basic questions: ρ HA Voiculescu 1986, Spe- Limiting spectral distributions: icher 1994 (Free probablity) Local density or rigidity. λ k γ k N ε γ k γ k±1, γk ρ HA = k/n holds with 1 N D for any small ε > 0 and D > 0. Delocalization of e.vectors (in any direction). Let u k be l 2 normalized eigenvectors of H + A, (1 k N), and w be deterministic vector max u k, w 2 N 1+ε k holds with 1 N D for any small ε > 0 and D > 0. 3

4 Some basic questions: Behaviors of outlier. Behaviors of the e.vector of outlier. Joint distribution of the largest k non-outlier eigenvalues. k-point correlation functions of H + A in bulk. 4

5 Wigner matrix has two important properties: 1. Independent entries. 2. Isotropic rows/columns (without diagonal entries): let h k = {H kl } l k and w R N 1 be deterministic vector, then h k, w N(0, N 1 w 2 2 ) Why is 2 important, think about a growing matrix Clearly H + A does not have the second property. Two exceptional cases: A is diagonal H is GOE. 5

6 Sample covariance matrix Let XX be a sample covariance matrix: X = (X ij ) 1 i M,1 j N, X ij = (M N) 1/4 x ij is a random matrix, where x ij s are independent random variables with mean 0 and variance 1. Let T = T M M (M M) be a deterministic matrix, T T I has full rank and most of the e.values of T T are O(1). What can we say about T XXT? Furthermore, let e be M 1/2 (1, 1,, 1, 1). How about T X(1 ee )XT 6

7 Real life question: Let y = (y(1), y(2),, y(m)) be some random vector with unknown distribution. For example: price changes of M stocks. How can one get the covariance matrix: { } M Cov(y(i), y(j)) i,j=1 Stat 101: Mesure y N times independently: y 1, y 2,, y N Cov(y(i), y(j)) = where lim N 1 N 1 N α=1 (y α (i) ȳ(i)) (y α (j) ȳ(j)) ȳ(i) = 1 N N α=1 y α (i) 7

8 Model: we assume that y is linear mix of some fundamental independent random variable x = ( x(1), x(2),, x(m ) ) If you measure y N times then We know y = T x, T M M Y = (y 1, y 2, y N ) = T X = T (x 1, x 2, x N ) Cov(y(i), y(j)) = lim N 1 N 1 T X(1 ee )XT For example: Let v is a fixed vector, x is random vector and ξ is random variable, and they are independent. Then y = x + ξv y ( x ) = (I, v) = T x ξ 8

9 Model: y = T x, T M M, x = (x(1), x(2),, x(m )) Cov(y(i), y(j)) = lim N 1 N 1 T X(1 ee )XT, Without loss of generality, we assume that Ex(i) = 0 (by defining x (i) = x(i) Ex(i)). And we assume E x(i) 2 = 1 (by rescaling T ). With this setting Recall the previous example: Here v 2 1, Cov(y(i), y(j)) = ( T T ) ij y = x + ξv, E x(i) 2 = 1, E ξ 2 = 1 y ( x ) = (I, v) = T x, T T = I + vv ξ As we can see T T has a large e. value (1 + v 2 2 ) with e. vector v, and they are related to "signals" ( ) 9

10 Though Cov(y(i), y(j)) = ( T T ) ij = lim N 1 N 1 T X(1 ee )XT in most case it does not work very well, since N needs to be very large (like N M). The basic idea is estimating the principal component (large e.values and their e.vectors) of the matrix with those of the matrix or just T T T X(1 ee )XT T XXT 10

11 PCA model: T T = d β v β v β, T T = signal = O(1) d β v β vβ + d β v β vβ, β signal β noise T M M, M = M + O(1) Noise are comparable c d β C, d β noise log N log M. d β and v β could depend on N or M 11

12 Basic picture (λ noise = 1 and only one λ signal case) Let d := (N/M) 1/2 ( λ signal λ noise ). Outlier appears when d > 1 and outlier µ satisfies: µ = γ d + d 1 + error. 13

13 Detection of v signal. Let u be the eigenvector of µ (outlier), then for any fixed normalized w, we have (w, u) 2 = f µ (w, v signal ) 2 + error Distribution of u? 1. θ(v signal, u) = arccos f µ + error 2. Delocalization in any direction orthogonal to v signal, i.e., if we have (w, v signal ) = 0, then (w, u) M 1/2+ε. Briefly speaking, u (u, v signal )v signal is random and isotropic. Two outliers cases: see graph. 14

14 Application of delocalization Assume we know v signal R M only has M non-zero components, M M and Then v signal (i) ( M) 1/2, if v signal (i) 0 1. if v signal (i) = 0, delocalization property shows u(i) M 1/2+ε 2. if v signal (i) 0, parallel property shows u(i) ( M) 1/2 ε Using this method, we can know that which components of v signal are non-zero. 15

15 Some previous results: Eigenvalues: T XX T, T M M, T T = β signal d βv β v β + β noise d βv β v β Baik and Silverstein (2006): p = q, λ noise = 1, for fixed d, they obtain the limit of µ. Bai and Yao (2008): p = q, λ( noise = ) 1, T is of the form (where A 0 A is O(1) O(1) matrix) T = for fixed d, they obtain the 0 I CLT of µ. Bai and Yao (2008): p = q, T is symmetric ( ) matrix of the form A 0 (where A is O(1) O(1) matrix) T = 0 T, for fixed d, they obtain the limit of µ. Nadler (2008): The spiked covariance model. T = ( I λe 1 ) 16

16 Eigenvector: Then for any fixed w, we have (w, u) 2 = f d (w, v signal ) 2 + error Paul (2007): p = q, λ noise = 1, T is diagonal and X ij is Gaussian. Shi (2013): p = q, λ noise = 1, T is diagonal. Benaych-Georges, Nadakuditi (2010), p = q, λ noise = 1, T is random symmetric matrix, T is independent of X, either X ij is Gaussian or T is isotropic. Benaych-Georges, Nadakuditi (2012): T = ( I isotropic v. λv ) with random Results: limit of (u, v signal ) 2, except the first one. 17

17 Main results 1. Rigidity of e.values (including outliers): (up to N ε factor) λ i γ i = error 2. Delocalization of e.vectors of non-outliers. 3. Direction of the e.vectors of outliers. u (u, v signal )v signal is random and isotropic. 4. Some eigenvector information can be detected even if d = 1 o(1) 5. TW distribution of the largest k non-outliers. El Karoui (2007): Gaussian case. 6. Isotropic law of (H + A) or T XXT. 7. Bulk universality with 4 moment match (for (H + A) or T XXT ). 18

18 Strategy: With V M M, U M M and diagonal D M M T = V D(I M, 0)U, T T = d β v β vβ + d β v β vβ β signal β noise Define S = S, SS = 1v β vβ + d β v β vβ β signal β noise Note: S has no signal. Represent ( 1, G := T X(1 ee )X T z) with where G S = G S, G S X, XG S X, etc ( 1. SXX S z) 19

19 Question: Let A be a matrix with only one non-zero entry and X = X + A. Then (XX z) 1 (X X z) 1,? 20

20 Isotropic law: Wigner: Let H be a Wigner matrix, G = (H z) 1, then for any fixed w and v, we have (w, (H z) 1 v) = m(z) + O((Nη) 1 (log N) C ), η = Im z and m(z) = ρ sc (x)(x z) 1 dz. Knowles and Y (2011). PCA: For fixed w and v, what are the behaviors of (w, G S v), (w, G S Xv), (w, X G S Xv), G S = (SXX S z) 1 Bloemendal, Erdos, Knowles, Yau and Y (2013): case. S = I M M 21

21 Isotropic law for general S or general A (SXX S z) 1, (H + A) 1 Knowles and Y (2014): Let A = UDU with D = diag(d 1, d 2,, d N ). Here d i C. Define m i = (d i z + m) 1, Then for fixed w and v R N, m := 1 N m j j (w, (H z) 1 v) = (w, (A z + m) 1 v) + error Based on this result: rigidity, delocalization, TW law. (Capitaine, Peche 2014: GOE+A) 22

22 Basic idea of proving the isotropic law of H + A. 1. The isotropic law of GOE + A. (polynomialization method) Bloemendal, Erdos, Knowles, Yau and Y (2013) 2. Compare (H + A) 1 with (GOE + A) 1 with Newton method. Let ν be the distribution density of H ij and ν G be the distribution density of the entries of GOE. Let H t be the Wigner matrix whose entries having distributions ν t = tν + (1 t)ν G Continuous bridge between H and GOE. 23

23 Recall EF (H) = 0 1 t EF (H t )dt Let H t,k,l,0 be the Wigner matrix whose entries having the same distribution as H t except that (k, l) entry of H t,k,l0 has the distribution of ν. Let H t,k,l,1 be the Wigner matrix whose entries having the same distribution as H t except that (k, l) entry of H t,k,l,1 has the distribution of ν G. Then ( EF (H t,k,l,1 ) EF (H t,k,l,0 ) ) t EF (H t ) = kl Note: H t,k,l,0 and H t,k,l,1 are very close to H t. 24

24 For example: F i,j,p (H) = [ (H z) 1 ] 2p ij Goal: Create a self-consistent differential equation of ( ) N EF i,j,p (H t ) ij=1 which is stable. 25

25 Thank you 26

Comparison Method in Random Matrix Theory

Comparison Method in Random Matrix Theory Comparison Method in Random Matrix Theory Jun Yin UW-Madison Valparaíso, Chile, July - 2015 Joint work with A. Knowles. 1 Some random matrices Wigner Matrix: H is N N square matrix, H : H ij = H ji, EH

More information

Non white sample covariance matrices.

Non white sample covariance matrices. Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions

More information

Semicircle law on short scales and delocalization for Wigner random matrices

Semicircle law on short scales and delocalization for Wigner random matrices Semicircle law on short scales and delocalization for Wigner random matrices László Erdős University of Munich Weizmann Institute, December 2007 Joint work with H.T. Yau (Harvard), B. Schlein (Munich)

More information

On the principal components of sample covariance matrices

On the principal components of sample covariance matrices On the principal components of sample covariance matrices Alex Bloemendal Antti Knowles Horng-Tzer Yau Jun Yin February 4, 205 We introduce a class of M M sample covariance matrices Q which subsumes and

More information

Random Matrices: Invertibility, Structure, and Applications

Random Matrices: Invertibility, Structure, and Applications Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37

More information

Isotropic local laws for random matrices

Isotropic local laws for random matrices Isotropic local laws for random matrices Antti Knowles University of Geneva With Y. He and R. Rosenthal Random matrices Let H C N N be a large Hermitian random matrix, normalized so that H. Some motivations:

More information

Fluctuations from the Semicircle Law Lecture 4

Fluctuations from the Semicircle Law Lecture 4 Fluctuations from the Semicircle Law Lecture 4 Ioana Dumitriu University of Washington Women and Math, IAS 2014 May 23, 2014 Ioana Dumitriu (UW) Fluctuations from the Semicircle Law Lecture 4 May 23, 2014

More information

Universality of local spectral statistics of random matrices

Universality of local spectral statistics of random matrices Universality of local spectral statistics of random matrices László Erdős Ludwig-Maximilians-Universität, Munich, Germany CRM, Montreal, Mar 19, 2012 Joint with P. Bourgade, B. Schlein, H.T. Yau, and J.

More information

Local Kesten McKay law for random regular graphs

Local Kesten McKay law for random regular graphs Local Kesten McKay law for random regular graphs Roland Bauerschmidt (with Jiaoyang Huang and Horng-Tzer Yau) University of Cambridge Weizmann Institute, January 2017 Random regular graph G N,d is the

More information

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao A note on a Marčenko-Pastur type theorem for time series Jianfeng Yao Workshop on High-dimensional statistics The University of Hong Kong, October 2011 Overview 1 High-dimensional data and the sample covariance

More information

Fluctuations of random tilings and discrete Beta-ensembles

Fluctuations of random tilings and discrete Beta-ensembles Fluctuations of random tilings and discrete Beta-ensembles Alice Guionnet CRS (E S Lyon) Workshop in geometric functional analysis, MSRI, nov. 13 2017 Joint work with A. Borodin, G. Borot, V. Gorin, J.Huang

More information

Random Matrix: From Wigner to Quantum Chaos

Random Matrix: From Wigner to Quantum Chaos Random Matrix: From Wigner to Quantum Chaos Horng-Tzer Yau Harvard University Joint work with P. Bourgade, L. Erdős, B. Schlein and J. Yin 1 Perhaps I am now too courageous when I try to guess the distribution

More information

The Matrix Dyson Equation in random matrix theory

The Matrix Dyson Equation in random matrix theory The Matrix Dyson Equation in random matrix theory László Erdős IST, Austria Mathematical Physics seminar University of Bristol, Feb 3, 207 Joint work with O. Ajanki, T. Krüger Partially supported by ERC

More information

Fluctuations of random tilings and discrete Beta-ensembles

Fluctuations of random tilings and discrete Beta-ensembles Fluctuations of random tilings and discrete Beta-ensembles Alice Guionnet CRS (E S Lyon) Advances in Mathematics and Theoretical Physics, Roma, 2017 Joint work with A. Borodin, G. Borot, V. Gorin, J.Huang

More information

The Isotropic Semicircle Law and Deformation of Wigner Matrices

The Isotropic Semicircle Law and Deformation of Wigner Matrices The Isotropic Semicircle Law and Deformation of Wigner Matrices Antti Knowles 1 and Jun Yin 2 Department of Mathematics, Harvard University Cambridge MA 02138, USA knowles@math.harvard.edu 1 Department

More information

Information-theoretically Optimal Sparse PCA

Information-theoretically Optimal Sparse PCA Information-theoretically Optimal Sparse PCA Yash Deshpande Department of Electrical Engineering Stanford, CA. Andrea Montanari Departments of Electrical Engineering and Statistics Stanford, CA. Abstract

More information

Universality for random matrices and log-gases

Universality for random matrices and log-gases Universality for random matrices and log-gases László Erdős IST, Austria Ludwig-Maximilians-Universität, Munich, Germany Encounters Between Discrete and Continuous Mathematics Eötvös Loránd University,

More information

Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE

Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Craig A. Tracy UC Davis RHPIA 2005 SISSA, Trieste 1 Figure 1: Paul Painlevé,

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yuanzhen Shao MA 26500 Yuanzhen Shao PCA 1 / 13 Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =.

More information

arxiv: v3 [math-ph] 21 Jun 2012

arxiv: v3 [math-ph] 21 Jun 2012 LOCAL MARCHKO-PASTUR LAW AT TH HARD DG OF SAMPL COVARIAC MATRICS CLAUDIO CACCIAPUOTI, AA MALTSV, AD BJAMI SCHLI arxiv:206.730v3 [math-ph] 2 Jun 202 Abstract. Let X be a matrix whose entries are i.i.d.

More information

III. Quantum ergodicity on graphs, perspectives

III. Quantum ergodicity on graphs, perspectives III. Quantum ergodicity on graphs, perspectives Nalini Anantharaman Université de Strasbourg 24 août 2016 Yesterday we focussed on the case of large regular (discrete) graphs. Let G = (V, E) be a (q +

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

Fluctuations of the free energy of spherical spin glass

Fluctuations of the free energy of spherical spin glass Fluctuations of the free energy of spherical spin glass Jinho Baik University of Michigan 2015 May, IMS, Singapore Joint work with Ji Oon Lee (KAIST, Korea) Random matrix theory Real N N Wigner matrix

More information

1 Last time: least-squares problems

1 Last time: least-squares problems MATH Linear algebra (Fall 07) Lecture Last time: least-squares problems Definition. If A is an m n matrix and b R m, then a least-squares solution to the linear system Ax = b is a vector x R n such that

More information

Concentration Inequalities for Random Matrices

Concentration Inequalities for Random Matrices Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic

More information

Notes on Implementation of Component Analysis Techniques

Notes on Implementation of Component Analysis Techniques Notes on Implementation of Component Analysis Techniques Dr. Stefanos Zafeiriou January 205 Computing Principal Component Analysis Assume that we have a matrix of centered data observations X = [x µ,...,

More information

1 Tridiagonal matrices

1 Tridiagonal matrices Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

Spectral Statistics of Erdős-Rényi Graphs II: Eigenvalue Spacing and the Extreme Eigenvalues

Spectral Statistics of Erdős-Rényi Graphs II: Eigenvalue Spacing and the Extreme Eigenvalues Spectral Statistics of Erdős-Rényi Graphs II: Eigenvalue Spacing and the Extreme Eigenvalues László Erdős Antti Knowles 2 Horng-Tzer Yau 2 Jun Yin 2 Institute of Mathematics, University of Munich, Theresienstrasse

More information

arxiv: v4 [math.pr] 10 Sep 2018

arxiv: v4 [math.pr] 10 Sep 2018 Lectures on the local semicircle law for Wigner matrices arxiv:60.04055v4 [math.pr] 0 Sep 208 Florent Benaych-Georges Antti Knowles September, 208 These notes provide an introduction to the local semicircle

More information

From the mesoscopic to microscopic scale in random matrix theory

From the mesoscopic to microscopic scale in random matrix theory From the mesoscopic to microscopic scale in random matrix theory (fixed energy universality for random spectra) With L. Erdős, H.-T. Yau, J. Yin Introduction A spacially confined quantum mechanical system

More information

LECTURE 16: PCA AND SVD

LECTURE 16: PCA AND SVD Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal

More information

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices László Erdős University of Munich Oberwolfach, 2008 Dec Joint work with H.T. Yau (Harvard), B. Schlein (Cambrigde) Goal:

More information

Spectral Universality of Random Matrices

Spectral Universality of Random Matrices Spectral Universality of Random Matrices László Erdős IST Austria (supported by ERC Advanced Grant) Szilárd Leó Colloquium Technical University of Budapest February 27, 2018 László Erdős Spectral Universality

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Numerical Solutions to the General Marcenko Pastur Equation

Numerical Solutions to the General Marcenko Pastur Equation Numerical Solutions to the General Marcenko Pastur Equation Miriam Huntley May 2, 23 Motivation: Data Analysis A frequent goal in real world data analysis is estimation of the covariance matrix of some

More information

Lectures 6 7 : Marchenko-Pastur Law

Lectures 6 7 : Marchenko-Pastur Law Fall 2009 MATH 833 Random Matrices B. Valkó Lectures 6 7 : Marchenko-Pastur Law Notes prepared by: A. Ganguly We will now turn our attention to rectangular matrices. Let X = (X 1, X 2,..., X n ) R p n

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

16.584: Random Vectors

16.584: Random Vectors 1 16.584: Random Vectors Define X : (X 1, X 2,..X n ) T : n-dimensional Random Vector X 1 : X(t 1 ): May correspond to samples/measurements Generalize definition of PDF: F X (x) = P[X 1 x 1, X 2 x 2,...X

More information

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms De La Fuente notes that, if an n n matrix has n distinct eigenvalues, it can be diagonalized. In this supplement, we will provide

More information

Eigenvalue variance bounds for Wigner and covariance random matrices

Eigenvalue variance bounds for Wigner and covariance random matrices Eigenvalue variance bounds for Wigner and covariance random matrices S. Dallaporta University of Toulouse, France Abstract. This work is concerned with finite range bounds on the variance of individual

More information

Wavelet Transform And Principal Component Analysis Based Feature Extraction

Wavelet Transform And Principal Component Analysis Based Feature Extraction Wavelet Transform And Principal Component Analysis Based Feature Extraction Keyun Tong June 3, 2010 As the amount of information grows rapidly and widely, feature extraction become an indispensable technique

More information

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

Local law of addition of random matrices

Local law of addition of random matrices Local law of addition of random matrices Kevin Schnelli IST Austria Joint work with Zhigang Bao and László Erdős Supported by ERC Advanced Grant RANMAT No. 338804 Spectrum of sum of random matrices Question:

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices S. Dallaporta University of Toulouse, France Abstract. This note presents some central limit theorems

More information

Invertibility of random matrices

Invertibility of random matrices University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]

More information

Exponential tail inequalities for eigenvalues of random matrices

Exponential tail inequalities for eigenvalues of random matrices Exponential tail inequalities for eigenvalues of random matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Random matrices, phase transitions & queuing theory. Raj Rao Nadakuditi

Random matrices, phase transitions & queuing theory. Raj Rao Nadakuditi Random matrices, phase transitions & queuing theory Raj Rao Nadakuditi Dept. of Electrical Engg. & Computer Science http://www.eecs.umich.edu/~rajnrao Joint works with Jinho Baik, Igor Markov and Mingyan

More information

Covariance estimation using random matrix theory

Covariance estimation using random matrix theory Covariance estimation using random matrix theory Randolf Altmeyer joint work with Mathias Trabs Mathematical Statistics, Humboldt-Universität zu Berlin Statistics Mathematics and Applications, Fréjus 03

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

Homogenization of the Dyson Brownian Motion

Homogenization of the Dyson Brownian Motion Homogenization of the Dyson Brownian Motion P. Bourgade, joint work with L. Erdős, J. Yin, H.-T. Yau Cincinnati symposium on probability theory and applications, September 2014 Introduction...........

More information

Homework 1. Yuan Yao. September 18, 2011

Homework 1. Yuan Yao. September 18, 2011 Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to

More information

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 59 CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 4. INTRODUCTION Weighted average-based fusion algorithms are one of the widely used fusion methods for multi-sensor data integration. These methods

More information

E2 212: Matrix Theory (Fall 2010) Solutions to Test - 1

E2 212: Matrix Theory (Fall 2010) Solutions to Test - 1 E2 212: Matrix Theory (Fall 2010) s to Test - 1 1. Let X = [x 1, x 2,..., x n ] R m n be a tall matrix. Let S R(X), and let P be an orthogonal projector onto S. (a) If X is full rank, show that P can be

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information

OPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS. Philippe Rigollet (joint with Quentin Berthet)

OPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS. Philippe Rigollet (joint with Quentin Berthet) OPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS Philippe Rigollet (joint with Quentin Berthet) High dimensional data Cloud of point in R p High dimensional data Cloud of point in R p High dimensional

More information

Preface to the Second Edition...vii Preface to the First Edition... ix

Preface to the Second Edition...vii Preface to the First Edition... ix Contents Preface to the Second Edition...vii Preface to the First Edition........... ix 1 Introduction.............................................. 1 1.1 Large Dimensional Data Analysis.........................

More information

Independent Component Analysis and Its Application on Accelerator Physics

Independent Component Analysis and Its Application on Accelerator Physics Independent Component Analysis and Its Application on Accelerator Physics Xiaoying Pang LA-UR-12-20069 ICA and PCA Similarities: Blind source separation method (BSS) no model Observed signals are linear

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Density of States for Random Band Matrices in d = 2

Density of States for Random Band Matrices in d = 2 Density of States for Random Band Matrices in d = 2 via the supersymmetric approach Mareike Lager Institute for applied mathematics University of Bonn Joint work with Margherita Disertori ZiF Summer School

More information

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014 Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes

More information

Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices

Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices Alex Wein MIT Joint work with: Amelia Perry (MIT), Afonso Bandeira (Courant NYU), Ankur Moitra (MIT) 1 / 22 Random

More information

DOA Estimation using MUSIC and Root MUSIC Methods

DOA Estimation using MUSIC and Root MUSIC Methods DOA Estimation using MUSIC and Root MUSIC Methods EE602 Statistical signal Processing 4/13/2009 Presented By: Chhavipreet Singh(Y515) Siddharth Sahoo(Y5827447) 2 Table of Contents 1 Introduction... 3 2

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Extreme eigenvalues of Erdős-Rényi random graphs

Extreme eigenvalues of Erdős-Rényi random graphs Extreme eigenvalues of Erdős-Rényi random graphs Florent Benaych-Georges j.w.w. Charles Bordenave and Antti Knowles MAP5, Université Paris Descartes May 18, 2018 IPAM UCLA Inhomogeneous Erdős-Rényi random

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Random band matrices

Random band matrices Random band matrices P. Bourgade Courant Institute, ew York University bourgade@cims.nyu.edu We survey recent mathematical results about the spectrum of random band matrices. We start by exposing the Erdős-Schlein-Yau

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1 Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of

More information

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with

More information

Spectral Efficiency of CDMA Cellular Networks

Spectral Efficiency of CDMA Cellular Networks Spectral Efficiency of CDMA Cellular Networks N. Bonneau, M. Debbah, E. Altman and G. Caire INRIA Eurecom Institute nicolas.bonneau@sophia.inria.fr Outline 2 Outline Uplink CDMA Model: Single cell case

More information

Limit Laws for Random Matrices from Traffic Probability

Limit Laws for Random Matrices from Traffic Probability Limit Laws for Random Matrices from Traffic Probability arxiv:1601.02188 Slides available at math.berkeley.edu/ bensonau Benson Au UC Berkeley May 9th, 2016 Benson Au (UC Berkeley) Random Matrices from

More information

Bulk scaling limits, open questions

Bulk scaling limits, open questions Bulk scaling limits, open questions Based on: Continuum limits of random matrices and the Brownian carousel B. Valkó, B. Virág. Inventiones (2009). Eigenvalue statistics for CMV matrices: from Poisson

More information

Random Matrices in Machine Learning

Random Matrices in Machine Learning Random Matrices in Machine Learning Romain COUILLET CentraleSupélec, University of ParisSaclay, France GSTATS IDEX DataScience Chair, GIPSA-lab, University Grenoble Alpes, France. June 21, 2018 1 / 113

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Random Matrices for Big Data Signal Processing and Machine Learning

Random Matrices for Big Data Signal Processing and Machine Learning Random Matrices for Big Data Signal Processing and Machine Learning (ICASSP 2017, New Orleans) Romain COUILLET and Hafiz TIOMOKO ALI CentraleSupélec, France March, 2017 1 / 153 Outline Basics of Random

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES

THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES FLORENT BENAYCH-GEORGES AND RAJ RAO NADAKUDITI Abstract. We consider the eigenvalues and eigenvectors of finite,

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Polynomial Chaos and Karhunen-Loeve Expansion

Polynomial Chaos and Karhunen-Loeve Expansion Polynomial Chaos and Karhunen-Loeve Expansion 1) Random Variables Consider a system that is modeled by R = M(x, t, X) where X is a random variable. We are interested in determining the probability of the

More information

Introduction to Probability and Stocastic Processes - Part I

Introduction to Probability and Stocastic Processes - Part I Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark

More information

Eigenvector Statistics of Sparse Random Matrices. Jiaoyang Huang. Harvard University

Eigenvector Statistics of Sparse Random Matrices. Jiaoyang Huang. Harvard University Eigenvector Statistics of Sparse Random Matrices Paul Bourgade ew York University E-mail: bourgade@cims.nyu.edu Jiaoyang Huang Harvard University E-mail: jiaoyang@math.harvard.edu Horng-Tzer Yau Harvard

More information

The Eigenvector Moment Flow and local Quantum Unique Ergodicity

The Eigenvector Moment Flow and local Quantum Unique Ergodicity The Eigenvector Moment Flow and local Quantum Unique Ergodicity P. Bourgade Cambridge University and Institute for Advanced Study E-mail: bourgade@math.ias.edu H.-T. Yau Harvard University and Institute

More information

On the concentration of eigenvalues of random symmetric matrices

On the concentration of eigenvalues of random symmetric matrices On the concentration of eigenvalues of random symmetric matrices Noga Alon Michael Krivelevich Van H. Vu April 23, 2012 Abstract It is shown that for every 1 s n, the probability that the s-th largest

More information

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix DIAGONALIZATION Definition We say that a matrix A of size n n is diagonalizable if there is a basis of R n consisting of eigenvectors of A ie if there are n linearly independent vectors v v n such that

More information

Large sample covariance matrices and the T 2 statistic

Large sample covariance matrices and the T 2 statistic Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and

More information

«Random Vectors» Lecture #2: Introduction Andreas Polydoros

«Random Vectors» Lecture #2: Introduction Andreas Polydoros «Random Vectors» Lecture #2: Introduction Andreas Polydoros Introduction Contents: Definitions: Correlation and Covariance matrix Linear transformations: Spectral shaping and factorization he whitening

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA) The circular law Lewis Memorial Lecture / DIMACS minicourse March 19, 2008 Terence Tao (UCLA) 1 Eigenvalue distributions Let M = (a ij ) 1 i n;1 j n be a square matrix. Then one has n (generalised) eigenvalues

More information

Gaussian random variables inr n

Gaussian random variables inr n Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b

More information

Spectral Regularization

Spectral Regularization Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information