Applications of random matrix theory to principal component analysis(pca)
|
|
- Prosper Perkins
- 6 years ago
- Views:
Transcription
1 Applications of random matrix theory to principal component analysis(pca) Jun Yin IAS, UW-Madison IAS, April-2014 Joint work with A. Knowles and H. T Yau. 1
2 Basic picture: Let H be a Wigner (symmetric) random matrix: H = (H ij ) 1 i,j N, H = H, H ij = N 1/2 h ij is a random matrix, whose upper right entries h ij s are independent random variables with mean 0 and variance 1. E H ij = 0, E H ij 2 = 1, 1 i, j N N Let A = A N N be a (full rank) deterministic symmetric matrix. Most of the e.values of A are O(1). What can we say about H + A? [Knowles and Y, ]: on the rank A = O(1) case. Example: A = N k=1 d k v k v k, where d 1 = 10, d k = 1 for 2 k N/2 and other d k = 2. 2
3 Some basic questions: ρ HA Voiculescu 1986, Spe- Limiting spectral distributions: icher 1994 (Free probablity) Local density or rigidity. λ k γ k N ε γ k γ k±1, γk ρ HA = k/n holds with 1 N D for any small ε > 0 and D > 0. Delocalization of e.vectors (in any direction). Let u k be l 2 normalized eigenvectors of H + A, (1 k N), and w be deterministic vector max u k, w 2 N 1+ε k holds with 1 N D for any small ε > 0 and D > 0. 3
4 Some basic questions: Behaviors of outlier. Behaviors of the e.vector of outlier. Joint distribution of the largest k non-outlier eigenvalues. k-point correlation functions of H + A in bulk. 4
5 Wigner matrix has two important properties: 1. Independent entries. 2. Isotropic rows/columns (without diagonal entries): let h k = {H kl } l k and w R N 1 be deterministic vector, then h k, w N(0, N 1 w 2 2 ) Why is 2 important, think about a growing matrix Clearly H + A does not have the second property. Two exceptional cases: A is diagonal H is GOE. 5
6 Sample covariance matrix Let XX be a sample covariance matrix: X = (X ij ) 1 i M,1 j N, X ij = (M N) 1/4 x ij is a random matrix, where x ij s are independent random variables with mean 0 and variance 1. Let T = T M M (M M) be a deterministic matrix, T T I has full rank and most of the e.values of T T are O(1). What can we say about T XXT? Furthermore, let e be M 1/2 (1, 1,, 1, 1). How about T X(1 ee )XT 6
7 Real life question: Let y = (y(1), y(2),, y(m)) be some random vector with unknown distribution. For example: price changes of M stocks. How can one get the covariance matrix: { } M Cov(y(i), y(j)) i,j=1 Stat 101: Mesure y N times independently: y 1, y 2,, y N Cov(y(i), y(j)) = where lim N 1 N 1 N α=1 (y α (i) ȳ(i)) (y α (j) ȳ(j)) ȳ(i) = 1 N N α=1 y α (i) 7
8 Model: we assume that y is linear mix of some fundamental independent random variable x = ( x(1), x(2),, x(m ) ) If you measure y N times then We know y = T x, T M M Y = (y 1, y 2, y N ) = T X = T (x 1, x 2, x N ) Cov(y(i), y(j)) = lim N 1 N 1 T X(1 ee )XT For example: Let v is a fixed vector, x is random vector and ξ is random variable, and they are independent. Then y = x + ξv y ( x ) = (I, v) = T x ξ 8
9 Model: y = T x, T M M, x = (x(1), x(2),, x(m )) Cov(y(i), y(j)) = lim N 1 N 1 T X(1 ee )XT, Without loss of generality, we assume that Ex(i) = 0 (by defining x (i) = x(i) Ex(i)). And we assume E x(i) 2 = 1 (by rescaling T ). With this setting Recall the previous example: Here v 2 1, Cov(y(i), y(j)) = ( T T ) ij y = x + ξv, E x(i) 2 = 1, E ξ 2 = 1 y ( x ) = (I, v) = T x, T T = I + vv ξ As we can see T T has a large e. value (1 + v 2 2 ) with e. vector v, and they are related to "signals" ( ) 9
10 Though Cov(y(i), y(j)) = ( T T ) ij = lim N 1 N 1 T X(1 ee )XT in most case it does not work very well, since N needs to be very large (like N M). The basic idea is estimating the principal component (large e.values and their e.vectors) of the matrix with those of the matrix or just T T T X(1 ee )XT T XXT 10
11 PCA model: T T = d β v β v β, T T = signal = O(1) d β v β vβ + d β v β vβ, β signal β noise T M M, M = M + O(1) Noise are comparable c d β C, d β noise log N log M. d β and v β could depend on N or M 11
12 Basic picture (λ noise = 1 and only one λ signal case) Let d := (N/M) 1/2 ( λ signal λ noise ). Outlier appears when d > 1 and outlier µ satisfies: µ = γ d + d 1 + error. 13
13 Detection of v signal. Let u be the eigenvector of µ (outlier), then for any fixed normalized w, we have (w, u) 2 = f µ (w, v signal ) 2 + error Distribution of u? 1. θ(v signal, u) = arccos f µ + error 2. Delocalization in any direction orthogonal to v signal, i.e., if we have (w, v signal ) = 0, then (w, u) M 1/2+ε. Briefly speaking, u (u, v signal )v signal is random and isotropic. Two outliers cases: see graph. 14
14 Application of delocalization Assume we know v signal R M only has M non-zero components, M M and Then v signal (i) ( M) 1/2, if v signal (i) 0 1. if v signal (i) = 0, delocalization property shows u(i) M 1/2+ε 2. if v signal (i) 0, parallel property shows u(i) ( M) 1/2 ε Using this method, we can know that which components of v signal are non-zero. 15
15 Some previous results: Eigenvalues: T XX T, T M M, T T = β signal d βv β v β + β noise d βv β v β Baik and Silverstein (2006): p = q, λ noise = 1, for fixed d, they obtain the limit of µ. Bai and Yao (2008): p = q, λ( noise = ) 1, T is of the form (where A 0 A is O(1) O(1) matrix) T = for fixed d, they obtain the 0 I CLT of µ. Bai and Yao (2008): p = q, T is symmetric ( ) matrix of the form A 0 (where A is O(1) O(1) matrix) T = 0 T, for fixed d, they obtain the limit of µ. Nadler (2008): The spiked covariance model. T = ( I λe 1 ) 16
16 Eigenvector: Then for any fixed w, we have (w, u) 2 = f d (w, v signal ) 2 + error Paul (2007): p = q, λ noise = 1, T is diagonal and X ij is Gaussian. Shi (2013): p = q, λ noise = 1, T is diagonal. Benaych-Georges, Nadakuditi (2010), p = q, λ noise = 1, T is random symmetric matrix, T is independent of X, either X ij is Gaussian or T is isotropic. Benaych-Georges, Nadakuditi (2012): T = ( I isotropic v. λv ) with random Results: limit of (u, v signal ) 2, except the first one. 17
17 Main results 1. Rigidity of e.values (including outliers): (up to N ε factor) λ i γ i = error 2. Delocalization of e.vectors of non-outliers. 3. Direction of the e.vectors of outliers. u (u, v signal )v signal is random and isotropic. 4. Some eigenvector information can be detected even if d = 1 o(1) 5. TW distribution of the largest k non-outliers. El Karoui (2007): Gaussian case. 6. Isotropic law of (H + A) or T XXT. 7. Bulk universality with 4 moment match (for (H + A) or T XXT ). 18
18 Strategy: With V M M, U M M and diagonal D M M T = V D(I M, 0)U, T T = d β v β vβ + d β v β vβ β signal β noise Define S = S, SS = 1v β vβ + d β v β vβ β signal β noise Note: S has no signal. Represent ( 1, G := T X(1 ee )X T z) with where G S = G S, G S X, XG S X, etc ( 1. SXX S z) 19
19 Question: Let A be a matrix with only one non-zero entry and X = X + A. Then (XX z) 1 (X X z) 1,? 20
20 Isotropic law: Wigner: Let H be a Wigner matrix, G = (H z) 1, then for any fixed w and v, we have (w, (H z) 1 v) = m(z) + O((Nη) 1 (log N) C ), η = Im z and m(z) = ρ sc (x)(x z) 1 dz. Knowles and Y (2011). PCA: For fixed w and v, what are the behaviors of (w, G S v), (w, G S Xv), (w, X G S Xv), G S = (SXX S z) 1 Bloemendal, Erdos, Knowles, Yau and Y (2013): case. S = I M M 21
21 Isotropic law for general S or general A (SXX S z) 1, (H + A) 1 Knowles and Y (2014): Let A = UDU with D = diag(d 1, d 2,, d N ). Here d i C. Define m i = (d i z + m) 1, Then for fixed w and v R N, m := 1 N m j j (w, (H z) 1 v) = (w, (A z + m) 1 v) + error Based on this result: rigidity, delocalization, TW law. (Capitaine, Peche 2014: GOE+A) 22
22 Basic idea of proving the isotropic law of H + A. 1. The isotropic law of GOE + A. (polynomialization method) Bloemendal, Erdos, Knowles, Yau and Y (2013) 2. Compare (H + A) 1 with (GOE + A) 1 with Newton method. Let ν be the distribution density of H ij and ν G be the distribution density of the entries of GOE. Let H t be the Wigner matrix whose entries having distributions ν t = tν + (1 t)ν G Continuous bridge between H and GOE. 23
23 Recall EF (H) = 0 1 t EF (H t )dt Let H t,k,l,0 be the Wigner matrix whose entries having the same distribution as H t except that (k, l) entry of H t,k,l0 has the distribution of ν. Let H t,k,l,1 be the Wigner matrix whose entries having the same distribution as H t except that (k, l) entry of H t,k,l,1 has the distribution of ν G. Then ( EF (H t,k,l,1 ) EF (H t,k,l,0 ) ) t EF (H t ) = kl Note: H t,k,l,0 and H t,k,l,1 are very close to H t. 24
24 For example: F i,j,p (H) = [ (H z) 1 ] 2p ij Goal: Create a self-consistent differential equation of ( ) N EF i,j,p (H t ) ij=1 which is stable. 25
25 Thank you 26
Comparison Method in Random Matrix Theory
Comparison Method in Random Matrix Theory Jun Yin UW-Madison Valparaíso, Chile, July - 2015 Joint work with A. Knowles. 1 Some random matrices Wigner Matrix: H is N N square matrix, H : H ij = H ji, EH
More informationNon white sample covariance matrices.
Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions
More informationSemicircle law on short scales and delocalization for Wigner random matrices
Semicircle law on short scales and delocalization for Wigner random matrices László Erdős University of Munich Weizmann Institute, December 2007 Joint work with H.T. Yau (Harvard), B. Schlein (Munich)
More informationOn the principal components of sample covariance matrices
On the principal components of sample covariance matrices Alex Bloemendal Antti Knowles Horng-Tzer Yau Jun Yin February 4, 205 We introduce a class of M M sample covariance matrices Q which subsumes and
More informationRandom Matrices: Invertibility, Structure, and Applications
Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37
More informationIsotropic local laws for random matrices
Isotropic local laws for random matrices Antti Knowles University of Geneva With Y. He and R. Rosenthal Random matrices Let H C N N be a large Hermitian random matrix, normalized so that H. Some motivations:
More informationFluctuations from the Semicircle Law Lecture 4
Fluctuations from the Semicircle Law Lecture 4 Ioana Dumitriu University of Washington Women and Math, IAS 2014 May 23, 2014 Ioana Dumitriu (UW) Fluctuations from the Semicircle Law Lecture 4 May 23, 2014
More informationUniversality of local spectral statistics of random matrices
Universality of local spectral statistics of random matrices László Erdős Ludwig-Maximilians-Universität, Munich, Germany CRM, Montreal, Mar 19, 2012 Joint with P. Bourgade, B. Schlein, H.T. Yau, and J.
More informationLocal Kesten McKay law for random regular graphs
Local Kesten McKay law for random regular graphs Roland Bauerschmidt (with Jiaoyang Huang and Horng-Tzer Yau) University of Cambridge Weizmann Institute, January 2017 Random regular graph G N,d is the
More informationA note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao
A note on a Marčenko-Pastur type theorem for time series Jianfeng Yao Workshop on High-dimensional statistics The University of Hong Kong, October 2011 Overview 1 High-dimensional data and the sample covariance
More informationFluctuations of random tilings and discrete Beta-ensembles
Fluctuations of random tilings and discrete Beta-ensembles Alice Guionnet CRS (E S Lyon) Workshop in geometric functional analysis, MSRI, nov. 13 2017 Joint work with A. Borodin, G. Borot, V. Gorin, J.Huang
More informationRandom Matrix: From Wigner to Quantum Chaos
Random Matrix: From Wigner to Quantum Chaos Horng-Tzer Yau Harvard University Joint work with P. Bourgade, L. Erdős, B. Schlein and J. Yin 1 Perhaps I am now too courageous when I try to guess the distribution
More informationThe Matrix Dyson Equation in random matrix theory
The Matrix Dyson Equation in random matrix theory László Erdős IST, Austria Mathematical Physics seminar University of Bristol, Feb 3, 207 Joint work with O. Ajanki, T. Krüger Partially supported by ERC
More informationFluctuations of random tilings and discrete Beta-ensembles
Fluctuations of random tilings and discrete Beta-ensembles Alice Guionnet CRS (E S Lyon) Advances in Mathematics and Theoretical Physics, Roma, 2017 Joint work with A. Borodin, G. Borot, V. Gorin, J.Huang
More informationThe Isotropic Semicircle Law and Deformation of Wigner Matrices
The Isotropic Semicircle Law and Deformation of Wigner Matrices Antti Knowles 1 and Jun Yin 2 Department of Mathematics, Harvard University Cambridge MA 02138, USA knowles@math.harvard.edu 1 Department
More informationInformation-theoretically Optimal Sparse PCA
Information-theoretically Optimal Sparse PCA Yash Deshpande Department of Electrical Engineering Stanford, CA. Andrea Montanari Departments of Electrical Engineering and Statistics Stanford, CA. Abstract
More informationUniversality for random matrices and log-gases
Universality for random matrices and log-gases László Erdős IST, Austria Ludwig-Maximilians-Universität, Munich, Germany Encounters Between Discrete and Continuous Mathematics Eötvös Loránd University,
More informationPainlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE
Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Craig A. Tracy UC Davis RHPIA 2005 SISSA, Trieste 1 Figure 1: Paul Painlevé,
More informationPrincipal Component Analysis
Principal Component Analysis Yuanzhen Shao MA 26500 Yuanzhen Shao PCA 1 / 13 Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =.
More informationarxiv: v3 [math-ph] 21 Jun 2012
LOCAL MARCHKO-PASTUR LAW AT TH HARD DG OF SAMPL COVARIAC MATRICS CLAUDIO CACCIAPUOTI, AA MALTSV, AD BJAMI SCHLI arxiv:206.730v3 [math-ph] 2 Jun 202 Abstract. Let X be a matrix whose entries are i.i.d.
More informationIII. Quantum ergodicity on graphs, perspectives
III. Quantum ergodicity on graphs, perspectives Nalini Anantharaman Université de Strasbourg 24 août 2016 Yesterday we focussed on the case of large regular (discrete) graphs. Let G = (V, E) be a (q +
More informationAssessing the dependence of high-dimensional time series via sample autocovariances and correlations
Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),
More informationFluctuations of the free energy of spherical spin glass
Fluctuations of the free energy of spherical spin glass Jinho Baik University of Michigan 2015 May, IMS, Singapore Joint work with Ji Oon Lee (KAIST, Korea) Random matrix theory Real N N Wigner matrix
More information1 Last time: least-squares problems
MATH Linear algebra (Fall 07) Lecture Last time: least-squares problems Definition. If A is an m n matrix and b R m, then a least-squares solution to the linear system Ax = b is a vector x R n such that
More informationConcentration Inequalities for Random Matrices
Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic
More informationNotes on Implementation of Component Analysis Techniques
Notes on Implementation of Component Analysis Techniques Dr. Stefanos Zafeiriou January 205 Computing Principal Component Analysis Assume that we have a matrix of centered data observations X = [x µ,...,
More information1 Tridiagonal matrices
Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate
More informationOn corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR
Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional
More informationSpectral Statistics of Erdős-Rényi Graphs II: Eigenvalue Spacing and the Extreme Eigenvalues
Spectral Statistics of Erdős-Rényi Graphs II: Eigenvalue Spacing and the Extreme Eigenvalues László Erdős Antti Knowles 2 Horng-Tzer Yau 2 Jun Yin 2 Institute of Mathematics, University of Munich, Theresienstrasse
More informationarxiv: v4 [math.pr] 10 Sep 2018
Lectures on the local semicircle law for Wigner matrices arxiv:60.04055v4 [math.pr] 0 Sep 208 Florent Benaych-Georges Antti Knowles September, 208 These notes provide an introduction to the local semicircle
More informationFrom the mesoscopic to microscopic scale in random matrix theory
From the mesoscopic to microscopic scale in random matrix theory (fixed energy universality for random spectra) With L. Erdős, H.-T. Yau, J. Yin Introduction A spacially confined quantum mechanical system
More informationLECTURE 16: PCA AND SVD
Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal
More informationLocal semicircle law, Wegner estimate and level repulsion for Wigner random matrices
Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices László Erdős University of Munich Oberwolfach, 2008 Dec Joint work with H.T. Yau (Harvard), B. Schlein (Cambrigde) Goal:
More informationSpectral Universality of Random Matrices
Spectral Universality of Random Matrices László Erdős IST Austria (supported by ERC Advanced Grant) Szilárd Leó Colloquium Technical University of Budapest February 27, 2018 László Erdős Spectral Universality
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationNumerical Solutions to the General Marcenko Pastur Equation
Numerical Solutions to the General Marcenko Pastur Equation Miriam Huntley May 2, 23 Motivation: Data Analysis A frequent goal in real world data analysis is estimation of the covariance matrix of some
More informationLectures 6 7 : Marchenko-Pastur Law
Fall 2009 MATH 833 Random Matrices B. Valkó Lectures 6 7 : Marchenko-Pastur Law Notes prepared by: A. Ganguly We will now turn our attention to rectangular matrices. Let X = (X 1, X 2,..., X n ) R p n
More informationUnsupervised Learning: Dimensionality Reduction
Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,
More information16.584: Random Vectors
1 16.584: Random Vectors Define X : (X 1, X 2,..X n ) T : n-dimensional Random Vector X 1 : X(t 1 ): May correspond to samples/measurements Generalize definition of PDF: F X (x) = P[X 1 x 1, X 2 x 2,...X
More informationEcon 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis
Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms De La Fuente notes that, if an n n matrix has n distinct eigenvalues, it can be diagonalized. In this supplement, we will provide
More informationEigenvalue variance bounds for Wigner and covariance random matrices
Eigenvalue variance bounds for Wigner and covariance random matrices S. Dallaporta University of Toulouse, France Abstract. This work is concerned with finite range bounds on the variance of individual
More informationWavelet Transform And Principal Component Analysis Based Feature Extraction
Wavelet Transform And Principal Component Analysis Based Feature Extraction Keyun Tong June 3, 2010 As the amount of information grows rapidly and widely, feature extraction become an indispensable technique
More informationOn corrections of classical multivariate tests for high-dimensional data
On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics
More informationLocal law of addition of random matrices
Local law of addition of random matrices Kevin Schnelli IST Austria Joint work with Zhigang Bao and László Erdős Supported by ERC Advanced Grant RANMAT No. 338804 Spectrum of sum of random matrices Question:
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationSecond-Order Inference for Gaussian Random Curves
Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)
More informationA Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices
A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices S. Dallaporta University of Toulouse, France Abstract. This note presents some central limit theorems
More informationInvertibility of random matrices
University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]
More informationExponential tail inequalities for eigenvalues of random matrices
Exponential tail inequalities for eigenvalues of random matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify
More informationAsymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations
More informationRandom matrices, phase transitions & queuing theory. Raj Rao Nadakuditi
Random matrices, phase transitions & queuing theory Raj Rao Nadakuditi Dept. of Electrical Engg. & Computer Science http://www.eecs.umich.edu/~rajnrao Joint works with Jinho Baik, Igor Markov and Mingyan
More informationCovariance estimation using random matrix theory
Covariance estimation using random matrix theory Randolf Altmeyer joint work with Mathias Trabs Mathematical Statistics, Humboldt-Universität zu Berlin Statistics Mathematics and Applications, Fréjus 03
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More informationGI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil
GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection
More informationHomogenization of the Dyson Brownian Motion
Homogenization of the Dyson Brownian Motion P. Bourgade, joint work with L. Erdős, J. Yin, H.-T. Yau Cincinnati symposium on probability theory and applications, September 2014 Introduction...........
More informationHomework 1. Yuan Yao. September 18, 2011
Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to
More informationCHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION
59 CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION 4. INTRODUCTION Weighted average-based fusion algorithms are one of the widely used fusion methods for multi-sensor data integration. These methods
More informationE2 212: Matrix Theory (Fall 2010) Solutions to Test - 1
E2 212: Matrix Theory (Fall 2010) s to Test - 1 1. Let X = [x 1, x 2,..., x n ] R m n be a tall matrix. Let S R(X), and let P be an orthogonal projector onto S. (a) If X is full rank, show that P can be
More informationLinear Regression Linear Regression with Shrinkage
Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle
More informationOPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS. Philippe Rigollet (joint with Quentin Berthet)
OPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS Philippe Rigollet (joint with Quentin Berthet) High dimensional data Cloud of point in R p High dimensional data Cloud of point in R p High dimensional
More informationPreface to the Second Edition...vii Preface to the First Edition... ix
Contents Preface to the Second Edition...vii Preface to the First Edition........... ix 1 Introduction.............................................. 1 1.1 Large Dimensional Data Analysis.........................
More informationIndependent Component Analysis and Its Application on Accelerator Physics
Independent Component Analysis and Its Application on Accelerator Physics Xiaoying Pang LA-UR-12-20069 ICA and PCA Similarities: Blind source separation method (BSS) no model Observed signals are linear
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More informationPrincipal Component Analysis
B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition
More informationEE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015
EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,
More informationDensity of States for Random Band Matrices in d = 2
Density of States for Random Band Matrices in d = 2 via the supersymmetric approach Mareike Lager Institute for applied mathematics University of Bonn Joint work with Margherita Disertori ZiF Summer School
More informationPrincipal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014
Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes
More informationOptimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices
Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices Alex Wein MIT Joint work with: Amelia Perry (MIT), Afonso Bandeira (Courant NYU), Ankur Moitra (MIT) 1 / 22 Random
More informationDOA Estimation using MUSIC and Root MUSIC Methods
DOA Estimation using MUSIC and Root MUSIC Methods EE602 Statistical signal Processing 4/13/2009 Presented By: Chhavipreet Singh(Y515) Siddharth Sahoo(Y5827447) 2 Table of Contents 1 Introduction... 3 2
More informationConnection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis
Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationExtreme eigenvalues of Erdős-Rényi random graphs
Extreme eigenvalues of Erdős-Rényi random graphs Florent Benaych-Georges j.w.w. Charles Bordenave and Antti Knowles MAP5, Université Paris Descartes May 18, 2018 IPAM UCLA Inhomogeneous Erdős-Rényi random
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationRandom band matrices
Random band matrices P. Bourgade Courant Institute, ew York University bourgade@cims.nyu.edu We survey recent mathematical results about the spectrum of random band matrices. We start by exposing the Erdős-Schlein-Yau
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1
Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of
More informationPart I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes
Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with
More informationSpectral Efficiency of CDMA Cellular Networks
Spectral Efficiency of CDMA Cellular Networks N. Bonneau, M. Debbah, E. Altman and G. Caire INRIA Eurecom Institute nicolas.bonneau@sophia.inria.fr Outline 2 Outline Uplink CDMA Model: Single cell case
More informationLimit Laws for Random Matrices from Traffic Probability
Limit Laws for Random Matrices from Traffic Probability arxiv:1601.02188 Slides available at math.berkeley.edu/ bensonau Benson Au UC Berkeley May 9th, 2016 Benson Au (UC Berkeley) Random Matrices from
More informationBulk scaling limits, open questions
Bulk scaling limits, open questions Based on: Continuum limits of random matrices and the Brownian carousel B. Valkó, B. Virág. Inventiones (2009). Eigenvalue statistics for CMV matrices: from Poisson
More informationRandom Matrices in Machine Learning
Random Matrices in Machine Learning Romain COUILLET CentraleSupélec, University of ParisSaclay, France GSTATS IDEX DataScience Chair, GIPSA-lab, University Grenoble Alpes, France. June 21, 2018 1 / 113
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationRandom Matrices for Big Data Signal Processing and Machine Learning
Random Matrices for Big Data Signal Processing and Machine Learning (ICASSP 2017, New Orleans) Romain COUILLET and Hafiz TIOMOKO ALI CentraleSupélec, France March, 2017 1 / 153 Outline Basics of Random
More informationStatistical signal processing
Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationTHE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES
THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES FLORENT BENAYCH-GEORGES AND RAJ RAO NADAKUDITI Abstract. We consider the eigenvalues and eigenvectors of finite,
More informationLECTURE NOTE #11 PROF. ALAN YUILLE
LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform
More informationPolynomial Chaos and Karhunen-Loeve Expansion
Polynomial Chaos and Karhunen-Loeve Expansion 1) Random Variables Consider a system that is modeled by R = M(x, t, X) where X is a random variable. We are interested in determining the probability of the
More informationIntroduction to Probability and Stocastic Processes - Part I
Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark
More informationEigenvector Statistics of Sparse Random Matrices. Jiaoyang Huang. Harvard University
Eigenvector Statistics of Sparse Random Matrices Paul Bourgade ew York University E-mail: bourgade@cims.nyu.edu Jiaoyang Huang Harvard University E-mail: jiaoyang@math.harvard.edu Horng-Tzer Yau Harvard
More informationThe Eigenvector Moment Flow and local Quantum Unique Ergodicity
The Eigenvector Moment Flow and local Quantum Unique Ergodicity P. Bourgade Cambridge University and Institute for Advanced Study E-mail: bourgade@math.ias.edu H.-T. Yau Harvard University and Institute
More informationOn the concentration of eigenvalues of random symmetric matrices
On the concentration of eigenvalues of random symmetric matrices Noga Alon Michael Krivelevich Van H. Vu April 23, 2012 Abstract It is shown that for every 1 s n, the probability that the s-th largest
More informationDIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix
DIAGONALIZATION Definition We say that a matrix A of size n n is diagonalizable if there is a basis of R n consisting of eigenvectors of A ie if there are n linearly independent vectors v v n such that
More informationLarge sample covariance matrices and the T 2 statistic
Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and
More information«Random Vectors» Lecture #2: Introduction Andreas Polydoros
«Random Vectors» Lecture #2: Introduction Andreas Polydoros Introduction Contents: Definitions: Correlation and Covariance matrix Linear transformations: Spectral shaping and factorization he whitening
More informationANOVA: Analysis of Variance - Part I
ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.
More informationThe circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)
The circular law Lewis Memorial Lecture / DIMACS minicourse March 19, 2008 Terence Tao (UCLA) 1 Eigenvalue distributions Let M = (a ij ) 1 i n;1 j n be a square matrix. Then one has n (generalised) eigenvalues
More informationGaussian random variables inr n
Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b
More informationSpectral Regularization
Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More information