Principal component analysis and the asymptotic distribution of high-dimensional sample eigenvectors
|
|
- Godfrey Nicholson
- 5 years ago
- Views:
Transcription
1 Principal component analysis and the asymptotic distribution of high-dimensional sample eigenvectors Kristoffer Hellton Department of Mathematics, University of Oslo May 12, 2015 K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
2 ...and related topics How are confidence distributions relevant for principal component analysis? K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
3 ...and related topics How are confidence distributions relevant for principal component analysis? Principal component analysis (PCA) deals with the distributions of sample eigenvalues, sample eigenvectors, projections unto the sample eigenvector space. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
4 Motivation Principal component analysis (PCA) has become a widely used dimension-reduction technique in areas with high-dimensional data: genetics, signal processing, chemometrics, climate, finance, imaging... K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
5 Motivation Principal component analysis (PCA) has become a widely used dimension-reduction technique in areas with high-dimensional data: genetics, signal processing, chemometrics, climate, finance, imaging... The method constructs a low-dimensional representation of each observation, which can be used to visualize high-dimensional data or as input in regression clustering classification K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
6 Example: Genetic markers, Novembre et al. (2008) K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
7 Population principal components For a p-dimensional random variable with expectation, E X = 0, and covariance matrix, Var X = Σ, X 1 X =. the principal components of X are defined as the linear projections given by the eigenvectors v 1,..., v p of Σ: X p, s T i = v T i X. The variance of each projection will be the eigenvalue Var s i = λ i. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
8 X 2 v 1 X 1 For the normal distribution X N (0, Σ), the first eigenvector v 1 of Σ corresponds to the major axis of the distribution ellipse. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
9 Sample principal components For a p n data matrix, X = x 1... x n, the principal components are given by the eigenvectors and eigenvalues of the sample covariance matrix ˆΣ = 1 n 1 n (x j x) (x j x) T, j=1 denoted by ˆv 1,..., ˆv p and ˆλ 1,..., ˆλ p, such that the ith sample principal component is given ŝ T i = ˆv T i X, for i = 1,..., p. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
10 X 2 X 1 The first sample eigenvector ˆv 1 will then be the line minimizing the sum of squares of the distances from the line to each data point.
11 X 2 X 1 The first sample eigenvector ˆv 1 will then be the line minimizing the sum of squares of the distances from the line to each data point.
12 X 2 ˆv 1 X 1 The first sample eigenvector ˆv 1 will then be the line minimizing the sum of squares of the distances from the line to each data point. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
13 Distribution of eigenvalues and eigenvectors The exact and asymptotic distributions of the sample eigenvalues and eigenvectors were explored and established by James (1960, 1964) and Anderson (1963, 1965). If X N(0, Σ) and λ 1 > > λ p, both the sample eigenvalues and -vectors are asymptotically normally distributed for fixed p and n : n (ˆλ i λ i ) d N ( 0, 2λ 2 ) i, n (ˆvi v i ) d N 0, p k=1,k i λ i λ k (λ i λ k ) 2 v kv T k. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
14 Distribution of eigenvalues and eigenvectors The exact and asymptotic distributions of the sample eigenvalues and eigenvectors were explored and established by James (1960, 1964) and Anderson (1963, 1965). If X N(0, Σ) and λ 1 > > λ p, both the sample eigenvalues and -vectors are asymptotically normally distributed for fixed p and n : n (ˆλ i λ i ) d N ( 0, 2λ 2 ) i, n (ˆvi v i ) d N 0, p k=1,k i λ i λ k (λ i λ k ) 2 v kv T k. Problems in the high-dimensional setting (p >> n): sample covariance matrix is rank deficient, no obvious asymptotic framework. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
15 High-dimensional setting When p > n, one assumes the m first eigenvalues to be substantially larger than the remaining eigenvalues (the spiked covariance model): λ 1 > > λ m }{{} Signal λ m+1 > > λ p. }{{} Noise K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
16 High-dimensional setting When p > n, one assumes the m first eigenvalues to be substantially larger than the remaining eigenvalues (the spiked covariance model): λ 1 > > λ m }{{} Signal λ m+1 > > λ p. }{{} Noise Two possible asymptotic frameworks to work within: K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
17 High-dimensional setting When p > n, one assumes the m first eigenvalues to be substantially larger than the remaining eigenvalues (the spiked covariance model): λ 1 > > λ m }{{} Signal λ m+1 > > λ p. }{{} Noise Two possible asymptotic frameworks to work within: 1 Random matrix theory (Bai, Silverstein): for fixed population eigenvalues λ i, let n, p such that p/n γ 0. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
18 High-dimensional setting When p > n, one assumes the m first eigenvalues to be substantially larger than the remaining eigenvalues (the spiked covariance model): λ 1 > > λ m }{{} Signal λ m+1 > > λ p. }{{} Noise Two possible asymptotic frameworks to work within: 1 Random matrix theory (Bai, Silverstein): for fixed population eigenvalues λ i, let n, p such that p/n γ 0. 2 High dimension low sample size (Marron, Jung): for fixed sample size n, let p as the m eigenvalues scale λ i = σ 2 i pα, α > 0. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
19 HDLSS framework The value α = 1 in the high dimension low sample size framework is a special case (Jung, Sen & Marron, 2012). Simplest setting: For normally distributed data and m = 1, λ 1 = σ 2 p, λ 2 = = λ p = τ 2, K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
20 HDLSS framework The value α = 1 in the high dimension low sample size framework is a special case (Jung, Sen & Marron, 2012). Simplest setting: For normally distributed data and m = 1, λ 1 = σ 2 p, λ 2 = = λ p = τ 2, the distribution of the first sample eigenvalue converges as p ˆσ 2 = p 1ˆλ 1 d σ 2 χ2 n n + τ 2 n. Here, ˆσ2 n τ 2 will be a pivot, such that a confidence distribution of σ 2 : σ 2 ( ˆσ C(σ 2 2 n τ 2 ) ) = 1 Γ n, Γ n ( ) is the cdf of χ 2 n. σ 2 K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
21 Confidence curve for first eigenvalue σ^2 n = 100, p = 500 K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
22 Confidence curve for first eigenvalue σ^2 n = 10, p = 500 K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
23 Distribution of eigenvectors (Jung, Sen & Marron, 2012) For normally distributed data, X N(0, Σ) and the population eigenvalues λ 1 = σ 2 p, λ 2 = = λ p = τ 2, the asymptotic distribution of the inner product between the first population eigenvector and the first sample eigenvector depends on α as p : ˆv 1 T v 1 1 α > 1, ( 1 + τ 2 σ 2 χ 2 n ) 1/2 α = 1, 0 α < 1. Possible confidence distribution for the boundary case α = 1? K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
24 Asymptotic inconsistency ˆv 1 v 1 X 2 X 1 In the high-dimensional setting, the sample eigenvectors do generally not converge to the population eigenvectors (Johnstone and Lu, 2009, Paul, 2007) - one motivation behind sparse PCA. But if the sample eigenvectors are not consistent, how can classical PCA be a successful method in high-dimensional applications? K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
25 Distribution of sample projections A possible answer can be given by the distribution of projections unto the sample eigenvector space. For the observations X = [x 1,..., x n ], the population and sample normalized principal component projections are defined as z T j = vt j X λj, ẑ T j = ˆvT j X ˆλ j. These sample projections are also not consistent, just as the eigenvectors, BUT... K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
26 Theorem (Hellton and Thoresen, 2015) For X N(0, Σ) and the m first eigenvalues λ 1 = σ 2 1p,..., λ m = σ 2 mp, the joint distribution of the m first sample projections of the jth observation converges, as p, to ẑ j1 n/d1 0 σ 1 z j1. d... u 1 u m., ẑ jm 0 n/dm σ m z jm }{{}}{{} Scaling Rotation for j = 1,..., n, where d i and u i are the ith eigenvalue and -vector of an m m Wishart distributed matrix, W Wishart ( n, diag(σ 2 1,..., σ2 m) ). K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
27 The vector of the first m sample projections will be a rotated and scaled version in m dimensions of the corresponding population projections. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
28 The vector of the first m sample projections will be a rotated and scaled version in m dimensions of the corresponding population projections. However, for the purpose of visualizing data one would plot pairs of sample projections in two dimensions. Simulations show that for moderate sample size, a two-dimensional representation of the sample projections will also be a scaled and rotated version of the population projections. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
29 The vector of the first m sample projections will be a rotated and scaled version in m dimensions of the corresponding population projections. However, for the purpose of visualizing data one would plot pairs of sample projections in two dimensions. Simulations show that for moderate sample size, a two-dimensional representation of the sample projections will also be a scaled and rotated version of the population projections. Implications for visualization As the difference between the sample and population projections is only a scaling and rotation, the relative positions and the visual content will remain the same. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
30 Different behavior seen in two samples (black dots) compared to their populations (circles). Scaling: 0.91 in x, 0.92 in y, Rotation: 3.7 degrees Scaling: 0.99 in x, 0.93 in y, Rotation: 27.1 degrees Even though eigenvectors and projections are not consistent, the visual information of the population projections is conserved in the estimated projections. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
31 Summing up In connection with principal component analysis in high dimension, there are a growing number of results on the distributions of sample eigenvalues, sample eigenvectors, and sample projections. Here, confidence distributions can play a role. K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
32 Thank you! K. Hellton (UiO) Distribution of sample eigenvectors May 12, / 21
Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations
More informationPrincipal Component Analysis for a Spiked Covariance Model with Largest Eigenvalues of the Same Asymptotic Order of Magnitude
Principal Component Analysis for a Spiked Covariance Model with Largest Eigenvalues of the Same Asymptotic Order of Magnitude Addy M. Boĺıvar Cimé Centro de Investigación en Matemáticas A.C. May 1, 2010
More informationNon white sample covariance matrices.
Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions
More informationJournal of Multivariate Analysis. Consistency of sparse PCA in High Dimension, Low Sample Size contexts
Journal of Multivariate Analysis 5 (03) 37 333 Contents lists available at SciVerse ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Consistency of sparse PCA
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationTable of Contents. Multivariate methods. Introduction II. Introduction I
Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation
More informationSparse PCA in High Dimensions
Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and
More informationThe Third International Workshop in Sequential Methodologies
Area C.6.1: Wednesday, June 15, 4:00pm Kazuyoshi Yata Institute of Mathematics, University of Tsukuba, Japan Effective PCA for large p, small n context with sample size determination In recent years, substantial
More informationFocused fine-tuning of ridge regression
Focused fine-tuning of ridge regression Kristoffer Hellton Department of Mathematics, University of Oslo May 9, 2016 K. Hellton (UiO) Focused tuning May 9, 2016 1 / 22 Penalized regression The least-squares
More informationAsymptotic behavior of Support Vector Machine for spiked population model
Journal of Machine Learning Research 18 (2017) 1-21 Submitted 11/16; Revised 3/17; Published 4/17 Asymptotic behavior of Support Vector Machine for spiked population model Department of Epidemiology and
More informationPermutation-invariant regularization of large covariance matrices. Liza Levina
Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work
More informationGopalkrishna Veni. Project 4 (Active Shape Models)
Gopalkrishna Veni Project 4 (Active Shape Models) Introduction Active shape Model (ASM) is a technique of building a model by learning the variability patterns from training datasets. ASMs try to deform
More informationNumerical Solutions to the General Marcenko Pastur Equation
Numerical Solutions to the General Marcenko Pastur Equation Miriam Huntley May 2, 23 Motivation: Data Analysis A frequent goal in real world data analysis is estimation of the covariance matrix of some
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationLarge sample covariance matrices and the T 2 statistic
Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and
More information7. Variable extraction and dimensionality reduction
7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationMS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis
MS-E2112 Multivariate Statistical (5cr) Lecture 8: Contents Canonical correlation analysis involves partition of variables into two vectors x and y. The aim is to find linear combinations α T x and β
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Two sample T 2 test 1 Two sample T 2 test 2 Analogous to the univariate context, we
More informationCS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)
CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationRandom Matrices and Multivariate Statistical Analysis
Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationPRINCIPAL COMPONENTS ANALYSIS
121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves
More informationPLS. theoretical results for the chemometrics use of PLS. Liliana Forzani. joint work with R. Dennis Cook
PLS theoretical results for the chemometrics use of PLS Liliana Forzani Facultad de Ingeniería Química, UNL, Argentina joint work with R. Dennis Cook Example in chemometrics A concrete situation could
More informationChapter 4: Factor Analysis
Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.
More informationPart III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data
1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions
More informationTAMS39 Lecture 10 Principal Component Analysis Factor Analysis
TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationHigh-dimensional two-sample tests under strongly spiked eigenvalue models
1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationA Peak to the World of Multivariate Statistical Analysis
A Peak to the World of Multivariate Statistical Analysis Real Contents Real Real Real Why is it important to know a bit about the theory behind the methods? Real 5 10 15 20 Real 10 15 20 Figure: Multivariate
More informationRegularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008
Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)
More information12.2 Dimensionality Reduction
510 Chapter 12 of this dimensionality problem, regularization techniques such as SVD are almost always needed to perform the covariance matrix inversion. Because it appears to be a fundamental property
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationInferences about a Mean Vector
Inferences about a Mean Vector Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University
More informationIntroduction to multivariate analysis Outline
Introduction to multivariate analysis Outline Why do a multivariate analysis Ordination, classification, model fitting Principal component analysis Discriminant analysis, quickly Species presence/absence
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More informationRegression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.
Regression Bivariate i linear regression: Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables and. Generally describe as a
More informationAsymptotic distribution of the largest eigenvalue with application to genetic data
Asymptotic distribution of the largest eigenvalue with application to genetic data Chong Wu University of Minnesota September 30, 2016 T32 Journal Club Chong Wu 1 / 25 Table of Contents 1 Background Gene-gene
More informationG E INTERACTION USING JMP: AN OVERVIEW
G E INTERACTION USING JMP: AN OVERVIEW Sukanta Dash I.A.S.R.I., Library Avenue, New Delhi-110012 sukanta@iasri.res.in 1. Introduction Genotype Environment interaction (G E) is a common phenomenon in agricultural
More informationFace Recognition and Biometric Systems
The Eigenfaces method Plan of the lecture Principal Components Analysis main idea Feature extraction by PCA face recognition Eigenfaces training feature extraction Literature M.A.Turk, A.P.Pentland Face
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationStatistica Sinica Preprint No: SS R2
Statistica Sinica Preprint No: SS-2016-0063R2 Title Two-sample tests for high-dimension, strongly spiked eigenvalue models Manuscript ID SS-2016-0063R2 URL http://www.stat.sinica.edu.tw/statistica/ DOI
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More informationApplications of random matrix theory to principal component analysis(pca)
Applications of random matrix theory to principal component analysis(pca) Jun Yin IAS, UW-Madison IAS, April-2014 Joint work with A. Knowles and H. T Yau. 1 Basic picture: Let H be a Wigner (symmetric)
More informationPrincipal Component Analysis!! Lecture 11!
Principal Component Analysis Lecture 11 1 Eigenvectors and Eigenvalues g Consider this problem of spreading butter on a bread slice 2 Eigenvectors and Eigenvalues g Consider this problem of stretching
More informationNeuroscience Introduction
Neuroscience Introduction The brain As humans, we can identify galaxies light years away, we can study particles smaller than an atom. But we still haven t unlocked the mystery of the three pounds of matter
More informationEffective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data
Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA
More informationStatistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16
Statistics for Applications Chapter 9: Principal Component Analysis (PCA) 1/16 Multivariate statistics and review of linear algebra (1) Let X be a d-dimensional random vector and X 1,..., X n be n independent
More informationLearning representations
Learning representations Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 4/11/2016 General problem For a dataset of n signals X := [ x 1 x
More informationHigh-dimensional asymptotic expansions for the distributions of canonical correlations
Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationHomework 1. Yuan Yao. September 18, 2011
Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to
More informationPost-selection Inference for Forward Stepwise and Least Angle Regression
Post-selection Inference for Forward Stepwise and Least Angle Regression Ryan & Rob Tibshirani Carnegie Mellon University & Stanford University Joint work with Jonathon Taylor, Richard Lockhart September
More informationAssessing the dependence of high-dimensional time series via sample autocovariances and correlations
Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),
More informationPrincipal Component Analysis
CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given
More informationRobust Testing and Variable Selection for High-Dimensional Time Series
Robust Testing and Variable Selection for High-Dimensional Time Series Ruey S. Tsay Booth School of Business, University of Chicago May, 2017 Ruey S. Tsay HTS 1 / 36 Outline 1 Focus on high-dimensional
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationFACTOR ANALYSIS FOR HIGH-DIMENSIONAL DATA
FACTOR ANALYSIS FOR HIGH-DIMENSIONAL DATA A DISSERTATION SUBMITTED TO THE DEPARTMENT OF STATISTICS AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
More informationLecture 32: Infinite-dimensional/Functionvalued. Functions and Random Regressions. Bruce Walsh lecture notes Synbreed course version 11 July 2013
Lecture 32: Infinite-dimensional/Functionvalued Traits: Covariance Functions and Random Regressions Bruce Walsh lecture notes Synbreed course version 11 July 2013 1 Longitudinal traits Many classic quantitative
More informationStructural Learning and Integrative Decomposition of Multi-View Data
Structural Learning and Integrative Decomposition of Multi-View Data, Department of Statistics, Texas A&M University JSM 2018, Vancouver, Canada July 31st, 2018 Dr. Gen Li, Columbia University, Mailman
More informationShared Subspace Models for Multi-Group Covariance Estimation
Shared Subspace Models for Multi-Group Covariance Estimation arxiv:1607.03045v3 [stat.me] 7 Nov 2018 Alexander Franks and Peter Hoff November 9, 2018 Abstract We develop a model-based method for evaluating
More informationCovariance estimation using random matrix theory
Covariance estimation using random matrix theory Randolf Altmeyer joint work with Mathias Trabs Mathematical Statistics, Humboldt-Universität zu Berlin Statistics Mathematics and Applications, Fréjus 03
More information9.1 Orthogonal factor model.
36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms
More informationME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander
ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 Prof. Steven Waslander p(a): Probability that A is true 0 pa ( ) 1 p( True) 1, p( False) 0 p( A B) p( A) p( B) p( A B) A A B B 2 Discrete Random Variable X
More informationVAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:
VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector
More informationMethods for Cryptic Structure. Methods for Cryptic Structure
Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationStat 206: Sampling theory, sample moments, mahalanobis
Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering
More informationMachine Learning (Spring 2012) Principal Component Analysis
1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in
More informationAnnouncements (repeat) Principal Components Analysis
4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long
More informationConvergence of Eigenspaces in Kernel Principal Component Analysis
Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation
More informationReview of Covariance Localization in Ensemble Filters
NOAA Earth System Research Laboratory Review of Covariance Localization in Ensemble Filters Tom Hamill NOAA Earth System Research Lab, Boulder, CO tom.hamill@noaa.gov Canonical ensemble Kalman filter update
More informationRandom Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg
Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws Symeon Chatzinotas February 11, 2013 Luxembourg Outline 1. Random Matrix Theory 1. Definition 2. Applications 3. Asymptotics 2. Ensembles
More informationComposite Loss Functions and Multivariate Regression; Sparse PCA
Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.
More informationLecture 2: Linear Algebra Review
EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationDimension Reduction and Classification Using PCA and Factor. Overview
Dimension Reduction and Classification Using PCA and - A Short Overview Laboratory for Interdisciplinary Statistical Analysis Department of Statistics Virginia Tech http://www.stat.vt.edu/consult/ March
More informationIntelligent Data Analysis Lecture Notes on Document Mining
Intelligent Data Analysis Lecture Notes on Document Mining Peter Tiňo Representing Textual Documents as Vectors Our next topic will take us to seemingly very different data spaces - those of textual documents.
More informationDistances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Distances and similarities Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Similarities Start with X which we assume is centered and standardized. The PCA loadings were
More informationVector Space Models. wine_spectral.r
Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components
More informationPrincipal component analysis (PCA) for clustering gene expression data
Principal component analysis (PCA) for clustering gene expression data Ka Yee Yeung Walter L. Ruzzo Bioinformatics, v17 #9 (2001) pp 763-774 1 Outline of talk Background and motivation Design of our empirical
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationWhitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain
Whitening and Coloring Transformations for Multivariate Gaussian Data A Slecture for ECE 662 by Maliha Hossain Introduction This slecture discusses how to whiten data that is normally distributed. Data
More informationPreviously Monte Carlo Integration
Previously Simulation, sampling Monte Carlo Simulations Inverse cdf method Rejection sampling Today: sampling cont., Bayesian inference via sampling Eigenvalues and Eigenvectors Markov processes, PageRank
More informationApproximate Kernel Methods
Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationWeakly dependent functional data. Piotr Kokoszka. Utah State University. Siegfried Hörmann. University of Utah
Weakly dependent functional data Piotr Kokoszka Utah State University Joint work with Siegfried Hörmann University of Utah Outline Examples of functional time series L 4 m approximability Convergence of
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationPRINCIPAL COMPONENT ANALYSIS
PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem
More information