Principal Component Analysis

Similar documents
1 Principal Components Analysis

Principal Component Analysis

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

Chapter 6: Orthogonality

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

No books, no notes, no calculators. You must show work, unless the question is a true/false, yes/no, or fill-in-the-blank question.

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Introduction to Machine Learning

Solutions to Review Problems for Chapter 6 ( ), 7.1

PRINCIPAL COMPONENTS ANALYSIS

MATH 115A: SAMPLE FINAL SOLUTIONS

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

Methods for sparse analysis of high-dimensional data, II

Eigenvalues, Eigenvectors, and an Intro to PCA

EECS 275 Matrix Computation

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

Exercises * on Principal Component Analysis

Methods for sparse analysis of high-dimensional data, II

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

Face Recognition and Biometric Systems

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Covariance and Correlation Matrix

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Eigenvalues, Eigenvectors, and an Intro to PCA

ft-uiowa-math2550 Assignment NOTRequiredJustHWformatOfQuizReviewForExam3part2 due 12/31/2014 at 07:10pm CST

Eigenvalues, Eigenvectors, and an Intro to PCA

2. Every linear system with the same number of equations as unknowns has a unique solution.

What is Principal Component Analysis?

Singular Value Decomposition

Mathematical foundations - linear algebra

Math 3191 Applied Linear Algebra

Final Exam - Take Home Portion Math 211, Summer 2017

Matrix Vector Products

MATH 221, Spring Homework 10 Solutions

Eigenfaces. Face Recognition Using Principal Components Analysis

Principal Component Analysis (PCA) CSC411/2515 Tutorial

LECTURE 16: PCA AND SVD

UNIT 6: The singular value decomposition.

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Principal Component Analysis

Schur s Triangularization Theorem. Math 422

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Applied Linear Algebra in Geoscience Using MATLAB

Section 6.4. The Gram Schmidt Process

Convergence of Eigenspaces in Kernel Principal Component Analysis

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4

PCA and admixture models

Nonlinear Dimensionality Reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Dimensionality Reduction

Final A. Problem Points Score Total 100. Math115A Nadja Hempel 03/23/2017

PCA, Kernel PCA, ICA

Exercise Set 7.2. Skills

Maximum variance formulation

Chapter 7: Symmetric Matrices and Quadratic Forms

Linear Algebra Final Exam Study Guide Solutions Fall 2012

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Notes on Implementation of Component Analysis Techniques

ON ANGLES BETWEEN SUBSPACES OF INNER PRODUCT SPACES

CSC 411 Lecture 12: Principal Component Analysis

MLCC 2015 Dimensionality Reduction and PCA

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

235 Final exam review questions

Direct Sums and Invariants. Direct Sums

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

7. Variable extraction and dimensionality reduction

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Computation. For QDA we need to calculate: Lets first consider the case that

Lecture 15, 16: Diagonalization

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

The Singular Value Decomposition

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Machine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Lecture: Face Recognition and Feature Reduction

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

MATH 1553 SAMPLE FINAL EXAM, SPRING 2018

Solutions to practice questions for the final

Signal Analysis. Principal Component Analysis

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Announcements Monday, November 26

Lecture: Face Recognition and Feature Reduction

Preprocessing & dimensionality reduction

Principal Component Analysis (PCA)

Math Linear Algebra II. 1. Inner Products and Norms

COMP 408/508. Computer Vision Fall 2017 PCA for Recognition

Lecture 1: Review of linear algebra

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

Numerical Methods I Singular Value Decomposition

Transcription:

Principal Component Analysis Yuanzhen Shao MA 26500 Yuanzhen Shao PCA 1 / 13

Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =. } x 1n x 2n x mn x m1 Figure: A collection data in R n Yuanzhen Shao PCA 2 / 13

Data as points in R n Without loss of generality, we may assume that the mean of this collection of data 0 X = 1 m 0 X i = m i=1.. 0 Otherwise, we just translate these data to have 0 mean by looking at S = {X 1 X, X 2 X,, X m X }. Figure: A collection data in R n Yuanzhen Shao PCA 3 / 13

Variance of Data Question 1: How can we determine a subspace that S is close to? Some examples: http://setosa.io/ev/principal-component-analysis/ Yuanzhen Shao PCA 4 / 13

Variance of Data Question 1: How can we determine a subspace that S is close to? Some examples: http://setosa.io/ev/principal-component-analysis/ Question 2: How to determine the direction representing the largest variance of S? Yuanzhen Shao PCA 4 / 13

Variance of Data Question 1: How can we determine a subspace that S is close to? Some examples: http://setosa.io/ev/principal-component-analysis/ Question 2: How to determine the direction representing the largest variance of S? Recall if v R n is a unit vector, then the orthogonal projection of X i on the direction given by v, i.e. span{v}, is Proj V X i = (X i, v)v = c i v, that is, c i is the coordinate of X i in the direction of v. Figure: Orthogonal projection Yuanzhen Shao PCA 4 / 13

Variance of Data Answer: the direction representing the largest variance of S is given by the unit vector v that can maximize m ci 2 = m (X i, v) 2 among all unit vector v, i=1 i=1 or equivalently to maximize m m ci 2 = (X i, v) 2 among all unit vector v. i=1 i=1 Yuanzhen Shao PCA 5 / 13

Optimization Problem Question 3: How to find such v to this optimization problem? Let X T 1 X T x 11 x 12 x 1n A m n = 2. =.... x m1 x m2 x mn X T m Yuanzhen Shao PCA 6 / 13

Optimization Problem Question 3: How to find such v to this optimization problem? Let X T 1 X T x 11 x 12 x 1n A m n = 2. =.... x m1 x m2 x mn X T m Recall (X i, v) = Xi T v. So X1 T v (X 1, v) X2 T Av = v. = (X 2, v). = Xm T v (X m, v) c 1 c 2. c m. Yuanzhen Shao PCA 6 / 13

Optimization Problem Question 3: How to find such v to this optimization problem? Let X T 1 X T x 11 x 12 x 1n A m n = 2. =.... x m1 x m2 x mn X T m Recall (X i, v) = Xi T v. So X1 T v (X 1, v) X2 T Av = v. = (X 2, v). = Xm T v (X m, v) c 1 c 2. c m. Therefore, m i=1 ci 2 = (Av, Av) = v T A T Av = v T }{{} C v =A T A Yuanzhen Shao PCA 6 / 13

Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Yuanzhen Shao PCA 7 / 13

Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Assume that C has eigenvalues λ 1 λ 2 λ n. Moreover, C has an orthonormal basis of eigenvectors v 1, v 2, v n such that Cv i = λ i v i. Yuanzhen Shao PCA 7 / 13

Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Assume that C has eigenvalues λ 1 λ 2 λ n. Moreover, C has an orthonormal basis of eigenvectors v 1, v 2, v n such that Cv i = λ i v i. In particular, vi T Cv i = λ i vi T v i = λ i v i 2 = λ i. Yuanzhen Shao PCA 7 / 13

Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Assume that C has eigenvalues λ 1 λ 2 λ n. Moreover, C has an orthonormal basis of eigenvectors In particular, v 1, v 2, v n such that Cv i = λ i v i. vi T Cv i = λ i vi T v i = λ i v i 2 = λ i. Claim: v 1 represents the direction of the largest variance of S. Yuanzhen Shao PCA 7 / 13

Optimization Problem Proof. If u = a 1 v 1 + a 2 v 2 + a n v n is a unit vector in R n, i.e. then 1 = u 2 = a 2 1 + a 2 2 + + a 2 n, (consider why?) u T Cu = (a 1 v 1 + a 2 v 2 + a n v n ) T (λ 1 a 1 v 1 + λ 2 a 2 v 2 + λ 2 a n v n ) }{{}}{{} u T Cu = λ 1 a1 2 + λ 2 a2 2 + λ n an 2 λ 1 a 2 1 + λ 1 a 2 2 + λ 1 a 2 n = λ 1 (a 2 1 + a 2 2 + + a 2 n) }{{} =1 = λ 1. Yuanzhen Shao PCA 8 / 13

Optimization Problem Thus, v 1 = the direction of the largest variance of S. Yuanzhen Shao PCA 9 / 13

Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } Yuanzhen Shao PCA 9 / 13

Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } v 3 = the direction of the largest variance of S in span{v 1, v 2 } Yuanzhen Shao PCA 9 / 13

Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } v 3 = the direction of the largest variance of S in span{v 1, v 2 } etc. Yuanzhen Shao PCA 9 / 13

Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } v 3 = the direction of the largest variance of S in span{v 1, v 2 } etc. In the end, we can just drop the directions corresponding to very small eigenvalues of C. Yuanzhen Shao PCA 9 / 13

PCA in digital images: an example by Václav Hlaváč Let us consider a 321 261 image. Such an image can be considered as a vector in R n with n = 321 261 = 83781. Yuanzhen Shao PCA 10 / 13

What if we have 32 instances of images? Yuanzhen Shao PCA 11 / 13

PCA in digital images: an example by Václav Hlaváč Using PCA method, we can determine a four-dimensional subspace W in R n such that all 32 images are close to W. Yuanzhen Shao PCA 12 / 13

PCA in digital images: an example by Václav Hlaváč Using PCA method, we can determine a four-dimensional subspace W in R n such that all 32 images are close to W. We can find four basis vectors for W, which can be displayed as images: Yuanzhen Shao PCA 12 / 13

PCA in digital images: an example by Václav Hlaváč Using PCA method, we can determine a four-dimensional subspace W in R n such that all 32 images are close to W. We can find four basis vectors for W, which can be displayed as images: We can reconstruct all 32 images by using linear combinations of these four basis images, e.g. where q 1 = 0.078, q 2 = 0.062, q 3 = 0.182, q 4 = 0.179. Yuanzhen Shao PCA 12 / 13

Reconstruction fidelity, 4 components Yuanzhen Shao PCA 13 / 13

References Václav Hlaváč, Principal Component Analysis Application to images http://setosa.io/ev/principal-component-analysis/ http://www.visiondummy.com/2014/04/ geometric-interpretation-covariance-matrix/ Yuanzhen Shao PCA 14 / 13