Principal Components Analysis (PCA)

Similar documents
Data Preprocessing Tasks

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

What is Principal Component Analysis?

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

1 Principal Components Analysis

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

PRINCIPAL COMPONENT ANALYSIS

Principal Component Analysis

Computation. For QDA we need to calculate: Lets first consider the case that

Problems. Looks for literal term matches. Problems:

Introduction to Machine Learning

Multivariate Statistical Analysis

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Principal Component Analysis (PCA) Theory, Practice, and Examples

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms.

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA

Principal Components Analysis

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Dimensionality Reduction

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

Principal Components Theory Notes

Quantitative Understanding in Biology Principal Components Analysis

Deriving Principal Component Analysis (PCA)

Principal Component Analysis

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

PRINCIPAL COMPONENT ANALYSIS

Unsupervised Learning: Dimensionality Reduction

Lecture: Face Recognition and Feature Reduction

Mathematical foundations - linear algebra

PCA, Kernel PCA, ICA

Principal component analysis (PCA) for clustering gene expression data

Linear Algebra Methods for Data Mining

Lecture: Face Recognition and Feature Reduction

7. Variable extraction and dimensionality reduction

A tutorial on Principal Components Analysis

Ch. 10 Principal Components Analysis (PCA) Outline

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

PROCESS MONITORING OF THREE TANK SYSTEM. Outline Introduction Automation system PCA method Process monitoring with T 2 and Q statistics Conclusions

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Singular Value Decomposition and Digital Image Compression

Introduction to Principal Component Analysis (PCA)

Machine Learning for Software Engineering

Example Linear Algebra Competency Test

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Linear Algebra Methods for Data Mining

Machine Learning 2nd Edition

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Feature Extraction Techniques

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

CSE 554 Lecture 7: Alignment

Eigenvalues, Eigenvectors, and an Intro to PCA

Numerical Methods I Singular Value Decomposition

PCA and admixture models

Covariance and Principal Components

Machine Learning (Spring 2012) Principal Component Analysis

e 2 e 1 (a) (b) (d) (c)

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Principal Component Analysis

Lecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Independent Component Analysis

PRINCIPAL COMPONENTS ANALYSIS (PCA)

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

EE16B Designing Information Devices and Systems II

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

12.2 Dimensionality Reduction

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Linear Algebra Review. Fei-Fei Li

Singular Value Decomposition

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

7 Principal Components and Factor Analysis

Principal Component Analysis

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -30 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

1 Linearity and Linear Systems

Math for ML: review. ML and knowledge of other fields

Linear Models for the Prediction of Animal Breeding Values

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

COMPLEX PRINCIPAL COMPONENT SPECTRA EXTRACTION

Multivariate Statistics

Announcements (repeat) Principal Components Analysis

Data reduction for multivariate analysis

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Lecture 6. Numerical methods. Approximation of functions

Principal Component Analysis (PCA)

Linear Algebra: Characteristic Value Problem

Descriptive Statistics

DIMENSION REDUCTION AND CLUSTER ANALYSIS

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques

= main diagonal, in the order in which their corresponding eigenvectors appear as columns of E.

Functional Analysis Review

Transcription:

Principal Components Analysis (PCA)

Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension

Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering the data c) Obtaining the covariance matrix d) Performing an eigenvalue decomposition of the covariance matrix e) Choosing components and forming a feature vector f) Deriving the new data set

Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering the data c) Obtaining the covariance matrix d) Performing an eigenvalue decomposition of the covariance matrix e) Choosing components and forming a feature vector f) Deriving the new data set

4

4

not an integer multiple of the original vector 4 4 times the original vector

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) eigenvalue

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) eigenvalue (,)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) eigenvalue (,) (,)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) eigenvalue Some properties of eigenvectors: (,) (,)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices (,) (,)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors (,) (,)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them (,)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them the eigenvectors of a matrix are orthogonal, i.e. at right angles to each other (,)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix (,) eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them the eigenvectors of a matrix are orthogonal, i.e. at right angles to each other if an eigenvector is scaled by some amount before multiplying the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

not an integer multiple of the original vector 4 4 times the original vector transformation matrix (,) eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them the eigenvectors of a matrix are orthogonal, i.e. at right angles to each other if an eigenvector is scaled by some amount before multiplying the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

not an integer multiple of the original vector 6 4 4 4 times the original vector transformation matrix (,) eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them the eigenvectors of a matrix are orthogonal, i.e. at right angles to each other if an eigenvector is scaled by some amount before multiplying the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

not an integer multiple of the original vector 6 4 4 6 4 4 times the original vector transformation matrix (,) eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them the eigenvectors of a matrix are orthogonal, i.e. at right angles to each other if an eigenvector is scaled by some amount before multiplying the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

not an integer multiple of the original vector 6 4 4 6 4 6 4 4 times the original vector transformation matrix (,) eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them the eigenvectors of a matrix are orthogonal, i.e. at right angles to each other if an eigenvector is scaled by some amount before multiplying the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

not an integer multiple of the original vector 6 4 4 4 6 6 4 4 times the original vector transformation matrix (,) eigenvector (this vector and all multiples of it) (,) eigenvalue Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of them the eigenvectors of a matrix are orthogonal, i.e. at right angles to each other if an eigenvector is scaled by some amount before multiplying the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction) thus, eigenvalues and eigenvectors come in pairs

Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering the data c) Obtaining the covariance matrix d) Performing an eigenvalue decomposition of the covariance matrix e) Choosing components and forming a feature vector f) Deriving the new data set

Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering the data c) Obtaining the covariance matrix d) Performing an eigenvalue decomposition of the covariance matrix e) Choosing components and forming a feature vector f) Deriving the new data set

PCA - a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences - once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information

PCA - a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences - once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information - example: data of dimensions: original data: x y -0.7 0...7.7..4 0..9. 0.4-0.4. 0. 0.9 0.4-0.6 0.7 y -0. 0.0 0..0..0. -0. 0.0 0..0..0 x

PCA - a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences - once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information - example: data of dimensions: original data: centered data: x y -0.7 0...7.7..4 0..9. 0.4-0.4. 0. 0.9 0.4-0.6 0.7 x y -.7-0...7 0.7. 0.4-0. 0.9 0. 0-0.6 -.4 0. -0.7-0. -0.6 -.6-0. y -0. 0.0 0..0..0. -0. 0.0 0..0..0 x

PCA - a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences - once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information - example: data of dimensions: original data: centered data: x y -0.7 0...7.7..4 0..9. 0.4-0.4. 0. 0.9 0.4-0.6 0.7 x y -.7-0...7 0.7. 0.4-0. 0.9 0. 0-0.6 -.4 0. -0.7-0. -0.6 -.6-0. y centered - - 0 - - 0 x centered

Covariance matrix: x y x.06 0.696667 y 0.696667.0777 Eigenvalues of the covariance matrix:.74 0.9999 Eigenvectors of the covariance matrix: 0.7064-0.70767 0.70767 0.7064 unit eigenvectors original data: centered data: x y -0.7 0...7.7..4 0..9. 0.4-0.4. 0. 0.9 0.4-0.6 0.7 x y -.7-0...7 0.7. 0.4-0. 0.9 0. 0-0.6 -.4 0. -0.7-0. -0.6 -.6-0. y centered - - 0 - - 0 x centered

Covariance matrix: x y x.06 0.696667 y 0.696667.0777 Eigenvalues of the covariance matrix:.74 0.9999 Eigenvectors of the covariance matrix: 0.7064-0.70767 0.70767 0.7064 unit eigenvectors original data: centered data: x y -0.7 0...7.7..4 0..9. 0.4-0.4. 0. 0.9 0.4-0.6 0.7 x y -.7-0...7 0.7. 0.4-0. 0.9 0. 0-0.6 -.4 0. -0.7-0. -0.6 -.6-0. y centered - - 0 - - 0 x centered

Compression and reduced dimensionality: - the eigenvector associated with the highest eigenvalue is the principal component of the dataset; it captures the most significant relationship between the data dimensions - ordering eigenvalues from highest to lowest gives the components in order of significance - if we want, we can ignore the components of lesser significance this results in a loss of information, but if the eigenvalues are small, the loss will not be great - thus, in a dataset with n dimensions/variables, one may obtain the n eigenvectors & eigenvalues and decide to retain p of them this results in a final dataset with only p dimensions - feature vector a vector containing only the eigenvectors representing the dimensions we want to keep in our example data, choices: 0.7064-0.70767 0.70767 0.7064 or 0.7064 0.70767 - the final dataset is obtained by multiplying the transpose of the feature vector (on the left) with the transposed original dataset - this will give us the original data solely in terms of the vectors we chose

Our example: a) retain both eigenvectors: 0.7064 0.70767-0.70767 0.7064 data t new data x x0 x0 b) retain only the first eigenvector / principal component: 0.7064 0.70767 data t new data PC -. -.0-0. 0.0 0..0. x x0 x0 0 PC Original data, rotated so that the eigenvectors are the axes - no loss of information.

Our example: a) retain both eigenvectors: 0.7064 0.70767-0.70767 0.7064 data t new data x x0 x0 b) retain only the first eigenvector / principal component: 0.7064 0.70767 data t new data -. -.0-0. 0.0 0..0. x x0 x0 0 PC Original data, rotated so that the eigenvectors are the axes - no loss of information. Only dimension left we threw away the other axis. Thus: we transformed the data so that is expressed in terms of the patterns, where the patterns are the lines that most closely describe the relationships between the data. Now, the values of the data points tell us exactly where (i.e., above/below) the trend lines the data point sits.

- similar to Cholesky decomposition as one can use both to fully decompose the original covariance matrix - in addition, both produce uncorrelated factors Eigenvalue decomposition: v v c c e e v c c e e pc pc e e e e c c c c e e v e e v e e e e - var var Cholesky decomposition: c c c c l va l l l l va va va l ch ch Eigenvalue: decomposition C E V E - Cholesky decomposition: l l l var var C Λ ΨΛ t

Thank you for your attention.