Principal Component Analysis!! Lecture 11!

Size: px
Start display at page:

Download "Principal Component Analysis!! Lecture 11!"

Transcription

1 Principal Component Analysis Lecture 11 1

2 Eigenvectors and Eigenvalues g Consider this problem of spreading butter on a bread slice 2

3 Eigenvectors and Eigenvalues g Consider this problem of stretching cheese on a bread slice 3

4 Eigenvectors and Eigenvalues g Consider this problem of spreading butter on a bread slice " A = 2 0 % $ ' # 0 1& Linear Transformation 4

5 Eigenvectors and Eigenvalues Find Eigenvalues: " A = 2 0 % $ ' # 0 1& A ( )I = 0 " 2 ( ) 0 % $ ' = 0 # 0 1 ( )& (2 ( ))(1( )) = 0 * ) = 2,1 5

6 Eigenvectors and Eigenvalues Find Eigenvectors for λ=2 A v = " v (A # "I) v = 0 $ 2 # 2 0 ' $ & ) v ' 1 & ) = 0 % 0 1 # 2( % ( 0v 1 #1v 2 = 0 v 2 = 0 $ v = & 1' ) % 0( v 2 X-axis 6

7 Eigenvectors and Eigenvalues Find Eigenvectors for λ=2 A v = " v (A # "I) v = 0 $ 2 # 2 0 ' $ & ) v ' 1 & ) = 0 % 0 1 # 2( % ( 0v 1 #1v 2 = 0 v 2 = 0 $ v = & 1' ) % 0( X-axis v 2 Find Eigenvectors for λ=1 A v = " v (A # "I) v = 0 $ 2 #1 0 ' $ & ) v ' 1 & ) = 0 % 0 1 #1(% ( 1v 1 # 0v 2 = 0 v 1 = 0 $ v = 0 ' & ) % 1( Y-axis v 2 7

8 Eigenvectors and Eigenvalues g What does the eigenvalue and vectors indicate? n That the linear transformation results in a scaling of λ along the eigenvector corresponding to λ # g In the example presented, λ=2 along the x-axis and λ=1 along the y-axis (intuitive)# g Stated differently, eigenvalue indicates the percentage of transformation present along a particular direction# n 66.66% along x-axis# n 33.33% along y-axis # 8

9 Eigenvectors and Eigenvalues g Lets consider another example n This time non-zero diagonal elements# " % A = $ ' # & Before Linear Transformation " % A = $ ' # & Before & After Linear Transformation Linear Transformation

10 Eigenvectors and Eigenvalues g Compute eigenvalues and eigenvectors of A " % A = $ ' # & Before Linear Transformation " % A = $ ' # & Before & After Linear Transformation Linear Transformation

11 Eigenvectors and Eigenvalues g Compute eigenvalues and eigenvectors of A " % A = $ ' # & " =1 " = 2 $ v = # ' $ & ) v = ' & ) % ( % ( Before & After Linear Transformation

12 Eigenvectors and Eigenvalues g What does the eigenvalue and vectors indicate? n That the linear transformation results in a scaling of λ along the eigenvector corresponding to λ # g λ=2 along [0.7071, ] T# g λ=1 along [ , ] T # Eigenvector2 (λ=1) Eigenvector1 (λ=2) # A " = 2 0 & % ( $ 0 1' Even though we started with a non-diagonal transformation matrix (A), by computing the eigenvectors and projecting the data onto those eigenvectors allows us to diagonalize the transformation 12

13 Recap Linear Algebra: Projection 13

14 Vector Space 14

15 Basis Vectors 15

16 Projection using Basis Vectors 16

17 Projection using Basis Vectors 17

18 Projection using Basis Vectors 18

19 Statistics Preliminaries g Mean n Let X 1,X 2,, X n be n observations of a random variable X# µ = X = E[ X] = 1 n n Mean is a measure of central tendency (others are mode and median)# n " i=1 X i g Standard Deviation n Measure of variability (square root of variance)# " = 1 n n $ i=1 ( X i # µ ) 2 19

20 Statistics Preliminaries g Covariance n A measure of how two variables change together# " XY = 1 n i=1 g Covariance Matrix n $ " XY = E X # E X ( X i # µ X ) Y i # µ Y ( ) T [( [ ])( Y # E[ Y] )] $ 2 # X1 " X1X 2... " X1Xd ' & 2 ) &" X1X 2 # X 2... " X 2Xd ) &.. ) " = &..... ) & ).. & 2 ) %" X1Xd " X 2Xd... # Xd ( 20

21 Statistics Preliminaries g Correlation n A normalized measure of how two variables change together# " XY = # XY $ X $ Y 21

22 Statistics Preliminaries An Example " µ = 2 % " $ ' ( = 2 0 % $ ' # 3& # 0 3& x x1 22

23 Statistics Preliminaries An Example " µ = 2 % " $ ' ( = % $ ' # 3& # & x x1 23

24 Statistics Preliminaries An Example " µ = 2 % " $ ' ( = 2 1 % $ ' # 3& # 1 3& x x1 24

25 Statistics Preliminaries An Example " µ = 2 % " $ ' ( = 2 2 % $ ' # 3& # 2 3& x x1 25

26 Statistics Preliminaries An Example " µ = 2 % " $ ' ( = 2 )2 % $ ' # 3& #)2 3 & x x1 26

27 PCA An Example " µ = 2 % " $ ' ( = 2 2 % $ ' # 3& # 2 3& 10 Can you find a vector that would approximate this 2-D space? x x1 27

28 PCA Letʼs build some intuition " µ = 2 % " $ ' ( = 2 2 % $ ' # 3& # 2 3& 10 Can you find a vector that would approximate this 2-D space? Green vector right? x x1 28

29 PCA Letʼs build some intuition " µ = 2 % " $ ' ( = 2 2 % $ ' # 3& # 2 3& 10 It would be nice to diagonalize the covariance matrix then you have only think about variance x x1 29

30 PCA Letʼs build some intuition " µ = 2 % " $ ' ( = 2 2 % $ ' # 3& # 2 3& 10 It would be nice to diagonalize the covariance matrix then you have only think about variance Think eigenvectors of covariance matrix x x1 30

31 PCA Letʼs build some intuition " µ = 0 % " $ ' ( = 2 2 % $ ' # 0& # 2 3& It would be nice to diagonalize the covariance matrix then you have only think about variance Think eigenvectors of covariance matrix x Letʼs subtract the mean first (makes things simple) x1 31

32 PCA Letʼs build some intuition " µ = 0 % " $ ' ( = 2 2 % $ ' # 0& # 2 3& It would be nice to diagonalize the covariance matrix then you have only think about variance Think eigenvectors of covariance matrix x " = " = $ v = # ' $ & ) v = ' & ) % ( % ( x1 32

33 PCA Letʼs build some intuition " µ = 0 % " $ ' ( = 2 2 % $ ' # 0& # 2 3& It would be nice to diagonalize the covariance matrix then you have only think about variance Think eigenvectors of covariance matrix x First eigenvector accounts for 91.23% of the data x1 33

34 PCA Letʼs build some intuition " µ = 0 % " $ ' ( = 2 2 % $ ' # 0& # 2 3& If I have to pick a vector that would approximate this 2-D space? Green vector right x First eigenvector accounts for 91.23% of the data x1 34

35 PCA An Example g Compute the principal components for the following twodimensional dataset n X=(x1,x2)={(1,2),(3,3),(3,5),(5,4),(5,6),(6,5),(8,7),(9,8)}# x x1 35

36 PCA An Example g Compute the principal components for the following twodimensional dataset n X=(x1,x2)={(1,2),(3,3),(3,5),(5,4),(5,6),(6,5),(8,7),(9,8)}# 10 Step 1: Determine the Sample Covariance Matrix x x1 36

37 PCA An Example g Compute the principal components for the following twodimensional dataset n X=(x1,x2)={(1,2),(3,3),(3,5),(5,4),(5,6),(6,5),(8,7),(9,8)}# Step 1: Determine the Sample Covariance Matrix # & " = % ( $ ' x Step 2: Find eigenvalues and eigenvectors of the covariance matrix x1 37

38 PCA An Example g Compute the principal components for the following twodimensional dataset n X=(x1,x2)={(1,2),(3,3),(3,5),(5,4),(5,6),(6,5),(8,7),(9,8)}# Step 1: Determine Covariance Matrix # & " = % ( $ ' x Step 2: Find eigenvalues and eigenvectors of the covariance matrix " = " = $ v = ' $ & ) v = # ' & ) %#0.8086( %#0.8086( x1 38

39 Now the Math Part g How did we decide that eigenvectors of the covariance matrix will retain maximum information? g What are the assumptions behind PCA n First, the data distribution is unimodal Gaussian in nature# g Fully explained by the first two moments of the distribution i.e. mean and covariance# n Information is in the variance # g PCA dimensionality reduction n The optimal* approximation of a random vector x R N by a linear combination of M (M<N) independent vectors is obtained by projecting the random vector x onto the eigenvectors ϕ i corresponding to the largest eigenvalues λ i of the covariance matrix Σ x# (*optimality is defined as the minimum of the sum-square magnitude of the approximation error) # 39

40 Lets do the derivation g The objective of PCA is to perform dimensionality reduction while preserving as much of the randomness (variance) in the highdimensional space as possible n Let x be an N-dimensional random vector, represented as a linear combination of orthonormal basis vectors [ϕ 1 ϕ 2... ϕ N ] as# x = N # y i " i where " i $" j i=1 n Where y i is the weighting coefficient of basis vector formed by taking the inner product of x with ϕ 1 : # y i = x T " i 40

41 Lets do the derivation g The objective of PCA is to perform dimensionality reduction while preserving as much of the randomness (variance) in the highdimensional space as possible n Let x be an N-dimensional random vector, represented as a linear combination of orthonormal basis vectors [ϕ 1 ϕ 2... ϕ N ] as# x = N # y i " i where " i $" j i=1 n Where y i is the weighting coefficient of basis vector formed by taking the inner product of x with ϕ 1 : # y i = x T " i Φ 2 =[0 1] T x=[1 1] T y 2 =x T φ 2 =1 " 1% " 0% x = y 1 $ ' + y # 0 2 $ ' & # 1& y 1 =x T φ 1 =1 Φ 1 =[1 0] T 41

42 Lets do the derivation g Suppose we choose to represent x with only M (M<N) of the basis vectors. We can do this by replacing the components [y M+1,, y N ] T with some pre-selected constants b i ˆ x (M) = M # y i " i + # b i " i i=1 N i=m +1 g The representation error is: 42

43 Lets do the derivation g Suppose we choose to represent x with only M (M<N) of the basis vectors. We can do this by replacing the components [y M+1,, y N ] T with some pre-selected constants b i ˆ x (M) = M # y i " i + # b i " i i=1 N i=m +1 g The representation error is "x(m) = x # ˆ x (M) M N & M N ) = % y i $ i + % y i $ i #(% y i $ i + % b i $ i + ' * i=1 N i=m +1 = % y i $ i # % b i $ i i=m +1 N % N i=m +1 = ( y i # b i )$ i i=m +1 i=1 i=m +1 43

44 Lets do the derivation g g We can measure this representation error by the mean-squared magnitude of Δx Our goal is to find the basis vectors ϕ i and constants b i that minimize this mean-square error E [ "x(m) 2 ] = 1 S S $ # 2 i (M) S = samples i=1 -' N *' N * 0 = E /) $ ( y i % b i )& i,) $ ( y i % b i )& i, 2.( i=m +1 + ( i=m ' N N = E / ) $ $ y i % b i./ ( i=m +1 j =M +1 N $ [( ) 2 ] = E y i % b i i=m +1 * 0 ( )& T i & j, ( ) y j % b j -& i 3& j 0 / 2.& i =11 44

45 Lets do the derivation g Find b i n The optimal values of b i can be found by computing the partial derivative of the objective function and equating it to zero# " % N ' $ E y "b i # b i i & i=m +1 [( ) 2 ] ( [ ] # b i ) = 0 [ ] = #2 E y i b i = E y i ( * = #2 E y i # b i ) ( [ ]) = 0 E[x] is a linear operator E[A-B]=E[A]-E[B] n Intuitive: replace the discarded dimensions y iʼs by their expected value (or mean)# 45

46 Lets do the derivation g Find b i n As we have done earlier in the course, the optimal values of bi can be found by computing the partial derivative of the objective function and equating it to zero# # n b i = E[ y i ] Intuitive: replace the discarded dimensions y iʼs by their expected value# Reducing 2d to1d x=[1 1] T x=[2.5 1] T y 2 =x T φ 2 =1 b 2 =( )/3 b 2 =0.833 x=[2 0.5] T y 1 =1 y 1 =2 y 1 =2.5 Φ 1 =[1 0] T y 1 =x T φ 1 =1 46

47 Lets do the derivation g The Mean-Squared-Error can now be written as N $ [( ) 2 ] " 2 (M) = E y i # b i N i=m +1 ( [ ]) 2 % = $ E y i # E y &' i i=m +1 ( )* g where y i = x T " i 47

48 Lets do the derivation g The Mean-Squared-Error can now be written as g where # ( [ ]) 2 " 2 $ (M) = * E y i # E y %& i i=m +1 Substituting y i in the MSE equation we get: N y i = x T " i ' () N " 2 % (M) = + E x T # i $ E x T # &' i N i=m +1 [ ] ( ) 2 = + # T i E ( x $ E[ x] )( x $ E [ x ] ) T # i #### "#### $ i=m +1 N + = # i T, x # i i=m +1 ( )* [ ] data cov ariance Where Σ x is the covariance matrix of x 48

49 Lets do the derivation g We seek to find the solution that minimizes the MSE and is also subject to the normality constraint (φ T φ=1), which we incorporate into the expression using a set of Lagrange multipliers λ i N " 2 (M) = %# T i $ x # i + %& i (1'# T i # i ) i=m +1 N i=m +1 g Computing partial derivatives with respect to φ i d ' N N * ) $ " T d" i # x " i + $ % i (1&" T i " i ), = 2 # x " i & % i " i i ( + Note : i=m +1 d dx xt Ax i=m +1 ( ) = (A + A T A symmertric )x = 2Ax ( ) = 0 - # x # " i " = # % $ i " i Eigenvalue problem g So ϕ i and λ i are the eigenvectors and eigenvalues of the covariance matrix Σ x 49

50 Lets do the derivation g We can express the sum-squared error as N " 2 (M) = %# T i $ x # i = %# T i & i # i = %& i i=m +1 N i=m +1 N i=m +1 g In order to minimize this measure, λ i will have to be smallest eigenvalues n Therefore, to represent x with minimum sum-square error, we will choose the eigenvectors ϕ i corresponding to the largest eigenvalues λ i # # 50

51 Scree plot 51

52 Applications of PCA: Spike Sorting g Extracellular recording (single electrode) 52

53 Applications of PCA: Spike Sorting g Multi-unit recording Notice different colors (represent different neurons) have different patterns of extracellular activity) [from Pouzat et al. 2002] 53

54 Applications of PCA: Spike Sorting g Multi-unit recording PC2 Different colors (represent different neurons) have different patterns of extracellular activity) PC1 Clustering after PCA allows identification of events from a single source (neuron) 54

55 Applications of PCA: Neural Coding g Information is represented by spiking patterns of neural ensembles g Visualization of high-dimensional neural activity 55

56 Identity vs. Intensity Coding g PN ensemble activity trace concentration-specific trajectories on odor-specific manifolds (Stopfer et al., Neuron, 2003). Odor Trajectories (3 odors at multiple concentrations) Odor Trajectories (Hexanol at five concentrations) Octanol manifold Geraniol manifold Hexanol manifold Stopfer et al, Neuron, 2003 Stopfer et al., Neuron, 2003 LLE Local Linear Embedding S. Roweis and L. Saul, Science,

57 Another Neural Coding Example Kemere et al., 2008 Byron Yu, CMU, Teaching Notes 57

58 Another Neural Coding Example Direction 3 Direction 2 Direction 1 Neural representation after dimensionality reduction Santhanam et al., 2009 Byron Yu, CMU, Teaching Notes 58

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

LECTURE 16: PCA AND SVD

LECTURE 16: PCA AND SVD Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Quantitative Understanding in Biology Principal Components Analysis

Quantitative Understanding in Biology Principal Components Analysis Quantitative Understanding in Biology Principal Components Analysis Introduction Throughout this course we have seen examples of complex mathematical phenomena being represented as linear combinations

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

GEOG 4110/5100 Advanced Remote Sensing Lecture 15

GEOG 4110/5100 Advanced Remote Sensing Lecture 15 GEOG 4110/5100 Advanced Remote Sensing Lecture 15 Principal Component Analysis Relevant reading: Richards. Chapters 6.3* http://www.ce.yildiz.edu.tr/personal/songul/file/1097/principal_components.pdf *For

More information

Kernel methods for comparing distributions, measuring dependence

Kernel methods for comparing distributions, measuring dependence Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques

Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques Multivariate Statistics Fundamentals Part 1: Rotation-based Techniques A reminded from a univariate statistics courses Population Class of things (What you want to learn about) Sample group representing

More information

Linear & Non-Linear Discriminant Analysis! Hugh R. Wilson

Linear & Non-Linear Discriminant Analysis! Hugh R. Wilson Linear & Non-Linear Discriminant Analysis! Hugh R. Wilson PCA Review! Supervised learning! Fisher linear discriminant analysis! Nonlinear discriminant analysis! Research example! Multiple Classes! Unsupervised

More information

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

MATH 829: Introduction to Data Mining and Analysis Principal component analysis 1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional

More information

Dimensionality Reduction with Principal Component Analysis

Dimensionality Reduction with Principal Component Analysis 10 Dimensionality Reduction with Principal Component Analysis Working directly with high-dimensional data, such as images, comes with some difficulties: it is hard to analyze, interpretation is difficult,

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Noise & Data Reduction

Noise & Data Reduction Noise & Data Reduction Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum Dimension Reduction 1 Remember: Central Limit

More information

DIMENSION REDUCTION AND CLUSTER ANALYSIS

DIMENSION REDUCTION AND CLUSTER ANALYSIS DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Exercises * on Principal Component Analysis

Exercises * on Principal Component Analysis Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

7. Variable extraction and dimensionality reduction

7. Variable extraction and dimensionality reduction 7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Principal Component Analysis (PCA) Theory, Practice, and Examples

Principal Component Analysis (PCA) Theory, Practice, and Examples Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A

More information

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4 Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4 Data reduction, similarity & distance, data augmentation

More information

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain Whitening and Coloring Transformations for Multivariate Gaussian Data A Slecture for ECE 662 by Maliha Hossain Introduction This slecture discusses how to whiten data that is normally distributed. Data

More information

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham Intelligent Data Analysis Principal Component Analysis Peter Tiňo School of Computer Science University of Birmingham Discovering low-dimensional spatial layout in higher dimensional spaces - 1-D/3-D example

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Apprentissage non supervisée

Apprentissage non supervisée Apprentissage non supervisée Cours 3 Higher dimensions Jairo Cugliari Master ECD 2015-2016 From low to high dimension Density estimation Histograms and KDE Calibration can be done automacally But! Let

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

CSC 411 Lecture 12: Principal Component Analysis

CSC 411 Lecture 12: Principal Component Analysis CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 12-PCA 1 / 23 Overview Today we ll cover the first unsupervised

More information

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Principal Component Analysis (PCA) CSC411/2515 Tutorial Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)

More information

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inria.fr http://perception.inrialpes.fr/

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Neuroscience Introduction

Neuroscience Introduction Neuroscience Introduction The brain As humans, we can identify galaxies light years away, we can study particles smaller than an atom. But we still haven t unlocked the mystery of the three pounds of matter

More information

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Anders Øland David Christiansen 1 Introduction Principal Component Analysis, or PCA, is a commonly used multi-purpose technique in data analysis. It can be used for feature

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Announcements (repeat) Principal Components Analysis

Announcements (repeat) Principal Components Analysis 4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long

More information

20 Unsupervised Learning and Principal Components Analysis (PCA)

20 Unsupervised Learning and Principal Components Analysis (PCA) 116 Jonathan Richard Shewchuk 20 Unsupervised Learning and Principal Components Analysis (PCA) UNSUPERVISED LEARNING We have sample points, but no labels! No classes, no y-values, nothing to predict. Goal:

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

Statistical and Computational Analysis of Locality Preserving Projection

Statistical and Computational Analysis of Locality Preserving Projection Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

THE UNIVERSITY OF HONG KONG DEPARTMENT OF MATHEMATICS

THE UNIVERSITY OF HONG KONG DEPARTMENT OF MATHEMATICS THE UNIVERSITY OF HONG KONG DEPARTMENT OF MATHEMATICS MATH853: Linear Algebra, Probability and Statistics May 5, 05 9:30a.m. :30p.m. Only approved calculators as announced by the Examinations Secretary

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yuanzhen Shao MA 26500 Yuanzhen Shao PCA 1 / 13 Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =.

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS 121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Covariance and Correlation Matrix

Covariance and Correlation Matrix Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that

More information

Machine Learning for Software Engineering

Machine Learning for Software Engineering Machine Learning for Software Engineering Dimensionality Reduction Prof. Dr.-Ing. Norbert Siegmund Intelligent Software Systems 1 2 Exam Info Scheduled for Tuesday 25 th of July 11-13h (same time as the

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Principal Dynamical Components

Principal Dynamical Components Principal Dynamical Components Manuel D. de la Iglesia* Departamento de Análisis Matemático, Universidad de Sevilla Instituto Nacional de Matemática Pura e Aplicada (IMPA) Rio de Janeiro, May 14, 2013

More information

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis

Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis For example Data reduction approaches Cluster analysis Principal components analysis

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Noise & Data Reduction

Noise & Data Reduction Noise & Data Reduction Andreas Wichert - Teóricas andreas.wichert@inesc-id.pt 1 Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares

Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares Drift Reduction For Metal-Oxide Sensor Arrays Using Canonical Correlation Regression And Partial Least Squares R Gutierrez-Osuna Computer Science Department, Wright State University, Dayton, OH 45435,

More information

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if

More information

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello Artificial Intelligence Module 2 Feature Selection Andrea Torsello We have seen that high dimensional data is hard to classify (curse of dimensionality) Often however, the data does not fill all the space

More information

Eigenfaces. Face Recognition Using Principal Components Analysis

Eigenfaces. Face Recognition Using Principal Components Analysis Eigenfaces Face Recognition Using Principal Components Analysis M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience, 3(1), pp. 71-86, 1991. Slides : George Bebis, UNR

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information