Principal Component Analysis

Size: px
Start display at page:

Download "Principal Component Analysis"

Transcription

1 Principal Component Analysis Anders Øland David Christiansen 1 Introduction Principal Component Analysis, or PCA, is a commonly used multi-purpose technique in data analysis. It can be used for feature extraction, compression, classification, and dimension reduction et cetera. There are various ways of approaching and implementing PCA. The two most standard ways of viewing it are: 1. variance maximization 2. minimum mean-square error compression In the following we will discuss PCA from the view of variance maximization. Although other interesting variants exist, such as probabilistic PCA (PPCA), we shall focus only on classic PCA 1. PCA can be described as finding a new basis for some matrix A such that each vector in the basis maximizes the variance of A with respect to itself. In other words, the first vector in the new basis is the dimension along which the data varies the most, the next is that along which it varies next-most, and so forth. The intuition is that the principal component along which there is the most variance is that which is most important in the data. Hopefully, the majority of the variance will be accounted for by fewer principal components than the dimensionality of the original data. PCA is deeply connected to the Singular Value Decomposition (SVD), which decomposes any matrix A with rank r into UΣV T, with orthogonal matrices U and V and diagonal Σ. The non-zero values along the diagonal of Σ, called σ 1,..., σ r are positive, and σ n σ n+1. The first step in PCA is to center the data around its mean. Simply subtract the mean of each dimension. If this were not done, then data in a dimension clustered around some point far from 0 would appear to be much more important relative to other dimensions whose data were clustered around 0. 1 We find that discussing PPCA would be out of scope for this report. 1

2 2 Variance and Covariance The variance of a data set is a measurement of how much those data are spread out. Data have high variance if they differ further from their mean. In this section, assume that all data have a mean of 0 that is, the mean has been subtracted. For some data vector a = (a 1, a 2,..., a n ), the variance i a2 i. σ a is defined as σa 2 = 1 n The covariance of two data vectors a and b with an equal number of elements n is defined by using the products of the corresponding elements in the calculation instead of the squares of individual values, that is, σab 2 = 1 n i a ib i. To the extent that values in b are correlated to the equivalent values in a, σ ab will be large. If they are negatively correlated, σ ab is less than zero. If they are completely uncorrelated, then σ ab is equal to zero. Covariance can be easily generalized to matrices that consist of a number of data vectors. For an m n matrix A, the covariance matrix C A = 1 n AAT. C A is a m m symmetric matrix. The variances of each vector are found on the diagonal, while the covariance of two vectors from A is found at the corresponding location in C A. 3 PCA and Covariance Matrices As we are attempting to find a new orthonormal basis within which some matrix A will have maximal variance along the first vector, next-maximal variance along the second, and so forth, the covariance matrix C A is a good starting point. If Y is A in this new basis, then C Y will be diagonal, and for all 1 i n 1, the i th element of the diagonal is less than or equal to the (i + 1) th element. The coefficient matrix will be diagonal because each component of the new matrix must be orthogonal to the others. We select the values in decreasing order to select the vectors that contribute the most first. We can diagonalize the covariance matrix by finding its eigenvectors and eigenvalues. Therefore, the principal components of A are the eigenvectors of A s covariance matrix, and the corresponding eigenvalue is the variance of A along the associated vector in the basis. 4 Recovering Principal Components from SVD Keep in mind that, for orthogonal matrices Q, Q 1 = Q T. Therefore, we can derive AV = UΣ from the SVD A = UΣV T. From this, we get (for each 1 i r) Av i = σ i u i. Because we know that each vector in U is a unit 2

3 vector and that the σs are in decreasing order of size, we know that each u contributes more to the final result than the last. UΣ represents the data in the new basis, while V is the matrix that transforms A to that basis. The PCA, then, can be implemented by doing the following to some matrix A that is centered around the means: 1. Find the SVD of A n = UΣV T. The division is necessitated because A n T A n = 1 n AT A, which is the covariance matrix for A. 2. We know then that A n V = UΣ. Because U and V are orthogonal matrices and Σ is diagonal with the values on the diagonal decreasing, we satisfy the requirements for the PCA. 5 Choosing the Number of Principal Components Each principal component has a corresponding eigenvalue (or σ from the SVD) that indicates the extent to which it contributes to the final reconstruction of the data. The PCA is useful to the extent that these coefficients are not equal - if they are all equal, then no component is more important than the others. When using the PCA for data compression, the eigenvalues give a measure of how much data is being lost by eliminating each vector. It is expected that they first few principal components will provide the majority of the original data, and that there will at some point be a sharp fall in the eigenvalues, indicating that the threshold has been reached. 6 Implementation and Results 6.1 Multivariate Gaussian Data - Two Dimensions Perhaps the most immediately understandable illustration of the function of PCA is to apply it to a bivariate Gaussian distribution that is already centered around the origin, plotted in a two-dimensional plane. Then, the bases found in the PCA multiplied by the projection of the mean of the dataset onto that basis can be plotted as vectors superimposed on the plot of the points. The bivariate Gaussian data will form a roughly ellipse-shaped blob on the graph. This ellipse appears rotated, so that the two perpendicular axes 3

4 Figure 1: Principal components of bivariate Gaussian data 4

5 (a) Dan Witzner Hansen (b) Smiley face (c) The letter P (d) The letter C (e) The letter A Figure 2: Test images for PCA image compression of the ellipse do not necessarily coincide with the x and y axes of the plane. PCA will recover vectors that match the axes of the elliptical area in which the data are found. The first principal component matches the longer of the axes. The components can be see in Figure Image Compression Here, we demonstrate the use of PCA to determine the most important components of an image. The test images used can be seen in Figure 2. All images are in grayscale. While the smiley face has only black and white pixels, the letters have fuzzy edges. Each image was loaded into a matrix in Matlab whose dimensions correspond to the pixel dimensions of the image and where each pixel s grayscale value is represented by a an integer from 0 to 255. Next, the principal components of each image were determined, and new images were generated by keeping only the most important of the components. These images are presented in Figures 3 and 4. 5

6 (a) Dan Witzner Hansen (b) Dan Witzner Hansen (1 (c) Dan Witzner Hansen (2 (original) component) (d) Dan Witzner Hansen (3 (e) Dan Witzner Hansen (5 (f) Dan Witzner Hansen (7 (g) Dan Witzner Hansen (10 (h) Dan Witzner Hansen (20 (i) Dan Witzner Hansen (30 Figure 3: Principal Components of Dan Witzner Hansen 6

7 Original Figure 4: Principal Components of Simpler Images 7

8 (a) P (1 componentnentsponentsponents) (b) P (5 compo- (c) P (10 com- (d) P (14 com- (e) P (15 (f) P (16 (g) P (17 (h) P (18 Figure 5: The letter P (i) P (19 (j) P (20 In the case of Dr. Witzner Hansen, the image begins to be recognizable at around seven components, and the difference in visual quality between 1 and 10 components is much more noticeable than the difference between 10 and 20. Likewise, the 10 components added to get from 20 to 30 components leads to a very slight improvement in visual quality. Approximately 10 percent of the original data yields a quite recognizable image. The relatively simple line drawings in Figure 4 are essentially identical to their initial uncompressed forms with only 15 components, with the exception of the letter P, for which a more detailed picture can be seen in Figure 5. Interestingly, it shows very little change between one and fourteen components, while drastic changes are evident with fifteen through nineteen components. From twenty onwards, the pictures is basically identical to its original uncompressed form. This indicates that the data in the original picture may be much more uncorrelated than the data of the other pictures. 7 Uses and Limitations of PCA PCA is useful for recovering from measurement error, where the more important principal component or components are considered to be the signal and the remaining components are the noise. Additionally, it can be used for lossy compression (by throwing out the least important components of data). PCA is useful for finding the axes along which Gaussian data are dis- 8

9 tributed. It is not particularly useful in the case of data described by a non-linear variable or for non-gaussian data. For example, two multivariate Gaussian distributions in one set of data would not be recovered. Generally speaking, PCA assumes that the relationships between the variables in the data are linear. If the those are instead non-linear, the principal components (or axes) would not constitue a proper representation of the data. Bishop[1] has a good example of when PCA would fail: if the data consisted of measurements of the coordinates of the position of a person riding on a ferris wheel. In that case, and in the general one, it would be a good idea to look for higher-order dependencies in the data before applying PCA. If such dependencies exist, they may be removed by representing the data in a different way. In the current example, by using polar coordinates instead of Cartesian. 9

10 References [1] Bishop, Christopher. Pattern Recognition and Machine Learning, Chapter 12. Springer, [2] Hyvärinen, Aapo, et al. Independent Component Analysis, Chapter 6. John Wiley & Sons, USA, [3] Nabney, Ian. Netlab: Algorithms for Pattern Recognition, Section 7.1. Springer, [4] Shlens, Jonathon. A Tutorial on Principal Component Analysis. Online, accessed 1. Aug [5] Strang, Gilbert. Introduction to Linear Algebra, Fourth Edition, p Wellesley Cambridge Press, USA,

11 A Matlab Source function [ V, D ] = pca ( d a t a ) % PCA implementation % % INPUT : % data Data to be analyzed ( row vectors ) % % OUTPUT : % V Eigenvectors of the covariance matrix % D Diagonal matrix with the eigenvalues of the covariance matrix c o v a r i a n c e = cov ( d o u b l e ( d a t a ), 1); [V,D] = eig ( c o v a r i a n c e ); end function p c a p l o t ( count, c o v a r ) hold on ; clf ; % Generate data points d a t a = mvnrnd ( zeros ( count, 2), c o v a r ); % Plot data points minx = min ( d a t a (:,1) ) - 1; miny = min ( d a t a (:,2) ) - 1; maxx = max ( d a t a (:,1) ) + 1; maxy = max ( d a t a (:,2) ) + 1; s i d e s = max ([ abs ( minx ) abs ( miny ) abs (maxx) abs (maxy)]); axis ([ - s i d e s s i d e s - s i d e s s i d e s ]); s c a t t e r ( d a t a (:,1), d a t a (:, 2), 4, [ ]) ; % Find PCs [V D] = pca ( d a t a ); [V D] = p c e i g s o r t (V, D); summ = V * sqrt (D) hold on ; plot ([0 summ(1,1) ], [0 summ(2, 1)], color, black, markersize, 8, linewidth, 3) plot ([0 summ(1,2) ], [0 summ(2, 2)], color, black, markersize, 8, linewidth, 3) axis ([ - s i d e s s i d e s - s i d e s s i d e s ]); hold o f f ; end function [] = p c a t e s t ( f i l e n a m e ) 11

12 % Load image, convert to greyscale I = i m r e a d ( f i l e n a m e ); d a t a = d o u b l e ( I ); % Get principal components [V D] = pca ( d a t a ); % Sort eigenvalues in descending order % and permute V & D accordingly [V D] = p c e i g s o r t (V, D); [, x ] = size ( d a t a ); for p c c o u n t = 1: x, R = u i n t 8 ( p c r e d u c t ( data, V, p c c o u n t )); basename = s t r s p l i t ( f i l e n a m e, ".") ; basename = s t r j o i n ( _, basename (1: length ( basename ) -1)); outputname = s t r j o i n ( _, s t r s p l i t ( basename, / )); i m w r i t e (R, s t r c a t ( output /, outputname, int2str ( p c c o u n t ),. png )) % figure, imshow (R); end end function [ V, D ] = p c e i g s o r t ( V, D ) % Primary Components Eigen Sort % Useful when using the maximum variance method for PCA % Sort eigenvalues in descending order [ e i g v a l s p e r m u t a t i o n ] = sort (- diag (D)); % Permute the eigenvectors in V & eigenvalues in D accordingly V(:, p e r m u t a t i o n ) = V; D(:, p e r m u t a t i o n ) = D; D = flipud (D); end function [ R ] = p c r e d u c t ( data, V, numofpc ) % Principal Components Reduction % Reduce the dimesionality of the data % using the primary components % Project the data dmean = o n e s ( size ( data, 1), 1) * mean ( d a t a ); p r o j = ( d a t a - dmean) * V(:, 1:numOfPC); R = p r o j *V(:, 1:numOfPC) + dmean; end 12

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform KLT JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform Has many names cited in literature: Karhunen-Loève Transform (KLT); Karhunen-Loève Decomposition

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Covariance and Principal Components

Covariance and Principal Components COMP3204/COMP6223: Computer Vision Covariance and Principal Components Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition (Com S 477/577 Notes Yan-Bin Jia Sep, 7 Introduction Now comes a highlight of linear algebra. Any real m n matrix can be factored as A = UΣV T where U is an m m orthogonal

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

The Mathematics of Facial Recognition

The Mathematics of Facial Recognition William Dean Gowin Graduate Student Appalachian State University July 26, 2007 Outline EigenFaces Deconstruct a known face into an N-dimensional facespace where N is the number of faces in our data set.

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

More information

Image Compression Using Singular Value Decomposition

Image Compression Using Singular Value Decomposition Image Compression Using Singular Value Decomposition Ian Cooper and Craig Lorenc December 15, 2006 Abstract Singular value decomposition (SVD) is an effective tool for minimizing data storage and data

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain

Whitening and Coloring Transformations for Multivariate Gaussian Data. A Slecture for ECE 662 by Maliha Hossain Whitening and Coloring Transformations for Multivariate Gaussian Data A Slecture for ECE 662 by Maliha Hossain Introduction This slecture discusses how to whiten data that is normally distributed. Data

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of

More information

Dimensionality Reduction with Principal Component Analysis

Dimensionality Reduction with Principal Component Analysis 10 Dimensionality Reduction with Principal Component Analysis Working directly with high-dimensional data, such as images, comes with some difficulties: it is hard to analyze, interpretation is difficult,

More information

Linear Algebra Review. Fei-Fei Li

Linear Algebra Review. Fei-Fei Li Linear Algebra Review Fei-Fei Li 1 / 37 Vectors Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightnesses, etc. A vector

More information

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

Chapter 7: Symmetric Matrices and Quadratic Forms

Chapter 7: Symmetric Matrices and Quadratic Forms Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T. Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data

More information

LECTURE 16: PCA AND SVD

LECTURE 16: PCA AND SVD Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS 121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Linear Algebra Review. Fei-Fei Li

Linear Algebra Review. Fei-Fei Li Linear Algebra Review Fei-Fei Li 1 / 51 Vectors Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightnesses, etc. A vector

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Clustering VS Classification

Clustering VS Classification MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Eigenimaging for Facial Recognition

Eigenimaging for Facial Recognition Eigenimaging for Facial Recognition Aaron Kosmatin, Clayton Broman December 2, 21 Abstract The interest of this paper is Principal Component Analysis, specifically its area of application to facial recognition

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Wavelet Transform And Principal Component Analysis Based Feature Extraction

Wavelet Transform And Principal Component Analysis Based Feature Extraction Wavelet Transform And Principal Component Analysis Based Feature Extraction Keyun Tong June 3, 2010 As the amount of information grows rapidly and widely, feature extraction become an indispensable technique

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Clayton Aldern (Clayton_Aldern@brown.edu) Tyler Benster (Tyler_Benster@brown.edu) Carl Olsson (Carl_Olsson@brown.edu)

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits

More information

Concentration Ellipsoids

Concentration Ellipsoids Concentration Ellipsoids ECE275A Lecture Supplement Fall 2008 Kenneth Kreutz Delgado Electrical and Computer Engineering Jacobs School of Engineering University of California, San Diego VERSION LSECE275CE

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Principal Component Analysis (PCA) CSC411/2515 Tutorial Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning. Sargur N. Srihari Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Face Recognition and Biometric Systems

Face Recognition and Biometric Systems The Eigenfaces method Plan of the lecture Principal Components Analysis main idea Feature extraction by PCA face recognition Eigenfaces training feature extraction Literature M.A.Turk, A.P.Pentland Face

More information

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

MATH 829: Introduction to Data Mining and Analysis Principal component analysis 1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional

More information

Introduction PCA classic Generative models Beyond and summary. PCA, ICA and beyond

Introduction PCA classic Generative models Beyond and summary. PCA, ICA and beyond PCA, ICA and beyond Summer School on Manifold Learning in Image and Signal Analysis, August 17-21, 2009, Hven Technical University of Denmark (DTU) & University of Copenhagen (KU) August 18, 2009 Motivation

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

The Principal Component Analysis

The Principal Component Analysis The Principal Component Analysis Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) PCA Fall 2017 1 / 27 Introduction Every 80 minutes, the two Landsat satellites go around the world, recording images

More information

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms.

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. January 5, 25 Outline Methodologies for the development of classification

More information

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface Image Analysis & Retrieval Lec 14 Eigenface and Fisherface Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li, Image Analysis & Retrv, Spring

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

Maths for Signals and Systems Linear Algebra in Engineering

Maths for Signals and Systems Linear Algebra in Engineering Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 15, Tuesday 8 th and Friday 11 th November 016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

COMP 558 lecture 18 Nov. 15, 2010

COMP 558 lecture 18 Nov. 15, 2010 Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to

More information

Introduction to SVD and Applications

Introduction to SVD and Applications Introduction to SVD and Applications Eric Kostelich and Dave Kuhl MSRI Climate Change Summer School July 18, 2008 Introduction The goal of this exercise is to familiarize you with the basics of the singular

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Le Song Machine Learning I CSE 674, Fall 23 Unsupervised learning Learning from raw (unlabeled, unannotated, etc) data, as opposed to supervised data where a classification of

More information

Homework 1. Yuan Yao. September 18, 2011

Homework 1. Yuan Yao. September 18, 2011 Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to

More information

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016 Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Table of Contents. Multivariate methods. Introduction II. Introduction I

Table of Contents. Multivariate methods. Introduction II. Introduction I Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational

More information

Announcements (repeat) Principal Components Analysis

Announcements (repeat) Principal Components Analysis 4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long

More information

Machine Learning 11. week

Machine Learning 11. week Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information