MATH 829: Introduction to Data Mining and Analysis Principal component analysis

Size: px
Start display at page:

Download "MATH 829: Introduction to Data Mining and Analysis Principal component analysis"

Transcription

1 1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016

2 Motivation 2/11 High-dimensional data often has a low-rank structure. Most of the action may occur in a subspace of R p.

3 Motivation 2/11 High-dimensional data often has a low-rank structure. Most of the action may occur in a subspace of R p.

4 Motivation 2/11 High-dimensional data often has a low-rank structure. Most of the action may occur in a subspace of R p. Problem: How can we discover low dimensional structures in data?

5 Motivation 2/11 High-dimensional data often has a low-rank structure. Most of the action may occur in a subspace of R p. Problem: How can we discover low dimensional structures in data? Principal components analysis: construct projections of the data that capture most of the variability in the data.

6 Motivation High-dimensional data often has a low-rank structure. Most of the action may occur in a subspace of R p. Problem: How can we discover low dimensional structures in data? Principal components analysis: construct projections of the data that capture most of the variability in the data. Provides a low-rank approximation to the data. Can lead to a signicant dimensionality reduction. 2/11

7 Principal component analysis (PCA) 3/11 Let X R n p with rows x 1,..., x n R p. We think of X as n observations of a random vector (X 1,..., X p ) R p.

8 Principal component analysis (PCA) 3/11 Let X R n p with rows x 1,..., x n R p. We think of X as n observations of a random vector (X 1,..., X p ) R p. Suppose each column has mean 0, i.e., n i=1 x i = 0 1 p.

9 Principal component analysis (PCA) 3/11 Let X R n p with rows x 1,..., x n R p. We think of X as n observations of a random vector (X 1,..., X p ) R p. Suppose each column has mean 0, i.e., n i=1 x i = 0 1 p. We want to nd a linear combination w 1 X w p X p with maximum variance. (Intuition: we look for a direction in R p where the data varies the most.)

10 Principal component analysis (PCA) 3/11 Let X R n p with rows x 1,..., x n R p. We think of X as n observations of a random vector (X 1,..., X p ) R p. Suppose each column has mean 0, i.e., n i=1 x i = 0 1 p. We want to nd a linear combination w 1 X w p X p with maximum variance. (Intuition: we look for a direction in R p where the data varies the most.) We solve: w = argmax w 2 =1 n (x T i w) 2. (Note: n i=1 (xt i w)2 is proportional to the sample variance of the data since we assume each column of X has mean 0.) i=1

11 Principal component analysis (PCA) Let X R n p with rows x 1,..., x n R p. We think of X as n observations of a random vector (X 1,..., X p ) R p. Suppose each column has mean 0, i.e., n i=1 x i = 0 1 p. We want to nd a linear combination w 1 X w p X p with maximum variance. (Intuition: we look for a direction in R p where the data varies the most.) We solve: w = argmax w 2 =1 n (x T i w) 2. (Note: n i=1 (xt i w)2 is proportional to the sample variance of the data since we assume each column of X has mean 0.) Equivalently, we solve: i=1 w = argmax(xw) T (Xw) = argmax w T X T Xw w 2 =1 w 2 =1 Claim: w is an eigenvector associated to the largest eigenvalue of X T X. 3/11

12 Proof of claim: Rayleigh quotients 4/11 Let A R p p be a symmetric (or Hermitian) matrix. The Rayleigh quotient is dened by R(A, x) = xt Ax x T x = Ax, x x, x, (x Rp, x 0 p 1 ).

13 Proof of claim: Rayleigh quotients 4/11 Let A R p p be a symmetric (or Hermitian) matrix. The Rayleigh quotient is dened by R(A, x) = xt Ax x T x = Ax, x x, x, (x Rp, x 0 p 1 ). Observations: 1 If Ax = λx with x 2 = 1, then R(A, x) = λ. Thus, sup R(A, x) λ max (A). x 0

14 Proof of claim: Rayleigh quotients 4/11 Let A R p p be a symmetric (or Hermitian) matrix. The Rayleigh quotient is dened by R(A, x) = xt Ax x T x Observations: = Ax, x x, x, (x Rp, x 0 p 1 ). 1 If Ax = λx with x 2 = 1, then R(A, x) = λ. Thus, sup R(A, x) λ max (A). x 0 2 Let {λ 1,..., λ p } denote the eigenvalues of A, and let {v 1,..., v p } R p be an orthonormal basis of eigenvectors of A. If x = p i=1 θ iv i, then R(A, x) =. p i=1 λ iθ 2 i n i=1 θ2 i

15 Proof of claim: Rayleigh quotients Let A R p p be a symmetric (or Hermitian) matrix. The Rayleigh quotient is dened by R(A, x) = xt Ax x T x Observations: = Ax, x x, x, (x Rp, x 0 p 1 ). 1 If Ax = λx with x 2 = 1, then R(A, x) = λ. Thus, sup R(A, x) λ max (A). x 0 2 Let {λ 1,..., λ p } denote the eigenvalues of A, and let {v 1,..., v p } R p be an orthonormal basis of eigenvectors of A. If x = p i=1 θ p i=1 iv i, then R(A, x) = λ iθi 2 n. i=1 θ2 i It follows that sup R(A, x) λ max (A). x 0 Thus, sup x 0 R(A, x) = sup x 2 =1 x T Ax = λ max (A). 4/11

16 Back to PCA 5/11 Previous argument shows that w (1) = argmax w 2 =1 n i=1 (x T i w) 2 = argmax w 2 =1 w T X T Xw is an eigenvector associated to the largest eigenvalue of X T X.

17 Back to PCA 5/11 Previous argument shows that w (1) = argmax w 2 =1 n i=1 (x T i w) 2 = argmax w 2 =1 w T X T Xw is an eigenvector associated to the largest eigenvalue of X T X. First principal component: The linear combination p i=1 w(1) i X i is the rst principal component of (X 1,..., X p ). Alternatively, we say that Xw (1) is the rst (sample) principal component of X.

18 Back to PCA 5/11 Previous argument shows that w (1) = argmax w 2 =1 n i=1 (x T i w) 2 = argmax w 2 =1 w T X T Xw is an eigenvector associated to the largest eigenvalue of X T X. First principal component: The linear combination p i=1 w(1) i X i is the rst principal component of (X 1,..., X p ). Alternatively, we say that Xw (1) is the rst (sample) principal component of X. It is the linear combination of the columns of X having the most variance.

19 Back to PCA Previous argument shows that w (1) = argmax w 2 =1 n i=1 (x T i w) 2 = argmax w 2 =1 w T X T Xw is an eigenvector associated to the largest eigenvalue of X T X. First principal component: The linear combination p i=1 w(1) i X i is the rst principal component of (X 1,..., X p ). Alternatively, we say that Xw (1) is the rst (sample) principal component of X. It is the linear combination of the columns of X having the most variance. Second principal component: We look for a new linear combination of the X i 's that 1 Is orthogonal to the rst principal component, and 2 Maximizes the variance. 5/11

20 Back to PCA (cont.) 6/11 In other words: w (2) := argmax w 2 =1 w w (1) n i=1 (x T i w) 2 = argmax w 2 =1 w w (1) w T X T Xw.

21 Back to PCA (cont.) 6/11 In other words: w (2) := argmax w 2 =1 w w (1) n i=1 (x T i w) 2 = argmax w 2 =1 w w (1) w T X T Xw. Using a similar argument as before with Rayleigh quotients, we conclude that w (2) is an eigenvector associated to the second largest eigenvalue of X T X.

22 Back to PCA (cont.) 6/11 In other words: w (2) := argmax w 2 =1 w w (1) n i=1 (x T i w) 2 = argmax w 2 =1 w w (1) w T X T Xw. Using a similar argument as before with Rayleigh quotients, we conclude that w (2) is an eigenvector associated to the second largest eigenvalue of X T X. Similarly, given w (1),..., w (k), we dene w (k+1) := argmax w 2 =1 w w (1),w (2),...,w (k) n i=1 (x T i w) 2 = argmax w T X T Xw. w 2 =1 w w (1),w (2),...,w (k) As before, the vector w (k+1) is an eigenvector associated to the (k + 1)-th largest eigenvalue of X T X.

23 PCA: summary 7/11 In summary, suppose X T X = UΛU T where U R p p is an orthogonal matrix and Λ R p p is diagonal. (Eigendecomposition of X T X.)

24 PCA: summary 7/11 In summary, suppose X T X = UΛU T where U R p p is an orthogonal matrix and Λ R p p is diagonal. (Eigendecomposition of X T X.) Recall that the columns of U are the eigenvectors of X T X and the diagonal of Λ contains the eigenvalues of X T X (i.e., the singular values of X).

25 PCA: summary 7/11 In summary, suppose X T X = UΛU T where U R p p is an orthogonal matrix and Λ R p p is diagonal. (Eigendecomposition of X T X.) Recall that the columns of U are the eigenvectors of X T X and the diagonal of Λ contains the eigenvalues of X T X (i.e., the singular values of X). Then the principal components of X are the columns of XU.

26 PCA: summary 7/11 In summary, suppose X T X = UΛU T where U R p p is an orthogonal matrix and Λ R p p is diagonal. (Eigendecomposition of X T X.) Recall that the columns of U are the eigenvectors of X T X and the diagonal of Λ contains the eigenvalues of X T X (i.e., the singular values of X). Then the principal components of X are the columns of XU. Write U = (u 1,..., u p ). Then the variance of the i-th principal component is (Xu i ) T (Xu i ) = u T i X T Xu i = (U T X T XU) ii = Λ ii.

27 PCA: summary 7/11 In summary, suppose X T X = UΛU T where U R p p is an orthogonal matrix and Λ R p p is diagonal. (Eigendecomposition of X T X.) Recall that the columns of U are the eigenvectors of X T X and the diagonal of Λ contains the eigenvalues of X T X (i.e., the singular values of X). Then the principal components of X are the columns of XU. Write U = (u 1,..., u p ). Then the variance of the i-th principal component is (Xu i ) T (Xu i ) = u T i X T Xu i = (U T X T XU) ii = Λ ii. Conclusion: The variance of the i-th principal component is the i-th eigenvalue of X T X.

28 PCA: summary In summary, suppose X T X = UΛU T where U R p p is an orthogonal matrix and Λ R p p is diagonal. (Eigendecomposition of X T X.) Recall that the columns of U are the eigenvectors of X T X and the diagonal of Λ contains the eigenvalues of X T X (i.e., the singular values of X). Then the principal components of X are the columns of XU. Write U = (u 1,..., u p ). Then the variance of the i-th principal component is (Xu i ) T (Xu i ) = u T i X T Xu i = (U T X T XU) ii = Λ ii. Conclusion: The variance of the i-th principal component is the i-th eigenvalue of X T X. We say that the rst k PCs explain ( k i=1 Λ ii)/( p i=1 Λ ii) 100 percent of the variance. 7/11

29 Example: zip dataset 8/11 Recall the zip dataset: images of digits Each image is in black/white with = 256 pixels. We use PCA to project the data onto a 2-dim subspace of R 256.

30 Example: zip dataset 8/11 Recall the zip dataset: images of digits Each image is in black/white with = 256 pixels. We use PCA to project the data onto a 2-dim subspace of R 256. from sklearn.decomposition import PCA pc = PCA(n_components=10) pc.fit(x_train) print(pc.explained_variance_ratio_) plt.plot(range(1,11), np.cumsum(pc.explained_variance_ratio_))

31 Example: zip dataset 8/11 Recall the zip dataset: images of digits Each image is in black/white with = 256 pixels. We use PCA to project the data onto a 2-dim subspace of R 256. from sklearn.decomposition import PCA pc = PCA(n_components=10) pc.fit(x_train) print(pc.explained_variance_ratio_) plt.plot(range(1,11), np.cumsum(pc.explained_variance_ratio_))

32 Example: zip dataset (cont.) 9/11 Projecting the data on the rst two principal components: Xt = pc.fit_transform(x_train).

33 Example: zip dataset (cont.) Projecting the data on the rst two principal components: Xt = pc.fit_transform(x_train). Note: 27% variance explained by the rst two PCAs. 90% variance explained by rst 55 components. 9/11

34 Principal component regression 10/11 PCAs can be directly used in a regression context.

35 Principal component regression 10/11 PCAs can be directly used in a regression context. Principal component regression: y R n 1, X R n p. 1 Center y and each column of X (i.e., subtract mean from the columns)

36 Principal component regression 10/11 PCAs can be directly used in a regression context. Principal component regression: y R n 1, X R n p. 1 Center y and each column of X (i.e., subtract mean from the columns) 2 Compute the eigen-decomposition of X T X: X T X = UΛU T.

37 Principal component regression 10/11 PCAs can be directly used in a regression context. Principal component regression: y R n 1, X R n p. 1 Center y and each column of X (i.e., subtract mean from the columns) 2 Compute the eigen-decomposition of X T X: X T X = UΛU T. 3 Compute k 1 principal components: W k := (Xu 1,..., Xu k ) = XU k, where U = (u 1,..., u p ), and U k = (u 1,..., u k ) R p k.

38 Principal component regression 10/11 PCAs can be directly used in a regression context. Principal component regression: y R n 1, X R n p. 1 Center y and each column of X (i.e., subtract mean from the columns) 2 Compute the eigen-decomposition of X T X: X T X = UΛU T. 3 Compute k 1 principal components: W k := (Xu 1,..., Xu k ) = XU k, where U = (u 1,..., u p ), and U k = (u 1,..., u k ) R p k. 4 Regress y on the principal components: ˆγ k := (W T k W k) 1 W T k y.

39 Principal component regression 10/11 PCAs can be directly used in a regression context. Principal component regression: y R n 1, X R n p. 1 Center y and each column of X (i.e., subtract mean from the columns) 2 Compute the eigen-decomposition of X T X: X T X = UΛU T. 3 Compute k 1 principal components: W k := (Xu 1,..., Xu k ) = XU k, where U = (u 1,..., u p ), and U k = (u 1,..., u k ) R p k. 4 Regress y on the principal components: ˆγ k := (W T k W k) 1 W T k y. 5 The PCR estimator is: ˆβ k := U kˆγ k, ŷ (k) := X ˆβ k = XU k ˆβk.

40 Principal component regression PCAs can be directly used in a regression context. Principal component regression: y R n 1, X R n p. 1 Center y and each column of X (i.e., subtract mean from the columns) 2 Compute the eigen-decomposition of X T X: X T X = UΛU T. 3 Compute k 1 principal components: W k := (Xu 1,..., Xu k ) = XU k, where U = (u 1,..., u p ), and U k = (u 1,..., u k ) R p k. 4 Regress y on the principal components: ˆγ k := (W T k W k) 1 W T k y. 5 The PCR estimator is: ˆβ k := U kˆγ k, ŷ (k) := X ˆβ k = XU k ˆβk. Note: k is a parameter that needs to be chosen (using CV or another method). Typically, one picks k to be signicantly smaller than p. 10/11

41 Projection pursuit 11/11 PCA looks for subspaces with the most variance.

42 Projection pursuit 11/11 PCA looks for subspaces with the most variance. Can also optimize other criteria.

43 Projection pursuit 11/11 PCA looks for subspaces with the most variance. Can also optimize other criteria. Projection pursuit (PP): 1 Set up a projection index to judge the merit of a particular one or two-dimensional projection of a given set of multivariate data.

44 Projection pursuit 11/11 PCA looks for subspaces with the most variance. Can also optimize other criteria. Projection pursuit (PP): 1 Set up a projection index to judge the merit of a particular one or two-dimensional projection of a given set of multivariate data. 2 Use an optimization algorithm to nd the global and local extrema of that projection index over all 1/2-dimensional projections of the data.

45 Projection pursuit 11/11 PCA looks for subspaces with the most variance. Can also optimize other criteria. Projection pursuit (PP): 1 Set up a projection index to judge the merit of a particular one or two-dimensional projection of a given set of multivariate data. 2 Use an optimization algorithm to nd the global and local extrema of that projection index over all 1/2-dimensional projections of the data. Example:(Izenman, 2013) The absolute value of kurtosis, κ 4 (Y ), of the one-dimensional projection Y = w T X has been widely used as a measure of non-gaussianity of Y.

46 Projection pursuit 11/11 PCA looks for subspaces with the most variance. Can also optimize other criteria. Projection pursuit (PP): 1 Set up a projection index to judge the merit of a particular one or two-dimensional projection of a given set of multivariate data. 2 Use an optimization algorithm to nd the global and local extrema of that projection index over all 1/2-dimensional projections of the data. Example:(Izenman, 2013) The absolute value of kurtosis, κ 4 (Y ), of the one-dimensional projection Y = w T X has been widely used as a measure of non-gaussianity of Y. Recall: The marginals of the multivariate Gaussian distribution are Gaussian.

47 Projection pursuit PCA looks for subspaces with the most variance. Can also optimize other criteria. Projection pursuit (PP): 1 Set up a projection index to judge the merit of a particular one or two-dimensional projection of a given set of multivariate data. 2 Use an optimization algorithm to nd the global and local extrema of that projection index over all 1/2-dimensional projections of the data. Example:(Izenman, 2013) The absolute value of kurtosis, κ 4 (Y ), of the one-dimensional projection Y = w T X has been widely used as a measure of non-gaussianity of Y. Recall: The marginals of the multivariate Gaussian distribution are Gaussian. Can maximize/minimize the kurtosis to nd subspaces where data looks Gaussian/non-Gaussian (e.g. to detect outliers). 11/11

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Le Song Machine Learning I CSE 674, Fall 23 Unsupervised learning Learning from raw (unlabeled, unannotated, etc) data, as opposed to supervised data where a classification of

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yuanzhen Shao MA 26500 Yuanzhen Shao PCA 1 / 13 Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =.

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis and Linear Discriminant Analysis Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29

More information

LECTURE 16: PCA AND SVD

LECTURE 16: PCA AND SVD Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T. Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Announcements (repeat) Principal Components Analysis

Announcements (repeat) Principal Components Analysis 4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Maths for Signals and Systems Linear Algebra in Engineering

Maths for Signals and Systems Linear Algebra in Engineering Maths for Signals and Systems Linear Algebra in Engineering Lecture 18, Friday 18 th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE LONDON Mathematics

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016 Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

Principal Components Analysis. Sargur Srihari University at Buffalo

Principal Components Analysis. Sargur Srihari University at Buffalo Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection Pursuit Methods Principal Components Examples of using PCA Graphical use of PCA Multidimensional Scaling Srihari 2

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs Ma/CS 6b Class 3: Eigenvalues in Regular Graphs By Adam Sheffer Recall: The Spectrum of a Graph Consider a graph G = V, E and let A be the adjacency matrix of G. The eigenvalues of G are the eigenvalues

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal

More information

18.06SC Final Exam Solutions

18.06SC Final Exam Solutions 18.06SC Final Exam Solutions 1 (4+7=11 pts.) Suppose A is 3 by 4, and Ax = 0 has exactly 2 special solutions: 1 2 x 1 = 1 and x 2 = 1 1 0 0 1 (a) Remembering that A is 3 by 4, find its row reduced echelon

More information

20 Unsupervised Learning and Principal Components Analysis (PCA)

20 Unsupervised Learning and Principal Components Analysis (PCA) 116 Jonathan Richard Shewchuk 20 Unsupervised Learning and Principal Components Analysis (PCA) UNSUPERVISED LEARNING We have sample points, but no labels! No classes, no y-values, nothing to predict. Goal:

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Principal Component Analysis (PCA) CSC411/2515 Tutorial Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

Dimensionality Reduction and Principal Components

Dimensionality Reduction and Principal Components Dimensionality Reduction and Principal Components Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Motivation Recall, in Bayesian decision theory we have: World: States Y in {1,..., M} and observations of X

More information

APPLICATIONS The eigenvalues are λ = 5, 5. An orthonormal basis of eigenvectors consists of

APPLICATIONS The eigenvalues are λ = 5, 5. An orthonormal basis of eigenvectors consists of CHAPTER III APPLICATIONS The eigenvalues are λ =, An orthonormal basis of eigenvectors consists of, The eigenvalues are λ =, A basis of eigenvectors consists of, 4 which are not perpendicular However,

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

Math 408 Advanced Linear Algebra

Math 408 Advanced Linear Algebra Math 408 Advanced Linear Algebra Chi-Kwong Li Chapter 4 Hermitian and symmetric matrices Basic properties Theorem Let A M n. The following are equivalent. Remark (a) A is Hermitian, i.e., A = A. (b) x

More information

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

MATH 829: Introduction to Data Mining and Analysis Linear Regression: old and new

MATH 829: Introduction to Data Mining and Analysis Linear Regression: old and new 1/15 MATH 829: Introduction to Data Mining and Analysis Linear Regression: old and new Dominique Guillot Departments of Mathematical Sciences University of Delaware February 10, 2016 Linear Regression:

More information

Problem # Max points possible Actual score Total 120

Problem # Max points possible Actual score Total 120 FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science Computational Methods CMSC/AMSC/MAPL 460 Eigenvalues and Eigenvectors Ramani Duraiswami, Dept. of Computer Science Eigen Values of a Matrix Recap: A N N matrix A has an eigenvector x (non-zero) with corresponding

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that

More information

Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

Principal Component Analysis CS498

Principal Component Analysis CS498 Principal Component Analysis CS498 Today s lecture Adaptive Feature Extraction Principal Component Analysis How, why, when, which A dual goal Find a good representation The features part Reduce redundancy

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS 121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Anders Øland David Christiansen 1 Introduction Principal Component Analysis, or PCA, is a commonly used multi-purpose technique in data analysis. It can be used for feature

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014 Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. I assume the reader is familiar with basic linear algebra, including the

More information

Study Guide for Linear Algebra Exam 2

Study Guide for Linear Algebra Exam 2 Study Guide for Linear Algebra Exam 2 Term Vector Space Definition A Vector Space is a nonempty set V of objects, on which are defined two operations, called addition and multiplication by scalars (real

More information

Tutorials in Optimization. Richard Socher

Tutorials in Optimization. Richard Socher Tutorials in Optimization Richard Socher July 20, 2008 CONTENTS 1 Contents 1 Linear Algebra: Bilinear Form - A Simple Optimization Problem 2 1.1 Definitions........................................ 2 1.2

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Eigenvalues and diagonalization

Eigenvalues and diagonalization Eigenvalues and diagonalization Patrick Breheny November 15 Patrick Breheny BST 764: Applied Statistical Modeling 1/20 Introduction The next topic in our course, principal components analysis, revolves

More information

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction

Robot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Lecture 7 Spectral methods

Lecture 7 Spectral methods CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017 Math 4A Notes Written by Victoria Kala vtkala@math.ucsb.edu Last updated June 11, 2017 Systems of Linear Equations A linear equation is an equation that can be written in the form a 1 x 1 + a 2 x 2 +...

More information