Data Mining and Analysis: Fundamental Concepts and Algorithms

Size: px
Start display at page:

Download "Data Mining and Analysis: Fundamental Concepts and Algorithms"

Transcription

1 Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer Science Universidade Federal de Minas Gerais, Belo Horizonte, Brazil Chapter 7: Dimensionality Reduction Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 1 / 33

2 Dimensionality Reduction The goal of dimensionality reduction is to find a lower dimensional representation of the data matrix D to avoid the curse of dimensionality. Given n d data matrix, each point x i = (x i1, x i2,...,x id ) T is a vector in the ambient d-dimensional vector space spanned by the d standard basis vectors e 1, e 2,...,e d. Given any other set of d orthonormal vectors u 1, u 2,...,u d we can re-express each point x as x = a 1 u 1 + a 2 u 2 + +a d u d where a = (a 1, a 2,...,a d ) T represents the coordinates of x in the new basis. More compactly: x = Ua where U is the d d orthogonal matrix, whose ith column comprises the ith basis vector u i. Thus U 1 = U T, and we have a = U T x Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 2 / 33

3 Optimal Basis: Projection in Lower Dimensional Space There are potentially infinite choices for the orthonormal basis vectors. Our goal is to choose an optimal basis that preserves essential information about D. We are interested in finding the optimal r-dimensional representation of D, with r d. Projection of x onto the first r basis vectors is given as x = a 1 u 1 + a 2 u 2 + +a r u r = r a i u i = U r a r where U r and a r comprises the r basis vectors and coordinates, respv. Also, restricting a = U T x to r terms, we have a r = U T r x The r-dimensional projection of x is thus given as: x = U r U T r x = P r x where P r = U r U T r = r u iu T i is the orthogonal projection matrix for the subspace spanned by the first r basis vectors. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 3 / 33

4 Optimal Basis: Error Vector Given the projected vector x = P r x, the corresponding error vector, is the projection onto the remaining d r basis vectors ǫ = d i=r+1 The error vector ǫ is orthogonal to x. a i u i = x x The goal of dimensionality reduction is to seek an r-dimensional basis that gives the best possible approximation x i over all the points x i D. Alternatively, we seek to minimize the error ǫ i = x i x i over all the points. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 4 / 33

5 Iris Data: Optimal One-dimensional Basis X1 X2 X3 X1 X2 X3 u1 Iris Data: 3D Optimal 1D Basis Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 5 / 33

6 Iris Data: Optimal 2D Basis X 1 X 2 X 3 X1 X2 X3 u1 u2 Iris Data (3D) Optimal 2D Basis Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 6 / 33

7 Principal Component Analysis Principal Component Analysis (PCA) is a technique that seeks a r-dimensional basis that best captures the variance in the data. The direction with the largest projected variance is called the first principal component. The orthogonal direction that captures the second largest projected variance is called the second principal component, and so on. The direction that maximizes the variance is also the one that minimizes the mean squared error. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 7 / 33

8 Principal Component: Direction of Most Variance We seek to find the unit vector u that maximizes the projected variance of the points. Let D be centered, and let Σ be its covariance matrix. The projection of x i on u is given as ( u x T ) x i i = u T u = (u T x i )u = a i u u Across all the points, the projected variance along u is σu 2 = 1 n (a i µ u ) 2 = 1 n u T ( ( x i x T ) i u = u T 1 n x i x T i n n n ) u = u T Σu We have to find the optimal basis vector u that maximizes the projected variance σ 2 u = u T Σu, subject to the constraint that u T u = 1. The maximization objective is given as max J(u) = u T Σu α(u T u 1) u Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 8 / 33

9 Principal Component: Direction of Most Variance Given the objective max u J(u) = u T Σu α(u T u 1), we solve it by setting the derivative of J(u) with respect to u to the zero vector, to obtain ( u T Σu α(u T u 1) ) = 0 u that is, 2Σu 2αu = 0which implies Σu = αu Thus α is an eigenvalue of the covariance matrix Σ, with the associated eigenvector u. Taking the dot product with u on both sides, we have σ 2 u = ut Σuu T αu = αu T u = α To maximize the projected variance σ 2 u, we thus choose the largest eigenvalue λ 1 of Σ, and the dominant eigenvector u 1 specifies the direction of most variance, also called the first principal component. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 9 / 33

10 Iris Data: First Principal Component X 3 X 1 X 2 u 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 10 / 33

11 Minimum Squared Error Approach The direction that maximizes the projected variance is also the one that minimizes the average squared error. The mean squared error (MSE) optimization condition is MSE(u) = 1 n n ǫ i 2 = 1 n n x i x i 2 = n x i 2 u T Σu n Since the first term is fixed for a dataset D, we see that the direction u 1 that maximizes the variance is also the one that minimizes the MSE. Further, we have sum n x i 2 u T Σu = var(d) = tr(σ) = n Thus, for the direction u 1 that minimizes MSE, we have d MSE(u 1 ) = var(d) u T 1 Σu 1 = var(d) λ 1 σ 2 i Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 11 / 33

12 Best 2-dimensional Approximation The best 2D subspace that captures the most variance in D comprises the eigenvectors u 1 and u 2 corresponding to the largest and second largest eigenvalues λ 1 and λ 2, respv. Let U 2 = ( ) u 1 u 2 be the matrix whose columns correspond to the two principal components. Given the point x i D its projected coordinates are computed as follows: a i = U T 2 x i Let A denote the projected 2D dataset. The total projected variance for A is given as var(a) = u T 1 Σu 1 + u T 2 Σu 2 = u T 1 λ 1u 1 + u T 2 λ 2u 2 = λ 1 +λ 2 The first two principal components also minimize the mean square error objective, since MSE = 1 n n x i x i 2 = var(d) 1 n n ( ) x T i P 2 x i = var(d) var(a) Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 12 / 33

13 Optimal and Non-optimal 2D Approximations The optimal subspace maximizes the variance, and minimizes the squared error, whereas the nonoptimal subspace captures less variance, and has a high mean squared error value, as seen from the lengths of the error vectors (line segments). u2 X1 X3 X2 X1 X3 X2 u1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 13 / 33

14 Best r-dimensional Approximation To find the best r-dimensional approximation to D, we compute the eigenvalues of Σ. Because Σ is positive semidefinite, its eigenvalues are non-negative λ 1 λ 2 λ r λ r+1 λ d 0 We select the r largest eigenvalues, and their corresponding eigenvectors to form the best r-dimensional approximation. Total Projected Variance: Let U r = ( ) u 1 u r be the r-dimensional basis vector matrix, withe the projection matrix given as P r = U ru T r = r u iu T i. Let A denote the dataset formed by the coordinates of the projected points in the r-dimensional subspace. The projected variance is given as var(a) = 1 n n x T i P r x i = r u T i Σu i = r λ i Mean Squared Error: The mean squared error in r dimensions is MSE = 1 n n r xi x 2 i = var(d) λ i = d λ i r λ i Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 14 / 33

15 Choosing the Dimensionality One criteria for choosing r is to compute the fraction of the total variance captured by the first r principal components, computed as f(r) = λ r 1 +λ 2 + +λ r = λ i λ 1 +λ 2 + +λ d d λ i = r λ i var(d) Given a certain desired variance threshold, say α, starting from the first principal component, we keep on adding additional components, and stop at the smallest value r, for which f(r) α. In other words, we select the fewest number of dimensions such that the subspace spanned by those r dimensions captures at least α fraction (say 0.9) of the total variance. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 15 / 33

16 Principal Component Analysis: Algorithm PCA (D,α): µ = 1 n 1 n x i // compute mean 2 Z = D 1 µ T // center the data ( Σ = 1 n Z T Z ) 3 // compute covariance matrix 4 (λ 1,λ 2,...,λ d ) = eigenvalues(σ)// compute eigenvalues U = ( ) 5 u 1 u 2 u d = eigenvectors(σ)// compute eigenvectors 6 f(r) = r λ i d λ i, for all r = 1, 2,...,d // fraction of total variance 7 Choose smallest r so that f(r) α// choose dimensionality U r = ( ) 8 u 1 u 2 u r // reduced basis 9 A = {a i a i = U T r x i, for i = 1,...,n}// reduced dimensionality data Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 16 / 33

17 Iris Principal Components Covariance matrix: Σ = The eigenvalues and eigenvectors of Σ λ 1 = λ 2 = λ 3 = u 1 = u 2 = u 3 = The total variance is therefore λ 1 +λ 2 +λ 3 = = The fraction of total variance for different values of r is given as r f(r) This r = 2 PCs are need to capture α = 0.95 fraction of variance. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 17 / 33

18 Iris Data: Optimal 3D PC Basis X 3 X 1 X 2 u2 u3 u1 Iris Data (3D) Optimal 3D Basis Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 18 / 33

19 Iris Principal Components: Projected Data (2D) u u 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 19 / 33

20 Geometry of PCA Geometrically, when r = d, PCA corresponds to a orthogonal change of basis, so that the total variance is captured by the sum of the variances along each of the principal directions u 1, u 2,...,u d, and further, all covariances are zero. Let U be the d d orthogonal matrix U = ( u 1 u 2 u d ), with U 1 = U T. Let Λ = diag(λ 1,,λ d ) be the diagonal matrix of eigenvalues. Each principal component u i corresponds to an eigenvector of the covariance matrix Σ Σu i = λ i u i for all 1 i d which can be written compactly in matrix notation: ΣU = UΛ which impliesσ = UΛU T Thus, Λ represents the covariance matrix in the new PC basis. In the new PC basis, the equation x T Σ 1 x = 1 defines a d-dimensional ellipsoid (or hyper-ellipse). The eigenvectors u i of Σ, that is, the principal components, are the directions for the principal axes of the ellipsoid. The square roots of the eigenvalues, that is, λ i, give the lengths of the semi-axes. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 20 / 33

21 Iris: Elliptic Contours in Standard Basis u 2 u 3 u 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 21 / 33

22 Iris: Axis-Parallel Ellipsoid in PC Basis u 1 u 2 u 3 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 22 / 33

23 Kernel Principal Component Analysis Principal component analysis can be extended to find nonlinear directions in the data using kernel methods. Kernel PCA finds the directions of most variance in the feature space instead of the input space. Using the kernel trick, all PCA operations can be carried out in terms of the kernel function in input space, without having to transform the data into feature space. Let φ be a function that maps a point x in input space to its image φ(x i ) in feature space. Let the points in feature space be centered and let Σ φ be the covariance matrix. The first PC in feature space correspond to the dominant eigenvector where Σ φ u 1 = λ 1 u 1 Σ φ = 1 n n φ(x i )φ(x i ) T Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 23 / 33

24 Kernel Principal Component Analysis n It can be shown that u 1 = c i φ(x i ). That is, the PC direction in feature space is a linear combination of the transformed points. The coefficients are captured in the weight vector c = ( ) T c 1, c 2,, c n Substituting into the eigen-decomposition of Σ φ and simplifying, we get: Kc = nλ 1 c = η 1 c Thus, the weight vector c is the eigenvector corresponding to the largest eigenvalue η 1 of the kernel matrix K. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 24 / 33

25 Kernel Principal Component Analysis The weight vector c can be used to then find u 1 via u 1 = n c i φ(x i ). The only constraint we impose is that u 1 should be normalized to be a unit vector, which implies c 2 = 1 η 1. We cannot compute directly compute the principal direction, but we can project any point φ(x) onto the principal direction u 1, as follows: u T 1φ(x) = n c i φ(x i ) T φ(x) = n c i K(x i, x) which requires only kernel operations. We can obtain the additional principal components by solving for the other eigenvalues and eigenvectors of Kc j = nλ j c j = η j c j If we sort the eigenvalues of K in decreasing order η 1 η 2 η n 0, we can obtain the jth principal component as the corresponding eigenvector c j. The variance along the jth principal component is given as λ j = η j n. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 25 / 33

26 Kernel PCA Algorithm KERNELPCA (D, K,α): K = { K(x i, x j ) } 1 // compute n n kernel matrix i,j=1,...,n 2 K = (I 1 n 1 n n)k(i 1 n 1 n n)// center the kernel matrix 3 ( (η 1,η 2,...,η d ) = ) eigenvalues(k)// compute eigenvalues 4 c1 c 2 c n = eigenvectors(k)// compute eigenvectors 5 λ i = η i n for all i = 1,...,n// compute variance for each component 6 7 c i = f(r) = 1 η i c i for all i = 1,...,n// ensure that u T i u i = 1 r λ i d, for all r = 1, 2,...,d // fraction of total λ i variance 8 Choose smallest r so that f(r) α// choose dimensionality C r = ( ) 9 c 1 c 2 c r // reduced basis 10 A = {a i a i = C T r K i, for i = 1,...,n}// reduced dimensionality data Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 26 / 33

27 Nonlinear Iris Data: PCA in Input Space X u X u 2 X 1 X 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 27 / 33

28 Nonlinear Iris Data: Projection onto PCs u u 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 28 / 33

29 Kernel PCA: 3 PCs (Contours of Constant Projection) Homogeneous Quadratic Kernel: K (xi, xj ) = (xti xj ) Cb Cb Cb X1 Zaki & Meira Jr. (RPI and UFMG) Cb 0 0 Cb X2 X2 X Cb X1 Data Mining and Analysis X1 Chapter 7: Dimensionality Reduction 29 / 33

30 Kernel PCA: Projected Points onto 2 PCs Homogeneous Quadratic Kernel: K(x i, x j ) = (x T i x j ) 2 u u 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 30 / 33

31 Singular Value Decomposition Principal components analysis is a special case of a more general matrix decomposition method called Singular Value Decomposition (SVD). PCA yields the following decomposition of the covariance matrix: Σ = UΛU T where the covariance matrix has been factorized into the orthogonal matrix U containing its eigenvectors, and a diagonal matrix Λ containing its eigenvalues (sorted in decreasing order). SVD generalizes the above factorization for any matrix. In particular for an n d data matrix D with n points and d columns, SVD factorizes D as follows: D = L R T where L is a orthogonal n n matrix, R is an orthogonal d d matrix, and is an n d diagonal matrix, defined as (i, i) = δ i, and 0 otherwise. The columns of L are called the left singular and the columns of R (or rows of R T ) are called the right singular vectors. The entriesδ i are called the singular values of D, and they are all non-negative. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 31 / 33

32 Reduced SVD If the rank of D is r min(n, d), then there are only r nonzero singular values, ordered as follows: δ 1 δ 2 δ r > 0. We discard the left and right singular vectors that correspond to zero singular values, to obtain the reduced SVD as D = L r r R T r where L r is the n r matrix of the left singular vectors, R r is the d r matrix of the right singular vectors, and r is the r r diagonal matrix containing the positive singular vectors. The reduced SVD leads directly to the spectral decomposition of D given as D = r δ i l i r T i The best rank q approximation to the original data D is the matrix D q = q δ il i r T i that n minimizes the expression D D q F, where A F = d j=1 A(i, j)2 is called the Frobenius Norm of A. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 32 / 33

33 Connection Between SVD and PCA Assume D has been centered, and let D = L R T via SVD. Consider the scatter matrix for D, given as D T D. We have D T D = (L R T) T ( L R T) = R T L T L R T = R( T )R T = R 2 dr T where 2 d is the d d diagonal matrix defined as 2 d(i, i) = δ 2 i, for i = 1,...,d. The covariance matrix of centered D is given as Σ = 1 n DT D; we get D T D = nσ = nuλu T = U(nΛ)U T The right singular vectors R are the same as the eigenvectors of Σ. The singular values of D are related to the eigenvalues of Σ as nλ i = δi 2, which impliesλ i = δ2 i, for i = 1,...,d n Likewise the left singular vectors in L are the eigenvectors of the matrix n n matrix DD T, and the corresponding eigenvalues are given as δ 2 i. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 7: Dimensionality Reduction 33 / 33

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms : Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA Department

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki Wagner Meira Jr. Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA Department

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Data Mining II. Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Zürich. Basel, Spring Semester 2016 D-BSSE

Data Mining II. Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Zürich. Basel, Spring Semester 2016 D-BSSE D-BSSE Data Mining II Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Zürich Basel, Spring Semester 2016 D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 2 / 117 Our course

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

7. Symmetric Matrices and Quadratic Forms

7. Symmetric Matrices and Quadratic Forms Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T. Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

MATH 829: Introduction to Data Mining and Analysis Principal component analysis 1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5. 2. LINEAR ALGEBRA Outline 1. Definitions 2. Linear least squares problem 3. QR factorization 4. Singular value decomposition (SVD) 5. Pseudo-inverse 6. Eigenvalue decomposition (EVD) 1 Definitions Vector

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Tutorials in Optimization. Richard Socher

Tutorials in Optimization. Richard Socher Tutorials in Optimization Richard Socher July 20, 2008 CONTENTS 1 Contents 1 Linear Algebra: Bilinear Form - A Simple Optimization Problem 2 1.1 Definitions........................................ 2 1.2

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Regularized Discriminant Analysis and Reduced-Rank LDA

Regularized Discriminant Analysis and Reduced-Rank LDA Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and

More information

Chap 3. Linear Algebra

Chap 3. Linear Algebra Chap 3. Linear Algebra Outlines 1. Introduction 2. Basis, Representation, and Orthonormalization 3. Linear Algebraic Equations 4. Similarity Transformation 5. Diagonal Form and Jordan Form 6. Functions

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016 Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen

More information

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis and Linear Discriminant Analysis Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29

More information

Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Novelty Detection. Cate Welch. May 14, 2015

Novelty Detection. Cate Welch. May 14, 2015 Novelty Detection Cate Welch May 14, 2015 1 Contents 1 Introduction 2 11 The Four Fundamental Subspaces 2 12 The Spectral Theorem 4 1 The Singular Value Decomposition 5 2 The Principal Components Analysis

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

Tutorial on Principal Component Analysis

Tutorial on Principal Component Analysis Tutorial on Principal Component Analysis Copyright c 1997, 2003 Javier R. Movellan. This is an open source document. Permission is granted to copy, distribute and/or modify this document under the terms

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Review of Some Concepts from Linear Algebra: Part 2

Review of Some Concepts from Linear Algebra: Part 2 Review of Some Concepts from Linear Algebra: Part 2 Department of Mathematics Boise State University January 16, 2019 Math 566 Linear Algebra Review: Part 2 January 16, 2019 1 / 22 Vector spaces A set

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Chapter XII: Data Pre and Post Processing

Chapter XII: Data Pre and Post Processing Chapter XII: Data Pre and Post Processing Information Retrieval & Data Mining Universität des Saarlandes, Saarbrücken Winter Semester 2013/14 XII.1 4-1 Chapter XII: Data Pre and Post Processing 1. Data

More information

UNIT 6: The singular value decomposition.

UNIT 6: The singular value decomposition. UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if

More information

Review problems for MA 54, Fall 2004.

Review problems for MA 54, Fall 2004. Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Preprocessing & dimensionality reduction

Preprocessing & dimensionality reduction Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016

More information

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Kernel Principal Component Analysis

Kernel Principal Component Analysis Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Conceptual Questions for Review

Conceptual Questions for Review Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Introduction to Matrix Algebra August 18, 2010 1 Vectors 1.1 Notations A p-dimensional vector is p numbers put together. Written as x 1 x =. x p. When p = 1, this represents a point in the line. When p

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

CHAPTER 5. THE KERNEL METHOD 138

CHAPTER 5. THE KERNEL METHOD 138 CHAPTER 5. THE KERNEL METHOD 138 Chapter 5 The Kernel Method Before we can mine data, it is important to first find a suitable data representation that facilitates data analysis. For example, for complex

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

Linear Methods for Regression. Lijun Zhang

Linear Methods for Regression. Lijun Zhang Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived

More information

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors. MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors. Orthogonal sets Let V be a vector space with an inner product. Definition. Nonzero vectors v 1,v

More information

Computational math: Assignment 1

Computational math: Assignment 1 Computational math: Assignment 1 Thanks Ting Gao for her Latex file 11 Let B be a 4 4 matrix to which we apply the following operations: 1double column 1, halve row 3, 3add row 3 to row 1, 4interchange

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples

Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Dimensionality Reduction and Principle Components

Dimensionality Reduction and Principle Components Dimensionality Reduction and Principle Components Ken Kreutz-Delgado (Nuno Vasconcelos) UCSD ECE Department Winter 2012 Motivation Recall, in Bayesian decision theory we have: World: States Y in {1,...,

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

Problem Set 1. Homeworks will graded based on content and clarity. Please show your work clearly for full credit.

Problem Set 1. Homeworks will graded based on content and clarity. Please show your work clearly for full credit. CSE 151: Introduction to Machine Learning Winter 2017 Problem Set 1 Instructor: Kamalika Chaudhuri Due on: Jan 28 Instructions This is a 40 point homework Homeworks will graded based on content and clarity

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

5 Linear Algebra and Inverse Problem

5 Linear Algebra and Inverse Problem 5 Linear Algebra and Inverse Problem 5.1 Introduction Direct problem ( Forward problem) is to find field quantities satisfying Governing equations, Boundary conditions, Initial conditions. The direct problem

More information

Lecture 7 Spectral methods

Lecture 7 Spectral methods CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

1 Feature Vectors and Time Series

1 Feature Vectors and Time Series PCA, SVD, LSI, and Kernel PCA 1 Feature Vectors and Time Series We now consider a sample x 1,..., x of objects (not necessarily vectors) and a feature map Φ such that for any object x we have that Φ(x)

More information

Concentration Ellipsoids

Concentration Ellipsoids Concentration Ellipsoids ECE275A Lecture Supplement Fall 2008 Kenneth Kreutz Delgado Electrical and Computer Engineering Jacobs School of Engineering University of California, San Diego VERSION LSECE275CE

More information

CSE 554 Lecture 7: Alignment

CSE 554 Lecture 7: Alignment CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex

More information

Dimensionality Reduction and Principal Components

Dimensionality Reduction and Principal Components Dimensionality Reduction and Principal Components Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Motivation Recall, in Bayesian decision theory we have: World: States Y in {1,..., M} and observations of X

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information