Principal Component Analysis
|
|
- Philippa Gregory
- 6 years ago
- Views:
Transcription
1 Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal Component Analysis 1 / 47
2 Motivation Motivation How wide is the widest part of NGC 1300? Leow Wee Kheng (NUS) Principal Component Analysis 2 / 47
3 Motivation How thick is the thickest part of NGC 4594? Use principal component analysis. Leow Wee Kheng (NUS) Principal Component Analysis 3 / 47
4 Maximum Variance Estimate Maximum Variance Estimate Consider a set of points x i, i = 1,...,n, in an m-dimensional space such that their mean µ = 0, i.e., centroid is at the origin. x 2 x i x 2 y i d i l q x 1 O x1 x 4 x 3 Want to find a line l through the origin that maximizes the projections y i of the points x i on l. Leow Wee Kheng (NUS) Principal Component Analysis 4 / 47
5 Maximum Variance Estimate Let q denote the unit vector along line l. Then, the projection y i of x i on l is y i = x i q. (1) The mean squared projection, which is the variance, V over all points is V = 1 n yi 2 = 1 n (x i q) 2 0. (2) n n i=1 i=1 Expanding Eq. 2 gives [ V = 1 n (x i q) (x i q) = q 1 n n i=1 ] n x i x i q. (3) The middle factor is the covariance matrix C of the data points C = 1 n x i x i. (4) n i=1 i=1 Leow Wee Kheng (NUS) Principal Component Analysis 5 / 47
6 Maximum Variance Estimate We want to find a unit vector q that maximizes the variance V, i.e., maximize V = q Cq subject to q = 1. (5) This is a constrained optimization problem. Use Lagrange multiplier method: combine V and the constraint using Lagrange multiplier λ maximize V = q Cq λ(q q 1). (6) Leow Wee Kheng (NUS) Principal Component Analysis 6 / 47
7 Maximum Variance Estimate Lagrange Multiplier Method Lagrange Multiplier Method Lagrange multiplier is a method for solving constrained optimization. Consider this problem: maximize f(x) subject to g(x) = c. (7) Lagrange multiplier method introduces a Lagrange multiplier λ to combine f(x) and g(x): The sign of λ can be positive or negative. L(x,λ) = f(x)+λ(g(x) c). (8) Then, solve for the stationary point of L(x,λ): L(x, λ) x = 0. (9) Leow Wee Kheng (NUS) Principal Component Analysis 7 / 47
8 Maximum Variance Estimate Lagrange Multiplier Method If x 0 is a solution of the original problem, then there is a λ 0 such that (x 0,λ 0 ) is a stationary point of L. Not all stationary points yield solutions of the original problem. The same method applies to minimization problem. Multiple constraints can be combined by adding multiple terms. is solved with minimize f(x) subject to g 1 (x) = c 1, g 2 (x) = c 2 (10) L(x) = f(x)+λ 1 (g 1 (x) c 1 )+λ 2 (g 2 (x) c 2 ). (11) Leow Wee Kheng (NUS) Principal Component Analysis 8 / 47
9 Maximum Variance Estimate Now, we differentiate V with respect to q and set to 0: Rearranging the terms gives V q = 2q C 2λq = 0. (12) q C = λq. (13) Since the covariance matrix C is symmetric (homework), C = C. So, transposing both sides of Eq. 13 gives Eq. 14 is called an eigenvector equation. q is the eigenvector. λ is the eigenvalue. Cq = λq. (14) Thus, the eigenvector q of C gives the line that maximizes variance V. Leow Wee Kheng (NUS) Principal Component Analysis 9 / 47
10 Maximum Variance Estimate The perpendicular distance d i of x i from the line l is d i = x i y i q = x i (x i q)q. (15) x 2 x i x 2 y i d i l q x 1 O x1 x 4 x 3 Leow Wee Kheng (NUS) Principal Component Analysis 10 / 47
11 Maximum Variance Estimate The squared distance is d 2 i = x i (x i q)q 2 = (x i (x i q)q) (x i (x i q)q) = x i x i x i (x i q)q (x i q)q x i +(x i q)q (x i q)q = x i x i (x i q)2. Averaging over all i gives D = 1 n n d 2 i = 1 n i=1 n x i x i V. (16) i=1 So, maximizing variance V means minimizing mean squared distance D to data points x i. The eigenvalue λ = V, the variance (homework); so λ 0. Leow Wee Kheng (NUS) Principal Component Analysis 11 / 47
12 Maximum Variance Estimate Name q as the first eigenvector q 1. The component of x i orthogonal to q 1 is x i = x i x i q 1q 1. Repeat previous method on x i in an (m 1)-D subspace orthogonal to q 1 to get q 2. Then, repeat to get q 3,...,q m. x 2 orthogonal subspace O q 1 x 1 x 3 Leow Wee Kheng (NUS) Principal Component Analysis 12 / 47
13 Eigendecomposition Eigendecomposition In general, a m m covariance matrix C has m eigenvectors: Cq j = λ j q j, j = 1,...,m. (17) Transpose both sides of the equations to get q j C = λ j q j, j = 1,...,m, (18) which are row matrices. Stack the row matrices into a column to get q λ q 1. C = (19) 0 0 λ m q m Denote matrix Q and Λ as q m Q = [q 1 q m ], Λ = diag(λ 1,...,λ m ). (20) Leow Wee Kheng (NUS) Principal Component Analysis 13 / 47
14 Eigendecomposition Then, Eq. 19 becomes Q C = ΛQ (21) or C = QΛQ. (22) This is the matrix equation for eigendecomposition. The eigenvectors are orthonormal: q j q j = 1, q j q k = 0, for k j. (23) So, the eigenmatrix Q is orthogonal: Q Q = QQ = I. (24) The eigenvalues are arranged to be sorted λ j λ j+1. Leow Wee Kheng (NUS) Principal Component Analysis 14 / 47
15 Eigendecomposition In general, the mean of the data points x i are not at the origin. In this case, we subtract µ from each x i. Now, we can transform x i into a new vector y i : q 1 y i = Q (x i µ) m (x i µ) =. = (x i µ) q j q j, (25) q m(x i µ) j=1 where is the projection of x i µ on q j. Original x i can be recovered from y i : Caution! x i y i +µ. y ij = (x i µ) q j (26) x i = Qy i +µ. (27) x i is in the original input space but y i is in the eigenspace. Leow Wee Kheng (NUS) Principal Component Analysis 15 / 47
16 Eigendecomposition Leow Wee Kheng (NUS) Principal Component Analysis 16 / 47
17 PCA Algorithm 1 PCA Algorithm 1 Let x i denote m-dimensional vectors (data points), i = 1,...,n. 1. Compute the mean vector of the data points µ = 1 n n x i. (28) i=1 2. Compute the covariance matrix C = 1 n n (x i µ)(x i µ). (29) i=1 3. Perform eigendecomposition of C C = QΛQ. (30) Leow Wee Kheng (NUS) Principal Component Analysis 17 / 47
18 PCA Algorithm 1 Some books and papers use sample covariance C = 1 n 1 n (x i µ)(x i µ). (31) i=1 Eq. 31 differs from Eq. 29 by only a constant. Leow Wee Kheng (NUS) Principal Component Analysis 18 / 47
19 PCA Algorithm 1 Properties of PCA Properties of PCA Mean µ y over all y i is 0 (homework). Variance σ 2 j along q j is λ j (homework). Since λ 1 λ m 0, so σ 1 σ m 0. q 1 gives orientation of the largest variance. q 2 gives orientation of largest variance orthogonal to q 1 (2nd largest variance). q j gives orientation of largest variance orthogonal to q 1,...,q j 1 (j-th largest variance). q m is orthogonal to all other eigenvectors (least variance). Leow Wee Kheng (NUS) Principal Component Analysis 19 / 47
20 PCA Algorithm 1 Properties of PCA Data Points Leow Wee Kheng (NUS) Principal Component Analysis 20 / 47
21 PCA Algorithm 1 Properties of PCA Centroid at Origin Leow Wee Kheng (NUS) Principal Component Analysis 21 / 47
22 PCA Algorithm 1 Properties of PCA Principal Components Leow Wee Kheng (NUS) Principal Component Analysis 22 / 47
23 PCA Algorithm 1 Properties of PCA Another Example Leow Wee Kheng (NUS) Principal Component Analysis 23 / 47
24 PCA Algorithm 1 Properties of PCA Centroid at Origin Leow Wee Kheng (NUS) Principal Component Analysis 24 / 47
25 PCA Algorithm 1 Properties of PCA Principal Components Leow Wee Kheng (NUS) Principal Component Analysis 25 / 47
26 PCA Algorithm 1 Properties of PCA Notes: Covariance matrix C is a m m matrix. Eigendecomposition of very large matrix is inefficient. Example: A colour image has values. Number of dimensions m = C has a size of !! Number of images n is usually m, e.g., Leow Wee Kheng (NUS) Principal Component Analysis 26 / 47
27 PCA Algorithm 2 PCA Algorithm 2 1. Compute mean µ of data points x i. µ = 1 n 2. Form a m n matrix A, n m: n x i. i=1 A = [(x 1 µ) (x n µ)]. (32) 3. Compute A A, which is just a n n matrix. 4. Apply eigendecomposition on A A: 5. Pre-multiply A to Eq. 33 giving (A A)q j = λ j q j. (33) AA Aq j = Aλ j q j. (34) Leow Wee Kheng (NUS) Principal Component Analysis 27 / 47
28 PCA Algorithm 2 6. Recover eigenvectors and eigenvalues. Since AA = nc (homework), Eq. 34 is nc(aq j ) = λ j (Aq j ) (35) Therefore, eigenvectors of C are Aq j, and eigenvalues of C are λ j /n. C(Aq j ) = λ j n (Aq j). (36) Leow Wee Kheng (NUS) Principal Component Analysis 28 / 47
29 PCA Algorithm 3 PCA Algorithm 3 1. Compute mean µ of data points x i. µ = 1 n n x i. i=1 2. Form a m n matrix A: A = [(x 1 µ) (x n µ)]. 3. Apply singular value decomposition (SVD) on A: A = UΣV. 4. Recover eigenvectors and eigenvalues (page 31). Leow Wee Kheng (NUS) Principal Component Analysis 29 / 47
30 PCA Algorithm 3 Singular Value Decomposition Singular Value Decomposition Singular value decomposition (SVD) decomposes a matrix A into A = UΣV. (37) Column vectors of U are left singular vectors u j, and u j are orthonormal. Column vectors of V are right singular vectors v j, and v j are orthonormal. Σ is diagonal and contains singular values s j. Rank of A = number of non-zero singular values. Leow Wee Kheng (NUS) Principal Component Analysis 30 / 47
31 PCA Algorithm 3 Singular Value Decomposition Notice that AA = UΣV ( UΣV ) = UΣV VΣU = UΣ 2 U and eigendecomposition of AA is AA = QΛQ. So, eigenvector of AA = u j, eigenvalue of AA = s 2 j. On the other hand, ( A A = UΣV ) UΣV = VΣU UΣV = VΣ 2 V. Compare with the eigendecomposition of A A: A A = QΛQ. So, eigenvector of A A = v j, eigenvalue of A A = s 2 j. Leow Wee Kheng (NUS) Principal Component Analysis 31 / 47
32 PCA Algorithm 3 4. Recover eigenvectors and eigenvalues. Compare AA = nc with eigendecomposition of C: AA = UΣ 2 U, C = QΛQ. Therefore, eigenvectors q j of C = u j in U, and eigenvalues λ j of C = s 2 j /n, for s j in Σ. Leow Wee Kheng (NUS) Principal Component Analysis 32 / 47
33 Application Examples Application Examples Identify the 1st and 2nd major axes of these objects. Leow Wee Kheng (NUS) Principal Component Analysis 33 / 47
34 Application Examples Apply PCA for plane fitting in 3-D. Compute PCA of the points. Then, mean of points = known point on plane 3rd eigenvector = normal of plane Leow Wee Kheng (NUS) Principal Component Analysis 34 / 47
35 Dimensionality Reduction Dimensionality Reduction If a point represents a colour image, then the dimensionality is = !! But, not all eigenvalues are non-zero. Leow Wee Kheng (NUS) Principal Component Analysis 35 / 47
36 Dimensionality Reduction Case 1: Number of data vectors n m number of dimensions. Data vectors are all independent. Then, number of non-zero eigenvalues (eigenvectors) = n 1, i.e., rank of covariance matrix = n 1. Why not = n? Data vectors are not independent. Then, rank of covariance matrix < n 1. Case 2: Number of data vectors n > m. m or more independent data vectors. Then, rank of covariance matrix = m. Fewer than m data vectors are independent. In this case, what is the rank of covariance matrix? In practice, it is often possible to reduce the dimensionality. Leow Wee Kheng (NUS) Principal Component Analysis 36 / 47
37 Dimensionality Reduction Eigenmatrix Q is Q = [q 1 q m ]. (38) PCA maps a data point x to a vector y in the eigenspace as m y = Q (x µ) = (x µ) q j q j, (39) which has m dimensions. Pick l eigenvectors with the largest eigenvalues to form truncated Q: Q = [q 1 q l ], (40) which spans a subspace of the eigenspace. Then, Q maps x to ŷ, an estimate of y: ŷ = Q l (x µ) = (x µ) q j q j, (41) which has l < m dimensions. j=1 j=1 Leow Wee Kheng (NUS) Principal Component Analysis 37 / 47
38 Dimensionality Reduction x y ŷ x x 1 y 1 y 1 x 1. Q.. Q.. Q y l y l Q /.. y l+1 dimensionality... reduction. x m y m x m Difference between y and ŷ is m y ŷ = (x µ) q j q j. (42) j=l+1 Leow Wee Kheng (NUS) Principal Component Analysis 38 / 47
39 Dimensionality Reduction Beware: and, therefore, y = Q (x µ) x = Qy+µ. But, whereas Why? ŷ = Q (x µ), x = Qŷ+µ x. (43) With n data points x i, sum-squared error E between x i and its estimate x i is n E = x i x i 2. (44) i=1 Leow Wee Kheng (NUS) Principal Component Analysis 39 / 47
40 Dimensionality Reduction When all m dimensions are kept, the total variance is m m σj 2 = λ j. (45) j=1 j=1 When l dimensions are used, the ratio R of unaccounted variance is: m m R (l) = j=l+1 m σ 2 j σj 2 j=1 = j=l+1 λ j. (46) m λ j j=1 Leow Wee Kheng (NUS) Principal Component Analysis 40 / 47
41 Dimensionality Reduction A sample plot of R vs. number of dimensions used: 1 R not enough enough 0 l m number of dimensions How to choose appropriate l? Choose l such that larger than l doesn t reduce R significantly: R (l) R (l+1) < ǫ. (47) Leow Wee Kheng (NUS) Principal Component Analysis 41 / 47
42 Dimensionality Reduction Alternatively, compute ratio R of accounted variance: R(l) = R 1 l j=1 m j=1 σ 2 j σ 2 j = l j=1 λ j. (48) m λ j j=1 l is large enough if R(l+1) R(l) < ǫ. not enough enough 0 l m number of dimensions Leow Wee Kheng (NUS) Principal Component Analysis 42 / 47
43 Summary Summary Eigendecomposition of covariance matrix = PCA. In practice, PCA is computed using SVD. PCA maximizes variances along the eigenvectors. Eigenvalues = variances along eigenvectors. Best fitting plane computed by PCA minimizes distance to points. PCA can be used for dimensionality reduction. Leow Wee Kheng (NUS) Principal Component Analysis 43 / 47
44 Probing Questions Probing Questions When applying PCA to plane fitting, how to check whether the points really lie close to the plane? If you apply PCA to a set of points on a curve surface, where do you expect the eigenvectors to point at? In dimensionality reduction, the m-d vector y is reduced to a l-d vector ŷ. Since l m, vector subtraction is undefined. But, subtraction of Eq. 39 and 41 gives y ŷ = m (x µ) q j q j. j=l+1 How is this possible? Is there a contradiction? Show that when n m, there are at most n 1 eigenvectors. Leow Wee Kheng (NUS) Principal Component Analysis 44 / 47
45 Homework Homework I 1. Describe the essence of PCA in one sentence. 2. Show that the covariance matrix C of a set of data points x i, i = 1,...,n, is symmetric about its diagonal. 3. Show that the eigenvalue λ of covariance matrix C equals the variance V as defined in Eq Show that the mean µ y over all y i in the eigenspace is Show that the variance σ 2 j along eigenvector q j is λ j. 6. Show that AA = nc. 7. Derive the difference Qy Qŷ in dimensionality reduction. Leow Wee Kheng (NUS) Principal Component Analysis 45 / 47
46 Homework Homework II 8. Suppose n m, and the truncated eigenmatrix Q contains all the eigenvectors with non-zero eigenvalues, Q = [q 1 q n 1 ]. Now, map an input vector x to y and ŷ respectively by the complete Q (Eq. 39) and the truncated Q (Eq. 41). What is the error y ŷ? What is the result of mapping ŷ back to the input space by Q (Eq. 43)? 9. Q3 of AY2015/16 Final Evaluation. Leow Wee Kheng (NUS) Principal Component Analysis 46 / 47
47 References References 1. G. Strang, Introduction to Linear Algebra, 4th ed., Wellesley-Cambridge, www-math.mit.edu/~gs 2. C. Shalizi, Advanced Data Analysis from an Elementary Point of View, Leow Wee Kheng (NUS) Principal Component Analysis 47 / 47
Computation. For QDA we need to calculate: Lets first consider the case that
Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the
More informationCOMP 558 lecture 18 Nov. 15, 2010
Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationSingular Value Decomposition
Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =
More informationData Mining Lecture 4: Covariance, EVD, PCA & SVD
Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the
More informationNotes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.
Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where
More informationCOMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare
COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of
More informationLECTURE 16: PCA AND SVD
Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal
More informationStat 159/259: Linear Algebra Notes
Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the
More informationGI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil
GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection
More information1 Principal Components Analysis
Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for
More informationSingular Value Decomposition (SVD)
School of Computing National University of Singapore CS CS524 Theoretical Foundations of Multimedia More Linear Algebra Singular Value Decomposition (SVD) The highpoint of linear algebra Gilbert Strang
More informationThe Singular Value Decomposition
The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will
More informationPositive Definite Matrix
1/29 Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Positive Definite, Negative Definite, Indefinite 2/29 Pure Quadratic Function
More informationMaths for Signals and Systems Linear Algebra in Engineering
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 15, Tuesday 8 th and Friday 11 th November 016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE
More information33AH, WINTER 2018: STUDY GUIDE FOR FINAL EXAM
33AH, WINTER 2018: STUDY GUIDE FOR FINAL EXAM (UPDATED MARCH 17, 2018) The final exam will be cumulative, with a bit more weight on more recent material. This outline covers the what we ve done since the
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,
More informationUNIT 6: The singular value decomposition.
UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T
More information18.06SC Final Exam Solutions
18.06SC Final Exam Solutions 1 (4+7=11 pts.) Suppose A is 3 by 4, and Ax = 0 has exactly 2 special solutions: 1 2 x 1 = 1 and x 2 = 1 1 0 0 1 (a) Remembering that A is 3 by 4, find its row reduced echelon
More information1 Linearity and Linear Systems
Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)
More informationOptimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng
Optimization 2 CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Optimization 2 1 / 38
More informationMatrix Vector Products
We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and
More informationB553 Lecture 5: Matrix Algebra Review
B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationSingular Value Decomposition
Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition
More informationLecture 5 Singular value decomposition
Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More informationSingular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces
Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang
More informationThe Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)
Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will
More informationRobust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng
Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52 Previously...
More informationNumerical Methods I Singular Value Decomposition
Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More informationConceptual Questions for Review
Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.
More informationLinear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4
Linear Systems Math Spring 8 c 8 Ron Buckmire Fowler 9 MWF 9: am - :5 am http://faculty.oxy.edu/ron/math//8/ Class 7 TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5. Summary
More informationLinear Algebra Primer
Linear Algebra Primer David Doria daviddoria@gmail.com Wednesday 3 rd December, 2008 Contents Why is it called Linear Algebra? 4 2 What is a Matrix? 4 2. Input and Output.....................................
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationLecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016
Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationMachine Learning (Spring 2012) Principal Component Analysis
1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in
More informationFoundations of Computer Vision
Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationCS 143 Linear Algebra Review
CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see
More informationPrincipal Component Analysis
Principal Component Analysis Anders Øland David Christiansen 1 Introduction Principal Component Analysis, or PCA, is a commonly used multi-purpose technique in data analysis. It can be used for feature
More informationSingular Value Decomposition
Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding
More informationFinal Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson
Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if
More information1 Singular Value Decomposition and Principal Component
Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)
More informationLecture 2: Linear Algebra
Lecture 2: Linear Algebra Rajat Mittal IIT Kanpur We will start with the basics of linear algebra that will be needed throughout this course That means, we will learn about vector spaces, linear independence,
More informationLecture 7 Spectral methods
CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset
More informationLEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach
LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits
More informationSingular Value Decomposition
Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the
More informationVectors and Matrices Statistics with Vectors and Matrices
Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc
More informationPrincipal Component Analysis (PCA) CSC411/2515 Tutorial
Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More information(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =
. (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)
More informationChapter 7: Symmetric Matrices and Quadratic Forms
Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved
More informationAssignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.
Assignment 1 Math 5341 Linear Algebra Review Give complete answers to each of the following questions Show all of your work Note: You might struggle with some of these questions, either because it has
More informationBindel, Fall 2013 Matrix Computations (CS 6210) Week 2: Friday, Sep 6
Order notation Wee 2: Friday, Sep 6 We ll use order notation in multiple ways this semester, so we briefly review it here. This should be familiar to many of you. We say f(n) = O(g(n)) (read f(n) is big-o
More informationExample Linear Algebra Competency Test
Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,
More informationComputational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science
Computational Methods CMSC/AMSC/MAPL 460 EigenValue decomposition Singular Value Decomposition Ramani Duraiswami, Dept. of Computer Science Hermitian Matrices A square matrix for which A = A H is said
More informationICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization
ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization Xiaohui Xie University of California, Irvine xhx@uci.edu Xiaohui Xie (UCI) ICS 6N 1 / 21 Symmetric matrices An n n
More informationHomework 1 Elena Davidson (B) (C) (D) (E) (F) (G) (H) (I)
CS 106 Spring 2004 Homework 1 Elena Davidson 8 April 2004 Problem 1.1 Let B be a 4 4 matrix to which we apply the following operations: 1. double column 1, 2. halve row 3, 3. add row 3 to row 1, 4. interchange
More informationCS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)
CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationNotes on Linear Algebra
1 Notes on Linear Algebra Jean Walrand August 2005 I INTRODUCTION Linear Algebra is the theory of linear transformations Applications abound in estimation control and Markov chains You should be familiar
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationBindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17
Logistics Week 8: Friday, Oct 17 1. HW 3 errata: in Problem 1, I meant to say p i < i, not that p i is strictly ascending my apologies. You would want p i > i if you were simply forming the matrices and
More informationAPPLICATIONS The eigenvalues are λ = 5, 5. An orthonormal basis of eigenvectors consists of
CHAPTER III APPLICATIONS The eigenvalues are λ =, An orthonormal basis of eigenvectors consists of, The eigenvalues are λ =, A basis of eigenvectors consists of, 4 which are not perpendicular However,
More informationNotes on Eigenvalues, Singular Values and QR
Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square
More informationLinear Algebra - Part II
Linear Algebra - Part II Projection, Eigendecomposition, SVD (Adapted from Sargur Srihari s slides) Brief Review from Part 1 Symmetric Matrix: A = A T Orthogonal Matrix: A T A = AA T = I and A 1 = A T
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationFSAN/ELEG815: Statistical Learning
: Statistical Learning Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware 3. Eigen Analysis, SVD and PCA Outline of the Course 1. Review of Probability 2. Stationary
More informationI. Multiple Choice Questions (Answer any eight)
Name of the student : Roll No : CS65: Linear Algebra and Random Processes Exam - Course Instructor : Prashanth L.A. Date : Sep-24, 27 Duration : 5 minutes INSTRUCTIONS: The test will be evaluated ONLY
More informationIV. Matrix Approximation using Least-Squares
IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that
More informationLinear Algebra Review
Linear Algebra Review CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Linear Algebra Review 1 / 16 Midterm Exam Tuesday Feb
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationNORMS ON SPACE OF MATRICES
NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system
More informationLinear Models Review
Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationBasic Calculus Review
Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More informationMath 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008
Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008 Exam 2 will be held on Tuesday, April 8, 7-8pm in 117 MacMillan What will be covered The exam will cover material from the lectures
More informationTutorial on Principal Component Analysis
Tutorial on Principal Component Analysis Copyright c 1997, 2003 Javier R. Movellan. This is an open source document. Permission is granted to copy, distribute and/or modify this document under the terms
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal
More informationAnnouncements (repeat) Principal Components Analysis
4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long
More informationMATH 829: Introduction to Data Mining and Analysis Principal component analysis
1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional
More informationFall TMA4145 Linear Methods. Exercise set Given the matrix 1 2
Norwegian University of Science and Technology Department of Mathematical Sciences TMA445 Linear Methods Fall 07 Exercise set Please justify your answers! The most important part is how you arrive at an
More information5 Linear Algebra and Inverse Problem
5 Linear Algebra and Inverse Problem 5.1 Introduction Direct problem ( Forward problem) is to find field quantities satisfying Governing equations, Boundary conditions, Initial conditions. The direct problem
More information1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)
Math 35 Exam Review SOLUTIONS Overview In this third of the course we focused on linear learning algorithms to model data. summarize: To. Background: The SVD and the best basis (questions selected from
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationAppendix A: Matrices
Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows
More informationBasic Concepts in Matrix Algebra
Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1
More information