Notes on Polar Decomposition and SVD
|
|
- Sandra Newton
- 5 years ago
- Views:
Transcription
1 Notes on Polar Decomposition and SVD by Tamara Fusselman 20 Nov 2012 Assume for these notes that V, W are finite-dim vector spaces over C with inner products. Observation: Given any T L(V ), (T T ) = T T, so T T is self-adjoint, T T v, v = T v, T v 0 v V, and so T T satisfies the definition of a positive operator. Because T T is a positive operator, by prop 7.28 (p. 146)! positive operator R L(V ), written R = T T, such that R 2 = T T. (Notice R = R because positive operators are also self-adjoint.) Claim: v V, T v = Rv. Proof: v V, T v 2 = T v, T v = T T v, v = R 2 v, v = Rv, R v = Rv, Rv (because R = R) = Rv 2. Now take the square root of both sides. Since T v = Rv v V, an explicit isometry between range(t ) and range(r) would be nice. Define S 1 : range(r) range(t ) by S 1 (Rv) = T v v V. As S 1 takes each Rv range(r) to T v, and T v = Rv (by ), S 1 satisfies the norm-preserving property 1 of an isometry. But is S 1 even a map? (What if v 1, v 2 V such that Rv 1 = Rv 2 but T v 1 T v 2?) Suppose that v 1, v 2 V such that Rv 1 = Rv 2. We have Rv 1 Rv 2 = 0 R(v 1 v 2 ) = 0 R(v 1 v 2 ) = 0 (since R is lin) T (v 1 v 2 ) = 0 (by claim above, taking v = v 1 v 2 ) T (v 1 v 2 ) = 0 T v 1 T v 2 = 0 T v 1 = T v 2. (because T is linear) 1 If you are not yet convinced that S 1 satisfies the norm-preserving propery, check it: Given any w range(r), v V such that Rv = w. S 1 w = S 1 (Rv) = T v (by the definition of S 1 ) = Rv (by claim, above) = w. Tamara Fusselman, 20 Nov :52 1 Notes on Polar Decomposition and SVD
2 Thus, S 1 is a map. Note that S 1 is bijective: Supposing v 1, v 2 V such that T v 1 = T v 2 and running the above calculation backwards shows that Rv 1 = Rv 2, and so S 1 is injective. T v range(t ), Rv range(r) with S 1 (Rv) = T v, so S 1 is surjective. Is S 1 linear? v 1, v 2 V, a C. S 1 (arv 1 + Rv 2 ) = S 1 ( R(av1 + v 2 ) ) Thus, S 1 is linear. (since R is linear) = T (av 1 + v 2 ) (by the defnition of S 1 ) = at v 1 + T v 2 (because T is linear) = as 1 (Rv 1 ) + S 1 (Rv 2 ) (by the definition of S 1.) As S 1 is a linear map constructed to have the isometry propery, S 1 is an isometry between range(r) and range(t ). Can S 1 be extended to an isometry on all of V? Because S 1 is bijective (or, as Axler puts it, is an injective linear map), dim ( range(r) ) = dim ( range(t ) ) dim ( range(r) ) = dim ( range(t ) ) Letting m = dim ( range(r) ), pick orthonormal basis B = (u1,..., u m ) for ( range(r) ) and orthonormal basis B = (u 1,..., u m) for ( range(t ) ). Writing each v ( range(r) ) as v = m i=1 a iu i, a i C, define S 2 : ( range(r) ) ( ) range(t ) by ( m ) m S 2 a i u i = a i u i i=1 i=1 It is easy to check that S 2 is linear. Now m S 2 v = a i u i = m a i (since B is an orthonormal basis.) i=1 i=1 m = a i u i (since B is an orthonormal basis.) so S 2 is an isometry. i=1 = v, V = range(r) ( range(r) ) ( ), so v V,! u range(r) and w range(r) such that v = u + w. Now define S L(V ) by Sv = S 1 u + S 2 w. S is linear since it s the sum of linear maps. Tamara Fusselman, 20 Nov :52 2 Notes on Polar Decomposition and SVD
3 Sv 2 = S 1 u + S 2 w 2 by the definition of S. Now, since S 1 u range(t ) and S 2 w ( range(t ) ), = S 1 u 2 + S 2 w 2 by the Pythagorean theorem (p. 102) = u 2 + w 2 since S 1 and S 2 are isometries = u + w 2 since u range(r) and S 2 w ( range(r) ) = v 2. Taking the square root of both sides shows that S is an isometry. Now, v V, Rv range(r), and so S(Rv) = S 1 (Rv) = T v. Recalling R = T T, v V, T v = S T T v and so T = S T T. This construction of S (which works T L(V )) gives the Polar Decomposition Theorem (Theorem 7.41, p. 153): T L(V ), isometry S L(V ) such that T = S T T. Singular Value Decomposition Theorem (Theorem 7.46, p. 156): T L(V ), scalars λ 1,..., λ k and orthonormal bases B = (u 1,..., u n ), B = (u 1,..., u n) such that T v = λ 1 v, u 1 u λ n v, u n u n, v V. Proof: We already know that T has polar decomposition T = SR, where R = T T and S L(V ) is an isometry. Since R is self-adjoint, by the Spectral Theorem (theorem 7.9, p. 133), R has an orthonormal eigenbasis B 1 = (u 1,..., u n ). 2 For each k, let λ k be the eigenvector associated with u k (so that (λ k, u k ) is an eigenpair). Since B 1 is an orthonormal basis for V, v V, v = v, u 1 u 1 + v, u n u n Rv = λ 1 v, u 1 u 1 + λ n v, u n u n T v = λ 1 v, u 1 Su 1 + λ n v, u n Su n (applying R to both sides) (applying S to both sides) because S is an isometry. B 2 = (u 1 = Su 1,..., u n = Su n ) is another orthonormal basis for V (See 7.36, p. 148). Hence T v = λ 1 v, u 1 u λ n v, u n u n. Definition: Singular Values of T : The singular values of T are the scalars λ 1,..., λ n found above. They are the eigenvalues of T T, counted with repetition. Matrix of T Recall some notation: If B = (v 1,..., v n ) and B = (v 1,..., v n) are bases for V and L L(V ), then B [L] B = M(L, B, B ) is the matrix whose k th column is [a 1k,..., a nk ], where Lv j = a 1k v a nk v n. Now, T u k = λ 1 u k, u 1 u λ n u k, u n u n = λ k u k k, hence the kth column of B2 [L] B1 nonzero in the k th place {}}{ [0,..., λ k,..., 0], is 2 An eigenbasis for a linear operator on V is a basis for V whose elements are all eigenvectors of the linear operator. Tamara Fusselman, 20 Nov :52 3 Notes on Polar Decomposition and SVD
4 making λ 1 B 2 [ T ] B1 =.... λ n The convention is to call this matrix Σ (for singular values ) or D (for diagonal ). We will let λ 1... = D. λ n Assuming the standard basis E = (e 1,..., e n ) is orthonormal with respect to our given inner product for V, we write E [ T ] E as follows: E[ T ] E = E [ I ] B2 B 2 [ T ] B1 B 1 [ I ] E, where B 1 [ I ] E = ( E[ I ] B1 ) 1 and both E [ I ] B1 and E [ I ] B2 are change-of-basis matrices from B 1 and B 2, respectively, to the standard basis E. Let U 1 = E [ I ] B1 and U 2 = E [ I ] B2. As both U 1 and U 2 are maps taking orthonormal bases to orthonormal bases, both are isometries, hence unitary operators (See theorem 7.36, p. 148). Recalling that B2 [ T ] B1 = D, E [ T ] E = U 2 DU1 1 = U 2 DU1. Summarizing the SVD geometrically Isometries: You can picture an isometry as rotating and reflecting (reversing the sign of) various basis elements in an orthonormal basis of a space. An isometry might change a shape to its mirror image, but aside from that, it doesn t deform the shape in any way; it just changes its orientation. Usually, when we see something upside-down, or reflected, or tilted, we immediately recognize what it is. In that sense, an isometry doesn t change what we re looking at. Positive Operators: Since positive operators are self-adjoint, they are diagonal with respect to some orthonormal basis Additionally, their eigenvalues (diagonal values) are all nonnegative. So we can think of a positive operator as rescaling the elements of an orthonormal basis without reversing any of their directions: Polar Decomposition: The Polar Decomposition Theorem shows us that every T L(V ) is isometric to a positive operator R = T T : S T R. Spectral Theorem: The spectral theorm tells us that R has an orthonormal eigenbasis, that is, an orthonormal basis B = (u 1,..., u n ) such that B1 [ I ] E is diagonal: S T D = B [ R ] B. Moreover, the nonzero entries of D are positive, since R is a positive operator. Altogether: Any linear operator T on a vector space C is isometric to an operator that simply rescales some orthonormal basis of the space. I find that thinking of the SVD geometrically makes remembering how to construct the matrix form of the SVD easier: Tamara Fusselman, 20 Nov :52 4 Notes on Polar Decomposition and SVD
5 Matrix Form of the SVD, Revisited: (Assuming that our standard basis E is orthonormal with respect to the inner product on V ) The polar decomposition gives us T = SR, where R = T T. The Spectral Theorem tells us that B[ R ] B = D, a diagonal matrix, where B = (u 1,..., u n ) is an orthonormal basis of V of eigenvectors for R. The matrix U 1 = [u 1,..., u n ] is E [ I ] B, the change-of-basis marix from B to E. As B and E are orthonormal bases, U 1 is an isometry and so U 1 1 = U 1. E[ R ] E = E [ I ] B B [ R ] B B [ I ] E = U 1 DU 1. By the polar decomposition, E[ T ] E = E [ S ] E E [ R ] E = E [ S ] E U 1 DU 1. Let U 2 = E [ S ] E U 1. Since the composition of isometries is another isometry, U 2 is an isometry. So E[ T ] E = U 2 DU 1. Generalizing Axler s proof of the SVD to all linear maps Let T L(V, W ), with dim(v ) = n and dim(w ) = m. T T : V W V is a positive operator on V, so R L(V ) such that R = T T. As in the case where T is a linear operator, it s possible to show that T v = Rv v V (even thought T v W and Rv V ). Likewise, it s possible to show that S 1 : range(r) range(t ) given by S 1 (Rv) = T v is an isometry between range(r) and range(t ). Again, the Spectral Theorem shows us that there exists an orthonormal eigenbasis B of V such that Σ = B [ R ] B is diagonal. Let B = (u 1,..., u r, u r+1,..., u n ), where (λ 1, u 1 ),..., (λ r, u r ) are the eigenpairs for the nonzero diagonal entries of Σ, arranged so that λ 1 λ r. Then u r+1,..., u n are all eigenvectors of R with eigenvalue 0. Notice that range(r) = span(u 1,..., u r ) and null(r) = span(u r+1,..., u n ). Since S 1 is an isometry from range(r) to range(t ), (u 1,..., u r) = (S 1 u 1,... S 1 u r ) is an orthonormal basis for range(t ), which can be extended to an orthonormal basis B = (u 1,..., u r, u r+1,..., u m) for all of W. Now express T with respect to B and B : T u k = S 1 (Ru k ) = S 1 (λ k u k ) = λ k S 1 u k = λ k u k 1 k r. Because S 1 is an isometry between range(r) and range(t ), dim ( range(t ) ) = dim ( range(r) ) = r. If T u k 0 for some r < k n, then dim ( range(t ) ) > dim ( range(r) ), contradicting the fact that S 1 is an isometry. So T u k = 0 r < k n, making B[ T ] B = u 1 u r u r+1 u n u 1 λ u r λ r u r u m 0 Tamara Fusselman, 20 Nov :52 5 Notes on Polar Decomposition and SVD
6 As long as the standard bases E v and E w are orthonormal with respect to the inner products given for V and W, the change-of-basis matrices U = Ev [ I ] B and U = Ev [ I ] B are unitary, and in particular, U 1 = B [ I ] Ev = U. Letting D = B [ I ] B, E w [ T ] Ev = Ew [ I ] B B [ T ] B B[ I ] Ev = U DU. Statement of the SVD Theorem for T L(V, W ), Axler style: Let T L(V, W ), where dim(v ) = n and dim(w ) = m. Then r = dim ( range(t ) ) min(n, m) and orthonormal bases B 1 = (u 1,..., u n ) for V and B 2 = (u 1,..., u m) for W such that v V, T v = λ 1 v, u 1 u λ r v, u r u r, where λ 1 λ 2 λ r are nonzero singular values of T. Is the SVD useful? Yes, very and not only to mathematicians. As you might imagine, the SVD has some important geometric applications. Chemists use it to compare molecules by rotating them around until as many points match up as possible. It s also used to reconstruct 3-D distances between objects in a photograph. The SVD is used to build seperable models in biology and to model entanglement in quantum mechanics. Perhaps the most widespread use of the SVD, though, is in principle component approximation, especially principle component analysis. Principal Component Approximation Consider the components λ i v, u i u i in the expression T v = λ 1 v, u 1 u λ r v, u r u r. Because λ 1 λ 2 λ r, the earlier components of T v contribute more towards T v than the later components. It seems we could get a pretty good approximation of T by throwing away the components whose singular values (the λ i ) we decide are too small. We can call approximating T by the first k components in the SVD we consider significant Principal Component Approximation. More formally, define T k by T k v = λ 1 v, u 1 u λ k v, u k u k v V, where k < r (You could choose k r, but there s not much point you d just get T back.) Notice that T k v is the projection of T v onto W k = span(u 1,..., u k ), so T kv is the closest approximation to T v in W k. (It can be shown that T k is the closest rank-k approximation to T according to the Frobenius norm, T = trace(t T), but I won t do that there.) u 1,..., u r are called the normalized principal component vectors of T. When T is represented by the m n matrix M with SVD M = U 2 DU1, then M k = U 2 D k U1, where λ 1... D k = λ k 0... is the matrix form of T k. Uses of Principal Component Approximation 0 Principal component approximation can be used for image compression, to build pattern-recognizing software (look up eigenfaces), or to improve matrix computations in the face of a computer s limited precision (It is often better to approximate a matrix using only the singular values that are well away from a machine s precision than to use the original matrix). Tamara Fusselman, 20 Nov :52 6 Notes on Polar Decomposition and SVD
7 Principal component approximation is most famously used in principal component analysis (PCA for short), a widely-used statistical tool of interest to many people besides statisticians (Some of the examples above use PCA). Principal Component Analysis Suppose an experiment measuring n random variables is repeated m times and the results are recorded in a table with n columns, each column containing the m data points gathered for a given random variable. The data is mean-centered by subtracting the column average of a given column from every entry in that column, so that what s left is how much each measurement of a random variable deviates from the random variable s sample mean. Let M be a mean-centered m n matrix. The sample correlation between any two random variables (column vectors) recorded in M is measured by the cosine of the angle between them, according to the usual dot-product definition of the cosine between two vectors. Random variables are considered uncorrelated if they (or rather, the column vectors in M representing them) are mutually orthogonal. The variance among all random variables recored in M is given by the covariance matrix 3 C = 1 n 1 M M. The goal of principal component analysis is to find uncorrelated factors (mutually orthogonal random variables composed of linear combinations of the random variables recorded in the experiment) which explain most of the variance in the experiment. If the SVD of the mean-centered data matrix M is M = U 2 DU 1, then C = 1 ( U2 DU ) ( 1 U2 DU ) 1 n 1 = 1 n 1 U 1D U2 U 2 DU1 = 1 n 1 U 1D DU 1 = 1 n 1 U 1D 2 U 1. It seems reasonable to approximate C by C k = 1 n 1 U 1DkU 2 1, where λ 1... D k = λ k 0..., k < rank(m), 0 and this is indeed what is done. C k ought to look familiar, for C k = 1 n 1 M k M k, where M k is the principal component approximation to M of rank k, defined earlier. PCA is often used with k = 2 or 3 in order to graph statistical data in the most revealing way possible. There s even a song about this on YouTube (called It had to be U ). 3 Depending on statistical considerations, the covariance matrix may take slightly different forms. Tamara Fusselman, 20 Nov :52 7 Notes on Polar Decomposition and SVD
8 Principal Component Analysis and Thorny Devils Several years ago, two biologists decided to study what makes the thorny devil, a slow-moving Australian lizard rejoicing in the Latin name Moloch horridus, so good at eating ants. They compared the thorny devil to three other lizards: one related to the thorny devil but who didn t specialize in eating ants (the bearded dragon, Pogona vitticeps); another who specialized in eating ants, but who wasn t closely related to the thorny devil (the horned lizard or horny toad, Phrynosoma platyrhinos); and a third who was neither closely related to the thorny devil, nor an ant-eating specialist (the fringetoed lizard, Uma notata). The biologists used high-speed cameras to record profile views of all four species eating ants, then computed 27 kinematic variables from the footage by measuring the movement of seven anatomical landmarks in the x-y plane. By performing Principal Component Analysis on the variables, they were able to find three principal components that explained 71% of the variation in the data. The first component, which accounted for 45% of the variation, corresponded with the habits that make the ant-eating specialists different from the generalists. The second two components seemed to correspond to which lizard was more closely related to which other lizard. What s interesting about this experiment is that it offered at least two opportunities to use the SVD. Obviously, the SVD could have been used to do the Principal Component Analysis. But the SVD could also have been used to retrieve more and more accurate initial data: It s hard to get a profile view of animals eating, so most of the footage of the lizards had to be thrown out. The biologists then took their measurements from footage where the lizards were approximately in profile view, apparently without correcting for the fact that the views weren t perfect profiles. They could have used the SVD to un-project foreshortened footage of the lizards so that more of the footage would have been usable, and the measurements from the footage they did use would be more accurate.
9 Practice Problems for SVD and Polar Decomposition 1) Find the SVD for the following real 2x2 matrix: -2-1/2-2 1/2 The matrix cos(x) sin(x) -sin(x) cos(x) and the following diagram may help: 2) Professor Eigenwichser attempts to demonstrate an SVD for a linear operator T on C 4 to his class. Unfortunately he has found bases B = (u 1, u 2, u 3, u 4 ) and B = (u 1, u 2, u 3, u 4 ) so that his matrix of singular values B [T] B has diagonal entries -1, 3, -5, and 0. But singular values are supposed to be nonnegative (since they are the eigenvalues of the square root of T*T, a positive operator). What could he do to his bases to fix this? 3) Suppose T: V W is a linear map but not an operator. Letting R be the square root of T*T, show that Tv = Rv for all v in V, even though Tv is in W and Rv is in V. 4) Again, suppose T is a linear map but not an operator. Show that the map S 1 : range(r) range(t) defined by S 1 (Rv) = Tv is an isometry, even though range(r) is in V and range(t) is in W.
10 References Axler, Linear Algebra Done Right Steven J Leon, Linear Algebra with Applications (pp , pp This is the only source I found that made statistics notation and PCA easy to follow for someone with only a linear algebra background.) Image Filtering via SVD Wikipedia s articles on SVD and PCA (though the math is presented differently from how we re doing it) It had to be U the SVD song Scholarpedia s article on Eigenfaces Prey capture kinematics of ant-eating lizards by Meyers and Herrel Eric Pianka s Thorny Devil page Special thanks to my husband, Jerry Fusselman for being my TEX hero until TEX conked out on us.
Summary of Week 9 B = then A A =
Summary of Week 9 Finding the square root of a positive operator Last time we saw that positive operators have a unique positive square root We now briefly look at how one would go about calculating the
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationNumerical Methods I Singular Value Decomposition
Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationNotes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.
Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where
More informationLecture 19: Isometries, Positive operators, Polar and singular value decompositions; Unitary matrices and classical groups; Previews (1)
Lecture 19: Isometries, Positive operators, Polar and singular value decompositions; Unitary matrices and classical groups; Previews (1) Travis Schedler Thurs, Nov 18, 2010 (version: Wed, Nov 17, 2:15
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationThen x 1,..., x n is a basis as desired. Indeed, it suffices to verify that it spans V, since n = dim(v ). We may write any v V as r
Practice final solutions. I did not include definitions which you can find in Axler or in the course notes. These solutions are on the terse side, but would be acceptable in the final. However, if you
More informationFinal Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2
Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch
More informationReview problems for MA 54, Fall 2004.
Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on
More informationChapter 6: Orthogonality
Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products
More informationSystems of Linear Equations
Systems of Linear Equations Math 108A: August 21, 2008 John Douglas Moore Our goal in these notes is to explain a few facts regarding linear systems of equations not included in the first few chapters
More informationHomework 11 Solutions. Math 110, Fall 2013.
Homework 11 Solutions Math 110, Fall 2013 1 a) Suppose that T were self-adjoint Then, the Spectral Theorem tells us that there would exist an orthonormal basis of P 2 (R), (p 1, p 2, p 3 ), consisting
More informationChapter 7: Symmetric Matrices and Quadratic Forms
Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved
More informationApplied Linear Algebra in Geoscience Using MATLAB
Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More information(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =
. (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)
More informationLecture 7: Positive Semidefinite Matrices
Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.
More informationLecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1)
Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1) Travis Schedler Thurs, Nov 17, 2011 (version: Thurs, Nov 17, 1:00 PM) Goals (2) Polar decomposition
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationContents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2
Contents Preface for the Instructor xi Preface for the Student xv Acknowledgments xvii 1 Vector Spaces 1 1.A R n and C n 2 Complex Numbers 2 Lists 5 F n 6 Digression on Fields 10 Exercises 1.A 11 1.B Definition
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationCSE 554 Lecture 7: Alignment
CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex
More informationLecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1)
Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1) Travis Schedler Thurs, Nov 17, 2011 (version: Thurs, Nov 17, 1:00 PM) Goals (2) Polar decomposition
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the
More informationLinear Algebra for Machine Learning. Sargur N. Srihari
Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More information7 Principal Component Analysis
7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is
More informationCS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery
CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery Tim Roughgarden & Gregory Valiant May 3, 2017 Last lecture discussed singular value decomposition (SVD), and we
More informationThroughout these notes we assume V, W are finite dimensional inner product spaces over C.
Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal
More informationSVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)
Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular
More informationMATH36001 Generalized Inverses and the SVD 2015
MATH36001 Generalized Inverses and the SVD 201 1 Generalized Inverses of Matrices A matrix has an inverse only if it is square and nonsingular. However there are theoretical and practical applications
More informationMath 113 Final Exam: Solutions
Math 113 Final Exam: Solutions Thursday, June 11, 2013, 3.30-6.30pm. 1. (25 points total) Let P 2 (R) denote the real vector space of polynomials of degree 2. Consider the following inner product on P
More information1 Singular Value Decomposition and Principal Component
Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)
More informationMath 414: Linear Algebra II, Fall 2015 Final Exam
Math 414: Linear Algebra II, Fall 2015 Final Exam December 17, 2015 Name: This is a closed book single author exam. Use of books, notes, or other aids is not permissible, nor is collaboration with any
More informationFinal Exam Practice Problems Answers Math 24 Winter 2012
Final Exam Practice Problems Answers Math 4 Winter 0 () The Jordan product of two n n matrices is defined as A B = (AB + BA), where the products inside the parentheses are standard matrix product. Is the
More informationEigenvalues and diagonalization
Eigenvalues and diagonalization Patrick Breheny November 15 Patrick Breheny BST 764: Applied Statistical Modeling 1/20 Introduction The next topic in our course, principal components analysis, revolves
More informationReview of linear algebra
Review of linear algebra 1 Vectors and matrices We will just touch very briefly on certain aspects of linear algebra, most of which should be familiar. Recall that we deal with vectors, i.e. elements of
More informationThis appendix provides a very basic introduction to linear algebra concepts.
APPENDIX Basic Linear Algebra Concepts This appendix provides a very basic introduction to linear algebra concepts. Some of these concepts are intentionally presented here in a somewhat simplified (not
More informationMath Linear Algebra II. 1. Inner Products and Norms
Math 342 - Linear Algebra II Notes 1. Inner Products and Norms One knows from a basic introduction to vectors in R n Math 254 at OSU) that the length of a vector x = x 1 x 2... x n ) T R n, denoted x,
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More information. = V c = V [x]v (5.1) c 1. c k
Chapter 5 Linear Algebra It can be argued that all of linear algebra can be understood using the four fundamental subspaces associated with a matrix Because they form the foundation on which we later work,
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,
More informationLINEAR ALGEBRA REVIEW
LINEAR ALGEBRA REVIEW JC Stuff you should know for the exam. 1. Basics on vector spaces (1) F n is the set of all n-tuples (a 1,... a n ) with a i F. It forms a VS with the operations of + and scalar multiplication
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More informationProposition 42. Let M be an m n matrix. Then (32) N (M M)=N (M) (33) R(MM )=R(M)
RODICA D. COSTIN. Singular Value Decomposition.1. Rectangular matrices. For rectangular matrices M the notions of eigenvalue/vector cannot be defined. However, the products MM and/or M M (which are square,
More information1. The Polar Decomposition
A PERSONAL INTERVIEW WITH THE SINGULAR VALUE DECOMPOSITION MATAN GAVISH Part. Theory. The Polar Decomposition In what follows, F denotes either R or C. The vector space F n is an inner product space with
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More informationLinear Algebra Review. Fei-Fei Li
Linear Algebra Review Fei-Fei Li 1 / 37 Vectors Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightnesses, etc. A vector
More information7. Symmetric Matrices and Quadratic Forms
Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationMATH SOLUTIONS TO PRACTICE MIDTERM LECTURE 1, SUMMER Given vector spaces V and W, V W is the vector space given by
MATH 110 - SOLUTIONS TO PRACTICE MIDTERM LECTURE 1, SUMMER 2009 GSI: SANTIAGO CAÑEZ 1. Given vector spaces V and W, V W is the vector space given by V W = {(v, w) v V and w W }, with addition and scalar
More informationThe Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)
Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will
More informationThe Spectral Theorem for normal linear maps
MAT067 University of California, Davis Winter 2007 The Spectral Theorem for normal linear maps Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (March 14, 2007) In this section we come back to the question
More informationMath 4A Notes. Written by Victoria Kala Last updated June 11, 2017
Math 4A Notes Written by Victoria Kala vtkala@math.ucsb.edu Last updated June 11, 2017 Systems of Linear Equations A linear equation is an equation that can be written in the form a 1 x 1 + a 2 x 2 +...
More informationChapter 8 Integral Operators
Chapter 8 Integral Operators In our development of metrics, norms, inner products, and operator theory in Chapters 1 7 we only tangentially considered topics that involved the use of Lebesgue measure,
More informationSingular Value Decomposition
Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =
More informationLinear Algebra - Part II
Linear Algebra - Part II Projection, Eigendecomposition, SVD (Adapted from Sargur Srihari s slides) Brief Review from Part 1 Symmetric Matrix: A = A T Orthogonal Matrix: A T A = AA T = I and A 1 = A T
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,
More informationThe University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.
The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational
More informationCSC 411 Lecture 12: Principal Component Analysis
CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 12-PCA 1 / 23 Overview Today we ll cover the first unsupervised
More informationDesigning Information Devices and Systems II Fall 2015 Note 5
EE 16B Designing Information Devices and Systems II Fall 01 Note Lecture given by Babak Ayazifar (9/10) Notes by: Ankit Mathur Spectral Leakage Example Compute the length-8 and length-6 DFT for the following
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationLinear Algebra Review. Fei-Fei Li
Linear Algebra Review Fei-Fei Li 1 / 51 Vectors Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightnesses, etc. A vector
More informationMath 4153 Exam 3 Review. The syllabus for Exam 3 is Chapter 6 (pages ), Chapter 7 through page 137, and Chapter 8 through page 182 in Axler.
Math 453 Exam 3 Review The syllabus for Exam 3 is Chapter 6 (pages -2), Chapter 7 through page 37, and Chapter 8 through page 82 in Axler.. You should be sure to know precise definition of the terms we
More informationSpectral Theorem for Self-adjoint Linear Operators
Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More information2: LINEAR TRANSFORMATIONS AND MATRICES
2: LINEAR TRANSFORMATIONS AND MATRICES STEVEN HEILMAN Contents 1. Review 1 2. Linear Transformations 1 3. Null spaces, range, coordinate bases 2 4. Linear Transformations and Bases 4 5. Matrix Representation,
More informationSolutions to Final Exam
Solutions to Final Exam. Let A be a 3 5 matrix. Let b be a nonzero 5-vector. Assume that the nullity of A is. (a) What is the rank of A? 3 (b) Are the rows of A linearly independent? (c) Are the columns
More informationThe Singular Value Decomposition
The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will
More informationMachine Learning (Spring 2012) Principal Component Analysis
1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in
More informationThere are two things that are particularly nice about the first basis
Orthogonality and the Gram-Schmidt Process In Chapter 4, we spent a great deal of time studying the problem of finding a basis for a vector space We know that a basis for a vector space can potentially
More informationDesigning Information Devices and Systems II
EECS 16B Fall 2016 Designing Information Devices and Systems II Linear Algebra Notes Introduction In this set of notes, we will derive the linear least squares equation, study the properties symmetric
More informationPrincipal Component Analysis (PCA)
Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha
More informationSingular Value Decompsition
Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost
More informationLecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora
princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the
More informationMATH 581D FINAL EXAM Autumn December 12, 2016
MATH 58D FINAL EXAM Autumn 206 December 2, 206 NAME: SIGNATURE: Instructions: there are 6 problems on the final. Aim for solving 4 problems, but do as much as you can. Partial credit will be given on all
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More information1 Linearity and Linear Systems
Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)
More informationLinear Algebra Short Course Lecture 4
Linear Algebra Short Course Lecture 4 Matthew J. Holland matthew-h@is.naist.jp Mathematical Informatics Lab Graduate School of Information Science, NAIST 1 Some useful references Finite dimensional inner-product
More informationConceptual Questions for Review
Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.
More informationLecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26
Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1
More information(VII.E) The Singular Value Decomposition (SVD)
(VII.E) The Singular Value Decomposition (SVD) In this section we describe a generalization of the Spectral Theorem to non-normal operators, and even to transformations between different vector spaces.
More informationDot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.
Dot Products K. Behrend April 3, 008 Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Contents The dot product 3. Length of a vector........................
More informationEigenvalues, Eigenvectors, and an Intro to PCA
Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.
More informationMTH 2032 SemesterII
MTH 202 SemesterII 2010-11 Linear Algebra Worked Examples Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education December 28, 2011 ii Contents Table of Contents
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationWhat is Principal Component Analysis?
What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most
More informationComputational math: Assignment 1
Computational math: Assignment 1 Thanks Ting Gao for her Latex file 11 Let B be a 4 4 matrix to which we apply the following operations: 1double column 1, halve row 3, 3add row 3 to row 1, 4interchange
More information1. Foundations of Numerics from Advanced Mathematics. Linear Algebra
Foundations of Numerics from Advanced Mathematics Linear Algebra Linear Algebra, October 23, 22 Linear Algebra Mathematical Structures a mathematical structure consists of one or several sets and one or
More informationLECTURE NOTE #11 PROF. ALAN YUILLE
LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform
More informationLecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University
Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization
More informationSingular Value Decomposition
Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the
More information