Singular Value Decomposition

Similar documents
Singular Value Decomposition

The Singular Value Decomposition

In this section again we shall assume that the matrix A is m m, real and symmetric.

UNIT 6: The singular value decomposition.

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

The Singular Value Decomposition

Review problems for MA 54, Fall 2004.

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

Notes on Eigenvalues, Singular Values and QR

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

MIT Final Exam Solutions, Spring 2017

Maths for Signals and Systems Linear Algebra in Engineering

IV. Matrix Approximation using Least-Squares

MATH 581D FINAL EXAM Autumn December 12, 2016

Summary of Week 9 B = then A A =

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Singular Value Decomposition

Chapter 7: Symmetric Matrices and Quadratic Forms

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in

EE731 Lecture Notes: Matrix Computations for Signal Processing

Numerical Methods I Singular Value Decomposition

Linear Least-Squares Data Fitting

Linear Algebra, part 3 QR and SVD

Principal Component Analysis

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

MATH36001 Generalized Inverses and the SVD 2015

18.06 Problem Set 10 - Solutions Due Thursday, 29 November 2007 at 4 pm in

Computational Methods. Eigenvalues and Singular Values

COMP 558 lecture 18 Nov. 15, 2010

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

Lecture notes: Applied linear algebra Part 1. Version 2

Conceptual Questions for Review

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

Chapter 3 Transformations

18.06SC Final Exam Solutions

Singular Value Decomposition

Lecture: Face Recognition and Feature Reduction

MATH 612 Computational methods for equation solving and function minimization Week # 2

MATH 310, REVIEW SHEET 2

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

The Singular Value Decomposition and Least Squares Problems

. = V c = V [x]v (5.1) c 1. c k

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.

Problem # Max points possible Actual score Total 120

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Lecture: Face Recognition and Feature Reduction

Singular Value Decomposition

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Linear Systems. Carlo Tomasi

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

MAT1302F Mathematical Methods II Lecture 19

Linear Algebra Primer

Eigenvectors and Hermitian Operators

Singular Value Decomposition and Digital Image Compression

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

I. Multiple Choice Questions (Answer any eight)

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

What is A + B? What is A B? What is AB? What is BA? What is A 2? and B = QUESTION 2. What is the reduced row echelon matrix of A =

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Repeated Eigenvalues and Symmetric Matrices

Linear Algebra. Carleton DeTar February 27, 2017

Vector and Matrix Norms. Vector and Matrix Norms

Linear Least Squares. Using SVD Decomposition.

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

Problem set 5: SVD, Orthogonal projections, etc.

The QR Decomposition

MATH 320, WEEK 11: Eigenvalues and Eigenvectors

Homework 1 Elena Davidson (B) (C) (D) (E) (F) (G) (H) (I)

The Singular Value Decomposition

Lecture 8: Linear Algebra Background

Numerical Linear Algebra Homework Assignment - Week 2

Lecture 2: Linear Algebra Review

EE731 Lecture Notes: Matrix Computations for Signal Processing

Singular Value Decomposition (SVD)

Topic 15 Notes Jeremy Orloff

MATH 583A REVIEW SESSION #1

Math 3191 Applied Linear Algebra

Linear Algebra Review. Fei-Fei Li

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Stat 159/259: Linear Algebra Notes

Linear Algebra. Session 12

Diagonalizing Matrices

STA141C: Big Data & High Performance Statistical Computing

LINEAR ALGEBRA KNOWLEDGE SURVEY

Review of Some Concepts from Linear Algebra: Part 2

MATH 122: Matrixology (Linear Algebra) Solutions to Level Tetris (1984), 10 of 10 University of Vermont, Fall 2016

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Linear Algebra in Actuarial Science: Slides to the lecture

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Math Fall Final Exam

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

Chapter 2 Notes, Linear Algebra 5e Lay

Linear Algebra: Matrix Eigenvalue Problems

Transcription:

Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the most powerful non-trivial tools of Numerical Linear Algebra that is used for a vast range of applications, from image compression, to Least Square fitting, and generally speaking many forms of big-data analysis, as we shall see below. We will first learn more about the SVD, discuss briefly how one may compute it, then see some examples. By contrast with other Chapters, here we will not focus so much on algorithmic development (which is a very complicated topic) but more on understanding the theoretical and practical aspects of the SVD. 1. What is the SVD? See Chapter 4 of the textbook 1.1. General ideas As we saw in Chapter 1, it is possible, from any matrix A of size m n, to define singular values σ i and left and right singular vectors u i and v i so that Av i = σ i u i (5.1) where u i are the normalized m 1 vectors parallel to the principal axes of the hyperellipse formed by the image of the unit ball via application of A, σ i > 0 are the lengths of these principal axes, and v i are the pre-images of u i, and of size n 1. If r = rank(a), then there will be r such principal axes, and therefore r non-zero singular values. Let us now re-write this as the matrix operation σ 1 σ AV = UΣ = A v 1 v... v r = u 1 u... u r... σ r where A is m n, V is n r, U is m r and Σ is r r. 101 (5.)

10 A very non-trivial result (that we prove below) is that both U and V are unitary matrices (orthogonal, if A is real). While it may seem obvious that U should be unitary since it is formed by the principal axes of the hyperellipse, note that we have not proved yet that the image of the unit ball is always a hyperellipse. Furthermore, there is no obvious reason why the pre-images v i of the u i vectors should themselves be orthogonal, even if the u i are. Nevertheless, assuming this is indeed the case, then we have U AV = Σ or equivalently A = UΣV (5.3) The first of these two expressions reveals something that looks a lot like a diagonalization of A in its own subspace, and is clearly an extension of the more commonly known expression V 1 AV = D λ (5.4) when A is diagonalizable (i.e. square, and non-defective) and V is the basis of its eigenvectors. Crucially, however, we will show that writing U AV = Σ can be done for any matrix A, even defective ones, and even singular ones. The second of these expression writes A as a factorization between interesting matrices here, two orthogonal ones and one diagonal, positive definite one, and is directly related to the so-called reduced and full singular value decompositions of A. The distinction between reduced SVD and full SVD here is very similar to the distinction we made between reduced QR decomposition and full QR decomposition in Chapter 3. 1.. The reduced SVD To construct the reduced singular value decomposition of A, we add the normalized vectors v r+1,... v n to V, choosing them carefully in such a way as to be orthogonal to one another and to all the already-existing ones. This transforms V into a square orthogonal matrix, whose column vectors form a complete basis for the space of vectors of length n. Having done that, we then need to expand Σ to size n n, where the added rows and columns are identically zero add the vectors u r+1,... u n to U. These vectors must be normalized, orthogonal to one another, and to the existing ones. Note that these new vectors lie outside of the range of A, by definition. The new factorization of A then becomes A = Û ˆΣ ˆV, and more precisely σ 1 v1.... A = u 1... u r u r+1... u n σ r vr 0 vr+1.... 0 vn (5.5)

103 Recalling that this can also be written A ˆV = Û ˆΣ, we find that the added v vectors (in red) satisfy Av i = 0u i, revealing them to span the nullspace of A. With this construction, ˆV and ˆΣ have size n n, and Û has size m n.. 1.3. The full SVD Just as in the QR case, we can then also define a full SVD, by completing the basis formed by the orthogonal vectors u i into full basis for the space of m 1 vectors. We then continue to add vectors to U, normalized, orthogonal to one another and orthogonal to the previous ones, as usual. We also continue to pad Σ with more rows containing only zeros. The matrix V on the other hand remains unchanged. This yields A = UΣV (5.6) where, this time, U is an m m matrix, V is an n n matrix, and Σ is an m n diagonal matrix, with zeros everywhere else. In this sense, U and V are square unitary matrices, but Σ has the same shape as the original matrix A. As in the case of the reduced vs. full QR decompositions, the full SVD naturally contains the reduced SVD in its columns and rows. 1.4. The existence of the SVD We now come to the tricky part, namely demonstrating that a SVD exists for all matrices. Note that the theorem applies to the full SVD, but since the reduced SVD is contained in the latter, it de-facto also proves the existence of the reduced SVD. Theorem: (Singular Value Decomposition) Let A be an m n matrix. Then there exist two unitary matrices U (of size m m) and V (of size n n) such that U AV = Σ, (5.7) where Σ is an m n diagonal matrix whose p = min(m, n) entries are σ 1 σ... σ p 0. If rank(a) = r, then there will be exactly r non-zero σ i, and all the other ones are zero. Idea behind the proof: The proof is inductive but unfortunately not very constructive. The idea is to work singular value by singular value. Let s begin with the first singular value of A, which we proved a long time ago to satisfy σ 1 = A (5.8) Because of the way A is constructed, we know that there exists 1 a unit vector x that satisfies Ax = σ 1 ; let that vector be v 1, and let u 1 be the unit vector defined such that Av 1 = σ 1 u 1. Let s then construct a unitary matrix U 1 whose first column is u 1 (randomly pick u,... u m normalized and orthogonal to one 1 Note that this is where the proof lets us down a little since although we know x must exist, it doesn t tell us how to find it.

104 another and to u 1 ), and another unitary matrix V 1 whose first column is v 1 (randomly pick v,... v n normalized orthogonal to one another and to u 1 ). Then let A 1 = U 1AV 1 (5.9) Since Av 1 = σ 1 u 1 and since u i u 1 = δ i1 (because U is unitary), it is easy to show that A 1 takes the form σ1 w A 1 = (5.10) 0 B where B is an m 1 n 1 matrix, w is an n 1 long vector, and 0 is an m 1 long vector full of zeros. The key to the proof is to show that w is also zero. To do so, first note that because U and V are unitary matrices, A 1 = A (see Lecture ). Next, let s consider the effect of A 1 on the column vector s = (σ 1, w T ) T. σ1 A 1 s = A 1 = w σ 1 + w w Bw On the one hand, because σ 1 is the norm of A 1, we know that (5.11) A 1 s σ 1 s = σ 1 σ 1 + w w (5.1) by definition of the norm. On the other hand, we also know that A 1 s = (σ1 + w w) + Bw σ 1 + w w (5.13) Putting these two bounds together, we have σ 1 σ 1 + w w σ 1 + w w (5.14) which can only be true if w = 0. In other words, any unitary matrices U 1 and V 1 constructed with u 1 and v 1 as their fist column vectors result in A 1 = U σ 1AV 1 = (5.15) 0 B If we continue the process for B, pick its first singular value (which we call σ ), identify the vectors u and v, and construct unitary matrices U and V based on them, we can then write B = Û B ˆV σ 0 = (5.16) 0 C which implies that U U 1AV 1 V = σ = σ 0 0 σ 0 B 0 (5.17) 0 0 C

where U = and V 0 = Û 0 ˆV 105 (5.18) The procedure can be continued over and over again, to reveal, one by one, all the singular values of A. Two things are left to discuss. One, is the fact that U and V constructed as above are indeed unitary. This is the case, because by construction everything related to B is orthogonal to v 1 and u 1 (respectively). And since the product of unitary matrices is also unitary, the construction gradually constructs the singular value decomposition of A with U = U 1 U... U r and V = V 1 V... V r (5.19) It is worth noting that, because of the way we constructed U and V, the first column of U contains u 1, the second column contains u, etc.., and similarly for V, the first column contains v 1, the second v, etc, as we expect from the original description of the SVD. Second, is what to do when the next singular value happens to be zero. Well, the simple answer is nothing, because only the zero matrix can have a singular value that is identically 0. And since the matrix U r and V r were constructed with completed bases already, all the work has already been done! It is worth noting that a singular value decomposition is not unique. For instance, if two singular values are the same, then one can easily switch the order in which the vectors corresponding to that value appear in U and V to get two different SVDs. Also, note how one can easily switch the signs of both u i and v i at the same time, and therefore get another pair of matrices U and V that also satisfies A = U ΣV. 1.5. Simple / trivial examples of SVD. Here are a few matrices where the SVD can easily be found by inspection. Example 1: In this case, A is nearly in the right form, but recall that the diagonal matrix in the SVD needs to have only positive entries. This is easily fixed as 3 0 3 0 = (5.0) 0 0 0 1 Other decomposition can be created by changing the signs of u 1 and v 1, and/or u and v. Example : In this case, A is again nearly in the right form, but the order of the singular values is wrong. This is easily fixed by doing a permutation of the rows and columns, as in ) ) 3 0 0 1 ( 0 0 3 = ( 0 1 0 (5.1) Again, other decomposition can be created by changing the signs of u 1 and v 1, and/or u and v.

106 Example 3: In this case, A is again nearly in the right form, and would be if the columns were switched. This can be done by multiplying A from the right by a permutation matrix, which ends up being V. There is nothing left to do from the left, so we simply take U to be the identity matrix. 0 0 0 = 0 0 0 0 0 0 1 (5.) 0 0 0 0 0 And so forth.. More examples can be found in the textbook, and will be set as project questions.. Properties of the SVD See Chapter 5 of the textbook The SVD has a number of important properties that will turn out to be very useful. The first few we will just state; many of them we have already seen: The SVD of a real matrix has real (and therefore orthogonal) U, V The number of non-zero singular values is equal to rank(a). The vectors u 1... u r span the range of A, while the vectors v r+1... v n span the nullspace of A. A = σ 1 and A F = σ 1 + + σ r. (To prove the second, simply recall that the Frobenius norm of the product of A with a unitary matrix is equal to the Frobenius norm of A (see Lecture )). If A is a square matrix m m, then det A = σ 1 σ... σ m (To prove this, first remember or prove that the determinant of a unitary matrix is ±1). Another very interesting property of the SVD comes from realizing that, if A square and is non-singular, then the inverse of A satisfies A 1 = (UΣV ) 1 = VΣ 1 U (5.3) which is nearly in SVD form. Indeed, 1 σ 1 1 Σ 1 σ =... 1 σ m (5.4) so the elements are not in the right order. This is easy to fix, however, simply by doing a permutation of the rows and columns on either side: A 1 = (VP)(PΣ 1 P)(PU ) (5.5)

107 where P is the same reverse identity matrix we introduced in Lecture 11, which satisfies PP = I. Since P is merely a permutation, both VP and PU are unitary matrices since V and U are. Hence, the ordered singular values of A 1 are (σm 1,..., σ1 1 ). This then also proves one of the results we have learned a while back, that A 1 = σm 1, a result we then used to show that the condition number of A is cond(a) = A 1 A = σ 1 (5.6) σ m Next, let us construct the matrix A A, in terms of the SVD of A. We have A A = (UΣV ) (UΣV ) = VΣ U UΣV (5.7) = VΣ ΣV = V σ 1 σ... V (5.8) σn This is effectively a diagonalization of the matrix A A, which shows that the non-zero eigenvalues of A A are the square of the nonzero singular values of A! Finally if A is Hermitian (A = A ), then we know that its eigenvalue/eigenvector decomposition is A = Q D λ Q (5.9) where Q is the matrix whose column vectors are the eigenvectors of A. Vector by vector, this implies Aq i = λ i q i. This actually looks very much like a singular value decomposition, except for the fact that some of the λ i can be negative while σ i must be positive. However by noting that you can write λ i = sign(λ i )σ i, then Aq i = λ i q i = sign(λ i )σ i q i = σ i u i (5.30) where u i = sign(λ i )q i. We can therefore use the q i to find the left and right singular vectors u i, and v i = q i, and then create the matrices U and V associated with the SVD of A. All of this shows that, for a Hermitian matrix, we have both σ i = λ i and the corresponding eigenvectors v i are also the right singular vectors. Note that if the singular values are not ordered, then we can simply use the trick of multiplying Σ by a permutation matrix.