Singular Value Decomposition and Polar Form

Similar documents
Singular Value Decomposition and Polar Form

Singular Value Decomposition (SVD) and Polar Form

Proposition 42. Let M be an m n matrix. Then (32) N (M M)=N (M) (33) R(MM )=R(M)

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Chapter 3 Transformations

Review problems for MA 54, Fall 2004.

THE SINGULAR VALUE DECOMPOSITION AND LOW RANK APPROXIMATION

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Singular Value Decomposition

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Linear Algebra Review. Vectors

Singular Value Decomposition

UNIT 6: The singular value decomposition.

Diagonalizing Matrices

The Singular Value Decomposition and Least Squares Problems

The Singular Value Decomposition

Spectral Theorems in Euclidean and Hermitian Spaces

Math 408 Advanced Linear Algebra

Properties of Matrices and Operations on Matrices

Lecture notes on Quantum Computing. Chapter 1 Mathematical Background

Homework 1. Yuan Yao. September 18, 2011

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

9.1 Eigenvectors and Eigenvalues of a Linear Map

1 Singular Value Decomposition and Principal Component

Lecture notes: Applied linear algebra Part 1. Version 2

3.5 SINGULAR VALUE DECOMPOSITION (OR SVD)

Recall the convention that, for us, all vectors are column vectors.

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

4.2. ORTHOGONALITY 161

Introduction to Numerical Linear Algebra II

7. Symmetric Matrices and Quadratic Forms

Elementary linear algebra

Summary of Week 9 B = then A A =

Singular Value Decompsition

1. The Polar Decomposition

Math Linear Algebra II. 1. Inner Products and Norms

Linear Algebra and Dirac Notation, Pt. 2

Eigenvalues and diagonalization

1 Linearity and Linear Systems

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

(VII.E) The Singular Value Decomposition (SVD)

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

Conceptual Questions for Review

Matrix Theory, Math6304 Lecture Notes from October 25, 2012

Compound matrices and some classical inequalities

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Pseudoinverse & Orthogonal Projection Operators

Moore-Penrose Conditions & SVD

Spectral Theorems in Euclidean and Hermitian Spaces

Notes on Eigenvalues, Singular Values and QR

MATH36001 Generalized Inverses and the SVD 2015

Review of Some Concepts from Linear Algebra: Part 2

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

Main matrix factorizations

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Review of similarity transformation and Singular Value Decomposition

Stat 159/259: Linear Algebra Notes

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Linear Algebra, part 3 QR and SVD

Review of Linear Algebra

Large Scale Data Analysis Using Deep Learning

The Singular Value Decomposition

MATH 583A REVIEW SESSION #1

Deep Learning Book Notes Chapter 2: Linear Algebra

Linear Algebra Massoud Malek

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

Linear Least Squares. Using SVD Decomposition.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

MATHEMATICS 217 NOTES

The Cartan Dieudonné Theorem

Section 3.9. Matrix Norm

Review of some mathematical tools

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

CHARACTERIZATIONS. is pd/psd. Possible for all pd/psd matrices! Generating a pd/psd matrix: Choose any B Mn, then

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

DS-GA 1002 Lecture notes 10 November 23, Linear models

Numerical Linear Algebra

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Math 113 Final Exam: Solutions

NORMS ON SPACE OF MATRICES

Maths for Signals and Systems Linear Algebra in Engineering

18.06 Problem Set 10 - Solutions Due Thursday, 29 November 2007 at 4 pm in

PRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them.

Linear Algebra. Session 12

The Singular Value Decomposition

LinGloss. A glossary of linear algebra

Knowledge Discovery and Data Mining 1 (VO) ( )

Spectral inequalities and equalities involving products of matrices

Lecture 2: Linear Algebra Review

Lecture 7: Positive Semidefinite Matrices

Designing Information Devices and Systems II

Frame Diagonalization of Matrices

The Principles of Quantum Mechanics: Pt. 1

Linear Algebra. Shan-Hung Wu. Department of Computer Science, National Tsing Hua University, Taiwan. Large-Scale ML, Fall 2016

Lecture 2: Linear operators

1 Linear Algebra Problems

Lecture 8: Linear Algebra Background

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Transcription:

Chapter 14 Singular Value Decomposition and Polar Form 14.1 Singular Value Decomposition for Square Matrices Let f : E! E be any linear map, where E is a Euclidean space. In general, it may not be possible to diagonalize f. We show that every linear map can be diagonalized if we are willing to use two orthonormal bases. This is the celebrated singular value decomposition (SVD). 731

732 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM AclosecousinoftheSVDisthepolar form of a linear map, which shows how a linear map can be decomposed into its purely rotational component (perhaps with a flip) and its purely stretching part. The key observation is that f f is self-adjoint, since h(f f)(u),vi = hf(u),f(v)i = hu, (f f)(v)i. Similarly, f f is self-adjoint. The fact that f f and f f are self-adjoint is very important, because it implies that f f and f f can be diagonalized and that they have real eigenvalues. In fact, these eigenvalues are all nonnegative.

14.1. SINGULAR VALUE DECOMPOSITION FOR SQUARE MATRICES 733 Thus, the eigenvalues of f f are of the form 2 1,..., 2 r or 0, where i > 0, and similarly for f f. The above considerations also apply to any linear map f : E! F betwen two Euclidean spaces (E,h, i 1 ) and (F, h, i 2 ). Recall that the adjoint f : F! E of f is the unique linear map f such that hf(u),vi 2 = hu, f (v)i 1, for all u 2 E and all v 2 F. Then, f f and f f are self-adjoint, and the eigenvalues of f f and f f are nonnegative (the proof is the same as in the previous case),

734 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM The situation is even better, since we will show shortly that f f and f f have the same eigenvalues. Remark: Given any two linear maps f : E! F and g : F! E, wheredim(e) =n and dim(f )=m, itcan be shown that m det( I n g f) = n det( I m f g), and thus g f and f g always have the same nonzero eigenvalues! Definition 14.1. Given any linear map f : E! F,the square roots i > 0ofthepositiveeigenvaluesoff f (and f f )arecalledthesingular values of f.

14.1. SINGULAR VALUE DECOMPOSITION FOR SQUARE MATRICES 735 Definition 14.2. Aself-adjointlinearmap f : E! E whose eigenvalues are nonnegative is called positive semidefinite (or positive), and if f is also invertible, f is said to be positive definite. In the latter case, every eigenvalue of f is strictly positive. If f : E! F is any linear map, we just showed that f f and f f are positive semidefinite self-adjoint linear maps. This fact has the remarkable consequence that every linear map has two important decompositions: 1. The polar form. 2. The singular value decomposition (SVD). The wonderful thing about the singular value decomposition is that there exist two orthonormal bases (u 1,...,u n ) and (v 1,...,v m )suchthat,withrespecttothesebases, f is a diagonal matrix consisting of the singular values of f, or0.

736 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM Thus, in some sense, f can always be diagonalized with respect to two orthonormal bases. The SVD is also a useful tool for solving overdetermined linear systems in the least squares sense and for data analysis, as we show later on. Recall that if f : E! F is a linear map, the image Im f of f is the subspace f(e) off,andtherank of f is the dimension dim(im f) ofitsimage. Also recall that dim (Ker f)+dim(imf) =dim(e), and that for every subspace W of E, dim (W )+dim(w? )=dim(e).

14.1. SINGULAR VALUE DECOMPOSITION FOR SQUARE MATRICES 737 Proposition 14.1. Given any two Euclidean spaces E and F, where E has dimension n and F has dimension m, for any linear map f : E! F, we have Ker f =Ker(f f), Ker f =Ker(f f ), Ker f =(Imf )?, Ker f =(Imf)?, dim(im f) =dim(imf ), and f, f, f f, and f f have the same rank. We will now prove that every square matrix has an SVD. Stronger results can be obtained if we first consider the polar form and then derive the SVD from it (there are uniqueness properties of the polar decomposition). For our purposes, uniqueness results are not as important so we content ourselves with existence results, whose proofs are simpler.

738 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM The early history of the singular value decomposition is described in a fascinating paper by Stewart [30]. The SVD is due to Beltrami and Camille Jordan independently (1873, 1874). Gauss is the grandfather of all this, for his work on least squares (1809, 1823) (but Legendre also published a paper on least squares!). Then come Sylvester, Schmidt, and Hermann Weyl. Sylvester s work was apparently opaque. He gave a computational method to find an SVD. Schmidt s work really has to do with integral equations and symmetric and asymmetric kernels (1907). Weyl s work has to do with perturbation theory (1912).

14.1. SINGULAR VALUE DECOMPOSITION FOR SQUARE MATRICES 739 Autonne came up with the polar decomposition (1902, 1915). Eckart and Young extended SVD to rectangular matrices (1936, 1939). Theorem 14.2. For every real n n matrix A there are two orthogonal matrices U and V and a diagonal matrix D such that A = VDU >, where D is of the form D = 0 B @ 1.... 2........... n 1 C A, where 1,..., r are the singular values of f, i.e., the (positive) square roots of the nonzero eigenvalues of A > A and AA >, and r+1 = = n = 0. The columns of U are eigenvectors of A > A, and the columns of V are eigenvectors of AA >.

740 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM Theorem 14.2 suggests the following defintion. Definition 14.3. Atriple(U, D, V )suchthat A = VDU >,whereu and V are orthogonal and D is a diagonal matrix whose entries are nonnegative (it is positive semidefinite) is called a singular value decomposition (SVD) of A. The proof of Theorem 14.2 shows that there are two orthonormal bases (u 1,...,u n )and(v 1,...,v n ), where (u 1,...,u n )areeigenvectorsofa > A and (v 1,...,v n )are eigenvectors of AA >. Furthermore, (u 1,...,u r )isanorthonormalbasisofima >, (u r+1,...,u n )isanorthonormalbasisofkera,(v 1,...,v r ) is an orthonormal basis of Im A, and(v r+1,...,v n )isan orthonormal basis of Ker A >.

14.1. SINGULAR VALUE DECOMPOSITION FOR SQUARE MATRICES 741 Using a remark made in Chapter 2, if we denote the columns of U by u 1,...,u n and the columns of V by v 1,...,v n,thenwecanwrite A = VDU > = 1 v 1 u > 1 + + r v r u > r. As a consequence, if r is a lot smaller than n (we write r n), we see that A can be reconstructed from U and V using a much smaller number of elements. This idea will be used to provide low-rank approximations of a matrix. The idea is to keep only the k top singular values for some suitable k r for which k+1,... r are very small.

742 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM Remarks: (1) In Strang [32] the matrices U, V, D are denoted by U = Q 2, V = Q 1,andD =, and an SVD is written as A = Q 1 Q > 2. This has the advantage that Q 1 comes before Q 2 in A = Q 1 Q > 2. This has the disadvantage that A maps the columns of Q 2 (eigenvectors of A > A)tomultiplesofthecolumns of Q 1 (eigenvectors of AA > ). (2) Algorithms for actually computing the SVD of a matrix are presented in Golub and Van Loan [17], Demmel [11], and Trefethen and Bau [34], where the SVD and its applications are also discussed quite extensively.

14.1. SINGULAR VALUE DECOMPOSITION FOR SQUARE MATRICES 743 (3) The SVD also applies to complex matrices. In this case, for every complex n n matrix A, therearetwo unitary matrices U and V and a diagonal matrix D such that A = VDU, where D is a diagonal matrix consisting of real entries 1,..., n, where 1,..., r are the singular values of A, i.e.,thepositivesquarerootsofthenonzero eigenvalues of A A and AA,and r+1 =...= n = 0. AnotioncloselyrelatedtotheSVDisthepolarformof amatrix. Definition 14.4. Apair(R, S)suchthatA = RS with R orthogonal and S symmetric positive semidefinite is called a polar decomposition of A.

744 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM Theorem 14.2 implies that for every real n n matrix A, thereissomeorthogonalmatrixr and some positive semidefinite symmetric matrix S such that A = RS. Furthermore, R, S are unique if A is invertible, but this is harder to prove. For example, the matrix 0 1 1 1 1 1 A = 1 B1 1 1 1 C 2 @ 1 1 1 1A 1 1 1 1 is both orthogonal and symmetric, and A = RS with R = A and S = I, whichimpliesthatsomeoftheeigenvalues of A are negative.

14.1. SINGULAR VALUE DECOMPOSITION FOR SQUARE MATRICES 745 Remark: In the complex case, the polar decomposition states that for every complex n n matrix A, thereis some unitary matrix U and some positive semidefinite Hermitian matrix H such that A = UH. It is easy to go from the polar form to the SVD, and conversely. Given an SVD decomposition A = VDU >,let R = VU > and S = UDU >. It is clear that R is orthogonal and that S is positive semidefinite symmetric, and RS = VU > UDU > = VDU > = A.

746 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM Going the other way, given a polar decomposition A = R 1 S, where R 1 is orthogonal and S is positive semidefinite symmetric, there is an orthogonal matrix R 2 and a positive semidefinite diagonal matrix D such that S = R 2 DR > 2,andthus A = R 1 R 2 DR > 2 = VDU >, where V = R 1 R 2 and U = R 2 are orthogonal. Theorem 14.2 can be easily extended to rectangular m n matrices (see Strang [32] or Golub and Van Loan [17], Demmel [11], Trefethen and Bau [34]).

14.2. SINGULAR VALUE DECOMPOSITION FOR RECTANGULAR MATRICES 747 14.2 Singular Value Decomposition for Rectangular Matrices Theorem 14.3. For every real m n matrix A, there are two orthogonal matrices U (n n) and V (m m) and a diagonal m n matrix D such that A = VDU >, where D is of the form D = 0 B @ 1.... 2........... n 0.... 0...... 0.... 0 1 or C A 0 B @ 1 1... 0... 0 2... 0... 0 C...... 0. 0A,... m 0... 0 where 1,..., r are the singular values of f, i.e. the (positive) square roots of the nonzero eigenvalues of A > A and AA >, and r+1 =... = p = 0, where p =min(m, n). The columns of U are eigenvectors of A > A, and the columns of V are eigenvectors of AA >.

748 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM Atriple(U, D, V )suchthata = VDU > is called a singular value decomposition (SVD) of A. Even though the matrix D is an m n rectangular matrix, since its only nonzero entries are on the descending diagonal, we still say that D is a diagonal matrix. If we view A as the representation of a linear map f : E! F,wheredim(E) =n and dim(f )=m, the proof of Theorem 14.3 shows that there are two orthonormal bases (u 1,..., u n )and(v 1,...,v m )fore and F,respectively, where (u 1,...,u n )areeigenvectorsoff f and (v 1,...,v m )areeigenvectorsoff f. Furthermore, (u 1,...,u r )isanorthonormalbasisofimf, (u r+1,...,u n )isanorthonormalbasisofkerf,(v 1,...,v r ) is an orthonormal basis of Im f, and(v r+1,...,v m )isan orthonormal basis of Ker f. The eigenvalues and the singular values of a matrix are typically not related in any obvious way.

14.2. SINGULAR VALUE DECOMPOSITION FOR RECTANGULAR MATRICES 749 For example, the n n matrix 0 1 12 0 0... 00 01 2 0... 00 00 1 2... 00 A =............. B00... 0 1 2 0 C @ 00... 0 0 1 2A 00... 0 0 0 1 has the eigenvalue 1 with multiplicity n, butitssingular values, 1 n, whicharethepositivesquare roots of the eigenvalues of the matrix B = A > A with 0 1 12 0 0... 00 25 2 0... 00 02 5 2... 00 B =............. B00... 2 5 2 0 C @ 00... 0 2 5 2A 00... 0 0 2 5 have a wide spread, since 1 n =cond 2 (A) 2 n 1.

750 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM If A is a complex n n matrix, the eigenvalues 1,..., n and the singular values 1 2 n of A are not unrelated, since 1 n = 1 n. More generally, Hermann Weyl proved the following remarkable theorem: Theorem 14.4. (Weyl s inequalities, 1949 ) For any complex n n matrix, A, if 1,..., n 2 C are the eigenvalues of A and 1,..., n 2 R + are the singular values of A, listed so that 1 n and 1 n 0, then 1 n = 1 n and 1 k apple 1 k, for k =1,...,n 1.

14.2. SINGULAR VALUE DECOMPOSITION FOR RECTANGULAR MATRICES 751 AproofofTheorem14.4canbefoundinHornandJohnson [20], Chapter 3, Section 3.3, where more inequalities relating the eigenvalues and the singular values of a matrix are given. The SVD of matrices can be used to define the pseudoinverse of a rectangular matrix. Computing the SVD of a matrix A is quite involved. Most methods begin by finding orthogonal matrices U and V and a bidiagonal matrix B such that A = VBU >.

752 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM 14.3 Ky Fan Norms and Schatten Norms The singular values of a matrix can be used to define various norms on matrices which have found recent applications in quantum information theory and in spectral graph theory. Following Horn and Johnson [20] (Section 3.4) we can make the following definitions: Definition 14.5. For any matrix A 2 M m,n (C), let q = min{m, n}, andif 1 q are the singular values of A, foranyk with 1 apple k apple q, let N k (A) = 1 + + k, called the Ky Fan k-norm of A. More generally, for any p 1andanyk with 1 apple k apple q, let N k;p (A) =( p 1 + + p k )1/p, called the Ky Fan p-k-norm of A. Whenk = q, N q;p is also called the Schatten p-norm.

14.3. KY FAN NORMS AND SCHATTEN NORMS 753 Observe that when k = 1, N 1 (A) = 1,andtheKy Fan norm N 1 is simply the spectral norm from Chapter 6, which is the subordinate matrix norm associated with the Euclidean norm. When k = q, thekyfannormn q is given by N q (A) = 1 + + q =tr((a A) 1/2 ) and is called the trace norm or nuclear norm. When p =2andk = q, thekyfann q;2 norm is given by N k;2 (A) =( 2 1 + + 2 q) 1/2 = p tr(a A)=kAk F, which is the Frobenius norm of A. It can be shown that N k and N k;p are unitarily invariant norms, and that when m = n, theyarematrixnorms; see Horn and Johnson [20] (Section 3.4, Corollary 3.4.4 and Problem 3).

754 CHAPTER 14. SINGULAR VALUE DECOMPOSITION AND POLAR FORM