Singular Value Decomposition (SVD)

Similar documents
COMP 558 lecture 18 Nov. 15, 2010

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Singular Value Decomposition

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

The Singular Value Decomposition and Least Squares Problems

2. Every linear system with the same number of equations as unknowns has a unique solution.

Singular Value Decomposition

DS-GA 1002 Lecture notes 10 November 23, Linear models

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

EE731 Lecture Notes: Matrix Computations for Signal Processing

Linear Algebra, part 3 QR and SVD

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

UNIT 6: The singular value decomposition.

Lecture notes: Applied linear algebra Part 1. Version 2

1 Linearity and Linear Systems

Linear Algebra Review. Fei-Fei Li

Solutions to Final Practice Problems Written by Victoria Kala Last updated 12/5/2015

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.

Designing Information Devices and Systems II

Review of Some Concepts from Linear Algebra: Part 2

Positive Definite Matrix

Problem set 5: SVD, Orthogonal projections, etc.

Singular Value Decomposition

7. Dimension and Structure.

1. Select the unique answer (choice) for each problem. Write only the answer.

Maths for Signals and Systems Linear Algebra in Engineering

Linear Algebra Review. Fei-Fei Li

The Singular Value Decomposition

The Singular Value Decomposition

. = V c = V [x]v (5.1) c 1. c k

Linear Algebra. Session 12

ft-uiowa-math2550 Assignment OptionalFinalExamReviewMultChoiceMEDIUMlengthForm due 12/31/2014 at 10:36pm CST

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Review of some mathematical tools

Principal Component Analysis

IV. Matrix Approximation using Least-Squares

Linear Algebra Review. Vectors

Transpose & Dot Product

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Large Scale Data Analysis Using Deep Learning

Computational Methods. Eigenvalues and Singular Values

1. General Vector Spaces

1. Linear systems of equations. Chapters 7-8: Linear Algebra. Solution(s) of a linear system of equations (continued)

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Solutions to the Calculus and Linear Algebra problems on the Comprehensive Examination of January 28, 2011

MAT Linear Algebra Collection of sample exams

Dimension and Structure

Review problems for MA 54, Fall 2004.

Transpose & Dot Product

18.06 Problem Set 10 - Solutions Due Thursday, 29 November 2007 at 4 pm in

. The following is a 3 3 orthogonal matrix: 2/3 1/3 2/3 2/3 2/3 1/3 1/3 2/3 2/3

Review of similarity transformation and Singular Value Decomposition

Stat 159/259: Linear Algebra Notes

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Numerical Methods I Singular Value Decomposition

Linear Algebra Primer

Singular Value Decomposition

ORTHOGONALITY AND LEAST-SQUARES [CHAP. 6]

PRACTICE PROBLEMS FOR THE FINAL

Solution of Linear Equations

18.06SC Final Exam Solutions

Linear Algebra Fundamentals

Foundations of Computer Vision

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

EECS 275 Matrix Computation

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

There are six more problems on the next two pages

Conceptual Questions for Review

LINEAR ALGEBRA QUESTION BANK

Chapter 7: Symmetric Matrices and Quadratic Forms

The QR Decomposition

Linear Algebra Formulas. Ben Lee

Pseudoinverse & Orthogonal Projection Operators

Vector and Matrix Norms. Vector and Matrix Norms

Linear Algebra - Part II

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

Computational math: Assignment 1

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

Linear Least Squares. Using SVD Decomposition.

SINGULAR VALUE DECOMPOSITION

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

ECE 275A Homework #3 Solutions

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Properties of Matrices and Operations on Matrices

Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Spectral Methods for Natural Language Processing (Part I of the Dissertation) Jang Sun Lee (Karl Stratos)

Summary of Week 9 B = then A A =

Basic Elements of Linear Algebra

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Lecture 23: 6.1 Inner Products

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007

Pseudoinverse & Moore-Penrose Conditions

Chapter 5 Eigenvalues and Eigenvectors

Transcription:

School of Computing National University of Singapore CS CS524 Theoretical Foundations of Multimedia More Linear Algebra Singular Value Decomposition (SVD) The highpoint of linear algebra Gilbert Strang Any m n matrix A can be decomposed into: A = UΣV U : m m : columns are left singular vectors Σ : m n : diagonal : singular values V : n n : columns are right singular vectors eg for m > n σ v σ r A = u u r u r+ u m vr vr+ vn σ σ 2 σ r >, r = rank(a) Economy version A = U }{{} r Σ r V }{{} r }{{} m r r r r n U,V orthogonal : U U = I m m, V V = I n n Column Space: look at Ax Ax = UΣV x, and let y = V x = σ u σ r u r y so Col(A) = Col(U r ) In fact, u,,u r form an orthonormal basis for Col(A) Nullspace: look at Ax = U r Σ r V r x =

pre-multiply by U r : Σ r Vr x = pre-multiply by Σ r : Vr x = ie want x to be orthogonal to v,,v r That s precisely v r+,,v n, since V is orthogonal! Thus, v r+,,v n form an orthonormal basis for Null(A) Consider A A = (UΣV ) (UΣV ) = VΣ U UΣV = VΣ 2 V But this is the eigen-decomposition of A A! So V is the eigenvector matrix of A A Σ 2 is the eigenvalue matrix of A A ie singular values are positive square roots of eigenvalues Similary, consider AA = UΣV VΣ U = UΣ 2 U So U is the eigenvector matrix for AA with same eigenvalues In general, for m n A : Low-rank approximation Ax = UΣV x = (rotate in R m ) ( scale ) (rotate in R n ) x SVD provides the best lower-rank approximation to A, ie rank k approx A k = U k Σ k V k The idea is to use only the first k singular values/vectors, so that A k A Use SVD for compression: Instead of storing A : mn numbers store u,,u k : mk numbers + σ,,σ k : k numbers + v,,v k : nk numbers = (m + n + )k numbers Use SVD to filter noise Typically, small singular values are caused by noise using rank k approx (k < r), removes noise Linear Equations Revisited: Ax = b Key: solution only when b Col(A) Case A n n and invertible Then unique solution : x = A b rank(a) = n, Col(A) = R n 2

Case 2 A n n and singular rank(a) = r < n, nullity = n r Two possibilities : (a) b Col(A) : many solutions (b) b / Col(A) : no exact solution, closest solution (a) b Col(A) : SVD gives particular solution x p such that Ax p = b But we can add any vector from Nullspace, x n, since A(x p + x n ) = Ax p + Ax n = b + Infinitely many solutions! What is the SVD solution? Invert only in rank r subspace A = UΣV (all n n) σ where Σ = σ r σ Let A = VΣ U, where Σ = σ r Then x p = A b A : pseudoinverse See Figure (b) b / Col(A) : No exact solution, but can find b Col(A) closest to b Solution x = A b = VΣU b Case 3 A m n with m < n underconstrained fewer equations than unknowns r = rank(a) min(m,n), ie r < n, so Nullspace is not trivial Col(A) R m Situation similar to the previous case, either b Col(A) or b / col(a) In practice, usually r = m, so that b Col(A), ie many solutions Case 4 A m n with m > n overconstrained, more equations that unknowns rank, r, is at most, n Therefore, Col(A) R m Again, depends on whether b col(a), so we can only find closest or least squares solution x = A b Pseudoinverse A solves Ax = b in least squares sense, ie Ax b 2 is minimum 3

Figure : A singular matrix A has Col(A) R n This is represented by a plane in the diagram If b lies outside of Col(A), then the best one can do is to obtain b, which is the vector in Col(A) that is closest to b This is what the pseudoinverse computes: b = Ax, where x = A b A = VΣ U (using SVD) ( = A A) A but this requires rank(a) = n Note: A A = ( A A ) A A = I, but AA = A ( A A ) A I in general Thus, pseudoinverse is only a left inverse, not a right inverse If A invertible, then pseudoinverse = true inverse: ( ) A = A A A = A A A = A In Matlab, always use A \ b to solve Ax = b \ will compute A or A accordingly Matrix Inversion Formulas Excerpt from: Statistical Signal Processing, by Louis LScharf, Addison Wesley, 99 Lemma (Inverse of a Partitioned Matrix) Let R denote the partitioned matrix [ A B R = C D The inverse of R is [ R E FH = H G H 4

E = A BD C AF = B GA = C H = D CA B All indicated inverses are assumed to exist The matrix E is called Schur complement of A, and the matrix H is called the Schur complement of D 2 Lemma 2 (Matrix Inversion Lemma) Let E denote the Schur complement of A: E = A BD C Then the inverse of E is E = A + FH G AF = B GA = C H = D CA B Lemmas and 2 combine to form the following representation for the inverse of a partitioned matrix Theorem (Partitioned Matrix Inverse) The inverse of the partitioned matrix [ A B R = C D is the matrix R = [ A [ F + I [H [ G I AF = B GA = C H = D CA B 5

Corollary: Woodbury s Identity The inverse of the matrix is the matrix Projections R = R + γ 2 uu R = R γ 2 + γ 2 u R ur uu R Often we want to project x onto some subspace, ie find y in subspace, closest to x Geometrically, this occurs when x y is orthogonal to subspace Often the subspace of interest is Col(A) Recall that in the SVD of A, U r form an orthogonal basis for Col(A) The projection matrix P A that projects any vector onto Col(A) is : P A = U r U r ( ) = A A A A (SVD) eg To project onto a line (vector) u, P u = uu u u In general, a projection matrix P is one that satisfies: P = P symmetric 2 P 2 = P idempotent What are the eigenvalues of P? Derivatives wrt Differentiate scalar vector matrix scalar scalar vector matrix vector vector matrix matrix matrix scalar scalar: eg vector scalar: eg d x2 = 2x matrix scalar: eg y = [cos θ sin 2 θ dy = [ sin θ dθ 2 sin θ cos θ [ x 2 x A = x [ da 2x = x 2 6

scalar vector: f(x) scalar function of vector x = x x n df = x x n vector vector: y(x) Then, m vector function of vector x R n y x dy = y x n y m x y m x n df da = a y : m x : n dy : n m matrix scalar matrix: f(a) scalar function of m n Then, a 2 Commonly used derivatives 2 3 d (Ax) = A = I dy x = y = y a 2 a m A a n a 2n a mn m n matrix 4 5 d ( x Ax ) {( A + A ) x if A square = 2Ax if A symmetric d ( u (x) v(x) ) = [ du v + [ dv u product rule 6 d tr(a) da = I 7

7 d det(a) = det(a)a da Example: to find pseudoinverse Let e = Ax b We want x such that e 2 smallest, ie e 2 2 smallest Hessian: 2 nd derivative Let y = e 2 2 = e e = (Ax b) (Ax b) = x A Ax 2b Ax + b b dy = 2A Ax 2A b = A Ax = A b ( ) x = A A A b } {{ } A Let f(x) be scalar function of x R n Then Hessian: Hessian is symmetric H = d2 f 2 = x 2 x 2 x x x 2 x 2 2 x n x x x n x 2 x n x 2 n Positive semi-definite (psd) A square matrix A is positive semi-definite if x Ax for all x Positive definite x Ax > Note: A is a psd means all eigenvalues If a Hessian matrix is psd, then f has minimum point eg in the pseudoinverse calculation, dy = 2A Ax 2A b So Hessian, H = d ( ) dy = 2A A Now, for any x, x Hx = 2x A Ax = 2 Ax 2 since Ax 2 is the squared norm So H is psd y has minimum point This justifies taking derivatives to find best x 8