Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine

Similar documents
Lecture 8: October 20, Applications of SVD: least squares approximation

Inverse Matrix. A meaning that matrix B is an inverse of matrix A.

Symmetric Matrices and Quadratic Forms

5.1 Review of Singular Value Decomposition (SVD)

LECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK)

(VII.A) Review of Orthogonality

Topics in Eigen-analysis

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

Apply change-of-basis formula to rewrite x as a linear combination of eigenvectors v j.

CHAPTER 3. GOE and GUE

MATH10212 Linear Algebra B Proof Problems

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

Zeros of Polynomials

Brief Review of Functions of Several Variables

Machine Learning for Data Science (CS 4786)

Chimica Inorganica 3

Physics 324, Fall Dirac Notation. These notes were produced by David Kaplan for Phys. 324 in Autumn 2001.

1 Last time: similar and diagonalizable matrices

Machine Learning for Data Science (CS 4786)

Linear Regression Demystified

a for a 1 1 matrix. a b a b 2 2 matrix: We define det ad bc 3 3 matrix: We define a a a a a a a a a a a a a a a a a a

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

2 Geometric interpretation of complex numbers

5.1. The Rayleigh s quotient. Definition 49. Let A = A be a self-adjoint matrix. quotient is the function. R(x) = x,ax, for x = 0.

AN INTRODUCTION TO SPECTRAL GRAPH THEORY

4 The Sperner property.

Matrix Algebra from a Statistician s Perspective BIOS 524/ Scalar multiple: ka

Principle Of Superposition

Matrix Algebra 2.2 THE INVERSE OF A MATRIX Pearson Education, Inc.

Notes for Lecture 11

Abstract Vector Spaces. Abstract Vector Spaces

For a 3 3 diagonal matrix we find. Thus e 1 is a eigenvector corresponding to eigenvalue λ = a 11. Thus matrix A has eigenvalues 2 and 3.

Introduction to Optimization Techniques

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Determinants of order 2 and 3 were defined in Chapter 2 by the formulae (5.1)

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

, then cv V. Differential Equations Elements of Lineaer Algebra Name: Consider the differential equation. and y2 cos( kx)

Matrix Theory, Math6304 Lecture Notes from October 23, 2012 taken by Satish Pandey

24 MATH 101B: ALGEBRA II, PART D: REPRESENTATIONS OF GROUPS

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

R is a scalar defined as follows:

Math Solutions to homework 6

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

COLLIN COUNTY COMMUNITY COLLEGE COURSE SYLLABUS CREDIT HOURS: 3 LECTURE HOURS: 3 LAB HOURS: 0

CHAPTER 5. Theory and Solution Using Matrix Techniques

5 Birkhoff s Ergodic Theorem

Eigenvalues and Eigenvectors

Efficient GMM LECTURE 12 GMM II

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

On Involutions which Preserve Natural Filtration

Representations of State Vectors and Operators

In the preceding Chapters, the mathematical ideas underpinning the quantum theory have been

Algebra of Least Squares

Math 61CM - Solutions to homework 3

subcaptionfont+=small,labelformat=parens,labelsep=space,skip=6pt,list=0,hypcap=0 subcaption ALGEBRAIC COMBINATORICS LECTURE 8 TUESDAY, 2/16/2016

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

Chapter Vectors

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Matrix Theory, Math6304 Lecture Notes from October 25, 2012 taken by Manisha Bhardwaj

Math 155 (Lecture 3)

PAPER : IIT-JAM 2010

Lecture 3 The Lebesgue Integral

Math 778S Spectral Graph Theory Handout #3: Eigenvalues of Adjacency Matrix

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices

The multiplicative structure of finite field and a construction of LRC

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

Beurling Integers: Part 2

8. Applications To Linear Differential Equations

Some examples of vector spaces

Matrices and vectors

State Space Representation

Mon Feb matrix inverses. Announcements: Warm-up Exercise:

TENSOR PRODUCTS AND PARTIAL TRACES

Matrix Algebra 2.3 CHARACTERIZATIONS OF INVERTIBLE MATRICES Pearson Education, Inc.

Linear Transformations

Problem Set 2 Solutions

Math E-21b Spring 2018 Homework #2

Stochastic Matrices in a Finite Field

c 2006 Society for Industrial and Applied Mathematics

Bertrand s Postulate

SOME TRIBONACCI IDENTITIES

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

b i u x i U a i j u x i u x j

CMSE 820: Math. Foundations of Data Sci.

TMA4205 Numerical Linear Algebra. The Poisson problem in R 2 : diagonalization methods

An Introduction to Randomized Algorithms

Random Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices

EE / EEE SAMPLE STUDY MATERIAL. GATE, IES & PSUs Signal System. Electrical Engineering. Postal Correspondence Course

LINEAR ALGEBRA. Paul Dawkins

Iterative method for computing a Schur form of symplectic matrix

The inverse eigenvalue problem for symmetric doubly stochastic matrices

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

LECTURE 11: POSTNIKOV AND WHITEHEAD TOWERS

Complex Analysis Spring 2001 Homework I Solution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Assignment 2 Solutions SOLUTION. ϕ 1 Â = 3 ϕ 1 4i ϕ 2. The other case can be dealt with in a similar way. { ϕ 2 Â} χ = { 4i ϕ 1 3 ϕ 2 } χ.

On Nonsingularity of Saddle Point Matrices. with Vectors of Ones

Singular Continuous Measures by Michael Pejic 5/14/10

Transcription:

Lecture 11 Sigular value decompositio Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaie V1.2 07/12/2018 1

Sigular value decompositio (SVD) at a glace Motivatio: the image of the uit sphere S uder ay m matrix trasformatio is a hyperellipse. x m A A x = Ax m x Ax S Through the SVD, we will ifer importat properties of matrix A from the shape of AS! 2

Sigular value decompositio (SVD) at a glace The sigular value decompositio (SVD) is a particular matrix factorizatio. x m A A x = Ax m x Ax S Through the SVD, we will ifer importat properties of matrix A from the shape of AS! 3

Why is the sigular value decompositio of particular importace? The reasos for lookig at SVD are twofold: 1. The computatio of SVD the SVD is used is used as a as a itermediate step i may algorithms of of practical iterest. 2. From a coceptual poit of view, SVD also eables the SVD a eables deeper a uderstadig deeper uderstadig of may problems of may aspects i liear of algebra. liear algebra. 4

Learig objectives & outlie Become familiar with the SVD ad its geometric iterpretatio, ad get aware of its sigificace 1. Geometric Remider of observatios some fudametals i liear algebra 2. Reduced Geometric SVD iterpretatio 3. From reduced SVD to full SVD, ull ad SVD formal defiitio 4. Formal Existece defiitio ad uiqueess 5

1 - Remider: fudametals i liear algebra I this sectio, we briefly review the cocepts of adjoit matrix, matrix rak, uitary matrix as well as matrix orms (Chapters 2 ad 3 i Trefethe & Bau, 1997).

Adjoit of a matrix The adjoit (or Hermitia cojugate) of a m matrix A, writte A *, is the m matrix whose i, j etry is the complex cojugate of the j, i etry of A. If A = A *, A is Hermitia (or self-adjoit). For a real matrix A, the adjoit is the traspose: A * = A T, if the matrix is Hermitia, that is A = A T, the it is symmetric. 7

Matrix rak The rak of a matrix is the umber of liearly idepedet colums (or rows) of a matrix. The umbers of liearly idepedet colums ad rows of a matrix are equal. A m matrix of full rak is oe that has the maximal possible rak (the lesser of m ad ). If m, such a matrix is characterized by the property that it maps o two distict vectors to the same vector. 8

Uitary matrix A square matrix Q C mm, is uitary (or orthogoal, i the real case), if i.e. Q * = Q 1, Q * Q = I. The colums q i of a uitary matrix form a othoormal basis of C m : (q i ) * q j = δ ij, with δ ij the Kroecker delta. 9

A rotatio matrix is a typical example of a uitary matrix A rotatio matrix R may write: R cosq siq siq cosq The image of a vector is the same vector, rotated couter clockwise by a agle q. Matrix R is orthogoal ad R * R = R T R = I. 10

(Iduced) matrix orms are defied from the actio of the matrix o vectors For a matrix A C m, ad give vector orms () o the domai of A (m) o the rage of A the iduced matrix orm () is the smallest umber C for which the followig iequality holds for all x C : Ax C x m It is the maximum factor by which A ca stretch a vector x. 11

(Iduced) matrix orms are defied from the actio of the matrix o vectors The matrix orm ca be defied equivaletly i terms of the images of the uit vectors uder A: A m, max x x x 0 x 1 This form is coveiet for visualizig iduced matrix orms, as i this example. A 1 2 0 2 x Ax x 1 2 m max Ax A Ax 2.9208 2 2 m 12

2 Geometric iterpretatio I this sectio, we itroduce coceptually the SVD, by meas of a simple geometric iterpretatio (Chapter 4 i Trefethe & Bau, 1997).

Geometric iterpretatio Let S be the uit sphere i R. Cosider ay matrix A R m, with m. Assume for the momet that A has full rak. S x x A : x 1 x 2 i1 x i 2 12 14

Geometric iterpretatio The imaghe AS is a hyperellipse i R m. This fact is ot obvious; but let us assume for ow that it is true. It will be proved later. x A Ax S AS 15

A hyperellipse is the m-dimesioal geeralizatio of a ellipse i 2D I R m, a hyperellipse is a surface obtaied by stretchig the uit sphere i R m by some factors s 1,, s m (possibly zero) i some orthogoal directios u 1,, u m R m For coveiece, let us take the u i to be uit vectors, i.e. u i 2 = 1. The vectors {s i u i } are the pricipal semiaxes of the hyperellipse. s 1 u 1 s 2 u 2 AS 16

A hyperellipse is the m-dimesioal geeralizatio of a ellipse i 2D If A has rak r, exactly r of the legths s i will be ozero. I particular, if m, at most of them will be ozero. x A s 2 u 2 AS S s 1 u 1 17

Sigular values We stated at the begiig that the SVD eables characterizig properties of matrix A from the shape of AS. Here we go for three defiitios We defie the sigular values of matrix A as the legths of the pricipal semiaxes of AS, oted s 1,, s. It is covetioal to umber the sigular values i descedig order: s 1 s 2 s. s 1 u 1 s 2 u 2 AS 18

Left sigular vectors We also defie the left sigular vectors of matrix A as the uit vectors {u 1,, u } orieted i the directios of the pricipal semiaxes of AS, umbered to correspod with the sigular values. s 2 u 2 Thus, the vector s i u i is the i th largest pricipal semiaxis. s 1 u 1 AS 19

Right sigular vectors We also defie the right sigular vectors of matrix A as the uit vectors {v 1,, v } S that are the preimages of the pricipal semiaxes of AS, umbered so that A v j = s j u j. v 1 v 2 A s 2 u 2 S s 1 u 1 AS 20

Importat remarks The terms left ad right sigular vectors will be uderstood later as we move forward with a more formal descriptio of the SVD. I the geometric iterpretatio preseted so far, we assumed that matrix A is real ad m = = 2. v 1 v 2 Actually, the SVD applies A s 2 u 2 to both real ad complex matrices, S whatever the umber of dimesios. s 1 u AS 1 21

3 From reduced to full SVD, ad formal defiitio I this sectio, we distiguish betwee the so-called reduced SVD, ofte used i practice, ad the full SVD. We also itroduce the formal defiitio of SVD (Chapter 4 i Trefethe & Bau, 1997).

The equatios relatig right ad left sigular vectors ca be expressed i matrix form We just metioed that the equatios relatig right sigular vectors {v j } ad left sigular vectors {u j } ca be writte A v j = s j u j 1 j This collectio of vector equatios ca be expressed as a matrix equatio. s 1 A s v 1 v 2 v = u 1 u 2 u 2 s 23

The equatios relatig right ad left sigular vectors ca be expressed i matrix form This matrix equatio ca be writte i a more compact form: AV Uˆ Sˆ with SŜ a diagoal matrix with real etries (as A was assumed to have full rak ) Uˆ a m matrix with orthoormal colums V a matrix with orthoormal colums A s v 1 v 2 v = u 1 u 2 u 2 Thus, V is uitary (i.e. V * = V 1 ), ad we obtai: A Uˆ Sˆ V * to distiguish from U, S i the full SVD s 1 s 24

Reduced SVD The factorizatio of matrix A i the form ˆ ˆ * A USV is called a reduced sigular values decompositio, or reduced SVD, of matrix A. Schematically, it looks like this (m ): m = A ˆ U Ŝ V * 25

From reduced SVD to full SVD The colums of Û are orthoormal vectors i the m-dimesioal space C m. Uless m =, they do ot form a basis of C m, or is Û a uitary matrix. However, we may upgrade Û to a uitary matrix! m = A ˆ U Ŝ V * 26

From reduced SVD to full SVD Let us adjoi a additioal m orthoormal colums to matrix Û, so that it becomes uitary. The m additioal orthoormal colums are chose arbitrarily ad the result is oted U. However, S must chage too m = A U Ŝ V * 27

silet colums From reduced SVD to full SVD For the product to remai ualtered, the last m colums of U should be multiplied by zero. Accordigly, let S be the m matrix cosistig of S i the upper block together with m rows of zeros below. m = * A U S V 28

From reduced SVD to full SVD We get a ew factorizatio of A, called full SVD: A = U S V * U is a m m uitary matrix, V is a uitary matrix, S is a m diagoal matrix with real etries m = * A U S V 29

Geeralizatio to the case of a matrix A which does ot have full rak m If matrix A is rak-deficiet (i.e. of rak r < ), oly r (istead of ) of the left sigular vectors are deduced from the size of the hyperellipse BUT the full SVD still applies, by itroducig m r (istead of m ) additioal arbitrary orthoormal colums to costruct the uitary matrix U; the matrix V also eeds r arbitrary orthoormal = colums to exted the r colums determied from the hyperellipse geometry * Amatrix S has oly Ur o-zero diagoal S etries. V 30

Formal defiitio of the SVD m Let m ad be arbitrary (we do ot require m ). Give A C m, ot ecessarily of full rak, a sigular value decompositio of A is a factorizatio where = A = U S V * s 1 s p oegative, i oicreasig order U C mm is square, uitary S R m is real diagoal V * C is square, uitary * A U S V 31

Cosequetly, the image of the uit sphere i R uder a map A = U S V * is a hyperellipse i R m m Thus, 1. The uitary map V * preserves the sphere 2. The diagoal matrix S stretches the sphere ito a hyperellipse 3. The fial uitary map U rotates, or reflects, the hyperellipse without chagig its shape. if we ca prove that every U C mm S matrix C m has a SVD, V * C we will have is = is proved square, that the image of the uit is square, sphere uder ay real uitary liear map uitary is ideed a hyperellipse. diagoal * A U S V 32

4 Existece ad uiqueess I this sectio, we demostrate the existece of the SVD, the uiqueess of the sigular values, as well as uder some specific coditios, the uiqueess of the sigular vectors (Chapter 4 i Trefethe & Bau, 1997).

Every matrix A C m has a sigular value decompositio A = U S V * To prove the existece of the SVD, we first isolate the directio of the largest actio of A, the we proceed by iductio o the dimesio of A. The proof takes 5 steps. 34

Every matrix A C m has a sigular value decompositio A = U S V * Set s 1 = A 2. From the defiitio of the matrix orm, there must be a vector v 1 C m A max x x 1 Ax with v 1 2 = 1 ad Av 1 2 = s 1 v 1 2 = 1 m A 1 v = Av1 Av 1 2 = s 1 We ote: Av1 u1 s 1 35

Every matrix A C m has a sigular value decompositio A = U S V * Cosider ay extesios of v 1 to a orthoormal basis {v j } of C ad of u 1 to a orthoormal basis {u j } of C m Let U 1 ad V 1 deote the uitary matrices with colums u j ad v j, respectively. m A v1 v j 1 ad u1 u j 1 V 1 U 1 36

Every matrix A C m has a sigular value decompositio A = U S V * The we have * * s 1 w U1 AV1 S 0 B where 0 is a colum vector of dimesio m 1, w * is a row vector of dimesio 1, ad B has dimesios (m 1) ( 1). * * u 1 u s u * w s 1 1 1 1 m A v1 = 0 B U 1 * V 1 u * j1s 1u1 0 37

Every matrix A C m has a sigular value decompositio A = U S V * Furthermore, A m, max x x 0 Ax x m s s 0 B w w * s 1 2 * 2 * 12 1 w 1 s1 w w s1 w w Implyig (from the defiitio of matrix orms) BUT, sice U 1 ad V 1 are uitary, we kow that S This implies w = 0. 2 S 2 s 2 * 12 1 w w U AV A s * 2 1 1 2 2 1 2 38

Every matrix A C m has a sigular value decompositio A = U S V * If = 1 or m = 1, we are doe! Otherwise, the submatrix B describes the actio of A o the subspace orthogoal to v 1. By the iductio hypothesis, B has a SVD B = U 2 S 2 V 2*. Now it is easily verified that 1 0 s 1 0 1 0 * A U1 V1 0 U 2 0 2 0 V S 2 is a SVD of A, completig the proof of existece. * 39

Uiqueess The sigular values {s j } are uiquely determied. If A is square ad the s j are distict, the left ad right sigular vectors {u j } ad {v j } are uiquely determied up to complex sigs. v 1 v 2 A s 2 u 2 S s 1 u 1 AS 40

Uiqueess Geometrically, the proof is straightforward: if the semiaxis legths of a hyperellipse are distict, the the semiaxes themselves are determied by the geometry, up to sigs. v 1 v 2 A s 2 u 2 S s 1 u 1 AS 41

Take-home messages SVD is a importat factorizatio method, which applies for all rectagular, real or complex matrices It decomposes the matrix ito three factors a uitary matrix a real diagoal matrix, with oegative etries aother uitary matrix It has a broad rage of implicatios ad applicatios! 42

What s ext? Every matrix is diagoal if oly oe uses the proper bases for the domai ad rage spaces. SVD vs. eigevalue decompositio existece rectagular vs. square matrices orthoormal bases i the SVD, ot eigevectors Lik with matrix rak, rage, ull space, orm Low-rak approximatios 43