Singular value decomposition. Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaine

Lecture 11 Sigular value decompositio Mathématiques appliquées (MATH0504-1) B. Dewals, Ch. Geuzaie V1.2 07/12/2018 1

Sigular value decompositio (SVD) at a glace Motivatio: the image of the uit sphere S uder ay m matrix trasformatio is a hyperellipse. x m A A x = Ax m x Ax S Through the SVD, we will ifer importat properties of matrix A from the shape of AS! 2

Sigular value decompositio (SVD) at a glace The sigular value decompositio (SVD) is a particular matrix factorizatio. x m A A x = Ax m x Ax S Through the SVD, we will ifer importat properties of matrix A from the shape of AS! 3

Why is the sigular value decompositio of particular importace? The reasos for lookig at SVD are twofold: 1. The computatio of SVD the SVD is used is used as a as a itermediate step i may algorithms of of practical iterest. 2. From a coceptual poit of view, SVD also eables the SVD a eables deeper a uderstadig deeper uderstadig of may problems of may aspects i liear of algebra. liear algebra. 4

Learig objectives & outlie Become familiar with the SVD ad its geometric iterpretatio, ad get aware of its sigificace 1. Geometric Remider of observatios some fudametals i liear algebra 2. Reduced Geometric SVD iterpretatio 3. From reduced SVD to full SVD, ull ad SVD formal defiitio 4. Formal Existece defiitio ad uiqueess 5

1 - Remider: fudametals i liear algebra I this sectio, we briefly review the cocepts of adjoit matrix, matrix rak, uitary matrix as well as matrix orms (Chapters 2 ad 3 i Trefethe & Bau, 1997).

Adjoit of a matrix The adjoit (or Hermitia cojugate) of a m matrix A, writte A *, is the m matrix whose i, j etry is the complex cojugate of the j, i etry of A. If A = A *, A is Hermitia (or self-adjoit). For a real matrix A, the adjoit is the traspose: A * = A T, if the matrix is Hermitia, that is A = A T, the it is symmetric. 7

Matrix rak The rak of a matrix is the umber of liearly idepedet colums (or rows) of a matrix. The umbers of liearly idepedet colums ad rows of a matrix are equal. A m matrix of full rak is oe that has the maximal possible rak (the lesser of m ad ). If m, such a matrix is characterized by the property that it maps o two distict vectors to the same vector. 8

Uitary matrix A square matrix Q C mm, is uitary (or orthogoal, i the real case), if i.e. Q * = Q 1, Q * Q = I. The colums q i of a uitary matrix form a othoormal basis of C m : (q i ) * q j = δ ij, with δ ij the Kroecker delta. 9

A rotatio matrix is a typical example of a uitary matrix A rotatio matrix R may write: R cosq siq siq cosq The image of a vector is the same vector, rotated couter clockwise by a agle q. Matrix R is orthogoal ad R * R = R T R = I. 10

(Iduced) matrix orms are defied from the actio of the matrix o vectors For a matrix A C m, ad give vector orms () o the domai of A (m) o the rage of A the iduced matrix orm () is the smallest umber C for which the followig iequality holds for all x C : Ax C x m It is the maximum factor by which A ca stretch a vector x. 11

(Iduced) matrix orms are defied from the actio of the matrix o vectors The matrix orm ca be defied equivaletly i terms of the images of the uit vectors uder A: A m, max x x x 0 x 1 This form is coveiet for visualizig iduced matrix orms, as i this example. A 1 2 0 2 x Ax x 1 2 m max Ax A Ax 2.9208 2 2 m 12

2 Geometric iterpretatio I this sectio, we itroduce coceptually the SVD, by meas of a simple geometric iterpretatio (Chapter 4 i Trefethe & Bau, 1997).

Geometric iterpretatio Let S be the uit sphere i R. Cosider ay matrix A R m, with m. Assume for the momet that A has full rak. S x x A : x 1 x 2 i1 x i 2 12 14

Geometric iterpretatio The imaghe AS is a hyperellipse i R m. This fact is ot obvious; but let us assume for ow that it is true. It will be proved later. x A Ax S AS 15

A hyperellipse is the m-dimesioal geeralizatio of a ellipse i 2D I R m, a hyperellipse is a surface obtaied by stretchig the uit sphere i R m by some factors s 1,, s m (possibly zero) i some orthogoal directios u 1,, u m R m For coveiece, let us take the u i to be uit vectors, i.e. u i 2 = 1. The vectors {s i u i } are the pricipal semiaxes of the hyperellipse. s 1 u 1 s 2 u 2 AS 16

A hyperellipse is the m-dimesioal geeralizatio of a ellipse i 2D If A has rak r, exactly r of the legths s i will be ozero. I particular, if m, at most of them will be ozero. x A s 2 u 2 AS S s 1 u 1 17

Sigular values We stated at the begiig that the SVD eables characterizig properties of matrix A from the shape of AS. Here we go for three defiitios We defie the sigular values of matrix A as the legths of the pricipal semiaxes of AS, oted s 1,, s. It is covetioal to umber the sigular values i descedig order: s 1 s 2 s. s 1 u 1 s 2 u 2 AS 18

Left sigular vectors We also defie the left sigular vectors of matrix A as the uit vectors {u 1,, u } orieted i the directios of the pricipal semiaxes of AS, umbered to correspod with the sigular values. s 2 u 2 Thus, the vector s i u i is the i th largest pricipal semiaxis. s 1 u 1 AS 19

Right sigular vectors We also defie the right sigular vectors of matrix A as the uit vectors {v 1,, v } S that are the preimages of the pricipal semiaxes of AS, umbered so that A v j = s j u j. v 1 v 2 A s 2 u 2 S s 1 u 1 AS 20

Importat remarks The terms left ad right sigular vectors will be uderstood later as we move forward with a more formal descriptio of the SVD. I the geometric iterpretatio preseted so far, we assumed that matrix A is real ad m = = 2. v 1 v 2 Actually, the SVD applies A s 2 u 2 to both real ad complex matrices, S whatever the umber of dimesios. s 1 u AS 1 21

3 From reduced to full SVD, ad formal defiitio I this sectio, we distiguish betwee the so-called reduced SVD, ofte used i practice, ad the full SVD. We also itroduce the formal defiitio of SVD (Chapter 4 i Trefethe & Bau, 1997).

The equatios relatig right ad left sigular vectors ca be expressed i matrix form We just metioed that the equatios relatig right sigular vectors {v j } ad left sigular vectors {u j } ca be writte A v j = s j u j 1 j This collectio of vector equatios ca be expressed as a matrix equatio. s 1 A s v 1 v 2 v = u 1 u 2 u 2 s 23

The equatios relatig right ad left sigular vectors ca be expressed i matrix form This matrix equatio ca be writte i a more compact form: AV Uˆ Sˆ with SŜ a diagoal matrix with real etries (as A was assumed to have full rak ) Uˆ a m matrix with orthoormal colums V a matrix with orthoormal colums A s v 1 v 2 v = u 1 u 2 u 2 Thus, V is uitary (i.e. V * = V 1 ), ad we obtai: A Uˆ Sˆ V * to distiguish from U, S i the full SVD s 1 s 24

Reduced SVD The factorizatio of matrix A i the form ˆ ˆ * A USV is called a reduced sigular values decompositio, or reduced SVD, of matrix A. Schematically, it looks like this (m ): m = A ˆ U Ŝ V * 25

From reduced SVD to full SVD The colums of Û are orthoormal vectors i the m-dimesioal space C m. Uless m =, they do ot form a basis of C m, or is Û a uitary matrix. However, we may upgrade Û to a uitary matrix! m = A ˆ U Ŝ V * 26

From reduced SVD to full SVD Let us adjoi a additioal m orthoormal colums to matrix Û, so that it becomes uitary. The m additioal orthoormal colums are chose arbitrarily ad the result is oted U. However, S must chage too m = A U Ŝ V * 27

silet colums From reduced SVD to full SVD For the product to remai ualtered, the last m colums of U should be multiplied by zero. Accordigly, let S be the m matrix cosistig of S i the upper block together with m rows of zeros below. m = * A U S V 28

From reduced SVD to full SVD We get a ew factorizatio of A, called full SVD: A = U S V * U is a m m uitary matrix, V is a uitary matrix, S is a m diagoal matrix with real etries m = * A U S V 29

Geeralizatio to the case of a matrix A which does ot have full rak m If matrix A is rak-deficiet (i.e. of rak r < ), oly r (istead of ) of the left sigular vectors are deduced from the size of the hyperellipse BUT the full SVD still applies, by itroducig m r (istead of m ) additioal arbitrary orthoormal colums to costruct the uitary matrix U; the matrix V also eeds r arbitrary orthoormal = colums to exted the r colums determied from the hyperellipse geometry * Amatrix S has oly Ur o-zero diagoal S etries. V 30

Formal defiitio of the SVD m Let m ad be arbitrary (we do ot require m ). Give A C m, ot ecessarily of full rak, a sigular value decompositio of A is a factorizatio where = A = U S V * s 1 s p oegative, i oicreasig order U C mm is square, uitary S R m is real diagoal V * C is square, uitary * A U S V 31

Cosequetly, the image of the uit sphere i R uder a map A = U S V * is a hyperellipse i R m m Thus, 1. The uitary map V * preserves the sphere 2. The diagoal matrix S stretches the sphere ito a hyperellipse 3. The fial uitary map U rotates, or reflects, the hyperellipse without chagig its shape. if we ca prove that every U C mm S matrix C m has a SVD, V * C we will have is = is proved square, that the image of the uit is square, sphere uder ay real uitary liear map uitary is ideed a hyperellipse. diagoal * A U S V 32

4 Existece ad uiqueess I this sectio, we demostrate the existece of the SVD, the uiqueess of the sigular values, as well as uder some specific coditios, the uiqueess of the sigular vectors (Chapter 4 i Trefethe & Bau, 1997).

Every matrix A C m has a sigular value decompositio A = U S V * To prove the existece of the SVD, we first isolate the directio of the largest actio of A, the we proceed by iductio o the dimesio of A. The proof takes 5 steps. 34

Every matrix A C m has a sigular value decompositio A = U S V * Set s 1 = A 2. From the defiitio of the matrix orm, there must be a vector v 1 C m A max x x 1 Ax with v 1 2 = 1 ad Av 1 2 = s 1 v 1 2 = 1 m A 1 v = Av1 Av 1 2 = s 1 We ote: Av1 u1 s 1 35

Every matrix A C m has a sigular value decompositio A = U S V * Cosider ay extesios of v 1 to a orthoormal basis {v j } of C ad of u 1 to a orthoormal basis {u j } of C m Let U 1 ad V 1 deote the uitary matrices with colums u j ad v j, respectively. m A v1 v j 1 ad u1 u j 1 V 1 U 1 36

Every matrix A C m has a sigular value decompositio A = U S V * The we have * * s 1 w U1 AV1 S 0 B where 0 is a colum vector of dimesio m 1, w * is a row vector of dimesio 1, ad B has dimesios (m 1) ( 1). * * u 1 u s u * w s 1 1 1 1 m A v1 = 0 B U 1 * V 1 u * j1s 1u1 0 37

Every matrix A C m has a sigular value decompositio A = U S V * Furthermore, A m, max x x 0 Ax x m s s 0 B w w * s 1 2 * 2 * 12 1 w 1 s1 w w s1 w w Implyig (from the defiitio of matrix orms) BUT, sice U 1 ad V 1 are uitary, we kow that S This implies w = 0. 2 S 2 s 2 * 12 1 w w U AV A s * 2 1 1 2 2 1 2 38

Every matrix A C m has a sigular value decompositio A = U S V * If = 1 or m = 1, we are doe! Otherwise, the submatrix B describes the actio of A o the subspace orthogoal to v 1. By the iductio hypothesis, B has a SVD B = U 2 S 2 V 2*. Now it is easily verified that 1 0 s 1 0 1 0 * A U1 V1 0 U 2 0 2 0 V S 2 is a SVD of A, completig the proof of existece. * 39

Uiqueess The sigular values {s j } are uiquely determied. If A is square ad the s j are distict, the left ad right sigular vectors {u j } ad {v j } are uiquely determied up to complex sigs. v 1 v 2 A s 2 u 2 S s 1 u 1 AS 40

Uiqueess Geometrically, the proof is straightforward: if the semiaxis legths of a hyperellipse are distict, the the semiaxes themselves are determied by the geometry, up to sigs. v 1 v 2 A s 2 u 2 S s 1 u 1 AS 41

Take-home messages SVD is a importat factorizatio method, which applies for all rectagular, real or complex matrices It decomposes the matrix ito three factors a uitary matrix a real diagoal matrix, with oegative etries aother uitary matrix It has a broad rage of implicatios ad applicatios! 42

What s ext? Every matrix is diagoal if oly oe uses the proper bases for the domai ad rage spaces. SVD vs. eigevalue decompositio existece rectagular vs. square matrices orthoormal bases i the SVD, ot eigevectors Lik with matrix rak, rage, ull space, orm Low-rak approximatios 43