Summary of Week 9 Finding the square root of a positive operator Last time we saw that positive operators have a unique positive square root We now briefly look at how one would go about calculating the square root Notice that the square root of a diagonal matrix is easy to write down If a,, a n are non-negative real numbers and a a B = an then A a a an where we take only positive square roots How do we compute the square root when the matrix of the positive operator in question is not in diagonal form? Since B is a positive operator, we first diagonalize it This is possible since B is self-adjoint and the spectral theorem tells us that B is diagonalizable We do this by finding a basis of eigenvectors of B (which is an equivalent condition of diagonalizability) and forming the matrix P whose columns are these eigenvectors Then P BP = D, a diagonal matrix, whose diagonal entries are exactly the eigenvalues of B This can be re-written as B = P DP We know that the square root of D, since it is a diagonal matrix We have (P DP )(P DP ) = P DP P DP = P DP = B Thus, P DP is the unique positive square root of B Note: We will be soon interested in square roots of A A for any matrix/operator A Although A need not be self-adjoint, A A is always self-adjoint! We have the following analogy between C and L(V ): Polar Decomposition C z z real numbers: z = z nonnegative unit circle: zz = polar form: z = re iθ L(V ) T T self-adjoint: T = T positive isometries: T T = I polar decomposition Polar Decomposition If T L(V ), then there exists an isometry S L(V ) such that T = S T T Proof If T L(V ) and v V, then T v = T v, T v = T T v, v = T T T T v, v = T T v, T T v = T T v and T v = T T v Define S : Im T T ImT by S ( T T v) = T v The idea is to extend Ŝ to an isometry S Im(V ) such that T = S T T We first check that S is well-defined If v, v V satisfy T T v = T T v then T v T v = T (v v ) = T T (v v ) = T T v T T v = 0 which show that T v = T v By definition, S maps Im T T onto Im T, ie, Moreover, S u = u for all u Im T T In particular, S is injective and the rank-nullity theorem implies that dim Im T T = dim Im T Using orthogonal decomposition, since Im T T and Im T are finite dimensional subspaces of V, we have V = Im T T (Im T T )
and V = Im T (Im T ) When we have a direct sum, the dimensions add up ie, if V = U U, then dim V = dim U + dim U We saw earlier that dim Im T T = dim Im T, this immediately gives dim(im T T ) = dim(im T ) Hence, orthonormal bases (e,, e m ), (e,, e m) can be chosen for (Im T T ) and (Im T ) respectively Define S : (Im T T ) (Im T ) by S (a e + + a m e m ) = a e + + a m e m Obviously, S w = w for all w (Im T T ) Now let S = S S : Im T T (Im T T ) Im T (Im T ) Then for any v V, S( T T v) = S ( T T v) = T v, showing that T = S T T Now it remains to check that S is an isometry To see this, notice that the Pythagorean theorem implies that if v = u + w V where u Im T T and w (Im T T ) then Hence S is an isometry Sv = S u + S w = S u + S w = u + w = v Singular Value Decomposition Definition Suppose L(V, W ) The singular values of T are defined as the eigenvalues of T T Singular Value Decomposition (SVD)-Matrix version If A M m n (C), then the following decomposition exists: UΣV where Σ is a diagonal matrix with non-negative entries and U and V are unitary matrices, ie, UU = U U = I = V V = V V The decomposition is obtained by finding the matrices U, Σ, V as follows: Remarks: Σ is an m n diagonal matrix with non-negative entries along the diagonal being the singular values of A A: λ λ λ n, where λ i are the eigenvalues of A A V is an n n matrix whose columns form an orthonormal list of eigenvectors w,, w n for A A That is, A Aw i = λ i w i If A has rank r, then the columns u,, u r of A are formed by u j = λj Aw j for j =,, r and u r+,, u m are formed by extending {u,, u r } to an orthonormal basis of R m () Reason for the choice of V, Σ: Observe that A (V Σ U )(UΣV ) = V Σ V, since Σ is a diagonal matrix and hence self-sdjoint and U U = I Since V V = I, this gives V = V so we have Σ as a diagonalization of A A If you haven t already seen this before, we have the following Fact: If A, B, P are matrices such that P BP then the eigenvalues of A coincide with the eigenvalues of B From the above equation, since V = V, the matrices Σ and A A have the same eigenvalues Since Σ is a diagonal matrix with diagonal entries λ i, i =,, n, this immediately gives λ i to be the singular values of A A, using the definition of A A given above This also gives why V is a matrix whose columns form an orthonormal basis of eigenvectors for A A: Since A A is self-adjoint, the spectral theorem ensures the existence of such a basis and
equivalently, diagonalizability of A A The diagonalization is, after all done using a matrix whose columns are the eigenvectors of A A The eigenvectors are calculated as usual Since A A is selfadjoint and therefore normal, the eigenvectors of A A are guaranteed to be orthogonal We then normalize them by dividing each eigenvector by its norm to get an orthonormal basis consisting of eigenvectors () Reason for the choice of U: We have the following fact For any A M m n (C), rank(a) = rank(a ) = rank(aa ) = rank(a A) Thus, if A has rank r, then A A also has rank r, which means A A has at most r non-zero eigenvalues Notice that if UΣV, then we also have AV = UΣ, since V V = I If e,, e n denote the column vectors of V and u,, u m denote the column vectors of U, then A[e,, e n ] = [u,, u m ] λ λ λn ie, [Ae,, Ae n ] = [ λ u,, λ m u m ] Now, if i r, then λ i 0 and we have u i = λi Ae i For r < j m, λ j = 0, then Ae j = 0, so u j can be any vector, but we want the matrix U to be unitary, therefore we choose the u j corresponding to j for which λ j = 0 in such a way that they extend u,, u r to an orthonormal basis of R m Solved examples: A Type: Square matrix A with A A having positive eigenvalues Example : Find a singular value decomposition of the matrix ( ) Solution: Step : Find A A: ( ) ( ) = ( ) 5 3 3 5 Step : Find eigenvalues and eigenvectors of A A: To find eigenvalues, we solve the equation det(a A λi) = 0 to get 5 λ 3 3 5 λ = 0 This gives us the equation λ 0λ + 6 = 0, solving which, we get λ = 8, λ =, therefore the singular values of A A are λ =, λ = Next, we find eigenvectors by solving the equations (A A λ i I)v i = 0 for i =, 3
For λ = 8, we have ( ) [ 3 3 v 3 3 = 0 = v = ] We then normalize it to get e = [ ] For λ =, we have ( ) [ 3 3 v 3 3 = 0 = v = ] We then normalize it to get e = [ ] Step 3: Write down Σ, V : Σ is the diagonal matrix of the same size of A, with singular values along the diagonal This gives us ( ) 0 () Σ = 0 V is the matrix formed by writing the orthonormal eigenvectors of A A as its columns This gives us ( ) / / () V = / / Step 4: Compute U: In this case, since there are no eigenvalues of A A that are equal to zero, the matrix U is given by [u u ] where the column vectors u, u are found using the formula (3) u i = λi Ae i This gives and This gives us u = (4) U = ( ) [ ] [ / / = 0] u = ( ) [ ] / / = ( ) 0 0 [ 0 ] Step 5: Write down the SVD decomposition using equations (), () and (4): ( ) ( ) ( ) UΣV 0 0 = / / 0 0 / / B Type: Square matrix A with A A having a zero eigenvalue: We proceed as before to compute Σ and V The only difference in this case is that we can t find all the column vectors of U using the formula given in equation (3) What we do in this case is that we find the column vectors using the non-zero eigenvalues/singular values and then extend this list to an orthonormal basis of R m In the example below, m =, so at most one eigenvalue of A A is zero If this happens, we get the 4
first column vector u using the formula (3) and extend the list to form an orthonormal basis of R : we just pick a vector whose inner product with u [ is zero and then normalize the vector This [ is ] a b straightforward in the case of length two vectors: If is a vector of norm, then the vector b] a is a vector of norm orthogonal to it Note: If r is the rank of the matrix A, we obtain exactly r non-zero eigenvalues in Step If r < m, we find the vectors u,, u r using equation (3) There is no unique choice of vectors that will extend the list u,, u r to an orthonormal basis of R m This won t affect the decomposition, because the zeros in the diagonal of Σ will ensure that the choice of vectors u r+,, u m is irrelevant Let s see this explicitly in the following example: Example : Find the SVD of the matrix ( ) 4 3 8 6 Solution: Following ( ) the same steps as in the previous example for steps to 4: 80 60 Step : A 60 45 Step : Eigenvalues of A A are λ = 5, λ = 0 The corresponding orthonormal eigenvectors are e = [ 4, e 5 3] = [ 3 5 4] Step 3: We can now write down Σ, V : ( ) 5 0 Σ =, V = 0 0 5 ( ) 4 3 3 4 Step 4: Finding U: Since λ 0, we use equation (3) to find u : u = ( ) ( ) 4 3 4/5 5 = [ ] 5 8 6 3/5 5 [ ] We then pick u to be 5 in order to make it orthogonal to u and be of norm Thus, U = ( ) 5 Step 5 Finally, we write down the SVD: ( ) 4 3 = ( ) ( ) ( 5 0 4/5 3/5 8 6 5 0 0 3/5 4/5 = ( ) ( ) ( ) 5 0 4/5 3/5 5 0 0 3/5 4/5 We could simplify even further to get ( ) ( 0 0 0 ) ( ) 4 3 3 4 ) 5
C Type: A being an m n matrix with m < n: We proceed as before, keeping in mind the different dimensions of U and V and the diagonal matrix Σ Observe that when m < n, the rank of A rank of A m So, although A A is an n n matrix, there will be at least n m eigenvalues that are equal to zero If r is the rank of A, then we will be assured of r positive eigenvalues of A A and while writing the matrix Σ, we shall ignore the (zero) eigenvalues λ m+,, λ n Although, if r < m, the eigenvalues λ r+,, λ m will also be zero, but they shall appear in the diagonal entries of Σ In the example below, m =, n = 3 Example 3: Find the SVD of the matrix ( ) Solution: We have 5 3 4 Step : A 3 3 4 3 5 Step : Eigenvalues of A A are λ =, λ =, λ 3 = 0 The corresponding orthonormal eigenvectors are e = 3, e = 0, e 3 = 3 3 Step 3: We can now write down Σ, V : ( ) 3/ / / 0 0 Σ =, V = / 0 3/ 0 0 3/ / / Step 4: Finding U: Since λ, λ 0, we use equation (3) to find the vectors u, u : u = ( ) 3/ / 3/ = [ ] Thus, u = ( ) / 0 / = [ ] U = ( ) Step 5 Finally, we write down the SVD: ( ) = ( ) ( ) 3/ / 3/ 0 0 / 0 / 0 0 / 3/ / 6
D Type: A being an m n matrix with m > n: Example 4: Find the SVD of the matrix Solution: We( have ) 9 9 Step : A 9 9 Step : Eigenvalues of A A are λ = 8, λ = 0 The corresponding orthonormal eigenvectors are e = [ ], e = [ ] Step 3: We can now write down Σ, V : 8 0 Σ = 0 0, V = ( ) 0 0 Step 4: Finding U: Since λ 0, we use equation (3) to find the vectors u : u = ( ) /3 / 8 / = /3 /3 We pick u so that it is orthogonal to u and then normalize it: We choose u = 5 0 Now we need to find a vector u 3 that is orthogonal to both u and u We do this by finding the cross product of u and u However, since we only need orthogonality, we can ignore the normalization for the vectors and find the cross-product of the vectors and to simplify calculations: 0 î ĵ ˆk = î + 4ĵ + 5ˆk 0 Normalizing, we have u 3 = 4 45 5 Step 5: Finally, we write down the SVD: ( ) /3 / 5 / 45 = /3 / 5 4/ 8 0 ( ) 45 /3 0 5/ 0 0 / / / / 45 0 0 7