Lecture 19: Polar and singular value decompositions; generalized eigenspaces; the decomposition theorem (1) Travis Schedler Thurs, Nov 17, 2011 (version: Thurs, Nov 17, 1:00 PM) Goals (2) Polar decomposition and singular value decomposition Generalized eigenspaces and the decomposition theorem Read Chapter 7, begin Chapter 8, and do PS 9. Warm-up exercise (3) (a) Let T be an invertible operator on a f.d. i.p.s. and set P := T T and S := T P 1. Show that S is an isometry. Recall P is positive, so T = SP is a polar decomposition (i.e., S is an isometry and P positive). (b) Now suppose T = 0. Show that polar decompositions T = SP are exactly T = S0 for every isometry S, i.e., we have always P = 0 but S can be anything. One-dimensional analogue: Either z C is invertible, in which case z = (z/ z ) z = sp or else z is zero, in which case z = s 0 for any s of absolute value one. Solution to warm-up exercise (4) (a) S S = (T P 1 ) T P 1 = (P 1 ) T T P 1 = P 1 P 2 P 1 = I. used that P = P and hence (P 1 ) = P 1 as well. Here we (b) Since isometries are invertible, 0 = SP for S an isometry implies P = S 1 0 = 0. On the other hand clearly S0 = 0 for all S. 1
Polar decomposition and SVD (5) Proposition: every complex number z is expressible as z = r e iθ, where r 0 and θ [0, 2π). (Unique if z nonzero). Equivalently: z = s r, for s = 1 and r = z = z z 0. Theorem 1. Let V be a f.d. i.p.s. and T L(V ). Then there is an expression T = SP, for S an isometry and P positive. P is unique and P = T T. Moreover, S is unique iff T is invertible. Corollary 2 (Singular Value Decomposition (SVD)). There exists orthonormal bases (e 1,..., e n ) and (f 1,..., f n ) of V such that T e i = s i f i, for s i 0 the singular values. Moreover, (e 1,..., e n ) is an orthonormal eigenbasis of T T with eigenvalues s 2 i. Proof: Let (e 1,..., e n ) be an orthonormal eigenbasis of T T and s 1,..., s n the square roots of the eigenvalues. When s i 0, set f i := s 1 i T e i. Then extend the resulting f i to an orthonormal eigenbasis. Uniqueness of polar decomposition (6) If T = SP, then T T = P S SP = P P = P 2, so P = T T. Thus P is unique (positive operators have unique positive square roots; see the slides for Lecture 18 or Axler). If T is invertible, S = T P 1 so S is unique. Conversely, if T is not invertible, neither is P, and we can replace S by SS where S is an isometry such that S v = v for all eigenvectors v of nonzero eigenvalue. So then S is not unique. Existence of polar decomposition (7) Set P := T T. range(p ) is P -invariant and P is an isomorphism there (it has an eigenbasis with nonzero eigenvalues). Define thus P 1 range(p ) : range(p ) range(p ). Consider S 1 := T P 1 range(p ) : range(p ) range(t ). S 1S 1 = I, so u, v = S 1 u, S 1 v for all u, v range(p ). Recall: null(p ) = null(t ). So dim range(p ) = dim range(t ). Thus S 1 takes an on. basis (e 1,..., e m ) of range(p ) to an on. basis (f 1,..., f m ) of range(t ). Extend (e i ) and (f i ) to on. bases of V and extend S 1 to S L(V ) by S(e i ) = f i when i > m. So S takes an on. basis to another on. basis, i.e., it is an isometry. Finally, T = SP, since it is true on range(p ) and null(t ) = null(p ) = range(p ) = Span(e m+1,..., e n ). 2
Nonuniqueness of SVD (8) Note: the SVD is not unique, even if T is invertible: the orthonormal eigenbasis (e i ) of T T is not unique. (e.g., one can reorder them and multiply by ±1, at the least.) On the other hand, the polar decomposition is unique iff T is invertible. ( ) 0 2 Example: T = T A, A =. 1 0 ( ) ( ) 0 1 1 0 We can guess that A = SP =. So this is the answer 1 0 0 2 (unique since A, equivalently P, is invertible). ( ( 1 0 For SVD we could have e 1 =, e 0) 2 =, s 1) 1 = 1, s 2 = 2, f 1 = ( 0 1), f 2 = ( 1 0 ). But we could also swap everything: s 1 with s 2, e 1 with e 2, and f 1 with f 2. Or we could take e 1 to e 1 (hence f 1 to f 1 ) and/or e 2 to e 2 (hence f 2 to f 2 ). Computing SVD and polar decomposition (9) The best way to compute these is to do SVD first; then let P be the operator with eigenvectors (e i ) and eigenvalues (s i ), and let S be the isometry Se i = f i for all i. To compute SVD, given T, compute first T T. Then find the eigenvalues of T T (2 2 case: characteristic polynomial: for A = M(T ) in an orthonormal basis, these are the roots of x 2 tr(āt A)x+ det(āt A).) Find the eigenspaces and an on. eigenbasis (e i ) of T T. Next, set P := T T, by taking the nonnegative square root of the eigenvalues. These eigenvalues are the s i. Finally, let f i := s 1 i T e i for the nonzero s i ; for the remaining f i just extend the ones we get to an orthonormal basis. Example (10) Example from before: A = ( ) 0 2. 1 0 3
( ) 1 0 First, Ā t A =. 0 4 ( ( 1 0 Next, an eigenbasis is e 1 = and e 0) 2 = with eigenvalues 1 and 4. 1) So P = Ā t A has the same eigenbasis, with eigenvalues s 1 = 1 = 1 and s 2 = 4 = 2. ( ) ( ( ) Then f 1 = s 1 0 0 1 Ae 1 = 1 1 =. Also f 1 1) 2 = s 1 2 2 Ae 2 = 2 1 = 0 ( ) 1. 0 Now P = ( ) 1 0 and S = 0 2 ( ) 0 1, as desired. 1 0 ( ) s1 0 In general: P = (e 1 e 2 ) (e 0 s 1 e 2 ) 1 and S = (f 1 f 2 )(e 1 e 2 ) 1. 2 Generalized eigenvectors (11) Goal: Although not all f.d. vector spaces admit an eigenbasis, over F = C they always admit a basis of generalized eigenvectors. Definition 3. A generalized eigenvector v of T eigenvalue λ is one such that, for some m 1, (T λi) m v = 0. Examples: m = 1 above if and only if v is an (ordinary) eigenvector. If T is nilpotent, then all vectors are generalized eigenvectors of eigenvalue zero. So, even though it does not have an eigenbasis, every basis is a basis of generalized eigenvectors! Definition 4. Let V (λ) be the generalized eigenspace of eigenvalue λ: the span of all generalized eigenvectors of eigenvalue λ. Note that V (λ) is T -invariant, since (T λi) m v = 0 implies (T λi) m T v = T (T λi) m v = 0. The decomposition theorem (12) Theorem 5. Let V be f.d., F = C, and T L(V ). Then V is the direct sum of its generalized eigenspaces: V = λ V (λ). First step: Lemma 6. Suppose that λ µ. Then V (λ) V (µ) = {0}. 4
Proof. Suppose v V (λ) V (µ) is nonzero. Let (T λi) m v = 0 but not (T λi) m 1 v. So u := (T λi) m 1 v is a nonzero eigenvector of eigenvalue λ. Now let m 1 be such that (T µi) m v = 0. Then also (T µi) m u = (T µi) m (T λi) m 1 v = (T λi) m 1 (T µi) m v = 0. But, (T µi) m u = (λ µ) m u 0, since λ µ. This is a contradiction. Lemma on generalized eigenspaces (13) Lemma 7. V (λ) = null(t λi) dim V. I.e., if v is a generalized eigenvector of eigenvalue λ, we can take m = dim V before: (T λi) dim V v = 0. Proof. Let U i := (T λi) i V (λ). Since V (λ) is T -invariant (hence (T λi)-invariant), U 0 U 1. However, if U i = U i+1, then (T λi) is injective on U i. Since (T λi) is nilpotent, this implies U i = {0}. So U 0 U 1 U m = {0}, and dim U i dim V (λ) i. Hence m dim V (λ) dim V, and (T λi) dim V v = 0 for all v V (λ). One more lemma (14) Lemma 8. V = (T λi) dim V range(t λi) dim V = V (λ) range(t λi) dim V. Proof. Since the dimensions are equal, we need to show just that (T λi) dim V range(t λi) dim V = {0}. Let v (T λi) dim V range(t λi) dim V. Write v = (T λi) dim V u. Since (T λi) 2 dim V u = (T λi) dim V v = 0, also u is a generalized eigenvector of eigenvalue λ. But, by the last lemma, then (T λi) dim V u = 0, so v = 0. Proof of the decomposition theorem (15) Theorem 9. Let V be f.d., F = C, and T L(V ). Then V is the direct sum of its generalized eigenspaces: V = λ V (λ). Proof: By induction on dim V. Let λ be an eigenvalue of T, so V (λ) {0}. Write V = V (λ) range(t λi) dim V. Since dim range(t λi) dim V < dim V, the ind. hyp. shows that range(t λi) is the direct sum of the generalized eigenspaces of T range(t λi) dim V. 5
To conclude, we claim that for µ λ, V (µ) range(t λi) dim V. Thus V (µ) is a generalized eigenspace of T range(t λi) dim V. For this, we show that (T λi) dim V V (µ) = V (µ). First, V (µ) is T -invariant, so (T λi) dim V V (µ) V (µ). We only need to show (T λi) dim V is injective on V (µ). This means that V (µ) null(t λi) dim V = {0}. But this is V (µ) V (λ) = {0}, by the lemmas. 6