Linear Algebra 2 Spectral Notes In what follows, V is an inner product vector space over F, where F = R or C. We will use results seen so far; in particular that every linear operator T L(V ) has a complex eigenvalue. This may need a bit of clarification, given our textbook s attempt at making things slightly more mysterious than need be. I repeat here the proof that if T L(V ) and V is a complex, finite dimensional, vector space, then T has an eigenvalue λ C. You should probably know this proof. Let n = dim V. Let v V, v 0. The n + 1 vectors v, T v,..., T n v must be linearly dependent, thus there exist c 0,..., c n C such that n k=0 c kt k v = 0 (where T 0 is, as usual, interpreted as being the identity operator). Let p(x) = n k=0 c kx k, so p P n (C). By the fundamental theorem of algebra and consequences, we can write p(x) = (x λ 1 ) (x λ n ), where λ 1,..., λ n C are the (possibly repeated) roots of p(x) = 0. Then 0 = p(t )v = (T λ 1 I) (T λ n I)v. This does not necessarily imply (T λ n I)v = 0, but it does imply that there exists k, 1 k n such that (T λ k+1 I) (T λ n )v 0, (T λ k I) (T λ n )v = 0. (If k = n we interpret (T λ k+1 I) (T λ n )v as being just v; i.e., (T λ n+1 I) (T λ n )v = v.) In brief, setting w = (T λ k+1 I) (T λ n )v, we have w 0, (T λ k )w = 0. Thus λ k is an eigenvalue of T. Recalling the notes preceding Homework 3, if V is a real vector space, we will say that λ C is an eigenvalue of T iff λ is an eigenvalue of the complexified operator T C. We recall that then if λ is real it is an eigenvalue of T in the textbook sense (as you had to prove in these notes). But with this new definition, every linear operator has at least one eigenvalue, albeit a complex one. We will also need to use that if T is self-adjoint, then all of its eigenvalues are real. The proof in the text ( 7.A, 7.13, p.210) is preceded by the comment that the result is only interesting when F = C. Thanks to the notes preceding Homework 3, and our extended definition of an eigenvalue, this comment is to be ignored; the result applies both in the case F = R and F = C. Incidentally, 7.27 in 7B is reduced to a triviality that need not even be mentioned. Let us get started! Well, the first result was also an exercise in the notes preceding Homework 3. Lemma 1 Assume V is a real, T L(V ). recall that we defined T C L(V C ) by T C (v + iw) = T v + it w if v, w V. Then (T C ) = (T ) C.
2 Proof. Let u, v, u, v V. Then T C (u + iv), u + iv = T u + it v, u + iv = T u, u + T v, v + i ( T v, u T uv ) = u, T u + v, T v + i ( v, T u ut v ) = u + iv, T u + it v = u + iv, (T ) C (u + iv ). Since u + iv, u + iv are arbitrary elements of V C, we are done. Definition 1 An operator T L(V ) is normal iff it commutes with its adjoint; T T = T T. Definition 2 Let T sl(v ). We say T is skew-adjoint iff T = T. Lemma 2 Let T L(V ). 1. If V is a complex vector space, then T is skew adjoint if and only if it is self-adjoint. 2. If V is a real vector space, then T is skew adjoint if and only if T C is skew adjoint thus, by part 1, if and only if it C is self-adjoint. Proof. Assume V is a complex vector space. Since (it ) = it, it is clear that T is skew adjoint if and only of it is self-adjoint. Assume next that V is a real vector space. One thing perhaps not mentioned in the notes preceding Homework 3, but that is immediately verified is that if R, S L(V ), a, b R, then (ar + bs) C = ar C + bs C. From now on I ll use this result without further explanations. So if T = T, then (T C ) = (T ) C = ( T ) C = T C.Conversely, if (T C ) = T C, then (T ) C = (T C ) = T C = ( T ) C. It is also immediate to verify that if R, S L(V ) and R C = S C, then R = S, thus T = T. Proposition 3 Let T L(V ). Then T is normal if and only if T = R + S, where R is self-adjoint, S is skew adjoint and RS = SR. Proof. Assume first that T = R + S, where R is self adjoint, S is skew adjoint and RS = SR. Then T = R S and T T = (R + S)(R S) = R 2 + SR RS S 2 = R 2 + RS SR S 2 = (R S)(R + S) = T T. Conversely, assume T is normal, so T T = T T. Let R = 1 2 (T + T ), S = 1 2 (T T ). Then obviously R + S = T and RS = 1 4 (T + T )(T T ) = 1 ( T 2 + T T T T (T ) 2) 4 = 1 ( T 2 + T T T T (T ) 2) = 1 4 4 (T T )(T + T ) = SR.
3 The decomposition T = 1 2 (T + T ) + 1 2 (T T ) is the only way in which one can write a linear operator T as a sum of a self adjoint and a skew adjoint operator. In fact, if T = R + S where R is self-adjoint and S is skew adjoint, then T = R S so that T + T = 2R and T T = 2S. Theorem 4 Let T L(V ), λ F and let E(λ, T ) = N(T λi) = {v V : (T λi)v = 0}. (So E(λ, T ) {0} if and only if λ is an eigenvalue of T.) Assume T is normal. Then both E(λ, T ) and E(λ, T ) are invariant subspaces with respect to T and T. Moreover, E(λ, T ) = E( λt ). Proof. Assume first the non-trivial situation (and the only interesting one) in which E(λ, T ) {0}, so λ is an eigenvalue of T. That E(λ, T ) is a T -invariant subspace of V is immediate (even if T is not normal); it remains to prove that it also is invariant with respect to T. For this, let u E(λ, T ). Then T u = λu; applying T we get T T u = λt u; since T T = T T, this is equivalent to T (T u) = λt u, thus T u E(λ, T ). This proves E(λ, T ) is T invariant. Assume next that u E(λ, T ). Then if v E(λ, T ) we have v, T u = T v, u = 0 since T v E(λ, T ). This proves that T u E(λ, T ) if u E(λ, T ) ; similarly (interchanging the roles of T and T ) one sees that E(λ, T ) is T -invariant. Since E(λ, T ) is T invariant we can consider T as a linear operator in L(E(λ, T )). As such T has a (possibly complex) eigenvalue µ. That is, assuming for a moment that F = C there is u E(λ, T ) such that u 0 and T u = µu. Then, since u E(λ, T ), µ u 2 = µ u, u = u, T u = T u, u = λu, u = λ u 2. Since u 0, it follows that µ = λ. We can now consider U := E ( λ, T E(λ,T ) ) = {u E(λ, T ) : T u = λu}. T restricted to E(λ, T ) is still normal, so is T. This means that the orthogonal complement of U in E(λ, T ) is T -invariant by what we proved so far; if it is different from the null space T would have an eigenvector corresponding to some eigenvalue in that space. But the previous proof applies to show that this eigenvalue must be λ, thus the eigenvector is already in U. That is U = E(λ, T ), proving that E(λ, T ) E( λ, T ). Interchanging the roles of T, T one sees that the converse inclusion also holds, thus E(λ, T ) = E( λ, T ). We assumed here that F = C. If F = R and λ = R, we can simply complexify, work in V C to get that the complexified E(λ, T C ) is the same as E(λ, (T C ) ). The result follows.
4 Theorem 5 (Spectral Theorem for a normal operator in a complex space) Assume V is a complex inner product space and let T L(V ) be normal. Then T has an orthonormal basis of eigenvectors, hence is diagonalizable. Proof. I will provide two proofs. A straightforward proof based on the material developed here, and a sort of cute proof from our textbook. Proof 1. We proceed by induction on the dimension of V. If dim V = 1, the result is obvious. Assume thus dim V = n > 1, and the result has been proved for all complex inner product vector spaces of dimension < n. Let λ be an eigenvalue of T ; as seen, all linear operators have at least one complex eigenvalue. Since F = C there is also an eigenvector; that is E(λ, T ) is a non-null subspace of V. By Theorem 4, E(λ, T ) is T -invariant and, since dim E(λ, T ) 1, we have dim E(λ, T ) = n dim E(λ, T ) < n; by the induction hypothesis T restricted to E(λ, T ) has an orthonormal basis consisting of eigenvectors, say u 1,..., u m. If we let u m+1,..., u n be an orthonormal basis of E(λ, T ) (so it automatically consists of eigenvectors), then u 1,..., u n is an orthonormal basis of eigenvectors of V. Proof 2. From the book. The author uses first the result that if the vector spaces are complex, every linear operator T has a basis in which the matrix of T is upper triangular. The Gram-Schmidt orthonormalization procedure preserves spanned spaces; that is, if we orthonormalize v 1,..., v n to get e 1,..., e n then, for every k, 1 k n we have that span(v 1,..., v k ) = span(e 1,..., e k ); an immediate consequence of this fact is that if there is a basis in which the matrix is upper triangular, then there is an orthonormal basis with this property. What is then proved in the textbook is that if the matrix is normal, the only upper triangular matrix with respect to an orthonormal basis is a diagonal one. The key to the proof is the following result: T is normal if and only if T u = T u for all u in the complex inner product space V. I ll refer to the text for the simple proof. Assume u 1,..., u n is an orthonormal basis of V with respect to which T has an upper triangular matrix. This means that T u j = j λ ij u i i=1 for j = 1,..., n. In particular, T u 1 = λ 11 u 1, so that u 1 is an eigenvector of T. It is a simple exercise to see that the matrix of T with respect to a basis v 1,..., v n (orthonormal or not) is the complex conjugate of the transpose of the matrix of T with respect to the same basis, thus for 1 k n, T u k = λ kj u j. j=k
5 Assume proved for some k, 1 k n that T u l = λ ll u l for l = 1,..., k. This has been done for k = 1. If it is also done for k = n, we have proved that we have a basis of eigenvectors, so assume k < n. Assuming 1 l k, we have on the one hand T u l 2 = λ lj 2 = λ ll 2 + + λ ln 2 j=l by orthogonality, on the other hand T u l 2 = λ ll 2. By normality, T u l 2 = T u l 2, which implies λ lj = 0 if j > l. Then This concludes Proof 2. k+1 T u k+1 = λ i,k+1 u i = λ k+1,k+1 u k+1. i=1 This theorem is valid for complex vector spaces. But as an immediate corollary we have Theorem 6 Let T be self adjoint in the (not necessarily complex) finite dimensional inner product space V. Then T is diagonalizable, there is a basis of eigenvectors with respect to which M(T ) = diag (λ 1,..., λ n ). where λ 1,..., λ n R. Proof. If V is a complex space this result is a particular case of Theorem 5. Assume V is real and apply Theorem 5 to T C. Recall all eigenvalues of a selfadjoint operator are real, and, in case the operator is T C, are eigenvalues of T. It remains to be seen that T has an orthonormal basis of eigenvectors. This is slightly tricky. Let v 1 +iw 1,..., v n +iw n be an orthonormal basis of eigenvectors of T C in V C and let λ 1,..., λ n be the corresponding eigenvalues. Then T C (v j + iw j ) = λ j (v j + iw j ) implies T v j = λ j v j, T w j = λ j w j for j = 1,..., n (because all the λ j s are real). We claim that the 2n vectors v 1,..., v n ; w 1,..., w n span V. In fact, if u V, there exist complex coefficients a j + ib j, a j, b j R, so that u = u + i0 = (a j + ib j )(v j + iw j ) = (a j v j b j w j ) + i (a j w j + b j v j ). It follows that u = n (a jv j b j w j ), establishing the claim. Every spanning set of a vector space contains a basis, thus there is a subset of n vectors from {v 1,..., v n ; w 1,..., w n } that is a basis. This proves that V has a basis of eigenvectors of T. Let us assume that λ 1,..., λ m are the distinct eigenvalues of T. For each λ k, 1 k m replace all the basis elements corresponding to this eigenvalue by an orthonormal set. This is still a set of eigenvectors corresponding to λ k, spanning the same eigenspace. Because eigenvectors corresponding to distinct eigenvalues are orthogonal, this procedure finally produces an orthonormal basis of eigenvectors for T.