Symmetric and self-adjoint matrices A matrix A in M n (F) is called symmetric if A T = A, ie A ij = A ji for each i, j; and self-adjoint if A = A, ie A ij = A ji or each i, j Note for A in M n (R) that A T = A Notice that if F = R, then A is symmetric if and only if (Ax, y) = (x, Ay) for each x, y in R n Observe that the set is a linear subspace of M n (R) Sym n (R) = {A M n (R) : A T = A} A matrix A in M n (C) is called self-adjoint or hermitian if A = A Notice that A is hermitian if and only if (Ax, y) = (x, Ay) for each x, y in C n Observe that the set Herm(n) = {A M n (C) : A = A} is a R-linear subspace of M n (C) Note that Sym n (R) Herm(n) Consider the sets of (real) orthongonal and unitary matrices: O(n) = {u M n (R) : u T u = I} and U(n) = {u M n (R) : u u = I} Clearly, O(n) U(n) Remark It is easy to see that for u in M n (R) (respectively, in M n (C)) that u O(n) (respectively, is in U(n)) if and only if the columns of u: u (1) = u 11 u 1n,, u (n) = u n1 u nn form an orthonormal basis for R n (respectively C n ) Lemma If A M n (R) admits a real eigenvalue, then there is a corresponding real eigenvector Proof Let λ be a real eigenvalue of A and z 0 in C n be an eigenvector Write x j = Rez j and y j = Imz j so z = x + iy where x, y R n Then λx + iλy = λz = Az = Ax + iay 1
Collecting the real and imaginary parts of each entries of each component of the above equality gives a non-zero real eigenvector: at least one of x or y Diagonalization Theorem (i) If A Herm(n), then the eigenvalues of A are real Furthermore, for any two distinct eigenvalues λ, µ of A with corresponding eigenvectors x and y in C n, we have (x, y) = 0 (ii) If A M n (R) (respectively, is in M n (C)) then A Sym n (R) (respectively, is in Herm(n)) if and only if there is u in O(n) (respectively, u in U(n)) for which λ 1 0 0 u 0 λ Au = 2 0 0 0 λ n where λ 1,, λ n are the eigenvalues of A (with multiplicity) Proof (i) Let λ be an eigenvalue of A with corresponding eigenvector x 0 in C n Then λ(x, x) = (λx, x) = (Ax, x) = (x, Ax) = (x, λx) = λ(x, x) Dividing by (x, x) we see that λ = λ If eigenvalues λ µ of A correspond to eigenvectors x and y then so (λ µ)(x, y) = 0 λ(x, y) = (Ax, y) = (x, Ay) = µ(x, y) (ii) Notice that sufficiency in both the real symmetric and hermitian cases in trivial Let us consider necessity Let µ 1,, µ k denote the full set of distinct eigenvalues of A with respective eigenspaces E j = ker F n(µ j I A), j = 1,, k Since E i E j for i j, as observed in (i), we may find an orthonormal basis u (1) = u 11 u 1n,, u (m) = 2 u n1 u mn
for E = E 1 + +E k such that u (1),, u (n1 ) is a basis for E 1, u (n1 +1),, u (n1 +n 2 ) is a basis for E 2,, and u (n1 + +n k 1 ),, u (m) is a basis for E k (Here each n j = dim C E j ) Let for j = 1,, k, P j in M n (F) be the matrix corresponding to the orthogonal projection onto E j, ie with entries n 1 + +n j P j,i j = (e j, u (l) )u (l), e i = l=0+n 1 + +n j 1 n 1 + +n j l=0+n 1 + +n j 1 (e j, u (l) )(u (l), e i ) which is easily seen to satisfy P j = P j Let B = A k µ jp j, which is in Herm(n) (respectively, in Sym n (R) if F = R) and satisfies E ker F n B If B 0, then it admits an eigenvalue µ 0 so (i) provides that its corresponding eigenvector x 0 is in E But then µx = Bx = Ax, which means that µ is one of µ 1,, µ k, above, contradicting that x E Hence B = 0 Further, if x E, then Ax = Bx = 0, so x E E, so x = 0 Thus E = F n, ie m = n Let u denote the matrix with columns u (1),, u (n), which, by the remark above, is unitary; and let λ 1,, λ n the respective eigenvalues µ 1 (n 1 times),, µ k (n k times) Let e 1,, e n denote the standard basis for F n Then ue j = u (j) and we have that (u Aue j, e i ) = (Au (j), ue i ) = λ j (u (j), u (i) ) = λ j (e j, e i ) for each i, j = 1,, n, so u Au admits the claimed diagonal form A representation of a symmetric/hermitian matrix The proof above tells us that if µ 1,, µ k are the distinct eigenvalues of a symmetric (respectively, hermitian matrix) A and P 1,, P n are the matrices representing the orthogonal projections onto the respective eigenspaces E 1,, E k (which span all of F n ), then A = µ j P j where I = P j ( ) Since E i E j if i j, P i P j = 0 = P j P i Hence if p(x) = l=0 a lx l is any polynomial, we have p(a) = l=0 a l A l = p(µ j )P j 3
where A 0 = I, by convention Lemma Given A and P 1,, P k as above, another matrix B commutes with A, ie [A, B] = AB BA = 0, if and only if [P j, B] = 0 for each j Proof Sufficiency is evident from ( ) To see necessity, let for each j p j (A) = (A µ 1I) (A µ j 1 I)(A µ j+1 I) (A µ k I) (µ j µ 1 ) (µ j µ j 1 )(µ j µ j+1 ) (µ j µ k ) Clearly [p j (A), B] = 0 Let x E j = ker F n(a µ j I) Then { 0 if i j p j (A)x = x if i = j Hence, since F n = E 1 + + E k, p j (A) 2 = p j (A) Also, it is obvious that p j (A) = p j (A) Thus p j (A) = P j Simultaneous Diagonalization Theorem If A, B Sym n (R) (or in Herm(n)), and [A, B] = 0, then there is v in O(n) (respectively, in U(n)) for which λ 1 0 0 ν 1 0 0 v 0 λ Av = 2 0 and 0 ν v Bv = 2 0 0 0 λ n 0 0 ν n where λ 1,, λ n are the eigenvalues of A and ν 1,, ν n are the eigenvalues of B (each with multiplicity) Proof Consider the representations of A and B as in ( ): A = k µ j P j and B = µ ip i Since [A, B] = 0, the lemma above provides that [A, P i ] = 0 = [P j, B] for any i, j, and the lemma again provides that [P j, P i ] = 0 for any i, j Hence each P j P i is self-adjoint and squares to itself, and is hence the orthogonal 4
projection onto E ij = ker F n(a µ j I) ker F n(b µ ii) Further ( ) provides that ( k ) ( k ) P j P i = P k = I ( ) so A = AI = P i k µ j P j P i and B = IB = k µ ip j P i Take orthonormal bases for each of the non-zero spaces E ij and combine them into an orthonormal basis v 11 v (1) = v 1n,, v (n) = for F n (this is possible by ( )) Let v be the matrix with columns v (1),, v (n), and we obtain the desired diagonal forms Corollary A in M n (C) is normal, ie [A, A ] = 0, if and only if there is v in U(n) for which λ 1 0 0 v 0 λ Av = 2 0 0 0 λ n v n1 v nn where λ 1,, λ n are the eigenvalues of A (with multiplicity) Proof Sufficiency being evident, we show only necessity Let ReA = 1 2 (A + A ) and ImA = 1 2i (A A ) so ReA, ImA Herm(n) and A = ReA + iima It is easy to verify that A is normal if and only if [ReA, ImA] = 0 Hence simultaneous diagonalization, above, provides the necessary unitary diagonalizing matrix v [ ] 0 1 No real analogue The real matrix J 2 = is normal, but admits 1 0 only purely imaginary eigenvalues, and hence cannot be diagonalized by orthogonal matrices (ie unitary matrices with real entries) 5
Real skew-symmetric matrices A matrix B in M n (R) is skew-symmtric if B T = B Real Skew-symmetric Block Diagonalization Theorem If B T = B in M n (R) then there is u in O(n) and λ 1,, λ m > 0 in R such that λ 1 J 2 0 0 0 u T λ Bu = m J 2 0 0 0 where J 2 = [ ] 0 1, ie 2m n 1 0 Proof First, notice that ib Herm(n) and hence has real eigenvalues so B has purely imaginary eigenvalues (including, possibly 0) In particular, the only real eigenvectors may be in ker B Consider B T B, which is in Sym n (R) Any eigenvalue µ of B T B with eigenvector x in R n \ {0} satisfies µ(x, x) = (B T Bx, x) = (Bx, Bx) 0 so µ 0 Let µ 1,, µ l denote the distinct non-zero eigenvalues of B T B and E j the eigenspace of µ j, so E i E j for i j and each E j ker B T B Let E = E 1 + + E l Let x E j \ {0} and V x = Rx + RBx Then B(Bx) = B 2 x = B T Bx = µ j Bx V x and it follows that V x is B-invariant, ie BV x V x Furthermore dim R V x = 2 as B admits no non-zero real eigenvalues and ker B = ker B T B Now if y E j \ {0} with y x then (By, x) = (y, B T x) = (y, x) = 0 and (By, Bx) = (y, B T Bx) = µ j (y, Bx) = 0 6
so V y V x Hence we may build an orthonormal basis u (j) 1,, u (j) l j such that each u (j) 2i in particular, l j is even V u (j) 2i 1 and V u (j) 2i 1+2i V u (j) 2i 1 for E j for i = 1,, l j /2, and, Putting everything together we have that dim E = l dim E j is even, and we can find an orthonormal basis u 1,, u 2m, u 2m+1,, u n for R n for which the spaces V j = V u2j 1 are pairwise orthogonal and span E, and u 2m+1,, u n ker B Letting u be the matrix whose rows u 1,, u n we find that B 1 0 0 0 u T B Bu = m 0 0 0 where each B j M 2 (R) Since (u T Bu) T = u T B T u = u T Bu we find that each block must have the form B j = λ j J 2 with λ j in R\{0}, and by applying block permutations we may assume λ j > 0 (One may further check that the values λ 1,, λ m are the values µ 1,, µ k with multiplicities) Remark The complex analogue of this result is much easier If B M n (C) with B = B we can B skew-hermitian Notice that ib Herm(n) and is hence unitarily diagonalizable with real eigenvalues, so B too is unitarily diagonalizable but with purely imaginary eigenvalues Written by Nico Spronk, for use by students of PMath 763 at University of Waterloo 7