Chapter 0 Preliminaries - PDF Free Download

Chapter 0 Preliminaries Objective of the course Introduce basic matrix results and techniques that are useful in theory and applications. It is important to know many results, but there is no way I can tell you all of them. I would like to introduce techniques so that you can obtain the results you want. This is the standard teaching philosophy of It is better to teach students how to fish instead of just giving them fishes. Assumption You are familiar with Linear equations, solution sets, elementary row operations. Matrices, column space, row space, null space, ranks, Determinant, eigenvalues, eigenvectors, diagonal form. Vector spaces, basis, change of bases. Linear transformations, range space, kernel. Inner product, Gram Schmidt process for real matrices. Basic operations of complex numbers. Block matrices. Notation M n (F), M m,n (F) are the set of n n and m n matrices over F = R or C (or a more general field). F n is the set of column vectors of length n with entries in F. Sometimes we write M n, M m,n if F = C. ( ) A1 0 If A = M 0 A m+n (F) with A 1 M m (F) and A 2 M n (F), we write A = A 1 A 2. 2 x t, A t denote the transpose of a vector x and a matrix A. For a complex matrix A, A denotes the matrix obtained from A by replacing each entry by its complex conjugate. Furthermore, A = (A) t. More results and examples on block matrix multiplications. 1

1 Similarity, Jordan form, and applications 1.1 Similarity Definition 1.1.1 Two matrices A, B M n are similar if there is an invertible S M n such that S 1 AS = B, equivalently, A = T 1 BT with T = S 1. A matrix A is diagonalizable if A is similar to a diagonal matrix. Remark 1.1.2 If A and B are similar, then det(a) = det(b) and det(zi A) = det(zi B). Furthermore, if S 1 AS = B, then Bx = µx if and only if A(Sx) = µ(sx). Theorem 1.1.3 Let A M n (F). Then A is diagonalizable over F if and only if it has n linearly independent eigenvectors in F n. Proof. An invertible matrix S M n satisfies S 1 AS = D = diag (λ 1,..., λ n ) if and only if AS = SD, i.e., the columns of S equal to x 1,..., x n. Remark 1.1.4 Suppose A = SDS 1 with D = diag (λ 1,..., λ n ). Let S have columns x 1,..., x n and S 1 have rows y t 1,..., yt j. Then Ax j = λ j x j and y t ja = λ j y t j, j = 1,..., n, and n A = λ j x j yj. t Thus, x 1,..., x n are the (right) eigenvectors of A, and y t 1,..., yt n are the left eigenvectors of A. Remark 1.1.5 For a real matrix A M n (R), we may not be able to find real eigenvalues, and we may not be able to find n linearly independent eigenvectors even if det(zi A) = 0 has only real roots. Example 1.1.6 Consider A = ( ) 0 1 and B = 1 0 B does not have two linearly independent eigenvectors. ( ) 0 1. Then A has no real eigenvalues, and 0 0 Remark 1.1.7 By the fundamental theorem of algebra, every complex polynomial can be written as a product of linear factors. Consequently, for every A M n (C), the characteristic polynomial has linear factors: det(zi A) = (z λ 1 ) (z λ n ). So, we needs only to find n linearly independent eigenvectors in order to show that A is diagonalizable. 2

1.2 Triangular form and Jordan form Question What simple form can we get by similarity if A M n is not diagonalizable? Theorem 1.2.1 Let A M n be such that det(zi A) = (z λ 1 ) (z λ n ). For any permutation (j 1,..., j n ) of (1,..., n), A is similar to a matrix in (upper) triangular form with diagonal entries λ j1,..., λ jn. Proof. By induction. The result clearly holds if n = 1. Suppose Ax 1 = λ i1 x 1. Let S 2 be ( ) ( ) λi1 invertible with first column equal to x 1. Then AS 1 = S 1. Thus, S 1 0 A 1 AS λi1 1 =. 1 0 A 1 Note that the eigenvalues of A 1 are λ i2,..., λ in. By induction assumption, S 1 2 A 1S 2 is in upper triangular form with eigenvalues λ i2,..., λ in. Let S = S 1 ([1] S 2 ). Then S 1 AS has the required form. Note that we can apply the argument to A t to get an invertible T such that T 1 A t T is in upper triangular form. So, S 1 AS is in lower triangular form for S = T 1. Lemma 1.2.2 Suppose A M m, B M n have no common eigenvalues. Then for any C M m,n there is X M m,n such that AX XB = C. Proof. Suppose S 1 AS = Ã is in upper triangular form, and T 1 BT = B is in lower triangular form. Multiplying the equation AX XB = C on left and right by S and T 1, we get ÃY Y B = C with Y = SXT 1 and C = SY T 1. Once solve ÃY Y B = C, we get the solution X = S 1 Y T for the original problem. Now, for each column col j ( C) for j = 1,..., n, we have col j (Y ) n i=1 B ij col i (Y ) = col j ( C). We get a system of linear equation with coefficient matrix A I n Ã B I m = B 11 I m... b n1 I m......... A. B n1 I.. m Bnn I m Because B is in lower triangular form, and the upper triangular matrix Ã = S 1 AS and the lower triangular matrix B = T 1 BT have no common eigenvalues, the coefficient matrix is in upper triangular form with nonzero diagonal entries. So, the equation ÃY Y B = C always has a unique solution. ( ) A11 A Theorem 1.2.3 Suppose A = 12 M 0 A n such that A 11 M k, A 22 M n k have no 22 common eigenvalue. Then A is similar to A 11 A 22. 3

Proof. By the previous lemma, there is X be such that A 11 X + A 12 = XA 22. Let S = ( ) Ik X so that AS = S(A 0 I 11 A 22 ). The result follows. n k Corollary 1.2.4 Suppose A M n has distinct eigenvalues λ 1,..., λ k. Then A is similar to A 11 A kk such that A jj has (only one distinct) eigenvalue λ j for j = 1,..., k. Definition 1.2.5 Let J k (λ) M k such that all the diagonal entries equal λ and all super diagonal λ 1...... entries equal 1. Then J k (λ) = λ 1 M k is call a (an upper triangular) Jordan block λ of λ of size k. Theorem 1.2.6 Every A M n is similar to a direct sum of Jordan blocks. Proof. We may assume that A = A 11 A kk. If we can find invertible matrices S 1,..., S k such that S 1 i A ii S i is in Jordan form, then S 1 AS is in Jordan form for S = S 1 S k. Focus on T = A ii λ i I nk. If S 1 T S is in Jordan form, then so is A ii. Now, we use a proof of Mark Wildon. https://www.math.vt.edu/people/renardym/class home/jordan.pdf Note: Then vectors u 1, T u 1,..., T n 1 u 1 in the proof is called a Jordan chain. 0 0 1 2 Example 1.2.7 Let T = 0 0 3 4 0 0 0 0. Then T e 1 = 0, T e 2 = 0, T e 3 = e 1 + 3e 2, T e 4 = 2e 1 + 4e 2. 0 0 0 0 So, T (V ) = span {e 1, e 2 }. Now, T e 1 = T e 2 = 0 so that e 1, e 2 form a Jordan basis for T (V ). Solving u 1, u 2 such that T (u 1 ) = e 1, T (u 2 ) = e 2, we let u 1 = 2e 3 + 3e 4 /2 and u 2 = e 3 e 4 /2. Thus, T S = S(J 2 (0) J 2 (0)) with Example 1.2.8 Let T = 1 0 0 0 S = 0 0 1 0 0 2 0 1. 0 3/2 0 1/2 0 1 2 0 0 3. Then T e 1 = 0, T e 2 = e 1, T e 3 = 2e 1 + e 2. So, T (V ) = 0 0 0 span {e 1, e 2 }, and e 2, T e 2 = e 1 form a Jordan basis for T (V ). Solving u 1 such that T (u 1 ) = e 2, we 4

have u 1 = ( 2e 2 + e 3 )/3. Thus, T S = SJ 3 (0) with 1 0 0 S = 0 1 2/3. 0 0 1/3 Remark 1.2.9 The Jordan form of A can be determined by the rank/nullity of (A λ j I) k for k = 1, 2,.... So, the Jordan blocks are uniquely determined. Let ker((a λi) i ) = l i has dimension l i. Then there are l 1 Jordan blocks of λ, and there are l i λ i 1 Jordan blocks of size at least i. Example 1.2.10 Suppose A M 9 has distinct eigenvalues λ 1, λ 2, λ 3 such that A λ 1 I has rank 8, A λ 2 I has rank 7, (A λ 2 I) 2 and (A λ 2 I) 3 have rank 5, A λ 3 I has rank 6, (A λ 3 I) 2 and (A λ 3 I) 3 have rank 5. Then the Jordan form of A is J 1 (λ 1 ) J 2 (λ 2 ) J 2 (λ 2 ) J 1 (λ 3 ) J 1 (λ 3 ) J 2 (λ 3 ). 5

1.3 Implications of the Jordan form Theorem 1.3.1 Two matrices are similar if and only if they have the same Jordan form. Proof. If A and B have Jordan form J, then S 1 AS = J = T 1 BT for some invertible S, T so that R 1 AR = B with R = ST 1. If S 1 AS = B, then rank (A µi) l = rank (B µi) l for all eigenvalues of A or B, and for all positive integers l. So, A and B have the same Jordan form. Remark 1.3.2 If A = S(J 1 J k )S 1, then A m = S(J m 1 J m k )S 1. Theorem 1.3.3 Let J k (λ) = λi k +N k, where N k = k 1 E j,j+1. Then J k (λ) m = m j=0 ( m j ) λ m j N j k, where Nk 0 = I k, N j k = 0 for j k, and N j k has one s at the jth super diagonal (entries with indexes (l, l + j)) and zeros elsewhere. For every polynomial function f(z) = a m z m + + a 0, let f(a) = a m A m + + a 0 I n for A M n. Theorem 1.3.4 (Cayley-Hamilton) Let A M n and f(z) = det(zi A). Then f(a) = 0. Proof. Let f(z) = det(zi A) = (z λ 1 ) n1 (z λ k ) n k such that n 1 + + n k. Then there is an invertible S such that S 1 AS = A 11 A kk with A jj M nk is a direct sum of Jordan blocks of λ j Suppose the maximum size of the Jordan block in A jj is r j, then r j n j for j = 1,..., k. Thus, As a result, f(a jj ) = (A jj λ 1 I) n 1... (A λ k ) n k = 0. f(a) = f(s(a 11 A kk )S 1 ) = S[f((A 11 ) f(a kk )]S 1 = 0 n. Definition 1.3.5 Let A M n. Then there is a unique monic polynomial m A (z) = x m + a 1 x m 1 + + a m such that m A (A) = 0. It is the minimal polynomial of A. Theorem 1.3.6 A polynomial g(z) satisfies g(a) = 0 if and only if it is a multiple of the minimal polynomial of A. 6

Proof. If g(z) = m A (z)q(z), then g(a) = m A (A)q(A) = 0. To prove the converse, by the Euclidean algorithm, g(z) = m A (z)q(z)+r(z) for any polynomial g(z). If 0 = g(a) = m A (A)q(A)+ r(a) = r(a), then r(a) = 0. But r(z) has degree less than m A (z). If r(z) is not zero, then there is a nonzero µ C such that µr(z) is a monic polynomial such that µr(a) = 0, which is impossible. So, r(z) = 0, i.e., g(z) is a multiple of m A (z). We can actually determine the minimal polynomial of A M n using its Jordan form. Theorem 1.3.7 Suppose A has distinct eigenvalues λ 1,..., λ k such that r j is the maximum size Jordan block of λ j for j = 1,..., k. Then m A (z) = (z λ 1 ) r1 (z λ k ) m k. Proof. Following the proof of the Cayley Hamilton Theorem, we see that m A (A) = 0 n. By the last Theorem, if g(a) = 0 n, then g(z) = m A (z)q(z). So, taking q(z) = 1 will yield the monic polynomial of minimum degree satisfying m A (A) = 0. Remark 1.3.8 For any polynomial g(z), the Jordan form of g(a) can be determine in terms of the Jordan form of A. In particular, for every Jordan block J k (λ), we can write g(z) = (z λ) k q(z)+r(z) with r(z) = a 0 + + a k 1 z k 1 so that g(j k (λ)) = r(j k (λ)). 1.4 Final remarks If A M n (R) has only real eigenvalues, then one can find a real invertible matrix such that S 1 AS is in Jordan form. If A M n (R), then there is a real invertible matrix such that S 1 AS is a direct sum of real Jordan blocks, and 2k 2k generalized Jordan blocks of the form (C ij ) 1 i,j k with ( ) µ1 µ C 11 = = C kk = 2, C µ 2 µ 12 = = C k 1,k = I 2, and all other blocks equal to 0 2. 1 The proof can be done by the following two steps. First of all find the Jordan form of A. Then group J k (λ) and J k ( λ) together for any complex eigenvalues, and find a complex S such that S 1 AS is a direct sum of the above form. Second if S = S 1 + is 2 for some real matrix S 1, S 2, show that there is Ŝ = S 1 + rs 2 for some real number r such that Ŝ is invertible so that Ŝ 1 AŜ has the desired form. Another proof of the Jordan canonical form theorem is to use λ-matrices. Let A M n. Then one can apply elementary row and column reductions using elementary matrices in {(f ij (λ)) : f ij (λ) is a plynomial in λ} to get the Smith normal form diag (d 1 (λ),..., d k (λ), 0,..., 0) 7

for some monic polynomials d j (λ) with d j (λ) d j+1 (λ) for j = 1,..., k 1. Furthermore, the following are true. Two matrices A, B M n are similar if and only if there are λ-matrices P (λ), Q(λ) such that P (λ)(λi A) = (λi B)Q(λ). Setting λ = 0, one has P 0 A = BQ 0 and that P 0 Q 0 = I. For every A M n, there is B is Jordan form so that P 0 A = BQ 0. 8

2 Unitary similarity, unitary equivalence, and consequences 2.1 Unitary space and unitary matrices Definition 2.1.1 Let x = (x 1,..., x n ) t, y = (y 1,..., y n ) t. Define the inner product of x and y by x, y = n x j ȳ j. The vectors are orthogonal if x, y = 0. A set {v 1,..., v k } is orthonormal if v i, v j = δ ij. A matrix U M n with orthonormal columns is a unitary matrix, i.e., U U = I n. A basis for C n is an orthonormal basis if it consists of orthonormal vectors. Some basic properties For any a, b C, x, y, z C n, x, y = y, x. ax + by, z = a x, y + b x, z, x, ay + bz = ā x, y + b x, z, x, x = x j 2 if x = (x 1,..., x n ) t. x, x 0, where the equality holds if and only if x = 0. Theorem 2.1.2 (a) Orthonormal sets in C n are linearly independent. If {u 1,..., u n } is an orthonormal basis for C n, then for any v C n, v = c 1 u 1 + +c n u n with c j = v, u j for j = 1,..., n. (b) [Gram Schmidt process] Suppose A = M n,m has linearly independent columns. Then A = V R such that V M n,m has orthonormal column, and R M m is upper triangular. (c) Every linearly independent (orthonormal) subset of C n can be extended to a (an orthonormal) basis. Proof. (a) Suppose {u 1,..., u k } is an orthonormal set, and c 1 u 1 + + c k u k = 0. Then 0 = c j u j, u i = c i for all i = 1,..., k. Let k = n, and v = c 1 u 1 + + c n u n. Then c i = v, u i for all i = 1,..., n. (b) Suppose A has column a 1,..., a m. Let u 1 = a 1, u 2 = a 2 a 2, u 1 u 1, u 1 u 1, u 3 = a 3 a 3, u 1 u 1, u 1 u 1 a 3, u 2 u 2, u 2 u 2,... Then set v j = u j / u j for j = 1,..., m, and V be n m with columns v 1,..., v m. We have A = V R, where R = V A is upper triangular. (c) Let {v 1,..., v k } be linearly independent. Take the pivoting columns of [v 1 v k e 1 e n ] to form an invertible matrix, and then apply Gram Schmidt process. 9

2.2 Schur Triangularization Theorem 2.2.1 For every matrix A M n with eigenvalues µ 1,..., µ n (listed with multipicities),there is a unitary U such that U AU is in upper (or lower) triangular form with diagonal entries µ 1,..., µ n. Proof. Similar to that of Theorem 1.2.1. Definition 2.2.2 Let A M n. The matrix A is normal if AA = A A, the matrix A is Hermitian if A = A, the matrix A is positive semidefinite if x Ax 0 for all x C n, the matrix A the matrix A is positive definite if x Ax > 0 for all nonzero x C n. Theorem 2.2.3 Let A M n. (a) The matrix A is normal if and only if it is unitarily diagonalizable. (b) The matrix A is unitary if and only if it is unitarily similar to a diagonal matrix with all eigenvalues having modulus 1. Proof. (a) If U AU = D, i.e., A = UDU for some unitary U M n. Then AA = UDU UD U = UDD U = UD DU = UD U UDU = A A. Conversely, suppose U AU = (a ij ) = Ã is in upper triangular form. If AA = A A, then ÃÃ = Ã Ã so that the (1, 1) entries of the matrices on both sides are the same. Thus, a 11 2 + + a 1n 2 = a 11 2 implying that Ã = [a 11] A 1, where A 1 M n 1 is in upper triangular form. Now, [ a 11 2 ] A 1 A 1 = ÃÃ = Ã Ã = [ a 11 2 ] A 1A 1. Consider the (1, 1) entries of A 1 A 1 and A 1 A 1, we see that all the off-diagonal entries in the second row of A 1 are zero. Repeating this process, we see that Ã = diag (a 11,..., a nn ). (b) If U AU = D = diag (λ 1,..., λ n ) with λ 1 = = λ n = 1, then A is unitary because AA = UDU UD U = U(DD )U = UU = I n. Conversely, if AA = A A = I n, then U AU = D = diag (λ 1,..., λ n ) for some unitary U M n. Thus, I = U IU = U AUU A U = DD. Thus, λ 1 = = λ n = 1. Theorem 2.2.4 Let A M n. The following are equivalent. (a) A is Hermitian. (b) A is unitarily similar to a real diagonal matrix. 10

(c) x Ax R for all x C n. Proof. Suppose (a) holds. Then AA = A 2 = A A so that U AU = D = diag (λ 1,..., λ n ) for some unitary U M n. Now, D = U AU = U A U = (U AU) = D. So, λ 1,..., λ n R. Thus (b) holds. Suppose (b) holds and A = U DU such that U is unitary and D = diag (d 1,..., d n. Then for any x C n, we can set Ux = (y 1,..., y n ) t so that x Ax = x U DUx = n d j y j 2 R. Suppose (c) holds. Let A = H + ig with H = (A + A )/2 0 and G = (A A )/(2i). Then H = H and G = G. Then for any x C n, x Hx = µ 1 R, x Gx = µ 2 R so that x Ax = µ 1 + iµ 2 C. If G is nonzero, then V GV = diag (λ 1,..., λ n ) with λ 1 0. Suppose x is the first column of V. Then x Ax = x Hx + ix Gx = µ 1 + iλ 1 / R, which is a contradiction. So, we have G = 0 and A = H is Hermitian. Theorem 2.2.5 Let A M n. The following are equivalent. (a) A is positive semidefinite. (b) A is unitarily similar to a real diagonal matrix with nonnegative diagonal entries. (c) A = B B for some B M n. (We can choose B so that B = B.) Proof. Suppose (a) holds. Then x Ax 0 for all x C n. Thus, there is a unitary U M n such that U AU = diag (λ 1,..., λ n ) with λ 1,..., λ n R. If there is λ j < 0, we can let x be the jth column of U so that x Ax = λ j < 0, which is a contradiction. So, all λ 1,..., λ n 0. Suppose (b) holds. Then U AU = D such that D has nonnegative entries. We have A = B B with B = UD 1/2 U = B. Hence condition (c) holds. Suppose (c) holds. Then for any x C n, x Ax = (Bx) (Bx) 0. Thus, (a) holds. For any A M n we can write A = H + ig with H = (A + A )/2 and G = (A A )/(2i). This is known as the Hermitian or Cartesian decomposition. 2.3 Spetch s theorem and Commuting families There is no easy canonical form under unitary similarity. 1 unitarily similar? How to determine two matrices are Definition 2.3.1 Let {X, Y } M n. A word W (X, Y ) in X and Y of length m is a product of m matrices chosen from {X, Y } (with repetition). 1 Helene Shapiro, A survey of canonical forms and invariants for unitary similarity, Linear Algebra Appl. 147 (1991), 101-167. 11

Theorem 2.3.2 Let A, B M n. (a) If A and B are unitarily similar, then tr (W (A, A )) = tr (W (B, B )) for all words W (X, Y ). (b) tr (W (A, A )) = tr (W (B, B )) for all words W (X, Y ) of length 2n 2, then A and B are unitarily similar. Definition 2.3.3 A family F M n is a commuting family if every pair of matrices X, Y F commute, i.e., XY = Y X. Lemma 2.3.4 Let F M n be a commuting family. Then there a unit vector v C n such that v is an eigenvector for every A F. Proof. Let V C n with minimum dimension be such that A(V ) V. We will show that dim V = 1 and the result will follow. Clearly, dim V n. Suppose dim V > 1. We claim that every nonzero vector in V is an eigenvector of A for every A F. If it is not true, let A F so that not every nonzero vector in V is an eigenvector of A. Since A(V ) V, there is v V such that Av = λv. Let V 0 = {u V : Au = λu} V. Note that for any B F and u V 0, we have Bu satisfying A(Bu) = BAu = Bλu = λbu, i.e., Bu V 0. So, V 0 satisfies B(V 0 ) V 0 and dim V 0 < dim V, which is impossible. The desired result follows. Theorem 2.3.5 Let F M n. Then there is an invertible matrix S M n such that S 1 AS is in upper triangular form. The matrix S can be chosen to be unitary. Proof. Similar to the proof of Theorem 1.2.1. Corollary 2.3.6 Suppose F M n is a commuting family of normal matrices. Then there is a unitary matrix U M n such that U AU is in diagonal form. 2.4 Singular decomposition and polar decomposition Lemma 2.4.1 Let A be a nonzero m n matrix, and u C m, v C n be unit vectors such that u Av attains the maximum value. Suppose U M m and V M n are unitary matrices with u and ( u v as the first columns, respectively. Then U AV = ) Av 0. 0 A 1 Proof. Note that the existence of the maximum u Av follows from basis analysis result. Suppose U AV = (a ij ). If the first column x = (a 11,..., a m1 ) t has nonzero entries other than a 11, then ũ = Ux/ x C m is a unit vector such that ũ Av = x UAv/ x = x x/ x > a 11 2, 12

which contradicts the choice of u and v. Similarly, if the first row y = (a 11,..., a 1n ) has nonzero entries other than a 11, then ṽ = V y/ y C n is a unit vector satisfying u Aṽ = u AV y/ y = y y/ y > a 11 2, which is a contradiction. The result follows. Theorem 2.4.2 Let A be an m n matrix. Then there are unitary matrices U M m, V M n such that U AV = D = s j E jj. Proof. We prove the result by induction on max{m, n}. By the previous lemma, there are ( u unitary matrices U M m, V M n such that U AV = ) Av 0. We may replace U by 0 A 1 e ir U for a suitable r [0, 2π) and assume that u Av = u Av = s 1. By induction assumption there are unitary matrices U 1 M m 1, V 1 M n 1 such that U 1 A 1V 1 = s 2 s 3. Then... ([1] U 1 )U AV ([1] V 1 ) has the asserted form. Remark 2.4.3 Let U 1 be formed by the first k columns u 1,..., u k of U and V 1 be formed by the first k columns v 1,..., v k of V. Then A = U 1 diag (s 1,..., s k )V 1 = s j u j vj. The values s 1 s k > 0 are the nonzero singular values of A, which are s 2 1,..., s2 k are the nonzero eigenvalues of AA and A A. The vectors v 1,..., v k are the right singular vectors of A, and u 1,..., u k are the left singular vectors of A. So, they are uniquely determined. Here is another way to determine the singular value decomposition. Let {v 1,..., v k } C n be an orthonormal set of eigenvectors corresponding to the nonzero eigenvalues s 2 1,..., s2 k of A A. Let u j = Av j /s j. Then {u 1,..., u k } C m is an orthonormal family such that A = k s ju j v j. Similarly, let {u 1,..., u k } C m be an orthonormal set of eigenvectors corresponding to the nonzero eigenvalues s 2 1,..., s2 k of AA. Let v j = A u j /s j. Then {v 1,..., v k } C n is an orthonormal family such that A = k s ju j v j. Corollary 2.4.4 Let A M n. Then A = U P = QV such that U, V M n are unitary, and P, Q are positive semidefinite matrices with eigenvalues equal to the singular values of A. 13

2.5 Other canonical forms We have considered the canonical forms under similarity and unitary similarity. Here, we consider other canonical forms for different classes of matrices. Unitary equivalence Two matrices A, B M m,n are unitarily equivalent if there are unitary matrices U M m, V M n such that U AV = B. Every matrix A M m,n is equivalent to k s je jj, where s 1 s k > 0 are the nonzero singular values of A. Two matrices are unitarily equivalent if and only if they have the same singular values. Equivalence Two matrices A, B M m,n are equivalent if there are invertible matrices R M m, S M n such that A = RBS. Every matrix A M m,n is equivalent to k E jj, where k is the rank of A. Two matrices are equivalent if they have the same rank. Proof. Elementary row operations and elementary column operations. -congruence A matrix A M n is -congruent to B M n if there is an invertible matrix S such that A = S BS. There is no easy canonical form under -congruence for general matrix. 2 Every Hermitian matrix A M n is -congruent to I p I q 0 n p q. The triple ν(a) = (p, q, n p q) is known as the inertia of A. Two Hermitian matrices are -congruent if and only if they have the same inertia. Proof. Use the unitary congruence/similarity results. Congruence or t-congruence A matrix A M n is t-congruent to B M n if there is an invertible matrix S such that A = S t BS. 2 Roger A. Horn and Vladimir V. Sergeichuk, Canonical forms for complex matrix congruence and -congruence, Linear Algebra Appl. (2006), 1010-1032. 14

There is no easy canonical form under t-congruence for general matrices; see footnote 2. Every complex symmetric matrix A M n is t-congruent to I k 0 n k, where k = rank (A). Every skew-symmetric A M n is t-congruent to 0 n 2k and k copies of The rank of a skew-symmetric matrix A M n is even. ( ) 0 1. 1 0 Two symmetric (skew-symmetric) matrices are t-congruent if and only if they have the same rank. Proof. Use the unitary congruence results. Unitary congruence A matrix A M n is unitarily congruent to B M n if there is a unitary matrix U such that A = U t BU. There is no easy canonical form under unitary congruence for general matrices. Every complex symmetric matrix A M n is unitarily congruent to k s je jj, where s 1 s k > 0 are the nonzero singular values of A. Every skew-symmetric A M n is unitarily congruent to 0 n 2k and ( ) 0 sj, j = 1,..., k, s j 0 where s 1 s k > 0 are nonzero singular values of A. The singular values of a skew-symmetric matrix A M n occur in pairs. Two symmetric (skew-symmetric) matrices are unitarily congruent if and only if they have the same singular values. Proof. Suppose A M n is symmetric. Let x C n be a unit vector so that x t Ax is real and maximum, and let U M n be unitary with x as the first column. Show that U t AU = [s 1 ] A 1. Then use induction. Suppose A M n is skew-symmetric. Let x, y C n be orthonormal pairs such that x t Ay is real and maximum, and U M n be unitary with x, y as the first two columns. ( ) 0 U t s1 AU = A s 1 0 1. Then use induction. Show that 15

2.6 Remarks on real matrices Theorem 2.6.1 If A M n (R), then A is orthogonally similar to block triangular matrix (A ij ) ( ) a b such that A jj is either 1 1 or of the forms. b a The matrix is normal, i.e., AA t = A t A, if and only if all the off-diagonal blocks are zero. The matrix A orthogonal, i.e., AA t = I n, if and only if the 1 1 diagonal blocks have the form [1] or [ 1], and the entries in the 2 2 diagonal blocks satisfy a 2 + b 2 = 1. The matrix A is symmetric, i.e., A = A t, if and only if it is orthogonally similar to a real diagonal matrices. The matrix A is symmetric and satisfies x t Ax 0 for all x R n if and only if it is orthogonally similar to a nonnegative diagonal matrices. Remark 2.6.2 Let A M n (R). K = (A A t )/2 is skew-symmetric, i.e., K t = K. Note that x t Kx = 0 for all x R n. Then A = S + K where S = (A + A t )/2 is symmetric and Clearly, x t Ax R for all real vectors x R n, and the condition does not imply that A is symmetric as in the complex Hermitian case. The matrix A satisfies x t Ax 0 for all if and only if (A + A t )/2 has only nonnegative eigenvalues. The condition does not automatically imply that A is symmetric as in the complex Hermitian case. Every skew-symmetric matrix K M n (R) is orthogonally similar to 0 2k and ( ) 0 sj, j = 1,..., k, s j 0 where s 1 s k > 0 are nonzero singular values of A. 16

3 Eigenvalues and singular values inequalities We study inequalities relating the eigenvalues, diagonal elements, singular values of matrices in this chapter. For a Hermitian matrix A, let λ(a) = (λ 1 (A),..., λ n (A)) be the vector of eigenvalues of A with entries arranged in descending order. Also, we will denote by s(a) = (s 1 (A),..., s n (A)) the singular values of a matrix A M m,n. For two Hermitian matrices, we write A B if A B is positive semidefinite. 3.1 Diagonal entries and eigenvalues of a Hermitian matrix Theorem Let A = (a ij ) M n be Hermitian with eigenvalues λ 1 λ n. Then for any 1 k < n, a 11 + + a kk λ 1 + + λ k. The equality holds if and only if A = A 11 A 22 so that A 11 has eigenvalues λ 1,..., λ k. Remark The above result will give us what we needed, and we can put the majorization result as a related result for real vectors. Lemma 3.1.1 (Rayleigh principle) Let A M n be Hermitian. Then for any unit vector x C n, λ 1 (A) x Ax λ n (A). The equalities hold at unit eigenvectors corresponding to the largest and smallest eigenvalues of A, respectively. Proof. Done in homework problem. If we take x = e j, we see that every diagonal entry of a Hermitian matrix A lies between λ 1 (A) and λ n (A). We can say more in the following. To do that we need the notion of majorization and doubly stochastic matrices. A matrix D = (d ij ) M n is doubly stochastic if d ij 0 and all the row sums and column sums of D equal 1. Let x, y R n. We say that x is weakly majorized by y, denoted by x w y if the sum of the k largest entries of x is not larger than that of y for k = 1,..., n; in addition, if the sum of the entries of x and y, we say that x is majorized by y, denoted by x y. We say that x is obtained from y be a pinching if x is obtained from y by changing (y i, y j ) to (y i δ, y j + δ) for two of the entries y i > j j of y and some δ (0, y i y j ). Theorem 3.1.2 Let x, y R n with n 2. The following conditions are equivalent. (a) x y. 17

(b) There are vectors x 1, x 2,..., x k with k < n, x 1 = y, x k = x, such that each x j is obtained from x j 1 by pinching two of its entries. (c) x = Dy for some doubly stochastic matrix. Proof. Note that the conditions do not change if we replace (x, y) by (P x, Qy) for any permutation matrices P, Q. We may make these changes in our proof. (c) (a). We may assume that x = (x 1,..., x n ) t and y = (y 1,..., y n ) t with entries in descending order. Suppose x = Dy for a doubly stochastic matrix D = (d ij ). Let v k = (e 1 + +e k ) and v t k D = (c 1,..., c n ). Then 0 c j 1 and n c j = k. So, x j = vk t Dy = c 1y 1 + + c n y n Clearly, the equality holds if k = n. c 1 y 1 + c k y k + [(1 c 1 ) + + [1 c k )]y k y 1 + + y k. (a) (b). We prove the result by induction on n. If n = 2, the result is clear. Suppose the result holds for vectors of length less than n. Assume x = (x 1,..., x n ) t and y = (y 1,..., y n ) t has entries arranged in descending order, and x y. Let k be the maximum integer such that y k x 1. If k = n, then for S = n x j = n y j, n 1 n 1 y n x 1 x n = S x j S y j = y n so that x = = x n = y 1 = = y n. So, x = x 1 = y. Suppose k < n and y k x 1 > y k+1. Then we can replace (y k, y k+1 ) by (ỹ k, ỹ k+1 ) = (x 1, y k + y k+1 x 1 ). Then removing x 1 from x and removing ỹ k in x 1 will yield the vectors x = (x 2,..., x n ) t and ỹ = (y 1,..., y k 1, ỹ k+1,..., y n ) t in R n 1 with entries arranged in descending order. We will show that x ỹ. The result will then follows by induction. Now, if l k, then if l > k, then x 2 + + x l x 1 + + x l 1 y 1 + + y l 1 ; x 2 + + x l (y 1 + + y l ) x 1 = y 1 + + y k 1 + ỹ k+1 + y k+1 + + y l with equality when l = n. The results follows. (b) (c). If x j is obtained from x j 1 by pinching the pth and qth entries. Then there is a doubly stochastic matrix P j obtained from I by changing the submatrix in rows and columns p, q by ( ) tj 1 t j 1 t j t j 18

for some t j (0, 1). Then x = Dy for D = P k P 1, which is doubly stochastic. Theorem 3.1.3 Let d, a R n. The following are equivalent. (a) There is a complex Hermitian (real symmetric) A M n with entries of a as eigenvalues and entries of d as diagonal entries. (b) The vectors satisfy d a. Proof. Let A = UDU such that D = diag (λ 1,..., λ n )). Suppose A = (a ij ) and U = (u ij ). Then a jj = n i=1 λ i u ji 2. Because ( u ji 2 ) is doubly stochastic. So, (a 11,..., a nn ) (λ 1,..., λ n ). We prove the converse by induction on n. Suppose (d 1,..., d n ) (λ 1,..., λ n ). If n = 2, let d 1 = λ 1 cos 2 θ + λ 2 sin 2 θ so that has diagonal entries d 1, d 2. ( ) ( cos θ sin θ λ1 (a ij ) = sin θ cos θ λ 2 ) ( cos θ ) sin θ sin θ cos θ Suppose n > 2. Choose the maximum k such that λ k d 1. If λ n = d 1, then for S = n d j = n λ j we have n 1 n 1 λ n d 1 d n = S d j S λ j = λ n. Thus, λ n = d 1 = = d n = S/n = n λ j/n implies that λ 1 = = λ n. Hence, A = λ n I is the required matrix. Suppose k < n. Then there is A 1 = A t 1 M 2(R) with diagonal entries d 1, λ k + λ k+1 d 1 and eigenvalues λ j, λ j+1. Consider A = A 1 D with D = diag (λ 1,..., λ k 1, λ k+2,..., λ n ). As shown in the proof of Theorem 3.1.3, if λ k+1 = λ k + λ k+1 d 1, then (d 2,..., d n ) ( λ k+1, λ 1,..., λ k 1, λ k+2,..., λ n ). By induction assumption, there is a unitary U M n 1 such that U([ λ k ] D)U M n 1 has diagonal entries d 2,..., d n. Thus, A = ([1] U)(A 1 D)([1] U ) has the desired eigenvalues and diagonal entries. 3.2 Max-Min and Min-Max characterization of eigenvalues In this subsection, we give a Max-Min and Min-Max characterization of eigenvalues of a Hermitian matrix. 19

Lemma 3.2.1 Let V 1 and V 2 be subspaces of C n such that dim(v 1 ) + dim(v 2 ) > n, then V 1 V 2 {0}. Proof. Let {u 1,..., u p } and {v 1,..., v q } be bases for V 1 and V 2. Then p + q > n and the linear system [u 1 u p v 1 v q ]x = 0 C n has a non-trivial solution x = (x 1,..., x p, y 1,..., y q ) t. Note that not all x 1,..., x p are zero, else y 1 v 1 + + y q v 1 = 0 implies y j = 0 for all j. Thus, v = x 1 u 1 + + x p u p = (y 1 v 1 + + y q v q ) is a nonzero vector in V 1 V 2. Theorem 3.2.2 Let A M n be Hermitian. Then for 1 k n, λ k (A) = max{λ k (X AX) : X M n,k, X X = I k } = min{λ 1 (Y AY ) : Y M n,n k+1, Y Y = I n k+1 }. Proof. Suppose {u 1,..., u n } is a family of orthonormal eigenvectors of A corresponding to the eigenvalues λ 1 (A),..., λ n (A). Let X = [u 1 u k ]. Then X AX = diag (λ 1 (A),..., λ k (A)) so that λ k (A) max{λ k (X AX) : X M n,k, X X = I k }. Conversely, suppose X has orthonomals column x 1,..., x k spanning a subspace V 1. Let u k,..., u n span a subspace V 2 of dimension n k+1. Then there is a unit vector v = k x jx j = n j=k y ju j. Let x = (x 1,..., x k ) t, y = (y k,..., y n k ) t, Y = [u k... u k+1 ]. Then v = Xx = Y y so that Y AY = diag (λ k (A),..., λ n (A)). By Rayleigh principle, λ k (X AX) x t X AXx = y t Y AY y λ k (A). 3.3 Change of eigenvalues under perturbation Theorem 3.3.1 Suppose A, B M n are Hermitian such that A B. Then λ k (A) λ k (B) for all k = 1,..., n. Proof. Let A = B + P, where P is positive semidefinite. Suppose k {1,..., n}. There is Y M n,k with Y Y = I k such that λ k (B) = λ k (Y BY ) = max{λ k (X BX) : X M m,n, X X = I k }. Let y C k be a unit eigenvector of Y AY corresponding to λ k (X AX). Then λ k (A) = max{λ k (X AX) : X M m,n, X X = I k } λ k (Y AY ) = y Y (B + P )Y y = y Y BY y + y Y P Y y y Y BY y λ k (Y BY ) = λ k (B). 20

Theorem 3.3.2 (Lidskii) Let A, B, C = A + B M n be Hermitian matrices with eigenvalues a 1 a n, b 1 b n, c 1 c n, respectively. Then n a j + n b j = n c j and for any 1 r 1 < < r k n, b n j+1 (c rj a rj ) b j. Proof. Suppose 1 r 1 < < r k n. We want to show k (c r j a rj ) k b j. Replace B by B b k I. Then each eieganvelue of B and each eigenvalue of C = A + B will be changed by b k. So, it will not affect the inequalities. Suppose B = n b jx j x j. Let B + = k b jx j x j. Then (c rj a rj ) (λ rj (A + B + ) λ rj (A)) because λ j (A + B) λ j (A + B + ) for all j n (λ j (A + B + ) λ j (A)) because λ j (A) λ j (A + B + ) for all j = tr (A + B + ) tr (A) = λ j (B + ) = b j. Replacing (A, B, C) by ( A, B, C), we get the other inequalities. Lemma 3.3.3 Suppose A M m,n has nonzero singular values s 1 s k. Then has nonzero eigenvalues ±s 1,..., ±s k. ( ) 0m A A 0 n Theorem 3.3.4 Let A, B, C M m,n with singular values a 1 a n, b 1 b n and c 1,..., c n, respectively. Then for any 1 j 1 < < j k n, we have (c rj a rj ) b j. 3.4 Eigenvalues of principal submatrices ( ) A Theorem 3.4.1 There is a positive matrix C = with A M B k so that A, B, C have eigenvalues a 1 a k, b 1 b n k and c 1 c n, respectively, if and only if there are positive semi-definite matrices Ã, B, C = Ã + B with eigenvalues a 1 a k 0 = a k+1 = = a n, b 1 b n k 0 = b n k+1 = = b n, and c 1 c n. Consequently, for any 1 j 1 < < j k n, we have k (c r j a rj ) k b j. 21

Proof. To prove the necessity, let C = Ĉ Ĉ with Ĉ = [C 1 C 2 ] M n with C 1 M n,k. Then A = C 1 C 1 has eigenvalues a 1,..., a k, and B = C 2 C 2 has eigenvalues b 1,..., b n k. Now, C = ĈĈ = C 1 C 1 + C 2C 2 also eigenvalues c 1,..., c n, and Ã = C 1C 1, B = C 2 C 2 have the desired eigenvalues. Conversely, suppose the Ã, B, C have the said eigenvalues. Let Ã = C 1C 1, B = C2 C 2 for some C 1 M n,k, C 2 M n,n k. Then C = [C 1 C 2 ] = [C 1 C 2 ] have the desired principal submatrices. By the above theorem, one can apply the inequalities governing the eigenvalues of Ã, B, C = Ã + B to deduce inequalities relating the eigenvalues of a positive semidefinite matrix C and its complementary principal submatrices. One can also consider general Hermitian matrix by studying C λ n (C)I. Theorem 3.4.2 There is a Hermitian (real symmetric) matrix C M n with principal submatrix A M m such that C and A have eigenvalues c 1 c n and a 1 a m, respectively, if and only if c j a j and a m j+1 c n j+1, j = 1,..., m. Proof. To prove the necessity, we may replace C by C λ n (C)I and assume that C is positive semidefinite. Then by the previous theorem, c j a j b n m 0, j = 1,..., m. Applying the argument to C, we get the conclusion. To prove the sufficiency, we will construct C c n I with principal submatrix A I m. Thus, we may assume that all the eigenvalues involved are nonnegative. We prove the converse by induction on n m {1,..., n 1}. Suppose n m = 1. We need only address the case µ j (λ j+1, λ j ) for j = 1,..., n 1, since the general case µ j [λ j+1, λ j ] follows by a continuity argument. Alternatively, we can take away the pairs of c j = a j or a j = c j+1 to get a smaller set of numbers that still satisfy the interlacing inequalities and apply the following arguments. We will show how to choose a real orthogonal matrix Q such that C = Q t diag(λ 1,..., λ n )Q has the leading principal submatrix A M n 1 with eigenvalues a 1 a n 1. To this end, let Q have last column u = (u 1,..., u n ) t. By the adjoint formula for the inverse [(zi C) 1 ] nn = det(zi n 1 A)) det(i C) = n 1 (z a j) n (z c j), but we also have the expression (zi A) 1 nn = u t (zi diag(λ 1,..., λ n )) 1 u = 22 n i=1 u 2 i (z λ i ).

Equating these two, we see that A(n) has characteristic polynomial n 1 i=1 (z µ i) if and only if n u 2 i i=1 j i n 1 (z a i ) = (z c i ). Both sides of this expression are polynomials of degree n 1 so they are identical if and only if they agree at the n distinct points a 1,..., c n, or equivalently, i=1 u 2 k = n 1 (c k a j ) j k (c k c j ) w k, k = 1,..., n. Since (c k a j )/(c k c j ) > 0 for all k j, we see that w k > 0. Thus if we take u k = w k then A has eigenvalues a 1,..., a n 1. Now, suppose m < n 1. Let c j = { max{c j+1, a j } 1 j m, min{c j, a m n+j+1 } m < j < n. Then and c 1 c 1 c 2 c n 1 c n 1 c n, c j a j c n m 1+j, j = 1,..., m. By the induction assumption, we can construct a Hermitian C M n 1 with eigenvalues c 1 c n 1, whose m m leading principal submatrix has eigenvalues a 1 a m, and C is the leading principal submatrix of the real symmetric matrix C M n such that C has eigenvalues c 1 c n. 3.5 Eigenvalues and Singular values ( ) A11 A Theorem 3.5.1 Let A = 12 M A 21 A n with A 11 M k. Then det(a 11 ) k s j(a). The 22 equality holds if and only if A = A 11 A 22 such that A 11 has singular values s 1 (A),..., s k (A). Proof. Let S(s 1,..., s n ) be the set of matrices in M n with singular values s 1 s n. Suppose ( ) A11 A A = 12 S(s A 21 A 1,..., s n ) with A 11 M k such that det(a 11 ) attains the maximum value. 22 We show that A = A 11 A 22 and A 11 has singular values s 1 s k. Suppose U, V M k are such that U A 11 V = diag (ξ 1,..., ξ k ) with ξ 1 ξ k 0. We may replace A by (U I n k )A(V I n 1 ) and assume that A 11 = diag (ξ 1,..., ξ k ). 23

Let A = (a ij ). We show that A 21 = 0 as follows. Suppose there is a nonzero entry a s1 with ( ) a11 a k < s n. Then there is a unitary X M 2 such that X s1 has (1, 1) entry equal to a 1s a ss ˆξ 1 = { a 11 2 + a s1 2 } 1/2 = {ξ 2 1 + a s1 2 } 1/2 > ξ 1. Let ˆX M n be obtained from I n by replacing the submatrix in rows and columns 1, j by X. Then the leading k k submatrix of ˆXA is obtained from that of A by changing its first row from (ξ 1, 0,..., 0) to (ˆξ 1,,, ), and has determinant ˆξ 1 ξ 2 ξ k > ξ 1 ξ k = det(a 11 ), contradicting the fact that det(a 11 ) attains the maximum value. Thus, the first column of A 21 is zero. Next, suppose that there is a s2 0 for some k < s n. Then there is a unitary X M 2 such ( ) a22 a that X s2 has (1, 1) entry equal to a 2s a ss ˆξ 2 = { a 22 2 + a s2 2 } 1/2 = {ξ 2 2 + a s2 2 } 1/2 > ξ 2. Then the leading k k submatrix of ˆXA is obtained from that of A by changing its first row from (0, ξ 2, 0,..., 0) to (0, ˆξ 2,,, ), and has determinant ξ 1 ˆξ2 ξ k > ξ 1 ξ k = det(a 11 ), which is a contradiction. So, the second column of A 21 is zero. Repeating this argument, we see that A 21 = 0. Now, the leading k k submatrix of A t S(s 1,..., s n ) also attains the maximum. Applying the above argument, we see that A t 12 = 0. So, A = A 11 A 22. Let Û, ˆV M n k be unitary such that Û A 22 ˆV = diag (ξk+1,..., ξ n ). We may replace A by (I k Û )A(I k ˆV ) so that A = diag (ξ 1,..., ξ n ). Clearly, ξ k ξ k+1. Otherwise, we may interchange kth and (k + 1)st rows and also the columns so that the leading k k submatrix of the resulting matrix becomes diag (ξ 1,..., ξ k 1, ξ k+1 ) with determinant larger than det(a 11 ). So, ξ 1,..., ξ k are the k largest singular values of A. Theorem 3.5.2 Let a 1,..., a n be complex numbers be such that a 1 a n and s 1 s n 0. Then there is A M n with eigenvalues a 1,..., a n and singular values s 1,, s n if and only if n a j = n s j, and k a j k s j for j = 1,..., n 1. Proof. Suppose A has eigenvalues a 1,..., a n and singular values s 1 s n 0. We may apply a unitary similarity to A and assume that A is in upper triangular form with diagonal entries a 1,..., a n. By the previous theorem, if A k is the leading k k submatrix of A, then a 1 a k = det(a k ) k s k for k = 1,..., n 1, and det(a) = a 1 a n = s 1 s n. To prove the converse, suppose the asserted inequalities and equality on a 1,..., a n and s 1,..., s n hold. We show by induction that there is an upper triangular matrix A = (a ij ) with singular values 24

s 1 s n and diagonal values a 1,..., a n (in a certain order). Then there will be a diagonal unitary matrix D such that DA has the desired eigenvalues and singular values. simplicity, we assume a j = a j in the following. Suppose n = 2. Then a 1 s 1, and a 1 a 2 = s 1 s 2 so that a 1 a 2 s 2. Consider ( ) ( cos θ sin θ s1 A(θ, φ) = sin θ cos θ s 2 ) ( ) cos φ sin φ. sin φ cos φ For notation There is φ [0, π/2] such that the (s 1 cos φ, s 2 sin φ) t has norm a 1 [s 2, s 1 ]. Then we can find θ [0, π/2] such that (cos θ, sin θ)(s 1 cos φ, s 2 sin φ) = a 1. Thus, the first column of A(θ, φ) equals (a 1, 0) t, and A(θ, φ) has the desired eigenvalues and singular values. Suppose the result holds for matrices of size at most n 1 2. (s 1,..., s n ) satisfying the product equality and inequalities. values. Consider (a 1,..., a n ) and If a 1 = 0, then s n = 0 and A = s 1 E 12 + +s n 1 E n 1,n has the desired eigenvalues and singular Suppose a 1 > 0. Let k be the maximum integer such that s k a 1. Then there is A 1 = ( ) a1 with s 0 s k+1 = s k s k+1 /a k [s k 1, s k+1 ]. Let k+1 ( s 1,..., s n 1 ) = (s 1,..., s k 1, s k+1, s k+2,..., s n ). We claim that (a 2,..., a n ) and ( s 1,..., s n 1 ) satisfy the product equality and inequalities. First, n j=2 a j = n s j/a 1 = n 1 s j. For l < k, l l 1 l 1 a j a j s j = j=2 l 1 s j. For l k + 1, l a j j=2 l s j /a 1 = l 1 s j. So, there is A 2 U DV in triangular form with diagonal entries a 2,..., a n, where U, V M n 1 are unitary, and D = diag ( s 1,..., s n ). Let ( ( ) ( 1 A0 1 A = U) D is in upper triangular form with diagonal entries a k, a 1,..., a k 1, a k+1,..., a n and singular values s 1,..., s n as desired. V ) 25

3.6 Diagonal entries and singular values Theorem 3.6.1 Let A M n have diagonal entries d 1,..., d n such that d 1 d n and singular values s 1 s n. (a) For any 1 k n, we have k d j k s j. The equality holds if and only if there is a diagonal unitary matrix D such that DA = A 11 A 22 such that A 11 is positive semidefinite with eigenvalues s 1 s k. (b) We have n 1 d j d n n 1 s j s n. The equality holds if and only if there is a diagonal unitary matrix D such that DA = (a ij ) is Hermitian with eigenvalues s 1,..., s n 1, s n and a nn 0. Proof. (a) Let S(s 1,..., s n ) be the set of matrices in M n with singular values s 1 s n. ( ) A11 A Suppose A = 12 S(s A 21 A 1,..., s n ) with A 11 M k such that a 11 + + a kk attains the 22 maximum value. We may replace A by DA by a suitable diagonal unitary D M n and assume that a jj = a jj for all j = 1,..., n. If a ij 0 for any j > k i, then there is a unitary X M 2 ( ) aii a such that X ij has (1, 1) entry equal to a ji a jj ã ii = { a ii 2 + a ji 2 } 1/2 > a ii. Let ˆX M n be obtained from I n by replacing the submatrix in rows and columns i, j by X. Then diagonal entries of the leading k k submatrix Â11 of ˆXA is obtained from that of A by changing its (i, i) entry a ii to â ii so that tr Â11 > tr A 11, which is a contradiction. So, A 12 = 0. Applying the same argument to A t, we see that A 12 = 0. Now, A 11 has singular values ξ 1 ξ k. Then A 11 = P V for some positive semidefinite matrix P with eigenvalues ξ 1,..., ξ k and a unitary matrix V M k. Suppose V = U ˆDU for some diagonal unitary ˆD M k and unitary U M k. Then tr A 11 = tr (P U ˆDU ) = tr U P U ˆD tr U P U = tr P, where the equality holds if and only if ˆD = I k, i.e., A 11 = P is positive semidefinite. In particular, we can choose B = diag (s 1,..., s n ) so that the sum of the k diagonal entries is k s j k ξ j = tr A 11. Thus, the eigenvalues of A 11 must be s 1,..., s k as asserted. (b) Let A = (a ij ) S(s 1,..., s n ) attains the maximum values n 1 a jj a nn. We may replace A by a diagonal unitary matrix and assume that a ii 0 for j = 1,..., n 1, and a nn 0. Let A 11 M n 1 be the leading (n 1) (n 1) principal submatrix of A. By part (a), we may assume that A 11 is positive semidefinite so that its trace equals to the sum of its singular values. Otherwise, there are U, V M n 1 such that U A 11 V = diag (ξ 1,..., ξ n 1 ) with ξ 1 + + ξ n 1 > n 1 a jj. 26

As a result, (U [1])A(V [1]) S(s 1,..., s n ) has diagonal entries ˆd 1,..., ˆd n 1, a nn such that n 1 n 1 ˆd j a nn > a jj a nn, which is a contradiction. ( ) ajj a Next, for j = 1,..., n 1, let B j = jn. We show that a jj a nn = s 1 (B j ) s 2 (B j ) a nj a nn and B j is Hermitian in the following. Note that s 1 (B 1 ) 2 + s 2 (B j ) 2 = a jj 2 + a jn 2 + a nj 2 + a nn 2 and s 1 (B j )s 2 (B j ) = a jj a nn a jn a nj so that a jj a nn = a jj a nn s 1 (B j )s 2 (B j ) a jn a nj. Hence, ( a jj a nn ) 2 = (a jj + a nn ) 2 = a 2 jj + a 2 nn + 2a jj a nn s 1 (B j ) 2 + s 2 (B j ) 2 ( a jk 2 + a kj 2 ) 2(s 1 (B j )s 2 (B j ) a jn a nj ) = (s 1 (B j ) s 2 (B j )) 2 ( a jk a kj ) 2 (s 1 (B j ) s 2 (B j )) 2. Here the two inequalities become equalities if and only if a jk = a kj and a jn a nj = a jn a nj, i.e., a jn = ā nj and B j is Hermitian. By the above analysis, a jj a nn s 1 (B j ) s 2 (B j ). If the inequality is strict, there are unitary X, Y M 2 such that X B j Y = diag (s 1 (B j ), s 2 (B j )). Let ˆX be obtained from I n by replacing the 2 2 submatrix in rows and columns j, n by X. Similarly, we can construct Ŷ. Then ˆX, Ŷ M n are unitary and ˆX AŶ has diagonal entries ˆd 1,..., ˆd n obtained from that of A by changing (a jj, a nn ) to (s 1 (B j ), s 2 (B j )). As a result, n 1 ˆd j ˆd n 1 n > a jj a nn, which is a contradiction. So, B j is Hermitian for j = 1,..., n 1. Hence, A is Hermitian, and tr A = a 11 + + a nn = a 11 + + a n 1,n 1 a nn. Suppose A has eigenvalues λ 1,..., λ n with λ j = s j (A) for j = 1,..., n. Because 0 a nn λ n, we see that tr A = λ j n 1 s j s n. Clearly, the equality holds. Else, we have B = diag (s 1,..., s n ) S(s 1,..., s n ) attaining n 1 s j s n. The result follows. Recall that for two real vectors x = (x 1,..., x n ), y = (y 1,..., y n ), we say that x w y is the sum of the k largest entries of x is not larger than that of y for k = 1,..., n. Theorem 3.6.2 Let d 1,..., d n be complex numbers such that d 1 d n. A M n with diagonal entries d 1,..., d n and singular values s 1 s n if and only if Then there is ( d 1,..., d n ) w (s 1,..., s n ) and 27 n 1 n 1 d j d n s j s n.

Proof. The necessity follows from the previous theorem. We prove the converse by induction on n 2. We will focus on the construction of A with singular values s 1,..., s n, and diagonal entries d 1,..., d n 1, d n with d 1,..., d n 0. ( ) d1 a Suppose n = 2. We have d 1 + d 2 s 1 + s 2, d 1 d 2 s 1 s 2. Let A = such that b d 2 a, b 0 satisfies ab = s 1 s 2 d 1 d 2 and a 2 + b 2 = s 2 1 + s2 2 d2 1 d2 2. Such a, b exist because 2(s 1 s 2 d 1 d 2 ) = 2ab a 2 + b 2 = s 2 1 + s 2 2 d 2 1 d 2 2. Suppose the result holds for matrices of sizes up to n 1 2. Consider (d 1,..., d n ) and (s 1,..., s n ) that satisfy the inequalities. Let k be the largest integer k such that s k d 1. ( ) d1 If k n 2, there is B = with singular values s ŝ k, s k+1, where ŝ = s k + s k+1 d 1. One can check that (d 2,..., d n ) and (s 1,..., s k 1, ŝ, s k+2,..., s n ) satisfy the inequalities for the n 1 case so that there are unitary U, V M n 1 such that UDV has diagonal entries d 2,..., d n, where D = diag (ŝ, s 1,..., s k 1, s k+2,..., s n ). Thus, A = ([1] U)(B diag (s 1,..., s k 1, s k+2,..., s n )([1] V ) has diagonal entries d 1,..., d n and singular values s 1,..., s n. Now suppose k n 1, let n 1 ŝ = max 0, d n 2 n + s n s n 1, d j s j n 2 min s n 1, s n 1 + s n d n, (s j d j ) + d n 1. It follows that (d n, ŝ) w (s n 1, s n ), d n ŝ s n 1 s n, (d 1,..., d n 1 ) w (s 1,..., s n 2, ŝ) and n 2 n 2 d j d n 1 s j ŝ. So, there is C M 2 with singular values s n 1, s n and diagonal elements ŝ, d n. Moreover, there are unitary matrix X, Y M n 1 such that Xdiag (s 1,..., s n 2, ŝ)y has diagonal entries d 1,..., d n 1. Thus, A = (X [1])(diag (s 1,..., s n 2 ) C)(Y [1]) will have the desired diagonal entries and singular values. 3.7 Final remarks The study of matrix inequalities has a long history and is still under active research. One of the most interesting question raised in 1960 s and was finally solved in 2000 s is the following. 28

Problem Determine the necessary and sufficient conditions for three set of real numbers a 1 a n, b 1 b n, c 1 c n for the existence of three (real symmetric) Hermitian matrices A, B and C = A + B with these numbers as their eigenvalues, respectively. It was proved that the conditions can be described in terms of the equality n (a j + b j ) = n c j and a family of inequalities of the form (a uj + b vj ) for certain subsequences (u 1,..., u k ), (v 1,..., v k ), (w 1,..., w k ) of (1,..., n). There are different ways to specify the subsequences. A. Horn has the following recursive way to define the sequences. c wj 1. If k = 1, then w 1 = u 1 + v 1 1. That is, we have a u + b v c u+v 1. 2. Suppose k < n and all the subsequences of length up to k 1 are specified. Consider subsequences (u 1,..., u k ), (v 1,..., v k ), (v 1,..., v k ) satisfying k (u j +v j ) = k w j +k(k+1)/2, and for any lenth l specified subsequences (α 1,..., α l ), (β 1,..., β l ), (γ 1,..., γ l ) of (1,..., n) with l < k, l (u αj + v βj ) Consequently, the subsequences (u 1,..., u k ), (v 1,..., v k ), (w 1,..., w k ) of (1,..., n) is a Horn s sequence triples of length k if and only if there are Hermitian matrices U, V, W = U + V with eigenvalues u 1 1 u 2 2 u k k, v 1 1 v 2 2 v k k, w 1 1 w 2 2 w k k, w γj. respectively. This is known as the saturation conjecture/theorem. Special cases of the above inequalities includes the following inequalities of Thompson, which reduces to the Weyl s inequalities when k = 1. Theorem 3.7.1 Suppose A, B, C = A + B M n are Hermitian matrices with eigenvalues a 1 a n, b 1 b n and c 1 c n, respectively. For any subequences (u 1,..., u k ), (v 1,..., v k ), of (1,..., n), if (w 1,..., w k ) is such that w j = u j + v j j n for all j = 1,..., k, then (a uj + b vj ) c wj. 29