Lecture notes: Applied linear algebra Part 2. Version 1

Lecture notes: Applied linear algebra Part 2. Version 1 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 First, some exercises: xercise 0.1 (2 Points) Another least squares problem: Let c C n and let A C m n, m n, be a matrix with full column rank. Give a formla for min{ x ; x C m, A x c} in terms of c and A A. xercise 0.2 (4+4 Points) This is an exercise on the SVD. (a) Let U and V be subspaces of C n. Then there exist an orthonormal basis u 1,...,u q of U and an orthonormal basis of v 1,...,v p of V such that u j v k 0 for j k and 0 u k v k 1 for k min{p, q}. The numbers φ k : arc cos(u k v k) are called the canonical angles beween the subspaces. Hint: Take any orthonormal basis of X of U and Y of V and make a singular value decomposition of X Y. (b) We consider a direct decompositon C n U W. Let P the projector onto U along W. Suppose the columns of the matrix X span an orthonormal basis of U and the columns of Y span an orthonormal basis of W. Then P 1 σ min (X Y ), where σ min ( ) denotes the smallest singular value. Hint: you might use the fact that the matrix products AB and BA have the same nonzero eigenvalues. This holds for any A C m n, B C n m. The goal of the following notes is to give an introduction to perturbation theory of eigenvalues and invariant subspaces. 1

1 Some preliminaries 1.1 The dual basis Suppose the columns of V [v 1,..., v n ] F n n form a basis of F n. Then V 1 exists. The columns of w 1,...w n of W : (V 1 ) are linearly independent and form a basis of F n. This is called the dual to the basis v 1,..., v n. The identity W V V 1 I then states that { 1 if jk w j, v k 0 otherwise. The identity I V V 1 V W yields for any x F n, x V W x n v k w k, x. Hence, the scalar products w k, x are the coordinates of x with respect to the basis v 1,..., v n. 1.2 Left eigenvectors A vector w C n \ {0} is said to be a left eigenvector of A C n n to the eigenvalue λ C if w A λ w. By transposing this equation we obtain A T w λ w. Hence the left eigenvectors are the conjugates of the right eigenvectors of A T. Recall that the eigenvalues of A and A T are the same. It follows that to each eigenvalue λ of A there exists a left eigenvector. Suppose A is diagonalizable, i.e. A V ΛV 1, Λ diag(λ 1,...,λ n ) ( ) Then the columns of V form a basis of (right) eigenvectors. However ( ) implies that W A ΛW, where W (V 1 ). quivalently, wj A λ j wj, j 1,...,n. Thus the columns of W (i.e. the conjugates of the rows of V 1 ) form a basis of left eigenvectors. 1.3 The Drazin inverse It is a basic fact in linear algebra that for any A F n n, F n R(A n ) N(A n ). The restriction of the linear map x Ax to the A-invariant subspace R(A n ) is invertible (one-to-one and onto). Hence, there is a unique matrix A D F n n such that A D Ax x for x R(A n ) and A D x 0 for x N(A n ). This matrix is called the Drazin inverse of A. Suppose we have a factorization of the form A V [ N 0 0 M 2 ] V 1

with square matrices N, M such that σ(n) {0} and 0 σ(m). Write V and V 1 in the block form W V [V 1, V 2 ], V 1 1, where V 1, W 1 have the same number of columns as N. Then R(V 1 ) N(A n ), R(V 2 ) R(A n ) and 0 0 A D V 0 M 1 V 1 V 2 M 1 W2. xercise 1.1 (2 points) Show that A D A + (A + Moore-Penrose inverse) if A is normal. W 2 1.4 The Sylvester equation Proposition 1.2 Let A C m m, B C q q, C C m q. If σ(a) σ(b) then the Sylvester equation has a unique solution X C m q. AX XB C (1) Proof: Suppose first that B [b jk ] is upper triangular, i.e. b jk 0 for j > k. Then the diagonal elements b kk are the eigenvalues of B. Let x k, b k, c k denote the kth column of X, B, C respectively. Then equation (1) is equivalent to n c k Ax k Xb k Ax k x j b jk Ax k k k 1 x j b jk (A b kk I)x k x j b jk (2) for k 1,...,q. Since b kk is not an eigenvalue of A the matrix A b kk I is invertible. Thus, (2) is equivalent to ) k 1 x k (A b kk I) (c 1 k + x j b jk. This is a recursion formular for the computation of the columns x k. Suppose now that B is not upper triangular. Let B V B 0 V be a Schur decomposition with unitary V and upper triangular B 0. By multiplying (1) with V from the left and V from the right we obtain the equivalent equation V} {{ AV} :A 0 V} {{ XV} :X 0 V XV }{{} B 0 V} {{ CV} X 0 :C 0 Now we can apply the method above to compute the columns of X 0. 3

1.5 Continuity of eigenvalues Proposition 1.3 Let λ 0 C be an eigenvalue of A 0 C n n of algebraic multiplicity m. Let D C be a closed disk about λ 0 that contains no other eigenvalue of A 0. Then there exists an ɛ > 0 such that D contains precisely m eigenvalues (counting algebraic multiplicites) of A C n n if A A 0 ɛ. Proof: Let f A (λ) det(z I A). By Rouche s theorem the number of zeros of the holomorphic function f A in the interior of the disk D is given by m(a) 1 f A (z) 2πi f A (z) dz. This integral is well defined if f A has no zeros on D, the boundary of D. The function A m(a) is continuous and has discret values. Hence it is constant on each connected component of its domain of definition. D 2 Invariant subspaces 2.1 Definition and matrix representation Let A F n n, F R or C be a square matrix. A subspace V F n is said to be A-invariant if AV V i.e. v V implies Av V. xercise 2.1 (3 points) Let V and U be A-invariant subspaces. Show that the subspaces U + V {u + v : u U, v V}, U V {w : w U and w V} are also A-invariant. Show that the orthogonal complement of V, is an invariant subspace of A. V {w C n : w, v 0 for all v V}, Let v k and l k denote the kth column of V F n p and L [l jk ] F p p. Then the matrix equation is equivalent to the equations AV V L (3) Av k V l k p v j l jk, k 1,...,p. (4) These equations state that Av k is a linear combination of the vectors v j. Hence, if (3) holds then R(V ) is an A-invariant subspace. On the other hand, if the subspace V is 4

A-invariant and v 1,...v p is any basis of V then (3) holds for some L F p p. The matrix L is said to be the representation of A on V with respect to the basis v 1,..., v p. Of course L depends on the basis. Precisely, let S F p p be nonsingular. Then the columns of ˆV V S and the columns of V span the same subspace V. Let ˆL S 1 LS. Then the equivalence AV V L AˆV ˆV ˆL holds. Finally note that if V F n n is a square matrix whose columns are linearly independent then AV V L V 1 AV L A V LV 1. 2.2 xamples of invariant subspaces xample 1: Let v 1,...v p C n be eigenvectors of A such that Av k λ k v k, λ k C. Then A [v 1,...,v p ] [v 1 λ 1,...,v p λ p ] [v 1,...,v p ] diag(λ 1,...,λ p ). }{{}}{{} V Λ Thus, V R(V ) is A-invariant. Suppose additionaly that p n and the v 1,...,v n are linearly independent. Then the vectors v k form a basis of C n, the matrix V is invertible and the relation AV V Λ is equivalent to A V ΛV 1. (5) The latter factorization is called a diagonalization of A. Thus, A is diagonalizable if and only if there exists a basis of eigenvectors. The eigenvectors are then the columns of the matrix V in the factorization (5). xample 2: A finite sequence of of vectors v 1,...,v p C n is said to be a Jordan chain of A C n n to the eigenvalue λ C if Av 1 λ v 1 and Av k λ v k + v k 1 for 1 < k p. The latter relations are equivalent to the matrix equation A [v 1,...,v p ] V J, where J }{{} V λ 1 λ 1....... λ 1 λ Thus, range V is an invariant subspace. The matrix J is called a Jordan block. Note that if one ommits the last vectors of the chain then one obtains a shorter Jordan chain v 1,..., v q, q < p which also forms an invariant subspace (with a shorter Jordan block). The Jordan canonical form theorem states that to each A C n n there exists a basis of C n consisting of Jordan chains. Let v i1,...,v ipi, i 1,...,r ( i p i n), be such a basis, i.e. A [v i1,...,v ipi ] }{{} :V i V i J i, i 1,..., r 5

with Jordan blocks J i. We then have J 1 J 2 A [V 1, V 2,...,V r ] [V }{{} 1, V 2,...,V r ].... :V J r }{{} J Since V is invertible, this is equivalent to A V JV 1. This is the Jordan factorization of A. Note that if all Jordan changes have length 1 (i.e. p i 1 for all i) then J is diagonal and the columns of V form a basis of eigenvectors. xample 3: Let A F n n, b F n \ {0}. Let K(A, b) denote the smallest A-invariant subspace of F n that contains b. We determine a basis and the associated matrix representation of A for K(A, b): Since K(A, b) is A-invariant and contains b it also contains the vectors A k b for all nonnegative integers k. However, not all of these vectors can be linearly independent. There is a positive integer m n such that b, Ab,..., A m 1 b are linearly independent and A m b is a linear combination of these vectors, i.e. m 1 A m b α k A k b, α k F. k0 This yields 0...... α 0 A[b, Ab,..., A m 1 b] [Ab, A 2 b,...,a m b] [b, Ab,...,A m 1 1. b].... } 1 {{ α m 1 } :L Thus, b, Ab,..., A m 1 b is a basis of K(A, b) and L is the associated matrix representation of A on this subspace. xample 4: A F n n, B F n p. Let K(A, B) denote the smallest A-invariant subspace of F n that contains contains all column of B. We have K(A, B) R(K(A, B)), where K(A, B) : [B, AB, A 2 B,...,A n 1 B] denotes the controllability matrix of (A, B). Reason: K(A, B) is generated by the columns of the matrices A k B, k 0, 1,.... However from the Cayley-Hamilton-theorem it follows that each A k with k n is a linear conbination of the matrices I, A,...,A n 1 6

1. Hence, the columns of A k B for k n are linear combinations of the columns of B, AB,...,A n 1 B. xercise 2.2 (2 points) Let v C n be an eigenvector of the real matrix A R n to the nonreal eigenvalue λ C \ R, i.e. Av λ v. We then also have A v λ v. Let Rv, Iv, Rλ, Iλ denote the real and imaginary parts of v and λ. Show that R([v, v]) R([Iv, Rv]) and Rλ Iλ A [Iv, Rv] [Iv, Rv]. Iλ Rλ xercise 2.3 (4 points) Suppose the matrix equation AV V L holds. Prove: (a) ach eigenvalue of L is also an eigenvalue of A. (b) If x 1,...,x q is a Jordan chain for L then V x 1,..., V x q is a Jordan chain for A. (c) f(z) c k z k be a power series that converges for all z C. Then AV V L implies f(a) V V f(l). (d) Let V be A-invariant. Let x : R C n be the solution of the linear differential equation ẋ(t) Ax(t) with initial value x(0) V. Then x(t) V for all t R. (Hint: for the proof you might use (c)). Invariant subspaces and block triangular matrices. Let V 1 [v 1,...,v p ] C n p be a matrix whose columns are linear indpendent and span an A-invariant subspace, i.e. AV 1 V 1 L. Let V 2 [v p+1,...,v n ] C n (n p) be such that the vectors v 1,...,v n form a basis of C n. Since each Av k is a linear combination of the basis vectors we have AV 2 V 1 R + V 2 M for some matrices R, M. Hence quivalently: A [V 1, V 2 ] [AV }{{} 1, AV 2 ] [V 1, V 2 ] :V A V L R. 0 M L R V 1. (6) 0 M On the other hand if (6) holds for a matrix V C n n, then the first p columns of V span an A-invariant subspace. If R 0 then also the columns v p+1,...,v n span an invariant subspace. This is called a complementary invariant subspace to V R(V 1 ). The following proposition gives a sufficient condition for the existence of a complementary A-invariant subspace. 1 Let χ(λ) det(λi A) denote the characteristic polynomial of A. The Cayley-Hamilton-theorem states that χ(a) 0. The polynomial λ k can be written in the form λ k q(λ)χ(λ) + r(λ) with polynomials q(λ) and r(λ) n 1 j0 r jλ j. (divide λ k by χ(λ) to obtain this). On replacing the variable λ with the matrix A we obtain. A k q(a)χ(a) + r(a) r(a) n 1 j0 r ja j. Thus each nonnegative power of A is a linear combination of I, A,...,A n 1. 7

Proposition 2.4 Suppose (6) holds and L and M have no common eigenvalue (i.e σ(l) σ(m) ). Then there exists Ṽ [v 1,...,v p, ṽ p+1,...,ṽ n ] F n n such that A Ṽ L 0 Ṽ 1. (7) 0 M I X Proof: The Ansatz Ṽ V with X F 0 I p (n p) yields Ṽ 1 Ṽ 1 AṼ [ I X 0 I ][ L R 0 M ] I X 0 I L LX XM + R 0 M I X V 0 I 1 and Since L and M have no common eigenvalue the Sylvester equation LX XM + R 0 has a unique solution X. 3 Taylor expansion of block diagonalization Theorem 3.1 Suppose A F n n (F {R, C}) has a factorization L0 0 A V 0 V0 1, V 0 M 0 F n n, L 0 F p p, M 0 F (n p) (n p) 0 and σ(l 0 ) σ(m 0 ). Then there exist an open neighborhood U of 0 F n n analytic functions U V, L, M satisfying V 0 V 0, L 0 L 0, M 0 M 0 and L A + V 0 0 M (V ) 1, U. (8) Furthermore V can be chosen such that V V 0 [ I Y X I ] with analytic functions X F p p, Y F (n p) (n p). We then have L X L0 0 Y M + 0 M 0 L k Xk Y k Mk, }{{} Π k where each entry of Π k is a homogeneuos polynomial of degree k whose variables are the entries of. The matrices Π k satify the following recursion formula. Partition W V 0 [V 1, V 2 ], V0 1 1 (9) 8 W 2

with V 1, W 1 F n p, V 2, W 2 F n (n p). Let the Sylvester operators S 1, S 1 be defined as S 1 (X) XM 0 L 0 X, S 2 (X) XL 0 M 0 X. Then [ L 1 X1 ] Y1 M1 [ L k Xk ] Y k M k [ W1 V 1 S1 1 1 V ] 2) S2 1 (W2 V 1 ) W2 V 2 [ W 1 V 2 Y ( ) F k ] S 1 2 k 1 ( G k ) S 1 1 W 2 V 1 X k 1 for k 2, (10) where F k G k : W 1 V 1 X k 1 k 1 X j M k j, : W2 V 2 Yk 1 k 1 Y j L k j. Corollary 3.2 Partition V [V 1, V 2 ] with V 1 F n p and V 2 F n (n p). Then L L 0 + W 1 V 1 + W 1 V 2 S 1 2 (W 2 V 1 ) + O( 3 ) (11) V 1 V 1 + V 2 S 1 2 (W 2 V 1) + O( 2 ). (12) In the special case that L 0 λ I we have L λ I + W 1 V 1 + W 1 (λ I A) D V 1 + O( 3 ) (13) V 1 V 1 + (λ I A) D V 1 + O( 2 ), (14) where (λ I A) D denotes the Drazin inverse. Proof: Theorem 3.1 implies L L 0 + L 1 + L 2 + O( 3 ), V 1 V 1 + V 2 Y 1 + O( 2 ), where L 1 W1 V 1, Y1 S2 1 (W2 V 1 ), L 2 W2 V 1 Y1. Hence (11) and (12) hold. If L 0 λ I then S 2 (X) (λ I M 0 )X. Thus, S2 1 (X) (λ I M 0 ) 1 X. As is easily verified we have (λ I A) D V 2 (λ I M 0 ) 1 W2. This yields (13) and (14). Remark: For the case that L 0 is a 1 1 matrix, L 0 [λ], formula (13) gives the Taylorexpansion of a simple eigenvalue up to the second order. In this case V 1 and W 1 are right and left eigenvectors to the eigenvalue λ with the property that V1 W 1 1. Formula (14) gives the Taylor-expansion for an associated right eigenvector V1 of A +. The eigenvector is not unique. One gets another Taylor-expansion if one multiplies the eigenvector V1 with a scalar factor which depends smoothly on the perturbation. 9

xercise 3.3 (5 points) Let W1 F n p, W2 F n (n p) be such that (W (V ) 1 1 ). (W 2 ) Then P : V1 (W 1 ) is the projector onto the (A + )-invariant subspace R(V1 ) along the (A + )-invariant subspace R(V1 ). We set P : P O V 1 W1. Show the following. (a) (W 1 ) W 1 S 1 1 (W 1 V 2 )W 2 + O( 2 ). Hint: You might use the fact that (I + Z) 1 I Z + O( Z 2 ). (b) If L 0 λ I then (W 1 ) W 1 + W 1 (λ I A)D + O( 2 ), P P + (λ I A) D P + P(λ A) D + O( 2 ). xercise 3.4 (4 points) [ Compute ] the Taylor expansion up to second order for the eigenvalue 2 of the matrix A. 2 5 0 5 Proof of Theorem 3.1: We consider the analytic function f : F n n F n n F n n defined by f H1 H 2 δl X, I X H 3 H 4 Y δm L0 + δl 0 L0 + H 1 H 2 I X Y I 0 M 0 + δm H 3 M 0 + H 4 Y I }{{} :H [ ] δl H 1 H 2 Y S 1 (X) + X δm H 2 H 1 X S 2 (Y ) + Y δl H 3 H 4 Y δm H 4 H 3 X where S 1 (X) XM 0 L 0 X and S 2 (Y ) Y L 0 M 0 Y. We have f ( ) δl X 0, Y δm ( ) δl S1 (X) δl X 2 + O S 2 (Y ) δm. Y δm Hence the derivative of f at (0, 0) with respect to the second variable is the linear map δl X δl S1 (X). Y δm S 2 (Y ) δm This map is bijective since the Sylvester operators S 1, S 2 are invertible. Hence, by the implicit function theorem there exists an analytic function δl(h) X(H) H (15) Y (H) δm(h) 10

defined in neighborhood of 0 F n n such that δl(0) X(0) 0 (16) Y (0) δm(0) and f ( ) δl(h) X(H) H, 0. (17) Y (H) δm(h) The latter identity is equivalent to [ L0 + H 1 H 2 I X(H) L0 + δl(h) 0 I X(H) H 3 M 0 + H 4 Y (H) I 0 M 0 + δm(h) Y (H) I if H is small enough (otherwise the inverse might not exist). For in a small neigborhood of 0 C n n let L H : V0 1 X V 0, L0 + δl(h Y M : ) X(H ) Y (H ) M 0 + δm(h. ) ] 1 Then (8) holds. Observe that with the partition (9), H H 1 H2 W H3 H4 1 H V 1 W1 H V 2 W2 H V 1 W2 H V 2 (18) Next we verify the recursion formulas (10). First note that (17) is equivalent to the matrix equations 0 δl(h) H 1 H 2 Y 0 S 1 (X) + X δm(h) H 2 H 1 X 0 S 2 (Y ) + Y δl H 3 H 4 Y 0 δm(h) H 4 H 3 X. (19) Let P k denote the set of matrix functions of H whose entries are homogeneous polynomials of degree k of the entries of H. Since the function (15) is analytic and satisfies (16) we can write δl(h) X(H) Lk (H) X k (H). Y (H) δm(h) Y k (H) M k (H) }{{} P k The first equation of (19) then gives 0 δl(h) H 1 H 2 Y (H) L k (H) H 1 H 2 Y k (H) L 1 (H) H }{{} 1 (L k (H) H 2 Y k 1 (H)) }{{} P k2 1 P k 11

This implies L 1 (H) H 1 and L k (H) H 2 Y k 1 (H) for k 2. From the second equation of (19) we obtain (ommiting the argument H) Thus, 0 S 2 (Y ) + Y δl H 3 H 4 Y ( ) ( ) ( ) S 2 Y + Y k L k H 3 H 4 S 2 (Y k ) + S 2 (Y 1 ) H }{{} 3 + P 1 k 1 Y j L k j H 3 H 4 k2 Y k Y k ( ) k 1 S 2 (Y k ) + Y j L k j H 4 Y k 1 k2 Y 1 (H) S2 1 (H 3 ) and Y k (H) S2 1 } {{ } P k ( ) k 1 H 4 Y k 1 (H) Y j (H) L k j (H) for k 2. Analogously the third and the fourth equation of (19) yield M 1 (H) H 4 and M k (H) H 3 X k 1 (H) for k 2, ( ) k 1 X 1 (H) S1 1 (H 2 ) and X k (H) S1 1 H 1 X k 1 (H) X j (H) M k j (H) for k 2. In order to obtain the recursion formulas (10) from this replace H by H and H k by Hk. 4 The eigenvalue inclusion theorems of Gershgorin and Brauer In the following R j (A) denotes the sum of the absolute values of the off-diagonal elements in the jth row of A [a jk ] C n n, R j (A) : a jk, j 1,..., n. k 1,...,n k j By D(a, r) we denote the disk of radius r about a C, D(a, r) {z C : z a r} C. 12

Theorem 4.1 (Gershgorin) ach eigenvalue of A C n n is contained in one of the disks D(a jj, R j (A)), j 1,...,n. Alternative formulation: σ(a) n D(a jj, R j (A)). Proof: Let Ax λ x, x [x 1, x 2,...,x n ] T 0, λ, x j C. We have (A λ)x 0. The latter equation can be written more explicitely as (a jj λ)x j + k j a jk x k 0, j 1,..., n This implies (a jj λ)x j k j a jk x k k j a jk x k, j 1,...,n. (20) Now choose j such that x j x k for all k. Then (20) implies a jj λ x j k j a jk x j On dividing this inequality by x j we obtain a jj λ R j (A). xercise 4.2 (2 points) A matrix A C n n is said to be strictly diagonally dominant if a jj > R j (A) for all j 1,...,n. Use the Gershgorin theorem to show that strictly diagonally dominant matrices are invertible. Hint: We have det(a) 0 if and only if 0 is an eigenvalue of A. In order to state the inclusion theorem of Brauer we introduce the sets C(a 1, a 2, r) : {z C : z a 1 z a 2 r}, a 1, a 2 C, r 0. Though these sets are not nessearily oval they are called the ovals of Cassini. Theorem 4.3 (Brauer) ach eigenvalue of A C n n is contained in one of the sets C(a jj, a ll, R j (A)R l (A)), j, l 1,...,n, j l. Proof: We use the notation of the proof of Gershgorin s theorem. We choose indices j, l such that x j x l x k for all k {j, l}. By multiplying the associated inequalities (20): (a jj λ)x j k j (a ll λ)x l k l a jk x k, a lk x k 13

1.5 1 2.2 0.5 0 0.5 0.5 1 1 0.5 1.5 1 1.5 2 1.5 1 0.5 0 0.5 1 1.5 2 Figure 1: The Cassini sets C( 1, 1, r) for several values of r we obtain ( ) ( ) a jj λ a ll λ x j x l a jk x k a lk x k k j k l a jk1 a lk2 x k1 x k2 k 1 j k 2 l a jk1 a lk2 x j x l k 1 j k 2 l ( )( ) a jk1 a lk2 x j x l k 1 j k 2 l R j (A)R l (A) x j x l If x l 0 then we can divide the inequality by x j x l and obtain a jj λ a ll λ R j (A)R l (A). Hence, λ C(a jj, a ll, R j (A)R l (A)). The case x l 0 is left as an exercise. xercise 4.4 (2 points) Complete the proof for the case x l 0. 5 Pseudospectra A pseudospectrum is the set of eigenvalues of all matrices which are obtained from a nominal matrix A by adding a perturbation of bounded size. Precisely, we define the 14

F-pseudospectrum of A F n n to the perturbation level ρ > 0 as σ F (A, ρ) : {z C; z σ(a + ) for some F n n with ρ}. Let d F (A) denote the distance of A to the set of singular matrices: Then d F (A) : min{ ; F n n, det(a + ) 0}. σ F (A, ρ) {z C; z σ(a + ) for some F n n with ρ} {z C; A + zi is singular for some F n n with ρ} {z C; d F (A zi) ρ}. Hence pseudospectra are the sublevel sets of the function z d F (A zi). The next proposition gives d F for the case F C. Proposition 5.1 For any A C n n, d C (A) σ n, the smallest singular value of A. Proof: If A + is singular, then for some v F n with v 1, (A + )v 0 v Av v Av min x 1 Ax σ n. Let A n σ ku k vk be a singular value decomposition with pairwise orthogonal unit vectors v k and pairwise orthogonal unit vectors u k. Let σ n u n vn. Then σ n and (A + )v n 0. It follows that In the case F R we have d R (A) inf γ (0,1] σ 2n 1 σ C (A, ρ) {z C; σ n (A zi) ρ}. ([ RA γ IA γ 1 IA RA ]) for all A C n n, where RA and IA denote the real and the imaginary part of A and σ 2n 1 is the second smallest singular value. The proof is to complicated to give it here. xercise 5.2 (4 points) Let D(λ, ρ) C denote the closed disk with center λ C and radius ρ. Show that for any A C n n, D(λ, ρ) σ C (A, ρ). λ σ(a) Show that equality holds if A is normal. Hence, the normal matrices have the smallest possible complex pseudospectra. Remark: the real pseudospectra of real normal matrices are generically not unions of disks. It is still an open question whether the real normal matrices have the smallest possible real pseudospectra in their similarity class. If you solve this problem you get 100 points. 15

From the statements of the exercise we can conclude the following result of Bauer and Fike. Proposition 5.3 Suppose A has a basis V [v 1,...,v n ] of igenvectors. Then σ(a + ) D(λ, ρ), where ρ V V 1. }{{} λ σ(a) cond.number Proof: We have A V ΛV 1, where Λ is a diagonal matrix of eigenvalues. It follows that σ(a + ) σ(v 1 (A + )V ) σ(λ + V 1 V ) σ(λ, V 1 V ) σ(λ, V 1 V ) D(λ, V 1 V ). λ σ(λ)σ(a) The latter equality holds since Λ is normal. 6 References. The following books helped me to prepare these notes. G.W. Stewart, J. Sun. Matrix Perturbation Theory. Academic Press, Inc. 1990. G. Golub, C. van Loan. Matrix Computations. Johns Hopkins University Press. 1983. P. Lancester, M. Trismenetsky. The Theory of Matrices. Academic Press, Inc. 1985. 16