Lecture notes: Applied linear algebra Part 2. Version 1

Similar documents
Lecture notes: Applied linear algebra Part 1. Version 2

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Numerical Linear Algebra Homework Assignment - Week 2

Math 408 Advanced Linear Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Notes on Eigenvalues, Singular Values and QR

MATH 511 ADVANCED LINEAR ALGEBRA SPRING 2006

Lecture 10 - Eigenvalues problem

November 18, 2013 ANALYTIC FUNCTIONAL CALCULUS

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

The Eigenvalue Problem: Perturbation Theory

MATHEMATICS 217 NOTES

Eigenvalue and Eigenvector Problems

Linear System Theory

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

235 Final exam review questions

Exercise Sheet 1.

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

Conceptual Questions for Review

Econ Slides from Lecture 7

MAT Linear Algebra Collection of sample exams

Chapter 3 Transformations

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

The German word eigen is cognate with the Old English word āgen, which became owen in Middle English and own in modern English.

Review of Some Concepts from Linear Algebra: Part 2

Linear Algebra 1. M.T.Nair Department of Mathematics, IIT Madras. and in that case x is called an eigenvector of T corresponding to the eigenvalue λ.

ALGEBRA QUALIFYING EXAM PROBLEMS LINEAR ALGEBRA

Properties of Matrices and Operations on Matrices

UNIT 6: The singular value decomposition.

Linear Algebra in Actuarial Science: Slides to the lecture

Review problems for MA 54, Fall 2004.

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

MA 265 FINAL EXAM Fall 2012

Chap 3. Linear Algebra

MATH 581D FINAL EXAM Autumn December 12, 2016

EIGENVALUE PROBLEMS. Background on eigenvalues/ eigenvectors / decompositions. Perturbation analysis, condition numbers..

ELE/MCE 503 Linear Algebra Facts Fall 2018

LINEAR ALGEBRA BOOT CAMP WEEK 1: THE BASICS

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Linear Algebra: Matrix Eigenvalue Problems

1. General Vector Spaces

MIT Final Exam Solutions, Spring 2017

Math Linear Algebra Final Exam Review Sheet

MATH 5524 MATRIX THEORY Problem Set 4

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

OHSx XM511 Linear Algebra: Solutions to Online True/False Exercises

Lecture Summaries for Linear Algebra M51A

Numerical Methods for Solving Large Scale Eigenvalue Problems

Eigenvalues, Eigenvectors. Eigenvalues and eigenvector will be fundamentally related to the nature of the solutions of state space systems.

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

EE731 Lecture Notes: Matrix Computations for Signal Processing

MATH 583A REVIEW SESSION #1

Math 489AB Exercises for Chapter 2 Fall Section 2.3

EE263: Introduction to Linear Dynamical Systems Review Session 5

The Jordan Normal Form and its Applications

APPLIED LINEAR ALGEBRA

Eigenvalues and Eigenvectors

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math 108b: Notes on the Spectral Theorem

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in

ACM 104. Homework Set 4 Solutions February 14, 2001

1 Last time: least-squares problems

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

4. Linear transformations as a vector space 17

. = V c = V [x]v (5.1) c 1. c k

Online Exercises for Linear Algebra XM511

Lecture 1: Review of linear algebra

BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

PROOF OF TWO MATRIX THEOREMS VIA TRIANGULAR FACTORIZATIONS ROY MATHIAS

Eigenvalues and Eigenvectors

E2 212: Matrix Theory (Fall 2010) Solutions to Test - 1

Diagonalizing Matrices

Symmetric and anti symmetric matrices

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Lecture 7: Positive Semidefinite Matrices

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Definition (T -invariant subspace) Example. Example

Generalized eigenvector - Wikipedia, the free encyclopedia

Linear Algebra M1 - FIB. Contents: 5. Matrices, systems of linear equations and determinants 6. Vector space 7. Linear maps 8.

Advanced Engineering Mathematics Prof. Pratima Panigrahi Department of Mathematics Indian Institute of Technology, Kharagpur

Linear Algebra. Session 12

Study Guide for Linear Algebra Exam 2

LINEAR ALGEBRA REVIEW

University of Colorado at Denver Mathematics Department Applied Linear Algebra Preliminary Exam With Solutions 16 January 2009, 10:00 am 2:00 pm

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

University of Colorado Denver Department of Mathematical and Statistical Sciences Applied Linear Algebra Ph.D. Preliminary Exam June 10, 2011

8. Diagonalization.

Foundations of Matrix Analysis

Linear Algebra Practice Problems

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Recall : Eigenvalues and Eigenvectors

Transcription:

Lecture notes: Applied linear algebra Part 2. Version 1 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 First, some exercises: xercise 0.1 (2 Points) Another least squares problem: Let c C n and let A C m n, m n, be a matrix with full column rank. Give a formla for min{ x ; x C m, A x c} in terms of c and A A. xercise 0.2 (4+4 Points) This is an exercise on the SVD. (a) Let U and V be subspaces of C n. Then there exist an orthonormal basis u 1,...,u q of U and an orthonormal basis of v 1,...,v p of V such that u j v k 0 for j k and 0 u k v k 1 for k min{p, q}. The numbers φ k : arc cos(u k v k) are called the canonical angles beween the subspaces. Hint: Take any orthonormal basis of X of U and Y of V and make a singular value decomposition of X Y. (b) We consider a direct decompositon C n U W. Let P the projector onto U along W. Suppose the columns of the matrix X span an orthonormal basis of U and the columns of Y span an orthonormal basis of W. Then P 1 σ min (X Y ), where σ min ( ) denotes the smallest singular value. Hint: you might use the fact that the matrix products AB and BA have the same nonzero eigenvalues. This holds for any A C m n, B C n m. The goal of the following notes is to give an introduction to perturbation theory of eigenvalues and invariant subspaces. 1

1 Some preliminaries 1.1 The dual basis Suppose the columns of V [v 1,..., v n ] F n n form a basis of F n. Then V 1 exists. The columns of w 1,...w n of W : (V 1 ) are linearly independent and form a basis of F n. This is called the dual to the basis v 1,..., v n. The identity W V V 1 I then states that { 1 if jk w j, v k 0 otherwise. The identity I V V 1 V W yields for any x F n, x V W x n v k w k, x. Hence, the scalar products w k, x are the coordinates of x with respect to the basis v 1,..., v n. 1.2 Left eigenvectors A vector w C n \ {0} is said to be a left eigenvector of A C n n to the eigenvalue λ C if w A λ w. By transposing this equation we obtain A T w λ w. Hence the left eigenvectors are the conjugates of the right eigenvectors of A T. Recall that the eigenvalues of A and A T are the same. It follows that to each eigenvalue λ of A there exists a left eigenvector. Suppose A is diagonalizable, i.e. A V ΛV 1, Λ diag(λ 1,...,λ n ) ( ) Then the columns of V form a basis of (right) eigenvectors. However ( ) implies that W A ΛW, where W (V 1 ). quivalently, wj A λ j wj, j 1,...,n. Thus the columns of W (i.e. the conjugates of the rows of V 1 ) form a basis of left eigenvectors. 1.3 The Drazin inverse It is a basic fact in linear algebra that for any A F n n, F n R(A n ) N(A n ). The restriction of the linear map x Ax to the A-invariant subspace R(A n ) is invertible (one-to-one and onto). Hence, there is a unique matrix A D F n n such that A D Ax x for x R(A n ) and A D x 0 for x N(A n ). This matrix is called the Drazin inverse of A. Suppose we have a factorization of the form A V [ N 0 0 M 2 ] V 1

with square matrices N, M such that σ(n) {0} and 0 σ(m). Write V and V 1 in the block form W V [V 1, V 2 ], V 1 1, where V 1, W 1 have the same number of columns as N. Then R(V 1 ) N(A n ), R(V 2 ) R(A n ) and 0 0 A D V 0 M 1 V 1 V 2 M 1 W2. xercise 1.1 (2 points) Show that A D A + (A + Moore-Penrose inverse) if A is normal. W 2 1.4 The Sylvester equation Proposition 1.2 Let A C m m, B C q q, C C m q. If σ(a) σ(b) then the Sylvester equation has a unique solution X C m q. AX XB C (1) Proof: Suppose first that B [b jk ] is upper triangular, i.e. b jk 0 for j > k. Then the diagonal elements b kk are the eigenvalues of B. Let x k, b k, c k denote the kth column of X, B, C respectively. Then equation (1) is equivalent to n c k Ax k Xb k Ax k x j b jk Ax k k k 1 x j b jk (A b kk I)x k x j b jk (2) for k 1,...,q. Since b kk is not an eigenvalue of A the matrix A b kk I is invertible. Thus, (2) is equivalent to ) k 1 x k (A b kk I) (c 1 k + x j b jk. This is a recursion formular for the computation of the columns x k. Suppose now that B is not upper triangular. Let B V B 0 V be a Schur decomposition with unitary V and upper triangular B 0. By multiplying (1) with V from the left and V from the right we obtain the equivalent equation V} {{ AV} :A 0 V} {{ XV} :X 0 V XV }{{} B 0 V} {{ CV} X 0 :C 0 Now we can apply the method above to compute the columns of X 0. 3

1.5 Continuity of eigenvalues Proposition 1.3 Let λ 0 C be an eigenvalue of A 0 C n n of algebraic multiplicity m. Let D C be a closed disk about λ 0 that contains no other eigenvalue of A 0. Then there exists an ɛ > 0 such that D contains precisely m eigenvalues (counting algebraic multiplicites) of A C n n if A A 0 ɛ. Proof: Let f A (λ) det(z I A). By Rouche s theorem the number of zeros of the holomorphic function f A in the interior of the disk D is given by m(a) 1 f A (z) 2πi f A (z) dz. This integral is well defined if f A has no zeros on D, the boundary of D. The function A m(a) is continuous and has discret values. Hence it is constant on each connected component of its domain of definition. D 2 Invariant subspaces 2.1 Definition and matrix representation Let A F n n, F R or C be a square matrix. A subspace V F n is said to be A-invariant if AV V i.e. v V implies Av V. xercise 2.1 (3 points) Let V and U be A-invariant subspaces. Show that the subspaces U + V {u + v : u U, v V}, U V {w : w U and w V} are also A-invariant. Show that the orthogonal complement of V, is an invariant subspace of A. V {w C n : w, v 0 for all v V}, Let v k and l k denote the kth column of V F n p and L [l jk ] F p p. Then the matrix equation is equivalent to the equations AV V L (3) Av k V l k p v j l jk, k 1,...,p. (4) These equations state that Av k is a linear combination of the vectors v j. Hence, if (3) holds then R(V ) is an A-invariant subspace. On the other hand, if the subspace V is 4

A-invariant and v 1,...v p is any basis of V then (3) holds for some L F p p. The matrix L is said to be the representation of A on V with respect to the basis v 1,..., v p. Of course L depends on the basis. Precisely, let S F p p be nonsingular. Then the columns of ˆV V S and the columns of V span the same subspace V. Let ˆL S 1 LS. Then the equivalence AV V L AˆV ˆV ˆL holds. Finally note that if V F n n is a square matrix whose columns are linearly independent then AV V L V 1 AV L A V LV 1. 2.2 xamples of invariant subspaces xample 1: Let v 1,...v p C n be eigenvectors of A such that Av k λ k v k, λ k C. Then A [v 1,...,v p ] [v 1 λ 1,...,v p λ p ] [v 1,...,v p ] diag(λ 1,...,λ p ). }{{}}{{} V Λ Thus, V R(V ) is A-invariant. Suppose additionaly that p n and the v 1,...,v n are linearly independent. Then the vectors v k form a basis of C n, the matrix V is invertible and the relation AV V Λ is equivalent to A V ΛV 1. (5) The latter factorization is called a diagonalization of A. Thus, A is diagonalizable if and only if there exists a basis of eigenvectors. The eigenvectors are then the columns of the matrix V in the factorization (5). xample 2: A finite sequence of of vectors v 1,...,v p C n is said to be a Jordan chain of A C n n to the eigenvalue λ C if Av 1 λ v 1 and Av k λ v k + v k 1 for 1 < k p. The latter relations are equivalent to the matrix equation A [v 1,...,v p ] V J, where J }{{} V λ 1 λ 1....... λ 1 λ Thus, range V is an invariant subspace. The matrix J is called a Jordan block. Note that if one ommits the last vectors of the chain then one obtains a shorter Jordan chain v 1,..., v q, q < p which also forms an invariant subspace (with a shorter Jordan block). The Jordan canonical form theorem states that to each A C n n there exists a basis of C n consisting of Jordan chains. Let v i1,...,v ipi, i 1,...,r ( i p i n), be such a basis, i.e. A [v i1,...,v ipi ] }{{} :V i V i J i, i 1,..., r 5

with Jordan blocks J i. We then have J 1 J 2 A [V 1, V 2,...,V r ] [V }{{} 1, V 2,...,V r ].... :V J r }{{} J Since V is invertible, this is equivalent to A V JV 1. This is the Jordan factorization of A. Note that if all Jordan changes have length 1 (i.e. p i 1 for all i) then J is diagonal and the columns of V form a basis of eigenvectors. xample 3: Let A F n n, b F n \ {0}. Let K(A, b) denote the smallest A-invariant subspace of F n that contains b. We determine a basis and the associated matrix representation of A for K(A, b): Since K(A, b) is A-invariant and contains b it also contains the vectors A k b for all nonnegative integers k. However, not all of these vectors can be linearly independent. There is a positive integer m n such that b, Ab,..., A m 1 b are linearly independent and A m b is a linear combination of these vectors, i.e. m 1 A m b α k A k b, α k F. k0 This yields 0...... α 0 A[b, Ab,..., A m 1 b] [Ab, A 2 b,...,a m b] [b, Ab,...,A m 1 1. b].... } 1 {{ α m 1 } :L Thus, b, Ab,..., A m 1 b is a basis of K(A, b) and L is the associated matrix representation of A on this subspace. xample 4: A F n n, B F n p. Let K(A, B) denote the smallest A-invariant subspace of F n that contains contains all column of B. We have K(A, B) R(K(A, B)), where K(A, B) : [B, AB, A 2 B,...,A n 1 B] denotes the controllability matrix of (A, B). Reason: K(A, B) is generated by the columns of the matrices A k B, k 0, 1,.... However from the Cayley-Hamilton-theorem it follows that each A k with k n is a linear conbination of the matrices I, A,...,A n 1 6

1. Hence, the columns of A k B for k n are linear combinations of the columns of B, AB,...,A n 1 B. xercise 2.2 (2 points) Let v C n be an eigenvector of the real matrix A R n to the nonreal eigenvalue λ C \ R, i.e. Av λ v. We then also have A v λ v. Let Rv, Iv, Rλ, Iλ denote the real and imaginary parts of v and λ. Show that R([v, v]) R([Iv, Rv]) and Rλ Iλ A [Iv, Rv] [Iv, Rv]. Iλ Rλ xercise 2.3 (4 points) Suppose the matrix equation AV V L holds. Prove: (a) ach eigenvalue of L is also an eigenvalue of A. (b) If x 1,...,x q is a Jordan chain for L then V x 1,..., V x q is a Jordan chain for A. (c) f(z) c k z k be a power series that converges for all z C. Then AV V L implies f(a) V V f(l). (d) Let V be A-invariant. Let x : R C n be the solution of the linear differential equation ẋ(t) Ax(t) with initial value x(0) V. Then x(t) V for all t R. (Hint: for the proof you might use (c)). Invariant subspaces and block triangular matrices. Let V 1 [v 1,...,v p ] C n p be a matrix whose columns are linear indpendent and span an A-invariant subspace, i.e. AV 1 V 1 L. Let V 2 [v p+1,...,v n ] C n (n p) be such that the vectors v 1,...,v n form a basis of C n. Since each Av k is a linear combination of the basis vectors we have AV 2 V 1 R + V 2 M for some matrices R, M. Hence quivalently: A [V 1, V 2 ] [AV }{{} 1, AV 2 ] [V 1, V 2 ] :V A V L R. 0 M L R V 1. (6) 0 M On the other hand if (6) holds for a matrix V C n n, then the first p columns of V span an A-invariant subspace. If R 0 then also the columns v p+1,...,v n span an invariant subspace. This is called a complementary invariant subspace to V R(V 1 ). The following proposition gives a sufficient condition for the existence of a complementary A-invariant subspace. 1 Let χ(λ) det(λi A) denote the characteristic polynomial of A. The Cayley-Hamilton-theorem states that χ(a) 0. The polynomial λ k can be written in the form λ k q(λ)χ(λ) + r(λ) with polynomials q(λ) and r(λ) n 1 j0 r jλ j. (divide λ k by χ(λ) to obtain this). On replacing the variable λ with the matrix A we obtain. A k q(a)χ(a) + r(a) r(a) n 1 j0 r ja j. Thus each nonnegative power of A is a linear combination of I, A,...,A n 1. 7

Proposition 2.4 Suppose (6) holds and L and M have no common eigenvalue (i.e σ(l) σ(m) ). Then there exists Ṽ [v 1,...,v p, ṽ p+1,...,ṽ n ] F n n such that A Ṽ L 0 Ṽ 1. (7) 0 M I X Proof: The Ansatz Ṽ V with X F 0 I p (n p) yields Ṽ 1 Ṽ 1 AṼ [ I X 0 I ][ L R 0 M ] I X 0 I L LX XM + R 0 M I X V 0 I 1 and Since L and M have no common eigenvalue the Sylvester equation LX XM + R 0 has a unique solution X. 3 Taylor expansion of block diagonalization Theorem 3.1 Suppose A F n n (F {R, C}) has a factorization L0 0 A V 0 V0 1, V 0 M 0 F n n, L 0 F p p, M 0 F (n p) (n p) 0 and σ(l 0 ) σ(m 0 ). Then there exist an open neighborhood U of 0 F n n analytic functions U V, L, M satisfying V 0 V 0, L 0 L 0, M 0 M 0 and L A + V 0 0 M (V ) 1, U. (8) Furthermore V can be chosen such that V V 0 [ I Y X I ] with analytic functions X F p p, Y F (n p) (n p). We then have L X L0 0 Y M + 0 M 0 L k Xk Y k Mk, }{{} Π k where each entry of Π k is a homogeneuos polynomial of degree k whose variables are the entries of. The matrices Π k satify the following recursion formula. Partition W V 0 [V 1, V 2 ], V0 1 1 (9) 8 W 2

with V 1, W 1 F n p, V 2, W 2 F n (n p). Let the Sylvester operators S 1, S 1 be defined as S 1 (X) XM 0 L 0 X, S 2 (X) XL 0 M 0 X. Then [ L 1 X1 ] Y1 M1 [ L k Xk ] Y k M k [ W1 V 1 S1 1 1 V ] 2) S2 1 (W2 V 1 ) W2 V 2 [ W 1 V 2 Y ( ) F k ] S 1 2 k 1 ( G k ) S 1 1 W 2 V 1 X k 1 for k 2, (10) where F k G k : W 1 V 1 X k 1 k 1 X j M k j, : W2 V 2 Yk 1 k 1 Y j L k j. Corollary 3.2 Partition V [V 1, V 2 ] with V 1 F n p and V 2 F n (n p). Then L L 0 + W 1 V 1 + W 1 V 2 S 1 2 (W 2 V 1 ) + O( 3 ) (11) V 1 V 1 + V 2 S 1 2 (W 2 V 1) + O( 2 ). (12) In the special case that L 0 λ I we have L λ I + W 1 V 1 + W 1 (λ I A) D V 1 + O( 3 ) (13) V 1 V 1 + (λ I A) D V 1 + O( 2 ), (14) where (λ I A) D denotes the Drazin inverse. Proof: Theorem 3.1 implies L L 0 + L 1 + L 2 + O( 3 ), V 1 V 1 + V 2 Y 1 + O( 2 ), where L 1 W1 V 1, Y1 S2 1 (W2 V 1 ), L 2 W2 V 1 Y1. Hence (11) and (12) hold. If L 0 λ I then S 2 (X) (λ I M 0 )X. Thus, S2 1 (X) (λ I M 0 ) 1 X. As is easily verified we have (λ I A) D V 2 (λ I M 0 ) 1 W2. This yields (13) and (14). Remark: For the case that L 0 is a 1 1 matrix, L 0 [λ], formula (13) gives the Taylorexpansion of a simple eigenvalue up to the second order. In this case V 1 and W 1 are right and left eigenvectors to the eigenvalue λ with the property that V1 W 1 1. Formula (14) gives the Taylor-expansion for an associated right eigenvector V1 of A +. The eigenvector is not unique. One gets another Taylor-expansion if one multiplies the eigenvector V1 with a scalar factor which depends smoothly on the perturbation. 9

xercise 3.3 (5 points) Let W1 F n p, W2 F n (n p) be such that (W (V ) 1 1 ). (W 2 ) Then P : V1 (W 1 ) is the projector onto the (A + )-invariant subspace R(V1 ) along the (A + )-invariant subspace R(V1 ). We set P : P O V 1 W1. Show the following. (a) (W 1 ) W 1 S 1 1 (W 1 V 2 )W 2 + O( 2 ). Hint: You might use the fact that (I + Z) 1 I Z + O( Z 2 ). (b) If L 0 λ I then (W 1 ) W 1 + W 1 (λ I A)D + O( 2 ), P P + (λ I A) D P + P(λ A) D + O( 2 ). xercise 3.4 (4 points) [ Compute ] the Taylor expansion up to second order for the eigenvalue 2 of the matrix A. 2 5 0 5 Proof of Theorem 3.1: We consider the analytic function f : F n n F n n F n n defined by f H1 H 2 δl X, I X H 3 H 4 Y δm L0 + δl 0 L0 + H 1 H 2 I X Y I 0 M 0 + δm H 3 M 0 + H 4 Y I }{{} :H [ ] δl H 1 H 2 Y S 1 (X) + X δm H 2 H 1 X S 2 (Y ) + Y δl H 3 H 4 Y δm H 4 H 3 X where S 1 (X) XM 0 L 0 X and S 2 (Y ) Y L 0 M 0 Y. We have f ( ) δl X 0, Y δm ( ) δl S1 (X) δl X 2 + O S 2 (Y ) δm. Y δm Hence the derivative of f at (0, 0) with respect to the second variable is the linear map δl X δl S1 (X). Y δm S 2 (Y ) δm This map is bijective since the Sylvester operators S 1, S 2 are invertible. Hence, by the implicit function theorem there exists an analytic function δl(h) X(H) H (15) Y (H) δm(h) 10

defined in neighborhood of 0 F n n such that δl(0) X(0) 0 (16) Y (0) δm(0) and f ( ) δl(h) X(H) H, 0. (17) Y (H) δm(h) The latter identity is equivalent to [ L0 + H 1 H 2 I X(H) L0 + δl(h) 0 I X(H) H 3 M 0 + H 4 Y (H) I 0 M 0 + δm(h) Y (H) I if H is small enough (otherwise the inverse might not exist). For in a small neigborhood of 0 C n n let L H : V0 1 X V 0, L0 + δl(h Y M : ) X(H ) Y (H ) M 0 + δm(h. ) ] 1 Then (8) holds. Observe that with the partition (9), H H 1 H2 W H3 H4 1 H V 1 W1 H V 2 W2 H V 1 W2 H V 2 (18) Next we verify the recursion formulas (10). First note that (17) is equivalent to the matrix equations 0 δl(h) H 1 H 2 Y 0 S 1 (X) + X δm(h) H 2 H 1 X 0 S 2 (Y ) + Y δl H 3 H 4 Y 0 δm(h) H 4 H 3 X. (19) Let P k denote the set of matrix functions of H whose entries are homogeneous polynomials of degree k of the entries of H. Since the function (15) is analytic and satisfies (16) we can write δl(h) X(H) Lk (H) X k (H). Y (H) δm(h) Y k (H) M k (H) }{{} P k The first equation of (19) then gives 0 δl(h) H 1 H 2 Y (H) L k (H) H 1 H 2 Y k (H) L 1 (H) H }{{} 1 (L k (H) H 2 Y k 1 (H)) }{{} P k2 1 P k 11

This implies L 1 (H) H 1 and L k (H) H 2 Y k 1 (H) for k 2. From the second equation of (19) we obtain (ommiting the argument H) Thus, 0 S 2 (Y ) + Y δl H 3 H 4 Y ( ) ( ) ( ) S 2 Y + Y k L k H 3 H 4 S 2 (Y k ) + S 2 (Y 1 ) H }{{} 3 + P 1 k 1 Y j L k j H 3 H 4 k2 Y k Y k ( ) k 1 S 2 (Y k ) + Y j L k j H 4 Y k 1 k2 Y 1 (H) S2 1 (H 3 ) and Y k (H) S2 1 } {{ } P k ( ) k 1 H 4 Y k 1 (H) Y j (H) L k j (H) for k 2. Analogously the third and the fourth equation of (19) yield M 1 (H) H 4 and M k (H) H 3 X k 1 (H) for k 2, ( ) k 1 X 1 (H) S1 1 (H 2 ) and X k (H) S1 1 H 1 X k 1 (H) X j (H) M k j (H) for k 2. In order to obtain the recursion formulas (10) from this replace H by H and H k by Hk. 4 The eigenvalue inclusion theorems of Gershgorin and Brauer In the following R j (A) denotes the sum of the absolute values of the off-diagonal elements in the jth row of A [a jk ] C n n, R j (A) : a jk, j 1,..., n. k 1,...,n k j By D(a, r) we denote the disk of radius r about a C, D(a, r) {z C : z a r} C. 12

Theorem 4.1 (Gershgorin) ach eigenvalue of A C n n is contained in one of the disks D(a jj, R j (A)), j 1,...,n. Alternative formulation: σ(a) n D(a jj, R j (A)). Proof: Let Ax λ x, x [x 1, x 2,...,x n ] T 0, λ, x j C. We have (A λ)x 0. The latter equation can be written more explicitely as (a jj λ)x j + k j a jk x k 0, j 1,..., n This implies (a jj λ)x j k j a jk x k k j a jk x k, j 1,...,n. (20) Now choose j such that x j x k for all k. Then (20) implies a jj λ x j k j a jk x j On dividing this inequality by x j we obtain a jj λ R j (A). xercise 4.2 (2 points) A matrix A C n n is said to be strictly diagonally dominant if a jj > R j (A) for all j 1,...,n. Use the Gershgorin theorem to show that strictly diagonally dominant matrices are invertible. Hint: We have det(a) 0 if and only if 0 is an eigenvalue of A. In order to state the inclusion theorem of Brauer we introduce the sets C(a 1, a 2, r) : {z C : z a 1 z a 2 r}, a 1, a 2 C, r 0. Though these sets are not nessearily oval they are called the ovals of Cassini. Theorem 4.3 (Brauer) ach eigenvalue of A C n n is contained in one of the sets C(a jj, a ll, R j (A)R l (A)), j, l 1,...,n, j l. Proof: We use the notation of the proof of Gershgorin s theorem. We choose indices j, l such that x j x l x k for all k {j, l}. By multiplying the associated inequalities (20): (a jj λ)x j k j (a ll λ)x l k l a jk x k, a lk x k 13

1.5 1 2.2 0.5 0 0.5 0.5 1 1 0.5 1.5 1 1.5 2 1.5 1 0.5 0 0.5 1 1.5 2 Figure 1: The Cassini sets C( 1, 1, r) for several values of r we obtain ( ) ( ) a jj λ a ll λ x j x l a jk x k a lk x k k j k l a jk1 a lk2 x k1 x k2 k 1 j k 2 l a jk1 a lk2 x j x l k 1 j k 2 l ( )( ) a jk1 a lk2 x j x l k 1 j k 2 l R j (A)R l (A) x j x l If x l 0 then we can divide the inequality by x j x l and obtain a jj λ a ll λ R j (A)R l (A). Hence, λ C(a jj, a ll, R j (A)R l (A)). The case x l 0 is left as an exercise. xercise 4.4 (2 points) Complete the proof for the case x l 0. 5 Pseudospectra A pseudospectrum is the set of eigenvalues of all matrices which are obtained from a nominal matrix A by adding a perturbation of bounded size. Precisely, we define the 14

F-pseudospectrum of A F n n to the perturbation level ρ > 0 as σ F (A, ρ) : {z C; z σ(a + ) for some F n n with ρ}. Let d F (A) denote the distance of A to the set of singular matrices: Then d F (A) : min{ ; F n n, det(a + ) 0}. σ F (A, ρ) {z C; z σ(a + ) for some F n n with ρ} {z C; A + zi is singular for some F n n with ρ} {z C; d F (A zi) ρ}. Hence pseudospectra are the sublevel sets of the function z d F (A zi). The next proposition gives d F for the case F C. Proposition 5.1 For any A C n n, d C (A) σ n, the smallest singular value of A. Proof: If A + is singular, then for some v F n with v 1, (A + )v 0 v Av v Av min x 1 Ax σ n. Let A n σ ku k vk be a singular value decomposition with pairwise orthogonal unit vectors v k and pairwise orthogonal unit vectors u k. Let σ n u n vn. Then σ n and (A + )v n 0. It follows that In the case F R we have d R (A) inf γ (0,1] σ 2n 1 σ C (A, ρ) {z C; σ n (A zi) ρ}. ([ RA γ IA γ 1 IA RA ]) for all A C n n, where RA and IA denote the real and the imaginary part of A and σ 2n 1 is the second smallest singular value. The proof is to complicated to give it here. xercise 5.2 (4 points) Let D(λ, ρ) C denote the closed disk with center λ C and radius ρ. Show that for any A C n n, D(λ, ρ) σ C (A, ρ). λ σ(a) Show that equality holds if A is normal. Hence, the normal matrices have the smallest possible complex pseudospectra. Remark: the real pseudospectra of real normal matrices are generically not unions of disks. It is still an open question whether the real normal matrices have the smallest possible real pseudospectra in their similarity class. If you solve this problem you get 100 points. 15

From the statements of the exercise we can conclude the following result of Bauer and Fike. Proposition 5.3 Suppose A has a basis V [v 1,...,v n ] of igenvectors. Then σ(a + ) D(λ, ρ), where ρ V V 1. }{{} λ σ(a) cond.number Proof: We have A V ΛV 1, where Λ is a diagonal matrix of eigenvalues. It follows that σ(a + ) σ(v 1 (A + )V ) σ(λ + V 1 V ) σ(λ, V 1 V ) σ(λ, V 1 V ) D(λ, V 1 V ). λ σ(λ)σ(a) The latter equality holds since Λ is normal. 6 References. The following books helped me to prepare these notes. G.W. Stewart, J. Sun. Matrix Perturbation Theory. Academic Press, Inc. 1990. G. Golub, C. van Loan. Matrix Computations. Johns Hopkins University Press. 1983. P. Lancester, M. Trismenetsky. The Theory of Matrices. Academic Press, Inc. 1985. 16