MATH 5640: Functions of Diagonalizable Matrices

MATH 5640: Functions of Diagonalizable Matrices Hung Phan, UMass Lowell November 27, 208 Spectral theorem for diagonalizable matrices Definition Let V = X Y Every v V is uniquely decomposed as u = x + y for x X and y Y The mapping E : v x is called the projector onto X along Y In this case, we also have R(E = X and N(E = Y Lemma 2 A linear operator E is a projector if and only if E 2 = E Lemma 3 Let E be a projector onto X along Y and F a projector onto X 2 along Y 2 Then E + F is a projector if and only if EF = F E = 0 In this case, R(E + F = X X 2 and N(E + F = Y Y 2 Proof We have E 2 = E and F 2 = F ( : Suppose E + F is a projector, then (E + F 2 = E + F, which implies EF + F E = 0 We have 0 = E(EF + F E = EF + EF E and 0 = (EF + F EE = EF E + F E So EF = F E = 0 ( : Suppose EF = F E = 0, then clearly (E + F 2 = E + F, ie, E + F is a projector In this case, one can check that R(E +F = X +X 2 and that X X 2 = 0, so R(E +F = X X 2 Finally, take x N(E + F, then Ex + F x = 0 So 0 = E(Ex + F x = E 2 x + EF x = Ex = 0, so x N(E = Y Similarly, x Y 2 Thus, x Y Y 2 On the other hand, obviously Y Y 2 N(E + F Theorem 4 (Meyer p57 A matrix A n n with spectrum σ(a = {λ,, λ k } is diagonalizable if and only if there exist matrices G,, G k such that where G i s satisfy A = λ G + λ 2 G 2 + + λ k G k (i G i is the projector onto N(A λ i I along R(A λ i I (ii G i G j = 0 whenever i j (iii G + G 2 + + G k = I This expansion is known as the spectral decomposition of A, and the G i s are called the spectral projectors associated with A Proof ( : Since A is diagonalizable, all eigenvalues of A are semisimple, ie, geo mult A (λ = alg mult A (λ for all λ σ(a In other words, if X i is a matrix whose columns form a basis for N(A λ i I, then the dimension of X i equals the multiplicity of λ i in the characteristic polynomial det(a λi Thus, P = ( X X 2 X k

is nonsingular If P is partition in a conformable manner, then we have λ I 0 0 Y T A = P DP = ( 0 λ 2 I 0 Y T 2 X X 2 X k 0 0 λ k I Yk T = λ X Y T + λ 2 X 2 Y2 T + + λ k X k Yk T =: λ G + λ 2 G 2 + + λ k G k Now, P P = I is precisely G + + G k = I Next, P P = I implies { { Yi T I, i = j, G 2 i X j = = G i, 0, i j, G i G j = 0, i j To establish R(G i = N(A λ i I = R(X i, we note that R(G i = R(X i Yi T R(X i = R(X i Yi T X i = R(G i X i R(G i To show N(R(G i = R(A λ i I, we note that on the one hand, On the other hand, ( G i (A λ i I = G i λ j G j λ i G i = 0 R(A λ i I N(G i Thus, R(A λ i I = N(G i j= dim R(A λ i I = n dim N(A λ i I = n dim R(G i = dim N(G i ( : First, G i is the projector onto N(A λ i I along R(A λ i I implies dim R(G i = dim N(A λ i I = geo mult A (λ i Next, since G G 2 = G 2 G = 0, Lemma 3 implies that G + G 2 is a projector with R(G + G 2 = R(G R(G 2 Inductively, we conclude that R(G + + G k = R(G R(G k It also follows that n = dim R(G + G 2 + + G k = dim R(G + dim R(G 2 + + dim R(G k = geo mult A (λ + geo mult A (λ 2 + + geo mult A (λ k alg mult A (λ + alg mult A (λ 2 + + alg mult A (λ k = n So, geo mult A (λ i = alg mult A (λ i for all i, ie, all eigenvalues of A are semi-simple Thus, A is diagonalizable 2 Functions of diagonalizable matrices Consider the exponential function e x = + z + z2 2! + z3 3! + = z k k! 2

Replacing z by a square matrix A (with A 0 = I results in the infinite series e A = I + A + A2 2! + A3 3! +, called the matrix exponential To make this formula mathematically correct, one must ensure the convergence Suppose now that A is diagonalizable, then A = P DP = P diag(λ,, λ n P, and A k = P D k P So ( e A P D k P D k = = P P = P diag(e λ,, e λn P k! k! In other words, we do not need to define e A by the infinite series, but instead, define e D = diag(e λ,, e λn and set e A = P e D P We then apply this generalization to any function f(z on a diagonalizable matrix A = P DP = P diag(λ,, λ n P by defining f(d = diag(f(λ,, f(λ n and setting f(a = P f(dp = P diag(f(λ,, f(λ n P So, this formula eliminates the problem of convergence However, we need to ensure uniqueness as the matrix P might not be unique To see this, we use the spectral decomposition theorem: suppose there are k distinct eigenvalues that are grouped by repetition, we can write f(λ I f(a = P f(dp = ( X X k f(λ 2 I f(λki ( Y T Y T k = f(λ i X i Y T i = f(λ i G i, where G i is the projector onto N(A λ i I along R(A λ i I, which is uniquely defined by A Therefore, f(a is uniquely defined regardless the choice of P Definition 2 (function of diagonalizable matrices Let A = P DP = P diag(λ I,, λ k IP be a diagonalizable matrix For a function f(z that is defined at each λ i σ(a, define f(a = P f(dp = P diag(f(λ I,, f(λ k IP = f(λ G + f(λ k G k, where G i is the i-th spectral projector defined in Theorem 4 Definition 22 (infinite series If f(z = n=0 c n(z z 0 n converges when z z 0 < r and if λ i z 0 < r for all eigenvalues λ i of a diagonalizable matrix A, then f(a = c n (A z 0 I n n=0 3

Example 23 (Neumann series The function f(z = ( z has the geometric series expansion ( z = zk that converges if and only if z < This means the associated matrix function f(a = (I A = is defined if and only if λ < for all λ σ(a In fact, lim n A n = 0 is sufficient for the convergence of Ak The two conditions turn out to be equivalent, ie, A k lim n An = 0 λ < λ σ(a We also have max i λ i A for all matrix norms, so (I A = Ak exists when A < for any matrix norm 3 Spectral projectors via Lagrange interpolation Suppose A is diagonalizable Then f(a exists if and only if f(λ i exists for all λ i σ(a = {λ,, λ k }, and f(a = f(λ G + + f(λ k G k, where G i is the i-th spectral projector Suppose p(z is a polynomial that agrees with f(z on σ(a, then p(a = p(λ i G i = f(λ i G i = f(a That means, every function of A can be expressed as a polynomial of A In particular, we can choose p(z to be the Lagrange polynomial given by So Now set for example f(z = p(z = f(a = j i f(λ i (z λ j j i (λ i λ j j i f(λ i (A λ ji j i (λ i λ j {, z = λ, 0, z λ, Then f(a = G = j i (A λ ji j i (λ i λ j, which is an explicit formula for the first spectral projector Similarly, we obtain all spectral projectors of A 4

( A λ n = f(a = f(λ G + f(λ 2 G 2 + + f(λ k G k 4 Power method Power method is an iterative technique for computing a dominant eigenpair (λ, x of a diagonalizable A R n n with eigenvalues λ > λ 2 λ 3 λ k Note that λ must be real, otherwise λ is another eigenvalue with the same magnitude Consider f(z = ( z λ n and use the spectral representation ( λ2 ng2 ( λk ngk = G + + + G as n λ λ A n x 0 Consequently, λ n G x 0 N(A λ I for all x 0 So if G x 0 0, or equivalently x 0 R(A λ I, then An x 0 converges to an eigenvector associated with λ This also means A n x 0 λ n tends toward the direction of an eigenvector because λ n is just a scaling factor to keep the length of the vector under control Instead of using λ n, we can simply use the component of maximal magnitude Denote m(x the first maximal component of x, we can iterate with x 0 R(A λ I y n = Ax n, ν n = m(y n, x n+ = y n ν n, for n = 0,, 2, So, x n x an eigenvector Moreover, note that So if ν n ν, then by taking the limit, we have ( Axn Ax n+ = A = A2 x n ν n ν n λ x = Ax = A2 x ν = λ2 x ν ν = λ 5 Systems of differential equations Consider the system of first-order differential equations with constant coefficients u (t = a u + a 2 u 2 + + a n u n, u 2(t = a 2 u + a 22 u 2 + + a 2n u n, u n(t = a n u + a n2 u 2 + + a nn u n, with u (0 = c, u 2 (0 = c 2, u n (0 = c n We write as a matrix form u = Au u(0 = c where u = u u n, A = ( a ij n n, c = 5 c c n

If A is diagonalizable with σ(a = {λ,, λ k }, then e At = e λ t G + e λ 2t G 2 + + e λ kt G k So we can derive de At dt = ( ( λ i e λit G i = λ i G i e λit G i = Ae At Ae At = e At A e At e At = e At e At = I = e 0 The first identity ensures that u = e At c is one solution to u = Au, u(0 = c To see that this is the only solution, suppose v(t is another solution so that v = Av, v(0 = c Differentiating e At v yields d[e At v] = e At v e At Av = 0, so e At v is a constant for all t dt At t = 0, we have e At v = e 0 v(0 = Ic = c, so e At v = c for all t It follows that v = e At c = u t=0 Finally, note that v i = G i N(A λ i I is an eigenvector associated with λ i Thus we can write the solution as u = e λ t v + + e λ kt v k, and this solution is completely determined by the eigenpairs (λ i, v i 6