THE EIGENVALUE PROBLEM Let A be an n n square matrix. If there is a number λ and a column vector v 0 for which Av = λv then we say λ is an eigenvalue of A and v is an associated eigenvector. Note that if v is an eigenvector, then any nonzero multiple of v is also an eigenvector for the same eigenvalue λ. Example: Let A = [.5 0.75 0.75.5 The eigenvalue-eigenvector pairs for A are [ ] λ =, v () = λ = 0.5, v () = ] [ () ] ()
Eigenvalues and eigenvectors are often used to give additional intuition to the function F (x) = Ax, x R n or C n Example. The eigenvectors in the preceding example () form a basis for R. For x = [x, x ] T, x = c v () + c v (), c = x + x, c = x x Using (), we can write the function as F (x) = Ax, x R F (x) = c Av () + c Av () = c v () + c v (), x R The eigenvectors provide a better way to understand the meaning of F (x) = Ax. See the following figures.
x v() x v () c v () c v () x Figure: Decomposition of x x v() Ax v () 0.5c v () c v () x Figure: Decomposition of Ax
How to calculate λ and v? Rewrite Av = λv as (λi A)v = 0, v 0 (3) a homogeneous system of linear equations with the coefficient matrix λi A and the nonzero solution v. This can be true iff f (λ) det(λi A) = 0 The function f (λ) is called the characteristic polynomial of A, and its roots are the eigenvalues of A. Assuming A has order n, f (λ) = λ n + α n λ n + + α λ + α 0 (4) For the case n =, [ ] λ a a f (λ) = det = (λ a )(λ a ) a a a λ a = λ (a + a )λ + a a a a The formula (4) shows that a matrix A of order n can have at most n distinct eigenvalues.
Example Let Then A = f (λ) = det 7 3 6 3 0 3 6 3 7 λ + 7 3 6 3 λ + 0 3 6 3 λ + 7 = λ 3 + 4λ 405λ + 97 (5) is the characteristic polynomial of A. The roots are λ = 36, λ = 9, λ 3 = 3 (6)
Finding an eigenvector For λ = 36, find an associated eigenvector v by solving (λi A)v = 0, which becomes ( 36I A)v = 0 9 3 6 v 3 6 3 v = 6 3 9 v 3 If v = 0, then the only solution is v = 0. Thus v 0, and we arbitrarily choose v =. This leads to the system 3v + 6v 3 = 9 6v 3v 3 = 3 3v 9v 3 = 6 The solution is v =, v 3 =. Thus the eigenvector v for λ = 36 is v () = [,, ] T (7) or any nonzero multiple of this vector. 0 0 0
SYMMETRIC MATRICES Recall that A symmetric means A T = A, assuming that A contains only real number entries. Such matrices are very common in applications. Example. A general 3 3 symmetric matrix looks like a b c A = b d e c e f A general n n matrix A = [a i,j ] is symmetric if and only if a i,j = a j,i, i, j n Following is a very important set of results for symmetric matrices, explaining much of their special character.
THEOREM. Let A be a real, symmetric, n n matrix. Then there is a set of n eigenvalue-eigenvector pairs {λ i, v (i) }, i n that satisfy the following properties. (i) The numbers λ, λ,..., λ n are all of the roots of the characteristic polynomial f (λ) of A, repeated according to their multiplicity. Moreover, all the λ i are real numbers. (ii) The eigenvectors v (i), i n, as vectors in n-dimensional space, are mutually perpendicular and of length : v (i)t v (j) = 0, i, j n, i j v (i)t v (i) =, i n
(iii) For each column vector x = [x, x,..., x n ] T, there is a unique choice of constants c,..., c n for which The constants are given by x = c v () + + c n v (n) c i = n j= x j v (i) j = x T v (i), i n where v (i) = [v (i),..., v (i) n ] T.
(iv) Define the matrix U of order n by U = [v (), v (),..., v (n) ] (8) Then λ 0 0 U T. AU = D 0 λ.......... 0 0 0 λ n and UU T = U T U = I (9) A = UDU T is a useful decomposition of A.
Example Recall A = [.5 0.75 0.75.5 and its eigenvalue-eigenvector pairs [ λ =, v () = ] ] ; λ = 0.5, v () = [ The eigenvectors are perpendicular. To have them be of length, replace the above by λ =, v () = ; λ = 0.5, v () = ] that are multiples of the original v (i).
The matrix U of (8) is given by U = Easily, U T U = I. Also, U T AU = [ 0 [.5 0.75 0.75.5 ] ] = 0 0.5 as specified in (9).
NONSYMMETRIC MATRICES Nonsymmetric matrices have a wide variety of possible behaviour. We illustrate with two simple examples some of the possible behaviour. Example. We illustrate the existence of complex eigenvalues. Let [ ] 0 A = 0 The characteristic polynomial is [ ] λ f (λ) = det = λ + λ The roots of f (λ) are complex, λ = i, and corresponding eigenvectors are [ v () i = λ = i ], v () = [ i ]
Example For A an n n nonsymmetric matrix, there may not be n independent eigenvectors. Let a 0 A = 0 a 0 0 a where a is a constant. Then λ = a is the eigenvalue of A with multiplicity 3, and any associated eigenvector must be of the form for some c 0. v = c [, 0, 0] T Thus, up to a nonzero multiplicative constant, we have only one eigenvector, v = [, 0, 0] T for the three equal eigenvalues λ = λ = λ 3 = a.
THE POWER METHOD This numerical method is used primarily to find the eigenvalue of largest magnitude, if such exists. We assume that the eigenvalues {λ,..., λ n } of an n n matrix A satisfy λ > λ λ n (0) Denote the eigenvector for λ by v (). We define an iteration method for computing improved estimates of λ and v (). Choose z (0) v (), usually chosen randomly. Define w () = Az (0) Let α be the maximum component of w (), in size. If there is more than one such component, choose the first such component as α. Then define z () = α w ()
Repeat the process iteratively. Define w (m) = Az (m ) () Let α m be the maximum component of w (m), in size. Define z (m) = α m w (m) () for m =,,.... Then, roughly speaking, the vectors z (m) will converge to some multiple of v (). To find λ by this process, also pick some nonzero component of the vectors z (m) and w (m), say component k; and fix k. Often this is picked as the maximal component of z (l), for some large l. Define λ (m) = w (m) k, m =,,... (3) z (m ) k where z (m ) k denotes component k of z (m ). It can be shown that λ (m) converges to λ as m.
Example [ ].5 0.75 Recall the earlier example A =. Double precision 0.75.5 was used in the computation, with rounded values shown below. m z (m) z (m) λ (m) λ (m) λ (m ) Ratio 0.0.5.0.8465.6500.0.9598.8846.60E 3.0.98964.96939 8.48E 0.33 4.0.99740.993.8E 0.7 5.0.99935.99805 5.8E 3 0.6 6.0.99984.9995.46E 3 0.5 7.0.99996.99988 3.66E 4 0.5 Note that λ = and v () = [, ] T, and the numerical results are converging to these values. The column of successive differences λ (m) λ (m ) and their successive ratios are included to show that there is a regular pattern to the convergence.
CONVERGENCE Assume A is a real n n matrix. It can be shown, by induction, that where σ m = ±. z (m) = σ m A m z (0) A m z (0), m (4) Further assume that the eigenvalues {λ,..., λ n } satisfy (0), and also that there are n corresponding eigenvectors { v (),..., v (n)} that form a basis for R n. Thus z (0) = c v () + + c n v (n) (5) for some choice of constants {c,..., c n }. Assume c 0, something that a truly random choice of z (0) will usually guarantee.
Apply A to z (0) in (5), to get Az (0) = c Av () + + c n Av (n) = λ c v () + + λ n c n v (n) Apply A repeatedly to get From (4), A m z (0) = λ m c v () + + λ m n c n v (n) [ ( ) m = λ m c v () λ + c v () + + λ ( ) m c v () + z (m) λ = σ m λ c v () + ( λ λ ( λ λ ( λn λ ) m c v () + + ) m c n v (n) ] ( λn λ ) m c n v (n) ) m ( ) m λn c v () + + c n v λ (n) As m, the terms (λ j /λ ) m 0, for j n, with (λ /λ ) m the largest. Also, ( ) m λ σ m = ± λ (6)
Thus as m, most terms in (6) are converging to zero. Cancel c from numerator and denominator, obtaining where ˆσ m = ±. z (m) v () ˆσ m v () If the normalization of z (m) is modified, to always have some particular component be positive, then z (m) ± v () v () ˆv () (7) with a fixed sign independent of m. Our earlier normalization of sign, dividing by α m, will usually accomplish this, but not always. The error in z (m) will satisfy z (m) ˆv () c for some constant c > 0. m λ λ, m 0 (8)
CONVERGENCE OF λ (m) { A similar convergence analysis can be given for same kind of error bound. Moreover, if we also assume then λ (m) }, with the λ > λ > λ 3 λ 4 λ n 0 (9) λ λ (m) c for some constant c, as m. ( λ λ ) m (0) The error decreases geometrically [ with a ] ratio of λ /λ. In the.5 0.75 earlier example with A =, 0.75.5 λ /λ = 0.5/ =.5 which is the ratio observed in the table.
Example Consider the symmetric matrix 7 3 6 A = 3 0 3 6 3 7 From earlier, the eigenvalues are λ = 36, λ = 9, λ 3 = 3 The eigenvector v () associated with λ is v () = [,, ] T Results of using the power method are shown in the following table. Note that the ratios of the successive differences of λ (m) approaches λ /λ = 0.5. Also note that location of the maximal component of z (m) changes from one iteration to the next. The initial guess z (0) was chosen closer to v () than would be usual in actual practice. It was so chosen for purposes of illustration.
m z (m) z (m) z (m) 3 0.000000 0.800000 0.900000 0.97477.000000.000000.000000 0.995388 0.99308 3 0.99865 0.9999.000000 4.000000 0.999775 0.999566 5 0.99989 0.999946.000000 6.000000 0.999986 0.999973 7 0.999993 0.999997.000000 m λ (m) λ (m) λ (m ) Ratio 3.80000 36.8075 5.03E + 0 3 35.8936 9.9E 0.97 4 36.04035.E 0.3 5 35.9903 5.0E 0.38 6 36.0045.3E 0.45 7 35.99939 3.06E 3 0.49
AITKEN EXTRAPOLATION From (0), λ λ (m+) r(λ λ (m) ), r = λ /λ () for large m. Choose r using r λ(m+) λ (m) λ (m) λ (m ) () as with Aitken extrapolation in 3.4 on linear iteration methods. Using this r, solve for λ in (), λ [ ] λ (m+) rλ (m) = λ (m+) + r [ ] λ (m+) λ (m) r r (3) This is Aitken s extrapolation formula. It also gives us the Aitken error estimate λ λ (m+) r [ ] λ (m+) λ (m) r (4)
Example In the previous table, take m + = 7. Then (4) yields λ λ (7) r r. = 0.0006 which is the actual error. [ ] λ (7) λ (6) = 0.49 + 0.49 [0.00306] Also the Aitken formula (3) will give the exact answer for λ, to seven significant digits.