EIGENVALUES IN LINEAR ALGEBRA * P. F. Leung National University of Singapore. General discussion In this talk, we shall present an elementary study of eigenvalues in linear algebra. Very often in various branches of mathematics we may come across the problem of finding certain powers say Ak of a square matrix A. This comes up for example in the theory of Markov Chains, in the solution of a system of ordinary differential equations with constant coefficients, etc. k In general, for small k we can calculate A directly by multiplication, but this becomes complicated for large k or for a general value k. To find another approach, we first observe that k A can be calculated easily when A is a diagonal matrix. For example, if then 0 for any positive integer k. *Text of a talk given at the Workshop on Linear Algebra and its Teaching from 9.0 September 985 at the National University of Singapore. 74
However, since very often the matrix we come across need not be diagonal so this observation doesn't help us directly. Still if we think about it more carefully, we come to realise that even when A itself is not diagonal, but if it is similar to a diagonal - matrix D in the sense that A - MDM for some non- singular matrix M, then we have k terms k and hence A can again be calculated easily. Now let us spend a little time and find out something about the equation A- MDM-. We will represent vectors in column form and write where mi is the i-th column of the matrix M. Then from A - MDM-l we have AM - MD and so ' and therefore we have since M is non-singular, therefore mi ~ 0, and so we are led to 75
study the equation for x,.& 0. Ax- AX Definition. Let A be an n x n matrix over the field of real numbers~. A real number A is called an eigenvalue of A if there n is a vector x (in column form) in~, x,.& 0, and such that Ax - AX. In this case x is called an eigenvector of A corresponding to the eigenvalue A. From Ax - AX, we have A(cx) - cax - CAX - A(cx) for any other real number c and hence we can see that any non-zero multiple of an eigenvector is again an eigenvector corresponding to the same eigenvalue and the linear subspace generated by x is mapped into itself under A. invariant subspace of A if AX ~ X. In general a subspace X is called an Let us now return to the question of finding a diagonal matrix D and a nonsingular one M such that A- MDM-. can find n eigenvalues A,...,A and the corresponding n. n eigenvectors m,...,mn are linearly independent, then M- (m,...,mn) is non-singular and letting Suppose we we see from () that AM- MD and so A- MDM-. In 2 we shall see how to find the eigenvalues and eigenvectors. In 3 we shall study one very important particular case when A is symmetric and we shall find out that in this case D and M can always be found (at least in theory) and furthermore 76
- t M can be assumed to be orthogonal (i.e. M - M the transpose of M) as well. 2. Finding eigenvalues and eigenvectors A is our n x n matrix and we use I to denote the n x n identity matrix. We have Ax- AX o Ax -.>.Ix - 0 o (A-.XI)x - 0. (2) Now (2) represents a system of linear equations with coefficient matrix A -.XI and the requirement x ~ 0 means that (2) has a non-trivial solution and hence we must have det(a -.XI) - 0, where det denotes the determinant function. Therefore >. is a solution of the characteristic equation det(a - zi) - 0 (*) Conversely, if>. satisfies(*), then (2) will have a non-trivial solution x and hence Ax -.Xx, x ~ 0. Hence finding eigenvalues of A is equivalent to solving its characteristic equation(*). We see that (*) is a polynomial equation (in z) of degree n and so A has at most n eigenvalues (counting multiplicity). Once an eigenvalue is found, an eigenvector corresponding to it can be found by solving for x in (2) using standard methods from linear algebra. Let us illustrate this procedure in some detail in an example. 77
Example. Our matrix A. f S -3 (_; ] From (*) the characteristic equation is I 2-z -3 0. -2 -z This simplifies to 2-3z - 4-0 z i.e. (z- 4)(z + ) - 0 and hence we see that there are two eigenvalues >. - - and >. - 4. 2 To find an eigenvector corresponding to >. - -, we look at equation (2) for this case which is and solve for the vector x - x 2 and so This equation reduces to is an eigenvector corresponding to >. - -. Next we do the same for >. 2-4 and solve 78
which reduces to 2x - -3x 2 and hence is an eigenvector corresponding to ~ 2-4. The matrix M will be given by M - [ i -~ ), and using the formula for the inverse of a 2 x 2 matrix, we see that and we finally conclude from that and hence [ 2-3ln [ 3] [(-l)n : l [: : l -2-2 0 4 5-5 l 2(-l)n + 3(4n) 3(-l)n - 3(4n) - 5 [ 2(-l)n - 2(4n) 3(-l)n + 2(4n) 79
3, The spectral theorem for symmetric matrices We shall consider the n-dimensional Euclidean space Rn with its usual inner product (i.e. dot product) denoted by <,> so that for any two vectors x - in R, we have [ xx.nl l' y - [ YYn l n An X n matrix A- (aij) is called a symmetric matrix if we have n <Ax,y> - <x,ay> for any x,y E R Since in general, we always have <Ax,y>- <x,aty>, a matrix is symmetric if A At, where At is the transpose of A, i.e. aij- aji for all i,j,...,n. For the rest of this section A will denote a symmetric matrix. A - t matrix M is called an orthogonal matrix if M - M. For an orthogonal matrix M, we have MtM- I and since the (i,j) entry in t M M is <mi,mj> where as in, mi' mj denote the i-th and j-th column of M respectively, we see that the columns m,...,mn form an orthonormal basis of Rn. Conversely, if m,...,mn form an n orthonormal basis of R, then M- (m,...,mn) is an orthogonal matrix. Our main result in this section is the following. Spectral Theorem. Let A be a symmetric matrix, then there exists an orthogonal matrix M and a diagonal matrix D such that As a consequence, we see that A has n real eigenvalues and that there is an orthonormal basis of Rn relative to which the transformation A is represented in diagonal form D. 80
We shall present two different proofs of this very important theorem based on two different ideas both of which are of independent interest. In Part I of the following, we shall take a geometric approach to obtain a characterisation of the eigenvalues of A in terms of a maximum-minimum property and this coupled with a little analysis and linear algebra will give us an inductive proof of the spectral theorem. In Part II an idea due to Jacobi in the numerical computation of eigenvalues will be studied and as a consequence we shall obtain a very simple and elegant proof of the spectral theorem. We need two important facts from analysis : Fact A subset of the Euclidean space is compact if and only if it is closed and bounded. Fact 2 A continuous real-valued function on a compact set must attain a minimum at some point in this set. Part I Let S n- n n- denote the unit sphere in IR, i.e. S - n- {X IRn : llxll - } where llxf- <x,x>. Clearly S is closed and bounded and hence compact. Now consider the real-valued n- function f on S defined by f(x) - <Ax,x>, this attains a minimum at a point m say in Sn-l and this minimum value is >.- <Aml,ml>. Claim. m is an eigenvector corresponding to the eigenvalue >.' i.e. Aml - >.lml. 8
Proof. By using orthogonal decomposition, we can let where y is a unit vector in the orthogonal complement m i figure). (see Int. Arn Consider now the unit circle ~(0) - m cos0 + ysino, ~ s 0 s ~. The restriction of f on this circle f(~(o)) as a function of one variable 0 attains a minimum when 0-0 and so we have :o f (-y < o > > - o. o-o (3) Now - <Am cos0 + AysinO, m cos0 + ysino> 2 2 - A cos 0 + <Am,y>sin20 + <Ay,y>sin 0 82
Therefore Hence from (3) we have ~ - 0 and therefore n-2 Now we apply a similar reasoning to the unit sphere S j_ (n-)-dimensional subspace m and obtain and a corresponding unit eigenvector m. j_ 2 unit sphere in (m,m } 2 in the a second eigenvalue A 2 Apply this again to the we obtain A 3 and m 3 and so on by induction we then obtain all n eigenvalues A,...,An and corresponding unit eigenvectors m,...,mn which are mutually orthogonal. So M = (m,...,mn) is an orthogonal matrix and we have 0 where D- [ A. 0 A n t A - MDM.] which proves the spectral theorem. The above reasoning gives us the following characterization of the eigenvalues Al ~ A 2 ~... ~An. Minimum Principle A - min <Ax,x> llxll- and 2 ~ k ~ n. 83
As a simple application of the minimum principle, we can derive a very easy upper bound for the smallest eigenvalue ~. We just look for the smallest element in the diagonal of A, say this is aii' then we have ~ s aii. This may sometimes be useful when the size of A is large so that a direct computation of its eigenvalues is not practical, and we only need some numerical estimate on its smallest eigenvalue. For example if A is a 00 x 00 symmetric matrix with a - in a diagonal entry, then we know without any work that A cannot have all eigenvalues positive. Supplement to Part one For practical purposes, the minimum principle is not very useful because of the dependence of the characterization of later eigenvalues on the eigenvectors of the preceeding ones. We shall now use the minimum principle to derive a different characterization. The Mini-Max Principle () ~l - min <Ax,x> llxll- (2) For 2 s k s nand any collection of k- vectors v,...,vk-l we define a number a depending on v,...,vk-l by 84
Then ~k- { max }Q(v,...,vk-l) where the maximum is vl,,vk- taken over all choices o,f v,..., vk-l' Proof. Let ~k- { max }Q(v,...,vk_ ). By minimum vl',vk- principle, we have and so Let us now take a fixed arbitrary choice of v,...,vk-l' We consider the vector where a,...,ak are chosen such that <x,v >-... -~x,vk-l>- 0 2 2 and a +... + ~-. This is possible because <x,v >- 0,..., <x,vk_ >- 0 is a system of (k-) linear equations ink unknowns (a,...,ak) and so a non-trivial solution can be found to satisfy 2 2 a +... + ak-. llxll - Now that the conditions x~{v are satisfied, we check that...,v k- and <Ax,x>- <al~lml + + ak~k~' alml + + ~~> 2 2 2 - ~lal + ~2a2 + + ~k~ 2 2 2 s ~kal + ~ka2 + + ~k~- ~k 85
have Finally, since v,...,vk-l is any arbitrary choice, we therefore Part II We shall give another proof of the spectral theorem based on a numerical method due to Jacobi. We begin with a simple fact compact. Lemma. The set O(n) of n x n orthogonal matrices is Proof. We shall realize O(n) as a closed and bounded set in 2 n ~ and hence its compactness follows. First we observe that the 2 space of all n x n matrices can be identified with ~n because there are n 2 entries for such a matrix. Therefore we can use the 2 norm in ~n to induce a norm on any n x n matrix B - (bij) be defining IIBII 2 equivalent to 2 Ibij" A simple calculation shows that this is Therefore for any ME O(n), we have IIMII 2 - trace(mmt) - trace(!) - n 86
2 and so O(n) is a subset of the hypersphere of radius /U in ~n. and hence O(n), is bounded. To see that O(n) is closed we take a sequence M. e O(n) such that M. ~ M, then we observe that M~ ~ M and hence t but MiMi- I and so MM t -I, i.e., Me O(n). We shall now introduce the method to reduce a non-zero entry in the off-diagonal of a symmetric matrix A to zero, and of course, after all t~e off-diagonal entries are reduced to zero, we get a diagonal matrix. So let us now introduce a function to measure the deviation of any matrix from a diagonal one. Definition. For any matrix B - (bij)' define 2 4>(B) I b.... J ;o<!j Clearly B is a diagonal matrix if and only if 4>(B) 0. Theorem (Jacobi). Let A be a ~ymmetric matrix. If 4>(A) > 0, then there exists an orthogonal matrix M such that Proof. Since 4>(A) > 0, there is an entry, say a.. ;o 0, with J i < j. Let M be the n X n matrix given by 87
i j.... cos 0.... sin 0.. sin 0... cos 0 Then it is easily checked that MMt - I and so M is orthogonal. Now we just have to multiply out MAMt. (b ), then the result is pq If we let MAMt b - b -a -a p ~ i,j' q ~ i,j pq qp pq qp bpi - a.coso + apjsino - bip, p ~ i,j p bpj a.sino + apjcoso - b p ~ i,j p jp Now we see that 0 can be chosen so as to make bij - this equation for 0) and with this choice of 0, a direct 0 (just solve 88
computation shows that and hence Proof of the spectral theorem. A is our fixed symmetric matrix. We now consider the function f : O(n) ~~defined by f(j)- ~(JtAJ). f is then a continuous function and since O(n) is compact, f attains a minimum, say at Me O(n). Clai~. f(m) - ~(MtAM) - 0. Suppose f(m) > 0, let D - t M AM. Then D is symmetric and ~(D) > 0. Therefore by the theorem of Jacobi, there exists N e O(n) such that f(m) - ~(D) > ~(NDNt) - ~(NMtAMNt) - ~(MNt)tA(MNt)) - f(mnt) t But M,N e O(n), so MN the hypothesis that f(m) is a minimum. e O(n) also, and this is a contradiction to Therefore ~(t) - 0 so that D is diagonal. Since D - MtAM, therefore A - MDMt and the spectral theorem is proved. 89