Matrix Theory, Math6304 Lecture Notes from October 25, 2012 taken by John Haas Last Time (10/23/12) Example of Low Rank Perturbation Relationship Between Eigenvalues and Principal Submatrices: We started with the bordering case, and then began to generalize Before we commence today s warm-up, let us begin by restating and proving the last theorem from the previous class (because we will then use it in the warm-up) This theorem generalizes eigenvalue interlacing to principal submatrices of smaller sizes 46 Eigenvalue interlacing for principal submatrices, continued 461 Theorem (Eigenvalue Interlacing for Principal Submatrices) Let A M n be Hermitian, and let A r M r be a principal submatrix of A, where r N,r n Then for 1 k r, one has that: λ k (A) λ k (A r ) λ k+n r (A) Proof Suppose A r is a principal submatrix of A, as above Then one can choose a sequence from A of nested principal submatrices A r+1,a r+2,,a n 1,A n = A, wherea k M k,whichsatisfies: A r is a principal submatrix of A r+1 A r+1 is a principal submatrix of A r+2 A n 1 is a principal submatrix of A n = A It is not hard to see that each A k above satisfies the conditions for the corollary of our bordering theorem with respect to A k+1,fork {r, r +1,,n 1} Thus, by applying the corollary in an iterative fashion, we obtain: λ k (A) λ k (A n 1 ) λ k (A r ) Similarly, we can apply the corollary to get: λ k (A r ) λ k+1 (A r+1 ) λ k+n r (A r+n r )=λ k+n r (A) 1
This completes the proof Warm-up We will define and then develop a test for positive definiteness 462 Definition AHermitianmatrixA M n is positive-definite if all of its eigenvalues are strictly positive 463 Theorem AHermitianmatrixA n M n is positive-definite iff each of its leading principal submatrices {A r } n r=1 satisfies that det(a r ) > 0 Note: We say a principal submatrix A r of A n is leading if it shares its top-left corner with A n : A n = A r Proof ( ): Suppose A n is positive-definite Then, using to the usual convention where λ 1 (A n ) is the smallest eigenvalue of A n,wehave: λ 1 (A n ) > 0 Hence, using the preceding theorem, we have for each leading principal submatrix A r, r {1, 2,,n}, that: 0 <λ 1 (A n ) λ 1 (A r ) Thus:, 0 λ k (A r ) for all k {1, 2,,r} So the implication is shown now, because: det(a r )= r λ k (A r ) > 0 k=1 ( ): Now suppose det(a r ) > 0 for r {1, 2,,n} InordertoshowthatallofA ns eigenvalues are strictly positive, we will proceed inductively on its dimension, n If n =1,thereisnothingtoshow,becausethenA n = A r has only one eigenvalue given by λ 1 (A n )=det(a r ) > 0 Now suppose the implication holds for all matrices up to size n (That is to say, for l {1, 2,,n}, ifalloftheleadingprincipalsubmatricesofa l have positive determinant, then A l is positive definite) Now let A n+1 M n+1 and suppose it satisfies that det(a r ) > 0 for 2
r {1, 2,,n,n+1} If we can show that A n+1 is positive definite (ie, all its eigenvalues are positive), then we will be done Our strategy will be to show that every leading principal submatrix A r of A n+1 must also be positive-definite, which will then imply that A n+1 (as a leading principal submatrix of itself) must be positive-definite as well To this end, we will induce over r (Note: This is a sort of nested induction, since we are now inducing over r to show than n implies n+1) If r=1, then once again there is nothing to show, since then its only eigenvalue is given by λ 1 (A 1 )=det(a 1 ) > 0 Now suppose, for 1 r k, thatallleadingprincipalsubmatricesa r of A n+1 are positivedefinite That is to say, let s assume that A r has strictly positive eigenvalues, for 1 r k Now let us examine A k+1 First,observethatwehaveaborderingsituation: A k+1 = A k Now, combine the bordering theorem from last class with our inductive assumption that A k is positive definite to get: 0 <λ j 1 (A k ) λ j (A k+1 ), for j {2,,k+1} (1) Furthermore, it is a standard fact that: k+1 k+1 det(a k+1 )= λ j (A k+1 )=λ 1 (A k+1 ) λ j (A k+1 ) (2) Finally, we combine (1) and (2) with our assumption that det(a k+1 ) > 0 to get: j=2 0 < det(a k+1) k+1 j=2 λ j(a k+1 ) = λ 1(A k+1 ) Thus, A k+1 is positive definite, which shows inductively that all leading principal submatrices of A n+1 must be positive-definite In turn, this shows that A n+1,asaprincipalsubmatrixof itself, must be positive-definite Finally, this completes our first induction step, allowing us to conclude the implication This completes the warm-up Now let s resume the discussion of eigenvalue interlacing for principal submatrices We can also relate the eigenvalues of a Hermitian matrix A with the eigenvalues of principal submatrices of matrices which are uniformly equivalent to A This relationship is made clear in the following corollary 3
464 Corollary Suppose A M n is Hermitian and {u 1,u 2,,u r } C n is an orthogonal system, with r n Now set B r =[Au j,u i ] r i, and order the eigenvalues of A and B r as usual Then: λ k (A) λ k (B r ) λ k+n r (A) Proof Let us begin by using the extension theorem and then the Gramm-Schmidt algorithm to obtain an orthonormal basis, {u 1,u 2,,u r,u r+1,,u n },forc n This gives us the following unitarily equivalent matrix: A =[Au j,u i ] n i, = U AU, whereu = u 1 u 2 u n is unitary Since A and A are uniformly equivalent, they have the same eigenvalues Finally, since B r is clearly a principal submatrix of A,thedesiredinequalityfollowsbytheprecedinginterlacing theorem for principal submatrices Now we can formulate a generalized, more geometric version of Raleigh-Ritz 465 Corollary For a subspace C n with dim( )=r, letp Sr denote the orthogonal projection onto,andleta M n be Hermitian with eigenvalues {λ j } n in nondecreasing order Then: λ 1 (A)+λ 2 (A)+ + λ r (A) = min and λ n r+1 (A)+λ n r+2 (A)+ + λ n (A) =max Note: As with the Courant-Fisher theorem, we are minimizing/maximizing over all possible subspaces of the given dimension, r Proof Given a subspace as above, let β = {u 1,u 2,,u r } be a corresponding orthonormal basis Then the action determined by orthogonally projecting a vector x C n onto is given by: r P Sr x = x, u j u j It can then be shown that P Sr = UU,where: U =[u 1 u 2 u r ] M n,r Now, extend β to an orthonormal basis β = {u 1,,u r,u r+1,,u n } and define Ũ to be the unitary matrix given as: Ũ =[u 1 u 2 u r u r+1 u n ] M n 4
Now let A M n be Hermitian Then, we have: tr[p A]=tr[ŨP Sr AU] n = P Sr Au j,u j = = (because trace is unitarily invariant) (this is how the sum of the diagonal entries are given) n Au j,p Sr u j (because P Sr is self adjoint) r Au j,p Sr u j (because P Sr u j =0unless j r) = tr[b r ] Here, B r =[Au j,u i ] r i,, anditisrelatedtoaasinthepreviouscorollary,whichresults in the following r inequalities: λ k (A) λ k (B r ), for 1 k r Summing these r inequalities, we obtain a new inequality: λ 1 (A)+ + λ r (A) λ 1 (B r )+ + λ r (B r ) = tr[b r ] (because the trace of a square matrix is the sum of its eigenvalues) = (by the above equality) We have shown generally that this inequality must always hold It remains to show that the inequality actually becomes an equality when we minimize over all possible r-dimensional subspaces, Tothisend,supposeoursubspace was chosen specially so that its orthonormal basis, β = {u 1,,u r },satisfiesthateachu i is an eigenvector corresponding to λ i (A) In this special case, B r looks like: B r =[Au j,u i ] r i, =[λ j (A)u j,u i ] r i, =[λ j (A) u j,u i ] r i, =[λ j (A) δ i,j ] r i, λ 1 (A) 0 0 0 λ 2 (A) 0 = 0 0 0 0 0 λ r (A) Hence, for this specially chosen subspace, we obtain: λ 1 (A)+ + λ r (A) =tr[b] = 5
It is in this sense that we mean: Now we ll show the other claim, which is: λ 1 (A)+λ 2 (A)+ + λ r (A) = min λ n r+1 (A)+λ n r+2 (A)+ + λ n (A) =max First, let us make two observations: And: max S R λ n+1 j (A) = λ j ( A) = min tr[p Sr ( A)] S R Combining these two observations with the first result, we see that: λ n r+1 (A)+λ n r+2 (A)+ + λ n (A) =λ (n+1) r (A)+λ (n+1) (r 1) (A)+ + λ (n+1) 1 (A) This concludes the proof 47 Majorization = λ r ( A) λ (r 1) ( A) λ 1 ( A) = (λ 1 ( A)+ + λ (r 1) ( A)+λ r ( A)) = (min tr[p Sr ( A)]) =max S R Because our results for spectrum analysis up to this point have been heavily dependent on minimizing/maximizing various mathematical objects (ie, Raleigh-Ritz, Courant-Fisher, the preceding corollary, etc), the following question emerges naturally: what kind of information can we extrapolate about eigenvalues without solving optimization problems? It turns out that, in some cases, one can determine more information about a matrix by studying sums of subsets of its eigenvalues than by studying the individual eigenvalues themselves This idea motivates the following definition 476 Definition Let {α j } n and {β j } n be real sequences Then we say β majorizes α, k or α β, if β mi k for all 1 k n, where,forsuchagivenk,m i and j i are i=1 α ji i=1 permutations such that α j1 α j2 α jn and β m1 β m2 β mn 6