DIAGONALIZATION BY SIMILARITY TRANSFORMATIONS The correct choice of a coordinate system (or basis) often can simplify the form of an equation or the analysis of a particular problem. For example, consider the obliquely oriented ellipse in Figure 7.2.1 whose equation in the xy-coordinate system is By rotating the xy-coordinate system counterclockwise through an angle of 45 into a uv -coordinate system by means of (5.6.13) on p. 326, the cross-product term is eliminated, and the equation of the ellipse simplifies to become Similarity Two n n matrices A and B are said to be similar whenever there exists a nonsingular matrix P such that The product is called a similarity transformation on A.
A Fundamental Problem. Given a square matrix A, reduce it to the simplest possible form by means of a similarity transformation. Diagonal matrices have the simplest form, so we first ask, Is every square matrix similar to a diagonal matrix? Linear algebra and matrix theory would be simpler subjects if this were true, but it s not. For example, consider which is false. Thus A, as well as any other nonzero nilpotent matrix, is not similar to a diagonal matrix. Nonzero nilpotent matrices are not the only ones that can t be diagonalized, but, as we will see, nilpotent matrices play a particularly important role in nondiagonalizability. So, if not all square matrices can be diagonalized by a similarity transformation, what are the characteristics of those that can? An answer is easily derived by examining the equation implies that P must be a matrix whose columns constitute n linearly independent eigenvectors, and D is a diagonal matrix whose diagonal entries are the corresponding eigenvalues. It s straightforward to reverse the above argument to prove the converse i.e., if there exists a linearly independent set of n eigenvectors that are used as columns to build a nonsingular matrix P, and if D is the diagonal matrix whose diagonal entries are the corresponding eigenvalues, then P^( 1) AP = D. Below is a summary.
Since not all square matrices are diagonalizable, it s natural to inquire about the next best thing i.e., can every square matrix be triangularized by similarity? This time the answer is yes, but before explaining why, we need to make the following observation. Similarity Preserves Eigenvalues Row reductions don t preserve eigenvalues (try a simple example). However, similar matrices have the same characteristic polynomial, so they have the same eigenvalues with the same multiplicities. Caution! Similar matrices need not have the same eigenvectors. In the context of linear operators, this means that the eigenvalues of a matrix representation of an operator L are invariant under a change of basis. In other words, the eigenvalues are intrinsic to L in the sense that they are independent of any coordinate representation. Now we can establish the fact that every square matrix can be triangularized by a similarity transformation. In fact, as Issai Schur realized in 1909, the similarity transformation always can be made to be unitary. The Cayley Hamilton theorem asserts that every square matrix satisfies its own characteristic equation p(λ) = 0. That is, p(a) = 0. Problem: Show how the Cayley Hamilton theorem follows from Schur s triangularization theorem. Solution: Schur s theorem insures the existence of a unitary U such thatu AU = T is triangular, and the development allows for the eigenvalues A to appear in any given order on the diagonal of T. So, if σ (A) = {λ1, λ2,..., λk } with λi repeated ai times, then there is a unitary U such that
and thus p(a) = 0. Schur s theorem is not the complete story on triangularizing by similarity. By allowing nonunitary similarity transformations, the structure of the uppertriangular matrix T can be simplified to contain zeros everywhere except on the diagonal and the superdiagonal (the diagonal immediately above the main diagonal). This is the Jordan form developed on p. 590, but some of the seeds are sown here.
Example:
Determining whether or not An n is diagonalizable is equivalent to determining whether or not A has a complete linearly independent set of eigenvectors, and this can be done if you are willing and able to compute all of the eigenvalues and eigenvectors for A. But this brute force approach can be a monumental task. Fortunately, there are some theoretical tools to help determine how many linearly independent eigenvectors a given matrix possesses. Independent Eigenvectors Let {λ1, λ2,..., λk } be a set of distinct eigenvalues for A. If {(λ1, x1 ), (λ2, x2 ),..., (λk, xk )} is a set of eigenpairs for A, then S = {x1, x2,..., xk } is a linearly independent set. (7.2.3) If Bi is a basis for N (A λi I), then B = B1 B2 Bk is a linearly independent set. (7.2.4)
These results lead to the following characterization of diagonalizability.
If An n happens to have n distinct eigenvalues, then each eigenvalue is simple. This means that geo multa (λ) = alg multa (λ) = 1 for each λ, so (7.2.5) produces the following corollary guaranteeing diagonalizability. Distinct Eigenvalues If no eigenvalue of A is repeated, then A is diagonalizable. (7.2.6) Caution! The converse is not true. An elegant and more geometrical way of expressing diagonalizability is now presented to help simplify subsequent analyses and pave the way for extensions.