Chapter 7: Symmetric Matrices and Quadratic Forms

Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Diagonalization of Symmetric Matrices We have seen already that it is quite time intensive to determine whether a matrix is diagonalizable. We ll see that there are certain cases when a matrix is always diagonalizable. Definition. A matrix A is symmetric if A T = A. 3 4 Example. Let A = 6. 4 3 Note that A T = A, so A is symmetric. The characteristic polynomial of A is χ A (t) = (t + )(t 7) so the eigenvalues are and 7. The corresponding eigenspaces have bases, λ =,, λ = 7,, 0. 0 Hence, A is diagonalizable. Now we use Gram-Schmidt to find an orthogonal basis for R 3. Note that the eigenvector for λ = is already orthogonal to both eigenvectors for λ = 7. / v = 0, v =, v 3 =. Finally, we normalize each vector, / u = 0, u = / /3 /3, u 3 = / /3 Now the matrix U = [u u u 3 is orthogonal and so U T U = I. /3 /3. Theorem. If A is symmetric, then any two eigenvectors from different eigenspaces are orthogonal. Proof. Let v, v be eigenvectors for A with corresponding eigenvalues λ, λ, λ λ. Then λ (v v ) = (λ v ) T v = (Av ) T v = v T A T v = v T Av = v T (λ v ) = λ (v v ). /3 Hence, (λ λ )(v v ) = 0. Since λ λ, then we must have v v = 0.

Based on the previous theorem, we say that the eigenspaces of A are mutually orthogonal. Definition. An n n matrix A is orthogonally diagonalizable if there exists an orthogonal n n matrix P and a diagonal matrix D such that A = P DP T. Theorem 3. If A is orthogonally diagonalizable, then A is symmetric. Proof. Since A is orthogonally diagonalizable, then A = P DP T for some orthogonal matrix P and diagonal matrix D. A is symmetric because A T = (P DP T ) T = (P T ) T D T P T = P DP T = A. It turns out the converse of the above theorem is also true! The set of eigenvalues of a matrix A is called the spectrum of A and is denoted σ A. Theorem 4 (The Spectral Theorem for symmetric matrices). Let A be a (real) n n matrix. Then the following hold. () A has n real eigenvalues, counting multiplicities. () For each eigenvalue λ of A, geomult λ (A) = algmult λ (A). (3) The eigenspaces are mutually orthogonal. (4) A is orthogonally diagonalizable. Proof. We proved in HW9, Exercise 6 that every eigenvalue of a symmetric matrix is real. The second part of () as well as () are immediate consequences of (4). We proved (3) in Theorem. Note that (4) is trivial when A has n distinct eigenvalues by (3). We prove (4) by induction. Clearly the result holds when A is. Assume (n ) (n ) symmetric matrices are orthogonally diagonalizable. Let A be n n and let λ be an eigenvalue of A and u a (unit) eigenvector for λ. By the Gram- Schmidt process, we may extend u to an orthonormal basis {u,..., u n } for R n where {u,..., u n } is a basis for W. Set U = [u u u n. Then u T U T Au u T Au n [ AU =..... = λ. 0 B u T n Au u T n Au n The first column is as indicated because u T i Au = u T i (λu ) = λ(u i u ) = λδ ij. As U T AU is symmetric, = 0 and B is a symmetric (n ) (n ) matrix that is orthogonally diagonalizable with eigenvalues λ,..., λ n (by the inductive hypothesis). Because A and U T AU are similar, then the eigenvalues of A are λ,..., λ n.

Since B is orthogonally diagonalizable, there exists an orthogonal matrix Q such that Q T BQ = D, where the diagonal entries of D are λ,..., λ n. Now [ T [ [ [ [ 0 λ 0 0 λ 0 λ 0 = =. 0 Q 0 B 0 Q 0 Q T BQ 0 D [ [ 0 0 Note that is orthogonal. Set V = U. As the product of orthogonal matrices is 0 Q 0 Q orthogonal, V is itself orthogonal and V T AV is diagonal. Suppose A is orthogonally diagonalizable, so A = UDU T where U = [u u n and D is the diagonal matrix whose diagonal entries are the eigenvalues of A, λ,..., λ n. Then A = UDU T = λ u u T + + λ n u n u T n. This is known as the spectral decomposition of A. Each u i u T i (u i u T i )x is the projection of x onto Span{u i}. is called a projection matrix because Example. Construct a spectral decomposition of the matrix A in Example 3 4 Recall that A = 6 and our orthonormal basis of Col(A) was 4 3 / u = 0, u = /3 /3, u 3 = /3 /3. / /3 /3 Setting U = [u u u 3 gives U T AU = D = diag(, 7, 7). The projection matrices are / 0 / u u T = 0 0 0, u u T = / 0 / The spectral decomposition is /8 /9 /8 /9 8 9 /9 /8 /9 /8, u 3u T 3 = 4/9 /9 4/9 /9 /9 /9 4/9 /9 4/9. 7u u T + 7u u T u 3 u T 3 = A.

. Quadratic Forms Definition 3. A quadratic form is a function Q on R n given by Q(x) = x T Ax where A is an n n symmetric matrix, called the matrix of the quadratic form. Example 6. The function x x is a quadratic form given by setting A = I. Quadratic forms appear in differential geometry, physics, economics, and statistics. [ [ x Example 7. Let A = and x =. The corresponding quadratic form is x Q(x) = x T Ax = x x x + x. Example 8. Find the matrix of the quadratic form Q(x) = 8x +7x 3x 3 6x x +4x x 3 x x 3. By inspection we see that 8 3 A = 3 7. 3 Theorem 9 (Principal Axes Theorem). Let A be an n n symmetric matrix. Then there is an orthogonal change of variable x = P y, that transforms the quadratic form x T Ax into a quadratic form y T Dy such that D is diagonal. Proof. By the Spectral Theorem, there exists an orthogonal matrix P such that P T AP = D with D diagonal. For all x R n, set y = P T x. Then x = P y and x T Ax = (P y) T A(P y) = y T (P T AP )y = y T Dy. Example 0. Let Q(x) = 3x 4x x + 6x. Make a change of variable that transforms Q into a quadratic form with no cross-product terms. [ {[ } {[ } 3 We have A = with eigenvalues 7 and, and eigenbases,, respectively. 6 We normalize each to determine our diagonalizing matrix P, so that P T AP = D where P = [ [ 7 0, and D =. 0 Our change of variable is x = P y and the new form is Q (x) = y T Dy = 7y + y. If A is a symmetric n n matrix, the quadratic form Q(x) = x T Ax is a real-valued function with domain R n. If n =, then the graph of Q(x) is the set of points (x, x, z) with z = Q(x). For example, if Q(x) = 3x + 7x, then Q(x) > 0 for all x 0. Such a form is called positive definite, and it is possible to determine this property from the eigenvalues of A.

Definition 4. A quadratic form Q is () positive definite if Q(x) > 0 for all x 0. () negative definite if Q(x) < 0 for all x 0. (3) indefinite if Q(x) assumes positive and negative values. Theorem. Let A be an n n symmetric matrix. Then Q(x) = x T Ax is () positive definite if and only if the eigenvalues of A are all positive. () negative definite if and only if the eigenvalues of A are all negative. (3) indefinite otherwise. Proof. By the Principal Axes Theorem, there exists an orthogonal matrix P such that x = P y and where λ,..., λ n are the eigenvalues of A. Q(x) = x T Ax = y T Dy = λ y + + λ n y n Since P is invertible, there is a - correspondence between all nonzero x and nonzero y. The values above for y 0 are clearly determined by the signs of the λ i. Hence so are the corresponding values of x 0. We can also determine the maximum and minimum of a quadratic form when evaluated on a unit vector. This is known as constrained optimization Theorem. Let A be a symmetric matrix. Set m = min{x T Ax : x = } and M = max{x T Ax : x = } The M is the greatest eigenvalue of A and m is the least eigenvalue of A. Moreover, x T Ax = M (resp. m) when x is the unit eigenvector corresponding to M (resp. m). Example 3. Let Q(x) = 7x + x + 7x 3 8x x 4x x 3 8x x 3. Find a vector x such that Q(x) is maximimized (mimimized) subject to x T x =. 7 4 The matrix of Q is A = 4 4 and the eigenvalues are 3, 9. Hence, the maximum (resp. 4 7 minimum) values of Q subject to the constraint is 9 (resp. 3). / An eigenvector for 9 is 0. Hence, setting u = 0 gives Q(u) = 9. / / 6 An eigenvector for 3 is. Hence, setting v = / 6 gives Q(v) = 6. / 6

4. Singular Value Decomposition The key question in this section is whether it is possible to diagonalize a non-square matrix. [ 4 4 Example 4. Let A =. A linear transformation with matrix A maps the unit sphere 8 7 {x R 3 : x = } onto an ellipse in R. Find a unit vector x at which the length x is maximized and compute its length. The key observation here is that Ax is maximized at the same x that maximizes Ax and Ax = (Ax) T (Ax) = x T (A T A)x. Thus, we want to maximize the quadratic form Q(x) = x T (A T A)x subject to the constraint x =. The eigenvalues of 80 00 40 A T A = 00 70 40 40 40 0 are λ = 360, λ = 90, λ 3 = 0 with corresponding (unit) eigenvectors /3 /3 /3 v = /3, v = /3, v 3 = /3. /3 /3 /3 The maximum value is 360 obtained when x = v. That is, the vector Av corresponds to the point on the ellipse furthest from the origin. Then [ 8 Av = and Av = 360 = 6 0. 6 The trick we utilized above is a handy one. That is, even though A is not symmetric (it wasn t even square), A T A is symmetric and we can extract information about A from A T A. Let A be an m n matrix. Then A T A can be orthogonally diagonalized. Let {v,..., v n } be an orthonormal A T A-eigenbasis for R n with corresponding eigenvalues λ,..., λ n. For i n, 0 Av i = (Av i ) T (Av i ) = vi T (A T A)v i = vi T (λ i v i ) = λ i vi T v i = λ i. Hence, λ i 0 for all i. Arrange the eigenvalues such that λ λ λ n 0 and define σ i = λ i. That is, the σ i represent the lengths of the vectors Av i. These are called the singular values of A. Example. In Example 4, the singular values are σ = 360 = 6 0, σ = 90 = 3 0, σ 3 = 0. Theorem 6. Let A be an m n matrix. Suppose {v,..., v n } is an orthonormal basis for R n consisting of eigenvectors of A T A with corresponding eigenvalues λ λ λ n 0. Suppose A has r singular values. Then {Av,..., Av r } is an orthogonal basis for Col(A) and ranka = r.

Proof. Suppose i j, then (Av i ) (Av j ) = (Av i ) T (Av j ) = v T i (A T A)v j = λ(v T i v j ) = 0. Hence, {Av,..., Av r } is an orthogonal set and hence linearly independent. It is left to show that the given set spans Col(A). Since there are exactly r nonzero singular values, Av i 0 if and only if i r. Let y Col(A), so y = Ax Col(A) for some x R n. Then x = c v + + c n v n for some scalars c i R and so y = Ax = c Av + + c r Av r + c r+ Av r+ + + c n v n = c Av + + c r Av r + 0 + + 0. Thus, y Span{Av,..., Av r } and so {Av,..., Av r } is an orthogonal basis for Col(A) and ranka = dim Col(A) = r. Theorem 7 (Singular Value [ Decomposition). Let A be an m n matrix with rank r. Then there D 0 exists an m n matrix Σ = where D is an r r diagonal matrix for which the diagonal 0 0 entries in D are the first r singular values of A, σ σ σ r > 0. and there exists an m m orthogonal matrix U and an n n orthogonal matrix V such that A = UΣV T. Proof. Let {v,..., v n } be an orthonormal basis for R n By Theorem 6, there exists an orthogonal basis {Av,..., Av r } of Col(A). For each i, set u i = Av i / Av i = σ i Av i. Then Av i = σ i u i. Extend {u,..., u r } to an orthonormal basis {u,..., u m } of R m. Let U = [u u m and V = [v v n. Then both U and V are orthogonal and [ [ AV = Av Av r 0 0 = σ u σ r u r 0 0 = UΣ. Since V is orthogonal, A = AV V T = UΣV T. The columns of U in the preceding theorem are called the left singular vectors of A and the columns of V are the right singular vectors of A. We summarize the method for SVD below. Let A be an m n matrix of rank r. Then A = UΣV T where U, Σ, V T are as below. () Find an orthonormal basis of A T A, {v,..., v n }. () Arrange the eigenvalues of A T A in decreasing order. The matrix V is [v v n in this order. (3) The matrix Σ is obtained by placing the r singular values along the diagonal in decreasing order. (4) Set u i = Av i Av i for i =,..., r. Extend the orthogonal set {u,..., u r } to an orthonormal basis of R m, {u,..., u m } to a basis of R m by adding vectors not in the set and applying Gram-Schmidt. The matrix U is [u u m.

Example 8. Consider A as in Example 4. We have already that /3 /3 /3 V = /3 /3 /3. /3 /3 /3 The nonzero singular values are σ = 360 = 6 0 and σ = 3 0 so [ [ 6 0 0 [ 6 0 0 0 D = 0 3 and Σ = D 0 = 0 0 3. 0 0 Now [ [ u = Av = [ [ 8 3/ 0 σ 6 = 0 6 / and u = Av = 3 / 0 0 σ 3 = 0 9 3/. 0 Note that {u, u } is already a basis for R and so U = [u u. Now we check that A = UΣV T. 7 Example 9. Construct the Singular Value Decomposition of A = 0 0. [ 74 3 We compute A T A =. The eigenvalues of A T A are λ = 90 and λ = 0 (note that 3 6 [ [ λ λ > 0). The corresponding eigenvectors are and, respectively. Normalizing gives [ [ [ / / / / v = / and v = / so V = / /. The singular values are now σ = 90 = 0 and σ = 0. Hence, [ [ 3 0 0 3 0 0 D D = and Σ = = 0 0. 0 0 0 0 0 A has rank and u = σ Av = / 0 and u = σ Av = / 0. / / Choose u 3 = cv300 so that {u, u, u 3 } is an orthonormal basis of R 3. Then U = [u u u 3 and A = UΣV T.

Example 0. Construct the Singular Value Decomposition of A =. [ 9 9 We compute A T A =. The eigenvalues of A T A are λ = 8 and λ = 0 with normalized 9 9 eigenvectors [ [ [ / / / / v = / and v = / so V = / /. The singular values are now σ = 8 = 3 and σ = 0. Hence, 3 0 Σ = 0 0 and u = /3 Av = /3. σ 0 0 To find u, u 3 such that {u, u, u 3 } is an orthonormal basis of R 3, we find a basis for Nul(u T ). Set u T x = 0. Then we have 3 x 3 x + 3 x 3 = 0, which implies x x + x 3 = 0. Hence, the parametric solution is given by x = x x x 3 = x x 3 x x 3 = /3 x + 0 x 3. 0 Set w = and w 3 = 0. By construction, w and w 3 are orthogonal to u but not to each 0 other. We apply Gram-Schmidt. Set ũ = w. Then ũ 3 = w 3 proj Span{ũ } ũ = w 3 ũ w 3 ũ = ũ ũ Normalizing gives, / u = / /, u 3 = 4/ 3 0 0 = ( 4 4/3 /3 ) / = 4/ 0 /3. Now {u, u, u 3 } is an orthonormal basis of R 3. Then U = [u u u 3 and A = UΣV T..