Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. An n n matrix A is orthogonal if (i) its inverse A exists, and (ii) A T = A. Example. Consider A = [ cosθ sinθ sinθ cosθ [ cosθ sinθ sinθ cosθ ]. The following is a 3 3 orthogonal matrix: /3 /3 /3 /3 /3 /3 /3 /3 /3 ]. It is orthogonal because A T = A = Lemma. If A is orthogonal, then A T is also orthogonal. Proof. The lemma thus follows. (A T ) T = (A ) T = (A T ) To explain the next property of orthogonal matrices, we need to define two new concepts. Let S be a set of non-zero vectors v, v,..., v k of the same dimensionality. We say that S is orthogonal if v i v j = for any i j. Furthermore, we say that S is orthonormal if (i) S is orthogonal, and (ii) v i = v i v i = for any i [,k]. For example, is orthogonal but not orthonormal. If, however, we scale each of the above vectors to have length, then the resulting vector set becomes orthonormal: / / / 6 / 6 / 6 / 3 / 3 / 3 Lemma. An orthogonal set of vectors must be linearly independent.
Proof. Suppose that S = {v, v,..., v k }. Assume, on the contrary, that S is not linearly independent. Hence, there exist real values c,c,...,c k that are not all zero, and make the following hold: c v +c v +...+c k v k =. Suppose, without loss of generality, that c i for some i [,k]. Then, we multiply both sides of the above equation by v i, and obtain: c v v i +c v v i +...+c k v k v i = c i v i v i =. The above equation contradicts the fact that c i and v i is a non-zero vector. We are now ready to reveal another way to define orthogonal matrix: Lemma 3. Let A be an n n matrix with row vectors r, r,..., r n, and column vectors c, c,..., c n. Both the following statements are true: A is orthogonal if and only if {r, r,..., r n } is orthonormal. A is orthogonal if and only if {c, c,..., c n } is orthonormal. Proof. We will prove only the first statement because applying the same argument on A T proves the second. Let B = AA T. Denote by b ij the element of B at the i-th row and j-th column. We know that b ij = r i r j (note that the j-th column of A T has the same components as r j ). A is orthogonal if and only if B is an identity matrix, which in turn is true if and only if b ij = when i = j, and b ij = otherwise. The lemma thus follows. Lemma 4. The determinant of an orthogonal matrix A can only be or. Proof. From A T = A, we know that AA T = I where I is an identity matrix. Hence, det(aa T ) = det(a)det(a T ) = (det(a)) =. The lemma thus follows. Symmetric Matrix Recall that an n n matrix A is symmetric if A = A T. Next, we give several nice properties of such matrices. Lemma 5. All the eigenvalues of a symmetric matrix must be real values (i.e., they cannot be complex numbers). We omit the proof of the lemma. Note that the above lemma is not true for general square matrices (i.e., it is possible for an eigenvalue to be a complex number). Lemma 6. Let λ and λ be two different eigenvalues of a symmetric matrix A. Also, suppose that x is an eigenvector of A corresponding to λ, and x is an eigenvector of A corresponding to λ. It must holds that x x =.
Proof. By definition of eigenvalue and eigenvector, we know: From (), we have Ax = λ x () Ax = λ x () x T A T = λ x T x T A = λ x T x T Ax = λ x T x (by ()) x T λ x = λ x T x x T x (λ λ ) = (by λ λ ) x T x =. The lemma then follows from the fact that x x = x T x. Example. Consider A = We know that A has two eigenvalues λ = and λ =. For eigenvalue λ =, all the eigenvectors can be represented as x = x = v u,x = u,x 3 = v x x x 3 satisfying: with u,v R. Setting (u,v) to (,) and (,) respectively gives us two linearly independent eigenvectors: x =,x = For eigenvalue λ =, all the eigenvectors can be represented as x = x = t,x = t,x 3 = t with t R. Setting t = gives us another eigenvector: x 3 = x x x 3 satisfying: Vectors x, x, and x 3 are linearly independent. According to Lemma 6, both x x 3 and x x 3 must be. You can verify that this is indeed the case. From an earlier lecture, we already know that every symmetric matrix can be diagonalized because it definitely has n linearly independent eigenvectors. The next lemma strengthens this fact: 3
Lemma 7. Every n n symmetric matrix has an orthogonal set of n eigenvectors. We omit the proof of the lemma (which is rather non-trivial). Note that n eigenvectors in the lemma must be linearly independent, according to Lemma. Example 3. Let us consider again the matrix A in Example. We have obtained eigenvectors x,x,x 3. Clearly, they do not constitute an orthogonal set because x,x are not orthogonal. We will replace x with a different x that is still an eigenvector of A for eigenvalue λ =, and is orthogonal to x. From Example, we know that all eigenvectors corresponding to λ have the form For such a vector to be orthogonal to x =, we need: ( )(v u)+u = v = u As you can see, there are infinitely many such vectors, any of which can be x except produce one, we can choose u =,v =, which gives x = {x,x,x 3} is thus an orthogonal set of eigenvectors of A.. v u u v.. To Corollary. Every n n symmetric matrix has an orthonormal set of n eigenvectors. Proof. The orthonormal set can be obtained by scaling all vectors in the orthogonal set of Lemma 7 to have length. Now we prove an important lemma about symmetric matrices. Lemma 8. Let A be an n n symmetric matrix. There exist an orthogonal matrix Q such that A = Qdiag[λ,λ,...,λ n ]Q, where λ,λ,...,λ n are eigenvalues of A. Proof. From an earlier lecture, we know that given a set of linearly independent eigenvectors v,v,...,v n corresponding to eigenvalues λ,λ,...,λ n respectively, we can produce Q by placing v i as the i-th column of Q, for each i [,n], such that A = Qdiag[λ,λ,...,λ n ]Q. From Corollary, we know that we can find an orthonormal set of v,v,...,v n. By Lemma 3, it follows that Q is an orthogonal matrix. Example 4. Consider once again the matrix A in Example. In Example 3, we have obtained an orthogonal set of eigenvectors: 4
By scaling, we obtain the following orthonormal set of eigenvectors: / / / 6 / 6 / 6 / 3 / 3 / 3 Recall that these eigenvectors correspond to eigenvalues,, and, respectively. We thus produce: Q = such that A = Qdiag[,, ]Q. / / 6 / 3 / / 6 / 3 / 6 / 3 5