The Orthogonal Geometry of R n

Chapter 9 The Orthogonal Geometry of R n This chapter is under revision in projections.tex need to edit Exercises In this chapter, we will study various geometric problems such as what is the minimal distance from a point of R n to a subspace. This is a version of the least squares problem. We will also prepare the way for the topic of the next chapter, which is the Principal Axis Theorem. In particular, we will need to show that every subspace of R n has an orthonormal basis. The last section is devoted to studying the rotations of R 3 and to giving examples of rotation groups. 9.1 A Fundamental Problem The purpose of this section is to solve the problem of finding the distance from a vector in R n to a subspace. This is a problem we already solved for plane through the origin in R 3. Recall that the distance from (x 0,y 0,z 0 ) R 3 to the plane ax + by + cz =0is d = ax 0 + by 0 + cz 0 (a 2 + b 2 + c 2 ) 1/2. What this formula represents of course is the length of the projection of (x 0,y 0,z 0 ) onto the line through the origin normal to the plane. 219

220 9.1.1 The orthogonal complement of a subspace We can easily formulate a n-dimensional version of this problem, the solution of which has many important applications. Consider a subspace of W of R n. Definition 9.1. The orthogonal complement of W is the subspace W of R n defined as the set of all vectors v R n which are orthogonal to every vector in W.Thatis, W = {v R n v w =0 w W } (9.1) The orthogonal complement W is clearly also a subspace of R n. It is easy to visalize in terms of matrices. Suppose W is the column space of an n k real matrix A. Then clearly W = N (A T ). In other words, ignoring the distinction between row and column vectors, it is clear that the row space and null space of a matrix are the orthogonal complements of one another. Applying our old principle that the number of variables in a homogeneous linear system is the number of corner variables plus the number of free variables, and using the fact that A and A T have the same rank, we thus see that dim(w )+dim(w )=n. This leads to a basic result. Proposition 9.1. Let W be a subspace of R n and W the orthogonal complement of W.Then (i) W W = {0}. (ii) Every v R n can be expressed in exactly one way as the sum of a vector in W and a vector in W. In particular, W + W = R n. Proof. Part (i) follows immediately from the fact that if v W W,then v v =0,sov = 0. The proof of (ii) is harder. Let a basis for W be w 1,...,w r and a basis for W be v 1,...,v s. We just showed r + s = n, so all we have to show is that w 1,...,w r, v 1,...,v s are independent since n independent vectors in R n form a basis. But we know from (i) that if we have a sum w + v = 0, wherew W,andv W,thenw = v = 0. Thus if a linear combination of w 1,...,w r, v 1,...,v s is 0, then the coefficients of the w i are all 0 and similarly, the coefficients of the v j are all 0. Therefore we have the independence so the proof is finished. We can now solve the following distance problem.

221 Problem: Letx R n be arbitrary, and let W be a subspace of R n. Minimize x y, where y is an arbitrary vector in W. One thing to observe is that the minimizing x y is equivalent to solving the least squares problem of minimizing the sum of squares x y 2. Recall that when x = w + v with w W and w v =0,wecallw the component of x in W. For any y W, x y 2 = (x w)+(w y) 2 = x w 2 + w y 2 (since (x w) (w y) =0) x w 2. This gives the solution. Proposition 9.2. The minimum distance from x R n to the subspace W is x w, wherew is the component of x in W. Put another way, the minimum distance from x to W is the length of the component of x in W. 9.1.2 The projection on a subspace In view of our solution to the least squares problem, the next step is to find an expression for the component of x in W.Asabove,letw 1,...,w r be a basis of W, and put A =(w 1 w r ). Thus W is the column space col(a) of A. The condition that defines w is that for some y R r w = Ay and A T (x w) =0, since the second condition says that x w W, due to the fact that the rows of A T span W. Substituting, we get A T x = A T Ay. Wealreadyknow Reference that when A is a real matrix with independent columns, A T A is invertible. Thus y =(A T A) 1 A T x. Hence, w = Ay = A(A T A) 1 A T x. (9.2) Hencewehaveanexpressionforthecomponentw of x. We can now define the projection P W of R n onto W by putting P W (x) = w, wherew is the component of x R n in W. Thus, by (9.2), P W (x) =A(A T A) 1 A T x. (9.3) Since P W is given by a matrix, it is obviously a linear map.

222 Example 9.1. Let W be the line in R n spanned by w. Here the projection P W is simply P W = w ( w T w ) 1 w T. This is a formula we already saw in Chapter 1. Example 9.2. Let 1 1 A = 2 1 1 0. 1 1 Then A has rank 2 and we find by direct computation that 14 1 5 4 P W = A(A T A) 1 A T = 1 1 11 4 7 17 5 4 3 1. 4 7 1 6 The next proposition simply says that projections on a subspace behave exactly like the projections on a line considered in Chapter 1. Proposition 9.3. The projection P W of R n onto a subspace W has the following properties: (i) if w W,thenP W (w) =w; (ii) P W P W = P W ; (iii) P W is symmetric; and finally, (iv) for any x R n, x P W (x) is orthogonal to W. Consequently, x = P W (x)+(x P W (x)) is the orthogonal decomposition of x into the sum of a component in W and a component orthogonal to W. 9.2 Orthonormal Sets in R n 9.2.1 Orthonormal Bases Recall that the dot product of two vectors v, w R n is defined to be n v w := v T w = v i w i, i=1

223 and the length of v of v is obtained from the dot product as v := v v. Definition 9.2. A collection of unit vectors in R n is called orthonormal, or ON for short, if the vectors are mutually perpendicular to each other. An orthonormal basis of R n (ONB for short) is a basis that is ON. More generally, an orthonormal basis of a subspace W of R n is a basis of W which is ON. Proposition 9.4. The vectors u 1, u 2,...,u n give an ONB of R n if and only if the matrix U =(u 1 u 2...u n ) is orthogonal. Proof. This is clear since U T U = I n (u T i u j )=(u i u j )=I n. Proposition 9.5. Any ON set in R n is linearly independent. The proof is an exercise. In order to prove the PAT, we will need to know that every subspace W of R n admits an ONB. This will be proved in the next section. Example 9.3. Here are some examples of ONBs. (a) The standard basis e 1,...,e n is an ONB of R n. (b) 1 3 (1, 1, 1) T, 1 6 (1, 2, 1) T, 1 2 (1, 0, 1) T give an ONB of R 3. The first two basis vectors are an ONB of the plane x z =0. (c) Both the columns and rows of an n n orthogonal matrix Q are an ONB of R n (why?). Using the matrix 1 1 1 1 Q = 1 1 1 1 1 2 1 1 1 1, 1 1 1 1 we thus get two distinct ONBs of R 4. 9.2.2 Projecting via an ONB Let s first consider the problem of expanding a vector in R n in terms of an ONB basis. After that we will find a formula for the projection P W onto a subspace W of R n. The solution to the first problem is very neat.

224 Proposition 9.6. Assume u 1, u 2,...,u n is an ONB of R n.thenanyw R n, has the unique expression w = n 1=i (w u i )u i = n 1=i (u T i w)u i. (9.4) We will call (9.4) the projection formula for R n sinceitsaysthat any vector in R n is the sum of its projections on an ONB. The coefficients x i = w u i are often called the Fourier coefficients of w with respect to the orthonormal basis. The projection formula can also be stated in matrix form as follows: n I n = u i u T i. (9.5) In other words, the sum of the projections on an ONB is the identity. Example 9.4. For example, 1=i (1, 0, 0, 0) = 1 2 (1, 1, 1, 1) 1 2 ( 1, 1, 1, 1) + 1 2 (1, 1, 1, 1) + 1 (1, 1, 1, 1). 2 To prove the projection formula (9.4), write w = n 1=i x i u i. To find the x i, we consider the system Qx = w, where Q =(u 1 u 2...u n ). Since Q is orthogonal, the unique solution is x = Q T w. But this that says that each x i = u T i w = w u i, which gives the desired formula. More generally, suppose W is a subspace of R n with an ONB u 1, u 2,...,u k. Then, by a similar argument, any w W has the unique expansion w = k (w u i )u i. (9.6) 1=i

225 To see this, first write Then observe that w u j =( k 1=i w = k 1=i x i u i ) u j = x i u i. k 1=i x i (u i u j )=x j since u i u j =0ifi j and u i u i =1. We now claim that the projection P W of R n onto W is given by That is the matrix of P W is P W (x) = P W = k (x u i )u i. (9.7) i=1 k u i u T i = QQ T, (9.8) i=1 where Q =(u 1 u k ). To see this, apply the formula P W = A(A T A) 1 A T to the case A = Q and notice Q T Q = I k (check this). Proposition 9.7. Let Q be the n k matrix Q =(u 1 u 2... u k ),where u 1, u 2,...,u k is an ONB of the subspace W.ThenmatrixofP W is P W = QQ T. (9.9) Example 9.5. Consider be the subspace W =span{(1, 1, 1, 1) T, (1, 1, 1, 1) T }. In order to find the matrix of P W, we must compute P W (e i )fori =1, 2, 3, 4. Observe that u 1 =1/2(1, 1, 1, 1) T and u 2 =1/2(1, 1, 1, 1) T are an ONB of W. Now, by a straightforward computation, P W (e 1 )= 1 2 (1, 0, 0, 1)T, P W (e 2 )= 1 2 (0, 1, 1, 0)T. By inspection, P W (e 3 )=P W (e 2 )andp W (e 4 )=P W (e 1 ). Hence the matrix A of P W is 1 0 0 1 A = 1 0 1 1 0 2 0 1 1 0. 1 0 0 1 Note that we could also have calculated QQ T,whereQ =(u 1 u 2 ).

226 The projection onto a hyperplane W in R n with unit normal u 0 clearly has the form P W (x) =(I n uu T )x. Using the same reasoning as in Chapter 1, we define the reflection through W to be the linear transformation H u = I n 2P u. (9.10) We leave it as an exercise to show that H u is a symmetric orthogonal matrix, H u u = u and H u x = x if x W.Thatis,H u has the expected properties of a reflection. The notion of Fourier coefficients and orthogonal projections are very useful in infinite dimensional situations also. A set S of functions in C[a, b] is called orthonormal if for any f,g S, b { 0 if f g (f,g)= f(t)g(t)dt = 1 if f = g. a The formula for projecting C[a, b] onto the subspace W spanned by a finite set of functions is exactly as given above, once an ONB of W has been constructed. This is our next step. 9.2.3 The Pseudo-Inverse and Least Squares Suppose A has independent columns, and W denotes the column space of A. Then the matrix A + =(A T A) 1 A T is called the pseudo-inverse of A. If A is square, then A and A T are both invertible, so A + = A 1. However, A + is always a left inverse of A. That is, A + A = I k. To see what is going on, it is helpful to consider A as a linear transformation A : R k R n. Since the columns of A are independent, N (A) =0, and thus we know that A is one to one. That is, Ax = Ay implies x = y. Wehavejustshowed that a one to one linear transformation A always has a left inverse B, that is a k n matrix B such that BA = I k, namely the pseudo-inverse A +. However, when k<nthere are many left inverses of A. In fact, if C is any n k matrix such that col(a) N(C) (for example a syndrome of A), then CA = O, so(a + + C)A = A + A + CA = I k + O = I k. Hence A + + C is also a left inverse of A. (Indeed, every left inverse of A has the form A + + C.) The special property of the pseudo-inverse A + is that not only is A + A = I k, but AA + = P W. Thus A + solves the least squares problem for W : given b R n, find x R k so that Ax is the element of W nearest b. Thesolution is of course x = A + b.

227 A useful way to look at the least squares problem is as a method of solving inconsistent systems. If the system Ax = b is inconsistent, then this system should be replaced by the consistent system Ax = P W b,sincep W b is the vector in the column space W of A nearest b. The solution to the system Ax = P W b is x = A + b. Now let us consider a typical application. Suppose one has points (a i,b i ), 1 i k, inr 2, which represent the outcome of some experiment, and one wants to find a line which fits these points as well as possible. If the equation of the line is y = mx + n, then the line will pass through (a i,b i ) b i = ma i + n. These equations can be expressed in matrix form a 1 1 Ax = a 2 1... a k 1 ( ) m = n b 1 b 2.. b k. Note that the unknowns are now m and n. The effect of applying least squares is to replace b 1,...,b k by c 1,...,c k so that all (a i,c i ) lie on a line and the sum k (b i c i ) 2 i=1 is minimized. The solution is easily written down using the pseudo-inverse A + of A. Wehave ( ) m x = = A + b =(A T A) 1 A T b. n More precisely, ( ) m = n ( ) a 2 1 ( i ai ai b i ai k bi ). Note that the 2 2 matrix in this solution is invertible just as long as some a i 0, 1. The problem of fitting a set of points (a i,b i,c i ) to a plane is similar. The method can also be adapted to the problem of fitting a set of points in R 2 to a nonlinear curve, such as an ellipse. This is apparently the genisis of the method of least squares. Its inventor, K. F. Gauss, astonished the astronomical world in 1801 by being able to predict on the basis of only 9 of observed orbit the approximate position of the astroid Ceres 11 months after his initial observations were made. Least squares applies to function spaces as well.

228 Example 9.6. Suppose we want to minimize 1 1 (cos x (a + bx + cx 2 )) 2 dx. The solution proceeds exactly as in the Euclidean situation. We first apply GS to 1,x,x 2 on [ 1, 1] to obtain ON polynomials f 0,f 1,f 2 on [ 1, 1] and then compute the Fourier coefficients of cos x with respect to the f i. Clearly f 0 = 1.Moreover,sincex is odd, (x, f 0 )=0. Hence 2 f 1 = x 2 = (x, x) 3 x. To get f 2, we calculate x 2 (x 2,f 0 )f 0 (x 2,f 1 )f 1 which turns out to be x 2 1 3. Computing (x2 1 3,x2 1 3 ), we get 45 8,sof 2 = 45 (x 2 1 8 3 ). The Fourier coefficients (cos x, f i )= 1 1 cos xf i(x)dx turn out to be 2cos1, 0 2 and 45 (4 cos 1 8 8 3 sin 1). Thus the best least squares approximation is cos 1 + 45 8 (4 cos 1 8 3 sin 1)(x2 1 3 ). The calculation is greatly simplified by the fact that we chose the interval to be symmetric about 0 since x and x 2 are already orthogonal on [ 1, 1], as are any even and odd polynomials. Exercises Exercise 9.1. Expand (1, 0, 0) T using the orthonormal basis consisting of the columns of the matrix Q of Example??(b). Do the same for (1, 0, 0, 0) using the rows of U. Exercise 9.2. Find an ONB for the plane x 2y +3z =0inR 3. Now extend this ON set in R 3 to an ONB of R 3. Exercise 9.3. Show that the product of two orthogonal matrices is orthogonal and the inverse of an orthogonal matrix is orthogonal. Why does the second statement imply the transpose of an orthogonal matrix is orthogonal? Exercise 9.4. Show that any ON set of vectors is linearly independent. (Use the projection formula.)

229 Exercise 9.5. What are the eigenvalues of a projection P W? Can the matrix of a projection be diagonalized? That is, does there exist an eigenbasis of R n for any P W? If the answer to the previous question was yes, does there exist an ON eigenbasis? Exercise 9.6. Show that the matrix of a projection is symmetric. Exercise 9.7. Diagonalize the matrix A of P W in Example 9.5. Exercise 9.8. Find the matrix of the reflection H u through the hyperplane orthogonal u defined in (9.10) for the following cases: (a) u is a unit normal to the hyperplane x 1 + 2x 2 + x 3 =0inR 3. (b) u is a unit normal to the hyperplane x 1 + x 2 + x 3 + x 4 =0inR 4. Exercise 9.9. Show that the matrix H u defined in (9.10) is a symmetric orthogonal matrix such that H u u = u and H u x = x if x u =0. Exercise 9.10. Show that H u admits an ON eigenbasis. Exercise 9.11. Let Q be the matrix of the reflection H b. (a) What are the eigenvalues of Q? (b) Use the result of (a) to show that Q = 1. (c) Show that Q can be diagonalized by explicitly finding an eigenbasis of R n for Q. Exercise 9.12. Using the formula P W = A(A T A) 1 A T, show that (a) every projection matrix P W satisfies the identity P W P W = P W and give a geometric interpretation of this, and (b) every projection matrix is symmetric. Exercises Exercise 9.13. Let A have independent columns. Verify the formula P = QQ T using A = QR. Exercise 9.14. Prove the Pythagorean relation used in the proof of Theorem 1. That is, show that if p q =0,then p + q 2 = p q 2 = p 2 + q 2. Conversely, if this identity holds for p and q, thenp and q are orthogonal.

230 Exercise 9.15. Let A be the matrix of Problem 2 in 16. Find the matrix of the projection of R 4 onto the column space W of A. Also, find the projection of (2, 1, 1, 1) T onto W. Exercise 9.16. Suppose H is a hyperplane in R n with normal line L. Interpret each of P H + P L, P H P N and P N P H by giving a formula for each. Exercise 9.17. Find the line that best fits the points ( 1, 1), (0,.5), (1, 2), and (1.5, 2.5). Exercise 9.18. Suppose coordinates have been put on the universe so that the sun s position is (0, 0, 0). Four observations of a planet orbiting the sun tell us that the planet passed through the points (5,.1, 0), (4.2, 2, 1.4), (0, 4, 3), and ( 3.5, 2.8, 2). Find the plane (through the origin) that best fits the planet s orbit. Exercise 9.19. Find the pseudo-inverse of the matrix 1 0 2 1. 1 1 Exercise 9.20. Assuming A has independent columns, find the pseudoinverse A + from the QR factorization of A. Exercise 9.21. Show that if A has independent columns, then any left inverse of A has the form A + + C, whereca = O. Exercise 9.22. Suppose A has independent columns and let A = QR be the QR factorization of A. Find a left inverse of A in terms of Q and R. Exercise 9.23. Consider the matrix 1 2 A = 0 1. 1 0 (a) Find the pseudo-inverse A + of A, and (b) Compute the QR factorization of A and use the result to find another left inverse of A. Exercise 9.24. Let W be a subspace of R n with basis w 1,...,w k and put A =(w 1 w k ). Show that A T A is always invertible. (HINT: It is sufficient to show that A T Ax = 0 implies x = 0 (why?). Now consider x T A T Ax.)

231 9.3 Gram-Schmidt and the QR Factorization 9.3.1 The Gram-Schmidt Method We are now going to show that every subspace of R n has an orthonormal basis. In fact, given a subspace W with a basis w 1,...,w k, we will construct an ONB u 1,...,u k of W such that for each index for m =1,...k, span{w 1,...,w m } =span{w 1,...,w m }. The method is called the Gram-Schmidt method or GS method for short. In fact, Gram-Schmidt is simply the technique we used in R 2, and extended to R n in the previous section (cf Proposition 9.3), to decompose a vector into two orthogonal components using projections. GS works in the abstract setting also, and we will take this theme up in the exercises. If V is any real vector space admitting an inner product, such as C[a, b], GS will also give a method for constructing an ONB of any finite dimensional subspace W of V. Let s begin with a subspace W of R n having a basis w 1,...,w k. Recall that the basis property implies no proper subset of w 1,...,w k can span W. Now proceed as follows: first let u 1 = w 1 1 w 1. Next find a non zero vector v 2 on the plane spanned by w 1 and w 2 which is orthogonal to w 1. The natural solution is to project w 2 on w 1 and put Next set v 2 := w 2 P w1 (w 2 )=w 2 (w 2 u 1 )u 1. Then u 1 and u 2 are ON. Moreover, and u 2 := v 2 1 v 2. w 1 =(w 1 u 1 )u 1, w 2 =(w 2 u 1 )u 1 +(w 2 u 2 )u 2. Thus u 1 and u 2 are in fact an ONB of the plane W 2 spanned by w 1 and w 2. To continue, let W 3 be the three space spanned by w 1, w 2 and w 3.By Proposition 9.3, the vector v 3 = w 3 P W2 (w 3 )=w 3 (w 3 u 1 )u 1 (w 3 u 2 )u 2

232 is orthogonal to W 2.Moreover,v 3 W 3 and v 3 0 (why?). Now put u 3 = v 3 1 v 3. Hence u 1, u 2, u 3 are an ONB of W 3. In general, if j k, letw j denote span{w 1,...,w j }, and suppose an ONB u 1,...,u j 1 of W j 1 is already defined. Then, one defines v j := w j P Wj 1 (w j )(sov j is w j minus the component of w j in W j 1 ). In other words, v j = w j (w j u 1 )u 1 (w j u 2 )u 2 (w j u j 1 )u j 1. Finally put u j = v j 1 v j. Then u 1,...,u j 1, u j is an ONB of the subspace W j spanned by w 1,...,w j. Continuing in this manner, we will eventually arrive at an ONB u 1,...,u k of W with the property that the span of u 1,...,u i coincides with the span of w 1,...,w i for each i k. To summarize, we state Proposition 9.8. Suppose w 1,...,w k are linearly independent vectors in R n. Then the Gram-Schmidt method produces ON vectors u 1,...,u k such that span{u 1,...,u i } =span{w 1,...,w i } for each i k. 9.3.2 The QR Decomposition The GS method can be summarized in an important matrix form called the QR factorization. This is the starting point of several methods in applied linear algebra, such as the QR algorithm. Let us consider the case k =3, the higher cases being analogous. Let W be a subspace of R n with a basis w 1, w 2, w 3. Applying GS to this basis gives an ONB u 1, u 2, u 3 of W such that the following matrix identity holds: w 1 u 1 w 2 u 1 w 3 u 1 (w 1 w 2 w 3 )=(u 1 u 2 u 3 ) 0 w 2 u 2 w 3 u 2. 0 0 w 3 u 3 In general, if A =(w 1 w k )isann k matrix over R with linearly independent columns, Q =(u 1 u k ) is the associated n k matrix produced by the GS method, and R is the k k upper triangular matrix of Fourier coefficients, then A = QR.

233 This is known as the QR decomposition or QR factorization of A. Summarizing, we have Proposition 9.9. Every real n k matrix A with independent columns (i.e. rank k) can be factored A = QR, whereq is an n k matrix with ON columns and R is an invertible upper triangular k k matrix. The fact that R is invertible is because it is upper triangular and its diagonal entries are nonzero (why?). If A is square, then both Q and R are square. In particular, Q is an orthogonal matrix. Constructing the factorization A = QR is the first step in the QR algorithm, which is an important method for approximating the eigenvalues of A. Exercises Exercise 9.25. Find an ONB of the subspace W of R 4 spanned by (1, 0, 1, 1), ( 1, 1, 0, 0), and (1, 0, 1, 1). Expand (0, 0, 0, 1) and (1, 0, 0, 0) in terms of this basis. Exercise 9.26. Let 1 1 1 A := 0 1 0 1 0 1. 1 0 1 Find the QR factorization of A. Exercise 9.27. Find a 4 4 orthogonal matrix Q whose first three columns are the columns of A in the previous problem. Exercise 9.28. What would happen if the GS method were applied to a set of vectors that were not lineary independent? In other words, why can t we produce an ONB from nothing? Exercise 9.29. In the QR decomposition, we claimed that the diagonal entries of R are non zero, hence R is invertible. Explain why they are indeed non zero. Exercise 9.30. Suppose A = QDQ 1 with Q orthogonal and D diagonal. Show that A is always symmetric and that A is orthogonal if and only if all diagonal entries of D are either ±1. Show that A is the matrix of a reflection H u precisely when D =diag( 1, 1,...,1), that is exactly one diagonal entry of D is 1 and all others are +1.

234 Exercise 9.31. How would you define the reflection H through a subspace W of R n? What properties should the matrix of H have? For example, what should the eigenvalues of H be? Exercise 9.32. Check directly that if R = I n P W,thenR 2 = R. Verify also that the eigenvalues of R are 0 and 1 and that E 0 = W and E 1 = W. Exercise 9.33. Show that for any subspace W of R n, P W can be expressed as P W = QDQ T,whereD is diagonal and Q is orthogonal. Find the diagonal entries of D, and describe Q. Exercise 9.34. Let W be the plane in R 4 spanned by (0, 1, 0, 1) and (1, 1, 0, 0). Find an ONB of W,anONBofW andanonbofr 4 containing the ONBs of W and W. 11. Verify that if W is any subset of R n,thenw is a subspace of R n. What is (W )? Exercise 9.35. The GS method applies to the inner product on C[a, b] as well. (a) Apply GS to the functions 1, x, x 2 on the interval [ 1, 1] to produce an ON basis of the set of polynomials on [ 1, 1] of degree at most two. The resulting functions P 0,P 1,P 2 are the first three normalized orthogonal polynomials of Legendre type. (b) Show that your nth polynomial P n satisfies the differential equation (1 x 2 )y 2xy + n(n +1)y =0. (c) The nth degree Legendre polynomial satisfies this second order differential equation equation for all n 0. This and the orthogonality condition can be used to generate all the Legendre polynomials. Find P 3 and P 4 without GS. Exercise 9.36. Using the result of the previous exercise, find the projection of x 4 + x on the subspace of C[ 1, 1] spanned by 1, x, x 2. 9.4 8.1 The group of rotations of R 3 One of the mathematical problems one encounters in crystallography is to determine the set of rotations of a particular molecule. In other words, the problem is to determine the rotational symmetries of some object in R 3. The first question we should consider is what is a rotation of R 3. We

235 will use a characterization apparently due to Euler that a rotation ρ of R 3 is determined by an axis through the origin, which R fixes pointwise, and every plane orthogonal to this axis is rotated through the same fixed angle θ. Using this as the basic definition, we will now describe the set of rotations of R 3 in terms of matrix theory. 9.4.1 The set Rot(R 3 ) Let Rot(R 3 ) denote the set of rotations of R 4. It is clear that a rotation R of R 3 about 0 should preserve lengths and angles. Recalling that for any x, y R 3, x y = x y cos α, we see that any transformation of R 3 preserving both lengths and angles also preserves the dot product of any two vectors. Thus if ρ Rot(R 3 ), ρ(x) ρ(y) =x y. (9.11) Therefore, every rotation is given by an orthogonal matrix, and we see that Rot(R 3 ) O(3, R), the set of 3 3 orthogonal matrices. In particular, every rotation of R 3 is a linear transformation. However, not every orthogonal matrix gives a ROR3. For example, a reflection of R 3 through a plane through the origin clearly isn t a rotation, because if a rotation fixes two orthogonal vectors in R 3,itfixesallofR 3. On the other hand, a reflection does fix two orthogonal vectors without fixing R 3. In fact, I claim that every rotation R has a positive determinant. Indeed, ρ fixes a line L through the origin pointwise, so 1 is an eigenvalue. Moreover, the plane orthogonal to L rotated, so there exists a basis of R 3 for which the matrix of ρ has the form 1 0 0 0 cosθ sin θ. 0 sinθ cos θ Hence if ρ Rot(R 3 ), then det(ρ) =1. We now introduce SO(3). Recall that SL(3, R) denotes the set of all 3 3 real matrices of determinant 1. Put SO(3) = SL(3, R) O(3, R). We therefore deduce that Rot(R 3 ) SO(3). In fact, the next thing we will show is Theorem 9.10. Rot(R 3 )=SO(3).

236 Proof. It suffices to show SO(3) Rot(R 3 ), i.e. every element of SO(3) is a rotation. Note that the identity transformation I 3 is a rotation, namely the rotation through zero degrees. We will first prove that if σ SO(3) and σ I 3, then 1 is an eigenvalue of σ and the corresponding eigenspace has dimension 1. That is, E 1 is a line. We know that every 3 3 real matrix has a real eigenvalue, and we also know that the real eigenvalues of an orthogonal matrix are either 1 or 1. Hence, σ SO(3), the eigenvalues of σ are one of one of the following possibilities: (i) 1 of multiplicity three, (ii) 1, 1, where 1 has multiplicity two, and (iii) 1, λ, λ, whereλ λ, since the complex roots of the characteristic polynomial of a real matrix occur in conjugate pairs. Hence, 1 is always an eigenvalue of σ, sodime 1 1. I claim that if σ SO(3) and σ I 3,thendimE 1 = 1. Indeed, dim E 1 = 3, is impossible since σ I 3. If dime 2 =2,thenσ fixes the plane E 2 pointwise. Since σ preserves angles, it also has to send the line L = E2 to itself. Thus L is an eigenspace. But the only real eigenvalue different from 1 is -1, so if σ I 3, there is a basis of R 3 so that the matrix of σ is 1 0 0 0 1 0. 0 0 1 But then det(σ) = 1, so dim E 1 = 2 cannot happen. This gives us the claim that dim E 1 =1. Therefore σ fixes every point on a unique line L through the origin and maps the plane L orthogonal to L into itself. We now need to show σ rotates L. Let u 1, u 2, u 3 be an ONB in R 3 such that u 1, u 2 L and σ(u 3 )=u 3. Let Q =(u 1 u 2 u 3 ). Since σu 1 and σu 2 are orthogonal unit vectors on L, we can choose an angle θ such that σu 1 =cosθu 1 +sinθu 2 and σu 2 = ±(sin θu 1 cos θu 2 ). In matrix terms, this says cos θ ± sin θ 0 σq = Q sin θ cos θ 0. 0 0 1

237 Since det(σ) = 1 and det(q) = ±1, it follows that cos θ ± sin θ 0 det sin θ cos θ 0 =1. 0 0 1 The only possibility is that cos θ sin θ 0 σ = Q sin θ cos θ 0 Q 1. (9.12) 0 0 1 This tells us that σ rotates the plane L through θ, hence σ Rot(R 3 ). This completes the proof that SO(3, R) =Rot(R 3 ). Notice that the matrix Q defined above may be chosen to be a rotation. Therefore, the above argument gives another result. Proposition 9.11. The matrix of a rotation σ SO(3) is similar via another rotation Q to a matrix of the form cos θ sin θ 0 sin θ cos θ 0. 0 0 1 We also get a surprising conclusion. Corollary 9.12. The composition of two rotations of R 3 is another rotation. Proof. This is clear since the product of two elements of SO(3) is another element of SO(3). Indeed, SO(3) = SL(3, R) O(3, R), and, by the product theorem for determinants, the product of two elements of SL(3, R) is another element of SL(3, R). Moreover, we also know that the product of two elements of O(3, R) isalsoino(3, R). 9.4.2 Rotation groups We begin with a definition. Definition 9.3. Let S be a solid in R 3.Therotation group of S is defined to be the set of all σ SO(3) such that σ(s) =S. We denote the rotation group of S by Rot(S). Proposition 9.13. Let S be a solid in R 3. If σ and τ are rotations of S, then so are στ and σ 1.

238 Proof. Clearly σ 1 SO(3). By Corollary 9.12, στ SO(3) as well. It s also clear that στ(s) =S, so the proof is finished. Example 9.7. We can now determine the group of rotations of a cube. Let S denote, for example, the cube with vertices at the points (A, B, C), where A, B, C = ±1. Every rotation of R 3 which maps S to itself maps each one of its six faces to another face, and, for any two faces, there is a rotation which maps one to the other. Moreover, there are 4 rotations which map any face to itself. It follows from Proposition 9.13 that there have to be at least 24 rotations of S. Now consider the 4 diagonals of S, i.e. the segments which join a vertex (A, B, C) to( A, B, C). every rotation of S permutes these segments. Moreover, if two rotations define the same permutation of the diagonal, they coincide (why?). Since the number of permutations of 4 objects is 4! = 24, it follows that Rot(S) has 24 elements, and these 24 elements are realized by the 24 permutations of the diagonals of S. Example 9.8. Consider the set consisting of the midpoints of the 6 faces of the cube S. The solid polygon S determined by these 6 points is called the regular octahedron. It is a solid with 8 triangular faces all congruent to each other. The cube and the regular octahedron are two of the 5 Platonic solids, which we will consider in Chapter??. Since each element of Rot(S) must also send midpoint to another midpoint, it follows that Rot(S) Rot(S ). But the other containment clearly also holds, so we deduce that Rot(S) = Rot(S ). 9.4.3 Reflections of R 3 We now know that rotations of R 3 are characterized by the property that their determinants are +1, and we know that the determinant of any element of O(3, R) is±1. Hence every element of O(3, R) that isn t a rotation has determinant 1. We also know that every orthogonal 2 2matrixiseither a rotation or a reflection: a rotation when the determinant is +1 and a reflection when the determinant is 1. A natural is whether this is also true in O(3, R). It turns out that the determinant of a reflection of R 3 is indeed 1. This is due to the fact that a reflection leaves a plane pointwise fixed and maps every vector orthogonal to the plane to its negative. Thus, for a reflection, dim E 1 =2anddimE 1 = 1, so the determinant is 1. It turns out, however, that there exist elements σ O(3, R) det(σ) = 1 which are not reflections. For example, such a σ has eigenvalues 1,λ,λ. It

239 is left as an easy exercise to describe how σ acts on R 3. As to reflections, we have the following fact. Proposition 9.14. An element Q O(3, R) is a reflection if and only if Q is symmetric and det(q) = 1. We leave the proof as an exercise. It is useful to recall a reflection can be expressed as I 3 2P L,whereP L is the projection on the line L orthogonal to the plane E 1 of the reflection. One final comment is that every reflection of R 2 actually defines a rotation of R 3.Forifσ reflects R 2 through a line L, the rotation ρ of R 3 through π with L as the axis of rotation acts the same way as σ on R 2, hence the claim. Note: the eigenvalues of ρ are 1, 1, 1, that is 1 occurs with multiplicity two. Remark: The term group as in rotation group will be defined in Chapter??. In essence, a group is a set that has a structure like that of a rotation group. In particular, elements can be multiplied and every element has an inverse. Exercises Exercise 9.37. Prove Proposition 9.14. Exercise 9.38. Let S be a regular quadrilateral in R 3,thatisS has 2 faces made up of congruent triangles. How many elements does Sym(S) have? Exercise 9.39. Compute Rot(S) in the following cases: (a) S is the half ball x 2 + y 2 + z 2 1,z 0}, and (b) S is the solid rectangle { 1 x 1, 2 y 2, 1 z 1}.