MTH 2032 SemesterII - PDF Free Download

MTH 202 SemesterII 2010-11 Linear Algebra Worked Examples Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education December 28, 2011

Contents Table of Contents iii 1 Systems of Linear Equations 1 2 Matrix Algebra 15 Eigenvalues and Eigenvectors 1 4 Vector Spaces 5 5 Orthogonality 75 iii

Chapter 5 Orthogonality Example 5.1 (Dot product) Find the dot product of x = (2, 2, 1) and y = (2, 5, ). Solution The dot product (or inner product or scalar product) of x and y is given by x y = (2, 2, 1) (2, 5, ) = 2 2 + 2 5 + 1 ( ) = 11. That is, x y is obtained by multiplying corresponding components and adding the resulting products. The vectors x and y are said to be orthogonal (or perpendicular) if their dot product is zero, that is, if x y = 0. Therefore, for this example, the two given vectors x and y are not orthogonal. Example 5.2 (Norm) For x = (1, 2,, 4, 5), find the norm x. Solution The norm of x is given by x = x x = 1 2 + ( 2) 2 + 2 + ( 4) 2 + 5 2 = 55. The norm (or length) of a vector x in R n, denoted by x, is defined to be the nonnegative square root of x x. In particular, if x = (x 1, x 2,, x n), then x = x 2 1 + x2 2 + + x2 n. That is, x is the square root of the sum of the squares of the components of x. Thus, x 0, and x = 0 if and only if x = 0. Example 5. (Normalize a vector) x Find the rescaled vector, where x = (2, 2, 1). x Solution By finding the norm x = 2 2 + 2 2 + 1 2 =, we could normalize x to the following unit vector x x = 1 (2, 2, 1) = (2, 2, 1 ). Verify that the norm of the above rescaled vector is (2/) 2 + (2/) 2 + (1/) 2 = 1. In general, a vector x is called a unit vector if x = 1 or, equivalently, if x x = 1. For any nonzero vector x in R n, the vector ˆx = (1/ x )x = x/ x is the unique unit vector in the same direction of x. The process of finding ˆx from x is called normalizing x. 75

5. Orthogonality Example 5.4 (Schwarz inequality) Prove that x y x y. Solution For any real number t, we have 0 (tx + y) (tx + y) = t 2 (x x) + 2t(x y) + (y y) = x 2 t 2 + 2(x y) t + y 2. Let a = x 2, b = 2(x y), c = y 2. Then, for every value of t, at 2 +bt+c 0. This means that the quadratic polynomial cannot have two distinct real roots. This implies that the discriminant D = b 2 4ac 0. Equivalently, b 2 4ac. Thus, 4(x y) 2 4 x 2 y 2. Dividing by 4 and taking the square root of both sides gives the inequality. Remark. The angle θ between nonzero vectors x and y in R n is defined by cos θ = x y x y. This definition is well-defined, since, by the Schwarz inequality, 1 x y 1. Thus, 1 cos θ 1, x y and so the angle exists and is unique. Note that if x y = 0, then θ = π/2. This then agrees with our previous definition of orthogonality. Example 5.5 (Angle between vectors) Consider the vectors x = (2,, 5) and y = (1, 4, ) in R. Find cos θ, where θ is the angle between them. Solution The angle θ between x and y is given by cos θ = x y x y = 5. 8 26 Note that θ is an acute angle, since cos θ is positive. Example 5.6 (Minkowski s inequality) Prove that x + y x + y. Solution By the Schwarz inequality and other properties of dot product, we have x + y 2 = (x + y) (x + y) = (x x) + 2(x y) + (y y) x 2 + 2 x y + y 2 = ( x + y ) 2. Taking the square root of both sides gives the inequality. Remark. The Minkowski s inequality is often known as the triangle inequality, because if we view x + y as the side of the triangle formed with sides x and y, then the inequality just says that the length of one side of a triangle cannot be greater than the sum of the lengths of the other two sides. 76

Example 5.7 (Orthogonal / Orthonormal set) Determine whether the vectors u 1 = (1,, 4), u 2 = ( 1, 1, 1) form an orthogonal set. For orthogonal set, further make them orthonormal. Solution Since u 1 u 2 = 1 + 4 = 0, we know that the vectors u 1 and u 2 are orthogonal. In other words, u 1, u 2 form an orthogonal set. Then by the norms u 1 = 26, u 2 =, we further obtain an orthonormal set {v 1,v 2}, where v 1 = u1 (1,, 4) = = ( u 1 26 1, 26 26, 4 26 ), v 2 = u2 ( 1, 1, 1) = = ( 1, 1, u 2 1 ). Remark. Generally speaking, vectors v 1, v 2,, v k in R n are said to form an orthogonal set of vectors if each pair of vectors are orthogonal, that is, v i v j = 0 for i j. Stronger than that, vectors v 1, v 2,, v k in R n are said to form an orthonormal set of vectors if the vectors are unit vectors and they form an orthogonal set, that is, { 0, if i j, v i v j = 1, if i = j. Normalizing an orthogonal set refers to the process of multiplying each vector in the set by the reciprocal of its length in order to transform the set into an orthonormal set of vectors. Of course, we have assumed that there is no zero vector in the orthogonal set. Otherwise, division by zero may occur. Example 5.8 (Orthogonal / Orthonormal set) Determine all values of k so that the two vectors (1, 2, ) and (k 2, 1, k) are orthogonal. Solution Two vectors are orthogonal if and only if their dot product is zero. Therefore by (1, 2, ) (k 2, 1, k) = 0 k 2 k + 2 = 0 (k 1)(k 2) = 0, the two vectors are orthogonal for k = 1 or 2. Remark. Now think about the question how to verify whether three vectors form an orthogonal set or not? In fact, for more than two vectors, we should be more careful for the verification process. The vectors v 1, v 2,, v k are said to form an orthogonal set if v 1, v 2,, v k are pairwise orthogonal (or mutually orthogonal). That is, v 1 v 2 = 0, v 1 v = 0, v 1 v 4 = 0, v 1 v 5 = 0,, v 1 v k 1 = 0, v 1 v k = 0, v 2 v = 0, v 2 v 4 = 0, v 2 v 5 = 0,, v 2 v k 1 = 0, v 2 v k = 0, v v 4 = 0, v v 5 = 0,, v v k 1 = 0, v v k = 0,. v k 1 v k = 0. We emphasize that among the vectors of an orthogonal set, it is permissible that some v j s are zero vectors. This is one important difference between orthogonal set and orthonormal set. For every orthonormal set, all vectors are unit vectors so that none of them are zero. 77

5. Orthogonality Example 5.9 (Orthogonal / Orthonormal set) Let v 1 = (1, 2, 5), v 2 = (1, 2, 1), v = (1,, 1). Do v 1, v 2, v form an orthogonal set? If yes, rescale the vectors to make them orthonormal. Solution It could be verified that v 1 v 2 = 1 + 4 5 = 0, v 1 v = 1 6 + 5 = 0, v 2 v = 1 6 1 = 6. Since v 2 v 0, the vectors v 1, v 2, v are not orthogonal. Example 5.10 (Orthogonal / Orthonormal set) Let v 1 = (1, 2, 1), v 2 = ( 1, 1, 1), v = (1, 0, 1). Do v 1, v 2, v form an orthogonal set? If yes, rescale the vectors to make them orthonormal. Solution It could be verified that v 1 v 2 = 1 + 2 1 = 0, v 1 v = 1 + 0 1 = 0, v 2 v = 1 + 0 + 1 = 0. Hence, v 1, v 2, v form an orthogonal set. By the norms v 1 = 6, v 2 =, v = 2, we obtain an orthonormal set consisting the three unit vectors v 1 v 1 = ( 1 6, 2 6, 1 6 ), v 2 v 2 = ( 1, 1, 1 ), v v = ( 1, 0, 1 ). 2 2 Remark. By a basis of R n we mean a set of n vectors in R n which are linearly independent. In addition, if the n vectors in R n form an orthogonal set, then we call it an orthogonal basis of R n. An orthogonal basis of R n can always be normalized to form an orthonormal basis of R n. Example 5.11 (Orthogonal basis) Show that the standard basis of R n is orthonormal for every n. Solution We consider n = only. Let {e 1,e 2,e } = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} be the standard basis of R. It is clear that e 1 e 2 = e 1 e = e 2 e = 0, e 1 e 1 = e 2 e 2 = e e = 1. Namely, {e 1,e 2,e } is an orthonormal basis of R. More generally, the standard basis of R n is orthonormal for every n. 78

Example 5.12 (Orthogonal basis) Show that v 1 = (1,, 1), v 2 = (1, 1, 2) are orthogonal. Find a third vector v so that v 1, v 2, v form an orthogonal basis of R. Solution Verify that v 1 v 2 = 1 + 2 = 0, so v 1 and v 2 are orthogonal. We need an additional vector v = (x, y, z) such that v is pairwise orthogonal to v 1 and v 2. That is, v v 1 = 0 and v v 2 = 0. This yields the homogeneous system { x +y z = 0, x y 2z = 0. Let A be the coefficient matrix of the system. By doing row operations, [ ] [ ] 1 1 A = 1 0 7/4. 1 1 2 0 1 1/4 Here only the third column is nonpivot and z is the only free variable. The general solution is given by x 7z/4 y = z/4 = z 7 1. 4 z z 4 Thus we may take v = (7, 1, 4). Now, as we want, the three (nonzero) vectors v 1, v 2, v form an orthogonal set. Furthermore, it follows from Theorem 10.2.1 (Lecture Notes, page 191) that v 1, v 2, v in R are indeed linearly independent. Accordingly, v 1, v 2, v form an orthogonal basis of R. Remark. We may interpret Theorem 10.2.1 (Lecture Notes, page 191) in the following. Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent. (5.1) Proof. Suppose S = {v 1,v 2,,v k } and suppose Taking the dot product of (5.2) with v 1, we get c 1v 1 + c 2v 2 + + c k v k = 0. (5.2) 0 = 0 v 1 = (c 1v 1 + c 2v 2 + + c k v k ) v 1 = c 1 v 1 v 1 + c 2 v 2 v 1 + + c k v k v 1 = c 1 v 1 v 1 + c 2 0 + + c k 0 = c 1 v 1 v 1. Since v 1 0, we have v 1 v 1 0. Thus c 1 = 0. Similarly, for i = 2,, k, taking the dot product of (5.2) with v i, 0 = 0 v i = (c 1v 1 + c 2v 2 + + c k v k ) v i = c 1 v 1 v i + + c i v i v i + + c k v k v i = c i v i v i. But v i v i 0, and hence c i = 0. Thus S is linearly independent. 79

5. Orthogonality Example 5.1 (Orthogonal basis) Show that the columns of the following matrix U form an orthogonal basis of R. Hence find U 1 and express (1, 2, ) as a linear combination of the columns of U. 1 2 2 U = 0 1 5. 2 1 1 Solution By direct multiplication of matrices, we have 1 0 2 1 2 2 5 0 0 U t U = 2 1 1 0 1 5 = 0 6 0, 2 5 1 2 1 1 0 0 0 and we denote the last diagonal matrix as D def. = diag (5, 6, 0). That is, U t U = D. Now we recall the fact U t U is diagonal columns of U are orthogonal (5.) that the three columns of U form an orthogonal set. Since none of the columns are zero, then by (5.1), the three columns of U in R are linearly independent and hence they form an orthogonal basis of R. Since detd 0, D is invertible, then D 1 U t U = I implies that U is invertible and its inverse is 1/5 0 0 1 0 2 1/5 0 2/5 U 1 = D 1 U t = 0 1/6 0 2 1 1 = 1/ 1/6 1/6. 0 0 1/0 2 5 1 1/15 1/6 1/0 Let v 1 = (1, 0, 2), v 2 = (2, 1, 1), v = ( 2, 5, 1) be the three columns of U. That is, U = [ v 1 v 2 v ]. By Theorem 10.2.1 (Lecture Notes, page 191), since v 1, v 2, v form an orthogonal basis of R, we could express x = (1, 2, ) as a linear combination of v 1, v 2, v. We first find x v 1 = 7, x v 2 = 1, x v = 11, v 1 v 1 = 5, v 2 v 2 = 6, v v = 0. Therefore, x = x v1 v 1 + x v2 v 2 + x v v = 7 v 1 v 1 v 2 v 2 v v 5 v1 + 1 11 v2 + 6 0 v. Remark. (i) We note that the matrix U in (5.) could be non-square and even some columns of U could be zero vectors. However, if all columns of U are unit vectors, we have a stronger version of (5.): U t U = I (the identity matrix) columns of U are orthonormal. (5.4) In fact, (5.) (resp. (5.4) ) can be used to verify whether the columns of a given matrix U can form an orthogonal set (resp. an orthonormal set) or not. In case of square matrix U and none of the columns of U are zero, then by (5.1), one can further verify whether the columns of U can form an orthogonal basis (resp. an orthonormal basis) of R n. In this case, since U is square, U t U = I UU t = I U is invertible and U 1 = U t. A square matrix U satisfying U 1 = U t is called an orthogonal matrix. (ii) By Theorem 10.2.1 (Lecture Notes, page 191), if we write x = c 1v 1 + c 2v 2 + + c k v k, the scalars c 1, c 2,, c k could be formally determined provided that v 1, v 2,, v k are nonzero and they can form an orthogonal set. In Example 5.1, the three columns of U can satisfy this requirement since they can form an orthogonal basis of R, as indeed the question required us to do so. 80

Example 5.14 (Orthogonal basis) Show that the columns of the following matrix U form an orthogonal basis of R 4. Then express (1, 0, 0, 0) and (0, 1, 2, ) as linear combinations of the columns of U. 1 1 1 2 U = 1 1 2 1 1 1 2 1. 1 1 1 2 Solution The columns of U are orthogonal because 1 1 1 1 1 1 1 2 4 0 0 0 U t U = 1 1 1 1 1 1 2 1 1 2 2 1 1 1 2 1 = 0 4 0 0 0 0 10 0 2 1 1 2 1 1 1 2 0 0 0 10 is a diagonal matrix. The four nonzero columns of U form an orthogonal basis of R 4. Let x = (1, 0, 0, 0), y = (0, 1, 2, ) and denote that U = [ u 1 u 2 u u 4 ], where uj is the j-th column of U. Then x = y = x u1 u 1 u 1 u 1 + x u2 u 2 u 2 u 2 + x u u u u + x u4 u 4 u 4 u 4 = 1 4 u1 1 4 u2 1 10 u + 1 5 u4, y u1 u 1 + y u2 u 2 + y u u + y u4 u 4 u 1 u 1 u 2 u 2 u u u 4 u 4 = 2 u1 + 0u2 + 1 2 u 1 2 u4. Remark. As we mentioned in the previous remark (page 80), if U is an n n real matrix, then the following are equivalent: 1. U is an orthogonal matrix. 2. The columns of U form an orthonormal basis of R n.. U t U = I. 4. U is invertible, and U 1 = U t. In fact, since U is square, then we further have U is an orthogonal matrix columns of U form an orthonormal basis of R n U t U = I UU t = I rows of U form an orthonormal basis of R n. In the above, U t U = I UU t = I follows from the fact that for any square matrices A, B, then AB = I = BA = I (Review Notes for Linear Algebra True or False, 5.12). Then UU t = I (U t ) t U t = I columns of U t form an orthonormal basis of R n rows of U form an orthonormal basis of R n. So next time when you see the keyword orthogonal matrix, you may recall any of the above equivalent statements if necessary. 81

5. Orthogonality Example 5.15 (Orthogonal matrix) (a) Let 1/ 1/ 1/ P = 0 1/ 2 1/ 2 2/ 6 1/ 6 1/. 6 The columns (as well as the rows) of P are orthogonal to each other and are unit vectors. Thus P is an orthogonal matrix. (b) Let P be an 2 2 orthogonal matrix. Then, for some real number θ, we have [ ] [ ] cos θ sin θ cos θ sin θ P = or P =. sin θ cos θ sin θ cos θ Example 5.16 (Orthogonal matrix) Prove that the product of two orthogonal matrices is again orthogonal. Solution Suppose P and Q are orthogonal matrices. It follows that P 1 = P t and Q 1 = Q t. By (PQ) 1 = Q 1 P 1 = Q t P t = (PQ) t that implies PQ is again orthogonal. Example 5.17 (Orthogonal matrix) Prove that the determinant of an orthogonal matrix is ±1. Solution Suppose P is an orthogonal matrix. Then P 1 = P t. It follows that 1 = deti = det (PP 1 ) = det (PP t ) = detp detp t. But recall that detp t = detp and hence (detp) 2 = 1. Therefore, detp = ±1. Example 5.18 (Orthogonal matrix) Let U be a square matrix. Show that if the columns of U are orthonormal, then the rows of U are also orthonormal. Give an example of a square matrix A such that the columns of A are orthogonal, but the rows of A are not. Solution By (5.4), if the columns of U are orthonormal, then U t U = I. Given that U is a square matrix, then U is indeed an orthogonal matrix and U 1 = U t. It follows that (U t ) t (U t ) = UU t = UU 1 = I. The above implies that the columns of U t are orthonormal. Equivalently, the rows of U are orthonormal. [ ] 2 1/2 Take A =. Then the columns of A are orthogonal but the rows are not. 1 1 82

Example 5.19 (Orthogonal projection) Find the orthogonal projection of x = (4, 4, ) onto the line spanned by v = (5, 1, 2). Then find the distance from x to the line. Solution Let S be the subspace (line) of R spanned by v. That is, S = span (v) = {kv : all number k}. Then the orthogonal projection of x onto S is proj S x = x v v v v = 0 v = (5, 1, 2). 0 Since proj S x x, we see that x is not in the span S. The distance from x to S is dist (x, S) = x proj S x = (4, 4, ) (5, 1, 2) = ( 1,, 1) = 11. Remark. (i) Recall from Theorem 10..1 (Lecture Notes, page 195) that the orthogonal projection of x onto a subspace S of R n can be formally determined provided that an orthogonal basis of S should be known. For questions if the subspace S is given as a span of some given vectors (say v 1, v 2,, v k ), then you are indeed required to prove that the vectors v 1, v 2,, v k are orthogonal (and none of them are zero) so that they are linearly independent by (5.1) and hence they can form an orthogonal basis for S. In general cases, if you are required to find the orthogonal projection of vector x onto some subspace of R n (for example, null space, row space, column space of some matrix), again it is supposed you should first find an orthogonal basis of the subspace and accordingly use Theorem 10..1 (Lecture Notes, page 195) to construct the orthogonal projection. (ii) If proj S x x, then x S. The (shortest) distance from x to S is given by the norm of the orthogonal projection of x onto the orthogonal complement of S. That is, dist (x, S) = z = x y = x proj S x. (iii) We just mentioned the keyword orthogonal complement, it is better to give a formal definition for this concept. Let S be a subspace of R n. The orthogonal complement of S, denoted by S, consists of those vectors in R n that are orthogonal to every vector y in S, that is, S = {z R n : z y = 0 for every y S}. For an example, suppose u is the orthogonal projection of v to W, then what is the orthogonal projection of v to W? v + u, v u, v u or v + u? In fact, the projection of v to W is u. Therefore the projection of v to W is ( v) ( u) = u v. Example 5.20 (Orthogonal projection) Find the orthogonal projection of x = (1, 2, 7) onto the plane spanned by the orthogonal vectors v = (1, 1, 1), Then find the distance from x to the plane. w = (1, 2, 1). Solution By v w = 1 2 + 1 = 0, we know that v and w are indeed orthogonal. Suppose S be the subspace of R spanned by v and w. Since both v and w are nonzero, by (5.1), they are linearly independent and hence they form an orthogonal basis for S. That is, S = span(v,w) = plane in R. By Theorem 10..1 (Lecture Notes, page 195), the orthogonal projection of x onto S is proj S x = x v v v v + x w w = 2v + 2w = (4, 2, 4). w w Since proj S x x, we see that x is not in the plane S. The distance from x to the plane S is dist (x, S) = x proj S x = (1, 2, 7) (4, 2, 4) = (, 0, ) = 2. 8

5. Orthogonality Example 5.21 (Orthogonal projection) Find the orthogonal projection of x = (2, 1,, 2) onto the plane spanned by orthogonal vectors v = (1, 1, 1, 1), w = (2, 1, 1, 2). Then find the distance from x to the plane. Solution By v w = 2 1 + 1 2 = 0, we know that v and w are indeed orthogonal. Suppose S be the subspace of R 4 spanned by v and w. Since both v and w are nonzero, by (5.1), they are linearly independent and hence they form an orthogonal basis for S. That is, S = span(v,w) = plane in R 4. By Theorem 10..1 (Lecture Notes, page 195), the orthogonal projection of x onto S is proj S x = x v v v v + x w w w w = 4 4 v + 10 w = (, 0, 2, 1). 10 Since proj S x x, we see that x is not in the plane S. The distance from x to the plane S is dist (x, S) = x proj S x = (2, 1,, 2) (, 0, 2, 1) = ( 1, 1, 1, 1) = 2. Example 5.22 (Orthogonal projection) Find the orthogonal projection of x = (, 4, 2) onto the subspace of R which has the orthonormal basis v 1 = 1 (2, 1, 2), v2 = 1 18 (1, 4, 1). Then find the distance from x to the subspace of R. Solution Suppose S be the subspace of R spanned by v 1 and v 2. Just recall from Theorem 10..1 (Lecture Notes, page 195) that we need an orthogonal basis for S before we can write down the orthogonal projection. For this example, we are given an orthonormal basis which is in particular an orthogonal basis. For easier hand calculations, we may take u 1 = (2, 1, 2), u 2 = (1, 4, 1) be the orthogonal basis for S. That is, S = span(u 1,u 2) = plane in R. We then use {u 1,u 2} to construct the orthogonal projection. The orthogonal projection of x onto S is proj S x = x u1 u 1 + x u2 u 2 = 6 15 u1 + u 1 u 1 u 2 u 2 9 18 u2 = (1 2, 4, 1 2 ). Since proj S x x, we see that x is not in the span S. The distance from x to S is dist (x, S) = x proj S x = (, 4, 2) (1 2, 4, 1 25 2 ) = 2 = 5 2 2. Remark. In Examples 5.19 5.22, we find that each subspace S of R n is always given by a span of some vectors in R n and those vectors are already pairwise orthogonal. However, in general cases, vectors in R n are not necessarily orthogonal, and you need to first orthogonalize them to make them usable. We will see an example later for this kind (Example 5.27, page 88). 84

Example 5.2 (Orthogonal diagonalization) Orthogonally diagonalize the symmetric matrix 1 0 1 A = 0 1 1. 1 1 2 That is, find an orthogonal matrix Q and diagonal matrix D so that Q 1 AQ = D. Solution Recall the fact that all real symmetric matrices are diagonalizable. Hence, A is diagonalizable and should have a diagonalization A = PDP 1 for some invertible P and diagonal D. Normally, P is constructed as a column partition matrix with eigenvectors of A as its columns. That is, P = [ v 1 v 2 v ], where v 1, v 2, v are eigenvectors of A. Now back to our problem, we need to orthogonally diagonalize A. The keyword here is orthogonally, so indeed we need to find an invertible matrix that is also an orthogonal matrix. Here we use Q (instead of P) to represent this matrix for its orthogonal property. Now we need to guarantee the eigenvectors v 1, v 2, v of A form an orthonormal set (pairwise orthogonal unit vectors). The characteristic equation of A is det (A λi) = 0, or 1 λ 0 1 det 0 1 λ 1 = λ(1 λ)(λ ) = 0 1 1 2 λ which gives the distinct real eigenvalues λ 1 = 0, λ 2 = 1, λ =. For λ 1 = 0, 1 0 1 1 0 1 A 0 I = 0 1 1 0 1 1. 1 1 2 0 0 0 By solving (A 0 I)x = 0, x = (x 1, x 2, x ), we have x 1 = x and x 2 = x. Thus, we get the eigenvector v 1 = (1, 1, 1). For λ 2 = 1, x = (x, x, x ) = x (1, 1, 1), 0 0 1 1 1 0 A 1 I = 0 0 1 0 0 1. 1 1 1 0 0 0 By solving (A 1 I)x = 0, x = (x 1, x 2, x ), we have x 1 = x 2 and x = 0. Thus, we get the eigenvector v 2 = (1, 1, 0). For λ =, x = (x 2, x 2, 0) = x 2 (1, 1, 0), 2 0 1 1 0 1/2 A I = 0 2 1 0 1 1/2. 1 1 1 0 0 0 By solving (A I)x = 0, x = (x 1, x 2, x ), we have x 1 = x /2 and x 2 = x /2. Thus, we get the eigenvector v = ( 1, 1, 2). x = ( x 2, x x, x) = 2 2 ( 1, 1, 2), 85

5. Orthogonality Verify that v 1 v 2 = 1 1 + 0 = 0, v 1 v = 1 1 + 2 = 0, v 2 v = 1 + 1 + 0 = 0. Luckily, v 1, v 2, v are pairwise orthogonal and hence they form an orthogonal set. Since eigenvectors must be nonzero, v 1, v 2, v are linearly independent by (5.1) and hence they form an orthogonal basis of R. Obviously, v 1, v 2, v are not unit vectors. We need to first normalize them. By v 1 =, v 2 = 2, v = 6, we further obtain an orthonormal basis of R consisting x 1 = v1 v 1 = 1 (1, 1, 1), x 2 = v2 v 2 = 1 2 (1, 1, 0), x = v v = 1 6 ( 1, 1, 2). Thus if we construct Q = [ x 1 x 2 x ], we can use it to diagonalize A such that Q 1 AQ = D, where 1/ 1/ 2 1/ 6 Q = 1/ 1/ 2 1/ 0 0 0 6 1/ 0 2/, D = 0 1 0. 6 0 0 Here we emphasize that Q is an orthogonal matrix (1) which satisfies Q 1 = Q t, (2) whose columns form an orthonormal basis of R. Please review the equivalent statements in the remark (page 81). Remark. For Example 5.2, we find that the three eigenvectors of A are already orthogonal by our construction. However, in many cases, the eigenvectors are not orthogonal, we then need an algorithm to convert them to orthogonal vectors. One method for this purpose is called Gram Schmidt orthogonalization process. We shall illustrate the details in the following. Suppose {u 1,u 2,,u n} is a basis of a subspace V. One can use this basis to construct an orthogonal basis {v 1,v 2,,v n} of V as follows. Set v 1 = u 1, v 2 = u 2 v = u. v n = u n u2 v1 v 1 v 1 v 1, In other words, for k = 2,,, n, we define u v1 u v2 v 1 v 2, v 1 v 1 v 2 v 2 un v1 un v2 un vn 1 v 1 v 2 v n 1. v 1 v 1 v 2 v 2 v n 1 v n 1 v k = u k c k1 v 1 c k2 v 2 c k,k 1 v k 1, where c ki = (u k v i)/(v i v i) is the component of u k along v i. In fact, each v k is orthogonal to the preceeding v s. Thus v 1, v 2,, v n form an orthogonal basis for V as claimed. Normalizing each v k will then yield an orthonormal basis for V. The above construction is the so-called Gram Schmidt process. We have some remarks for this process. (1) Each vector v k is a linear combination of u k and the preceeding v s. Hence one can easily show, by induction, that each v k is a linear combination of u 1, u 2,, u n. This accounts for span(v 1,v 2,v ) = span(u 1,u 2,u ). (2) Since taking multiples of vectors does not affect orthogonality, it may be simpler in hand calculations to clear fractions in any new v k, by multiplying v k by an appropriate scalar, before obtaining the next v k+1. () Suppose w 1, w 2,, w m are linearly independent, and so they form a basis for W = span (w 1,w 2,,w m). Applying Gram Schmidt process to the w s yields an orthogonal basis for W. 86

Example 5.24 (Gram Schmidt Orthogonalization) Apply Gram Schmidt process to the following vectors to produce an orthogonal set. Solution Apply Gram Schmidt process, v 1 = u 1 = (1, 0, 2), v 2 = u 2 u 1 = (1, 0, 2), u 2 = (1,, 7). u2 v1 v 1 = (1,, 7) 15 (1, 0, 2) = ( 2,, 1). v 1 v 1 5 Now v 1, v 2 form an orthogonal set. Also, span(v 1,v 2) = span(u 1,u 2). Example 5.25 (Gram Schmidt Orthogonalization) Find an orthonormal basis for the subspace of R 4 spanned by u 1 = (0, 2, 1, 0), u 2 = (1, 1, 0, 0), u = (1, 2, 0, 1). Solution Let S be the subspace of R 4 spanned by u 1, u 2, u. We find that u 1, u 2, u are linearly independent because all three columns of [ ] u 1 u 2 u are pivot: 0 1 1 [ ] u1 u 2 u = 2 1 2 1 0 0 1 0 0 0 1 0 0 0 1. 0 0 1 0 0 0 Thus u 1, u 2, u indeed form a basis for S. However, we find that u 1, u 2, u are not pairwise orthogonal (u 1 u 2 0). We then use Gram Schmidt process to obtain an orthogonal set {v 1,v 2,v } from {u 1,u 2,u }. We set v 1 = u 1 = (0, 2, 1, 0), and from u 2 we may take v 2 = (5, 1, 2, 0), u u2 v1 v 1 = (1, 1, 0, 0) 2 v 1 v 1 5 (0, 2, 1, 0) = (1, 1 5, 2 5, 0) = 1 (5, 1, 2, 0), 5 and from u v1 u v2 v 1 v 2 = (1, 2, 0, 1) 4 v 1 v 1 v 2 v 2 5 (0, 2, 1, 0) 0 (5, 1, 2, 0) = (1 2, 1, 1, 1), 2 we may take v = (1, 1, 2, 2). Now the three vectors v 1, v 2, v form an orthogonal set. Since none of them are zero, v 1, v 2, v form an orthogonal basis for the subspace S such that S = span (u 1,u 2,u ) = span(v 1,v 2,v ). Finally, we must normalize them to obtain an orthonormal basis for S, that is, { (0, 2 5, 1 5, 0), (, 1 2 1,, 0), (, 5 0 0 0 10 1, 2, 2 } ). 10 10 10 Example 5.26 (Gram Schmidt Orthogonalization) Find an orthonormal set from u 1 = (1, 2, 1), u 2 = (1,, 1), u = (2, 2, 1). Solution We note that in some cases, we even do not need Gram Schmidt process for generating an orthogonal set. For this example, u 1, u 2, u are linearly independent and form a basis of R. In particular, u 1, u 2, u span R. So one simple orthonormal set that spans R is the set of the standard basis, i.e., {e 1, e 2, e } = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. 87

5. Orthogonality Example 5.27 (Gram Schmidt Orthogonalization) Consider the subspace W spanned by u 1 = (1, 0, 2, 0), u 2 = (1, 2, 2, 4), u = ( 1, 1,, 2). Find the orthogonal projection of (1, 1, 0, 0) onto W. Then find the distance from the vector to W. Solution By row operations we may reduce the matrix [ u 1 u 2 u ] and find that all columns are pivot, we know that u 1, u 2, u are linearly independent. Since they span W, they indeed form a basis for W. Apply Gram Schmidt process, v 1 = u 1 = (1, 0, 2, 0), v 2 = u 2 v = u u2 v1 v 1 = u 2 5 v1 = (0, 2, 0, 4), v 1 v 1 5 u v1 u v2 v 1 v 2 = u 5 10 v1 v2 = ( 2, 0, 1, 0). v 1 v 1 v 2 v 2 5 20 Then W = span(v 1,v 2,v ). Let x = (1, 1, 0, 0). Then the orthogonal projection of x onto W is proj W x = x v1 v 1 + x v2 v 2 + x v v v 1 v 1 v 2 v 2 v v = 1 2 2 v1 + v2 + 5 20 5 v = (1, 1 5, 0, 2 5 ). The distance from x to W is dist (x, W) = x proj W x = (1, 1, 0, 0) (1, 1 5, 0, 2 5 ) = (0, 4 5, 0, 2 5 ) = 2. 5 Example 5.28 (Gram Schmidt Orthogonalization) Verify that the following vectors v 1 = (1, 2, 1, 1), v 2 = ( 1, 1, 0, 1) are orthogonal to each other. Extend v 1, v 2 to an orthogonal basis of R 4. Solution Verify that v 1 v 2 = 1 + 2 + 0 1 = 0, so v 1 and v 2 are orthogonal. We need two independent vectors u, u 4 which are orthogonal to v 1 and v 2. Let u = (x, y, z, w) such that u v 1 = 0 and u v 2 = 0. This yields a homogeneous system. Let A be the coefficient matrix of the system. By doing row operations, A = [ 1 2 1 ] 1 1 1 0 1 [ ] 1 0 1/ 1. 0 1 1/ 0 Here columns and 4 are nonpivot and hence z and w are free variables. The general solution is given by (x, y, z, w) = ( z w, z, z, w) = z (1, 1,, 0) + w ( 1, 0, 0, 1). Thus we may take u = (1, 1,, 0), u 4 = ( 1, 0, 0, 1). Since u and u 4 are not orthogonal (u u 4 0), we use Gram Schmidt process to obtain v = u = (1, 1,, 0), u 4 u4 v v = ( 1, 0, 0, 1) 1 v v 11 (1, 1,, 0) = 1 ( 10, 1,, 11), take v4 = ( 10, 1,, 11). 11 Then the vectors v 1, v 2, v, v 4 form an orthogonal basis of R 4. 88

Example 5.29 (Gram Schmidt Orthogonalization) Orthogonally diagonalize the symmetric matrix 0 2 1 A = 2 2. 1 2 0 That is, find an orthogonal matrix Q and diagonal matrix D so that Q 1 AQ = D. Solution The characteristic equation of A is det (A λi) = 0, or λ 2 1 det 2 λ 2 = (5 λ)(λ + 1) 2 = 0 1 2 λ which gives the eigenvalues λ 1 = 5, λ 2 = 1 (λ 2 being a repeated eigenvalue). For λ 1 = 5, 5 2 1 1 0 1 A 5 I = 2 2 2 0 1 2. 1 2 5 0 0 0 By solving (A 5 I)x = 0, x = (x 1, x 2, x ), we have x 1 = x and x 2 = 2x. Thus, we get the eigenvector u 1 = ( 1, 2, 1). For λ 2 = 1, x = ( x, 2x, x ) = x ( 1, 2, 1), 1 2 1 1 2 1 A ( 1) I = 2 4 2 0 0 0. 1 2 1 0 0 0 By solving (A ( 1) I)x = 0, x = (x 1, x 2, x ), we have x 1 = 2x 2 + x. Thus, x = ( 2x 2 + x, x 2, x ) = x 2 ( 2, 1, 0) + x (1, 0, 1), we get the two eigenvectors u 2 = ( 2, 1, 0), u = (1, 0, 1). Verify that u 1 u 2 = 2 2 + 0 = 0, u 1 u = 1 + 0 + 1 = 0, u 2 u = 2 + 0 + 0 = 2. Since u 2 u 0, the vectors u 1, u 2, u are not orthogonal. We then use Gram Schmidt process for these last two vectors. Therefore, we take and from u v 1 = u 1 = ( 1, 2, 1), v 2 = u 2 = ( 2, 1, 0), u v2 v 2 = (1, 0, 1) 2 v 2 v 2 5 ( 2, 1, 0) = (1 5, 2 5, 1) = 1 (1, 2, 5), take v = (1, 2, 5). 5 Now, as we want, the vectors v 1, v 2, v form an orthogonal set. Since none of them are zero, v 1, v 2, v are linearly independent by (5.1) and hence they form an orthogonal basis of R. 89

5. Orthogonality By v 1 = 6, v 2 = 5, v = 0, we further obtain an orthonormal basis of R consisting x 1 = v1 v 1 = 1 6 ( 1, 2, 1), x 2 = v2 v 2 = 1 5 ( 2, 1, 0), x = v v = 1 0 (1, 2, 5). Thus if we construct Q = [ x 1 x 2 x ], we can use it to diagonalize A such that Q 1 AQ = D, where 1/ 6 2/ 5 1/ 0 Q = 2/ 6 1/ 5 2/ 5 0 0 0 1/ 6 0 5/, D = 0 1 0. 0 0 0 1 Here we emphasize that Q is an orthogonal matrix (1) which satisfies Q 1 = Q t, (2) whose columns form an orthonormal basis of R. Thus, we also have Q t AQ = D. Example 5.0 (Gram Schmidt Orthogonalization) In R, find the distance from the point (1, 1, 1) to the plane x 1 + x 2 + 2x = 0. Solution Any vector in the subspace (plane) can be expressed as (x 1, x 2, x ) = ( x 2 2x, x 2, x ) = x 2 ( 1, 1, 0) + x ( 2, 0, 1). Hence, u 1 = ( 1, 1, 0) and u 2 = ( 2, 0, 1) form a basis of the subspace. Apply the Gram Schmidt process, v 1 = u 1 = ( 1, 1, 0), v 2 = u 2 u2 v1 v 1 = ( 2, 0, 1) 2 ( 1, 1, 0) = ( 1, 1, 1). v 1 v 1 2 Thus the orthogonal projection of the vector x = (1, 1, 1) to the subspace is x v 1 v 1 + x v2 v 2 = 0 1 ( 1, 1, 0) + v 1 v 1 v 2 v 2 2 ( 1, 1, 1) = 1 (1, 1, 1). Thus the distance from x to the plane is (1, 1, 1) 1 (1, 1, 1) = (2, 2, 4 ) = 2 6. Alternatively, by the equation x 1 + x 2 + 2x = 0, we know that the vector z = (1, 1, 2) is perpendicular to the plane. Let W = span(z). Then the distance is given by proj W x = x z z z z = 4 6 (1, 1, 2) = 2 6. 90

Example 5.1 (Data fitting problem using a straight line) Find a straight line y = c + mx that best fits the following set of data on the xy-plane. (2, 1), (5, 2), (7, ), (8, ). Solution When we can find a straight line y = c + mx passing through all the points, it will of course best fit the data. However, the corresponding system admits no solution. c + 2m = 1, c + 5m = 2, c + 7m =, c + 8m = In this example, the sum of the difference squares is 1 2 1 5 1 7 1 8 [ ] c m = 1 2 is inconsistent. (c + 2m 1) 2 + (c + 5m 2) 2 + (c + 7m ) 2 + (c + 8m ) 2. Note that a vector in Col A is b 0 = (c+2m, c+5m, c+7m, c+8m), so the sum of difference square is exactly b 0 b and therefore the technique of the normal equation should give us the best approximated solution. Now, by direct computations, we have A t A = [ 4 22 22 142 ], A t b = [ ] 9, 57 and the corresponding normal equation A t Ax = A t b has a unique solution ( 2, 5 ), so the straight line 7 14 that best fits the given set of data is y = 2 7 + 5 14 x. Example 5.2 (Data fitting problem using a polynomial curve) Find a polynomial of degree at most 2 that best fits the following set of data on the xy-plane. (2, 1), (5, 2), (7, ), (8, ). Solution A general polynomial of degree at most 2 can be represented as the form y = a 0 1 + a 1 x + a 2 x 2. When such a polynomial curve can pass through all the four points, it will of course best fit the data. However, the corresponding system admits no solution. a 0 1 + a 1 2 + a 2 2 2 = 1, 1 2 2 2 1 a 0 1 + a 1 5 + a 2 5 2 = 2, 1 5 5 2 a 0 a 0 1 + a 1 7 + a 2 7 2 =, 1 7 7 2 a 1 = 2 is inconsistent. a 0 1 + a 1 8 + a 2 8 2 = 1 8 8 2 a 2 Again, the technique of normal equation will help. By direct computations, we have 4 22 142 9 A t A = 22 142 988, A t b = 57. 142 988 718 9 We note that A t A is invertible, so the solution of the normal equation is (A t A) 1 A t b = 19 12 19 44 1 12. 91

5. Orthogonality This shows that the polynomial of degree at most 2 that best fits the data is y = 19 12 + 19 44 x 1 12 x2. Example 5. (Data fitting problem using a general curve) Find a curve in the form y = a 0 + a 1 sin x + a 2 sin 2x that best fits the following set of data. ( π 6, 1), (π 4, 2), (π, ), (π 2, ). Solution The system we are considering is again an inconsistent system: a 0 1 + a 1 sin π 6 + a2 sin 2π = 1, 1 sin π 6 6 a 0 1 + a 1 sin π 4 + a2 sin 2π = 2, 4 1 sin π a 0 1 + a 1 sin π + a2 sin 2π A = 4 =, 1 sin π a 0 1 + a 1 sin π 2 + a2 sin 2π =. 2 1 sin π 2 sin 2π 6 sin 2π 4 sin 2π. sin 2π 2 By direct computations, we have 4 A t A = 1 2 ( + 2 + ) 1 + A t b = 9 7 2 + 2 + 2 2 + 2. 1 2 ( + 2 + ) 1 + 5 1 2 4 ( + 2 2 + ), 1 4 ( + 2 2 + 5 ) 2 As we are looking for approximated solution, exact calculation is not necessary. So, an approximated solution for the normal equation is Then the best fitting curve is a 0 2.29169, a 1 5.108, a 1 0.67095. y = ( 2.29169) + (5.108) sin x + (0.67095) sin 2x. 92