Chapter 4 Euclid Space Inner Product Spaces Definition.. Let V be a real vector space over IR. A real inner product on V is a real valued function on V V, denoted by (, ), which satisfies () (x, y) = (y, x), () (kx, y) = k(x, y), (3) (x + y, z) = (x, z) + (y, z), (4) (x, x) 0 for all x V and (x, x) = 0 if and only if x = 0, where x, y, z are any vectors in V and k IR is a real number. The vector space V, together with the inner product, is called a real inner product space or a Euclidean space. Example. In IR n, for x = (ξ, ξ,..., ξ n ), y = (η, η,..., η n ). Define (x, y) = n ξ i η i It is easy to verify that it satisfies ()...(4). So, IR n, together with (, ), is an inner product space. Example. Let V = C[a, b] = {all continuous functions on [a, b]}. Define, (f, g) = b a f(x)g(x)dx for f, g C[a, b]. It is an inner product space. From the definition (), the inner product is symmetric and similar to () and (3), we have ( ) (α, kβ) = (kβ, α) = k(β, α) = k(α, β). (3 ) (α, β + γ) = (β + γ, α) = (β, α) + (γ, α) = (α, β) + (α, γ).
Definition.. For any x V. Define It is called the norm of x. x = (x, x). Theorem.. (Cauchy - Schartz inequality) For any x, y V, (x, y) x y. The above equality holds if and only if {x, y} is linearly dependent. Proof. When y = 0. The left equals to the right. Suppose that y 0. Let t be a real number. Consider for all t. Therefore, i.e. (x + ty, x + ty) 0 (x, x) + (x, y)t + (y, y)t 0 Let t = (x,y) (y,y). Then (x, y) (x, x) (y, y) 0 (x, y) (x, x)(y, y) i.e. (x, y) x y. If {x, y} is linear dependent i.e. x = ky. The equality holds. On the other hand, if the equality hold, from the above proof, we deduce that either y = 0 or i.e. {x, y} is linear dependent. x (x, y) (y, y) y = 0 Example. Apply Cauchy inequality to IR n. We get ( n n ) / ( n a i b i a i Example. Apply Cauchy inequality to C[a, b] we get b a ( b f(x)g(x)dx a b i ) / ) / ( b ) / f (x)dx g (x)dx a
Definition.3. x and y are orthogonal (which is often denoted by x y) if (x, y) = 0. Definition.4. Let x, y V be non zero vectors. The angle between x and y is defined as For orthogonal vectors x and y, we have (x, y) ϑ = cos x y. x + y = x + y. Orthonormal Basis Definition.. We say that a basis {x, x,..., x n } is an orthogonal basis for V if the x i are mutually orthogonal ; that is, (x i, x j ) = 0 whenever i j. In addition, if each x i = (that is, each x i is a unit vector), we say that the basis is orthonormal. It is easy to verify that a set of mutually orthogonal vectors is linearly independent. If {x, x,..., x n } is an orthonormal basis then (x i, x j ) = δ ij where δ ij is the Chroniker notation δ ij = { i = j 0 i j Under an orthonormal basis {x, x,..., x n }, any vector x V can be expressed as In fact if x = n a ix i then (x i, x) = a i. x = (x, x)x +... + (x n, x)x n. Theorem.. (Gram - Schmidt Theorem) Suppose that {x, x,..., x n } is mutually orthogonal in an inner product space V and x i are nonzero. Let y V and set m (y, x j ) x m+ = y (x j, x j ) x j = y j= m proj xj (y) Then the vectors x, x,..., x m, x m+ are mutually orthogonal and j= span{x,..., x m, y} = span{x, x,..., x m, x m+ }. Further, x m+ = 0 if and only if y span{x,..., x m }. 3
Remark.. P roj u (v) = (u,v) (u,u) u is called the projection of vector v onto u. Proof. By the definition of x m+, one sees immediately that y span{x,..., x m, x m+ } and x m+ span{x,..., x m, y}. From this, it follows that span{x,..., x m+ } = span{x,..., x m, y}. We only need to prove x m+ x i, for i =,,..., m. Note that (x m+, x i ) = (y m j= = (y, x i ) (y, x j ) (x j, x j ) x j, x i ) m j= = (y, x i ) (y, x i ) = 0 (y, x j ) (x j, x j ) (x j, x i ) For the final statement, if x m+ = 0, then y span{x,..., x m }. Conversely, suppose y span{x,..., x m }, then the set {u, u,..., u m } is a orthonormal basis for span{x,..., x m }, where u i = x i x i. Therefore y = m (y i, u i )u i = m (y, x i ) (x i, x i ) x i. i.e., x m+ = 0. The Theorem is proved. Theorem.. Any finite dimensional inner product space V has an orthogonal basis (and hence orthonormal basis). Proof. Let {x, x,..., x n } be a basis of V. We construct an orthogonal basis {w,..., w n } for V using {x,..., x n }. First, set w = x ( 0). Obviously span{w } = span{x }. For k with k n, we define w k inductively. We set k (x k, w j ) w k = x k (w j, w j ) w j j= (the Gram - Schmidt orthogonalization). By Theorem (.) span{w, w,..., w k } = span{x, x,..., x k } and w,..., w k are mutually orthogonal. Since {x,..., x k, x k } is linearly independent, we deduce that w k 0. After n-steps, we set an orthogonal basis {w, w,..., w n } of V. We may set u i = w i w i for i =,,..., n to set an orthonormal basis {u, u,..., u n }. 4
Example. Consider the subsapce V of IR 4 with basis of the three vectors v = (,, 0, ), v = (0,,, 4), v 3 = (3, 3, 3, 0). The vector space V has inner product defined as usual. We apply the Theorem (.) process as follows: Since v = 3, we set u = v v = ( 3, 3, 0, 3 ). Next, w = v proj u (v ) = v (v, u )u = (0,,, 4) 9 3 ( 3, 3, 0, 3 ) = (, 0,, ). Since w = 3, we obtain that u = w w = ( 3, 0, 3, 3 ). Finally, we set w 3 = v 3 proj u (v 3 ) proj u (v 3 ) = (3, 3, 3, 0) 3( 3, 3, 0, 3 ) ( )( 3, 0, 3, 3 ) = ( 0,, 3 3, 4 3 ). Since w 3 = 7, we set u 3 = w 3 w 3 = 7 ( 3,, 0 3, 4 3 ). Example. Consider P (IR) of real polynomials of degree at most. P has an inner product given by for p, q P (IR). (p, q) = 0 p(t)q(t)dt The standard basis {, t, t } is not orthogonal for this inner product. Gram-Schimidt procedure to obtain an orthonormal basis. We take w =, we see easily (, ) =, and (t, ) =. Hence We apply the To compute w 3, we calculate (w, w 3 ) = w = t = t (w, w ) = 0 0 t (t )dt =, (t ) dt = 5
and (t, ) = 3. Consequently, This gives an orthogonal basis for P (IR). w 3 = t 3 / / (t ) = t t + 6. Now let us study the transition matrix from one orthonormal basis to another. Let X = (x, x,..., x n ) and Y = (y, y,..., y n ) be two orthonormal bases of an inner vector space V and the transition matrix from X to Y be A = (a ij ), i.e., a a... a n (y,..., y n ) = (ɛ,..., ɛ n ) a a... a n............ a n a n... a nn Since (y,..., y n ) is orthonormal, (y i, y j ) = δ ij = { i = j 0 i j Note that the columns of matrix A are coordinates of y,..., y n under the orthonormal basis X = (x,..., x n ). Therefore, (y i, y j ) = a i a j + a i a j +... + a ni a nj = { i = j 0 i j That is, A A = E or A = A. Definition.. An n n real matrix A is called orthogonal if A A = E. From above analysis, we deduce that the transition matrix from one orthonormal basis to another is orthogonal. Conversely, if the first basis is orthogonal and the transition matrix from this basis to another is orthogonal, then the second basis is orthonormal. From A A = E, we deduce that AA = AA = E. 6
3 Isometries Definition 3.. Two Euclidean space V and V over IR is said to be isometric if there exists a bijection σ from V to V satisfying (i) σ(x + y) = σ(x) + σ(y), for all x, y V. (ii) σ(kx) = kσ(x), for all x V and k IR. (iii) (σ(x), σ(y)) = (x, y), for all x, y V. It is easy to verify that the isometric relation is an equivalent relation. As a consequence, we have: Theorem 3.. Two finite dimensional Euclidean spaces are isometric if and only if they have the same dimension. 4 Orthogonal Transformation Definition 4.. A linear transformation A on a Euclidean space V is called an orthogonal transformation if it keeps the inner product invariant, i.e., for any x, y V, (Ax, Ay) = (x, y) An orthogonal transformation can be characterized as follows. Theorem 4.. Let A be a linear transformation on an Euclidean space V. The following four statements are equivalent: (i) A is an orthogonal transformation ; (ii) A keeps length of vector invariant, ie., Ax = x, for all x V. (iii) If {x,..., x n } is an orthonormal basis of V, so is {Aɛ,..., Aɛ n } (iv) The matrix A of A under any orthonormal basis is orthogonal. Proof. We prove that (i) is equivalent (ii). If A is an orthogonal transformation, then Ax = (Ax, Ax) = (x, x) = x Therefore, Ax = x. Conversely, if A keeps length of vectors invariant, then (Ax, Ax) = (x, x), (Ay, Ay) = (y, y) and (A(x + y), A(x + y)) = (x + y, x + y) 7
Expanding the last equation, we get (Ax, Ax) + (Ax, Ay) + (Ay, Ay) = (x, x) + (x, y) + (y, y) Therefore, (Ax, Ay) = (x, y), that is, A is an orthogonal transformation. We prove that (i) and (iii) are equivalent. Let {x, x,..., x n } be an orthonormal basis and A be an orthogonal transformation. Then (Ax i, Ax j ) = (x i, x j ) == { i = j 0 i j That is, {Ax, Ax,..., Ax n } is an orthonormal basis. Conversely, let {Ax,..., Ax n } is an orthonormal basis. Assume that x = ξ x + ξ x +... + ξ n x n y = η x + η x +... + η n x n then Ax = n ξ iax i and Ay = n η iax i and n n n (Ax, Ay) = ( ξ i Ax i, η i Ax i ) = ξ i η i (Ax i, Ax j ) = i,j= That is, A is an orthogonal transformation. n ξ i η i = (x, y) We prove (iii) and (iv) are equivalent. Let X = {x,..., x n } be an orthonormal basis and A be the matrix representation of the linear transformation A under the basis {x,..., x n }, i.e., A = Φ X (A). or (Ax,..., Ax n ) = (x,..., x n )A If {Ax,..., Ax n } is orthonormal, then A is the transition matrix from basis {x,..., x n } to the basis {Ax,..., Ax n }. So it is othorgonal. Conversely if A is an orthogonal matrix then {Ax,..., Ax n } is an orthonormal basis. Since an orthogonal matrix is invertible, so is an orthogonal transformation. In fact an orthogonal transformation is an isometry on V. Therefore the product of orthogonal transformations is orthogonal and the product of orthogonal matrices are orthogonal. If A is an orthogonal matrix then AA t = E so A = or A = ±. If A =, we say A is of the first type or a rotation. If A =, we say A is of the second type. 8
5 Orthogonal Complement Definition 5.. Suppose that V, V are two subspaces of an Euclidean space V. If for any x V and y V, (x, y) = 0, we say V is orthogonal to V, denoted by V V. If x V and (x, y) = 0 for all y V, we say x is orthogonal to V denoted x V. Homework. If V V then V V = {0}. Theorem 5.. If V, V,..., V s are subspaces of V and are pairwisely orthogonal then the sum V + V +... + V n is a direct sum. Proof. Suppose that α i V i for i =,,..., s and α + α +... + α s = 0. We prove that α = α =... = α s = 0. Taking inner product on both sides of the above equation with α and using the orthogonality, we get (α i, α i ) = 0 which implies α i = 0 for all i. That is V + V +... + V s is a direct sum. Definition 5.. A subspace V is called the orthogonal complement of V if V V and V = V + V. Obviously if V is the orthogonal complement of V then V is the orthogonal complement of V. Theorem 5.. Any subspace V of an Euclidean space V has a unique orthogonal complement. Proof. If V = {0} then its orthogonal complement is V and uniqueness is obvious. Assume V {0}. Since V is an Euclidean space with inner product (, ) inherited from V, it has an orthogonal basis, denoted by {x, x,..., x m }. Augment it into an orthogonal basis of V, that is, {x,..., x m, x m+,..., x n }. Obviously, the orthogonal complement of V is span{x m+,..., x n }. Let us prove the uniqueness. Let V and V 3 are orthogonal complements of V then V = V V and V = V V3 9
If α V, then from the second equation above, α = α +α 3 where α V and α 3 V 3. Since α α and α 3 α, (α, α ) = (α, α ) + (α 3, α ) = (α + α 3, α ) = (α, α ) = 0. We deduce that α = 0. Therefore α = α 3 V 3. That is V V 3. Similarly, we can prove that V 3 V, that is V = V 3. The orthogonal complement of V is denoted by V. By the definition, Corollary 5.3. V = {x V x V }. dim V + dim V = n. From V = V V, for any x V there exist (uniquely) x V, x V such that x = x + x Define P roj V (x) = x. It is called the projection of x onto V. 6 Standard Form of Symmetric Matrices Recall that an n n matrix A is symmetric if A t = A. We will prove that for any n n real symmetric matrix, there is an orthogonal matrix T such that T t AT = T AT is a diagonal matrix. Let us study properties of symmetric matrices. Lemma 6.. lem Let A be a real symmetric matrix then all eigenvalues of A are real. Proof. Let λ 0 be an eigenvalue of A and x = (x, x,..., x n ) t C n be a corresponding eigenvector, i.e., Ax = λ 0 x. Let x = (x, x,..., x n ) t C n, where x i is the complex conjugate of x i. Then Ax = λ 0 x. Thus, (Ax, x) = x t Ax = x t A t x = (Ax) t x = (Ax) t x The left side of the above is λ 0 x t x and the right is λ 0 x t x. So λ 0 x t x = λ 0 x t x Therefore λ 0 = λ 0 is a real number since x t x 0. 0
Note that n n (Ax, x) = a ij x j x i = x j a ij x i i,j= i,j= n n n n = x j ( a ji x i ) = ( a ji x i )x j j= j= = (x, Ax) Moreover, (Ax, x) = λ 0 (x, x) and (x, Ax) = λ 0 (x, x). Therefore, λ 0 = λ 0 or λ 0 is real number. For a real symmetric matrix A, define a linear transformation A on IR n as follows Ax = Ax Lemma 6.. lem Let A be a real symmetric matrix and A be defined as above, then for any x, y IR n (Ax, y) = (x, Ay) (6.) or y t Ax = x t Ay. Proof. In fact y t Ax = y t A t x = (Ay) t x = x t Ay. Definition 6.. A transform A on IR n satisfying (6.) is called a self-adjoint operator on IR n. Lemma 6.3. lem3 Let A be a self-adjoint operator and V be an A-subspace. Then V is also an A-subspace. Proof. Let y V. We need to show that Ay V. For any x V, since Ax V and y V, we have (y, Ax) = 0 Therefore, (Ay, x) = (y, Ax) = 0, i.e. Ay V or Ay V. Lemma 6.4. lem4 Let A be a real symmetric matrix. Then eigenvectors in IR n associated with distinct eigenvalues are orthogonal.
Proof. Let λ and µ be two distinct eigenvalues and x, y be eigenvectors associated with λ and µ, respectively, i.e. Ax = λx and Ay = µy. Since (Ax, y) = (x, Ay) we have This implies (x, y) = 0 since λ µ. Now let us prove the main Theorem. λ(x, y) = (Ax, y) = (x, Ay) = µ(x, y) Theorem 6.5. Let A be a real symmetric n n matrix. There exists an n n orthogonal matrix T such that T t AT = T AT is diagonal. In other word, any real symmetric matrix is diagonalizable. Proof. We only need to show that there is an orthonormal basis of IR n that consist of eigenvectors of A. We prove it by induction. The theorem is true for n =. Suppose that theorem holds for n. For n dimensional space IR n, the linear transformation A has a real eigenvalue λ. Let x IR n be an associated eigenvector. We may normalize x such that x =. Let V = span{x }. Then V is an A subspace, by Lemma??, V = V is also an A subspace and the dimension of V is n. Consider A V. It is obvious that A V satisfies (6.). That is A V is self adjoint. By the induction assumption, A V has an orthonormal basis {x, x 3,..., x n } that consist of eigenvectors of A V. Then {x, x,..., x n } is an orthonormal basis of IR n that consist of eigenvectors of A. The Theorem is proved.. Let A be a real symmetric matrix. How do we diagonalize A? From the above Theorem, to find the orthgonal matrix T is equivalent to find an orthonormal basis of IR n that consist of eigenvectors of A. In fact, if t t n t t n t n t η =., η t =.,, η t n = n. is an orthonormal basis of IR n that consist of eigenvectors of A, then the transition matrix from ɛ, ɛ,..., ɛ n to η, η,..., η n is t t... t n T = t t... t n............ t n t n... t nn t nn
and T AT = T t AT is diagonal. Therefore, one can find the orthogonal matrix T as follows: () Find all eigenvalues of A. Let λ,..., λ r be all eigenvalues of A. () For each λ i, solve the homogenous system (λ i A)x = 0 to find a basis for the solution space V λi. Then use the Gram-Schimit process to find an orthonormal basis η i, η i,..., η iki for V λi. (3) Since λ, λ,..., λ r are distinct, {η, η,..., η k,...η r, η r,..., η rkr } are pairwise orthogonal and it forms an orthonormal basis of IR n which forms the orthogonal matrix T. Example. Let 0 A = 0 0 0 () Find eigenvalues of A. We have λ 0 λ λ λ λ A = λ λ = 0 λ 0 λ 0 0 λ λ λ λ λ = (λ ) 3 0 = (λ ) 3 (λ + 3) 0 Therefore eigenvalues of A are λ = and λ = 3. For λ =. Solve (λ A)X = 0 to find a basis for the solution space V λ. α = (,, 0, 0) α = (, 0,, 0) α 3 = (, 0, 0, ) 3
Orthogonalizing it, we get Normalizing it, we get β = α = (,, 0, 0) β = α (α, β ) (β, β ) β = (,,, 0) β 3 = α 3 (α 3, β ) (β, β ) β (α 3, β ) (β, β ) β = ( 3, 3, 3, ) η = (,, 0, 0) η = (,,, 0) 6 6 6 η 3 = (,,, 3 ) For λ = 3. Solve (λ A)X = 0 to find a basis for the solution space V λ = span{(,,, )}. Normalizing it, we get η 4 = (,,, ) Then {η, η, η 3, η 4 } form a orthonormal basis of IR 4. Therefore the orthogonal matrix T = 6 6 0 6 0 0 3 and T AT = 3 7 Orthogonal Projection and Direct Let A be m n matrix and ker(a) = row(a) IR n. Assume that AX = 0 then x α + x α +... + x n α n = 0 4
and (R i, v) = 0 for each row R i. So v row(a) iff v ker(a). Corollary 7.. Suppose A is an m n matrix of rank n. Then A t A is an invertible n n matrix. Proof. We need to show that A t AX = 0 only has zero solution. Indeed, A t AX = 0 implies that AX ker(a t ). However, AX col(a) = row(a t ). But ker(a t ) = row(a t ). This implies that Ax row(a t ) (row(a t )) = {0}. Therefore, AX = 0. Since v IR and rank(a) = n, A is injective or v = 0. Note that: AX = 0 has only zero solution iff rank(a) = n. 5