6.. Length, Angle, and Orthogonality In this section, we discuss the defintion of length and angle for vectors and define what it means for two vectors to be orthogonal. Then, we see that linear systems are much easier to solve if their set of vectors is orthogonal. We begin by defining the dot product, which has already shown up on defining matrix-vector and matrix-matrix multiplication. Definition 6... The dot product of two vectors x, y R n is given by For example, x y x x y =. y. = x y + x y x + + x n y n x n y n In addition, the dot product has the following properties. Proposition 6... Given x, y, z R n and a scalar c, (i) x y = y x, (ii) x x (iii) x x = if and only if x =, (iv) x (c y) = (c x) y = c( x y), (v) ( x + y) z = x z + y z, (vi) x ( y + z) = x y + x z Proof. Problem. 7 = () + (7) () =. Definition 6... The length of a vector x R n is given by x = x x = x + x + + x n. We leave it to the reader to show that Definition 6.. of length agrees with the notion of length from geometry, Problem. Note that Definition 6.. can also be written as () Here are some properties of the length function. x = x x. Proposition 6... Given any x, y R n and scalar c, (i) x. (ii) x = if and only if x = (iii) c x = c x (iv) x + y + x y = ( x + y ), called the Parallelogram Law. Proof. Problem. In addition, the length and dot product have another important relationship called the Cauchy-Scwarz Inequality, which tells us how big the dot product between two vectors can be in terms of their length. It is quite important in more advanced mathematics and other applications. In addition, the triangle inequality tells us that going from point A directly to point B is always at least as short as going from A to C to B. Various forms of the triangle inequality appear all over mathematics. Theorem 6.. (Cauchy-Schwarz Inequality). For all x, y R n, x y x y.
Proof. Problem and 6. Proposition 6..6 (Triangle Inequality). For all x, y R n, Proof. Problem 8. x + y x + y. Definition 6..7. A vector x R n is a unit vector if x =. The set of all vectors in R n represents the set of all magnitudes and directions, while the set of unit vectors in R n represents the set of directions in R n since magnitude is taken out of the equation by restricting to unit length. Definition 6..8. The angle between two nonzero vectors x, y R n is given by ( ) x y ( x, y) = cos. x y Again, we should check that Definition 6..8 agrees with the notion of angle from geometry. Consider nonzero x, y R n. Then, x, y and y x form a possibly degenerate triangle in span( x, y). The angle between x and y refers to the angle in this triangle between x and y. That way, ( x, y) π. Note that whether this angle is directed from x to y or x to y counterclockwise depends on x and y. We visualize this within the or dimensional subspace span( x, y) below: y x x ( x, y) y Now, by the Law of Cosines, y x = y + x x y cos( ( x, y)). Using () and properties of the dot product in Proposition 6.., ( y x) ( y x) = y y + x x x y cos( ( x, y)) y y ( x y) + x x = y y + x x x y cos( ( x, y)) ( x y) = x y cos( ( x, y)) cos( ( x, y)) = x y x y ( x, y) = cos ( x y x y Thus, the definition of angle agrees with our geometric definition of angle. We are particularly interested in when nonzero two vectors are orthogonal/perpendicular, which is when their angle is π. We have ( x, y) = π ( ) x y cos = π x y = x y =. x y x y This means the following definition agrees with our geometric notion as well. Definition 6..9. Two vectors x, y R n are orthogonal, denoted x y, if x y =. The Pythagorean Theorem generalizes to vectors as follows: Proposition 6... If x, y R n are orthogonal, then ). x + y} = x + y
We extend the notion of orthogonality to finite lists of vectors by requiring that each pair of vectors in that set be orthogonal. Definition 6... A list ( u,..., u k ) of vectors in R n is orthogonal if each vector is nonzero, and u i u j = for all i j. Also, ( u,..., u k ) is orthonormal if in addition all of the vectors are unit vectors, so { if i = j u i u j = if i j. The canonical example of an orthonormal set is the set of standard basis vectors in R n, ( e,..., e n ). We have { if i = j e i e j = if i j. which we let the reader verify, Problem. Here are some other examples of orthogonal lists of vectors:,,,,,,,,, Even one pair of vectors not being orthogonal forces the whole set to not be orthogonal. For example,,,, is not orthogonal because and are not orthogonal, but every other pair is. Although is orthogonal to every vector, we exclude from orthogonal lists for reasons that will be clear later. Why do we care about orthogonal sets of vectors? They have applications to physics, differential equations, and other areas of advanced mathematics. But for this class, they make linear systems easy to solve, and they will play an important role in constructing orthogonal projections and orthogonal linear maps (Sections 6. and 6.). If a linear system uses an orthogonal set of vectors, we can simply our computations by a lot. It is then an immediate consequence that orthogonal subsets of R n are linearly independent. Proposition 6... Suppose ( u,..., u k ) R n is orthogonal, and x R n. Then, c u + + c k u k = x has at most one solution. Moreover, if it has a solution, that solution is c j = x u j, for all j. u j Proof. Suppose that c,..., c k satisfy c u + + c k u k = x In order to use that ( u,..., u k ) is orthogonal, we have to get the dot products u i u j involved somehow. We can do that simply by taking the dot product of both sides with u j. Then, for all j, using the properties of
the dot product and that ( u,..., u k ) is orthogonal, (c u + + c j u j + + c k u k ) u j = x v j (c u ) u j + + (c j u j ) u j + + (c k u k ) u j = x u j c ( u u j ) + + c j ( u j u j ) + + c k ( u k u j ) = x u j c () + + c j ( u j ) + + c k () = x u j c j ( u j ) = x u j c j = x u j u j. We can divide by u j because u j since u j. This is part of why we assumed the vectors were nonzero in definition of orthogonality. Since we started with an arbitrary solution and found only possibility, this is the only solution if there is one. Corollary 6... Any orthogonal list of vectors in R n is linearly independent. Proof. Let ( u,..., u k ) R n be orthogonal. By Proposition 6.., the linear system c u + + c k u k =. has at most one solution, which can only be the trivial solution. Hence, ( u,..., u k ) is linearly independent. Example 6... Solve the following linear systems. (a) (b) Solutions: x + x + x = 6, x + x + x =. (a) Notice that,, is orthogonal. By Proposition 6.., if (a) has a solution, it is 6 6 6 c = =, c = =, c = = 7 Note we do not know that this is a solution yet. This is just the only possible solution if there is one. So, let us check if it is: + 7 = 6. Yes, it is a solution. Thus, c =, c =, c = 7 is the only solution to (a).
(b) Notice that,, is orthogonal. By Proposition 6.., if (b) has a solution, it is c = = =, c = = = 6, c = = = Again, all we know is that this is the only solution if there is one. So, let us check if it is: + 6 + = / / / 7/ No, it is not. Thus, (b) has no solution. Exercises:. Determine if the following lists are orthogonal. (a) ([ ], [ ]) (b) ([ ], [ ]) (c),, (d),, 7 8. Add vectors to the following sets to make orthogonal bases for R n, with the appropriate n, if possible. If it is not possible, why not? (a) ([ ]) (b), (c), (d),
6 (e). Solve the following linear systems using Proposition 6... (a) x [ ] [ ] [ 7 + x = ] (b) (c) (d) (e) x x x x + x = 7 + x = 6 7 + x + x + x + x 7 = = Problems:. () Prove the properties of the dot product in Proposition 6.... () Use the Pythagorean Theorem to show that Definition 6.. agrees with the notion of length from geometry.. () Prove the properties of length in Proposition 6... ( ) u. () Show that if ( u,..., u k ) is orthogonal, then u,..., u k u k is orthonormal.. () Show that ( e,..., e n ) is orthonormal list in R n. 6. () Show that for any y R n, T : R n R given by T ( x) = x y is a linear map, and that any linear map T : R n R is of this form. 7. (+) Show that any linear map T : R n R m is of the form x u T ( x) =. x u m for some u,..., u m R n. 8. () Suppose that ( u,..., u k ) and ( v,..., v m ) satisfy u i v j for all i and j. Show that any linear combination of ( u,..., u k ) is orthogonal to any linear combination of ( v,..., v k ).
9. () Suppose that ( u,..., u k ) and ( v,..., v m ) are orthogonal and satisfy u i v j for all i and j. Show that ( u,..., u k, v,..., v m ) is orthogonal.. () Show that if ( u,..., u n ) is an orthonormal basis for R n, then for all x R n, x = ( x u ) u + + ( x u n ) u n 7. () Show that for all x, y R n,. () Prove Proposition 6... x y = x + y x y.. (+) Show that if ( u,..., u k ) is orthogonal, then u + + u k = u + + u k.. (+) Show that if ( u,..., u k ) is orthonormal and c,..., c k are scalars, then. Prove Theorem 6.. using Definition 6..8. c u + + c n u k = c + + c k. 6. The goal of this problem is to give another proof of Theorem 6... Fix x, y R n, and consider the function f : R R given by f(c) = x c y. (a) () Find the minimum value of f and where it occurs. (b) () Deduce that x y x y. 7. () Show that for any orthogonal list ( u,..., u k ) in R n with k < n, there exists u k+ R n so that ( u,..., u k, u k+ ) is orthogonal. 8. (+) Prove Proposition 6..6. 9. () Show that for all u,..., u k R n, and scalars c,..., c k, c u + + c k u k c u + + c k u k.. () Show that for all x, y R n, x y x y.. Suppose T : R n R n is linear and diagonalizable with nonnegative eigenvalues λ, λ,..., λ n and corresponding eigenbasis ( v,..., v n ), so that λ λ λ n. (a) () If ( v,..., v n ) is orthogonal, show that for all x R n, λ x T ( x) λ n x. (b) () Does either inequality in (a) need to hold if ( v,..., v n ) is not orthogonal? Why or why not?. Suppose L : R n R satisfies the properties in Proposition 6... L need not coincide with the length function from this section! Then, define x, y = L( x + y) L( x y). Show that the following hold for all x, y, z R n : (i) () x, x = L( x). (ii) () x, y = y, x. (iii) () x, y = x, y.
8 (iv) () x + y, z = x, z + y, z. (v) () n x, y = n x, y for all positive integers n. (vi) (+) r x, y = r x, y for all rational scalars r, where r = a b for some integers a, b. (vii) () (Requires a bit of Real Analysis) c x, y = c x, y for all scalars c.