NOTES ON LINEAR ALGEBRA CLASS HANDOUT

NOTES ON LINEAR ALGEBRA CLASS HANDOUT ANTHONY S. MAIDA CONTENTS 1. Introduction 2 2. Basis Vectors 2 3. Linear Transformations 2 3.1. Example: Rotation Transformation 3 4. Matrix Multiplication and Function Composition 3 4.1. Example: Rotation Transformation Revisited 4 5. Identity Matrices, Inverses, and Determinants 5 5.1. Example: Inverse of Rotation Transformation 6 6. Eigenvectors and Eigenvalues of a Matrix 7 6.1. Example: Finding Eigenvalues and Eigenvectors 8 6.2. Example: Eigenvalues of Rotation Matrix 8 7. Significance of Eigenvectors and Eigenvalues 8 7.1. Example: Raising M to a Power 9 7.2. Example: Stability of Discrete System of Linear Equations 10 7.3. Example: Dominant Mode of a Discrete Time System of Linear Equations 11 8. Vector Spaces 11 8.1. Distance metrics 12 8.2. Inner Product or Dot Product 12 8.2.1. Properties of the Inner Product 13 9. Vector Geometry 13 9.1. Perpendicular Vectors 14 9.2. Cosine of the Angle Between Vectors 14 9.2.1. Method 1 14 9.2.2. Method 2 14 10. Matlab 15 Date: Version February 13, 2015. Copyright c 2007-2015. 1

2 ANTHONY S. MAIDA 1. INTRODUCTION This write-up explains some concepts in linear algebra using the intuitive case of 2 2 matrices. Once the reader envisions the concepts for these simple matrices, it is hoped that his or her intuition will extend to the more general case of n n matrices, and make more advanced treatments on the topic accessible. In the following, we assume that a matrix M is a 2 2 matrix with elements as shown below. [ ] a b M (1) c d A key idea will be that a matrix represents a linear transformation. We will also have need to represent two-component vectors such as x (x 1, x 2 ). Since we are working in a linear algebra context, we will represent these as 2 1 matrices. These are also known as column vectors, denoted x [ x1 x 2 ]. (2) If we write, [x 1 x 2 ] T, this denotes a column vector and it is an alternative to using expression 2 above. This convention saves vertical space in written documents. 2. BASIS VECTORS We will denote a continuous, two-dimensional plane of points by R 2, which is shorthand for R R. Points in the plane are denoted by vectors. When we talk about R 2, combined with the rules of vector algebra, then we are using a vector space. Vectors can be decomposed into a canonical representation which is just a (linear) combination of so-called basis vectors. Any pair of vectors can serve as a basis for R 2 as long is they are non-zero and not colinear. When discussing the vector space R 2, we normally use the standard basis, which consists of the vectors [1, 0] T and [0, 1] T. An arbitrary point [x 1 x 2 ] T in two-dimensional space can be decomposed into a linear combination of these basis vectors. For instance, the point [2, 3] can be represented as [ 2 3 ] [ 1 2 0 ] + 3 [ 0 1 ]. (3) The right-hand side of the above formula is an example of a linear combination. As noted above, any pair of vectors that are linearly independent could be used as a basis. A pair of vectors is linearly independent if they are both nonzero and their directions are not aligned. Given a linear transformation, we may chose a basis set of vectors that is convenient to understand the structure of the transformation. We now define a linear transformation. 3. LINEAR TRANSFORMATIONS The first thing to learn is that a 2 2 matrix of real numbers is not just a table of numbers. It represents a linear transformation from R 2 to R 2. That is, it represents a mapping from points in the plane to other points in the plane. Algebraically, a linear transformation is a function, f( ), which has the following property. f(a v + b w) af( v) + bf( w) (4)

NOTES ON LINEAR ALGEBRA CLASS HANDOUT 3 In the above, a and b are scalars, and v and w are two-dimensional vectors. If a mapping has the above property, then it is a linear transformation. A linear transformation shall map these basis vectors into some other (possibly the same) points. Let us call these points [a, c] T and [b, d] T. Specifically, we have ([ ]) [ ] 1 a f (5) 0 c ([ ]) [ ] 0 b f. (6) 1 d Keeping this in mind, let us see what a linear transformation,f, does to an arbitrary value [x 1 x 2 ] T. ([ ]) ( [ ] [ ]) x1 1 0 f f x x 1 + x 2 0 2 1 ([ ]) ([ ]) 1 0 x 1 f + x 0 2 f 1 [ ] [ ] a b x 1 + x c 2 d [ ] [ ] ax1 bx2 + cx 1 dx 2 [ ] ax1 + bx 2 cx 1 + dx 2 [ ] [ ] a b x1 M x (7) c d x 2 Because f([x 1 x 2 ] T ) is shown to equal M x, this shows that multiplying a matrix with a vector is the same as applying a linear transformation to the vector. The matrix, M, represents a linear transformation and is defined by what the linear transformation does to the basis vectors. 3.1. Example: Rotation Transformation. A counter clockwise rotation about of a point in the plane about the origin is an example of a linear transformation and can be represented by a 2 2 matrix. The a and c values of the matrix are determined by specifying how the basis vector [1, 0] T should be transformed. Similarly, the b and d values of the matrix are determined by specifying how the [0, 1] T should be transformed (see Figure 1). The resulting matrix is given below. [ ] cos θ sin θ M (8) sin θ cos θ 4. MATRIX MULTIPLICATION AND FUNCTION COMPOSITION Let f( ), g( ), and h( ) be arbitrary functions that map from values in R 2 to values in R 2. Let x denote a vector in R 2. Let (g f)( ) denote the function that results when applying g( ) to the output of f( ). In other words, (g f)( x) means the same thing as g(f( x)), which is depicted in Figure 2. The former notation is the mathematician s way of creating a name for large procedure,

4 ANTHONY S. MAIDA (a, c) (a, T c) T (b, d) (b, T d) T (0, 1 (0, ) T 1 ) T θ θ θ θ (1, 0 (1, ) T 0 ) T (A) (B) FIGURE 1. Illustration of the trigonometry for rotating the basis vectors [1, 0] T and [0, 1] T. (g f)( ), that is built of two subprocedures f( ) and g( ) that are executed in sequence. The operation of assembling functions in this fashion is called function composition and the operator is the function composition operator. Function composition is associative. Specifically, (h (g f))( ) ((h g) f)( ). (9) Although function composition is associative, it is not commutative. That would correspond to swapping the order of subprocedures within a procedure. Now let us suppose that the above mentioned functions f( ), g( ), and h( ) are linear transformations. Then they can be represented by the 2 2 matrices which are F, G, and H. When we write H(G(F x)) (10) it means to first mulitply F with x. This yields a point in R 2 that is represented as a column vector. Multiplying G with this result yields another column vector, and H can be multiplied with that result. Thus, matrix multiplication corresponds to function composition. Since function composition is associative, it does not matter how we parenthesize the matrices, as long as we do not change the order of the matrices. In fact, we can leave the parentheses out completely, as shown below. HGF x (11) In this vein, it is worth noting that matrix multiplication, like function composition, is associative but not commutative. If this were not true, matrix multiplication would be unable to represent function composition. 4.1. Example: Rotation Transformation Revisited. Suppose one wants to apply a rotation transformation by an amount θ 1 and then after that apply another rotation transformation by an amount θ 2. This is shown below. [ ] [ ] [ ] cos θ2 sin θ 2 cos θ1 sin θ 1 x1 (12) sin θ 2 cos θ 2 sin θ 1 cos θ 1 x 2

NOTES ON LINEAR ALGEBRA CLASS HANDOUT 5 x f y g z FIGURE 2. Procedural representation of the effect of composing functions g( ) and f( ) to obtain z g(f( x)). Of course, we can compose the matrix transformations by multiplying the matrices together. This gives the matrix. [ ] (cos θ1 cos θ 2 sin θ 1 sin θ 2 ) (sin θ 1 cos θ 2 + cos θ 1 sin θ 2 ) (13) (sin θ 1 cos θ 2 + cos θ 1 sin θ 2 ) (cos θ 1 cos θ 2 sin θ 1 sin θ 2 ) If θ θ 1 + θ 2, then this matrix is equivalent to that in Equation 8. Since the corresponding matrix elements are equal, we have proved the following two trigonometric identities below. sin(θ 1 + θ 2 ) sin θ 1 cos θ 2 + cos θ 1 sin θ 2 (14) cos(θ 1 + θ 2 ) cos θ 1 cos θ 2 sin θ 1 sin θ 2 (15) Later, we will use the latter identity to obtain a formula for the cosine of the angle between two vectors. 5. IDENTITY MATRICES, INVERSES, AND DETERMINANTS If f( ) is the identity function, then f( x) x for all x. This function is a linear transformation and can be represented by the identity matrix, shown below. [ ] 1 0 I (16) 0 1 If a function, f( ), has an inverse, denoted f 1 ( ), then f f 1 ( ) f 1 f( ) which equals the identity function. If M is a matrix that has an inverse M 1, then MM 1 M 1 M I. (17) For a 2 2 matrix, M, define the determinant to be the quantity ad bc. Specifically, det(m) ad bc. Note that sometimes the notation M is used to denote the determinant of matrix M. The formula for the inverse of a matrix is given below. [ M 1 d c ] b a 1 det(m) Note that this formula is defined only if det(m) 0. In particular, a matrix has an inverse if and only if its determinant is not equal to 0. This is convenient because it allows one to easily determine whether a linear transformation has an inverse. (18)

6 ANTHONY S. MAIDA (b, d) (a, c) FIGURE 3. Image of unit square generated by matrix transformation. Area is given by ad bc. The area is nonzero unless the parallelogram degenerates to a line or a point. A linear transformation maps the points falling within the unit square into a parallelogram. The unit square consists of the points in the region of R 2 where 0 x 1 1 and 0 x 2 1. The corners of the unit square map to the corners of the parallelogram (see Figure 3). The determinant of a matrix gives the surface area of the transformation s image when applied to the unit square. The area of this parallelogram gives the value of the determinant. If the transformation maps the square into a line or point (both of which are degenerate parallelograms), then the value of the determinant is zero. Otherwise, it is nonzero. If the parallelogram is not degenerate then the mapping that is specified by the matrix is nonsingular. Specifically, unique points on the unit square map to unique points on the parallelogram, and the reverse is also true. If the image is a line or a point, then many points on the square map to single points on the degenerate parallelogram. In this case, the function does not (for obvious reasons) have an inverse. 5.1. Example: Inverse of Rotation Transformation. The matrix in Formula 8 gives a counter clockwise rotation by an angle θ. The inverse transformation would instead give a clockwise rotation. Using the formula for the inverse we can obtain the clockwise rotation matrix shown below. [ M 1 cos θ sin θ ] sin θ cos θ When deriving this remember that sin 2 θ + cos 2 θ 1. Also note, for the case of rotation, the inverse of a rotation matrix is its transpose. That is, M 1 M T. When this happens we have an orthonormal transformation. This corresponds to a (possibly) flipped rigid rotation. By rigid rotation, we mean that vector lengths and angles between vectors do not change when the transformation is applied. (19)

NOTES ON LINEAR ALGEBRA CLASS HANDOUT 7 6. EIGENVECTORS AND EIGENVALUES OF A MATRIX There is an effective way to perform a structural analysis of a linear transformation. This involves the use of eigenvectors and eigenvalues of the matrix representing the transformation. Consider the situation of multiplying a matrix with a nonzero vector as in M x. If the vector x is chosen correctly, this operation has the effect of shrinking or stretching the vector but it does not change the direction of the vector other than perhaps reversing the direction. This can be written as M x λ x. (20) In the above, λ is a scalar. Given a matrix M, if a vector x has this property, then x is said to be an eigenvector of M, and λ is its associated eigenvalue. We shall now solve for the eigenvectors and eigenvalues of 2 2 matrix M. Consider the following steps. M x λi x (21) M x λi x 0 (22) (M λi) x 0 (23) In the above, note that (M λi) denotes a matrix and Eq. 23 is called the characteristic equation for matrix M. Since we have assumed that x is nonzero, the only way that this matrix can map x into zero is if it is mapping the unit square into a degenerate parallelogram. Thus the determinant of this matrix is zero. This gives us some leverage to find the value of λ. Matrix (M λi) expands as shown below. [ ] [ ] a b λ 0 M λi (24) c d 0 λ [ ] a λ b (25) c d λ However, we are actually interested in the determinant of this matrix, rather than the matrix itself. The determinant expands to M λi (a λ)(d λ) bc (26) λ 2 (a + d)λ + ad bc 0. (27) The above is a quadratic equation where λ is the unknown, and it can be solved using the quadratic formula. It is called the characteristic polynomial for matrix M. When this is solved, we have obtained up to two eigenvalues for the matrix M. Once we know the eigenvalues, we can use Formula 20 to solve for the eigenvectors that go with the eigenvalues. For a 2 2 matrix, we will have two eigenvalues and one eigenvector to go with each eigenvalue. Notice that, in Equation 27, the quantity a + d is the sum of the diagonals of matrix M. This is called the trace of M. Also note that ad bc is the determinant of M.

8 ANTHONY S. MAIDA Let us denote the trace of M by τ and the determinant of M by. Then using the quadratic formula, we can write concise formulas for the eigenvalues. λ 1 τ + τ 2 4 2 λ 2 τ τ 2 4 2 6.1. Example: Finding Eigenvalues and Eigenvectors. Compute the eigenvalues and eigenvectors of the matrix [ ] 4 0 M. (30) 0 1 Solution. [ ] 4 λ 0 (4 λ)(1 λ) λ 0 1 λ 2 5λ + 4 0 (31) The quadratic equation on the right has two solutions: λ 1 4 and λ 2 1. These are the two eigenvalues listed in order of numerical magnitude. The corresponding eigenvectors can be obtained by substituting the value of λ back into Equation 23. Specifically, using λ 1 gives the first eigenvector. [ ] [ ] 4 λ1 0 x1 (M λ 1 I) x (32) 0 1 λ 1 x 2 [ ] [ ] 0 0 x1 (33) 0 3 x 2 [ ] [ ] 0 0 (34) 3x 2 0 The above equations imply that x 2 0 but place no constraint on the value of x 1. For convenience, we set x 1 1 so that the length of the vector is one. We shall denote the eigenvector that corresponds to λ 1 by ξ 1. Thus, ξ 1 [1, 0] T. Similar analysis shows that ξ 2 [0, 1] T. 6.2. Example: Eigenvalues of Rotation Matrix. If one computes the eigenvalues of the rotation matrix, one finds that they are complex numbers. This makes sense because the transformation rotates all vectors in R 2, thus the transformed vector never points in the same direction. 7. SIGNIFICANCE OF EIGENVECTORS AND EIGENVALUES For this section, we will assume that the eigenvalues of the matrix under discussion are distinct and that we are working with an n n matrix. Given a matrix M, there is an alternative way to represent it using its eigenvectors and eigenvalues. This yields a canonical representation that makes the structure of the underlying linear transformation explicit. Define the matrix Λ as shown below λ 1 0... 0 0 λ 2... 0 Λ..... 0. (35) 0 0 0 λ n (28) (29)

NOTES ON LINEAR ALGEBRA CLASS HANDOUT 9 This is a diagonal matrix of eigenvalues of M, where the λ i are the eigenvalues, and they are ordered according to their magnitude. We will see that this matrix represents the same linear transformation as M but using a more convenient basis set. To go further, we have to change the representation of the vectors that we have been using so that they use the basis set assumed by the matrix Λ. Define the matrix V as shown below V [ ξ 1 ξ 2... ξ n ]. (36) V is an n n matrix where the first column is the first eigenvector of M, and so forth. The eigenvectors are ordered according to the corresponding eigenvalues of Λ (which are in turn ordered according to their magnitudes). The original matrix M can be factored into the product of V, Λ, and V 1, as shown below. (Why?) M V ΛV 1 (37) In other words, the factorized transformation can be applied to x, as shown below. M x V ΛV 1 x (38) How do we interpret this? First notice that V and V 1 are inverses. V 1 represents a transformation that converts x to a representation that Λ understands. Λ applies the transformation of interest. Finally,V converts the result of the transformation to the representation that we started with. Why is the diagonal matrix representation Λ desirable? Let us look in more detail at the eigenvector matrix, V. The columns of this matrix (eigenvectors) serve as an alternate basis set for points in the underlying space. Specifically, an arbitrary point x in the space can be represented as a linear combination of the eigenvectors as shown below. x α 1 e 1 + α 2 e 2 +... α n e n (39) Put another way, the vector x which is represented using the standard basis, is represented as α when represented using the alternate basis which consists of the eigenvectors of M. This allows us to create a very simple representation of the transformation M using this new basis. The derivation below shows this. M x V ΛV 1 (α 1 e 1 + α 2 e 2 +... α n e n ) (40) V ΛV 1 V [α 1... α n ] T (41) V Λ [α 1... α n ] T (42) λ 1 α 1 e 1 + λ 2 α 2 e 2 +... + λ n α n e n (43) 7.1. Example: Raising M to a Power. In the next section, we will need to consider the quantity M k, where k is a positive integer. It is very useful to represent this in terms of the matrix factorization. Specifically, M k ( V ΛV 1) k V Λ k V 1 (44)

10 ANTHONY S. MAIDA Further, note that Λ is a diagonal matrix. Raising a diagonal matrix to a power involves raising the elements on the diagonal to the power. Thus, λ k 1 0... 0 Λ k 0 λ k 2... 0..... (45) 0 0 0 0 λ k n This representation of M k will be used in the next example. 7.2. Example: Stability of Discrete System of Linear Equations. This example shows how to use eigenvalues to study the stability of a discrete system of linear equations. The discrete system may be a set of coupled equations. The same system can be alternately represented as a set of uncoupled equations. This makes the system much easier to analyze. Let us represent a discrete system of linear equations with constant coefficients as shown below. x(k + 1) M x(k) + b (46) The vector x holds the values of state variables which take on values at discrete time steps k 0, 1, 2... Matrix M is a matrix of constant coefficients. Finally, b is a vector of inputs that are constant over the life of the system. Let us pose the question: Is this system stable? If the system is stable, there is a vector x such that when the state vector x(k) is sufficiently close to x, then x(k ) tends to evolve toward x for all k k. That is, the difference between the current state and the stable state, represented as x(k ) x, will approach zero as k approaches infinity. Furthermore, the relation below also holds. x M x + b (47) The above equation holds because the system is stable at point x. If the system state ever reaches x, it stays at x for all subsequent k. With this in mind, let us expand the expression representing the difference between the current state and the stable state as shown below. x(k + 1) x M x(k) + b M x b (48) M( x(k) x ) (49) We can perform a change of variable to simplify the above expression. We shall let z(k) x(k) x. This allows us to rewrite the above equation as z(k + 1) M z(k). (50) Since z(k) now represents the difference between the current state vector and the stable state, z(k) approaches zero as k approaches infinity for a system with stable state x. Note also that the values of z(k) need not be zero. These two facts imply that the matrix M k approaches the zero matrix as k approaches infinity. This is because z(k) M k z(0) (51)

NOTES ON LINEAR ALGEBRA CLASS HANDOUT 11 From the previous example, we know that M k can be factored as λ k 1 0... 0 M k 0 λ k 2... 0 V..... 0 V 1. (52) 0 0 0 λ k n Thus M k approaches zero as k approaches infinity if and only if Λ k approaches zero as k approaches infinity. Λ k approaches zero if and only if λ i < 1 for all i {1... n}. 7.3. Example: Dominant Mode of a Discrete Time System of Linear Equations. Consider the discrete time system x(k + 1) A x(k) (53) Suppose that matrix A has n distinct eigenvalues. The eigenvalue, λ i, with the largest magnitude, λ i, is the dominant eigenvalue. As k approaches infinity, the state vector x(k +1) evolves to align with the eigenvector corresponding to the dominant eigenvalue. The initial state vector can be represented as Thus, the solution to the system for any time step k 1 is x(0) α 1 e 1 +... α n e n. (54) x(k) α 1 λ k 1 e 1 +... + α n λ k n e n (55) Without loss of generality, assume that λ 1 is the dominant eigenvalue. λ k 1 grows faster than λk i for any other eigenvalue. Therefore, the following holds α 1 λ k 1 >> α i λ k i (56) as long as α 1 0. Therefore, for sufficiently large k, the state vector x(k) is essentially aligned with e 1. 8. VECTOR SPACES In section 2, we referred to a vector space but did not define it. We need to delve into this so that we can say more about vector geometry. A vector is a quantity consisting of a direction and a magnitude, often drawn as an arrow with a head and a tail. If we assume that vectors always have their tails at the origin in Euclidean space, then we can specify a vector by listing the coordinates of its head. In three dimensional space, the vector x can be specified using the coordinates (x 1, x 2, x 3 ). Such an expression is called a tuple and x 1, x 2, and x 3 are called its components. For our purposes, we shall assume that the components are always real numbers and that the vector space is defined over the real numbers. That is, whenever a scalar is encountered, it is a real number. Intuitively, by vector space, we mean a set of vectors which is closed under a set of operations (addition and multiplication by a scalar). For instance, if you add any two vectors, the result is another vector. Vectors in Euclidean space over the field of real numbers have the following properties.

12 ANTHONY S. MAIDA 1. The sum of two vectors is a vector (closure). You add two vectors by adding their corresponding components. Both vectors must have the same number of components. Vector addition is commutative and associative. The following example shows how to add two vectors and also shows that vector addition is commutative. x + y (x 1, x 2,..., x n ) + (y 1, y 2,..., y n ) (x 1 + y 1, x 2 + y 2,..., x n + y n ) (y 1 + x 1, y 2 + x 2,..., y n + x n ) y + x 2. The zero vector is of length zero has zero for each of its components. Adding the zero vector to a vector doesn t change the vector. The zero vector is the only vector that has this property. 3. To multiply a vector by a scalar, multiply each component of the vector with the scalar. This gives you another vector. If you multiply by the scalar +1, you don t change the vector. The +1 scalar is the only scalar that has this property. If you multiply each component of a vector x by the scalar -1, you get x. The sum of x and x equals the zero vector. For a given vector x, x is the only vector which has this property. 4. Vectors have the following algebraic properties. Let a and b be scalars. 4.1. a (b x) (ab) x 4.2. (a + b) x a x + b x 4.3. a ( x + y) a x + b y We shall prove property (a) because it gives us an opportunity to provide an example of scalar multiplication. a (b x) a(b x 1, b x 2,..., b x n ) (ab x 1, ab x 2,..., ab x n ) (ab) x (57) 8.1. Distance metrics. A distance metric is a convention for defining distance between two points in an n-dimensional space. A point in n-dimensional space is specified as a vector with n components. A legal distance metric must satisfy the following properties for any points a, b, and c. 1. distance(a, a) 0. 2. distance(a, b) > 0 if a b. 3. distance(a, b) distance(b, a). 4. distance(a, c) distance(a, b) + distance(b, c). Both the Euclidean distance metric, based on the Pythagorean theorem, and the city block distance metric satisfy these properties. The third property is known as symmetry and the fourth property is known as the triangle inequality. In Euclidean space, the triangle inequality is a corollary of the fact that the shortest distance between two points is a straight line. 8.2. Inner Product or Dot Product. The inner product or dot product of two n-dimensional vectors is computed by multiplying the corresponding components together and then summing the results. The inner product of vectors x and y, written x y, is defined below in expression (58).

NOTES ON LINEAR ALGEBRA CLASS HANDOUT 13 y-x y y+x -x x FIGURE 4. If vectors x and y are perpendicular then y x y + x. Expression (58) also shows that the inner product is commutative. x-cy n n x x y x i y i y i x i y x (58) i1 i1 θ y Using matrix notation, if vectors are written as column vectors, cy then the inner product between two vectors x and y is written x T y. The inner product is a measure of the degree of overlap between the vectors. If the vectors both point in the same direction, the inner product is positive and maximum. If they are pointing in opposite directions, the inner product is negative and minimum. If the inner product is 0, the vectors are said to be orthogonal (perpendicular). It is useful to look at the inner product of a vector with itself, as shown below. x x n x 2 i (59) Since a vector points in the same direction as itself, this quantity will be positive (unless the vector is the zero vector). The square root of this quantity is known as the Euclidean norm of the vector x, written x. This is the length of the vector as determined by the Pythagorean theorem (Euclidean distance metric) generalized to n dimensions. If x 1, then we say x is a unit vector. 8.2.1. Properties of the Inner Product. Some basic properties of the inner product are the following. commutativity: If x and y are vectors, then i1 x y y x distributivity: If x, y, and z are vectors, then x ( y + z) x y + x z multiplication by scalar: If c is a scalar and x and y are vectors, then (a x) y a( x y) magnitude: x x 0 if and only if x is the zero vector. Otherwise, x x > 0. 9. VECTOR GEOMETRY This section develops intuitions about geometric interpretations of linear algebra in two-dimensions.

14 ANTHONY S. MAIDA (x1, x2) T θ β α T (y1, y2) FIGURE 5. Angle θ β α is the angle between vectors x and y. 9.1. Perpendicular Vectors. If two vectors are perpendicular, then their dot product is zero. To see this, take a look at Figure 4. If vectors x and y are perpendicular then y x y + x. From this we obtain The last line above can only be true if x y 0. ( x + y) ( x + y) ( x y) ( x y) (60) x x + 2 x y + y y x x 2 x y + y y (61) x y x y (62) 9.2. Cosine of the Angle Between Vectors. The cosine of the angle between two vectors, x and y, is defined below. x y cos θ (63) x y This definition holds for n dimensions. To strengthen our intuitions, let us see why this definition corresponds to the cosine of an angle when the number of dimensions is two. 9.2.1. Method 1. Consider vectors x and y on the plane shown in Figure 5. They are separated by the angle θ β α. From Equation 15, we obtain Formula 65. cos θ cos(β α) (64) cos β cos α + sin β sin α (65) y 1 y 2 x 1 x y + x 2 x y x T y x y From applying trigonometry to Figure 5, we obtain Formula 66. We obtain Formula 67 by simplifying. 9.2.2. Method 2. Consider the angle θ between vectors x and y in Figure 6. Consider the projection of x onto y at c y so that the angle is perpendicular. For the figure, it follows that (66) (67) cos θ c y x. (68)

-x x NOTES ON LINEAR ALGEBRA CLASS HANDOUT 15 x-cy x θ cy y FIGURE 6. The dot product between x c y and c y is zero. It remains to obtain the value for c. Since the vectors x c y and c y are perpendicular, it follows that ( x c y) c y 0. Solving for c we obtain x y c y y. (69) If we plug the value of c back into Equation 68, we obtain cos θ c y x x T y x y (70) (71) 10. MATLAB When manipulating matrices whose dimensions are larger than 2 2, use MATLAB. Here are some commands. Given a square matrix M, the expression inv(m) computes its inverse, the expression det(m) computes its determinant, and the expression trace(m) computes the trace, and diag() extracts the diagonal and represents it as a column vector. The expression below obtains the eigenvectors and eigenvalues. [V, D] eig(m) The above is a component-wise assignment statement as indicated by the brackets on the left-hand side. Both the variables V and D are assigned new values because the function eig() returns two values. The variable V holds the eigenvectors of M. Each column of the matrix V stores an eigenvector. Their lengths are normalized to 1. The variable D is a diagonal matrix holding the corresponding eigenvalues. The eigenvalues fall on the diagonal of the matrix. Both the eigenvectors and the eigenvalues are sorted according to the magnitude of the eigenvalues. The first eigenvalue corresponds to the first eigenvector and so forth for the second, third, etcetera.