Introduction to Linear Algebra. Tyrone L. Vincent

Introduction to Linear Algebra Tyrone L. Vincent Engineering Division, Colorado School of Mines, Golden, CO E-mail address: tvincent@mines.edu URL: http://egweb.mines.edu/~tvincent

Contents Chapter. Revew of Vectors and Matricies. Useful Notation. Vectors and Matricies. Basic operations. Useful Properties of the Basic Operations 6 Chapter. Vector Spaces 9. Vector Space De nition 9. Linear Independence and Basis. Change of basis. Norms, Dot Products, Orthonormal Basis 5. QR Decomposition 6 Chapter. Projection Theorem Chapter. Matrices and Linear Mappings. Solutions to Systems of Linear Equations Chapter 5. Square Matrices, Eigenvalues and Eigenvectors 5. Matrix Exponential 6. Other Matrix Functions 6 Appendix A. Appendix A 9 iii

CHAPTER Revew of Vectors and Matricies. Useful Notation.. Common Abbrivations. In this course, as in most branches of mathematics, we will often utilize sets of mathematical objects. For example, there is the set of natural numbers, which begins ; ; ; : This set is often denoted N, so that is a member of N but is not. To specify that an object is a member of a set, we use the notation for "is a member of". For example N: Some of the sets we will use are R C R n R mn real numbers complex numbers n dimensional vectors of real numbers m n dimensional real matrices For these common sets, particular notation will be used to identfy members, namely lower case for a scalar or vector, and upper case for a matrix. The following table also includes some common operations x hx; yi A A T A det(a), jaj vector or scalar inner product between vectors x and y matrix transpose of A inverse of A determinant of A To specify a set, we can also use a bracket notation. For example, to specify E as the set of all positive even numbers, we can say either E = f; ; 6; 8; g when the pattern is clear, or use a : symbol, which means "such that": E = fx N : mod(x; ) = g : This can be read "The set of natural numbers x, such that x is divisible evenly by ". When talking about sets, we will often want to say when a property holds for every member of the set, or for at least one. In this case, the symbol 8; meaning "for all" and 9; meaning "there exists" are useful. For example, suppose I is the set numbers consisting of the IQs for people in this class. Then 8x I x >

. REVEW OF VECTORS AND MATRICIES means that all students in this class have IQ greater than while 9x I : x > means that at least one student in the class has IQ greater than. We will also be concerned with functions. Given a set X and a set Y a function f from X to Y maps an element from X to and element of Y and is denoted f : X! Y: The set X is called the domain, and f(x) is assumed to be de ned for every x X: The range, or image of f is the set of y for which f(x) = y for some x : Range(f) = fy Y : 9x X such that y = f(x)g : If Range(f) = Y; then f is called "onto". If there is only one x X y = f(x); then f is called "one to one".. Vectors and Matricies such that You are probably already familiar with vectors and matrices from previous courses in mathmatics or physics. We will nd matrices and vectors very useful when representing dynamic systems mathematically, however, we will need to be able to manipulate and understand these objects at a fairly deep level. Some texts use bold face for vectors and matrices, but ours does not, and I will not use that convention here, or during class. I will however, use lower case letters for vectors and upper case letters for matrices. A vector of n tuple real (or sometimes complex) numbers is represented as: x = 6 So that x is a vector, and x i are each scalars. We will use the notation x i R n to show that x is a length p vector of real numbers (or x i C n if the elements of x i are complex.) Sometimes we will want to index vectors as well, which can sometimes be confusing: Is x i the vector x i or the ith element of the vector x? To make the di erence clear, we will reserve the notation [x] i to indicate the ith element of x: As an example, consider the following illustration of addition and scalar multiplication for vectors: x + x = 6 [x ] + [x ] [x ] + [x ]. [x ] n + [x ] n x x. x n 7 5 7 5 x = 6 A matrix is an m n array of scalars: a a a n a a a n A = 6..... 7. 5 a m a m a mn [x ] [x ]. [x ] n We use the notation A R mn indicate that A is a m n matrix. Addition and scalar multiplication are de ned the same way as for vectors. 7 5

. BASIC OPERATIONS. Basic operations You should already be faimilar with most of the basic operations on vectors and matricies listed in this section... Transpose. Given a matrix A R mn ; the transpose A T R nm is found by ipping all terms along the diagonal. That is, if a a a n a a a n A = 6..... 7. 5 a m a m a mn then a a a n A T a a a n = 6..... 7. 5 a m a m a nm Note that if the matrix is not square (m 6= n); then the shape of the matrix changes. We can also use transpose on a vector x R n by considering it to be an n by matrix. In this case, x T is the by n matrix: x T = [x] [x] [x] n.. Inner (dot) product. In three dimensional space, we are familiar with vectors as indicating direction. The inner product is an operation that allows us to tell if two vectors are pointing in a similar direction. We will use the notation hx; yi for inner product between x and y: In other courses, you may have seen this called the dot product with notation x y: The notation used here is more common in signal processing and control systems. The inner product of x; y R n is de ned to be the sum of product of the elements nx hx; yi = [x] i [y] i i= = x T y Recall that if x and y are vectors, the angle between them can be found using the formula hx; yi cos = p hx; xi hy; yi Note that the inner product satis es the following rules (inherited from transpose) hx + y; zi = hx; zi + hy; zi hy; zi = hy; zi.. Matrix-vector multiplication. Suppose we have an m n matrix A a a a n a a a n A = 6..... 7. 5 a m a m a mn

. REVEW OF VECTORS AND MATRICIES and a length n vector x : Note that the number of columns of A are the same as the length of x : Multiplication of A and x is de ned as follows: where x = Ax (.) [x ] = a [x ] + a [x ] + + a n [x ] n (.a) [x ] = a [x ] + a [x ] + + a n [x ] n (.b). (.c) [x ] m = a m [x ] + a m [x ] + + a mn [x ] n (.d) Note that the result x is a length m vector (the number of rows of A). The notation (.) is a compact representation of the system of linear algebraic equations (.). Note that A de nes a mapping from R n to R m : Thus, we can write A : R n! R m : This mapping is linear. We can also consider a matrix to be a group of vectors. For example, if we group the vectors x ; x ; ; x n into a matrix M = x x x p and de ne the vector a = 6 Then all linear combinations of x ; x ; ; x p are given by. p 7 5 y = Ma = x + x + + p x p.. Matrix-Matrix multiplication. If matrix A : R n! R m ; and matrix B : R m! R p ; we can nd mapping C: R n! R p which is the composition of A and B C = BA [c] ij = X k [b] ik [a] kj That is, the i; j element of C is the dot product of the ith row of B with the jth column of A: The dimension of C is pn: This can also be though of as B mapping a column of A at a time: That is, the rst column of C; [c] is B[a] ; B times the rst column of A: Clearly, two matricies can be multiplied only if they have compatible dimensions. Unlike scalars, the order of multiplication is important. If A and B are square matricies, AB 6= BA in general. The identity matrix I =. 6. 7... 5 is a square matrix with ones along the diagonal. If size is important, we will denote it via a subscript, so that I m is the m m identity matrix. The identity matrix is

. BASIC OPERATIONS 5 the multiplicative identity for matrix multiplication, in that AI = A and IA = A (where I has compatible dimensions with A):.5. Block Matricies. Matricies can also be de ned in blocks, using other matricies. For example, suppose A R mn B R mp C R qn and D R qp : Then we can "block ll" a (m + q) by (n + p) matrix X as A B X = C D Often we will want to specify some blocks as zero. We will denote a block of zeros as simply : The dimension can be worked out from the other matricies. For example, if A X = C D The zero block must have the name number of rows as A and the same number of columns as D: Matrix multiplication of block matricies uses the same rules as regular matricies, except as applied to the blocks. Thus A B A B A A = + B C A B + B D C D C D C A + D C C B + D D.6. Determinants. If A is a scalar matrix, that is, A R ; then the determinant of A is just equal to itself. For higher dimensions, the determinant is de ned recursively. If A R nn then nx det(a) = [a] ij c ij i= where c ij is the ijth cofactor, and is the determinant of a R n n matrix (this is what makes the de nition recursive) possibly times -. In particular: c ij = ( ) i+j det(m ij ) where M ij is the n n submatrix created by deleting the ith row and jth column from A If A R ; then det(a) = a a a a (Check that this matches the de nition).7. Inverse. Given a square matrix A R nn ; the inverse of A is the unique matrix (when it exists) such that AA = I The inverse can be calculated as A = det(a) C T where [C] ij = c ij ; the ijth cofactor of A: C T is also called the adjugate of A: The inverse exists whenever det(a) 6= :

6. REVEW OF VECTORS AND MATRICIES Transpose. Useful Properties of the Basic Operations A T T = A (A + B) T = A T + B T (AB) T = B T A T Determinants det(ab) = det(ba) = det(a) det(b) det(a T ) = det(a) det(i) = Determinants for Block Matricies Very useful: the determinant of block triangular matricies is the product of the determininant of the diangonal blocks. In particular, if A and D are square A B A det = det = det(a) det(d) D C D if A and D are square and D exists, then A B det = det(a BD C) det(d) C D Inverse AA = A A = I (A T ) = (A ) T (AB) = B A Inverses for Block Matricies If A and D are square and invertible, A B A A = BD D D If A and D are square and D and = A BD C invertible, A B = BD D D D C D C BD + D.. Exercises. () Show that det(a ) = det(a) () Let N = Q = P = Im A O I n Im B I n Im A B I n (a) Explain why det(n) = det(q) = (b) Compute NP and QP (c) Show that det(np ) = det(i m + AB) and det(qp ) = det(i n + BA)

. USEFUL PROPERTIES OF THE BASIC OPERATIONS 7 (d) Show that det(i m + AB) = det(i n + BA) () Using the results of problem, explain why (I m + AB) exists if and only if (I n + BA) exists. Show by verifying the properties of the inverse that (I m + AB) = I m A(I n + BA) B: (That is, multiply the right hand side by I m + AB and show that you get the identity) () Verify the block inversion equations.

CHAPTER Vector Spaces. Vector Space De nition Definition. A vector space (F,X ) consists of a set of elements X, called vectors, and a eld F (such as the real numbers) which satisfy the following conditions: () To every pair of vectors x and x in X ; there corresponds a vector x = x + x in X : () Addition is commutative: x + x = x + x () Addition is associative: (x + x ) + x = x + (x + x ) () X contains a vector, denoted, such that + x = x for every x in X (5) To every x in X there is a vector x in X such that x + x = (6) To every in F, and every x in X ; there corresponds a vector x in X (7) Scalar multiplication is associative: For any ; in F and any x in X, (x) = ()x (8) Scalar multiplication is distributive with respect to vector addition: (x + x ) = x + x (9) Scalar multiplication is distributed with respect to scalar addition: ( + )x = x + x () For any x in X ; x = x: You can verify that R n (or C n ) is a vector space. It is interesting to see that other mathematical objects also qualify to be vector spaces. For example: Example. X =R n [s]; the set of all polynomials with real coe cients with degree less than n; F = R; with addition and multiplication de ned in the usual way: if x = a s n + a s n + + a n x = b s n + b s n + + b n ; then x + x = (a + b )s n + (a + b )s n + + (a n + b n ) kx = ka s n + ka s n + + ka n We can show that this is a vector space by verifying that it satis es the conditions: () Given any x = a s n +a s n + +a n and x = b s n +b s n + + b n ; we see that x + x = (a + b )s n + (a + b )s n + + (a n + b n ) is indeed a polynomial of degree less than n, so x + x is in X () obvious from de nition of addition () obvious from de nition of addition () Select x = as the zero vector 9

. VECTOR SPACES (5) Given x = a s n +a s n + +a n ; select x = a s n a s n a n (6) Given x = a s n + a s n + + a n ; we see that ax = aa s n + aa s n + + aa n is a polynomial of degree less than n; so that ax is in X (7) obvious from de nition of scalar multiplication (8) obvious from de nition of addition and scalar multiplication (9) obvious from de nition of addition and scalar multiplication () select x = as the unit vector.. Exercises. () Show that X =C; the set of all continuous functions is a vector space withf = R; with addition and multiplication de ned as x = f(t); x = g(t); x + x = f(t) + g(t); ax = af(t): This can be shown to be a vector space in the same way as above. () Show that X =C n ; the set of all n tuples of complex numbers is a vector space with F = C; the eld of complex numbers () Show that X =R n ; F = C is not a vector space. () Show that X =fx : x + _x + = g is a vector space. with F = R.. Linear Independence.. Linear Independence and Basis Definition. A linear combination of the vectors x ; x ; ; x p is a sum of the form x + x + + p x p : A linear combination can also be written in matrix-vector form, x x x p 6 7. 5 p A vector x is said to be linearly dependent upon a set S of vectors if x can be expressed as a linear combination of vectors from S: A vector x is said to be linearly independent of S if it is not linearly dependent on S: A set of vectors is said to be a linearly independent set if each vector in the set is linearly independent of the remainder of the set. This de nition immediately leads to the following tests: Theorem. A set of vectors S = fx ; x ; : : : ; x p g are linearly dependent if there exists i with at least one 6= such that x + x + + p x p = Theorem. A set of vectors S = fx ; x ; : : : ; x p g is linearly independent if and only if implies i = i = ; ; ; p: x + x + + p x p =

. LINEAR INDEPENDENCE AND BASIS Example. Consider the set of vectors x = 5 ; x = 5 ; x = 5 6 This set is linearly dependent, for if we select = have x + x + x = Example. Consider the set of vectors x = ; x = ; = and = ; we This set is linearly dependent, for if we select = and = ; then x + x = Note that the zero vector is linearly dependent on all other vectors. The maximal number of linearly independent vectors in a vectors space is an important characteristic of that vector space. Definition. The maximal number of linearly independent vectors in a vector space is called the dimension of the vector space Example. Show that the dimension of the vector space (R ; R) is Note that the vectors and are linearly independent. Thus the dimension of (R ; R) is greater than or equal to : Given three vectors x = a b ; x = c d ; x = e f, Then we have x + x + x = if = ; and and are solutions to the system of equations a + c = e b + d = f which always has at least one solution. Thus no set of three vectors are linearly independent, and the dimension of (R ; R) is less than ; implying that the dimension of (R ; R) is... Basis. Definition. A set of linearly independent vectors from a vector space (F; X ) is a basis for X if every vector in X can be expressed as a unique linear combination of these vectors. It is a fact that in an n-dimensional vector space, any set of n linearly independent vectors quali es as a basis. We have seen that there are many di erent mathematical objects which qualify at vector spaces. However, all n-dimensional vectors spaces (X ; R) have a one to one correspondence with the vector space (R n ; R) once a basis has been chosen. Suppose e ; e ; ; e n is a basis for X : Then for all x in X x = e e e n where = n and i are scalars. Thus the vector x can be identi ed with the unique vector in R n :Consider the vector space (R [s]; R)

. VECTOR SPACES where R [s] is the set of all real polynomials of degree less than. This vector space has dimension, with one basis as e = ; e = s; e = s : The vector x = s + s can be written as x = e e e 5 So that the representation with respect to this basis is : However, if we choose the basis e = ; e = s ; e = s s (verify that this set of vectors is independent), x = s + s = + 5(s ) + (s s) = e e e 5 5 so that the representation of x with respect to this basis is 5 :.. Standard basis. For R n ; the standard basis are the unit vectors that point in the direction of each axis i = 6 7. 5 ; i = 6 7. 5 ; i n = 6. 7 5.. Exercises. () Find the dimension of the vector space given by all (real) linear combinations of x = 5 ; x = 5 ; x = 5 ; 5 That is, X = fx : x = x + x + x ; i Rg This is called the vector space spanned by fx ; x ; x g : () Show that the space of all solutions to the di erential equation x + _x + x = t is a dimensional vector space. (Verify the properties of a vector space). Change of basis Since the vectors are made up of polynomials which are mathematical objects quite di erent from n tuples of numbers, the ideas of separation between vectors and their representations with respect to basis is fairly clear. This becomes more complicated when consider the native vector space (R n ; R): When n = ; it is natural to visualize these vectors in the plane, as shown in Figure. In order to represent the vector x; we need to choose a basis. The most natural basis for (R ; R) is the array i = i =

. CHANGE OF BASIS Figure. A two-dimensional real vector space In this basis, we have the following representation for x : x = = i i Note that the vector and its representation look identical. However, if we choose a di erent basis, say e = e = then x = = e e so the representation of x in this basis is We have seen that a vector x can have di erent representations for di erent basis. A natural desired operation would be to transform between one basis and another. Suppose a vector x has representations with respect to e e e n as and with respect to e e e n as ; so that x = e e e n = e e e n (.) what is the relationship between and? The answer is most easily found by nding the relationship between the bases themselves. Each basis vector has a representation in the other basis. That is, there exists p i such that e i = e e e n pi

. VECTOR SPACES If we group the vectors e i into a matrix, we can write e e e n = e e e n p p p n p p p n = e e e p p p n n 6 7... 5 p n p n p nn = e e e n P (.) where we see that the matrix P takes the vectors p i as its columns. Substituting (.) into (.), we get e e e n P = e e e n Since the representation of a vector with respect to its basis is unique, we must have = P : Thus, in order to transform from basis ( e e e n ) to basis ( e e e n ), we must form the matrix P; where P = 6 ith column: the representation of basis vector i (e i ) with respect to basis ( e e e n ) 7 5 It turns out that P will always be an invertible matrix, so that = P must have ith column: the representation of P = Q = 6 basis vector i (e i ) 7 with respect to basis ( 5 e e e n ) ; and we. Norms, Dot Products, Orthonormal Basis.. Vector Norms. A vector norm, denoted kxk ; is a real valued function of x which is measure of its length. You are probably already familar with a common norm de ned by the Euclidean length of a vector, but in fact, there are many possibilities. A valid norm satis es the following properties: x: () (Always positive unless x = ) kxk for every x and kxk = imples x = () (homogeneity) kxk = jj kxk for scalar : () (Triangle inequality) kx + x k kx k + kx k The most common vector norms are the following:... -norm. The -norm is the sum of the absolute value of the elements of kxk := nx j[x] i j i=

. NORMS, DOT PRODUCTS, ORTHONORMAL BASIS 5... -norm. The -norm corresponds to Euclidean distance, and is the square root of the sum of squares of the elements of x: v ux kxk := t n ([x] i ) Note that the sum of squares of the elements can also be written as x T x: Thus i= kxk = p x T x... -norm. The -norm is simply the largest component of x kxk = max [x] i i.. Dot products, orthogonality and projection. As discussed earlier, the dot product between two vectors is given by nx hx; yi = [x] i [y] i Note that i= = x T y kxk = p hx; xi If two vectors have a dot product of zero, then they are said to be orthogonal. A set of vectors fx i g which are pairwise orthogonal and unit -norm are said to be orthonormal and will satisfy hx i ; x j i = x T i x j = i = j i 6= j The projection of one vector (say x) on another (say y) is given by z = hx; yi kyk y The vector z points in the same direction as y; but the length is chosen so that the di erence between z and x is orthogonal to y: * + hx; yi hz x; yi = kyk y x; y since hy; yi = kyk hx; yi = hy; yi kyk = hx; yi.. Orthonormal Basis - Gram-Schmidt Proceedure. An orthonormal basis is a vector space basis which is also orthonormal. Operations are often much easier when vectors are de ned using an orthonormal basis. The gram-schimidt proceedure can be used to transform a general basis into an orthonormal basis. It does so by building up the orthonormal basis one vector at a time.

6. VECTOR SPACES Suppose we had a basis of two vectors fe ; e g: We can make an orthonormal basis as follows: Set the rst basis vector to point in the same direction as e ; but with unit length: q = e ke k : We need to pick a section vector which is orthogonal to q ; but spans the same space as fe ; e g: This can be done by subtracting the part of e which points in the same direction as q : Let Then u = e hq ; e i q hq ; u i = hq ; e hq ; e i q i = hq ; e i hq ; hq ; e i q i = hq ; e i hq ; e i hq ; q i = hq ; e i hq ; e i = since hq ; q i = : Thus u is orthogonal to q : We can get an orthonormal set by letting q = e ke : Yet, k " # hq ;e i q q = e e ke k ku k ku k which is clearly an invertible change of basis. The general proceedure is as follows. Let fe ; ; e n g be a basis. Let u = e q = u ku k u = e hq ; e i q q = u kuk.. nx u n = e n hq k ; e n i q k q n = un ku nk k= The orthonormal basis given by fq ; ; q n g spans the same space as fe ; ; e n g 5. QR Decomposition The gram-schmidt proceedure can be viewed as a matrix decomposition. Let E = e e e n be a matrix with columns made up of n independent vectors e i : Then the relationship between the orthonormal vectors q obtained via the gram-schmit proceedure and the original vectors can be written as ku k hq ; e i hq ; e n i e e e n = q q q q ku k hq ; e i. 6. 7... hqn ; e n i5 ku n k or E = QR

5. QR DECOMPOSITION 7 where Q is a matrix with orthonormal columns, and R is an upper diagonal matrix. Since Q has orthonormal columns, you can verify that QQ T = I; implying that Q = Q T : A matrix whose transpose is also its inverse is called and orthonormal matrix, and satis es QQ T = Q T Q = I (so its rows are also orthonormal.) It turns out that the gram-schimdt proceedure as described in the last section is not very well conditioned, numerically, meaning that small errors will accumulate as the algorithm progresses. However, much more numerically stable algorithms are available using Householder or Givens transformations. We will examine the former, but both are covered in detail in textbooks on numerical linear algebra, such as Golub G. H and C. F Van Loan, Matrix Computations, John Hopkins Press, 989. Consider the following problem: we have a vector x; and we would like to nd an orthonormal matrix P such that P x = 6 7. 5 = i where is an arbitrary number. It turns out that a matrix of the form P = I vv T will do the job, where v is restricted to be unit length (kvk = ): First, lets check that P is indeed orthonormal for any v : P P T = I vv T I vv T T = I vv T I vv T = I vv T + (vv T vv T ) = I vv T + (v kvk v T ) = I vv T + vv T = I Where we have used the fact that kvk = : Now, lets see if we can indeed pick an appropriate v: P x = I vv T x = x v(v T x)

8. VECTOR SPACES Let s pick v = P x = x = x = x+kxki kx+kxki : Then k kx + kxk i k (x + kxk i ) ((x + kxk i ) T x) kx + kxk i k (x + kxk i ) (kxk + kxk x T i ) kx + kxk i k x kxk + kxk x T i i kx + kxk i k kxk + kxk x T i x = kx + kxk i k kxk + kxk x T i + kxk ki k x kxk + kxk x T i i kxk + kxk x T i x note that ki k = ; thus P x = kx + kxk i k kxk + kxk x T i x kxk + kxk x T i x kxk + kxk x T i i = kxk + kxk x T i kx + kxk i k i so that the desired transformation occurs with = (kxk +kxk x T i ) : kx+kxki k You can verify in a similar manner that another possible choice for v is x kxki kx kxki : k In practice, one would choose the v for which kx kxk i k is largest, to avoid dividing by a small number. Now, and QR decomposition can be accomplished as follows. () Given E R nn E = e e e n pick v = ekeki ke : Apply P ke ki k = I v v T to get P E = where is an arbitrary number, is a vector zero of length n are arbitrary vectors of length n : () Pick v = e ke ki and apply P ke ke ki k = I v v T to get P P E = 5 e e n e e n and e i Note that because of the way we chose P ; the rst column of P E remains the same, and we zero out the correct parts of the second column. () Continue in this manner, until with P = P n P n P ; we get P E = R; where R is an upper diagonal matrix. Then with Q = P T ; E = QR:

5. QR DECOMPOSITION 9 The keys to numerical stability are that at each step, the modi cation of E involes an orthonormal matrix, and speci cation of this orthonormal matrix is well conditioned when kvk is away from zero.

CHAPTER Projection Theorem The close connection between inner product and the -norm comes into play in vector minimization problems that involve the -norm. Suppose we have a matrix A; and a vector y: We would like to nd the vector x which gets mapped through A to a vector which is as close as possible to y: That is, we have the folowing problem: min kax yk x (.) Useful facts: () When A is a matrix, there is always a solution to this minimization problem. () When A is an arbitrary linear operator, there is always a solution to this minimization problem if the image, (or range space) of A is closed. () The solution can be found using dot products. Let s try to understand the minimization problem. Theorem. (Projection Theorem) x is a minimizer of (.) if and only if hy Ax; Axi = for all x: X: Proof. (if) Suppose x satis es hy kax yk = ka(x + x x) yk = kax y + A(x x)k Ax; Axi = : Let x be another vector in = kax yk + hax y; A(x x)i + ka(x x)k Now, x x is a vector in X; so that hax y; A(x x)i = ; and kax yk = kax yk + ka(x x)k since ka(x x)k ; kax yk kax yk : Thus x is a minimizer (only if) Now, suppose ^x does not satisfy hy A^x; Axi = for some x X; e.g. hy A^x; Ax d i = c: Then that is is a minimizer ka(^x + x d ) yk = ka^x yk + ha^x y; Ax d i + kx d k =

CHAPTER Matrices and Linear Mappings A matrix is an m n array of scalars that represents a linear map from one vector space to another: If we group the vectors x ; x ; ; x n into a matrix M = x x x p and de ne the vector a = 6. 5 p Then all linear combinations of x ; x ; ; x p are given by The space is called the range space of M 7 y = Ma = x + x + + p x p S = fy : y = Ma a R p g Definition 5. The range space (or just range) of M is the subset of R n to which vectors are mapped by M, that is R(M) = fy : y = Mx; x R m g In matrix notation, we can say that the column vectors of M are linearly independent if and only if Ma = implies a = : If a matrix does not consist of linearly independent column vectors, then there exists a set of a such that Ma = : This set is a subspace of R p ; and is called a null space. Definition 6. The null space of M is the subset of R m which is mapped to the zero vector, that is N(M) = fx : = Mx; x R m g The dimension of the range space is called the rank of a matrix. Theorem. The rank of M is given by the maximal number of linearily independent columns (or rows)... MATLAB. MATLAB has commands to nd the range space, null space and rank of a matrix. Consider the matrix Orth, null and rank... Exercises.

. MATRICES AND LINEAR MAPPINGS. Solutions to Systems of Linear Equations Given vector spaces X and Y and linear operator A : X! Y: Given equation Ax = y (.) () Determine if there is a solution for a particular y () Determine if there is a solution for every y () Determine if the solution is unique () If the solution exists, nd it.... Existence and uniqueness of a solution. y in range space of A: null space of A:... Finding the solution Case : A invertible. De nition of the inverse of A...5. Finding the solution Case : A not invertible.

CHAPTER 5 Square Matrices, Eigenvalues and Eigenvectors If A is a square matrix, i.e. A R nn ; then A maps vectors back to the same space. These matrices can be characterized by their eigenvalues and eigenvectors Definition 7. Given matrix A R nn ; (or C nn ) if there exists scalar and vector x 6= such that Ax = x then is an eigenvalue of A; and x is an eigenvector of A: To nd eigenvalues and eigenvectors, we can use the concept of rank. If is an eigenvalue of A; then there exists x such that or Ax x = (A I)x = From above, an x 6= only exists if the rank of (A I) is less than n: Recall that the rank of a square matrix is less than n if and only if the determinant of the matrix is zero. This implies that the eigenvalues of A satisfy Then Example 5. Let det(a I) = A = A I = det(a I) = ( )( ) = = ( )( + ) Thus the eigenvalues are and : For = ; we nd the corresponding eigenvector via (A + I) x = x = x which has solution x = : Note that there are many possible solutions, each can be obtained through a scale factor. 5

6 5. SQUARE MATRICES, EIGENVALUES AND EIGENVECTORS For = ; we nd the corresponding eigenvector via which has solution x = : (A I) x = x = x. Matrix Exponential Given A R nn ; we de ne the matrix exponential as follows: e A : = I + A + A! + A! + : = X k= Using the de nition, you can verify the following properties: A k k! e = I; where is a matrix of zeros. e A = e A d dt eat = Ae At = e At A For the last property, note that and similarly for e At A: d X (At) k dt k= k! = X k= = X `= = A X = Ae At ka k t k k! (` + )A`+ t` (` + )! `= A`t` `!. Other Matrix Functions Note that the de nition of the matrix exponential corresponded with the usual scalar de nition, that is e x = + x + x! + e A = I + A + A! + In general, a matrix function is de ned using its series expansion. In particular, we de ne the matrix natural log ln(a) = (A I) (A I) + (A I) (A I) + It turns out that for a given matrix, the value of the matrix function can be found using a nite expansion as well. The key result is called the Caley-Hamilton Theorem, which will be stated without proof:

. OTHER MATRIX FUNCTIONS 7 Theorem 5. (Caley-Hamilton) Given an n-dimensional square matrix A; let a() = det(a I) = n + n + + n = be the characteristic equation. Then a(a) = ; that is A n + A n + + n I = ( A satis es its own characteristic equation) In particular, this indicates that A n = A n n I A n+ = A n A n n A = A n n I A n n A = A n n I and in general, A k for k > n can be written as a linear combination of the terms I; A; A ; A n : Thus for a given A; and a given matrix function f(); there is an n dimensional polynomial representation of f(a); that is f(a) = A n + A n + + n I Although we will not prove this in detail, since both A and i have the property that a( i ) = a(a) = ; and because a() is what is used to simplify a(a); we can use the function values at the eigenvalues to more easily evaluate the matrix function. Given n dimensional matrix A and function f(); to evaluate f(a) when the eigenvalues of A are simple (not repeated), perform the following steps: Find the eigenvalues of A; that is, ; ; n Solve the following equations for through n f( ) = n + n + + n f( ) = n + n + + n. f( n ) = n n + n n + + n Find f(a) = A n + A n + + n I When the eigenvalues are repeated, there will not be enough equations to solve for i ; so the additional conditions are added d d f( i) = d d n + n + + n. d m d m f( i ) = dm d m n + n + + n where m is the index of the repeated eigenvalue. #! Example 6. Find ln det " p " p p p : The characteristic equation is #! = p! + = p +

8 5. SQUARE MATRICES, EIGENVALUES AND EIGENVECTORS and = p q = p j = ej 6 : Now ln e j 6 = j + k k = ; ; ; 6 ln e j 6 = j + k k = ; ; ; 6 Note that there are multiple solutions. This is because the function ln is not one to one. The solution of interest depends on the context. For now, let s pick the solutions with k = : Then j 6 = e j 6 + or This has solution and j = 6 e j 6 + " p # j 6 j = + j p 6 j = = ln(a) = p A 6 I = = " p # + j p j 6 j j 6 p p 6 p 6 6 6 6 p 6 6 p 6 For an example with repeated roots, see Example B. in the text. Note that, because ln(e x ) = x; ln(e A ) = A; and the matrix log is the inverse function for the matrix exponential. 5

APPENDIX A Appendix A 9