Linear Algebra Review (Course Notes for Math 308H - Spring 2016)

Linear Algebra Review (Course Notes for Math 308H - Spring 2016) Dr. Michael S. Pilant February 12, 2016 1 Background: We begin with one of the most fundamental notions in R 2, distance. Letting (x 1, y 1 ) denote a vector in R 2, the length of this vector is given by x 2 1 + y2 1. If we define the inner product of two vectors as then the square of the length of the vector x is given by This generalizes to N-dimensional vectors in R N by (x 1, y 1 ), (x 2, y 2 ) == x 1.x 2 + y 1.y 2 x 2 = x, x x 2 = x, x = x 2 1 + x 2 2 + + x 2 N We define another operation, the transpose by (x 1, x 2 ) T == ( x 1 x 2 ) and ( x 1 x 2 ) T == (x1, x 2 ) This allows us to write x, y = x T y when x, y are column vectors, and x, y = xy T when x, y are row vectors. If we interpret a two-dimensional matrix as a column vector of row vectors (or a row vector of column vectors) we have a natural definition of a matrix-vector inner product. a 11 a 12 a 1m x 1 a 11 x 1 + a 12 x 2 + + a 1m x m a 21 a 22 a 2m x 2 = a 21 x 1 + a 22 x 2 + + a 2m x m a n1 a n2 a nm x m a n1 x 1 + a n2 x 2 + + a nm x m 1

where we take the inner product of the rows of the matrix with the columns of what follows. This provides us with the tools to multiply two matrices together, asuming the number of columns of the first matrix are equal to the number of rows of the second matrix. We can write this more efficiently, using summation notation. If A.B = C where A, B, C are matrices of appropriate dimensions, then the value of C at row i, column j is given by C ij = n A ik B kj where n is the number of columns of A which must equal the number of rows of B. This is often expressed as k=1 C ij = A ik B kj where a repeated index implies summation over that index. It is also possible to have matrices of greater than two dimensions, for example A ijk denotes a 3 dimensional array of scalars. Such arrays are called tensors and occur in many Physics and Engineering contexts. 2 Matrix Transpose The transpose of a matrix is actually defined in terms of the inner product < Ax, y >=< x, A T y > if x, y are real vectors. If x, y are complex, the we define the adjoint of A by Where A = (A T ) = (Ā)T. < Ax, y >=< x, A y > If a square matrix satisfies A = A T then it is said to be symmetric. If A = A then it is said to be self-adjoint. Such matrices must be square. 3 Solving Linear Equations Given a system of linear equations, we can express them in terms of matrices in a systematic way: a 11 x 1 + + a 1n x n = b1 a 21 x 1 + + a 2n x n = b2 a m1 x 1 + + a mn x n = b2

This is equivalent, due to the definition of matrix vector product, to a 11 a 12 a 1n x 1 a 21 a 22 a 2n x 2 = a m1 a m2 a mn x n b 1 b 2 b n 3.1 Row Operations There are three permissible operations that one can do that will not affect the solution of a system of linear equations. 1. Interchange two equations. 2. Multiply an equation by a constant. 3. Add two equations together. If we write the system as an augmented matrix a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a m1 a m2 a mn b m Letting R i denote row i (or equation i) then the three allowable operations are the row operations 1. R i R j 2. cr i R i 3. R i + R j R i Solving the system means reducing the system to the form x 1 = a 1 x 2 = a 2 x n = a n which is equivalent to row reducing the augmented matrix to [ A b ] [ I a ] This process is known as Gaussian Elimination. One can show that the inverse of a square matrix can be computed by row reduction by the following:

[ A I ] [ I B ] where B = A 1. 4 Inverse of a Square Matrix The inverse of a square matrix A is a square matrix B of the same dimension which has the following property A.B = B.A = I (1) where I is the identity matrix which has ones along the diagonal and zeros everywhere else. Note: Non square matrices do not have inverses which satisfy (1) (since the products A.B and B.A will have different dimensions!). If the inverse of a matrix is known, the system of equations can be solved easily, and explicitly: Ax = b x = A 1 b Note: It actually takes more operations to compute A 1 and perform the multiplication A 1 b than it does to row reduce (perform Gaussian Elimination) on the system Ax = b. 5 Linear Independence A set of k vectors in R N is linearly dependent if one vector can be represented in terms of a linear combination of the remaining vectors, that is x i = c 1 x 1 + + c i 1 x i 1 + c i+1 x i+1 + c k x k This is equivalent to or c 1 x 1 + + c k x k = 0 (2) [X] c = [ ] x 1 x 2 x n c 1 c 2 c n = 0 0 0 having non zero solutions c j. We say a set of vectors is linearly independent if (2) does not have any non-zero solutions. This is equivalent to the matrix [X] being invertible. An n n matrix is invertible (non-singular) if and only if its columns are linearly independent.

6 Algebra of Square Matrices The set of square N N matrices forms an algebra under the operations of 1. scalar multiplication ca where c is a scalar 2. matrix addition and subtraction A + B, A B 3. matrix multiplication A.B and B.A Since A.B B.A in most cases, the set of N N matrices forms a non-commutative algebra 7 Eigenvalues and Eigenvectors A square matrix A N N takes a vector x R N and yields a vector in R N. In general, the input and output vectors will not be in the same direction. Under certain circumstances the input and output can be proportional (i.e. in the same direction. In this case we have where k is a scalar. For this to happen we must have which can only happen if A ki is bf NOT invertible, ie Ax = kx, x 0 (A ki)x = 0, x 0 det(a ki) = 0 This is a polynomial of degree N called the characteristic polynomial of A. In general, it will have N real or complex roots. of multiplicity greater than or equal to 1. 8 Self-Adjoint Matrices If A is a self-adjoint matrix (A = A), then the following are true 1. All eigenvalues of A are real. This follows from the fact that the eigenvalues of A are given by λ. If λ = λ, λ must be real. 2. Eigenvectors corresponding to different eigenvalues are linearly independent. (This is true for any matrix). Furthermore, the eigenvectors of a self-adjoint matrix are orthogonal (not true for arbitrary matrices). 3. A has a full set of N orthogonal eigenvectors, which are linearly independent, hence form a basis. 4. AA = A A, hence A is normal. 5. If Q is the matrix of eigenvectors then AQ = QD where D is the diagonal matrix of eigenvalues of A. This means A can be diagonalized, Q 1 AQ = D.

Self-adjoint matrices behave very much like scalars in the sense that we can easily define functions. For example, the exponential is defined as e x = 1 + x + x2 xn + + 2 n! + = x k k! The exponential of A is defined as e A = I + A + A2 2 k=0 + + An n! + = k=0 A k k! If A is self-adjoint, it can be diagonalized (A = QDQ 1 ). This implies that A 2 = QDQ 1 QDQ 1 = QD 2 Q 1 and for any power A k = QD k Q 1. This means that e A = I + QDQ 1 + QD2 Q 1 + = k = 0 QD k Q 1 2 k! = Qe D Q 1 In fact, f(a) = Qf(D)Q 1 for any piecewise continuous f, where the diagonal matrix f(λ 1 ) 0 0 f(d) = 0 f(λ 2 ) 0 0 0 f(λ 1 ) This is known as the spectral mapping theorem. Since self-adjoint matrices (and operators) are so fundamental to the Sciences and Engineering applications, it is important to know that we can deal with them almost as if they were scalars 9 Matrix solution of systems of ODEs If we consider the system of differential equations where A is a square matrix, we can write down a solution of the form: d y dt = A y(t), y(0) = y 0 (3) y(t) = e At y 0 It remains to verify that it is a solution of (3). Differentiation yields d dt y(t) = d dt eat y 0 = d t2 tk [I + At + A2 + + Ak dt 2! k! + ] y 0 = [A + A 2 t + A k t k 1 (k 1)! + ] y 0 = A[I + At + ] y 0 = Ae At y 0 = A y(t)

If we have A = A(t), then the solution of d y dt = A(t) y(t), y(0) = y 0 (4) is given by For systems of linear equations of the form we have integration factor (matrix) leading to the system of differential equations y(t) = e A(t)dt y 0 d y dt + A(t) y(t) = b(t), y(0) = y 0 µ(t) = e A(t)dt which can be integrated as in the scalar case. d[µ(t) y(t)] dt = µ(t) b(t), y(0) = y 0 (5) 10 Solving Systems by Diagonalization Given the linear system dy = [A]y(t) + b(t) dt change variables by y(t) = [Q]z(t) where [Q] is any non-singular matrix. This implies y (t) = [Q]z (t) and therefore [Q]z (t) = [A][Q]z(t) + b(t) or z (t) = [Q] 1 [A][Q]z(t) + [Q] 1 b(t) If A can be diagonalized by [Q] then z (t) = [D]z(t) + b(t) where b(t) = [Q] 1 b(t). This results in a fully decoupled system of equations z 1(t) = λ 1 z 1 (t) + b 1 (t) z 2(t) = λ 2 z 2 (t) + b 2 (t) z N (t) = λ N z N (t) + b N (t) which can each be solved independently of the others. Unfortunately, this does not work as easily if [A] = [A(t)]

11 Solving Systems by Eigenvectors If A is self-adjoint (A = A ) then we can solve the differential equation du dt = Au, u(0) = u 0 very easily. Since the eigenvectors e k of A form an orthogonal basis, we have < e i, e j >= 0 if i j, and we can normalize the vectors so that < e i, e i >= 1. If we write u = c 1 e 1 + + c N e N = N c ie k we can take the inner product with e k to get. < u, e k >= c 1 < e 1, e k > + + c k < e k, e k > + + c N < e N, e k >= c k Writing u(t) = N c i(t)e i and substituting we have du N dt = N N c i(t)e i = Au = Ac i e i = c i λ i e i Taking the inner product of this with e k, we get c k(t) = c k λ k which has the solution c k (t) = c k (0)e λkt. Hence the solution can be written as N u(t) = c i (0)e λit e i One can show (setting t = 0) that c i (0) =< u(0), e i > so N u(t) = < u(0), e i > e λit e i This is the finite dimensional analog of Fourier Series, which consists of solving equations by writing functions as expansions of orthogonal eigenfunctions sin(kt), cos(kt). 12 Linearization If we have a single nonlinear equation, dy dt = f(t, y), y(t 0) = y 0 we can expand f(t, y) in a Taylor series about the point (t, y 0 ) as f(t, y) = f(t, y 0 ) + f y (t, y 0).(y y 0 ) + 1 2 f 2! y 2 (t, y 0).(y y 0 ) 2 + = b(t) + a(t).(y y 0 ) +

The nonlinear equation become If we let z(t) = y(t) y 0, then dy dt = a(t).(y y 0) + b(t) +. dz dt = a(t).z + b(t) +., z(t 0) = 0 If we have a system of nonlinear equations d y dt = f(t, y), y(t 0 ) = y 0 we can write dy i dt = f i(t, y 0 ) + [ f i y j (t, y 0 )].( y y 0 ) +. = b i (t) + [A i,j (t)].(y j y 0,j ) + If we neglect the nonlinear terms, the behavior in the neighborhood of t = t 0, y = y 0 is determined by the eigenvalues and eigenvectors of the Jacobian matrix: A i,j (t) = f i y j (t 0, y 0 )