Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

Chapter 1: Systems of linear equations and matrices Section 1.1: Introduction to systems of linear equations Definition: A linear equation in n variables can be expressed in the form a 1 x 1 + a 2 x 2 + + a n x n = b where x 1, x 2,... are unknowns (variables) and a 1, a 2,..., b are real constants. 1. 2x = 1 2. x + 2y = 5 3. 2x + y z = 3 4. x 2 + y 2 = 1 5. y = cos( x) Definition: A solution of a linear equation is a sequence of numbers s 1,..., s n that satisfy the equation when we substitute x 1 = s 1, x 2 = s 2,..., x n = s n. 1

Definition: A system of linear equations (or linear system) is a finite set of linear equations in the variables x 1, x 2,..., x n. Definition: A sequence of numbers s 1,..., s n is a solution of the system if x 1 = s 1, x 2 = s 2,..., x n = s n is a solution of every equation in the system. Definition: The solution set of a linear system is the set of all possible solutions of that system. Definition: A linear system of equations that has at least one solution is called consistent. A linear system that has no solution is called inconsistent. Note: Every system of linear equations has no solutions, exactly one solution, or infinitely many solutions. 2

Notation: We can rewrite a linear system as a rectangular array of numbers. Definition: Two or more linear systems are equivalent systems if they have the same solution set. Idea for solving linear systems: replace given system of equations with an equivalent system that is easier to solve. Method 1: Substitution 3

Method 2: System reduction Method 3: Matrix row reduction 4

Definition: The steps we used in method 3 are called elementary row operations. Specifically, these operations are 1. Scaling (multiply all entries in a row by a nonzero constant) 2. Interchange (interchange two rows) 3. Replacement (add multiple of one row to another row) Example: Solve x 1 + x 3 = 0 2x 1 + x 2 x 3 = 1 x 2 + x 3 = 0 5

Section 1.2: Gaussian elimination Definition: A matrix is in row-echelon form if 1. The first nonzero number in the row (starting from the left) is a 1, called the leading 1. Note a row may consist entirely of 0s. 2. If any rows consist entirely of 0s, they are grouped at the bottom of the matrix. 3. For any two nonzero rows, the leading 1 of the lower row is further to the right than the leading 1 of the higher row. Definition: A matrix is in reduced row-echelon form if 1. It is in row-echelon form 2. Each column with a leading 1 has 0s everywhere else in that column Note: Every matrix has a unique reduced row-echelon form but does not have a unique row-echelon form (different elementary matrix operations can lead to different row-echelon forms). Why should we care? By reducing a matrix to a row-echelon of reduced row-echelon form, we make it easier to find a solution to a given system of equations. What kind of matrices are these? 0 1 5 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 3 1 0 1 1 0 0 1 0 1 0 1 0 1 1 0 0 1 0 3 0 0 0 1 0 0 0 0 0 0 6

Definition: Gaussian elimination is a method used to reduce an arbitrary matrix to row-echelon form. Definition: Gauss-Jordan elimination is a method used to reduce an arbitrary matrix to reduced row-echelon form. Method and example: Reduce the augmented matrix 0 2 3 3 3 1 5 2 2 4 1 1 Step 1: locate left-most nonzero column Step 2: if necessary, interchange top row with another row to put a nonzero entry to the top of the column from step 1. Step 3: if the entry at the top of the column from step 1 is a, a 1, multiply the row by 1/a to get a leading 1. Step 4: add multiples of the top row to the rows below so all entries below 7

the leading 1 are 0. Step 5: cover top row (pretend it s not there) and repeat from step 1. Continue steps 1-5 until entire matrix is in row echelon form. 8

The procedure of steps 1-5 is Gaussian elimination. Adding one more step: Step 6: Beginning with the last nonzero row and working up, add multiples of each row to the row above to introduce 0s above the leading 1. The procedure of steps 1-6 is Gauss-Jordan elimination. We can use Gaussian or Gauss-Jordan elimination to find the solution set of a system of linear equations. Example: 9

Terminology: For a system of equations expresses in row-echelon or reduced row-echelon form, the variables corresponding to the leading 1s are called leading variables or pivots. Any variable without a corresponding leading 1 is called a free variable or parameter. Example: 10

Sometimes it s easier to use Gaussian elimination to get a row-echelon form then solve by back substitution to find the solution rather than continuing with Gauss-Jordan elimination (especially when solving by hand). Example: Solve 2x 1 4x 2 + 2x 3 = 8 x 1 2x 2 + 2x 3 = 6 x 2 + 6x 3 = 1 11

Definition: A system of linear equations is homogeneous if the constant terms are all zero. Homogeneous or not? x 1 + 17x 2 2x 3 = 0 811x 1 118x 2 + 0x 3 = 0 14x 2 + πx 3 = 0 1 0 0 0 0 1 0 0 0 0 1 1 Note: All homogeneous linear systems are consistent since they all have at least the solution x 1 = 0, x 2 = 0,... which is called the trivial solution. If the system has any other solutions they are called nontrivial solutions (recall the example of the lines from earlier). Example: Solve the homogeneous linear system x 1 + x 2 2x 3 = 0 3x 1 + 2x 2 + 4x 3 = 0 4x 1 + 3x 2 + 2x 3 = 0 12

Note: Elementary row operations do not change the column of zeros so the corresponding equations to a reduced row-echelon form will also be homogeneous. Theorem 1.2.1: A homogeneous system of linear equations with more unknowns than equations has infinitely many solutions. In other words, if there are enough zero rows in the reduced row-echelon form to reduce the number of equations to less than the number of unknowns, or we start out with less equations than unknowns, the best we can do for a reduced row-echelon matrix looks like 1 0 0 0 a 1 a 2 0 0 1 0 0 b 1 b 2 0 0 0 1 0 c 1 c 2 0... 0 0 0 1 d 1 d 2 0 Note: The applies only to homogeneous systems. 13

Section 1.3: Matrices and matrix operations Definition: A matrix is a rectangular array of numbers. The numbers in the array are called the entries in the matrix. Note: We ve been using matrices to represent systems of linear equations but matrices can be used to represent many different types of data. Definition: The size of a matrix is determined by the number of columns x number of rows it contains, so a matrix with m rows and n columns has size m n. An m 1 matrix is called a column matrix (or column vector) and a 1 n matrix is called a row matrix (or row vector). Notation: Capital letters are used to denote matrices (eg A) and lower case letters are used to denote entries. General entries often have numerical subscripts with the first number representing the row and the second number representing the column (eg a ij ). Row or column matrices are usually represented by a lower case letter with an arrow (eg r, the book represents them as bold letters). 14

Note: The letter is usually the same to denote the matrix and the entries. We often write (A) ij = a ij to denote an entry and A = [a ij ] to denote a matrix. Definition: A matrix with n rows and n columns is called a square matrix of order n. The main diagonal of a square matrix are the entries a 11, a 22,..., a nn. Example: Definition: Two matrices are equal if they have the same size and the corresponding entries are equal. Definition: If A and B are matrices of the same size, then the sum A+B is the matrix obtained by adding the entries of B to the corresponding entries of A. The difference A B is the matrix obtained by subtracting the entries of B from the corresponding entries of A. That is, (A + B) ij = (A) ij + (B) ij = a ij + b ij (A B) ij = (A) ij (B) ij = a ij b ij 15

Definition: If A is any matrix and c is any scalar (real or complex number), then the scalar product ca is the matrix obtained by multiplying each entry of the matrix A by c. That is, Example: (ca) ij = c(a) ij = ca ij Definition: A linear combination of matrices A 1, A 2,.., A n, all A i the same size, with coefficients c 1, c 2,.., c n, all c i scalars, is an expression of the form c 1 A 1 + c 2 A 2 + + c n A n Example: Definition: If A is an m n matrix and B is an n p matrix, then the 16

product AB is the m p matrix obtained with the following entries (AB) ij = for i = 1,...,m, j = 1,...,n. Example: n n (A) ik (B) kj = a ik b kj k=1 k=1 17

Alternate Definition 1: If A is an m n matrix and B is an n p matrix, then the product AB has entries (AB) ij found by multiplying entries in the ith row of A with the corresponding entries in the jth column of B and then taking the sum of those products. Example: Note: For the product AB to be defined, the number of columns in A must be the same as the number of rows in B. The product has the same number of rows as A and the same number of columns as B. Example: Note: In general, AB BA! 18

Definition: Any matrix can be partitioned into smaller submatrices by inserting vertical/horizontal rules between columns/rows. Alternate Definition 2: If A is an m n matrix and B is an n p matrix, then the ith row matrix of the product AB = [ith row matrix of A]B and the jth column matrix of AB = A[jth column matrix of B]. Note: We can use this method to find particular row or column of AB without calculating the entire product. Now we can use matrix multiplication to rewrite our linear system of equations a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.. a n1 x 1 + a n2 x 2 + + a nn x n = b n 19

So we compact the notation to write our system of equations as A x = b and the augmented matrix is the partitioned matrix [A b]. We will see much more about this later... Definition: If A is an m n matrix then the transpose of A, written A T, is the n m matrix obtained by interchanging the rows and columns of A. That is, (A T ) ij = (A) ji 20

Note: For a square matrix, the transpose can be obtained by interchanging entries symmetrically positioned across the main diagonal. Definition: If A is a square matrix of order n, the trace of A, written tr(a), is the sum of the entries on the main diagonal of A. That is,. n tr(a) = a 11 + a 22 + + a nn = a kk k=1 21

Section 1.4: Inverses; Rules of matrix arithmetic Theorem 1.4.1: Assuming that the sizes of the matrices are such that the indicated operations can be performed, the following rules are valid. 1. A + B = B + A 2. A + (B + C) = (A + B) + C 3. A(BC) = (AB)C 4. A(B + C) = AB + AC 5. (A + B)C = AC + BC 6. A(B C) = AB AC 7. (A B)C = AC BC 8. a(b + C) = ab + ac 9. a(b C) = ab ac 10. (a + b)c = ac + bc 11. (a b)c = ac bc 12. a(bc) = (ab)c 13. a(bc) = (ab)c = B(aC) Proof: See the text for the general method for proving these rules. Note: Many (but not all!!) of the same properties of real numbers are also valid for matrices. 22

Definition: An m n matrix whose entries are all zero is called a zero matrix, denoted 0 or 0 m n. Theorem 1.4.2: Assuming that the sizes of the matrices are such that the indicated operations can be performed, the following rules are valid. 1. A + 0 = 0 + A = A 2. A A = 0 3. 0 A = A 4. A0 = 0; 0A = 0 Definition: A square matrix of order n whose entries are only 1 on the main diagonal and 0 everywhere else is called the identity matrix, denoted I or I n. Note: The identity matrix behaves like the number 1 when it multiplies another matrix. That is, if A is an m n matrix then I m A = A and AI n = A. Theorem 1.4.3: If R is the reduced row-echelon form of an n n matrix A, then either R has a row of zeros or R is the identity matrix I n. 23

Proof: See the text. Definition: If A is a square matrix of order n and there exists a matrix B of the same size such that AB = BA = I, then A is invertible and B is the inverse of A, denoted A 1. If no such matrix B can be found, then A is said to be singular. Theorem 1.4.4: If B and C are both inverses of the matrix A, then B = C. Note: This means that any invertible matrix A has only one inverse, so we can talk about the inverse of a matrix. [ ] a b Theorem 1.4.5: The matrix A = is invertible if ad bc 0, c d in which case the inverse is given by the formula Example: [ ] A 1 1 d b = ad bc c a 24

Theorem 1.4.6: If A and B are invertible matrices of the same size, then AB is invertible and Proof: (AB) 1 = B 1 A 1 Note: This can be extended to any number of factors: the product of any number of invertible matrices is invertible, and the inverse of the product is the product of the inverses in reverse order. That is, (A 1 A 2 A n ) 1 = A 1 n A 1 2 A 1 1 Definition: If A is a square matrix of order n, we define nonnegative integer powers of A to be A 0 = I A n = AA }{{ A}, (n > 0) n factors = (A 1 ) n A n = A} 1 A 1 {{ A 1 }, if A is invertible n factors Theorem 1.4.7: If A is a square matrix of order n, r,s are integers, then A r A s = A r+s, (A r ) s = A rs Theorem 1.4.8 If A is an invertible matrix, then 1. A 1 is invertible and (A 1 ) 1 = A 2. A n is invertible and (A n ) 1 = (A 1 ) n for n = 0, 1, 2,... 3. For any nonzero scalar k, the matrix ka is invertible and (ka) 1 = 1 k A 1 25

then Proof: See text Definition: If A is a square matrix of order n and p(x) is the polynomial p(x) = a 0 + a 1 x + a n x n p(a) = a 0 I n + a 1 A + + a n A n. That is, p(a) is the n n matrix found by substituting A for x. Recall the transpose of a matrix... Theorem 1.4.9: Assuming that the sizes of the matrices are such that the indicated operations can be performed, the following rules are valid. 1. (A T ) T + A 2. (A + B) T = A T + B T and (A B) T = A T B T 3. (ka) T = ka T, k is a scalar 4. (AB) T = B T A T Proof: See text Note: Property 4 can be extended to the product of any number of matrices. That is, (A 1 A 2 A n ) T = A T n A T 2 A T 1. Note the similarity to the inverse of the product of matrices. Theorem 1.4.10: If A is an invertible matrix, then A T is also invertible and (A T ) 1 = (A 1 ) T. Proof: 26

Example: Section 1.5: Elementary matrices and a method for finding A 1 Definition: A square matrix of order n is an elementary matrix if it can be obtained from the identity matrix I n by performing a single elementary row operation. Theorem 1.5.1: If the elementary matrix E results from performing a specific row operation in I m. and if A is an m n matrix, then the product EA is the matrix that results from performing the same row operation in A. Note: This theorem is more of theoretical interest (and we will use it later in the course) and it is much easier to simply perform the row operation on A rather than computing the product. Theorem 1.5.2: Every elementary matrix is invertible and the inverse is also an elementary matrix. 27

Theorem 1.5.3: If A is an n n matrix, then the following are equivalent (that is, if we know one statement is true/false then the rest must be true/false) 1. A is invertible 2. A x = 0 has only the trivial solution 3. The reduced row echelon form of A is I n 4. A is expressible as a product of elementary matrices Proof: See text Definition: If a matrix B can be obtained from a matrix A by a finite sequence of elementary row operations (and conversely we can get A from B by applying the inverse row operations in reverse order) then A and B are row equivalent. Consequence of theorem 1.5.3: If A is an n n matrix and we can find a sequence of row operations to reduce it to I n (eg A and I n are row equivalent), then applying those same operations in the same order to I n will yield A 1. That is, to find A 1, we use the augment matrix [A I n ] and perform row operations until we get [I n A 1 ]. Example: 28

Note: If A is not invertible then it can t be reduced to I n by elementary row operations and must contain at least one row of zeros. If we are using the above method to find A 1 and we get a row of zeros on the left hand side of the augmented matrix, then we can stop since we know A is not invertible. Section 1.6: Further results on systems of equations and invertibility Theorem 1.6.2: If A is an invertible n n matrix, then for each n 1 matrix b, the system of equations A x = b has exactly one solution, Example: Solve x = A 1 b 2x 1 x 2 = 1 3x 1 + 2x 2 = 1 Note: This method for finding x only works when you have as many equations as unknowns and the coefficient matrix A is invertible. 29

Suppose you want to solve a sequence of systems A x = b 1, A x = b 2,...A x = b n with a common coefficient matrix A. If A is invertible, we can find A 1 then multiply by b i to find the solution x i. Or, we can use the augmented matrix [A b 1 b 1 b n ] then apply row operations to find the reduced row echelon form of A. the corresponding columns now have the solutions x 1, x 2,...x n. a) Example: Solve x 1 + x 2 = 1 x 1 + 2x 2 = 0 3x 1 + x 2 = 5 b) x 1 + x 2 = 2 x 1 + 2x 2 = 1 3x 1 + x 2 = 8 Note: This method works even when A is not invertible. 30

Theorem 1.6.3: Let A be a square matrix 1. If B is a square matrix satisfying BA = I, then B = A 1 2. If B is a square matrix satisfying AB = I, then B = A 1 Now we have two more statements about invertible matrices (see theorem 1.5.3): Theorem 1.6.4: If A is an n n matrix, then the following are equivalent (that is, if we know one statement is true/false then the rest must be true/false) 1. A is invertible 2. A x = 0 has only the trivial solution 3. The reduced row echelon form of A is I n 4. A is expressible as a product of elementary matrices 5. A x = b is consistent for every n 1 matrix b 6. A x = b has exactly one solution for every n 1 matrix b Proof: See text Theorem 1.6.5: Let A and B be square matrices of order n. If AB is invertible then A and B must also be invertible. We are often interested in determining the conditions under which a system of equations is consistent. That is, given an m n matrix A, find all m 1 matrices b such that A x = b is consistent. Example and method: Find conditions on b 1, b 2, and b 3 to ensure the following system of equations is consistent. x 1 x 2 2x 3 = b 1 2x 1 + x 3 = b 2 3x 1 + x 2 + 4x 3 = b 3 31