Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Direct Methods for Solving Linear Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1

Overview General Linear Systems Gaussian Elimination Triangular Systems The LU Factorization Pivoting Special Linear Systems Strictly Diagonally Dominant Matrices The LDM T and LDL T Factorizations Positive Definite Systems Tridiagonal Systems 2

Linear Systems of Equations For a linear system of equations a 1,1 x 1 + a 1,2 x 2 + + a 1,n x n = b 1, a 2,1 x 1 + a 2,2 x 2 + + a 2,n x n = b 2, a n,1 x 1 + a n,2 x 2 + + a n,n x n = b n ; or equivalently, in matrix/vector notation: (A) n n (x) n 1 = (b) n 1 : a 1,1 a 1,2 a 1,n a 2,1 a 2,2 a 2,n x 1 x 2 = b 1 b 2, (1) a n,1 a n,2 a n,n x n b n find x = [x 1, x 2,, x n ] T such that the relation Ax = b holds 3

Gaussian Elimination A process of reducing the given linear system to a new linear system in which the unknowns x i s are systematically eliminated; The reduction is done via elementary row operations; It may be necessary to reorder the equations to accomplish this, ie, use equation pivoting 4

Elementary Row Operations Type 1: interchange two rows of a matrix: (i) (j); Type 2: replacing a row by the same row multiplied by a nonzero constant: (i) λ (i), λ \ {0}; Type 3: replacing a row by the same row plus a constant multiple of another row: (j) (j) + λ (i), λ 5

Gaussian Elimination: an Example 1 4 7 30 2 5 8 36 (2) (2) (a 2,1 /a 1,1 )(1) (3) (3) (a 3,1 /a 1,1 )(1) = 1 4 7 30 0 3 6 24 3 6 10 45 0 6 11 45 augmented matrix 1 4 7 30 (3) (3) (a 3,2 /a 2,2 )(2) = 0 3 6 24 x 3 = 3, 0 0 1 3 3x 2 6x 3 = 24 = x 2 = 2, x 1 + 4x 2 + 7x 3 = 30 = x 1 = 1 6

Triangular Systems: Forward Substitution Consider the following 2-by-2 lower triangular system: l 1,1 0 x 1 b 1 = l 2,1 l 2,2 x 2 b 2 If l 1,1 l 2,2 0, then x 1 = b 1 /l 1,1, x 2 = (b 2 l 2,1 x 1 )/l 2,2 The general procedure is obtained by solving the ith equation in Lx = b for x i : ( b i ) i 1 j=1 l i,jx j x i = l i,i n Flop count: (i 1) + (i 2) + }{{}}{{}}{{} 1 + }{{} 1 = n 2 i=2 mul add sub div 7

Triangular Systems: Back Substitution Consider the following 2-by-2 upper triangular system: u 1,1 u 1,2 x 1 b 1 = 0 u 2,2 x 2 b 2 If u 1,1 u 2,2 0, then x 2 = b 2 /u 2,2, x 1 = (b 1 u 1,2 x 2 )/u 1,1 The general procedure is obtained by solving the ith equation in Ux = b for x i : ( b i ) n j=i+1 u i,jx j x i = u i,i Flop count: n 2 8

The Algebra of Triangular Matrices Definition A unit triangular matrix is a triangular matrix with ones on the diagonal Properties The inverse of an upper (lower) triangular matrix is upper (lower) triangular; The product of two upper (lower) triangular matrices is upper (lower) triangular; The inverse of a unit upper (lower) triangular matrix is a unit upper (lower) triangular; The product of two unit upper (lower) triangular matrices is unit upper (lower) triangular 9

The LU Factorization 1 Compute a unit lower triangular L and an upper triangular U such that A = LU; 2 Solve Lz = b (forward substitution); 3 Solve Ux = z (back substitution) Example 3 5 6 7 } {{ } A = 1 0 2 1 } {{ } L 3 5 0 3 } {{ } U For b = [1, 4] T, solving Lz = b yields z = [1, 2] T, and solving Ux = z yields x = [13/9, 2/3] T 10

LU Factorization: an Example Example 1 4 7 2 5 8 (2) (2) (a 2,1 /a 1,1 )(1) (3) (3) (a 3,1 /a 1,1 )(1) = 1 4 7 0 3 6 3 6 10 0 6 11 A Note that M 1 A = A 1 where M 1 is the unit lower triangular matrix 1 0 0 A 1 M 1 = a 2,1 a 1,1 1 0 a 3,1 a 1,1 0 1 11

1 4 7 1 4 7 0 3 6 (3) (3) (a 3,2 /a 2,2 )(2) = 0 3 6 0 6 11 0 0 1 A 1 Note that M 2 A 1 = A 2 where M 2 is the unit lower triangular matrix 1 0 0 A 2 M 2 = 0 1 0 0 a 3,2 a 2,2 1 Hence, M 2 M 1 A = A 2, or equivalently, A = M1 1 M 2 1 }{{} L A 2 }{{} U 12

Also, 1 0 0 1 1 0 0 M 1 1 = a 2,1 a 1,1 1 0 = a 2,1 a 1,1 1 0, a 3,1 a 1,1 0 1 a 3,1 a 1,1 0 1 1 0 0 1 1 0 0 M 1 2 = 0 1 0 = 0 1 0, 0 a 3,2 a 2,2 1 0 a 3,2 a 2,2 1 1 0 0 L = M 1 1 M 1 2 = a 2,1 a 1,1 1 0 a 3,1 a 1,1 a 3,2 a 2,2 1 13

Gauss Transformation Suppose x n with x k 0 If τ T = [0,, 0 }{{} k, τ k+1,, τ n ], τ i = x i x k for (k + 1) i n and we define M k = I τ e T k, where e T k = [0,, 0 }{{} k 1, 1, 0,, 0], M k x = 1 0 0 0 0 1 0 0 0 τ k+1 1 0 0 τ n 0 1 x 1 x k x k+1 x n = x 1 x k 0 0, M k : Gauss transformation; τ k+1,, τ n : multipliers 14

Upper Triangularizing Assume that A n n, Gauss transformation M 1,, M n 1 can usually be found such that M n 1 M 2 M 1 A = U is upper triangular During the kth step: We are confronted with the matrix A (k 1) = M k 1 M 1 A that is upper triangular in columns 1 to k 1; The multipliers in M k are based on a (k 1) k+1,k,, a(k 1) n,k In particular, we need a (k 1) k,k 0 to proceed 15

The LU Factorization Let M 1,, M n 1 be the Gauss transforms such that M n 1 M 1 A = U is upper triangular If M k = I τ (k) e T k, then M 1 k = I + τ (k) e T k Hence, A = LU where L = M 1 1 M 1 n 1 L is a unit lower triangular matrix since each M 1 k triangular (p9) is unit lower Let τ (k) be the vector of multipliers associated with M k then upon termination, A[k + 1n, k] = τ (k) 16

The LU Factorization: an Algorithm Description Algorithm LUFactorization(A) input: an n-by-n (square) matrix L output: the LU factorization of A, provided it exists for i from 1 to n do for k from i + 1 to n do mult := a k,i /a i,i ; a k,i := mult; for j from i + 1 to n do a k,j := a k,j mult a i,j ; od; od; od; return A 17

The LU Factorization: Flop Count Definition A flop is a floating-point operation There is no standard agreement on this terminology We shall adopt the MATLAB convention, which is to count the total number of operations The flop count will include the total number of adds + multiplies + divides + subtracts For LU factorization: n n n 1 + i=1 k=i+1 j=i+1 2 = 2 3 n3 1 2 n2 1 6 n = O(n3 ) flops Recall that for forward substitution (p7) and back substitution (p8): O(n 2 ) flops 18

The LU Factorization: Sufficient Conditions Definition A leading principal submatrix of a matrix A is a matrix of the form for some 1 k n A k = a 1,1 a 1,2 a 1,k a 2,1 a 2,2 a 2,k a k,1 a k,2 a k,k Theorem A n n has an LU factorization if for 1 k n 1, det(a k ) 0 where the A k s are the leading principal submatrices of A If the LU factorization exists and A is nonsingular, then the LU factorization is unique and det(a) = u 1,1 u n,n 19

Partial Pivoting Row interchanges, ie, elementary row operation of type 1 (see p5) are needed when one of the pivot elements a (k) k,k = 0 Row interchanges are often necessary even when the pivot 0: If a (k) k,k a(k) j,k for some j (k + 1 j n), the multiplier m j,k = a (k) j,k /a(k) k,k will be very large Roundoff error that was produced in the computation of a (k) k,l will be multiplied by the large factor m j,k when computing a (k+1) j,l ; Roundoff error can also be dramatically increased in the back substitution step x k = a(k) k,n+1 n j=k+1 a(k) k,j when the pivot a (k) k,k is small a (k) k,k 20

Example For the linear system E 1 : 0003000x 1 + 5914x 2 = 5917, E 2 : 5291x 1 6130x 2 = 4678 (x 1 ) E = 1000, (x 2 ) E = 1000, (x 1 ) A = 1000, (x 2 ) A = 1001 (Gaussian elimination, four-digit arithmetic rounding) a (1) 1,1 = 0003000 = m 2,1 = 5291/0003000 1764 Apply (E 2 ) (E 2 m 2,1 E 1 ): E 1 : 0003000x 1 + 5914x 2 5917, E 2 : 104300x 2 104400 Back substitution = x 2 1001 However, since the pivot a 1,1 is small, x 1 (5917 (5914)(1001))/0003000 = 1000 contains the small error 0001 multiplied by 5914/0003000 20000 21

Pivoting Strategy Select the largest element a (k) i,k that is below the pivot a(k) k,k : if a (k) max j=k,,n j,k = a (k), then swap row k with row k, and use a (k) k,k k,k to form the multiplier Example For the linear system {E 1, E 2 } on p21, since a 1,1 < a 2,1, rows 1 and 2 are swapped This leads to the correct solution x 1 = 1000, x 2 = 10000 22

Permutation Matrices A permutation matrix is just the identity matrix with its rows re-ordered If P is a permutation and A is a matrix, then P A is a row permuted version of A, and AP is a column permuted version of A, eg, 0 0 0 1 a 4,1 a 4,2 a 4,3 a 4,4 P = 1 0 0 0 0 0 1 0, P A = a 1,1 a 1,2 a 1,3 a 1,4 a 3,1 a 3,2 a 3,3 a 3,4 0 1 0 0 a 2,1 a 2,2 a 2,3 a 2,4 If P is a permutation, then P 1 = P T (P is orthogonal) 23

An interchange permutation is obtained by merely swapping two rows in the identity If P = E n E 1 and each E k is the identity with rows k and p(k) interchanged, then the vector p(1 : n) is a useful vector encoding of P, eg, [4, 4, 3, 4] is the vector encoding of P on p23 No floating point arithmetic is involved in a permutation operation However, permutation matrix operations often involve the irregular movement of data, and can represent a significant computational overhead 24

Partial Pivoting: the Basic Idea 3 17 10 A = 2 4 2 6 18 12 (1) (3), p[1] = 3: 0 0 1 6 18 12 E 1 = 0 1 0, E 1 A = 2 4 2, 1 0 0 3 17 10 1 0 0 6 18 12 M 1 = 1/3 1 0, M 1 E 1 A = 0 2 2 1/2 0 1 0 8 16 25

(2) (3), p[2] = 3: 1 0 0 1 0 0 6 18 12 E 2 = 0 0 1, M 2 = 0 1 0, M 2 E 2 M 1 E 1 A = 0 8 16 0 1 0 0 1/4 1 0 0 6 In general, upon completion we emerge with M n 1 E n 1 M 1 E 1 A = U, an upper triangular matrix To solve the linear system Ax = b, we Compute y = M n 1 E n 1 M 1 E 1 b; Solve the upper triangular system Ux = y All the information necessary to do this is contained in the array A and the vector p Cost O(n 2 ) comparisons associated with the search for the pivots The overall algorithm involves 2n 3 /3 flops 26

Scaled Partial Pivoting Scale the coefficients before deciding on row exchanges; Scale factor: for row i, S i = max j=1,2,,n a i,j ; Let k + 1 j n be such that a (k) j,k /S j = max i=k+1,,n a(k) i,k /S i If a (k) j,k /S j > a (k) k,k /S k then rows j and k are exchanged 27

Example For 211 421 0921 A = 401 102 112, 109 0987 0832 s 1 = 421, s 2 = 102, s 3 = 109 (1) (3) a 1,1 s 1 = 0501, a 2,1 s 2 = 0393, a 3,1 s 3 = 421 109 0987 0832 401 102 112 211 421 0921 Compute the multipliers m 2,1, m 3,1, Cost Additional O(n 2 ) comparisons, and O(n 2 ) flops (divisions) 28

Complete Pivoting search all the entries a i,j, k i, j n, to find the entry with the largest magnitude Both row and column interchanges are performed to bring this entry into the pivot position only recommended for systems where accuracy is essential since it requires an additional O(n 3 ) comparisons 29

An Application: Multiple Right Hand Side Suppose A is nonsingular and n-by-n, and that B is n-by-p Consider the problem of finding X (n-by-p) so that AX = B If X = [x 1,, x p ] and B = [b 1,, b p ] are column partitions, then Compute P A = LU; for k from 1 to p do Solve Ly = P b k ; Solve Ux k = y; od; Note that A is factored just once If B = I n, then we emerge with a computed A 1 30

Strictly Diagonally Dominant Matrices Definition An n n matrix A is strictly diagonally dominant (sdd) n if a ii > a ij holds for each i = 1, 2,, n j=1,j i Theorem An sdd matrix is nonsingular Proof Suppose there is x 0 such that Ax = 0 Then there is 1 k n such that x k = max 1 j n x j > 0 Since x is a solution to Ax = b, a k,k x k + n j=1,j k a k,jx j = 0 Hence, a k,k n j=1,j k a k,j x j / x k n j=1,j n a k,j A contradiction since A is sdd Hence, x = 0 is the only solution to Ax = b Equivalently, A is nonsingular 31

Theorem Let A be an sdd matrix Then Gaussian elimination can be performed on any linear system of the form Ax = b to obtain its unique solution without row or column interchanges, and the computations are stable to the growth of roundoff errors Proof sketch A (1) type 3 A (2) type 3 A (n) sdd sdd sdd 32

LDM T and LDL T Factorizations n n are Theorem If all the leading principal submatrices of A nonsingular, then there exist unique lower triangular matrices L and M and a unique diagonal matrix D = diag(d 1,, d n ) such that A = LDM T Proof By the theorem on p19, A = LU exists Set D = diag(d 1,, d n ) with d i = u i,i for 1 i n Since D is nonsingular and M T = D 1 U is unit upper triangular, A = LU = LD(D 1 U) = LDM T Uniqueness follows from the uniqueness of the LU factorization Theorem If A = LDM T is the LDM T factorization of a nonsingular symmetric matrix A, then L = M 33

Example For the matrix 1 4 7 A = 4 5 8, 7 8 10 The LU factorization of A is 1 0 0 1 4 7 L = 4 1 0, U = 0 11 20 7 20 1 11 0 0 29 11 Set 1 0 0 D = diag(u) = 0 11 0 0 0 29 11 34

Then the matrix M T in the LDM T factorization of A is 1 4 7 M T = 20 0 1 11 0 0 1 Since A is symmetric, M T = L T and we have the LDL T factorization of A Remark It is possible to compute the matrices L, D, and M directly, instead of computing them via the LU factorization 35

Positive Definite Matrices Definition A matrix A is positive definite if x T Ax > 0 for every nonzero n-dimensional column vector x Theorem If A is an n n positive definite matrix, then a A is nonsingular; b a i,i > 0, for each 1 i n; c max 1 k,j n a k,j max 1 i n a i,i ; d (a i,j ) 2 < a i,i a j,j, for each i j Theorem A symmetric matrix A is positive definite if and only if each of its leading principal submatrices has a positive determinant 36

Example For 2 1 0 A = 1 2 1, 0 1 2 det A 1 = det 2 = 2 > 0, det A 2 = det 2 1 1 2 det A 3 = det A = 4 > 0 = 3 > 0, Since A is also symmetric, A is positive definite Theorem A symmetric matrix A is positive definite if and only if Gaussian elimination without row exchanges can be performed on the linear system Ax = b with all the pivot elements positive Moreover, in this case, the computations are stable with respect to the growth of roundoff errors 37

Theorem (Cholesky Factorization) If A n n is positive definite, then there exists a unique lower triangular G n n with positive diagonal entries such that A = GG T Proof By the second theorem of p33, there exists a unit lower triangular L and a diagonal D = diag(d 1,, d n ) such that A = LDL T Since the d k are positive, the matrix G = Ldiag( d 1,, d n ) is real lower triangular with positive diagonal entries It also satisfies A = GG T Uniqueness follows from the uniqueness of LDL T factorization 38

Example For the positive definite matrix A on p37, the matrices L and D in the LDL T factorization for A are 1 0 0 2 0 0 L = 1/2 1 0, D = 0 3/2 0 0 2/3 1 0 0 4/3 Hence, the matrix G = Ldiag( d 1, d 2, d 3 ) in the GG T factorization of A is 2 0 0 G = 1/2 2 1/2 6 0 0 1/3 6 2/3 3 39

Tridiagonal Matrices Matrices of the form A = a 1,1 a 1,2 0 0 a 2,1 a 2,2 a 2,3 0 a 3,2 a 3,3 a 3,4 0 a n 1,n 0 0 a n,n 1 a n,n 40

Now suppose A can be factored into the triangular matrices L and U Suppose that the matrices can be found in the form L = l 1,1 0 0 l 2,1 l 2,2 0 0 0 0 l n,n 1 l n,n, U = 1 u 1,2 0 0 0 1 0 u n 1,n 0 0 1 The zero entries of A are automatically generated by LU 41

Multiplying A = LU, we also find the following conditions: a 1,1 = l 1,1 ; (2) a i,i 1 = l i,i 1, for each i = 2, 3,, n; (3) a i,i = l i,i 1 u i 1,i + l i,i, for each i = 2, 3,, n; (4) a i,i+1 = l i,i u i,i+1,, for each i = 1, 2,, n 1 (5) This system is straightforward to solve: (2) and (3) give us l 1,1 and the off-diagonal entries of L, (4) and (5) are used alternately to obtain the remaining entries of L and U This solution technique is often referred to as Crout factorization Cost (5n 4) multiplications/divisions; (3n 3) additions/subtractions 42