SOLVING LINEAR SYSTEMS We want to solve the linear system a, x + + a,n x n = b a n, x + + a n,n x n = b n This will be done by the method used in beginning algebra, by successively eliminating unknowns from equations, until eventually we have only one equation in one unknown This process is known as Gaussian elimination To put it onto a computer, however, we must be more precise than is generally the case in high school algebra We begin with the linear system x x x = 6x x + x = 6 9x + 7x + x = (E) (E) (E)
x x x = 6x x + x = 6 9x + 7x + x = (E) (E) (E) [] Eliminate x from equations (E) and (E) Subtract times (E) from (E); and subtract times (E) from (E): x x x = x + 4x = 6 x x = (E) (E) (E) [] Eliminate x from equation (E) Subtract times (E) from (E): x x x = x + 4x = 6 4x = 4 Using back substitution, obtain x = x = x = (E) (E) (E)
In the computer, we work on the arrays rather than on the equations To illustrate this, we repeat the preceding example using array notation The original system is Ax = b, with A = 6, b = 9 7 6 We often write these in combined form as an augmented matrix: [A b] = 6 6 9 7 In step, we eliminate x from equations and We multiply row by and subtract it from row ; and we multiply row by - and subtract it from row This yields 4 6
In step, we eliminate x from equation We multiply row by and subtract from row This yields 4 4 6 4 Then we proceed with back substitution as previously
For the general case, we reduce [A b] = a (), a (),n a () n, a () n,n b () b () n in n steps to the form a (), a (),n a (n) n,n b () b (n) n
More simply, and introducing new notation, this is equivalent to the matrix-vector equation Ux = g: u, u,n x g = u n,n x n g n
This is the linear system u, x + u, x + + u,n x n + u,n x n = g u n,n x n + u n,n x n = g n u n,n x n = g n We solve for x n, then x n, and backwards to x This process is called back substitution x n = g n u n,n x k = g k (u k,k+ x k+ + + u k,n x n ) u k,k for k = n,, What we have done here is simply a more carefully defined and methodical version of what you have done in high school algebra
How do we carry out the conversion of to a (), a (),n a () n, a () n,n a (), a (),n a (n) n,n b () b () n b () b (n) n
To help us keep track of the steps of this process, we will denote the initial system by a (), a (),n b () [A () b () ] = a () n, a n,n () b n () Initially we will make the assumption that every pivot element will be nonzero; and later we remove this assumption
Step We will eliminate x from equations thru n Begin by defining the multipliers m i, = a() i, a (),, i =,, n Here we are assuming the pivot element a (), Then in succession, multiply m i, times row (called the pivot row) and subtract the result from row i This yields new matrix elements a () i,j = a () i,j m i, a (),j, j =,, n b () i = b () i m i, b () for i =,, n Note that the index j does not include j = The reason is that with the definition of the multiplier m i,, it is automatic that a () i, = a() i, m i,a (), =, i =,, n
The augmented matrix now is [A () b () ] = a (), a (), a (),n a (), a (),n a () n, a () n,n b () b () b () n
Step k: Assume that for i =,, k the unknown x i has been eliminated from equations i + thru n We have the augmented matrix [A (k) b (k) ] = a (), a (), a (),n a (), a (),n a (k) k,k a (k) k,n a (k) n,k a n,n (k) b () b () b (k) k b (k) n We want to eliminate unknown x k from equations k + thru n Begin by defining the multipliers m i,k = a(k) i,k a (k) k,k, i = k +,, n
The pivot element is a (k) k,k, and we assume it is nonzero Using these multipliers, we eliminate x k from equations k + thru n Multiply m i,k times row k (the pivot row) and subtract from row i, for i = k + through n: a (k+) i,j = a (k) i,j m i,k a (k) k,j, j = k +,, n b (k+) i = b (k) i m i,k b (k) k This yields the augmented matrix [A (k+) b (k+) ]: a (), a (),n a (k) k,k a (k+) k+,k+ a (k) k,k+ a (k) k,n a (k+) k+,n a (k+) n,k+ a n,n (k+) b () b (k) k b (k+) k+ b (k+) n
Doing this for k =,,, n leads to the upper triangular system with the augmented matrix a (), a (),n a (n) n,n We later remove the assumption b () b (n) n a (k) k,k, k =,,, n
QUESTIONS How do we remove the assumption on the pivot elements? How many operations are involved in this procedure? How much error is there in the computed solution due to rounding errors in the calculations? How does the machine architecture affect the implementation of this algorithm?
PARTIAL PIVOTING Recall the reduction of to [A () b () ] = [A () b () ] = a (), a (),n a () n, a () n,n a (), a (), a (),n a (), a (),n a () n, a () n,n b () b () n b () b () b () n What if a (), =? In that case we look for an equation in which the x is present To do this in such a way as to avoid zero the maximum extant possible, we do the following
Look at all the elements in the first column, a () pick the largest in size Say it is = max a () a () k, j=,,n j,,, a(),,, a() n,, and Then interchange equations and k, which means interchanging rows and k in the augmented matrix [A () b () ] Then proceed with the elimination of x from equations thru n as before Having obtained [A () b () ] = a (), a (), a (),n a (), a (),n a () n, a () n,n what if a (), =? Then we proceed as before b () b () b () n
Among the elements a (),, a(), a () k, =,, a() n,, pick the one of largest size: max a () j=,,n Interchange rows and k Then proceed as before to eliminate x from equations through n, thus obtaining a (), a (), a (), a (),n b () a (), a (), a (),n b () [A () b () ] = a (), a (),n b () a () n, a n,n () b n () This is done at every stage of the elimination process This technique is called partial pivoting, and it is a part of most Gaussian elimination programs (including the one in the text) j,
Consequences of partial pivoting Recall the definition of the elements obtained in the process of eliminating x from equations through n: m i, = a() i, a (),, a () i,j = a () b () i = b () i m i, b () i,j m i, a (),j, j =,, n for i =,, n By our definition of the pivot element a (),, we have m i,, i =,, n Thus in the calculation of a () i,j and b () i, we have that the elements do not grow rapidly in size This is in comparison to what might happen otherwise, in which the multipliers m i, might have been very large This property is true of the multipliers at very step of the elimination process: m i,k, i = k +,, n, k =,, n
The property m i,k, i = k +,, n leads to good error propagation properties in Gaussian elimination with partial pivoting The only error in Gaussian elimination is that derived from the rounding errors in the arithmetic operations For example, at the first elimination step (eliminating x from equations thru n), a () i,j = a () i,j m i, a (),j, j =,, n b () i = b () i m i, b () The above property on the size of the multipliers prevents these numbers and the errors in their calculation from growing as rapidly as they might if no partial pivoting was used As an example of the improvement in accuracy obtained with partial pivoting, see the example on pages 6 6
OPERATION COUNTS One of the major ways in which we compare the efficiency of different numerical methods is to count the number of needed arithmetic operations For solving the linear system a, x + + a,n x n = b a n, x + + a n,n x n = b n using Gaussian elimination, we have the following operation counts A U, where we are converting Ax = b to Ux = g: Divisions Additions Multiplications n(n ) n(n )(n ) 6 n(n )(n ) 6
b g: Additions Multiplications Solving Ux = g: Divisions Additions Multiplications n(n ) n(n ) n n(n ) n(n )
On some machines, the cost of a division is much more than that of a multiplication; whereas on others there is not any important difference We assume the latter; and then the operation costs are as follows MD(A U) = n ( n ) n(n ) MD(b g) = n(n + ) MD(Find x) = n(n )(n ) AS(A U) = 6 n(n ) AS(b g) = n(n ) AS(Find x) =
Thus the total number of operations is Additions ( Multiplications and Divisions ) n + n 5n 6 n + n n Both are around n, and thus the total operations account is approximately n What happens to the cost when n is doubled?
Solving Ax = b and Ax = c What is the cost? Only the modification of the right side is different in these two cases Thus the additional cost is ( ) MD(b g) = n MD(Find x) ( ) AS(b g) = n(n ) AS(Find x) The total is around n operations, which is quite a bit smaller than n when n is even moderately large, say n = Thus one can solve the linear system Ax = c at little additional cost to that for solving Ax = b This has important consequences when it comes to estimation of the error in computed solutions
CALCULATING THE MATRIX INVERSE Consider finding the inverse of a matrix a, a, a, A = a, a, a, = [A,, A,, A, ] a, a, a, We want to find a matrix for which This means we want to solve X = [X,, X,, X, ] AX = I A [X,, X,, X, ] = [e, e, e ] [AX,, AX,, AX, ] = [e, e, e ] AX, = e, AX, = e, AX, = e We want to solve three linear systems, all with the same matrix of coefficients A
In augmented matrix notation, we want to work with [A I ] a, a, a, a, a, a, a, a, a, Then we proceed as before with a single linear system, only now we have three right hand sides We first introduce zeros into positions and of column ; and then we introduce zero into position of column Following that, we will need to solve three upper triangular linear systems by back substitution
MATRIX INVERSE EXAMPLE A = m, = m, =
Then by using back substitution to solve for each column of the inverse, we obtain A = 6 6 Alternatively, we can continue to perform elementary row operations to transform the left half of the augmented matrix to an identity This is shown next
MATRIX INVERSE EXAMPLE (cont) 6
MATRIX INVERSE EXAMPLE (cont) Thus, A = 6 6 6 6 6
COST OF MATRIX INVERSION In calculating A, we solve A [X,, X,,, X,n ] = [e, e,, e n ] for the matrix X = [X,, X,,, X,n ], where e j is column j of the identity matrix Thus we solve n linear systems AX, = e, AX, = e,, AX,n = e n () all with the same coefficient matrix Returning to the earlier operation counts for solving a single linear system, we have: cost of triangulating A: cost of solving Ax = b: approx n operations n operations Thus solving the n linear systems in () costs approximately n + n ( n ) = 8 n operations, approximately It costs approximately four times as many operations to invert A as to solve a single system With attention to the form of the right-hand sides in () this can be reduced to n operations
MATLAB MATRIX OPERATIONS To solve the linear system Ax = b in Matlab, use In Matlab, the command will calculate the inverse of A x = A \ b inv(a) There are many matrix operations built into Matlab, both for general matrices and for special classes of matrices We do not discuss those here, but recommend the student to investigate these thru the MATLAB help options