CS227-Scientific Computing Lecture 4: A Crash Course in Linear Algebra Linear Transformation of Variables A common phenomenon: Two sets of quantities linearly related: y = 3x + x 2 4x 3 y 2 = 2.7x 2 x 3 z = y + 3y 2 z 2 = y y 2 You can compose these two sets of equations to get another linear relation between the z i and the x i. z = 3x + ( + 3 ( 2.7))x 2 + ( ( 4) + 3 ( ))x 3 = 3x 7.x 2 7x 3 z 2 = 3x + ( ( 2.7))x 2 + ( ( 4) ( ))x 3 = 3x + 3.7x 2 3x 3
2 Matrix Form We can write these equations in more compact matrix form: [ y y 2 ] = [ z z 2 ] [ 3 4 0 2.7 = [ 3 ] ] [ y y 2 x x 2 x 3 ] [ z z 2 ] = = [ 3 [ 3 7. 7 3 3.7 3 ] [ 3 4 0 2.7 ] x x 2 x 3 ] x x 2 x 3 3 Matrix Multiplication Implicit in this is a way of multiplying matrices that captures the composition of two linear relations. Observe that if A is an r s matrix (r rows, s columns) and B is an m n matrix, then the product AB is defined only if s = m. The product is associative (i.e., (AB)C = A(BC) whenever the dimensions of A, B and C are such that the products are defined), but not commutative (AB and BA are not generally equal even if both products are defined. 4 Example The code below takes the points on a circle {(cos t, sin t) : 0 t 2π} and applies a 2 2 matrix to them. The result is an ellipse. > > t=0:0.0:6.3; > > x=cos(t); 2
> > y=sin(t); > > A=rand(2)*[x;y]; > > plot(x,y,a(,:),a(2,:)); The result looks like this; 5 Solving Linear Systems A ubiquitous problem in applied mathematics is solving systems of linear equations. For a (small) example: find x, y such that In general such systems have the form 3x + 2y = 7 x y = 4x + y = 6 Ax = b, where A is an m n matrix, xis an unknown n matrix (a column vector) and b is an m matrix (another column vector). The problem is to find all vectors x making the equation true. Most often, the number of equations in the system will be equal to the number of unknowns, so that the matrix A is square. 6 Geometric View A single equation in two unknowns is the equation of a line in the plane. The solution a system of two equations in two unknowns is the intersection of the two lines. 3
Figure : A linear transformation turns a circle into an ellipse. 4
Figure 2: The solution to a 2 2 linear system is the intersection of two straight lines. 7 Gaussian Elimination The basic algorithm for solving linear systems. Successively eliminate variables from equations until one equation has the form ax j = b. Solve this for x j, then use the value found for x j to solve for another variable, etc. If you encounter a zero on the diagonal during the elimination step, you interchange the row with a row below. This is illustrated in the example that follows. In matrix terms, this solves Ax = b by successively adding multiples of rows of A to other rows, and interchanging rows of A, performing the same operations on b, until you have an equivalent system of the form Ux = b, which is quickly solved by back-substitution. 5
8 Gaussian Elimination-An Example Solve 3 2 6 2 22 3 6 9 x = 9 Elimination Phase 3 2 6 2 22 3 6 9, 3 2 0 0 8 3 6 9 3 2 0 0 22 0 5 7 3 2 0 5 7 0 0 8,,, 0 0 Note the row interchange in the second step. 0 Back-substitution Phase 3 2 0 5 7 0 0 8 x = 0 6
x 3 = /8 5x 2 7/8 = 0 x 2 = 7/90 3x + 7/90 2 /8 = x = 83/270 So the solution is x = 83 270 7 90 8 Singular and Nonsingular Square Matrices If the result of elimination is a an upper-triangular matrix U in which every diagonal entry is nonzero, the Ax = b has a unique solution for every b. A is nonsingular. A geometric realization is given by two lines in the plane intersecting in a single point, as illustrated earlier, or three planes in three-dimensional space intersecting in a single point. 7
Figure 3: Both the original matrix and the upper-triangular matrix are nonsingular. Unique solution for each right-hand side. If the result of elimination is an upper-triangular matrix U in which some diagonal entry is zero, then the last row of U is zero, and the system will have either infinitely many solutions or no solutions, depending on the righthand side b. A is singular. For a pair of lines in the plane, singularity means that the lines are parallel (no intersection) or coincide (infinitely many points of intersection). For three planes in 3-dimensional space, think of a situation where the first two planes intersect in a line, and the third plane is either parallel to this line (no solution) or contains the line (infinitely many solutions). 8
Figure 4: Both the original matrix and the upper-triangular matrix are singular. Either no solution or infinitely many solutions, depending on the right-hand side. 2 Nonsquare matrices Gaussian elimination works for non-square matrices as well (though we won t do much with them now). If A is an m n matrix with m < n, and Gaussian elimination leads to a U in which the last row is all zeros, then Ax = b has either no solutions, or infinitely many solutions, depending on the value of b. (A is rank-deficient). Geometrically, with m = 2 and n = 3, this corresponds to the case of two planes that are either parallel or coincide. 9
Figure 5: Rank-deficient matrix with more columns than rows. Otherwise, there are infinitely many solutions for every value of b. (A has full rank). If m = 2 and n = 3, this corresponds to the typical case of two planes intersecting in a line. 0
Figure 6: Full-rank matrix with more columns than rows. If m > n, then Gaussian elimination must lead to a matrix U in which the last row is all zero. Thus Ax = b has no solutions for some values of b. If U has nonzero values in every diagonal entry (full rank), then for other values of b there is a unique solution to the system. Think of three lines in the plane: Typically the first two lines will intersect in a point, and the third may or may not go through the point where the first two intersect.
Figure 7: Full-rank matrix with more rows than columns. Either no solution or a unique solution. If U has a zero on the diagonal, then for other values of b there are infinitely many solutions (rank-deficient). This corresponds to the case where three lines in the plane all have the same slope: There is either no solution, or the three lines coincide. 2
Figure 8: Rank-deficient matrix with more rows than columns. 3 Gaussian Elimination as Matrix Factorization Adding a multiple c of row i of an m n matrix A to row j is the same thing as replacing A by MA, where M is the m m matrix that has s on the diagonal, c in the (j, i)-entry, and zeros elswhere. If j > i, as in Gaussian elimination, then M is lower triangular with s on the diagonal, or unit lower triangular. Note that the product of two unit lower triangular matrices is also unit lower triangular with s on the diagonal. When we perform Gaussian elimination, we can do all the row interchanges before the elimination. Permuting rows of A is the same as replacing A by P A, where P is a permutation matrix (identity matrix with its rows scrambled. So Gaussian elimination yields a factorization L P A = U, where U is upper triangular, P is a permutation matrix, and 3
L is unit lower triangular. If we multiply both sides by the inverse L of L, which is also unit lower triangular, we obtain the LU-factorization of A. P A = LU, 4 The Pivoting Strategy In our simple version of Gaussian elimination, we switch rows whenever a diagonal entry becomes 0. In practical implementations, when working with the j th row we find the largest element in the j th column below or on the diagonal, and interchange the row of this element with the j th row: We can t pivot on 0, but we try to avoid pivoting on small values. To see why, try solving a slightly different system from one in our example, and use 3-digit decimal arithmetic. 3 2 6 2.0 22 x = 3 6 9 Solving this system without row interchanges and threedigit decimal arithmetic gives 0.370 x = 0. 0.0557 4
This varies from the correct solution 0.307 0.89 0.0557 and also has the residual Ax b quite large in the last component:.00 Ax = 0.995 0.052 Solving this system with interchange of the second and third rows, and three-digit decimal arithmetic gives 0.307 x = 0.89, 0.0556 and Ax = 0.999 0.999 0.999 so that the residual is quite small. So Gaussian Elimination is implemented with partial pivoting : When the algorithm reaches the k th row, it searches for the entry a mk (row m column k) with m k and the largest possible absolute value, then interchanges rows m and k., 5
5 How accurate is Gaussian Elimination? What is the effect of roundoff error in Gaussian elimination? We can measure the accuracy in several different ways. We can look at the residual. This is the number Ax b, where x is the solution computed by Gaussian elimination, and v represents the length of the vector v. When partial pivoting is used, this will generally speaking be small (on the order of machine epsilon) relative to the size of the entries in the matrix A and the vector b. But we can also measure the accuracy in the more obvious way, by the distance between x and the true solution x. The relative error x x x is on the order of machine epsilon times κ(a), where κ(a) is the condition number of A: a measure of how close A is to being singular. The condition number is a measure of how close the matrix is to being singular. Imagine two lines in the plane that are nearly parallel. A slight change in the right-hand side of the corresponding system of equations will produce a large change in the coordinates of the solution. Since the computer can only represent the right-hand side to a limited accuracy, this problem won t go away. 6
Some matrices are very ill-conditioned. In class we will demonstrate this with the Hilbert matrix. (In MATLAB, type hilb(n) to get the n n Hilbert matrix, a symmetric matrix which has i+j in the (i, j)-entry. 6 Solving Linear Systems in MATLAB For square matrices A, you can use the backslash operator to solve Ax = b. This uses Gaussian elimination and backsubstitution to produce a solution. >> A=[3 2;6 2 22;6 6 9]; >> b=[;;]; >>A\b ans = 0.3843-0.047-0.0556 7 Computing the LU Decomposition in MATLAB You can compute the LU decomposition explicitly. The lu function in MATLAB gives different results, depending on how many output arguments you specify. >> [L,U]=lu(A) L = 0.5000 0.0000.0000 0 0 7
.0000.0000 0 U = 6 2 22 0 4-3 0 0-9 >> [L,U,P]=lu(A) L =.0000 0 0.0000.0000 0 0.5000 0.0000 U = 6 2 22 0 4-3 0 0-9 P = 0 0 0 0 0 0 So another way to solve Ax = b is >>[L,U]=lu(A); >>x=u\(l\b); This is actually less efficient than the backslash operator. So why would you ever compute the LU-decomposition to solve the system? Because you may have to solve several linear systems with the same left-hand side A and different right-hand 8
sides b. Invoking A\b for each one of these requires the expensive elimination step to be performed over and over again. If you use the LU-decomposition, the elimination step is performed only once, and only the more efficient forward- and back-substitution steps are performed repeatedly. 8 How Fast is Gaussian Elimination? For an n n matrix, computing the elimination step takes time proportional to n 3, while back- and forwardsubstitution each take time proportional to n 2. So on the whole, the running time of the algorithm is roughly proportional to n 3. In practice, this means if the number of variables in your problem becomes twice as large, then the time to solve the associated linear equations takes 8 times longer. For matrices having a special form (triangular, tridiagonal) there are faster methods. The MATLAB backslash operator checks first to see if the matrix has one of these special forms before solving. 9