LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

Similar documents
Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Gaussian Elimination and Back Substitution

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Scientific Computing

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

AMS526: Numerical Analysis I (Numerical Linear Algebra)

2.1 Gaussian Elimination

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

Direct Methods for Solving Linear Systems. Matrix Factorization

Matrix decompositions

Roundoff Analysis of Gaussian Elimination

Matrix decompositions

Computational Methods. Systems of Linear Equations

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

CSE 160 Lecture 13. Numerical Linear Algebra

Scientific Computing: An Introductory Survey

Computational Linear Algebra

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 2. Systems of Linear Equations

The Solution of Linear Systems AX = B

5 Solving Systems of Linear Equations

Ax = b. Systems of Linear Equations. Lecture Notes to Accompany. Given m n matrix A and m-vector b, find unknown n-vector x satisfying

lecture 2 and 3: algorithms for linear algebra

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

CHAPTER 6. Direct Methods for Solving Linear Systems

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

14.2 QR Factorization with Column Pivoting

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU

Linear Algebra Linear Algebra : Matrix decompositions Monday, February 11th Math 365 Week #4

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

1 GSW Sets of Systems

Numerical Methods - Numerical Linear Algebra

Linear Systems of n equations for n unknowns

Review Questions REVIEW QUESTIONS 71

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

Pivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3

lecture 3 and 4: algorithms for linear algebra

MTH 464: Computational Linear Algebra

Numerical Linear Algebra

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra

Solving Dense Linear Systems I

Numerical Linear Algebra

Matrix Factorization and Analysis

Sherman-Morrison-Woodbury

Practical Linear Algebra: A Geometry Toolbox

5.6. PSEUDOINVERSES 101. A H w.

Course Notes: Week 1

Solving Linear Systems of Equations

Introduction to Mathematical Programming

Applied Linear Algebra

1.5 Gaussian Elimination With Partial Pivoting.

Orthogonal Transformations

Linear Least squares

Dense LU factorization and its error analysis

Lecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II)

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

I-v k e k. (I-e k h kt ) = Stability of Gauss-Huard Elimination for Solving Linear Systems. 1 x 1 x x x x

Outline. Math Numerical Analysis. Errors. Lecture Notes Linear Algebra: Part B. Joseph M. Mahaffy,

NUMERICAL MATHEMATICS & COMPUTING 7th Edition

LINEAR SYSTEMS (11) Intensive Computation

Gaussian Elimination for Linear Systems

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm

V C V L T I 0 C V B 1 V T 0 I. l nk

AMS 147 Computational Methods and Applications Lecture 17 Copyright by Hongyun Wang, UCSC

Next topics: Solving systems of linear equations

Solution of Linear Equations

MODULE 7. where A is an m n real (or complex) matrix. 2) Let K(t, s) be a function of two variables which is continuous on the square [0, 1] [0, 1].

5. Direct Methods for Solving Systems of Linear Equations. They are all over the place...

Lecture 9. Errors in solving Linear Systems. J. Chaudhry (Zeb) Department of Mathematics and Statistics University of New Mexico

Solving linear systems (6 lectures)

Chapter 2 - Linear Equations

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

Linear Algebraic Equations

MATH 3511 Lecture 1. Solving Linear Systems 1

Lecture 7. Gaussian Elimination with Pivoting. David Semeraro. University of Illinois at Urbana-Champaign. February 11, 2014

COURSE Numerical methods for solving linear systems. Practical solving of many problems eventually leads to solving linear systems.

Review of Basic Concepts in Linear Algebra

Numerical Linear Algebra

Lecture 4: Linear Algebra 1

This can be accomplished by left matrix multiplication as follows: I

Numerical Methods I: Numerical linear algebra

Linear Analysis Lecture 16

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

SOLVING LINEAR SYSTEMS

A Review of Matrix Analysis

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

POLI270 - Linear Algebra

4.2 Floating-Point Numbers

Transcription:

AM 205: lecture 7 Last time: LU factorization Today s lecture: Cholesky factorization, timing, QR factorization Reminder: assignment 1 due at 5 PM on Friday September 22

LU Factorization LU factorization is the most common way of solving linear systems! Ax = b LUx = b Let y Ux, then Ly = b: solve for y via forward substitution 1 Then solve for Ux = y via back substitution 1 y = L 1 b is the transformed right-hand side vector (ie Mb from earlier) that we are familiar with from Gaussian elimination

LU Factorization Next question: How should we determine M 1, M 2,, M n 1? We need to be able to annihilate selected entries of A, below the diagonal in order to obtain an upper-triangular matrix To do this, we use elementary elimination matrices Let L j denote j th elimination matrix (we use L j rather than M j from now on as elimination matrices are lower triangular)

LU Factorization Let X ( L j 1 L j 2 L 1 A) denote matrix at the start of step j, and let x (:,j) R n denote column j of X Then we define L j such that L j x (:,j) 1 0 0 0 0 1 0 0 0 x j+1,j /x jj 1 0 0 x nj /x jj 0 1 x 1j x jj x j+1,j x nj = x 1j x jj 0 0

LU Factorization To simplify notation, we let l ij x ij x jj in order to obtain 1 0 0 0 L j 0 1 0 0 0 l j+1,j 1 0 0 l nj 0 1

LU Factorization Using elementary elimination matrices we can reduce A to upper triangular form, one column at a time Schematically, for a 4 4 matrix, we have L 1 0 0 0 L 2 0 0 0 0 0 A L 1A L 2L 1A Key point: L k does not affect columns 1, 2,, k 1 of L k 1 L k 2 L 1 A

LU Factorization After n 1 steps, we obtain the upper triangular matrix U = L n 1 L 2 L 1 A U = 0 0 0 0 0 0

LU Factorization Finally, we wish to form the factorization A = LU, hence we need L = (L n 1 L 2 L 1 ) 1 = L 1 1 L 1 2 L 1 n 1 This turns out to be surprisingly simple due to two strokes of luck! First stroke of luck: L 1 j is obtained simply by negating the subdiagonal entries of L j L j 1 0 0 0 0 1 0 0 0 l j+1,j 1 0 0 l nj 0 1, L 1 j 1 0 0 0 0 1 0 0 0 l j+1,j 1 0 0 l nj 0 1

LU Factorization Explanation: Let l j [0,, 0, l j+1,j,, l nj ] T so that L j = I l j e T j Now consider L j (I + l j e T j ): L j (I+l j e T j ) = (I l j e T j )(I+l j e T j ) = I l j e T j l j e T j = I l j (e T j l j )e T j Also, (e T j l j ) = 0 (why?) so that L j (I + l j e T j ) = I By the same argument (I + l j e T j )L j = I, and hence L 1 j = (I + l j e T j )

LU Factorization Next we want to form the matrix L L 1 1 L 1 2 L 1 n 1 Note that we have L 1 j L 1 j+1 = (I + l j ej T )(I + l j+1 ej+1) T = I + l j e T j + l j+1 e T j+1 + l j (e T j l j+1 )e T j+1 = I + l j e T j + l j+1 e T j+1 Interestingly, this convenient result doesn t hold for L 1 j+1 L 1 j, why?

LU Factorization Similarly, L 1 j L 1 j+1 L 1 j+2 = (I + l j ej T + l j+1 ej+1)(i T + l j+2 ej+2) T = I + l j e T j + l j+1 e T j+1 + l j+2 e T j+2 That is, to compute the product L 1 1 L 1 2 L 1 the subdiagonals for j = 1, 2,, n 1 n 1 we simply collect

LU Factorization Hence, second stroke of luck: L L 1 1 L 1 2 L 1 n 1 = 1 l 21 1 l 31 l 32 1 l n1 l n2 l n,n 1 1

LU Factorization Therefore, basic LU factorization algorithm is 1: U = A, L = I 2: for j = 1 : n 1 do 3: for i = j + 1 : n do 4: l ij = u ij /u jj 5: for k = j : n do 6: u ik = u ik l ij u jk 7: end for 8: end for 9: end for Note that the entries of U are updated each iteration so at the start of step j, U = L j 1 L j 2 L 1 A Here line 4 comes straight from the definition l ij u ij u jj

LU Factorization Line 6 accounts for the effect of L j on columns k = j, j + 1,, n of U For k = j : n we have L j u (:,k) 1 0 0 0 0 1 0 0 0 l j+1,j 1 0 0 l nj 0 1 u 1k u jk u j+1,k u nk = u 1k u jk u j+1,k l j+1,j u jk u nk l nj u jk The vector on the right is the updated k th column of U, which is computed in line 6

LU Factorization LU Factorization involves a triply-nested for-loop, hence O(n 3 ) calculations Careful operation counting shows LU factorization requires 1 3 n3 additions and 1 3 n3 multiplications, 2 3 n3 operations in total

Solving a linear system using LU Hence to solve Ax = b, we perform the following three steps: Step 1: Factorize A into L and U: 2 3 n3 Step 2: Solve Ly = b by forward substitution: n 2 Step 3: Solve Ux = y by back substitution: n 2 Total work is dominated by Step 1, 2 3 n3

Solving a linear system using LU An alternative approach would be to compute A 1 explicitly and evaluate x = A 1 b, but this is a bad idea! Question: How would we compute A 1?

Solving a linear system using LU Answer: Let a(:,k) inv denote the kth column of A 1, then a inv satisfy Aa(:,k) inv = e k (:,k) must Therefore to compute A 1, we first LU factorize A, then back/forward substitute for rhs vector e k, k = 1, 2,, n The n back/forward substitutions alone require 2n 3 operations, inefficient! A rule of thumb in Numerical Linear Algebra: It is almost always a bad idea to compute A 1 explicitly

Solving a linear system using LU Another case where LU factorization is very helpful is if we want to solve Ax = b i for several different right-hand sides b i, i = 1,, k We incur the 2 3 n3 cost only once, and then each subequent forward/back subsitution costs only 2n 2 Makes a huge difference if n is large!

Stability of Gaussian Elimination There is a problem with the LU algorithm presented above Consider the matrix A = [ 0 1 1 1 ] A is nonsingular, well-conditioned (κ(a) 262) but LU factorization fails at first step (division by zero)

Stability of Gaussian Elimination LU factorization doesn t fail for [ ] 10 20 1 A = 1 1 but we get [ L = 1 0 10 20 1 ] [ 10 20 1, U = 0 1 10 20 ]

Stability of Gaussian Elimination Let s suppose that 10 20 F (a floating point number) and that round(1 10 20 ) = 10 20 Then in finite precision arithmetic we get [ ] [ ] 1 0 10 20 1 L = 10 20, Ũ = 1 0 10 20

Stability of Gaussian Elimination Hence due to rounding error we obtain LŨ = [ 10 20 1 1 0 ] which is not close to A = [ 10 20 1 1 1 ] Then, for example, let b = [3, 3] T Using LŨ, we get x = [3, 3]T True answer is x = [0, 3] T Hence large relative error (rel err = 1) even though the problem is well-conditioned

Stability of Gaussian Elimination In this example, standard Gaussian elimination yields a large residual Or equivalently, it yields the exact solution to a problem corresponding to a large input perturbation: b = [0, 3] T Hence unstable algorithm! In this case the cause of the large error in x is numerical instability, not ill-conditioning To stabilize Gaussian elimination, we need to permute rows, ie perform pivoting

Pivoting Recall the Gaussian elimination process x jj But we could just as easily do x ij x jj 0 0 0 x ij 0

Partial Pivoting The entry x ij is called the pivot, and flexibility in choosing the pivot is essential otherwise we can t deal with: [ ] 0 1 A = 1 1 From a numerical stability point of view, it is crucial to choose the pivot to be the largest entry in column j: partial pivoting 2 This ensures that each l ij entry which acts as a multiplier in the LU factorization process satisfies l ij 1 2 Full pivoting refers to searching through columns j : n for the largest entry; this is more expensive and only marginal benefit to stability in practice

Partial Pivoting To maintain the triangular LU structure, we permute rows by premultiplying by permutation matrices x ij P 1 x ij L 1 x ij 0 0 Pivot selection Row interchange In this case P 1 = 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 and each P j is obtained by swapping two rows of I

Partial Pivoting Therefore, with partial pivoting we obtain L n 1 P n 1 L 2 P 2 L 1 P 1 A = U It can be shown (we omit the details here, see Trefethen & Bau) that this can be rewritten as where 3 P P n 1 P 2 P 1 PA = LU Theorem: Gaussian elimination with partial pivoting produces nonsingular factors L and U if and only if A is nonsingular 3 The L matrix here is lower triangular, but not the same as L in the non-pivoting case: we have to account for the row swaps

Partial Pivoting Pseudocode for LU factorization with partial pivoting (blue text is new): 1: U = A, L = I, P = I 2: for j = 1 : n 1 do 3: Select i( j) that maximizes u ij 4: Interchange rows of U: u (j,j:n) u (i,j:n) 5: Interchange rows of L: l (j,1:j 1) l (i,1:j 1) 6: Interchange rows of P: p (j,:) p (i,:) 7: for i = j + 1 : n do 8: l ij = u ij /u jj 9: for k = j : n do 10: u ik = u ik l ij u jk 11: end for 12: end for 13: end for Again this requires 2 3 n3 floating point operations

Partial Pivoting: Solve Ax = b To solve Ax = b using the factorization PA = LU: Multiply through by P to obtain PAx = LUx = Pb Solve Ly = Pb using forward substitution Then solve Ux = y using back substitution

Partial Pivoting in Python Python s scipylinalglu function can do LU factorization with pivoting Python 275 (default, Mar 9 2014, 22:15:05) [GCC 421 Compatible Apple LLVM 50 (clang-500068)] on darwin Type "help", "copyright", "credits" or "license" for more information >>> import numpy as np >>> import scipylinalg >>> a=nprandomrandom((4,4)) >>> a array([[ 030178809, 009895414, 075341645, 055745407], [ 008879282, 097137694, 004768167, 028140464], [ 087253281, 066021495, 04941091, 052966743], [ 07990001, 045251929, 055493106, 015781707]]) >>> (p,l,u)=scipylinalglu(a) >>> p array([[ 0, 0, 1, 0], [ 0, 1, 0, 0], [ 1, 0, 0, 0], [ 0, 0, 0, 1]]) >>> l array([[ 1, 0, 0, 0 ], [ 010176445, 1, 0, 0 ], [ 034587592, -014310957, 1, 0 ], [ 091572499, -016816814, 017525841, 1 ]]) >>> u array([[ 087253281, 066021495, 04941091, 052966743], [ 0, 090419053, -000260107, 022750332], [ 0, 0, 058214377, 040681276], [ 0, 0, 0, -036025118]])

Stability of Gaussian Elimination Numerical stability of Gaussian Elimination has been an important research topic since the 1940s Major figure in this field: James H Wilkinson (English numerical analyst, 1919 1986) Showed that for Ax = b with A R n n : Gaussian elimination without partial pivoting is numerically unstable (as we ve already seen) Gaussian elimination with partial pivoting satisfies r A x 2n 1 n 2 ɛ mach

Stability of Gaussian Elimination That is, pathological cases exist where the relative residual, r / A x, grows exponentially with n due to rounding error Worst case behavior of Gaussian Elimination with partial pivoting is explosive instability but such pathological cases are extremely rare! In over 50 years of Scientific Computation, instability has only been encountered due to deliberate construction of pathological cases In practice, Gaussian elimination is stable in the sense that it produces a small relative residual

Stability of Gaussian Elimination In practice, we typically obtain r A x nɛ mach, ie grows only linearly with n, and is scaled by ɛ mach Combining this result with our inequality: x x κ(a) r A x implies that in practice Gaussian elimination gives small error for well-conditioned problems!

Cholesky Factorization

Cholesky factorization Suppose that A R n n is an SPD matrix, ie: Symmetric: A T = A Positive Definite: for any v 0, v T Av > 0 Then the LU factorization of A can be arranged so that U = L T, ie A = LL T (but in this case L may not have 1s on the diagonal) Consider the 2 2 case: [ ] [ a11 a 21 l11 0 = a 21 a 22 l 21 l 22 ] [ l11 l 21 0 l 22 ] Equating entries gives l 11 = a 11, l 21 = a 21 /l 11, l 22 = a 22 l 2 21

Cholesky factorization This approach of equating entries can be used to derive the Cholesky factorization for the general n n case 1: L = A 2: for j = 1 : n do 3: l jj = l jj 4: for i = j + 1 : n do 5: l ij = l ij /l jj 6: end for 7: for k = j + 1 : n do 8: for i = k : n do 9: l ik = l ik l ij l kj 10: end for 11: end for 12: end for

Cholesky factorization Notes on Cholesky factorization: For an SPD matrix A, Cholesky factorization is numerically stable and does not require any pivoting Operation count: 1 3 n3 operations in total, ie about half as many as Gaussian elimination Only need to store L, hence uses less memory than LU

Sparse Matrices

Sparse Matrices In applications, we often encounter sparse matrices A prime example is in discretization of partial differential equations (covered in the next section) Sparse matrix is not precisely defined, roughly speaking it is a matrix that is mostly zeros From a computational point of view it is advantageous to store only the non-zero entries The set of non-zero entries of a sparse matrix is referred to as its sparsity pattern

Sparse Matrices A = A sparse = (1,1) 2 (1,2) 2 (2,2) 1 (1,3) 2 (3,3) 1 (1,4) 2 (4,4) 1 (1,5) 2 2 2 2 2 2 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 (5,5) 1

Sparse Matrices From a mathematical point of view, sparse matrices are no different from dense matrices But from Sci Comp perspective, sparse matrices require different data structures and algorithms for computational efficiency eg, can apply LU or Cholesky to sparse A, but new non-zeros (ie outside sparsity pattern of A) are introduced in the factors These new non-zero entries are called fill-in many methods exist for reducing fill-in by permuting rows and columns of A

QR Factorization

QR Factorization A square matrix Q R n n is called orthogonal if its columns and rows are orthonormal vectors Equivalently, Q T Q = QQ T = I Orthogonal matrices preserve the Euclidean norm of a vector, ie Qv 2 2 = v T Q T Qv = v T v = v 2 2 Hence, geometrically, we picture orthogonal matrices as reflection or rotation operators Orthogonal matrices are very important in scientific computing, norm-preservation implies no amplification of numerical error!

QR Factorization A matrix A R m n, m n, can be factorized into where A = QR Q R m m is orthogonal [ ] ˆR R R m n 0 ˆR R n n is upper-triangular QR is very good for solving overdetermined linear least-squares problems, Ax b 4 4 QR can also be used to solve a square system Ax = b, but requires 2 as many operations as Gaussian elimination hence not the standard choice