CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University

Similar documents
2.1 Gaussian Elimination

CSE 160 Lecture 13. Numerical Linear Algebra

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

Lecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II)

Systems of Linear Equations

Lecture 12. Linear systems of equations II. a 13. a 12. a 14. a a 22. a 23. a 34 a 41. a 32. a 33. a 42. a 43. a 44)

Illustration of Gaussian elimination to find LU factorization. A = a 11 a 12 a 13 a 14 a 21 a 22 a 23 a 24 a 31 a 32 a 33 a 34 a 41 a 42 a 43 a 44

Solving Dense Linear Systems I

Linear Algebraic Equations

LINEAR SYSTEMS (11) Intensive Computation

Solution of Linear Systems

1.Chapter Objectives

1 Problem 1 Solution. 1.1 Common Mistakes. In order to show that the matrix L k is the inverse of the matrix M k, we need to show that

y b where U. matrix inverse A 1 ( L. 1 U 1. L 1 U 13 U 23 U 33 U 13 2 U 12 1

Matrices and Matrix Algebra.

COURSE Numerical methods for solving linear systems. Practical solving of many problems eventually leads to solving linear systems.

7.2 Linear equation systems. 7.3 Linear least square fit

The purpose of computing is insight, not numbers. Richard Wesley Hamming

Linear Systems and LU

Review. Example 1. Elementary matrices in action: (a) a b c. d e f = g h i. d e f = a b c. a b c. (b) d e f. d e f.

Solving Linear Systems Using Gaussian Elimination. How can we solve

LU Factorization a 11 a 1 a 1n A = a 1 a a n (b) a n1 a n a nn L = l l 1 l ln1 ln 1 75 U = u 11 u 1 u 1n 0 u u n 0 u n...

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Parallel Scientific Computing

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Computational Linear Algebra

Dense LU factorization and its error analysis

1 GSW Sets of Systems

BLAS: Basic Linear Algebra Subroutines Analysis of the Matrix-Vector-Product Analysis of Matrix-Matrix Product

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

The Solution of Linear Systems AX = B

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

MODULE 7. where A is an m n real (or complex) matrix. 2) Let K(t, s) be a function of two variables which is continuous on the square [0, 1] [0, 1].

1.5 Gaussian Elimination With Partial Pivoting.

Lecture 4: Linear Algebra 1

Scientific Computing

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

Direct Methods for Solving Linear Systems. Matrix Factorization

CS 323: Numerical Analysis and Computing

Example: Current in an Electrical Circuit. Solving Linear Systems:Direct Methods. Linear Systems of Equations. Solving Linear Systems: Direct Methods

(17) (18)

SOLVING LINEAR SYSTEMS

Linear Systems of n equations for n unknowns

Matrix decompositions

NUMERICAL MATHEMATICS & COMPUTING 7th Edition

Computation of the mtx-vec product based on storage scheme on vector CPUs

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

Linear Systems of Equations by Gaussian Elimination

Matrix decompositions

The System of Linear Equations. Direct Methods. Xiaozhou Li.

Gaussian Elimination and Back Substitution

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU

Simultaneous Linear Equations

Numerical Linear Algebra

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Numerical Linear Algebra

Linear Equations and Matrix

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes

Numerical Methods - Numerical Linear Algebra

5. Direct Methods for Solving Systems of Linear Equations. They are all over the place...

Solution of Linear systems

CHAPTER 6. Direct Methods for Solving Linear Systems

Numerical Linear Algebra

CS513, Spring 2007 Prof. Amos Ron Assignment #5 Solutions Prepared by Houssain Kettani. a mj i,j [2,n] a 11

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

Linear System of Equations

A Review of Matrix Analysis

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

Lecture Note 2: The Gaussian Elimination and LU Decomposition

Solution of Linear Equations

Numerical Methods I: Numerical linear algebra

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

April 26, Applied mathematics PhD candidate, physics MA UC Berkeley. Lecture 4/26/2013. Jed Duersch. Spd matrices. Cholesky decomposition

Pivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3

Numerical Solution Techniques in Mechanical and Aerospace Engineering

5 Solving Systems of Linear Equations

Process Model Formulation and Solution, 3E4

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

Introduction to PDEs and Numerical Methods Lecture 7. Solving linear systems

Linear Algebra Linear Algebra : Matrix decompositions Monday, February 11th Math 365 Week #4

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

Solving linear equations with Gaussian Elimination (I)

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Parallel LU Decomposition (PSC 2.3) Lecture 2.3 Parallel LU

Linear Algebra Math 221

Linear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey

JACOBI S ITERATION METHOD

CS475: PA1. wim bohm cs, csu

MATH 3511 Lecture 1. Solving Linear Systems 1

Numerical Analysis: Solving Systems of Linear Equations

Solving Systems of Linear Equations

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Computational Methods. Systems of Linear Equations

Transcription:

CS475: Linear Equations Gaussian Elimination LU Decomposition Wim Bohm Colorado State University Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 license.

Linear Equations a 00 x 0 + a 0 1 x 1 + + a 0 n-1 x n-1 = b 0 a 10 x 0 + a 1 1 x 1 + + a 1 n-1 x n-1 = b 1 a 20 x 0 + a 2 1 x 1 + + a 2 n-1 x n-1 = b 2..... a n-1 0 x 0 + a n-1 1 x 1 + a n-1 n-1 x n-1 = b n-1 In matrix notation: Ax=b matrix A, column vectors x and b

Solving Linear Equations: Gaussian Elimination Reduce Ax=b into Ux=y U is an upper triangular Diagonal elements Uii = 1 x 0 + u 0 1 x 1 + + u 0 n-1 x n-1 = y 0 x 1 + + u 1 n-1 x n-1 = y 1 Back substitute.. x n-1 = y n-1

Example 2 4 2 x 0 = 16 3 7 4 x 1 = 29 2 5 4 x 2 = 24 1 2 1 x 0 = 8 3 7 4 x 1 = 29 2 5 4 x 2 = 24 1 2 1 x 0 = 8 0 1 1 x 1 = 5 0 1 2 x 2 = 8 1 2 1 x 0 = 8 0 1 1 x 1 = 5 0 0 1 x 2 = 3 x 0 + 2 x 1 + x 2 = 8 x 0 = 1 x 1 + x 2 = 5 x 1 = 2 x 2 = 3

Upper Triangularization - Sequential Two phases repeated n times Consider the k-th iteration (0 k < n) Phase 1: Normalize k-th row of A for j = k+1 to n-1 A kj /= A kk y k = b k /A kk A kk = 1 Phase 2: Eliminate by * - row k Using k-th row, make k-th column of A zero for row# > k for i = k+1 to n-1 for j = k+1 to n-1 b i A ij -= A ik *A kj -= a ik * y k Aik = 0 O(n 2 ) divides, O(n 3 ) subtracts and multiplies

Upper Triangularization (Pipelined) Parallel p = n, row-striped partition for all P i (i = 0.. n-1) do in parallel for k = 0 to n-1 if (i == k) perform k-th phase 1: normalize send normalized row k down if (i > k) receive row k, send it down perform k-th phase 2: eliminate with row k

Example 2 4 2 16 3 7 4 29 2 5 4 24 1 2 1 8 3 7 4 29-3(1 2 1 8) = 0 1 1 5 2 5 4 24-2(1 2 1 8) = 0 1 2 8 1 2 1 8 0 1 1 5 x 0 = 1 x 1 = 2 0 1 2 8-1(0 1 1 5) = 0 0 1 3 x 2 = 3

Pivoting in Gaussian elimination What if A kk ~ 0? - We get a big error, because we divide by A kk Find a lower row e with largest A ek and exchange rows What if all A ek ~ 0? - Find a column with largest Ake and exchange columns This complicates the parallel algorithm - Need to keep track of the permutations

Back-substitute Pipelined row striped x 0 + u 0 1 x 1 + + u 0 n-1 x n-1 = y 0 x 1 + + u 1 n-1 x n-1 = y 1.. for k = n-1 down to 0 P k : x k = y k ; send x k up x n-1 = y n-1 P i (i<k) : send x k up, y i = y i x k * u ik

LU Decomposition In Gaussian elimination we transform Ax=b into Ux=y i.e., we transform both A & b. So if we get a new system Ax=c, we need to do the whole transformation again. LU decomposition treats A independently: A =LU L unit lower triangular matrix U upper triangular matrix

Forward / backward substitution Ax=b LUx = b Define Ux=y, first solve Ly=b then Ux=y Solving Ly=b forward substitution: fwd substiture y 1 : 1 0 0 y 1 = b 1 2 1 0 y 2 = b 2 1 0 y 2 = b 2 2b 1 3 4 1 y 3 = b 3 4 1 y 3 = b 3 3b 1 now fwd substitute y 2 and then y 3 Solving Ux=y backward substitution same mechanism, studied for Gaussian elimination Both forward and backward easily pipelined

LU decomposition Recursive approach n=1: done! L=I 1, U = A n>1: A = a 11 a 12 a 1n = a 11 w T a 21 a 22 a 2n v A. a n1 a n2 a nn v and w: (n-1)-sized vectors A = a 11 w T = 1 0 a 11 w T v A v/a 11 I n-1 0 A -vw T / a 11 You can check this by matrix multiplication Do it by hand for n=2 and n=3 We have created the first step of the LU decomposition.

n=2 a 11 a 12 1 0 a 11 a 12 = a 21 a 22 a 21 /a 11 1 0 p p =?

n=2 a 11 a 12 1 0 a 11 a 12 = a 21 a 22 a 21 /a 11 1 0 p p =? a 22 = (a 21 * a 12 )/a 11 + p

n=2 a 11 a 12 1 0 a 11 a 12 = a 21 a 22 a 21 /a 11 1 0 p p =? a 22 = (a 21 * a 12 )/a 11 + p p = a 22 - (a 21 * a 12 )/a 11

LU decomposition: step 1 Recursive approach n=1: done! L=I 1, U = A n>1: A = a 11 a 12 a 1n = a 11 w T a 21 a 22 a 2n v A. a n1 a n2 a nn v and w: (n-1)-sized vectors A = a 11 w T = 1 0 a 11 w T v A v/a 11 I n-1 0 A -vw T / a 11 You can check this by matrix multiplication We have created the first step of the LU decomposition.

Inductive Leap Solve the rest recursively, ie A -vw T / a 11 = L U Element wise this means: Very similar to Gauss (Akj/=Akk Aij -= Aik*Akj Aij-= Aik * Akj) then A = 1 0 a 11 w T v/a 11 I n-1 0 L U = 1 0 a 11 w T v/a 11 L 0 U In place: L and U stored and computed in A 0-s and 1-s of L and U are implicit (not stored)

The code for k = 1 to n { for i = k+1 to n Aik /= Akk for i = k+1 to n for j = k+1 to n Aij -= Aik*Akj } 1 4 5 2 10 16 3 20 42 1 4 5 2 2 6 3 8 27 1 4 5 2 2 6 3 4 3 k=1: divide down by 1 then * - k=2: divide down by 2 then * - CHECK!!! how?

1 4 5 2 2 6 3 4 3 1 0 0 2 1 0 3 4 1 x 1 4 5 0 2 6 0 0 3 = 1 4 5 2 10 16 3 20 42

Pivoting Why? A kk can be close to 0 causing big error in 1/A kk, so we should find the absolute largest A ik and swap rows i an k How? We can register this in a permutation matrix or better an n sized array π[ i ] : position of row i. No pivoting: LU code is a straightforward translation of our recurrence.

LUD: data dependence for k = 1 to n { for i = k+1 to n Aik /= Akk for i = k+1 to n for j = k+1 to n Aij -= Aik*Akj } A

Pipelining Iteration k+1 can start after iteration k has finished row k+1 neither row k nor column k are read or written anymore This naturally leads to a a streaming algorithm every iteration becomes a process results from iteration/process k flow to process k+1 process k uses row k to update sub matrix A[k+1:n, k+1:n] Actually, only a fixed number P of processes run in parallel after that, data is re-circulated

Pipelined LU Pipeline of P processes parallel sections in OpenMP lingo algorithm runs in ceil( N/P ) sweeps In one sweep P outer k iterations run in parallel After each sweep P rows and columns are done Problem size n is reduced to (n-p) 2 Rest of the (updated) array is re-circulated Each process owns a row Process p owns row p + (i-1)p in sweep i, i=1 to ceil (N/P) Array elements are updated while streaming through the pipe of processes Total work O(n 3 ), speedup ~P Forward / backward substitution are simple loops

First, Flip the matrix vertically P1 P2

Feed matrix to Processor 1 P1 P2

P1 P2

Feed matrix to Processor 2 P2 P3

P2 P3

P3 P4

Process Pipeline M1 0 M2 Memories: M1 -> M2 on even sweeps M2 -> M1 on odd sweeps M3 : siphoning off finished results 1 S0 Processes are grouped in OpenMP style parallel sections with intermediate data in streams M1 S1 P-1 M2 M3 Process 0 reads either from M1 or M2 writes to stream S0 Process i reads from stream Si-1 writes to Si Process P-1 reads from stream Sp-1 writes finished (black) data to M3 writes rest to M1 or M2