Solving linear systems (6 lectures)

Similar documents
Scientific Computing

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Gaussian Elimination and Back Substitution

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Computational Methods. Systems of Linear Equations

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Numerical Methods - Numerical Linear Algebra

Next topics: Solving systems of linear equations

Gaussian Elimination without/with Pivoting and Cholesky Decomposition

9. Numerical linear algebra background

ECE133A Applied Numerical Computing Additional Lecture Notes

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

The Solution of Linear Systems AX = B

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Lecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II)

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Matrix Factorization and Analysis

CHAPTER 6. Direct Methods for Solving Linear Systems

MATH 3511 Lecture 1. Solving Linear Systems 1

5.6. PSEUDOINVERSES 101. A H w.

Direct Methods for Solving Linear Systems. Matrix Factorization

1.Chapter Objectives

Linear Algebraic Equations

Solving Linear Systems of Equations

Linear Algebra and Matrix Inversion

Scientific Computing: Dense Linear Systems

9. Numerical linear algebra background

Solving PDEs with CUDA Jonathan Cohen

Computational Linear Algebra

Lecture Note 2: The Gaussian Elimination and LU Decomposition

A Review of Matrix Analysis

Review Questions REVIEW QUESTIONS 71

Direct solution methods for sparse matrices. p. 1/49

Numerical Methods I Non-Square and Sparse Linear Systems

LINEAR SYSTEMS (11) Intensive Computation

Lecture 9: Elementary Matrices

CSE 160 Lecture 13. Numerical Linear Algebra

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

AM205: Assignment 2. i=1

Solving Dense Linear Systems I

Algebra C Numerical Linear Algebra Sample Exam Problems

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

MTH 464: Computational Linear Algebra

Linear Systems of n equations for n unknowns

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Numerical Methods I: Numerical linear algebra

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

Introduction to PDEs and Numerical Methods Lecture 7. Solving linear systems

Numerical Linear Algebra

1 GSW Sets of Systems

Lecture 7. Gaussian Elimination with Pivoting. David Semeraro. University of Illinois at Urbana-Champaign. February 11, 2014

Iterative Methods for Linear Systems

2.1 Gaussian Elimination

Scientific Computing: An Introductory Survey

Chapter 7 Iterative Techniques in Matrix Algebra

Dense LU factorization and its error analysis

MAA507, Power method, QR-method and sparse matrix representation.

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Linear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey

Solving Linear Systems Using Gaussian Elimination. How can we solve

Finite Difference Methods for Boundary Value Problems

This can be accomplished by left matrix multiplication as follows: I

Kasetsart University Workshop. Multigrid methods: An introduction

AMS 147 Computational Methods and Applications Lecture 17 Copyright by Hongyun Wang, UCSC

The Behavior of Algorithms in Practice 2/21/2002. Lecture 4. ɛ 1 x 1 y ɛ 1 x 1 1 = x y 1 1 = y 1 = 1 y 2 = 1 1 = 0 1 1

G1110 & 852G1 Numerical Linear Algebra

Course Notes: Week 1

Poisson Solvers. William McLean. April 21, Return to Math3301/Math5315 Common Material.

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 2. Systems of Linear Equations

Linear Equations and Matrix

MTH 215: Introduction to Linear Algebra

Example: 2x y + 3z = 1 5y 6z = 0 x + 4z = 7. Definition: Elementary Row Operations. Example: Type I swap rows 1 and 3

Numerical Linear Algebra

Lecture 3: QR-Factorization

EE364a Review Session 7

MODULE 7. where A is an m n real (or complex) matrix. 2) Let K(t, s) be a function of two variables which is continuous on the square [0, 1] [0, 1].

CLASSICAL ITERATIVE METHODS

Chapter 4 No. 4.0 Answer True or False to the following. Give reasons for your answers.

Solving Linear Systems of Equations

Transcription:

Chapter 2 Solving linear systems (6 lectures) 2.1 Solving linear systems: LU factorization (1 lectures) Reference: [Trefethen, Bau III] Lecture 20, 21 How do you solve Ax = b? (2.1.1) In numerical linear algebra, NEVER compute A 1 then A 1 b. Reason: very expensive! storage! less accurate! 2.1.1 Gaussian elimination Basic idea: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Example 1 (Gaussian elimination). Solve 3 1 5 2 0 3 1 1 3 19 x 1 x 2 x 3 2 = 1. (2.1.2) 0

Gaussian elimination is the same as This gives rise to LU factorization. 3 1 5 2 0 1 3 3 0 2 4 3 3 3 1 5 2 0 1 3 3 0 0 1 2 = (2.1.3) x 1 x 2 x 3 x 1 x 2 x 3 1 0 0 3 1 5 2 2 1 0 3 0 1 3 3 1 1 1 0 0 1 3 }{{}}{{} L U 1 3 2 3 2 = 1 3 (2.1.4) 1 x 1 x 2 x 3 2 = 1 (2.1.5) 0 LUx = b (2.1.6) Solution: x 1 x 2 x 3 2 = 1 (2.1.7) 1 2.1.2 LU factorization, forward/backward substitutions Exact solver (Gaussian elimination) of Ax = b is equivalent to: LU-factorize the matrix A: A = LU Solve Ly = b for intermediate solution y (forward substitution) Solve Ux = y for final solution x (backward substitution) We can derive the LU factorization based on the process of Gaussian elimination, as follows: 20

Algorithm 1 LU factorization 1: for k = 1,, n 1 do iterate over all rows 2: for i = k + 1,, n do iterate over all rows beneath row k 3: mult = a ik /a kk determine the multiplicative factor of row i (i > k) 4: a ik = mult form the k-th column of the lower triangular matrix 5: for j = k + 1,, n do iterate over all columns in a row 6: a ij = a ij mult a kj subtract the scaled row data and form the i-th row of the upper triangular matrix 7: end for 8: end for 9: end for Theorem 1 (LU factorization). A = LU, (2.1.8) where L= lower triangular matrix, unit diagonal; U= upper triangular matrix. In addition, 1 A (n-1) = mult U L= mult 1 We can also derive the forward substitution based on the process of Gaussian elimination, as follows: Algorithm 2 Forward substitution for Ly = b 1: for i = 1,, n do iterate over all rows 2: y i = b i 3: for j = 1,, i 1 do iterate over all (lower-triangular) columns in a row 4: y i = y i l ij y j solve for y i 5: end for 6: y i = y i /l ii skip this line if l ii = 1 7: end for 21

Exercise 1 (backward substitutions). Write down the algorithm for the backward substitution Ux = y. 2.1.3 Complexity Evaluate complexity by the number of FLOPs (floating point operations): +,,,. The complexity of LU factorization: From Algorithm 1, we have n 1 n n n 1 n n 1 2 = 2 (n i) = 2 (n i) 2 = 2 3 n3 + O(n 2 ), (2.1.9) i=1 k=i+1 j=i+1 i=1 i=1 k=i+1 where we have used n i = 1 n n(n + 1), 2 i=1 i=1 i 2 = 1 n(n + 1)(2n + 1). (2.1.10) 6 Exercise 2 (Forward/backward substitutions). Analyze the complexity of the forward substitution Ly = b and the backward substitution Ux = y. The overall complexity of the exact solver: 2 3 n3 + O(n 2 ). (2.1.11) 2.1.4 Pivoting Example 2 (Instability of Gaussian elimination when a 11 = 0). Consider solving 0 2 3 x 1 1 1 0 4 x 2 = 6. (2.1.12) 3 0 2 8 Since the pivot a 11 = 0, we cannot perform LU factorization! Example 3 (Partial pivoting). The solution for the instability issue is pivoting. Reorder the equations (the rows of A and the right hand side b), such that the largest among a i1 (i = 1, 2, 3) becomes the pivot. 3 0 2 x 1 8 1 0 4 x 2 = 6. (2.1.13) 0 2 3 1 x 3 x 3 22

3 0 2 10 x 1 8 0 0 3 x 2 = 10 3. (2.1.14) 0 2 3 1 x 3 Next we perform Gaussian elimination on the red submatrix, which requires another pivoting! 3 0 2 x 1 8 0 2 3 x 2 = 1. (2.1.15) 10 0 0 x 3 3 10 3 Example 4 (Complete pivoting). Instead, reorder both the equations (the rows of A and b) and the unknowns (the columns of A and x), such that the largest among a ij (i = 1, 2, 3; j = 1, 2, 3) becomes the pivot. Remark 1 (Reordering). 1 0 4 0 2 3 3 0 2 4 0 1 3 2 0 2 0 3 x 1 x 2 x 3 x 3 x 2 x 1 = Reordering the equations = reordering the rows of A and b. 6 1. (2.1.16) 8 6 = 1. (2.1.17) 8 Reordering the unknowns = reordering the columns of A and x. In general, after the (k 1)-th Gaussian elimination done a (k-1) kk 0 A (k-1) row k if the original pivot a (k 1) kk = 0 or a (k 1) kk 0, then pivoting is required. 23

Two possible pivoting strategies: Complete pivoting: Search the largest (absolute value) element in A (k 1), and pivot. Partial pivoting: Search the largest (absolute value) element in the column k of A (k 1), and pivot. In practice, we use partial pivoting, since complete pivoting is expensive, and does not yield much more gains! To summarize the essence of Gaussian elimination with partial pivoting: to: Exact solver (Gaussian elimination) of Ax = b with partial pivoting is equivalent Permutation of the rows of A: A P matrix = P A, where P is a certain permutation LU-factorize the matrix A P : A P = LU Solve Ly = b for intermediate solution y (forward substitution) Solve Ux = y for final solution x (backward substitution) Example 5 (Permutation matrix for partial pivoting). In Example 3, the permutation matrix is given by: 1 P = P 2 P 1 = 1 1 1 1 1 = 1 1 1 (2.1.18) The algorithm of LU factorization with partial pivoting is given as follows: 24

Algorithm 3 LU factorization with partial pivoting 1: for k = 1,, n 1 do iterate over all rows 2: Select i = arg max i k a ik 3: a k,k:m a i,k:m interchange row k and row i (upper triangular part) 4: a k,1:k 1 a i,1:k 1 interchange row k and row i (lower triangular part) 5: for i = k + 1,, n do iterate over all rows beneath row k 6: mult = a ik /a kk determine the multiplicative factor of row i (i > k) 7: a ik = mult form the k-th column of the lower triangular matrix 8: for j = k + 1,, n do iterate over all columns in a row 9: a ij = a ij mult a kj subtract the scaled row data and form the i-th row of the upper triangular matrix 10: end for 11: end for 12: end for 2.1.5 When is pivoting unnecessary? However, in some situations, we can prove that pivoting is unnecessary. The following condition on A will ensure pivoting is not necessary and the LU factorization is always stable: A is symmetric positive definite (SPD), A is row diagonally dominant, or A is column diagonally dominant. Here we show that pivoting is unnecessary for SPD matrices. A quick review on SPD matrices can be found in the supplementary notes. Theorem 2 (Pivoting being unnecessary for SPD matrices). Suppose A is SPD. Then during Gaussian elimination, a (k 1) kk > 0. Proof. For simplicity, consider k = 1. Suppose ( A = 25 a 11 v v T B ) (2.1.19)

is SPD. Here a 11 is a number, v R n 1 and B R (n 1) (n 1). Then a 11 > 0. Now eliminate v using a 11 as pivot: ( ) ( ) a 11 v T a11 v T (2.1.20) v B 0 B vvt a 11 Hence the Gaussian elimination gives A (1) = B vvt a 11. (2.1.21) Next we will prove that A (1) is SPD. It is easy to see that A (1) is symmetric, so our focus is to prove that A (1) is PD. Let x R n 1 and ( ) y xt v a 11 R n (2.1.22) x Some straightforward algebra can show that y T Ay = x T ( Since A is PD, we have B vvt a 11 ) x = x T A (1) x. (2.1.23) y T Ay > 0 x T A (1) x > 0. (2.1.24) Hence, A (1) is PD. Then a (1) 22 > 0. k > 1 can be proved by in the same fashion (induction). Remark 2. You will prove that pivoting is unnecessary for row/column diagonally dominant matrices in your assignment. 26

2.2 Solving symmetric positive definite systems: Cholesky factorization (1 lectures) Reference: [Trefethen, Bau III] Lecture 23 The complexity of LU factorization is 2 3 n3, which is still very expensive! Consider a 1000 1000 image. The dimension of the resulting linear system is n = 10 6. The computational complexity is 10 18! Consider special linear systems: Exploit the special structure of linear systems More efficient LU factorization We will see in this lecture that Generic matrix: LU factorization = LDM T factorization. Symmetric matrix: LDL T factorization. Positive definite matrix: LDM T factorization, where D > 0. Symmetric positive definite matrix: LDL T factorization, where D > 0 Cholesky factorization (A = GG T ) 2.2.1 Generic matrix: LDM T factorization Theorem 3 (LDM T factorization). If all the leading principal submatrices of A are nonsingular, then there exists unique unit lower triangular matrices L and M, and a unique diagonal matrix D, such that A = LDM T. (2.2.1) Proof. A = LU = LDD 1 U = LDM T, (2.2.2) where D is the diagonal part of U and M T = D 1 U (rescale each row such that it is unit diagonal). Remark 3. LDM T factorization is simply a variant of LU factorization. Nothing new! 27

2.2.2 Symmetric matrix: LDL T factorization Theorem 4 (LDL T factorization). If A is symmetric, then M = L, or equivalently, A = LDL T. (2.2.3) Proof. A = LDM T M 1 AM T = M 1 LDM T M T = M 1 LD Note that M 1 AM T is symmetric M 1 LD is symmetric. Note that both M and L are unit lower triangular M 1 L is unit lower triangular (why?) M 1 LD is lower triangular. Hence, M 1 LD is a diagonal matrix! M 1 L is a diagonal matrix! Note that M 1 L is unit lower triangular M 1 L is an identity matrix! M = L. Remark 4. Why matters? Save half the work by computing L and D only. 2.2.3 Positive definite (PD) matrix: LDM T factorization, where D > 0 Theorem 5 (LDM T factorization for PD). If A is PD, then for A = LDM T, D > 0. Proof. A = LDM T L 1 AL T = L 1 LDM T L T = DM T L T Note that L 1 AL T is PD DM T L T is PD. By Corollary 8, diag(dm T L T ) > 0. Note that both M and L are unit lower triangular M T L T is unit upper triangular (why?) diag(dm T L T ) = D. Hence, D > 0. 2.2.4 Symmetric positive definite (SPD) matrix: Cholesky factorization Symmetric matrix: PD matrix: A = LDM T M = L. (2.2.4) A = LDM T D > 0. (2.2.5) 28

Then SPD matrix: A = LDM T M = L, D > 0. (2.2.6) Rewrite: A = LDL T = LD 1 2 D 1 2 L T = (LD 1 2 )(LD 1 2 ) T = GG T. (2.2.7) This gives rise to Cholesky factorization where G is lower triangular. A = GG T, (2.2.8) 2.2.5 Cholesky factorization algorithm Naively, LU factorization, and go through the process above Cholesky. But can we do Cholesky directly? The answer is yes! We can verify that ( α A = v If A is SPD, then B vvt α ) ( α ) ( v T 0 = B I v α I 0 0 B vvt α is SPD (exercise). Let ) ( α ) v T α. (2.2.9) 0 I B vvt α = G 1G T 1. (2.2.10) Then ( α ) ( α ) 0 v T A = α v = GG α G T. (2.2.11) 1 0 G T 1 This implies that we can perform Cholesky factorization recursively. Exercise 3. Prove that B vvt α is SPD. Hint: Check XT AX, where X ( ) 1 vt α. (2.2.12) 0 I You should get X T AX = ( ) α 0. (2.2.13) 0 B vvt α 29

Example 6 (Cholesky factorization). Based on (2.2.9), Cholesky factorize the following 3 3 matrix: 9 3 3 A = 3 5 1 (2.2.14) 3 1 18 The answer is To summarize the Choleksy factorization: Algorithm 4 Cholesky factorization 3 G = 1 2 (2.2.15) 1 1 4 1: for k = 1,, n do Iterate from top to bottom along diagonal. 2: a kk = a kk Factor the diagonal element α. 3: for i = k + 1,, n do 4: a ik = a ik /a kk Update current column entries below the diagonal v = v/ α. 5: end for 6: for j = k + 1,, n do 7: for i = j,, n do 8: a ij = a ij a ik a jk Update the lower right block B = B vv T /α (below the diagonal only). 9: end for 10: end for 11: end for Complexity: The complexity of the Cholesky factorization is n 3 3 + O(n2 ). (2.2.16) Exercise 4. Verify the complexity. Remark 5. In order to solve Ax = b, after the Cholesky factorization A = GG T, we use forward/backward substitution to solve GG T x = b. 30

2.3 From partial differential equations to sparse linear systems (1 lectures) Reference: [Saad] 2.1-2.2 2.3.1 Partial differential equations (PDEs) Three most basic linear partial differential equations (PDEs). t: time. x, y: space. Wave equation: Solution: Figure: u t + au x = 0, u(x, 0) = sin(2πx). (2.3.1) u(x, t) = sin(2π(x at)). (2.3.2) 1.0 0.5 0.2 0.4 0.6 0.8 1.0-0.5-1.0 Heat equation: Solution: u t σu xx = 0, u(x, 0) = sin(kπx), u(0, t) = 0, u(1, t) = 0. k is an integer, (2.3.3) u(t, x) = e k2 π 2 σt sin(kπx). (2.3.4) 31

Figure: 1.0 0.8 0.6 0.4 0.2 0.2 0.4 0.6 0.8 1.0 Poisson equation: u xx + u yy = 2π 2 sin(πx) sin(πy), inside (0, 1) (0, 1), u = 0, on the boundary of [0, 1] [0, 1]. (2.3.5) Solution: Figure: u(x, y) = sin(πx) sin(πy). (2.3.6) Remark 6. In this course, we only discuss time-independent problems (boundary problems, steady state problems). Time dependent problems, or more generally, numerical techniques for all types of PDEs, AMATH 342, AMATH 442, AMATH 741 / CS 778. Remark 7. In general, difficult to find analytical solutions. Need numerical solutions! 32

2.3.2 1D Poisson equation Consider solving steady heat distribution or electric potential on a line: u xx (x) = f(x), inside (0, 1), (2.3.7) u(0) = a, u(1) = b. (2.3.8) Idea: Continuous Discrete. Discretize the computational domain [0, 1] into a square grid. Find u(x) at each grid point. Finite difference discretization. Construct a grid: gives Grid size: m = 4 Grid spacing: h = 1 m+1 = 1 5. Grid coordinates: x 0 = 0, x 1 = 1 5, x 2 = 2 5, x 3 = 3 5, x 4 = 4 5, x 5 = 1. Right hand side: f(x 0 ), f(x 1 ), f(x 2 ), f(x 3 ), f(x 4 ), f(x 5 ), denoted as f 0, f 1, f 2, f 3, f 4, f 5. Our goal: Solve for the unknowns u(x 0 ), u(x 1 ), u(x 2 ), u(x 3 ), u(x 4 ), u(x 5 ), denoted as u 0, u 1, u 2, u 3, u 4, u 5. On the boundary, Equation (2.3.8) Inside 0 < x < 1, check Equation (2.3.7). u(0) = a, u(1) = b. (2.3.9) u 0 = a, u 5 = b. (2.3.10) 33

The 2nd derivative: On the grid, approximate it by u xx (x) = lim h 0 u(x h) 2u(x) + u(x + h) h 2. (2.3.11) u xx (x i ) u(x i h) 2u(x) + u(x i + h) h 2 = u i 1 2u i + u i+1 h 2, i = 1, 2, 3, 4. (2.3.12) Equation (2.3.7) becomes i = 1: Note that u 0 = a: i = 2: i = 3: i = 4: Note that u 5 = b: u xx (x) = f(x), inside (0, 1), (2.3.13) u i 1 + 2u i u i+1 h 2 = f i, i = 1, 2, 3, 4. (2.3.14) u 0 + 2u 1 u 2 h 2 = f 1, 2u 1 u 2 h 2 = f 1 + a h 2, u 1 + 2u 2 u 3 h 2 = f 2, u 2 + 2u 3 u 4 h 2 = f 3, u 3 + 2u 4 u 5 h 2 = f 4. u 3 + 2u 4 h 2 = f 4 + b h 2. 34

Align the unknowns: 2 h 2 u 1 1 h 2 u 2 = f 1 + a h 2, 1 h 2 u 1 + 2 h 2 u 2 1 h 2 u 3 = f 2, 1 h 2 u 2 + 2 h 2 u 3 1 h 2 u 4 = f 3, This gives a linear system: 2 1 1 1 2 1 h 2 1 2 1 1 2 1 h 2 u 3 + 2 h 2 u 4 = f 4 + b h 2. u 1 u 2 u 3 u 4 (2.3.15) f 1 + a h 2 = f 2 f 3. (2.3.16) f 4 + b h 2 For general m, the discretization of Equation (2.3.7) gives rise to a linear system where the pattern is 2 1 1 2 1 A = 1 1 2 1 h 2........, u =. 1 2 1 1 2 Au = f, (2.3.17) u 1 u 2 u 3. u m 1 u m Remark 8. The matrix A is SPD (see supplementary notes). Algorithm 5 1D discrete Laplacian 1: for i = 1,, n do 2: A i,i = 2/h 2 3: if i > 1 then 4: A i,i 1 = 1/h 2 5: end if 6: if i < m then 7: A i,i+1 = 1/h 2 8: end if 9: end for f 1 + a h 2 f 2 f 3, f =.. f m 1 f m + b h 2 (2.3.18) 35

2.3.3 2D Poisson equation Consider solving steady heat distribution or electric potential in a squared box: u xx u yy = f, inside (0, 1) (0, 1), u = g, on the boundary of [0, 1] [0, 1]. (2.3.19) Idea: Continuous Discrete. Discretize the computational domain [0, 1] [0, 1] into a square grid. Find u(x, y) at each grid point (x, y). Finite difference discretization. Construct a grid: Grid size: m = 4 Grid spacing: h = 1 m+1 = 1 5. Grid coordinates (boundary excluded) (x 1, y 1 ), (x 1, y 2 ), (x 1, y 3 ), (x 1, y 4 ), (x 2, y 1 ), (x 2, y 2 ), (x 2, y 3 ), (x 2, y 4 ), (x 3, y 1 ), (x 3, y 2 ), (x 3, y 3 ), (x 3, y 4 ), (x 4, y 1 ), (x 4, y 2 ), (x 4, y 3 ), (x 4, y 4 ). (lexicographic order) Right hand side: f at these points, denoted as f 1,1, f 1,2,, f 4,4. Our goal: Solve for the unknowns u 1,1, u 1,2,, u 4,4. 36

Inside (0, 1) (0, 1), check Equation (2.3.19). The 2nd derivative: u xx (x, y) = lim h 0 u(x h, y) 2u(x, y) + u(x + h, y) h 2, (2.3.20) u yy (x, y) = lim h 0 u(x, y h) 2u(x, y) + u(x, y + h) h 2. (2.3.21) On the grid, approximate it by u xx (x i, y j ) u(x i h, y j ) 2u(x i, y j ) + u(x i + h, y j ) h 2 u yy (x i, y j ) u(x i, y j h) 2u(x i, y j ) + u(x i, y j + h) h 2 = u i 1,j 2u i,j + u i+1,j h 2, (2.3.22) = u i,j 1 2u i,j + u i,j+1 h 2. (2.3.23) PDE (2.3.19) can be approximated by u i 1,j + 2u i,j u i+1,j h 2 + u i,j 1 + 2u i,j u i,j+1 h 2 = f i,j, i, j = 1, 2, 3, 4. (2.3.24) 1 h 2 This gives a linear system: 4 1 1 1 4 1 1 1 4 1 1 1 4 1 1 4 1 1 1 1 4 1 1 1 1 4 1 1 1 1 4 1 1 4 1 1 1 1 4 1 1 1 1 4 1 1 1 1 4 1 1 4 1 1 1 4 1 1 1 4 1 1 1 4 u 1,1 u 1,2 u 1,3 u 1,4 u 2,1 u 2,2 u 2,3 u 2,4 u 3,1 u 3,2 u 3,3 u 3,4 u 4,1 u 4,2 u 4,3 u 4,4 = f 1,1 +g 0,1 /h 2 +g 1,0 /h 2 f 1,2 +g 0,2 /h 2 f 1,3 +g 0,3 /h 2 f 1,4 +g 0,4 /h 2 +g 1,5 /h 2 f 2,1 +g 2,0 /h 2 f 2,2 f 2,3 f 2,4 +g 2,5 /h 2 f 3,1 +g 3,0 /h 2 f 3,2 f 3,3 f 3,4 +g 3,5 /h 2 f 4,1 +g 5,1 /h 2 +g 4,0 /h 2 f 4,2 +g 5,2 /h 2 f 4,3 +g 5,3 /h 2 f 4,4 +g 5,4 /h 2 +g 4,5 /h 2 (2.3.25) For general m, the discretization of Equation (2.3.7) gives rise to a linear system 37 Au = f, (2.3.26)

where the pattern is A = 1 h 2 4 1 1 1 4 1 1............ 1 4 1 1 1 4 1... R m2 m 2. (2.3.27) 1 1 4 1............ 1 1 4 1 1 1 4...... 1 4 1 For convenience, we write it into a block form: B 1 I A = 1 I B 2 I......... R m2 m 2, (2.3.28) h 2 I B m 1 I I B m where each block B i (i = 1,, m) reads 4 1 1 4 1 B i =......... R m m. (2.3.29) 1 4 1 1 4 Remark 9. By writing the matrix A in a block form, it becomes much easier to handle the boundary terms that are moved to the right hand side of the equation. Whenever a matrix entry falls outside a block or the full matrix, kill it (and move it to the right hand side). Remark 10. The matrix A is again, SPD (see supplementary notes). 38

Algorithm 6 2D discrete Laplacian 1: for i = 1,, m do 2: for j = 1,, m do 3: A m(i 1)+j, m(i 1)+j = 4/h 2 4: if i > 1 then 5: A m(i 1)+j, m(i 1 1)+j = 1/h 2 6: end if 7: if i < m then 8: A m(i 1)+j, m(i+1 1)+j = 1/h 2 9: end if 10: if j > 1 then 11: A m(i 1)+j, m(i 1)+j 1 = 1/h 2 12: end if 13: if j < m then 14: A m(i 1)+j, m(i 1)+j+1 = 1/h 2 15: end if 16: end for 17: end for 2.3.4 Convection-diffusion equation Consider solving a steady state of the (fluid, gas) particles that allow diffusion and convection: u xx u yy + au x + bu y = f, inside (0, 1) (0, 1), u = g, on the boundary of [0, 1] [0, 1]. (2.3.30) In this case, we also need to consider approximating the first derivatives. Take u x (x, y) as example. There are three different possibilities: Central difference: u(x + h, y) u(x h, y) u x (x, y) = lim, (2.3.31) h 0 2h On the grid, approximate it by u x (x i, y j ) u(x i + h, y j ) u(x i h, y j ) 2h = u i+1,j u i 1,j. (2.3.32) 2h 39

Forward difference: u(x + h, y) u(x, y) u x (x, y) = lim, (2.3.33) h 0 h On the grid, approximate it by u x (x i, y j ) u(x i + h, y j ) u(x i, y j ) h = u i+1,j u i,j. (2.3.34) h Backward difference: u(x, y) u(x h, y) u x (x, y) = lim, (2.3.35) h 0 h On the grid, approximate it by u x (x i, y j ) u(x i, y j ) u(x i h, y j ) h = u i,j u i 1,j. (2.3.36) h For stability reasons, we choose forward/backward differences, depending on the signs of a and b. When a > 0 and b > 0, use backward differences for u x and u y. Hence, PDE (2.3.30) can be approximated by u i 1,j + 2u i,j u i+1,j h 2 This gives rise to a linear system + u i,j 1 + 2u i,j u i,j+1 +a u i,j u i 1,j +b u i,j u i,j 1 = f h 2 i,j. h h (2.3.37) Au = f, (2.3.38) where the full matrix is B 1 I A = 1 (1 + ah)i B 2 I........., (2.3.39) h 2 (1 + ah)i B m 1 I (1 + ah)i B m 40

and the submatrices are 4 + ah + bh 1 (1 + bh) 4 + ah + bh 1 B i =.......... (2.3.40) (1 + bh) 4 + ah + bh 1 (1 + bh) 4 + ah + bh Exercise 5. When a > 0, b < 0, for stability reason, we use backward difference for u x and forward difference for u y : u i 1,j + 2u i,j u i+1,j h 2 Write down the matrix A. + u i,j 1 + 2u i,j u i,j+1 +a u i,j u i 1,j +b u i+1,j u i,j = f h 2 i,j. h h (2.3.41) Remark 11. This course does not require you to know how to discretize the PDEs. It is a very complicated subject and research topic indeed! (AMATH 342, AMATH 442, AMATH 741 / CS 778). The requirement is that once a PDE expert tells you how to discretize the PDEs, you can write down the resulting linear system Au = f. In general, a discretization of partial differential equation gives rise to a band system, or more generally, a sparse linear system. 41

2.4 Solving sparse systems (2 lectures) Reference: [Saad] 3.1-3.3 We have seen that PDE discretization gives rise to a sparse linear system, or more precisely, band linear system. 2.4.1 LU factorization of band systems A general form of band systems with upper bandwidth q and lower bandwidth p:... 0... A =. (2.4.1)... 0.. Example 7 (Band system). 3 1 2 3 1 A = 1 2 3 1 1 2 3 1 1 2 3 (2.4.2) We have q = 1, p = 2. Example 8 (1D Poisson matrix). We have p = q = 1. 2 1 1 2 1 A = 1 1 2 1 h 2......... (2.4.3). 1 2 1 1 2 42

Example 9 (2D Poisson matrix). A = 1 h 2 4 1 1 1 4 1 1............ 1 4 1 1 1 4 1.... (2.4.4) 1 1 4 1............ 1 1 4 1 1 1 4...... 1 4 1 We have p = q = m (Note that the size of the matrix A is m 2 m 2 ). Theorem 6 (LU factorization of a band system). If A has upper bandwidth q and lower bandwidth p, then for A = LU, U has upper bandwidth q and L has lower bandwidth p. = Algorithm 7 LU factorization for band system 1: for k = 1,, n 1 do iterate over all rows 2: for i = k + 1,, min(k + p, n) do iterate over all rows beneath row k and above row min(k + p, n). 3: mult = a ik /a kk determine the multiplicative factor of row i 4: a ik = mult form the k-th column of the lower triangular matrix 5: for j = k + 1,, min(k + q, n) do iterate between the (k + 1)-th column and the min(k + q, n)-th column in a row 6: a ij = a ij mult a kj subtract the scaled row data and form the k-th row of the upper triangular matrix 7: end for 8: end for 9: end for 43

Complexity: If n p and n q, then the computational complexity 2npq. Exercise 6. Verify the complexity. Remark 12. Compared to 2 3 n3 for generic LU, band LU is much faster! 2.4.2 Issues with sparse systems Band matrices are only special instances of sparse matrices. Consider more general sparse matrices. Something we can do: Usually a constant number of non-zeros per row, or O(n) non-zeros in total. O(n) storage of a sparse matrix: CRS (compressed row storage) In LU factorization, skip all the zero entries when computing However, there are still issues! Example 10 (Arrow matrix). Consider solving a ij = a ij a ik a kk a kj. (2.4.5) Ax = b, (2.4.6) where A is an arrow matrix: A = (2.4.7) The LU factorization of A: 1 1 = 1 1 1 (2.4.8) 44

L and U are dense! Storage: O(n 2 ). Cost: O(n 3 ). Bad! However, if we reorder both the unknowns (the column of A) and the equations (the row of A): A P x P = b P (2.4.9) then 1 1 A P = = 1 1 1 (2.4.10) Example 11 (2D Poisson matrix with m x < m y ). Consider Poisson matrix with m x < m y. Total number of grid points is n = m x m y. x-axis first, y-axis second: Band width is m x. Computational cost is O(m 2 xn). y-axis first, x-axis second: Band width is m y. Computational cost is O(m 2 yn). A sparse matrix A can still result in dense L and U. The ordering of the sparse matrix A can dramatically affect the sparsity of the resulting L and U. 2.4.3 Graph representation of matrices Our goal: reordering helps reducing storage and computational cost. Our tool: graph representation of matrices. A sparse matrix A can be represented by a graph. If a i,j 0, then there exists an edge from node i to j. 45

Example 12 (Graph representation). A = (2.4.11) Example 13 (Graph representation). Graph representation for 1D and 2D Poisson matrices: The graph of matrix with symmetric structure remains unchanged under reordering. The graph structure often has a physical or geometrical interpretation on the systems. What does Gaussian elimination do with the graph? Example 14 (Graph representation). 0 0 (2.4.12) 46

Gaussian elimination of the node i deletes node i and all the edges it connects, and creates new edge from j to k within the remaining subgraph if there is a fill-in at (j, k). 2.4.4 Ordering algorithm (I): Cuthill-McKee ordering Idea: In each row, the fillings of L only occur between the first non-zero in the row and the diagonal. Keep the envelop as close to the diagonal as possible. Try to label the nodes such that the labels of the graph neighbors are as close as possible. 47

Algorithm 8 Cuthill-McKee ordering 1: Pick starting node 2: for i = 1,, n do 3: Find all unnumbered neighbors of node i 4: Label them in order of degree (smallest first) 5: end for 6: Reverse Cuthill-McKee: node i node n i+1, i = 1,, n. The reverse order is better! Example 15 (Cuthill-McKee ordering). 48

Example 16 (Cuthill-McKee ordering). Example 17 (Why reversed order is better?). CM ordering: 1-g, 2-h, 3-e, 4-b, 5-f, 6-c, 7-j, 8-a, 9-d, 10-i. RCM ordering: 1-i, 2-d, 3-a, 4-j, 5-c, 6-f, 7-b, 8-e, 9-h, 10-g. Example 18 (Why reversed order is better?). CM ordering: 1-A, 2-G, 3-B, 4-C, 5-D, 6-E, 7-F. Reversed CM ordering: 1-F, 2-E, 3-D, 4-C, 5-B, 6-G, 7-A. 49

Remark 13. Reverse ordering tends to create a matrix A that is similar to the low-fill downward arrow matrix. Remark 14. RCM does not necessarily produce an optimal ordering. Indeed, producing optimal ordering is NP-complete problem. 2.4.5 Ordering algorithm (II-1): Local strategy (optional) Local strategy, idea: After k steps of Gaussian elimination done row (k+1) 0 A (k) The worst case fill-ins for the current k-th step of Gaussian elimination A (k) = A (k) = (2.4.13) (2.4.14) Markowitz products: the worst case fill-in if a (k) i,j (r (k) i is pivoted is given by 1)(c (k) j 1), (2.4.15) 50

where r (k) i (or c (k) i ): number of non-zero entries in row i (or column j) of A (k). Objective: Minimize worst case fill-ins for the current k-th step Gaussian elimination find (i, j) that has the minimum Markowitz product! min k+1 i,j n (r(k) i 1)(c (k) j 1). (2.4.16) Implementation of pivoting: Pick a (k) i,j that has the minimum Markowitz product, and swap it into the top-left position of A (k). Example 19 (Markowitz products). a b c d e A = f g h 2 6 2 1 3 3 1 i 3 0 (2.4.17) Note that a 44 = i has a Markowitz product of 0. It means that using it as pivot introduces no fill at all! Hence, we pick a 44 and swap it into the top-left position of A: a b c a b c i h d e f g d e f g a b c (2.4.18) d e h i i h f g 2.4.6 Ordering algorithm (II-2): Minimum degree ordering Consider local strategy for symmetric case. Then Thus, it suffices to find rather than (2.4.16), and use a (k) ii min r (k) i = min c (k) j (2.4.19) min k+1 i n (r(k) i 1) (2.4.20) as the pivot. Example 20 (Markowitz products for symmetric matrix). 51 (2.4.21)

Graph of this matrix: In fact, evaluating (2.4.20) is equivalent to choosing i that has the minimum degree as the pivot! Minimum degree ordering: At the k-th step, choose the node with minimum degree! Algorithm 9 Minimum degree ordering 1: for k = 1,, n do 2: Number the node with (current) least degree. 3: Remove the node and its edges. 4: Add new edges connecting all its neighbors together. corresponding to fill-in 5: end for Remark 15. Possible strategies for tie breaking: Select the node with smallest node number in the original ordering. Pre-order with RCM Example 21 (Minimum degree ordering). A = LU (2.4.22) 52

Original ordering: 1-A, 2-B, 3-C, 4-D, 5-E, 6-F, 7-G. Minimum degree ordering: 1-A, 2-C, 3-D, 4-E, 5-B, 6-F, 7-G. A P = LU (2.4.23) Remark 16. Minimum degree ordering is a local strategy. No guarantee that it will produce the global minimum fill-ins. Example 22 (Not global minimum fill-ins). Not the end of the story. Matlab s symamd (symmetric approximate minimum degree permutation) can do even better! 53

54

Yangang Chen, U Waterloo 2.5 2.5.1 CS 475/675 Notes, Spring 2017 Application: Image denoising (1 lectures) Image denoising Images often contain random noise (small errors), which may result from e.g. the sensors, the capture process, or conditions under which it was captured. Often there is enough signal amidst the noise that we can try to recover a version with the noise removed/reduced. Image denoising: given some observations, reconstruct the source/factors that generated them. 55

2.5.2 Mathematical formulation We treat (grayscale) images as 2D scalar functions: u i,j = pixel intensity value at row i, column j Mathematical formulation: given observed image u 0, and true underlying image u, find an approximation of u (u u), in order to eliminate/reduce noise in the solution, or, to minimize the total fluctuation of the pixel values, R(u): min R(u), (2.5.1) u and preserve as much information as possible min u u u 0 2 2 = min u u(x) u 0 (x) 2 dx. (2.5.2) 56

Image denoising is a trade-off between (2.5.1) and (2.5.2). The optimization problem is given by ( ) min αr(u) + u u0 2 2. (2.5.3) u α is a user-specified parameter. α 0: u u 0. α : u constant. We want something between. So, how do we characterize the total fluctuation of the pixel values, R(u)? 2.5.3 Attempt 1: Laplacian regularization Choose R(u) = u 2 2 = u 2 dx. (2.5.4) The optimization (2.5.3) becomes The Euler-Lagrange equation gives us ( ) min α u 2 2 + u u 0 2 2. (2.5.5) u α 2 u + u u 0 = 0. (2.5.6) α 2 u + u = u 0. (2.5.7) This is very similar to the 2D Poisson equation. Use finite difference α 4u i,j u i 1,j u i+1,j u i,j 1 u i,j+1 h 2 + u i,j = (u 0 ) i,j. (2.5.8) This gives a linear system: (αa + I)u = u 0. (2.5.9) 57

2.5.4 Attempt 2: Total variation regularization Choose L 1 norm instead of L 2 norm: R(u) = u 2 1 = u dx. (2.5.10) The optimization (2.5.3) becomes ( ) min α u 2 1 + u u 0 2 2. (2.5.11) u The Euler-Lagrange equation gives us ( ) 1 α u u + u u 0 = 0. (2.5.12) Remark 17 (How does it work?). The coefficients c in characterizes the degree of smoothing! ( ) 1 α u u u + u = u 0. (2.5.13) α (c u) + u = u 0 (2.5.14) depend on gradients in the solution nonlin- For (2.5.13), the coefficients c = 1 ear PDE. u Near edge: u is large, c = 1 is small, small degree of smoothing. u Flat region: u is small, c = 1 is large, large degree of smoothing. u Previous approach is roughly the same, but with the coefficients c = 1. 58

We can again apply finite difference and obtain a system of equations (αa(u) + I)u = u 0. (2.5.15) Unlike the previous example, the matrix A depends on u. So it is a nonlinear system of equations. A simple approach to nonlinear equations is fixed point iteration: Freeze the coefficients to make the equations linear, solve, update, and repeat. Algorithm 10 Fixed point iteration for (2.5.15) 1: Pick u (0). 2: for k = 1, 2, until convergence do 3: Solve (αa(u (k 1) ) + I)u (k) = u 0 4: end for Results: 59

60