Review: From problem to parallel algorithm

Size: px
Start display at page:

Download "Review: From problem to parallel algorithm"

Transcription

1 Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution: Discretize and solve Ax = b Many methods, which serve as building blocks for other problems and algorithms Jacobi s method: Easy to parallelize iterative method with slow convergence 1

2 Recall: Algorithms for 2-D (3-D) Poisson, N=n 2 (=n 3 ) Algorithm Serial PRAM Memory # procs Dense LU Band LU Jacobi Explicit inverse Conj. grad. RB SOR Sparse LU FFT Multigrid Lower bound N 3 N N 2 N 2 N 2 (N 7/3 ) N N 3/2 (N 5/3 ) N (N 4/3 ) N 2 (N 5/3 ) N (N 2/3 ) N N N 2 log N N 2 N 2 N 3/2 (N 4/3 ) N 1/2(1/3) log N N N N 3/2 (N 4/3 ) N 1/2 (N 1/3 ) N N N 3/2 (N 2 ) N 1/2 N log N (N 4/3 ) N N log N log N N N N log 2 N N N N log N N PRAM = idealized parallel model with zero communication cost. Source: Demmel (1997) 2

3 Recall: 2-D Poisson Equation Graph and stencil T =

4 Recall: Jacobi s method is easy to parallelize Parallelism: Update all points independently Block partition domain Partition domain into blocks n 2 /p elements / block Communicate at boundaries n/p per neighbor Small if n >> p 4

5 I: More Poisson II: Performance metrics and models Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA, Spring 2008 [L.04] Thursday, January 17,

6 Sources for today s material CS 267 (Yelick & Demmel, UCB) Sourcebook, eds. Dongarra, et al. Mike Heath at UIUC Intro to the CG method w/o the agonizing pain, by Jonathan Shewchuk (UCB) 6

7 Algorithms for 2-D (3-D) Poisson, N=n 2 (=n 3 ) Algorithm Serial PRAM Memory # procs Dense LU Band LU Jacobi Explicit inverse Conj. grad. RB SOR Sparse LU FFT Multigrid Lower bound N 3 N N 2 N 2 N 2 (N 7/3 ) N N 3/2 (N 5/3 ) N (N 4/3 ) N 2 (N 5/3 ) N (N 2/3 ) N N N 2 log N N 2 N 2 N 3/2 (N 4/3 ) N 1/2(1/3) log N N N N 3/2 (N 4/3 ) N 1/2 (N 1/3 ) N N N 3/2 (N 2 ) N 1/2 N log N (N 4/3 ) N N log N log N N N N log 2 N N N N log N N PRAM = idealized parallel model with zero communication cost. Source: Demmel (1997) 7

8 Can we speed up the rate at which information propagates? RHS True solution 5 steps of Jacobi 5 steps of another technique 8

9 Can speed up info propagation? Jacobi: u t+1 i,j = 1 4 ( u t i 1,j + u t i+1,j + u t i,j 1 + u t i,j+1 + h 2 f i,j ) If processing in lexicographic order, can use most recent values u t+1 i,j = 1 4 ( u t+1 i 1,j + ut i+1,j + u t+1 i,j 1 + ut i,j+1 + h 2 f i,j ) Gauss-Seidel algorithm 9

10 Red-black Gauss-Seidel Alternately update R & B subsets General graphs? Not much improvement Convergence: (only) 2x faster PRAM: 2x parallel steps 10

11 Successive overrelaxation (SOR) Rewrite Jacobi as original + correction: u t+1 i,j = u t i,j + i,j If correction is a good direction, accelerate by relaxation factor ω > 1: u t+1 i,j = u t i,j + ω i,j Red-black SOR: Alternately apply the following to red, black subsets u t+1 i,j = (1 ω)u t i,j + ω 4 ( u t i 1,j + u t i+1,j + u t i,j 1 + u t i,j+1 + h 2 f i,j ) 11

12 Red-black SOR Red-black SOR: Alternately apply the following to red, black subsets u t+1 i,j = (1 ω)u t i,j + ω 4 ( u t i 1,j + u t i+1,j + u t i,j 1 + u t i,j+1 + h 2 f i,j ) Can show, for Poisson, that error minimized when: [Demmel (1997)] 1 < ω = 2 1+sin π N+1 < 2 Can also show no. of steps to converge is O(n) vs. Jacobi s O(n 2 ) Serial complexity = O(n 3 =N 3/2 ) vs. Jacobi s O(n 4 =N 2 ). [PRAM?] 12

13 Algorithms for 2-D (3-D) Poisson, N=n 2 (=n 3 ) Algorithm Serial PRAM Memory # procs Dense LU Band LU Jacobi Explicit inverse Conj. grad. RB SOR Sparse LU FFT Multigrid Lower bound N 3 N N 2 N 2 N 2 (N 7/3 ) N N 3/2 (N 5/3 ) N (N 4/3 ) N 2 (N 5/3 ) N (N 2/3 ) N N N 2 log N N 2 N 2 N 3/2 (N 4/3 ) N 1/2(1/3) log N N N N 3/2 (N 4/3 ) N 1/2 (N 1/3 ) N N N 3/2 (N 2 ) N 1/2 N log N (N 4/3 ) N N log N log N N N N log 2 N N N N log N N PRAM = idealized parallel model with zero communication cost. Source: Demmel (1997) 13

14 Ax=b solves a minimization problem If A is symmetric positive definite, then the quadratic form, φ(x) = 1 2 xt Ax x T b is minimized when Ax = b Intuition? Consider ( 3 2 ) ( 2 ) A = 2 6 b = 8 14

15 φ(x) = 1 2 xt Ax x T b 15

16 φ(x) = 1 2 xt Ax x T b Constant ϕ 16

17 Positive-definite Negative-definite Singular, positive-indefinite Indefinite 17

18 Why the quadratic form? φ(x) = 1 2 xt Ax x T b Consider error between approx. and true solution at step k of some method: e = xapprox x e 2 A = e T Ae 18

19 Some additional observations about this minimization problem In general, an iterative numerical optimization method has the form x k+1 = x k + α s k Choose α to minimize ϕ(xk + αsk) Negative gradient is the residual vector φ(x) =b Ax r Can show analytically that α = rt k s k s T k As k 19

20 Conjugate gradient algorithm for solving linear systems (Sparse) matrix-vector multiply 20

21 Refer to Templates book for broad survey of iterative linear solvers N Symmetric? Y N Storage Expensive? A T available? Y Try QMR N Try Minres or CG Definite? Y Eigenvalues Known? N Y N Y Try GMRES Try CGS or Bi-CGSTAB Try CG Try CG or Chebyshev 21

22 Algorithms for 2-D (3-D) Poisson, N=n 2 (=n 3 ) Algorithm Serial PRAM Memory # procs Dense LU Band LU Jacobi Explicit inverse Conj. grad. RB SOR Sparse LU FFT Multigrid Lower bound N 3 N N 2 N 2 N 2 (N 7/3 ) N N 3/2 (N 5/3 ) N (N 4/3 ) N 2 (N 5/3 ) N (N 2/3 ) N N N 2 log N N 2 N 2 N 3/2 (N 4/3 ) N 1/2(1/3) log N N N N 3/2 (N 4/3 ) N 1/2 (N 1/3 ) N N N 3/2 (N 2 ) N 1/2 N log N (N 4/3 ) N N log N log N N N N log 2 N N N N log N N PRAM = idealized parallel model with zero communication cost. Source: Demmel (1997) 22

23 Administrivia 23

24 Administrative stuff Accounts: Apparently, you already have them or will soon (!) Try logging into warp1 with your UNIX account password If it doesn t work, go see TSO Help Desk (and good luck!) CCB 148 / M-F 7a-5p / / AIM:tsohlpdsk Summer internships at national and industrial research labs 24

25 Metrics and models of efficiency and scalability 25

26 Outline Parallel efficiency: Effectiveness of parallel algorithm compared to serial Scalability - definitions, problem scaling, isoefficiency Simple models Slides in this section taken from Heath (UIUC) 26

27 Parallel efficiency: 4 scenarios Consider load balance, concurrency, and overhead 27

28 (a) Perfect load balance and concurrency 28

29 (b) Good initial concurrency but poor load balance 29

30 (c) Good load balance but poor concurrency 30

31 (d) Good load balance and concurrency, but with overheads 31

32 Basic definitions M W V T C Memory complexity Computational complexity Processor speed Execution time Computational cost Storage for given problem (e.g., words) Amount of work for given problem (e.g., flops) Ops / time (e.g., flop/s) Elapsed wallclock (e.g., secs) (No. procs) * (exec. time) [e.g., processor-hours] 32

33 Basic definitions M W V T C Memory complexity Computational complexity Processor speed Execution time Computational cost Storage for given problem (e.g., words) Amount of work for given problem (e.g., flops) Ops / time (e.g., flop/s) Elapsed wallclock (e.g., secs) (No. procs) * (exec. time) [e.g., processor-hours] Subscripts denote processors used, e.g., T1 = serial time, Wp = work for p procs. 33

34 Basic definitions M W V T C Memory complexity Computational complexity Processor speed Execution time Computational cost Storage for given problem (e.g., words) Amount of work for given problem (e.g., flops) Ops / time (e.g., flop/s) Elapsed wallclock (e.g., secs) (No. procs) * (exec. time) [e.g., processor-hours] Assumptions: M p M 1,W p W 1 34

35 Quantities may be functions of one another Consider: W(M) to indicate that work depends on memory complexity. Example: Multiplying two n x n matrices M = O(n 2 ),W = O(n 3 ) = W = O(M 3 2 ) 35

36 Comments on processor speed, V Processor speed will depend on M due to memory hierarchies V (M)? = V (N) ; V Homogeneous vs. heterogeneous processors ( M p V p (M)? = V 1 (M) )? V (M) Aggregate speed p V ( M p ) 36

37 Execution time vs. cost Serial and parallel execution time: Work / Speed T 1 = W 1 V (M) T p = p V W p ( M p ) Cost = (no. procs) * (execution time) C 1 T 1 C p p T p = W p ( V M p ) 37

38 Efficiency and speedup Efficiency E p C 1 = T 1 = W 1 V (M/p) C p p T p W p V (M) Speedup S p T 1 T p = p E p Question: When might superlinear speedup occur? 38

39 Example: Summation using a tree algorithm Memory usage is the same M 1 = M p = n 39

40 Example: Summation using a tree algorithm Work: parallel case does more for intermediate sums W 1 n W p n + p log p 40

41 Example: Summation using a tree algorithm Time: Assume perfect load balance & concurrency T 1 n T p n p + log p 41

42 Example: Summation using a tree algorithm Cost: (no. procs) * (time) C 1 n C p n + p log p 42

43 Example: Summation using a tree algorithm Efficiency E p C 1 C p n n + p log p = 1 1+ p n log p 43

44 Example: Summation using a tree algorithm Speedup S p T 1 T p n n p + log p = p 1+ p n log p 44

45 Parallel scalability Algorithm is scalable if E p = Θ(1) as p Why use more processors? Solve fixed problem in less time Solve larger problem in same time (or any time) Obtain sufficient aggregate memory Tolerate latency and/or use all available bandwidth (Little s Law) 45

46 Problem scaling: Fixed serial work More processors eventually hits diminishing returns Summation algorithm is not scalable in fixed work case E p C 1 C p n n + p log p = 1 1+ p n log p 46

47 Problem scaling: Fixed execution time Applies when a strict time limit applies T 1 = W 1 V (M) T p = p V W p ( M p ) Algorithm scales only if work scales linearly with p Summation algorithm does not scale in this scenario T 1 n T p n p + log p 47

48 Problem scaling: Scaled speedup Fixed work per processor T 1 = W 1 V (M) T p = p V W p ( M p ) Summation algorithm does not scale in this scenario E p W 1 W p pn pn + p log p = 1 1+ log p n 0 48

49 Problem scaling Fixed memory per processor Fixed accuracy Fixed efficiency (isoefficiency) next time 49

50 In conclusion 50

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication. CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax

More information

Iterative methods for Linear System

Iterative methods for Linear System Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009) Iterative methods for Linear System of Equations Joint Advanced Student School (JASS-2009) Course #2: Numerical Simulation - from Models to Software Introduction In numerical simulation, Partial Differential

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc. Lecture 11: CMSC 878R/AMSC698R Iterative Methods An introduction Outline Direct Solution of Linear Systems Inverse, LU decomposition, Cholesky, SVD, etc. Iterative methods for linear systems Why? Matrix

More information

Lecture 18 Classical Iterative Methods

Lecture 18 Classical Iterative Methods Lecture 18 Classical Iterative Methods MIT 18.335J / 6.337J Introduction to Numerical Methods Per-Olof Persson November 14, 2006 1 Iterative Methods for Linear Systems Direct methods for solving Ax = b,

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Iteration basics Notes for 2016-11-07 An iterative solver for Ax = b is produces a sequence of approximations x (k) x. We always stop after finitely many steps, based on some convergence criterion, e.g.

More information

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

6. Iterative Methods for Linear Systems. The stepwise approach to the solution... 6 Iterative Methods for Linear Systems The stepwise approach to the solution Miriam Mehl: 6 Iterative Methods for Linear Systems The stepwise approach to the solution, January 18, 2013 1 61 Large Sparse

More information

9.1 Preconditioned Krylov Subspace Methods

9.1 Preconditioned Krylov Subspace Methods Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

From Stationary Methods to Krylov Subspaces

From Stationary Methods to Krylov Subspaces Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco

Parallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and

More information

Lecture 17: Iterative Methods and Sparse Linear Algebra

Lecture 17: Iterative Methods and Sparse Linear Algebra Lecture 17: Iterative Methods and Sparse Linear Algebra David Bindel 25 Mar 2014 Logistics HW 3 extended to Wednesday after break HW 4 should come out Monday after break Still need project description

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

4.6 Iterative Solvers for Linear Systems

4.6 Iterative Solvers for Linear Systems 4.6 Iterative Solvers for Linear Systems Why use iterative methods? Virtually all direct methods for solving Ax = b require O(n 3 ) floating point operations. In practical applications the matrix A often

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 21: Sensitivity of Eigenvalues and Eigenvectors; Conjugate Gradient Method Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis

More information

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico Lecture 11 Fast Linear Solvers: Iterative Methods J. Chaudhry Department of Mathematics and Statistics University of New Mexico J. Chaudhry (UNM) Math/CS 375 1 / 23 Summary: Complexity of Linear Solves

More information

CLASSICAL ITERATIVE METHODS

CLASSICAL ITERATIVE METHODS CLASSICAL ITERATIVE METHODS LONG CHEN In this notes we discuss classic iterative methods on solving the linear operator equation (1) Au = f, posed on a finite dimensional Hilbert space V = R N equipped

More information

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen Parallel Iterative Methods for Sparse Linear Systems Lehrstuhl für Hochleistungsrechnen www.sc.rwth-aachen.de RWTH Aachen Large and Sparse Small and Dense Outline Problem with Direct Methods Iterative

More information

Numerical Solution Techniques in Mechanical and Aerospace Engineering

Numerical Solution Techniques in Mechanical and Aerospace Engineering Numerical Solution Techniques in Mechanical and Aerospace Engineering Chunlei Liang LECTURE 3 Solvers of linear algebraic equations 3.1. Outline of Lecture Finite-difference method for a 2D elliptic PDE

More information

Recap from the previous lecture on Analytical Modeling

Recap from the previous lecture on Analytical Modeling COSC 637 Parallel Computation Analytical Modeling of Parallel Programs (II) Edgar Gabriel Fall 20 Recap from the previous lecture on Analytical Modeling Speedup: S p = T s / T p (p) Efficiency E = S p

More information

Multigrid Methods and their application in CFD

Multigrid Methods and their application in CFD Multigrid Methods and their application in CFD Michael Wurst TU München 16.06.2009 1 Multigrid Methods Definition Multigrid (MG) methods in numerical analysis are a group of algorithms for solving differential

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics)

Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Parallel programming practices for the solution of Sparse Linear Systems (motivated by computational physics and graphics) Eftychios Sifakis CS758 Guest Lecture - 19 Sept 2012 Introduction Linear systems

More information

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Math 471 (Numerical methods) Chapter 3 (second half). System of equations Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular

More information

Algebraic Multigrid as Solvers and as Preconditioner

Algebraic Multigrid as Solvers and as Preconditioner Ò Algebraic Multigrid as Solvers and as Preconditioner Domenico Lahaye domenico.lahaye@cs.kuleuven.ac.be http://www.cs.kuleuven.ac.be/ domenico/ Department of Computer Science Katholieke Universiteit Leuven

More information

Minisymposia 9 and 34: Avoiding Communication in Linear Algebra. Jim Demmel UC Berkeley bebop.cs.berkeley.edu

Minisymposia 9 and 34: Avoiding Communication in Linear Algebra. Jim Demmel UC Berkeley bebop.cs.berkeley.edu Minisymposia 9 and 34: Avoiding Communication in Linear Algebra Jim Demmel UC Berkeley bebop.cs.berkeley.edu Motivation (1) Increasing parallelism to exploit From Top500 to multicores in your laptop Exponentially

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

Lab 1: Iterative Methods for Solving Linear Systems

Lab 1: Iterative Methods for Solving Linear Systems Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as

More information

9. Iterative Methods for Large Linear Systems

9. Iterative Methods for Large Linear Systems EE507 - Computational Techniques for EE Jitkomut Songsiri 9. Iterative Methods for Large Linear Systems introduction splitting method Jacobi method Gauss-Seidel method successive overrelaxation (SOR) 9-1

More information

The conjugate gradient method

The conjugate gradient method The conjugate gradient method Michael S. Floater November 1, 2011 These notes try to provide motivation and an explanation of the CG method. 1 The method of conjugate directions We want to solve the linear

More information

Solving PDEs with Multigrid Methods p.1

Solving PDEs with Multigrid Methods p.1 Solving PDEs with Multigrid Methods Scott MacLachlan maclachl@colorado.edu Department of Applied Mathematics, University of Colorado at Boulder Solving PDEs with Multigrid Methods p.1 Support and Collaboration

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 11 Partial Differential Equations Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002.

More information

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP. 7.3 The Jacobi and Gauss-Siedel Iterative Techniques Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP. 7.3 The Jacobi and Gauss-Siedel Iterative Techniques

More information

Analytical Modeling of Parallel Programs. S. Oliveira

Analytical Modeling of Parallel Programs. S. Oliveira Analytical Modeling of Parallel Programs S. Oliveira Fall 2005 1 Scalability of Parallel Systems Efficiency of a parallel program E = S/P = T s /PT p Using the parallel overhead expression E = 1/(1 + T

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

Analytical Modeling of Parallel Programs (Chapter 5) Alexandre David

Analytical Modeling of Parallel Programs (Chapter 5) Alexandre David Analytical Modeling of Parallel Programs (Chapter 5) Alexandre David 1.2.05 1 Topic Overview Sources of overhead in parallel programs. Performance metrics for parallel systems. Effect of granularity on

More information

Performance and Scalability. Lars Karlsson

Performance and Scalability. Lars Karlsson Performance and Scalability Lars Karlsson Outline Complexity analysis Runtime, speedup, efficiency Amdahl s Law and scalability Cost and overhead Cost optimality Iso-efficiency function Case study: matrix

More information

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems Iterative Methods for Sparse Linear Systems Luca Bergamaschi e-mail: berga@dmsa.unipd.it - http://www.dmsa.unipd.it/ berga Department of Mathematical Methods and Models for Scientific Applications University

More information

Iterative Solution methods

Iterative Solution methods p. 1/28 TDB NLA Parallel Algorithms for Scientific Computing Iterative Solution methods p. 2/28 TDB NLA Parallel Algorithms for Scientific Computing Basic Iterative Solution methods The ideas to use iterative

More information

High Performance Nonlinear Solvers

High Performance Nonlinear Solvers What is a nonlinear system? High Performance Nonlinear Solvers Michael McCourt Division Argonne National Laboratory IIT Meshfree Seminar September 19, 2011 Every nonlinear system of equations can be described

More information

1 Conjugate gradients

1 Conjugate gradients Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration

More information

Stabilization and Acceleration of Algebraic Multigrid Method

Stabilization and Acceleration of Algebraic Multigrid Method Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008

More information

KRYLOV SUBSPACE ITERATION

KRYLOV SUBSPACE ITERATION KRYLOV SUBSPACE ITERATION Presented by: Nab Raj Roshyara Master and Ph.D. Student Supervisors: Prof. Dr. Peter Benner, Department of Mathematics, TU Chemnitz and Dipl.-Geophys. Thomas Günther 1. Februar

More information

Notes for CS542G (Iterative Solvers for Linear Systems)

Notes for CS542G (Iterative Solvers for Linear Systems) Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,

More information

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method. Optimization Unconstrained optimization One-dimensional Multi-dimensional Newton s method Basic Newton Gauss- Newton Quasi- Newton Descent methods Gradient descent Conjugate gradient Constrained optimization

More information

1. Fast Iterative Solvers of SLE

1. Fast Iterative Solvers of SLE 1. Fast Iterative Solvers of crucial drawback of solvers discussed so far: they become slower if we discretize more accurate! now: look for possible remedies relaxation: explicit application of the multigrid

More information

Analytical Modeling of Parallel Systems

Analytical Modeling of Parallel Systems Analytical Modeling of Parallel Systems Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Thank Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar for providing

More information

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction

More information

PETROV-GALERKIN METHODS

PETROV-GALERKIN METHODS Chapter 7 PETROV-GALERKIN METHODS 7.1 Energy Norm Minimization 7.2 Residual Norm Minimization 7.3 General Projection Methods 7.1 Energy Norm Minimization Saad, Sections 5.3.1, 5.2.1a. 7.1.1 Methods based

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

Modelling and implementation of algorithms in applied mathematics using MPI

Modelling and implementation of algorithms in applied mathematics using MPI Modelling and implementation of algorithms in applied mathematics using MPI Lecture 3: Linear Systems: Simple Iterative Methods and their parallelization, Programming MPI G. Rapin Brazil March 2011 Outline

More information

Classical iterative methods for linear systems

Classical iterative methods for linear systems Classical iterative methods for linear systems Ed Bueler MATH 615 Numerical Analysis of Differential Equations 27 February 1 March, 2017 Ed Bueler (MATH 615 NADEs) Classical iterative methods for linear

More information

Preface to the Second Edition. Preface to the First Edition

Preface to the Second Edition. Preface to the First Edition n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

Scientific Computing II

Scientific Computing II Technische Universität München ST 008 Institut für Informatik Dr. Miriam Mehl Scientific Computing II Final Exam, July, 008 Iterative Solvers (3 pts + 4 extra pts, 60 min) a) Steepest Descent and Conjugate

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018 Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer 28. (Vector and Matrix Norms) Homework 3 Due: Tuesday, July 3, 28 Show that the l vector norm satisfies the three properties (a) x for x

More information

James Demmel. CS 267: Applications of Parallel Computers. Lecture 17 - Structured Grids. Motifs

James Demmel. CS 267: Applications of Parallel Computers. Lecture 17 - Structured Grids. Motifs Motifs CS 267: Applications of Parallel Computers Lecture 17 - Structured Grids Topic of this lecture The Motifs (formerly Dwarfs ) from The Berkeley View (Asanovic et al.) Motifs form key computational

More information

Comparison of Fixed Point Methods and Krylov Subspace Methods Solving Convection-Diffusion Equations

Comparison of Fixed Point Methods and Krylov Subspace Methods Solving Convection-Diffusion Equations American Journal of Computational Mathematics, 5, 5, 3-6 Published Online June 5 in SciRes. http://www.scirp.org/journal/ajcm http://dx.doi.org/.436/ajcm.5.5 Comparison of Fixed Point Methods and Krylov

More information

Iterative Methods and Multigrid

Iterative Methods and Multigrid Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be

More information

Von Neumann Analysis of Jacobi and Gauss-Seidel Iterations

Von Neumann Analysis of Jacobi and Gauss-Seidel Iterations Von Neumann Analysis of Jacobi and Gauss-Seidel Iterations We consider the FDA to the 1D Poisson equation on a grid x covering [0,1] with uniform spacing h, h (u +1 u + u 1 ) = f whose exact solution (to

More information

the method of steepest descent

the method of steepest descent MATH 3511 Spring 2018 the method of steepest descent http://www.phys.uconn.edu/ rozman/courses/m3511_18s/ Last modified: February 6, 2018 Abstract The Steepest Descent is an iterative method for solving

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725 Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple

More information

EXAMPLES OF CLASSICAL ITERATIVE METHODS

EXAMPLES OF CLASSICAL ITERATIVE METHODS EXAMPLES OF CLASSICAL ITERATIVE METHODS In these lecture notes we revisit a few classical fixpoint iterations for the solution of the linear systems of equations. We focus on the algebraic and algorithmic

More information

1 Extrapolation: A Hint of Things to Come

1 Extrapolation: A Hint of Things to Come Notes for 2017-03-24 1 Extrapolation: A Hint of Things to Come Stationary iterations are simple. Methods like Jacobi or Gauss-Seidel are easy to program, and it s (relatively) easy to analyze their convergence.

More information

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

An advanced ILU preconditioner for the incompressible Navier-Stokes equations An advanced ILU preconditioner for the incompressible Navier-Stokes equations M. ur Rehman C. Vuik A. Segal Delft Institute of Applied Mathematics, TU delft The Netherlands Computational Methods with Applications,

More information

Solving PDEs with CUDA Jonathan Cohen

Solving PDEs with CUDA Jonathan Cohen Solving PDEs with CUDA Jonathan Cohen jocohen@nvidia.com NVIDIA Research PDEs (Partial Differential Equations) Big topic Some common strategies Focus on one type of PDE in this talk Poisson Equation Linear

More information

Lecture 4: Linear Algebra 1

Lecture 4: Linear Algebra 1 Lecture 4: Linear Algebra 1 Sourendu Gupta TIFR Graduate School Computational Physics 1 February 12, 2010 c : Sourendu Gupta (TIFR) Lecture 4: Linear Algebra 1 CP 1 1 / 26 Outline 1 Linear problems Motivation

More information

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS Jordan Journal of Mathematics and Statistics JJMS) 53), 2012, pp.169-184 A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS ADEL H. AL-RABTAH Abstract. The Jacobi and Gauss-Seidel iterative

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Jason E. Hicken Aerospace Design Lab Department of Aeronautics & Astronautics Stanford University 14 July 2011 Lecture Objectives describe when CG can be used to solve Ax

More information

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems

An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems P.-O. Persson and J. Peraire Massachusetts Institute of Technology 2006 AIAA Aerospace Sciences Meeting, Reno, Nevada January 9,

More information

Practicality of Large Scale Fast Matrix Multiplication

Practicality of Large Scale Fast Matrix Multiplication Practicality of Large Scale Fast Matrix Multiplication Grey Ballard, James Demmel, Olga Holtz, Benjamin Lipshitz and Oded Schwartz UC Berkeley IWASEP June 5, 2012 Napa Valley, CA Research supported by

More information

The Conjugate Gradient Method for Solving Linear Systems of Equations

The Conjugate Gradient Method for Solving Linear Systems of Equations The Conjugate Gradient Method for Solving Linear Systems of Equations Mike Rambo Mentor: Hans de Moor May 2016 Department of Mathematics, Saint Mary s College of California Contents 1 Introduction 2 2

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD

More information

Motivation: Sparse matrices and numerical PDE's

Motivation: Sparse matrices and numerical PDE's Lecture 20: Numerical Linear Algebra #4 Iterative methods and Eigenproblems Outline 1) Motivation: beyond LU for Ax=b A little PDE's and sparse matrices A) Temperature Equation B) Poisson Equation 2) Splitting

More information

JACOBI S ITERATION METHOD

JACOBI S ITERATION METHOD ITERATION METHODS These are methods which compute a sequence of progressively accurate iterates to approximate the solution of Ax = b. We need such methods for solving many large linear systems. Sometimes

More information

An Empirical Comparison of Graph Laplacian Solvers

An Empirical Comparison of Graph Laplacian Solvers An Empirical Comparison of Graph Laplacian Solvers Kevin Deweese 1 Erik Boman 2 John Gilbert 1 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms Department

More information

Image Reconstruction And Poisson s equation

Image Reconstruction And Poisson s equation Chapter 1, p. 1/58 Image Reconstruction And Poisson s equation School of Engineering Sciences Parallel s for Large-Scale Problems I Chapter 1, p. 2/58 Outline 1 2 3 4 Chapter 1, p. 3/58 Question What have

More information

HOMEWORK 10 SOLUTIONS

HOMEWORK 10 SOLUTIONS HOMEWORK 10 SOLUTIONS MATH 170A Problem 0.1. Watkins 8.3.10 Solution. The k-th error is e (k) = G k e (0). As discussed before, that means that e (k+j) ρ(g) k, i.e., the norm of the error is approximately

More information

Iterative Methods for Linear Systems of Equations

Iterative Methods for Linear Systems of Equations Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method

More information

Tradeoffs between synchronization, communication, and work in parallel linear algebra computations

Tradeoffs between synchronization, communication, and work in parallel linear algebra computations Tradeoffs between synchronization, communication, and work in parallel linear algebra computations Edgar Solomonik, Erin Carson, Nicholas Knight, and James Demmel Department of EECS, UC Berkeley February,

More information

Introduction to Scientific Computing

Introduction to Scientific Computing (Lecture 5: Linear system of equations / Matrix Splitting) Bojana Rosić, Thilo Moshagen Institute of Scientific Computing Motivation Let us resolve the problem scheme by using Kirchhoff s laws: the algebraic

More information

Partial Differential Equations

Partial Differential Equations Partial Differential Equations Introduction Deng Li Discretization Methods Chunfang Chen, Danny Thorne, Adam Zornes CS521 Feb.,7, 2006 What do You Stand For? A PDE is a Partial Differential Equation This

More information

2.29 Numerical Fluid Mechanics Spring 2015 Lecture 9

2.29 Numerical Fluid Mechanics Spring 2015 Lecture 9 Spring 2015 Lecture 9 REVIEW Lecture 8: Direct Methods for solving (linear) algebraic equations Gauss Elimination LU decomposition/factorization Error Analysis for Linear Systems and Condition Numbers

More information

Computational Economics and Finance

Computational Economics and Finance Computational Economics and Finance Part II: Linear Equations Spring 2016 Outline Back Substitution, LU and other decomposi- Direct methods: tions Error analysis and condition numbers Iterative methods:

More information

Multipole-Based Preconditioners for Sparse Linear Systems.

Multipole-Based Preconditioners for Sparse Linear Systems. Multipole-Based Preconditioners for Sparse Linear Systems. Ananth Grama Purdue University. Supported by the National Science Foundation. Overview Summary of Contributions Generalized Stokes Problem Solenoidal

More information

A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm

A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm Penporn Koanantakool and Katherine Yelick {penpornk, yelick}@cs.berkeley.edu Computer Science Division, University of California,

More information

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk

More information

Solving Linear Systems

Solving Linear Systems Solving Linear Systems Iterative Solutions Methods Philippe B. Laval KSU Fall 207 Philippe B. Laval (KSU) Linear Systems Fall 207 / 2 Introduction We continue looking how to solve linear systems of the

More information

Iterative Methods. Splitting Methods

Iterative Methods. Splitting Methods Iterative Methods Splitting Methods 1 Direct Methods Solving Ax = b using direct methods. Gaussian elimination (using LU decomposition) Variants of LU, including Crout and Doolittle Other decomposition

More information