Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy
Outline Reminder about FEM and F-D -D Example Finite Difference Matrices in, 2 and 3D Gaussian Elimination Costs Krylov Method Communication Lower bound Preconditioners based on improving communication
Heat Flow -D Example Normalized -D Equation Normalized Poisson Equation T( x) u( x) κ = h = s x x x 2 2 f ( x) u ( x) = f ( x) xx
FD Matrix properties -D Poisson u = xx Finite Differences f x0 x x2 x uˆ 2uˆ uˆ + + = x j j j xn x n + 2 f( x ) uˆ f( x) 2 0 0 L M M 2 O M M M 0 O O O 0 M = M M O O O M M 0 L 0 2 M M uˆ n f( xn) A 2 j
Using Basis Functions Residual Equation Partial Differential Equation form 2 u = 2 x Basis Function Representation f u(0) = 0 u() = 0 u( x) u ( ) ( ) h x ωi i x { ϕ Plug Basis Function Representation into the Equation d ϕ ( x) R x f x ( ) n = i= n 2 i = ωi + 2 i= dx Basis Functions ( )
Using Basis functions Basis Weights Galerkin Scheme Force the residual to be orthogonal to the basis functions ϕ ( ) ( ) l x R x dt = 0 Generates n equations in n unknowns 0 0 ϕ l ( ) d ( ) n 2 ϕi x ω ( ) i f x dx 2 i= dx x + = 0 l {,..., n}
Using Basis Functions Basis Weights Galerkin with integration by parts Only first derivatives of basis functions ( x) n d ωϕ ( x) i i i= dx ϕ ( x) f ( x) dx = i dx dϕ l dx 0 0 0 l {,..., n}
Structural Analysis of Automobiles Equations Force-displacement relationships for mechanical elements (plates, beams, shells) and sum of forces = 0. Partial Differential Equations of Continuum Mechanics
Drag Force Analysis of Aircraft Equations Navier-Stokes Partial Differential Equations.
Engine Thermal Analysis Equations The Poisson Partial Differential Equation.
FD Matrix properties m x x2 x m + m 2-D Discretized Problem xm x2m Discretized Poisson uj+ 2uj + uj uj+ m 2uj + uj m + = 2 2 442443 x y 442443 u xx u yy f( x ) j
FD Matrix properties 2-D Discretized Problem Matrix Nonzeros, 5x5 example
FD Matrix properties m m 3-D Discretization Discretized Poisson x j m x 2 j m x j x j m x j + x 2 j + m x j + m uˆ 2 2 2ˆ ˆ ˆ ˆ 2 ˆ ˆ 2ˆ ˆ u uj u j+ uj+ uj uj+ m uj+ u + j m j+ m j m + + = 2 2 2 ( x) ( y) ( z) 442443 442443 44424443 u u u xx yy zz f ( x ) j
FD Matrix properties 3-D Discretization Matrix nonzeros, m = 4 example
FD Matrix properties Summary Numerical Properties Matrix is Irreducibly Diagonally Dominant Matrix is symmetric positive definite Assuming uniform discretization, diagonal is Aii Aij j i Each row is strictly diagonally dominant, or path connected to a strictly diagonally dominant row D 2, 2 D 4, 3 D 6, 2 2 2
FD Matrix properties Summary Structural Properties Matrices in 3-D are LARGE 2 2 3 3 D m m, 2 D m m, 3 D m m 00x00x00 grid in 3-D = million x million matrix Matrices are very sparse Nonzeros per row D 3, 2 D 5, 3 D 7 Matrices are banded D A = 0 i j > 2 D A = 0 i j > m 3 D A = 0 i j > m ij ij ij 2
Basics of GE Triangularizing Picture A A A A 2 3 4 A0 A A A A22 23 2 22 23 24 A0 A A A A 32 0 A A24 A33 A 34 3 32 33 34 A0 A0 0A A 43 A 42 A A34 A A 44 A 44 4 42 43 44
GE Basics Triangularizing Algorithm For i = to n- { For each Row For j = i+ to n { For each Row below pivot For k = i+ to n { For each element beyond Pivot } } } A A A A ji jk jk ik Aii Multiplie r Pivot Form n- reciprocals (pivots) Form n i= Perform 2 n ( n i) = 2 n 2 ( n i) n 3 i= multipliers 2 3 Multiply-adds
Complexity of GE On = Om 3 3 D ( ) ( ) 00 pt grid On = Om 3 6 2 D ( ) ( ) 00 0 On = Om 0grid 6 (0 ) ops O( 2 0 ) ops 00 g O(0 ) ops 3 9 8 3 D ( ) ( ) 00 00 rid O For 2- D and 3-D problems Need a Faster Solver!
b Banded GE b NONZEROS 0 Triangularizing Algorithm For i = to n- { For j = i+ to n i+b- { { For k = i+ to n { A A A A i+b- { ji jk jk ik Aii 0 } } } n i= Perform (min(, )) ( ) 2 2 b n i O b n Multiply-adds
Complexity of Banded GE Obn = Om 2 D ( ) ( ) 00 pt grid Obn = Om 2 4 2 D ( ) ( ) 00 00 grid 3 D Obn = Om O(00) ops 2 7 ( ) ( ) 00 00 00 grid O 8 (0 ) ops O 4 (0 ) ops For 3 - D problems Need a Faster Solver!
The World According to Krylov Preconditioning Start with Ax = b, Form PAx = Pb 0 Determine the Krylov Subspace r = Pb PAx 0 0 k Krylov Subspace span r, PAr,..., PA r 0 { ( ) } Select Solution from the Krylov Subspace k k k k x + = x + y, y span r, PAr,..., PA r { ( ) } 0 0 0 0 k GCR picks a residual minimizi ng y.
Krylov Methods Preconditioning Diagonal Preconditioners Let A = D + A nd Apply GCR to ( ) ( ) nd D A x = I + D A x= D b The Inverse of a diagonal is cheap to compute Usually improves convergence
Krylov Methods Convergence Analysis Optimality of GCR poly r GCR Optimality Property th % ( PA) r where % is any k order k + 0 k+ k+ polynomial such that % 0 = Therefore Any polynomial which satisfies the constraints can be used to get an upper bound on k+ ( ) k r r + 0
Residual Poly Picture for Heat Conducting Bar Matrix No loss to air (n=0) ( ) Keep as small as possible: k λi Easier if eigenvalues are clustered!
The World According to Krylov Krylov Vectors for diagonal Preconditioned A x ( digit) 0.9-0.7 0.6-0.5 0.4-0.3 0. exact b 0 = r 0 0 0 0 0 0 D A D A -.5 0 0 0 0 0.25 - -.5 0 0 0 0 0.5 0 0 0.5 0 O 0 O O 0.5 0 0 0.5 444 42444443 D A 0 0.5 M 0 -D Discretized PDE =.25 0.5 M0.5 0
The World According to Krylov Krylov Vectors for Diagonal Preconditioned A Communication Lower Bound x ( digit) 0.9-0.7 0.6-0.5 0.4-0.3 0. exact b 0 = r 0 0 0 0 0 0 D A D A -.5 0 0 0 0 0.25 - -.5 0 0 0 0 Communication Lower Bound for m gridpoints ( ) k 0 th D A r is nonzero in m entry after k = m iters k ( k+ ) 0 j Need at least m iterations for x = x + α jr 0 m j= m
m The World According to Krylov m m 2 Krylov Vectors for Diagonal Preconditioned A Two Dimensional Case For an mxm Grid If r 0 0 = M 0 ( ) Takes 2m= O m iters ( x k + ) for 0 m 2
The World According to Krylov Convergence for GCR Eigenanalysis 0.5 0 0 0.5 0 O 0 O O 0.5 0 0 0.5 444 42444443 D A 0 0.5 M 0 =.25 0.5 M0.5 0 kπ D A = m + Recall Eigenvalues of cos
The World According to Krylov k r r 0 κ Convergence for GCR Eigenanalysis max For D A, cos cos λmin m # Iters for GCR to achieve convergence k λ mπ π = + m+ k 2 κ convergence γ κ + tolerance γ log = 2 κ m log κ + Om ( ) GCR achieves Communication lower bound O(m)!
The World According to Krylov Work for Banded Gaussian Elimination, Relaxation and GCR Dimension Banded GE Sparse GE GCR 2 3 ( ) ( ) ( 2 ) O m O m O m ( 4) ( 3) ( 3) O m O m O m ( 7) ( 6) ( 4) O m O m O m GCR faster than banded GE in 2 and 3 dimensions Could be faster, 3-D matrix only m 3 nonzeros. Must defeat the communication lower bound!
The World According to Krylov How to get Faster Converging GCR Preconditioning is the Only Hope GCR already achieves Communication Lower bound for a diagonally preconditioned A Preconditioner must accelerate communication Multiplying by PA must move values by more than one grid point.
Preconditioning Approaches u 0 (new) û (new) û (old) û 2 (new) û 2 (old) û 3 Gauss-Seidel Preconditioning Physical View -D Discretized PDE Gauss Seidel (new) uˆn (new) uˆn u ˆn+ Each Iteration of Gauss-Seidel Relaxation moves data across the grid
Preconditioning Approaches x ( digit) 0.9-0.7 0.6-0.5 0.4-0.3 0. exact b ( + ) ( + ) D L A D L A b ( + ) ( + ) D L A D L A 0 = r 0 0 0 0 0 0 x x x x x x x x x x x x x x 0 = r 0 0 0 0 0 0 0 0 0 0 0 x x 0 0 0 0 x x x Gauss-Seidel Preconditioning Krylov Vectors Gauss-Seidel communicates quickly in only one direction -D Discretized PDE X=nonzero
Preconditioning Approaches u 0 (new) û (new) û (old) û 2 (new) û 2 (old) û 3 Gauss-Seidel Preconditioning Symmetric Gauss-Seidel (new) uˆn (new) uˆn u ˆn+ (new) uˆn 2 (newer) uˆn ˆ ( new u ) n u (newer) û 0 (newer) û 2 This symmetric Gauss-Seidel Preconditioner Communicates both directions
Preconditioning Approaches Gauss-Seidel Preconditioning Symmetric Gauss-Seidel Derivation of the SGS Iteration Equations Forward Sweep half step : k D + L x + Ux = b ( ) ( ) k+ /2 Backward Sweep half step : x = + D+ U x + Lx = b ( ) ( ) k k+ /2 ( ) ( ) k D U L D + L Ux ( ) ( ) + D+ U D+ U L( + ) k+ ( ) ( ) b D L b k k k x + = x D+ U D D+ L Ax + D + U D D+ L b ( ) ( )
Preconditioning Approaches Grid Block Diagonal Preconditioners Line Schemes Matrix m 2 Tridiagonal Matrices factor quickly
Preconditioning Approaches Grid Block Diagonal Preconditioners Line Schemes Problem Lines preconditioners communicate rapidly in only one direction Solution m 2 Do lines first in x, then in y. The Preconditioner is now two Tridiagonal solves, with variable reordering in between.
Preconditioning Approaches Block Diagonal Preconditioners Domain Decomposition Approach Break the domain into small blocks each with the same # of grid points The trade-off m 2 Fewer blocks means faster convergence, but more costly iterates
Preconditioning Approaches m points 2 l Block index Block Diagonal Preconditioners Domain Decomposition m 2 l m l 2 points m ( 2 B lock cost: factoring l l grids, O m l ), sparse GE l m GCR iters: Communication bound gives O iterations. l 3 Suggests insensitivity t o l: Algorithm is O m. ( ) Do you have to refactor every GCR iteration?
Preconditioning Approaches Grid Siedelerized Block Diagonal Preconditioners Line Schemes Matrix m 2
Preconditioning Approaches Grid Overlapping Domain Preconditioners Line based Schemes Matrix m 2 Bigger systems to solve, but can have faster convergence on tougher problems (not just Poisson).
Preconditioning Approaches i Incomplete Factorization Schemes Outline Reminder about Gaussian Elimination Computational Steps Fill-in for Sparse Matrices Greatly increases factorization cost Fill-in in a 2-D grid Incomplete Factorization Idea
Sparse Matrices Matrix Non zero structure X X X Fill-In Example Matrix after one GE step X X X X X 0 X X X0 X 0 X X X0 X X= Non zero Fill-ins
Sparse Matrices Fill-In Second Example Fill-ins Propagate X X X X X X X0 X0 0 X X X0 0 X X0 X0 Fill-ins from Step result in Fill-ins in step 2
Sparse Matrices Fill-In Pattern of a Filled-in Matrix Very Sparse Very Sparse Dense
Sparse Matrices Fill-In Unfactored Random Matrix
Sparse Matrices Fill-In Factored Random Matrix
Factoring 2-D Finite-Difference matrices Generated Fill-in Makes Factorization Expensive
FD Matrix properties 3-D Discretization Matrix nonzeros, m = 4 example
Preconditioning Approaches i Incomplete Factorization Schemes Key idea THROW AWAY FILL-INS! Throw away all fill-ins Throw away only fill-ins with small values Throw away fill-ins produced by other fill-ins Throw away fill-ins produced by fill-ins of other fill-ins, etc.
Summary 3-D BVP Examples Aerodynamics, Continuum Mechanics, Heat-Flow Finite Difference Matrices in, 2 and 3D Gaussian Elimination Costs Krylov Method Communication Lower bound Preconditioners based on improving communication