High Performance Multigrid Software
|
|
- Edwin Bryan
- 6 years ago
- Views:
Transcription
1 High Performance Multigrid Software Ulrike Meier Yang This work was performed under the auspices of the U.S. Department of Energy by under Contract DE-AC52-07NA Lawrence Livermore National Security, LLC
2 Time to Solution Motivation: Scalable Solvers Diag-CG Multigrid-CG scalable Number of Processors (Problem Size) 10 5 Multigrid solvers are essential components of LLNL simulation Multigrid solvers are optimal (O(N) operations) Scalable faster simulations better science! Concerns about scalability on future architectures: - higher communication cost - increased parallelism 2
3 Outline Hypre newest developments: Brief overview Efforts to reduce communication New elasticity interpolation Future plans XBraid: Overview Compressible Navier-Stokes application 3
4 The hypre Team Rob Falgout Tzanio Kolev Jacob Schroder Ulrike Yang Former Allison Baker Chuck Baldwin Guillermo Castilla Edmond Chow Andy Cleary Noah Elliott Van Henson Ellen Hill David Hysom Jim Jones Mike Lambert Barry Lee Jeff Painter Charles Tong Tom Treadway Panayot Vassilevski Deborah Walker 4
5 The best solver for a given application usually takes advantage of the setting structured grids, constant coefficients, FE discretization, etc. Traditional linear solver libraries take in only generic matrix-vector information Linear System Interfaces Linear Solvers PFMG,... FAC,... Split,... MLI,... AMG,... Data Layouts structured composite block-struc unstruc CSR 5
6 Current solver / preconditioner availability via hypre s linear system interfaces Data Layouts Structured Semi-structured Sparse matrix Matrix free System Interfaces Solvers Struct SStruct FEI IJ Jacobi P P SMG P P PFMG P P Split P SysPFMG P FAC P Maxwell P AMS, ADS P P P BoomerAMG P P P MLI P P P ParaSails P P P Euclid P P P PILUT P P P PCG P P P P GMRES P P P P BiCGSTAB P P P P Hybrid P P P P 6
7 New features in next hypre release (hypre b expected soon) New approaches in AMG with reduced communication Non-Galerkin AMG RD Falgout, JB Schroder, Non-Galerkin coarse grids for algebraic multigrid, SIAM Journal on Scientific Computing 36 (3), C309-C334, (Mult)-Additive AMG PS Vassilevski, UM Yang, Reducing communication in algebraic multigrid using additive variants, Numerical Linear Algebra with Applications 21 (2), , Interpolation for elasticity AH Baker, TV Kolev, UM Yang, Improving algebraic multigrid interpolation operators for linear elasticity problems, Numerical Linear Algebra with Applications 17 (2 3), ,
8 Algebraic Multigrid (AMG) iterative method for solving Ax=b commonly used as a preconditioner Setup phase: Solve phase: Select coarse grids smooth Define interpolation P (m),m 1,... Define restriction (m) (m)t R P Finest Grid restrict interpolate Define coarse-grid operators: (m 1) (m)t A P A (m) P (m) First Coarse Grid 8
9 AMG Communication patterns, 128 cores Performance degradation caused by increased communication complexity on coarser grids! 9
10 Goal: Replace the standard Galerkin coarse grid matrix, (m 1) (m)t A P A with a sparser approximation. Non-Galerkin AMG (m) P (m) Non-Galerkin coarse grid Let A g = P T AP Sparsify A g to yield A c for coarse grid Goal: less expensive method, especially in parallel Desire good spectral equivalence between A g and A c Heuristic targets: I A c A g 1 2 θ Can show this implies AMG convergence 10
11 Step 1: Choose appropriate sparsity pattern for coarse grid matrix. Algorithm Outline Leverage fine grid matrix graph to reproduce stencil patterns on coarse grid. Row-wise drop tolerance parameter controls information eliminated from each row. Remove symmetric edge Collapse Step 2: Form A (m+1) = R (m) A (m) P (m) and eliminate unwanted matrix entries through stencil collapsing approach. Preserves important near nullspace modes and spectral equivalence between the Galerkin and non- Galerkin operators. 11
12 Results: 3D Diffusion Timings on Vulcan IBM BG/Q, scaled up to 131,072 cores AMG convergence largely unchanged Comparison to best practices Galerkin AMG 12
13 Additive AMG Originally invented in the 80 s to increase parallelism in multigrid (Greenbaum, 1986, BPX, Bastian, Hackbusch, Wittum, 1998, ) Generally leads to increased number of iterations But: shows potential for reduced communication 13
14 Perform in parallel smooth smooth smooth smooth smooth smooth smooth smooth smooth solve solve 14
15 Time Proc id Performance profile of AMG solve cycle for 64 MPI tasks on Hera computation idle time MPI calls Cannot take advantage of parallelism, but Most communication generated by coarse grid operators, not by interpolation or restriction too little computation compared to communication on coarse grid, prohibiting overlap 15
16 Time Proc id Performance profile of AMG solve cycle for 64 MPI tasks on Hera computation idle time MPI calls Cannot take advantage of parallelism, but Most communication generated by coarse grid operators, not by interpolation or restriction too little computation compared to communication on coarse grid, prohibiting overlap Combine communication smooth smooth smooth solve 16
17 x 0 = 0, r 0 = b For k = 0,, l 1 (seq) x k = M 1 k r k r k+1 = (P k k+1 ) T r k A k x k x l = M l 1 r l x l x l + M l T r l A l x l For k = l 1,, 0 (seq) x k x k + P k k+1 x k+1 x k x k + M T k r k A k x k x 0 = 0, r 0 = b For k = 0,, l 1 (seq) k r k+1 = ( P k+1 ) T r k For k = 0,, l (parallel) x k = M 1 k r k x k x k + M T k r k A k x k simplified For k = l 1,, 0 (seq) k x k x k + P k+1 x k+1 k Both algorithms are equivalent for P k+1 = (I M T k A k )P k k+1. k Additive AMG: P k+1 k = P k+1 17
18 Multiplicative V-cycle Weighted Jacobi Additive V-cycle Weighted Jacobi Mult-Additive V-cycle with interpolation truncated to at most 8 elements per row Weighted Jacobi 18
19 seconds Axis Title solve times - 7pt solve times - 27pt mult-gs mult-l1j 9 mult-gs mult-l1j map8-gal 8 map8-gal 5 sp8-gal 7 sp8-gal no of cores no of cores 19
20 seconds Axis Title solve times - 7pt solve times - 27pt 7 10 mult-gs mult-gs mult-l1j map8-gal sp8-gal mult-ng-gs mult-ng-l1j map8-ng sp8-ng mult-l1j map8-gal sp8-gal mult-ng-gs mult-ng-l1j map8-ng sp8-ng no of cores no of cores 20
21 Definitions point or node: physical point of the grid unknown: function being approximated (e.g. component of displacement) Approaches Unknown-based: coarsen/interpolate only between variables of the same function (unknown) nodal-based: coarsen/interpolate in a nodal or pointwise fashion hybrid approach: nodal coarsening/ unknown interpolation 21
22 For effective AMG methods the smooth error vectors should be in the range of the interpolation operator P (near-null-space should be preserved on all coarse levels) For systems of PDEs the nullspace can contain more than just constant vectors, such as the rigid body modes (RBMs) in linear elasticity Idea: P = initial AMG interpolation (any) Augment P: P = P Q s.t. s F = P sc 1 with F s Q ij = P i ij and s c coarse grid restriction of s F For linear elasticity i P ij s j c, 2D, 1 new dof at each node P = P u 0 Q u 0 P v Q v 3D, 3 new dofs at each node 22
23 seconds Linear elasticity problem on a two dimensional beam domain The beam is fixed on one side and a volume force is applied U H H-GM Here E = 210, ν = 0.3 Quadratic finite elements, ~ 51,000 dofs per process no of processes 23
24 New release b soon: will contain communication-reducing approaches linear elasticity interpolation (requires adding rigid body modes info) Various bug fixes and added threading Add rectangular matrix structure to (semi-) structured interface Investigate transition to heterogeneous architectures 24
25 Description of Method R. Falgout, S. Friedhoff, Tz. Kolev, S. MacLachlan, and J. Schroder, Parallel Time Integration with Multigrid, to appear in SIAM J. Sci. Comput. A Fluid Dynamics application R. Falgout, A. Katz, Tz. Kolev, J. Schroder, A. Wissink and U. M. Yang, Parallel Time Integration with Multigrid Reduction for a Compressible Fluid Dynamics Application, submitted to J. Comp. Phys Our Team Veselin Dobrev Rob Falgout Tzanio Kolev Anders Petersson Jacob Schroder Ulrike Yang 25
26 Consider a system of ODEs of the form u t = f t, u t, u 0 = g 0, t 0, T. Let t i = iδt, i = 0,, N, with δt = T/N A general one-step method is now given by u i = Ф i u i 1 + g i, i = 1,, N This can also be expressed as an O(N), sequential direct method A u = I Ф 1 I Ф N I u 0 u 1 u N = g 0 g 1 g N = g We propose solving this system iteratively with a multigrid method Extend multigrid reduction (MGR, 1979) to the time dimension Coarsens only in time (non-intrusive) O(N), highly parallel 37
27 T 0 T 1 t 0 t 1 t 2 t 3 T = m t Relaxation is highly parallel t Alternates between F-points and C-points F-point relaxation = propagation of C-point value across time interval Define coarse-grid system with N = N/m grid points A u = I Ф,1 I Ф,N t N I u,0 u,1 u,n = where Ф,i should be at least as cheap to apply as Ф i F-point (fine grid only) C-point (form coarse grid) F-relaxation g,0 g,1 g,n = g, 38 6
28 1. Apply FCF-relaxation to A u = g. T 0 T 1 T = m t t 0 t 1 t 2 t 3 t t N F-point (fine grid only) C-point (form coarse grid) 2. Restrict fine grid approximation and residual to coarse grid u,i u mi, r,i g mi A(u) mi, i = 0,, N. 3. Solve A v = A u + r. 4. Compute coarse grid error approximation e = v u. 5. Add the error to the values of u at the C-points: u mi u mi + e,i 6. Correct u by applying an F-relaxation step Apply procedure to coarse grid (3) for multilevel method 39
29 Our code (XBraid) is agnostic to spatial decomposition and only parallelizes in time Serial time stepping Multigrid-in-time t (time) t (time) x (space) Parallelize in space only Store only one time step x (space) Parallelize in space and time Store several time steps 40
30 Full approximation scheme (FAS) formulation for nonlinear problems Non-intrusive approach with unchanged time discretization User provides time integrator MGRIT is optimal for simple parabolic problems (implicit and explicit) In practice, store and solve one space-time slab at a time MGRIT with F-relaxation and two levels is equivalent to parareal Two levels still requires a significant sequential solve Multigrid perspective proved useful for achieving the additional parallelism of a full multilevel method without sacrificing optimality Parallel time integration is only useful beyond some scale There is a crossover point, but we have already observed speedups around 10x 41
31 User defines two objects: App and Vector User also writes several wrapper routines: Phi, Init, Clone, Free, Sum, Dot, Write, BufPack, BufUnpack Coarsen, Restrict (optional, for spatial coarsening) Phi(app, tstart, tstop, accuracy, u, &rfactor) Advances vector u from time tstart to tstop Return value rfactor specifies a requested temporal refinement factor Code stores only C-points to minimize storage Consider relaxation over a processor s portion of the time interval Each proc starts with right-most interval to overlap comm/comp 1) Post receive 2) Compute and send 42
32 Use the serial Strand2D code*** Solves compressible Navier-Stokes (nonlinear) Problem is unsteady vortex shedding over a cylinder Implicit time stepping and a spatial FAS multigrid cycle for implicit solves Our tests use backward Euler 3 rd -order finite-differencing on Strand Grids for efficiency and accuracy *** A. J. Katz and D. Work, High-order flux correction/finite difference schemes for strand grids, 52nd Aerospace Sciences Meeting, American Institute of Aeronautics and Astronautics, Jan
33 Plot velocity magnitude Solution snapshots exhibiting unsteady vortex shedding t = t = t = t =
34 Strand2D Code: approximately 13,500 lines Not counting library code like LAPACK We added 129 lines to Strand2D 20 lines to facilitate file output in parallel 109 lines to enable restarting the code at a new time for a new state vector With little outside help, this took about 3-4 weeks If restarting had already been enabled, this would have been much shorter XBraid wrapper code is about 475 lines Includes main(), command line parsing, etc... Important code is much shorter 45
35 Plot velocity magnitude 5120 th time step (t=2.56s) After 13 XBraid iterations, accuracy is good Iteration 1 Iteration 5 Iteration 9 Iteration 13 46
36 Fix time domain with t final = 2.56s, then refine in time Linux cluster: Intel Sandybridge, InfiniBand QDR interconnect XBraid: F-cycles, FCF relaxation, coarsen by 5, relative tol of 1e-5 Strand: 24,960 d.o.f. spatial mesh 1280 time step case barely resolves unsteady behavior Num steps nprocs XBraid iters Run Time XBraid Run time serial Speedup min 37 min min 163 min min 655 min
37 50
38 51
39 XBraid added to Strand2D with relative ease XBraid enabled to run sequential Strand2D code on up to 4096 cores Achieved speedup up to ~8 Further theoretical investigation Add more features to the code Apply XBraid to unsteady 3D CFD production code to increase parallelism 52
40 This work was performed under the auspices of the U.S. Department of Energy by under Contract DE-AC52-07NA Partial support for this work was provided through Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (and Basic Energy Sciences/Biological and Environmental Research/High Energy Physics/Fusion Energy Sciences/Nuclear Physics) and by Applied Mathematics Program, DOE ASCR. 53
hypre MG for LQFT Chris Schroeder LLNL - Physics Division
hypre MG for LQFT Chris Schroeder LLNL - Physics Division This work performed under the auspices of the U.S. Department of Energy by under Contract DE-??? Contributors hypre Team! Rob Falgout (project
More informationMultigrid Solvers in Space and Time for Highly Concurrent Architectures
Multigrid Solvers in Space and Time for Highly Concurrent Architectures Future CFD Technologies Workshop, Kissimmee, Florida January 6-7, 2018 Robert D. Falgout Center for Applied Scientific Computing
More informationParallel-in-Time Optimization with the General-Purpose XBraid Package
Parallel-in-Time Optimization with the General-Purpose XBraid Package Stefanie Günther, Jacob Schroder, Robert Falgout, Nicolas Gauger 7th Workshop on Parallel-in-Time methods Wednesday, May 3 rd, 2018
More informationSolving PDEs with Multigrid Methods p.1
Solving PDEs with Multigrid Methods Scott MacLachlan maclachl@colorado.edu Department of Applied Mathematics, University of Colorado at Boulder Solving PDEs with Multigrid Methods p.1 Support and Collaboration
More informationPreprint ISSN
Fakultät für Mathematik und Informatik Preprint 2015-14 Allison H. Baker, Axel Klawonn, Tzanio Kolev, Martin Lanser, Oliver Rheinbach, and Ulrike Meier Yang Scalability of Classical Algebraic Multigrid
More informationNew Multigrid Solver Advances in TOPS
New Multigrid Solver Advances in TOPS R D Falgout 1, J Brannick 2, M Brezina 2, T Manteuffel 2 and S McCormick 2 1 Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, P.O.
More informationImplementation of a preconditioned eigensolver using Hypre
Implementation of a preconditioned eigensolver using Hypre Andrew V. Knyazev 1, and Merico E. Argentati 1 1 Department of Mathematics, University of Colorado at Denver, USA SUMMARY This paper describes
More informationSpectral element agglomerate AMGe
Spectral element agglomerate AMGe T. Chartier 1, R. Falgout 2, V. E. Henson 2, J. E. Jones 4, T. A. Manteuffel 3, S. F. McCormick 3, J. W. Ruge 3, and P. S. Vassilevski 2 1 Department of Mathematics, Davidson
More informationRobust solution of Poisson-like problems with aggregation-based AMG
Robust solution of Poisson-like problems with aggregation-based AMG Yvan Notay Université Libre de Bruxelles Service de Métrologie Nucléaire Paris, January 26, 215 Supported by the Belgian FNRS http://homepages.ulb.ac.be/
More informationAN AGGREGATION MULTILEVEL METHOD USING SMOOTH ERROR VECTORS
AN AGGREGATION MULTILEVEL METHOD USING SMOOTH ERROR VECTORS EDMOND CHOW Abstract. Many algebraic multilevel methods for solving linear systems assume that the slowto-converge, or algebraically smooth error
More informationScalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems
Scalable Domain Decomposition Preconditioners For Heterogeneous Elliptic Problems Pierre Jolivet, F. Hecht, F. Nataf, C. Prud homme Laboratoire Jacques-Louis Lions Laboratoire Jean Kuntzmann INRIA Rocquencourt
More informationPreconditioned Eigensolver LOBPCG in hypre and PETSc
Preconditioned Eigensolver LOBPCG in hypre and PETSc Ilya Lashuk, Merico Argentati, Evgueni Ovtchinnikov, and Andrew Knyazev Department of Mathematics, University of Colorado at Denver, P.O. Box 173364,
More informationUsing an Auction Algorithm in AMG based on Maximum Weighted Matching in Matrix Graphs
Using an Auction Algorithm in AMG based on Maximum Weighted Matching in Matrix Graphs Pasqua D Ambra Institute for Applied Computing (IAC) National Research Council of Italy (CNR) pasqua.dambra@cnr.it
More informationAlgebraic Multigrid as Solvers and as Preconditioner
Ò Algebraic Multigrid as Solvers and as Preconditioner Domenico Lahaye domenico.lahaye@cs.kuleuven.ac.be http://www.cs.kuleuven.ac.be/ domenico/ Department of Computer Science Katholieke Universiteit Leuven
More informationAdaptive algebraic multigrid methods in lattice computations
Adaptive algebraic multigrid methods in lattice computations Karsten Kahl Bergische Universität Wuppertal January 8, 2009 Acknowledgements Matthias Bolten, University of Wuppertal Achi Brandt, Weizmann
More information1. Fast Iterative Solvers of SLE
1. Fast Iterative Solvers of crucial drawback of solvers discussed so far: they become slower if we discretize more accurate! now: look for possible remedies relaxation: explicit application of the multigrid
More informationResearch Directions in Scalable Algorithms
Research Directions in Scalable Algorithms Robert D. Falgout Center for Applied Scientific Computing Lawrence Livermore National Laboratory Panel Discussion SIAM Conference on CSE February 20, 2007 The
More informationMultigrid Methods and their application in CFD
Multigrid Methods and their application in CFD Michael Wurst TU München 16.06.2009 1 Multigrid Methods Definition Multigrid (MG) methods in numerical analysis are a group of algorithms for solving differential
More informationAMG for a Peta-scale Navier Stokes Code
AMG for a Peta-scale Navier Stokes Code James Lottes Argonne National Laboratory October 18, 2007 The Challenge Develop an AMG iterative method to solve Poisson 2 u = f discretized on highly irregular
More informationIMPLEMENTATION OF A PARALLEL AMG SOLVER
IMPLEMENTATION OF A PARALLEL AMG SOLVER Tony Saad May 2005 http://tsaad.utsi.edu - tsaad@utsi.edu PLAN INTRODUCTION 2 min. MULTIGRID METHODS.. 3 min. PARALLEL IMPLEMENTATION PARTITIONING. 1 min. RENUMBERING...
More informationLehrstuhl für Informatik 10 (Systemsimulation)
FRIEDRICH-ALEXANDER-UNIVERSITÄT ERLANGEN-NÜRNBERG INSTITUT FÜR INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl für Informatik 10 (Systemsimulation) Comparison of two implementations
More informationFine-grained Parallel Incomplete LU Factorization
Fine-grained Parallel Incomplete LU Factorization Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Sparse Days Meeting at CERFACS June 5-6, 2014 Contribution
More informationPreface to the Second Edition. Preface to the First Edition
n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................
More informationSolving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners
Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Eugene Vecharynski 1 Andrew Knyazev 2 1 Department of Computer Science and Engineering University of Minnesota 2 Department
More informationEnhancing Scalability of Sparse Direct Methods
Journal of Physics: Conference Series 78 (007) 0 doi:0.088/7-6596/78//0 Enhancing Scalability of Sparse Direct Methods X.S. Li, J. Demmel, L. Grigori, M. Gu, J. Xia 5, S. Jardin 6, C. Sovinec 7, L.-Q.
More informationOn the Development of Implicit Solvers for Time-Dependent Systems
School o something FACULTY School OF OTHER o Computing On the Development o Implicit Solvers or Time-Dependent Systems Peter Jimack School o Computing, University o Leeds In collaboration with: P.H. Gaskell,
More informationImplicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method
Implicit Solution of Viscous Aerodynamic Flows using the Discontinuous Galerkin Method Per-Olof Persson and Jaime Peraire Massachusetts Institute of Technology 7th World Congress on Computational Mechanics
More informationLecture 8: Fast Linear Solvers (Part 7)
Lecture 8: Fast Linear Solvers (Part 7) 1 Modified Gram-Schmidt Process with Reorthogonalization Test Reorthogonalization If Av k 2 + δ v k+1 2 = Av k 2 to working precision. δ = 10 3 2 Householder Arnoldi
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationPreliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools
Preliminary Results of GRAPES Helmholtz solver using GCR and PETSc tools Xiangjun Wu (1),Lilun Zhang (2),Junqiang Song (2) and Dehui Chen (1) (1) Center for Numerical Weather Prediction, CMA (2) School
More informationA High-Performance Parallel Hybrid Method for Large Sparse Linear Systems
Outline A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Azzam Haidar CERFACS, Toulouse joint work with Luc Giraud (N7-IRIT, France) and Layne Watson (Virginia Polytechnic Institute,
More informationUniversity of Illinois at Urbana-Champaign. Multigrid (MG) methods are used to approximate solutions to elliptic partial differential
Title: Multigrid Methods Name: Luke Olson 1 Affil./Addr.: Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 email: lukeo@illinois.edu url: http://www.cs.uiuc.edu/homes/lukeo/
More informationAn Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems
An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems P.-O. Persson and J. Peraire Massachusetts Institute of Technology 2006 AIAA Aerospace Sciences Meeting, Reno, Nevada January 9,
More informationAggregation-based algebraic multigrid
Aggregation-based algebraic multigrid from theory to fast solvers Yvan Notay Université Libre de Bruxelles Service de Métrologie Nucléaire CEMRACS, Marseille, July 18, 2012 Supported by the Belgian FNRS
More informationAn Algebraic Multigrid Method for Eigenvalue Problems
An Algebraic Multigrid Method for Eigenvalue Problems arxiv:1503.08462v1 [math.na] 29 Mar 2015 Xiaole Han, Yunhui He, Hehu Xie and Chunguang You Abstract An algebraic multigrid method is proposed to solve
More informationEdwin van der Weide and Magnus Svärd. I. Background information for the SBP-SAT scheme
Edwin van der Weide and Magnus Svärd I. Background information for the SBP-SAT scheme As is well-known, stability of a numerical scheme is a key property for a robust and accurate numerical solution. Proving
More informationMULTIGRID METHODS FOR NONLINEAR PROBLEMS: AN OVERVIEW
MULTIGRID METHODS FOR NONLINEAR PROBLEMS: AN OVERVIEW VAN EMDEN HENSON CENTER FOR APPLIED SCIENTIFIC COMPUTING LAWRENCE LIVERMORE NATIONAL LABORATORY Abstract Since their early application to elliptic
More informationStabilization and Acceleration of Algebraic Multigrid Method
Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration
More informationPartial Differential Equations
Partial Differential Equations Introduction Deng Li Discretization Methods Chunfang Chen, Danny Thorne, Adam Zornes CS521 Feb.,7, 2006 What do You Stand For? A PDE is a Partial Differential Equation This
More informationStatic-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems
Static-scheduling and hybrid-programming in SuperLU DIST on multicore cluster systems Ichitaro Yamazaki University of Tennessee, Knoxville Xiaoye Sherry Li Lawrence Berkeley National Laboratory MS49: Sparse
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 11 Partial Differential Equations Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002.
More informationFINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION
FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper presents a new fine-grained parallel algorithm for computing an incomplete LU factorization. All nonzeros
More informationComputers and Mathematics with Applications
Computers and Mathematics with Applications 68 (2014) 1151 1160 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: www.elsevier.com/locate/camwa A GPU
More informationMultigrid absolute value preconditioning
Multigrid absolute value preconditioning Eugene Vecharynski 1 Andrew Knyazev 2 (speaker) 1 Department of Computer Science and Engineering University of Minnesota 2 Department of Mathematical and Statistical
More informationMultigrid Methods for Linear Systems with Stochastic Entries Arising in Lattice QCD. Andreas Frommer
Methods for Linear Systems with Stochastic Entries Arising in Lattice QCD Andreas Frommer Collaborators The Dirac operator James Brannick, Penn State University Björn Leder, Humboldt Universität Berlin
More informationLecture 17: Iterative Methods and Sparse Linear Algebra
Lecture 17: Iterative Methods and Sparse Linear Algebra David Bindel 25 Mar 2014 Logistics HW 3 extended to Wednesday after break HW 4 should come out Monday after break Still need project description
More informationRecent Developments in Overture
Recent Developments in Overture Bill Henshaw Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA. 11th Symposium on Overset Grids and Solution Technology,
More informationMultigrid solvers for equations arising in implicit MHD simulations
Multigrid solvers for equations arising in implicit MHD simulations smoothing Finest Grid Mark F. Adams Department of Applied Physics & Applied Mathematics Columbia University Ravi Samtaney PPPL Achi Brandt
More informationAspects of Multigrid
Aspects of Multigrid Kees Oosterlee 1,2 1 Delft University of Technology, Delft. 2 CWI, Center for Mathematics and Computer Science, Amsterdam, SIAM Chapter Workshop Day, May 30th 2018 C.W.Oosterlee (CWI)
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 24: Preconditioning and Multigrid Solver Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 5 Preconditioning Motivation:
More informationProgress in Parallel Implicit Methods For Tokamak Edge Plasma Modeling
Progress in Parallel Implicit Methods For Tokamak Edge Plasma Modeling Michael McCourt 1,2,Lois Curfman McInnes 1 Hong Zhang 1,Ben Dudson 3,Sean Farley 1,4 Tom Rognlien 5, Maxim Umansky 5 Argonne National
More informationElmer. Introduction into Elmer multiphysics. Thomas Zwinger. CSC Tieteen tietotekniikan keskus Oy CSC IT Center for Science Ltd.
Elmer Introduction into Elmer multiphysics FEM package Thomas Zwinger CSC Tieteen tietotekniikan keskus Oy CSC IT Center for Science Ltd. Contents Elmer Background History users, community contacts and
More informationParallel Discontinuous Galerkin Method
Parallel Discontinuous Galerkin Method Yin Ki, NG The Chinese University of Hong Kong Aug 5, 2015 Mentors: Dr. Ohannes Karakashian, Dr. Kwai Wong Overview Project Goal Implement parallelization on Discontinuous
More informationKey words. Parallel iterative solvers, saddle-point linear systems, preconditioners, timeharmonic
PARALLEL NUMERICAL SOLUTION OF THE TIME-HARMONIC MAXWELL EQUATIONS IN MIXED FORM DAN LI, CHEN GREIF, AND DOMINIK SCHÖTZAU Numer. Linear Algebra Appl., Vol. 19, pp. 525 539, 2012 Abstract. We develop a
More informationNewton s Method and Efficient, Robust Variants
Newton s Method and Efficient, Robust Variants Philipp Birken University of Kassel (SFB/TRR 30) Soon: University of Lund October 7th 2013 Efficient solution of large systems of non-linear PDEs in science
More information2 CAI, KEYES AND MARCINKOWSKI proportional to the relative nonlinearity of the function; i.e., as the relative nonlinearity increases the domain of co
INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS Int. J. Numer. Meth. Fluids 2002; 00:1 6 [Version: 2000/07/27 v1.0] Nonlinear Additive Schwarz Preconditioners and Application in Computational Fluid
More information- Part 4 - Multicore and Manycore Technology: Chances and Challenges. Vincent Heuveline
- Part 4 - Multicore and Manycore Technology: Chances and Challenges Vincent Heuveline 1 Numerical Simulation of Tropical Cyclones Goal oriented adaptivity for tropical cyclones ~10⁴km ~1500km ~100km 2
More informationSolving Large Nonlinear Sparse Systems
Solving Large Nonlinear Sparse Systems Fred W. Wubs and Jonas Thies Computational Mechanics & Numerical Mathematics University of Groningen, the Netherlands f.w.wubs@rug.nl Centre for Interdisciplinary
More informationAn efficient multigrid solver based on aggregation
An efficient multigrid solver based on aggregation Yvan Notay Université Libre de Bruxelles Service de Métrologie Nucléaire Graz, July 4, 2012 Co-worker: Artem Napov Supported by the Belgian FNRS http://homepages.ulb.ac.be/
More informationJ.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009
Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.
More informationIntegration of PETSc for Nonlinear Solves
Integration of PETSc for Nonlinear Solves Ben Jamroz, Travis Austin, Srinath Vadlamani, Scott Kruger Tech-X Corporation jamroz@txcorp.com http://www.txcorp.com NIMROD Meeting: Aug 10, 2010 Boulder, CO
More informationFINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION
FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper presents a new fine-grained parallel algorithm for computing an incomplete LU factorization. All nonzeros
More informationSchwarz-type methods and their application in geomechanics
Schwarz-type methods and their application in geomechanics R. Blaheta, O. Jakl, K. Krečmer, J. Starý Institute of Geonics AS CR, Ostrava, Czech Republic E-mail: stary@ugn.cas.cz PDEMAMIP, September 7-11,
More informationScalable Non-Linear Compact Schemes
Scalable Non-Linear Compact Schemes Debojyoti Ghosh Emil M. Constantinescu Jed Brown Mathematics Computer Science Argonne National Laboratory International Conference on Spectral and High Order Methods
More informationIs My CFD Mesh Adequate? A Quantitative Answer
Is My CFD Mesh Adequate? A Quantitative Answer Krzysztof J. Fidkowski Gas Dynamics Research Colloqium Aerospace Engineering Department University of Michigan January 26, 2011 K.J. Fidkowski (UM) GDRC 2011
More information2.29 Numerical Fluid Mechanics Spring 2015 Lecture 9
Spring 2015 Lecture 9 REVIEW Lecture 8: Direct Methods for solving (linear) algebraic equations Gauss Elimination LU decomposition/factorization Error Analysis for Linear Systems and Condition Numbers
More informationBlock-Structured Adaptive Mesh Refinement
Block-Structured Adaptive Mesh Refinement Lecture 2 Incompressible Navier-Stokes Equations Fractional Step Scheme 1-D AMR for classical PDE s hyperbolic elliptic parabolic Accuracy considerations Bell
More informationA Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation
A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation Tao Zhao 1, Feng-Nan Hwang 2 and Xiao-Chuan Cai 3 Abstract In this paper, we develop an overlapping domain decomposition
More informationIncomplete Cholesky preconditioners that exploit the low-rank property
anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université
More informationMultilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses
Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses P. Boyanova 1, I. Georgiev 34, S. Margenov, L. Zikatanov 5 1 Uppsala University, Box 337, 751 05 Uppsala,
More informationA Generalized Eigensolver Based on Smoothed Aggregation (GES-SA) for Initializing Smoothed Aggregation Multigrid (SA)
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2007; 07: 6 [Version: 2002/09/8 v.02] A Generalized Eigensolver Based on Smoothed Aggregation (GES-SA) for Initializing Smoothed Aggregation
More informationA Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya
A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya Sarah Swatski, Samuel Khuvis, and Matthias K. Gobbert (gobbert@umbc.edu) Department
More informationScalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver
Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Sherry Li Lawrence Berkeley National Laboratory Piyush Sao Rich Vuduc Georgia Institute of Technology CUG 14, May 4-8, 14, Lugano,
More informationDiscretization of PDEs and Tools for the Parallel Solution of the Resulting Systems
Discretization of PDEs and Tools for the Parallel Solution of the Resulting Systems Stan Tomov Innovative Computing Laboratory Computer Science Department The University of Tennessee Wednesday April 4,
More informationAn Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84
An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84 Introduction Almost all numerical methods for solving PDEs will at some point be reduced to solving A
More informationComparison of V-cycle Multigrid Method for Cell-centered Finite Difference on Triangular Meshes
Comparison of V-cycle Multigrid Method for Cell-centered Finite Difference on Triangular Meshes Do Y. Kwak, 1 JunS.Lee 1 Department of Mathematics, KAIST, Taejon 305-701, Korea Department of Mathematics,
More informationEfficient domain decomposition methods for the time-harmonic Maxwell equations
Efficient domain decomposition methods for the time-harmonic Maxwell equations Marcella Bonazzoli 1, Victorita Dolean 2, Ivan G. Graham 3, Euan A. Spence 3, Pierre-Henri Tournier 4 1 Inria Saclay (Defi
More informationc 2015 Society for Industrial and Applied Mathematics
SIAM J. SCI. COMPUT. Vol. 37, No. 2, pp. C169 C193 c 2015 Society for Industrial and Applied Mathematics FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION EDMOND CHOW AND AFTAB PATEL Abstract. This paper
More informationDivergence Formulation of Source Term
Preprint accepted for publication in Journal of Computational Physics, 2012 http://dx.doi.org/10.1016/j.jcp.2012.05.032 Divergence Formulation of Source Term Hiroaki Nishikawa National Institute of Aerospace,
More informationNewton-Krylov-Schwarz Method for a Spherical Shallow Water Model
Newton-Krylov-Schwarz Method for a Spherical Shallow Water Model Chao Yang 1 and Xiao-Chuan Cai 2 1 Institute of Software, Chinese Academy of Sciences, Beijing 100190, P. R. China, yang@mail.rdcps.ac.cn
More informationBootstrap AMG. Kailai Xu. July 12, Stanford University
Bootstrap AMG Kailai Xu Stanford University July 12, 2017 AMG Components A general AMG algorithm consists of the following components. A hierarchy of levels. A smoother. A prolongation. A restriction.
More informationDistributed Memory Parallelization in NGSolve
Distributed Memory Parallelization in NGSolve Lukas Kogler June, 2017 Inst. for Analysis and Scientific Computing, TU Wien From Shared to Distributed Memory Shared Memory Parallelization via threads (
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationOptimizing Runge-Kutta smoothers for unsteady flow problems
Optimizing Runge-Kutta smoothers for unsteady flow problems Philipp Birken 1 November 24, 2011 1 Institute of Mathematics, University of Kassel, Heinrich-Plett-Str. 40, D-34132 Kassel, Germany. email:
More informationToward black-box adaptive domain decomposition methods
Toward black-box adaptive domain decomposition methods Frédéric Nataf Laboratory J.L. Lions (LJLL), CNRS, Alpines Inria and Univ. Paris VI joint work with Victorita Dolean (Univ. Nice Sophia-Antipolis)
More informationAdaptive Multigrid for QCD. Lattice University of Regensburg
Lattice 2007 University of Regensburg Michael Clark Boston University with J. Brannick, R. Brower, J. Osborn and C. Rebbi -1- Lattice 2007, University of Regensburg Talk Outline Introduction to Multigrid
More informationAdvanced numerical methods for nonlinear advectiondiffusion-reaction. Peter Frolkovič, University of Heidelberg
Advanced numerical methods for nonlinear advectiondiffusion-reaction equations Peter Frolkovič, University of Heidelberg Content Motivation and background R 3 T Numerical modelling advection advection
More informationCase Study: Quantum Chromodynamics
Case Study: Quantum Chromodynamics Michael Clark Harvard University with R. Babich, K. Barros, R. Brower, J. Chen and C. Rebbi Outline Primer to QCD QCD on a GPU Mixed Precision Solvers Multigrid solver
More informationNEWTON-GMRES PRECONDITIONING FOR DISCONTINUOUS GALERKIN DISCRETIZATIONS OF THE NAVIER-STOKES EQUATIONS
NEWTON-GMRES PRECONDITIONING FOR DISCONTINUOUS GALERKIN DISCRETIZATIONS OF THE NAVIER-STOKES EQUATIONS P.-O. PERSSON AND J. PERAIRE Abstract. We study preconditioners for the iterative solution of the
More informationProject 4: Navier-Stokes Solution to Driven Cavity and Channel Flow Conditions
Project 4: Navier-Stokes Solution to Driven Cavity and Channel Flow Conditions R. S. Sellers MAE 5440, Computational Fluid Dynamics Utah State University, Department of Mechanical and Aerospace Engineering
More informationRobust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations
Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations Rohit Gupta, Martin van Gijzen, Kees Vuik GPU Technology Conference 2012, San Jose CA. GPU Technology Conference 2012,
More informationTime-dependent Dirichlet Boundary Conditions in Finite Element Discretizations
Time-dependent Dirichlet Boundary Conditions in Finite Element Discretizations Peter Benner and Jan Heiland November 5, 2015 Seminar Talk at Uni Konstanz Introduction Motivation A controlled physical processes
More informationKasetsart University Workshop. Multigrid methods: An introduction
Kasetsart University Workshop Multigrid methods: An introduction Dr. Anand Pardhanani Mathematics Department Earlham College Richmond, Indiana USA pardhan@earlham.edu A copy of these slides is available
More informationADAPTIVE ALGEBRAIC MULTIGRID
ADAPTIVE ALGEBRAIC MULTIGRID M. BREZINA, R. FALGOUT, S. MACLACHLAN, T. MANTEUFFEL, S. MCCORMICK, AND J. RUGE Abstract. Efficient numerical simulation of physical processes is constrained by our ability
More informationParallelism in FreeFem++.
Parallelism in FreeFem++. Guy Atenekeng 1 Frederic Hecht 2 Laura Grigori 1 Jacques Morice 2 Frederic Nataf 2 1 INRIA, Saclay 2 University of Paris 6 Workshop on FreeFem++, 2009 Outline 1 Introduction Motivation
More informationAcceleration of Time Integration
Acceleration of Time Integration edited version, with extra images removed Rick Archibald, John Drake, Kate Evans, Doug Kothe, Trey White, Pat Worley Research sponsored by the Laboratory Directed Research
More informationRobust and Adaptive Multigrid Methods: comparing structured and algebraic approaches
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 0000; 00:1 25 Published online in Wiley InterScience www.interscience.wiley.com). Robust and Adaptive Multigrid Methods: comparing
More informationPreconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners
Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners Michele Benzi Department of Mathematics and Computer Science Emory University Atlanta, Georgia, USA
More information6. Multigrid & Krylov Methods. June 1, 2010
June 1, 2010 Scientific Computing II, Tobias Weinzierl page 1 of 27 Outline of This Session A recapitulation of iterative schemes Lots of advertisement Multigrid Ingredients Multigrid Analysis Scientific
More informationAn Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems
An Efficient Low Memory Implicit DG Algorithm for Time Dependent Problems Per-Olof Persson and Jaime Peraire Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. We present an efficient implicit
More information