IMPLEMENTATION OF A PARALLEL AMG SOLVER Tony Saad May 2005 http://tsaad.utsi.edu - tsaad@utsi.edu
PLAN INTRODUCTION 2 min. MULTIGRID METHODS.. 3 min. PARALLEL IMPLEMENTATION PARTITIONING. 1 min. RENUMBERING... 1 min. THE SOLVER... 3 min. RESULTS... 3 min. CONCLUSION. 2 min.
INTRODUCTION Computational Fluid Dynamics (CFD) is an advanced computing technology developed from traditional fluid mechanics. It uses a set of numerical methods for the solution of PDEs arising from general transport phenomena CFD is being used to model several applications
Aerospace INTRODUCTION
INTRODUCTION Aerospace Appliances
INTRODUCTION Aerospace Appliances Sports
INTRODUCTION Aerospace Appliances Sports Environment
INTRODUCTION The Finite Volume Method is a very popular method used in CFD It is based on conservation princinples
INTRODUCTION ρφ ( ) ( ) t Transient Term + ρuφ = Γ φ + Convection Term Diffusion Term S φ Source Term
ρφ ( ) ( ) t Transient Term + ρuφ = Γ φ + Convection Term Diffusion Term INTRODUCTION S φ Source Term a P t CD + a P urf φ + P i= NB( P) a CD φ Ni Ni = Q P V P + a Pt φ P 0 + 1 urf urf t CD * ( a P + a P )φ P a p φ P + i= NB( P) a Ni φ Ni = b P a Ni = Γ f E f d PNi &m f,0 b P = Q P V P + a Pt φ 0 P + 1 urf urf t a P = a P a Ni + &m fi i= NB( P) i= nb( P) t CD ( a P + a P )φ * P + Γ φ i= nb( P) ( ) fi T fi
INTRODUCTION Discretization yields a linear system of equations a P φ P + a Ni φ Ni = b P i= NB( P) A φ = b The solution of this system requires an iterative procedure as the coefficient matrix is non-linear Famous solvers: SOR, ILU Convergence rate slows down as error becomes smooth
SOR Starts with a Gauss-Seidl Iterate k b P a Ni φ Ni ( * φ ) k P = i= NB( P) a P Relax and update the value (k φ ) P = ω( * φ ) k P + ( (k 1) 1 ω)φ P φ P (k ) = ω b P i= NB( P) a P k a Ni φ Ni + ( (k 1) 1 ω)φ P
MULTIGRID METHODS Multigrid Methods are designed to overcome this problem by deriving a set of coarse grids from the original grid on which the solution is not smooth Algebraic Multigrid (AMG) is suitable for unstructured grids and FV formulation
MULTIGRID METHODS
MULTIGRID METHODS At each level, a linear system of equations is solved The coarse grid linear systems are derived from the finer grids using an agglomeration of the coefficients i G I j G J (l A +1) l I,J = A i, j b I (l +1) = i G I r i (l) r k = b A x k
PARALLEL IMPLEMENTATION Why think parallel? CFD models are becoming computationally expensive Parallel computing is the future of scientific computing Parallelization requires several steps: Partitioning Partition Renumbering Iterative Solver
PARTITIONING Partitioning Divides domain into partitions of equal size Each partition is then assigned to a different processor
PARTITIONING
PARTITIONING Partitioning Divides domain into partitions of equal size Each partition is then assigned to a different processor
PARTITION RENUMBERING Partition Renumbering Each partition is renumbered locally so as to have an independent problem on each process Each processor solves the same set of equations but on its own partition Coupling between processors is done by defining shadow and sender elements at the interface
PARTITIONG RENUMBERING
SOLVER SYNCHRONIZATION As each processor iterates on its system of equations, updates at the interface need to be made
SOLVER SYNCHRONIZATION As each processor iterates on its system of equations, updates at the interface need to be made =
AMG SYNCHRONIZATION Linear solver synchronization affects the number of outer iterations and thus scalability degenerates Solution is to parallelize AMG solver Each processor performs agglomeration while enforcing same number of levels across the domain Shadow and sender elements are defined for each level Updates at the interface are made at each level
RESULTS Non-linear Diffusion problem is a square domain ( Γ φ)= 0 Γ = φ 0.1
RESULTS Speedup Solution Time Shadow-to-core Ratio Efficiency 1.40 180.04 10000 16 0.035 1.20 14 0.03 1.00 12 0.025 0.80 10 0.02 0.60 1000 8 0.015 6 0.40 0.01 4 0.20 0.005 2 Solution Time Ideal 99454 99,454 99,454 199,222 99,454 199222 199,222 199,222 298,154 298,154 298,154 397,393 397,393 397,393 0.000 100 0 00 0 2 2 444 6 6 0.01 88 8 10 10 10 12 12 1214 14 14 0.02 16 16 16 18 18 20 18 Maximum Number shadow/core Processors of Partitions elements ratio
CONCLUSION A Parallel AMG solver has been implemented and tested using available resources Good scalability was achieved Improvement by using better network switch Future plans included implementation of the solution of the Navier-Stokes equations
THANK YOU FOR ATTENDING