Numerical Solution of Weather and Climate Systems

Size: px

Start display at page:

Download "Numerical Solution of Weather and Climate Systems"

Augustine Homer Evans
5 years ago
Views:

1 Numerica Soution of Weather and Cimate Systems submitted by Sean Buckeridge for the degree of Doctor of Phiosophy of the University of Bath Department of Mathematica Sciences November 010 COPYRIGHT Attention is drawn to the fact that copyright of this thesis rests with its author. A copy of this thesis has been suppied on condition that anyone who consuts it is understood to recognise that its copyright rests with the author and they must not copy it or use materia from it except as permitted by aw or with the consent of the author. This thesis may be made avaiabe for consutation within the University Library and may be photocopied or ent to other ibraries for the purposes of consutation. Signature of Author... Sean Buckeridge

2 Summary The subject of this thesis is an optima and scaabe parae geometric mutigrid sover for eiptic probems on the sphere, crucia to the forecasting and data assimiation toos used at the UK Met Office. The optimaity of mutieve techniques for eiptic probems makes them a suitabe choice for these appications. The Met Office uses spherica poar grids which, athough structured, have the drawback of creating strong anisotropies near the poes. Moreover, a higher resoution in the radia direction introduces further anisotropies, and so modifications to the standard mutigrid reaxation and coarsening procedures are necessary to retain optima efficiency. Since the strength of anisotropy varies, we propose a non-uniform strategy, coarsening the grid ony in regions that are sufficienty isotropic. This is combined with ine reaxation in the radia direction. The inspiration of a non-uniform coarsening strategy comes from agebraic mutigrid (AMG) methods, which have aready demonstrated the success of this technique. The arge setup cost required by AMG, however, means that geometric mutigrid methods are typicay more efficient. Since a the probems deat with in this thesis have the convenient feature of a grid-aigned anisotropy, we can expoit this and retain the efficiency of a geometric approach to produce an optima method that surpasses the costs of AMG. We demonstrate both theoreticay and experimentay that the non-uniform geometric mutigrid method is robust with respect to grid refinement, and therefore an optima method for soving eiptic probems on the sphere. The theory, which expoits the grid-aigned anisotropy in these probems, is based on a separation of coordinate directions using a tensor product approach, as done in [14], and using existing theory from Hackbusch et a [44]. The advantages of the method are shown experimentay on mode probems, both sequentiay and in parae, and show robustness and optima efficiency of the method with constant convergence factors of ess than 0.1. It substantiay outperforms Kryov subspace methods with one-eve preconditioners and the BoomerAMG impementation of AMG on typica grid resoutions used at the Met Office. The parae impementation scaes amost optimay on up to 56 processors, so that a goba sove of 3D probems with a maximum horizonta resoution of about 10km and unknowns takes about 60 seconds. The non-uniform mutigrid method is aso appicabe to certain eiptic probems arising in a new deveopment within data assimiation, caed the potentia vorticity (PV)-based contro variabe transform. These probems are so i-conditioned that the sovers currenty used at the Met Office fai to converge to a soution. However, using the non-uniform mutigrid method as a preconditioner to a Kryov subspace method, it has been possibe not ony to sove these probems, but to sove them amost optimay. i

3 Acknowedgements I woud ike to begin by thanking my main PhD supervisor, Robert Scheich, who tireessy supported me with precise, efficient and detaied guidance throughout the project. His enthusiasm, commitment and expertise are second to none, and I thank him for the sheer amount of time and effort he has dedicated to ensure the success of this project. Thank you to Chris Budd for his usefu advise and support during my PhD and to John Thuburn for his input from the University of Exeter. Thank you aso to my externa examiner Andy Wathen for his comments and interest on my work (and, of course, for passing me). I woud aso ike to thank the team at the Met Office, in particuar Mike Cuen and Marek Wasak for their supervision. Mike s comprehensive knowedge of Data Assimiation has been instrumenta to my understanding of the fied, and I was gratefu for his wiingness to aways answer the many questions I had for him in great detai. Marek s exceent knowedge of the VAR system made ife easier for me during my pacements at the Met Office, and his patience and aptness were very much appreciated. Thank you to the staff of the Department of Mathematica Sciences for their hep with technica and administrative issues, and for funding severa visits to nationa and internationa conferences. Thank you aso to the Numerica Anaysis group for many interesting and usefu seminars, and for showing a genuine interest in my work. Many of them heped me and gave me fruitfu comments, these incude Ivan Graham, Aastair Spence, Jan Van Lent, Meina Freitag and Emiy Wash. Specia thanks to Great Western Research and the Met Office who financed this project and supported the coaboration with the University of Bath. I hope the coaboration wi continue for many years and ead to severa successfu projects in the future. Iwoudike topay aparticuar respecttomy famiywhogot meherewithcontinued support and beief, without whom none of this woud have been possibe. Finay I woud ike to thank my friends who have made my PhD experience such an enjoyabe one. I thank Phi, my housemate for the past three years, for his unique brand of highy inteectua humour which is sti ost on me, and for his impeccabe decorum at a times. Thanks to Fynn and Jane for making away days usefu and fun, neither of which were perceived to be possibe before! Finay, thank you to Maddy for her understanding and encouragement from the start of my PhD, and for giving me a push in the right direction when I needed it. Her proof reading skis have aso been invauabe severa times. ii

4 Contents 1 Introduction The Subject of the Thesis The Aims of the Thesis The Achievements of the Thesis The Structure of the Thesis Eiptic Probems in Numerica Weather Prediction 10.1 The Hemhotz Probem in NWP Data Assimiation in NWP D-VAR D-VAR Contro Variabe Transforms in 3D-VAR Choice of Contro Variabe Transform PV-Based Contro Variabe Transformations Grid Structure for the UM Summary of the Key Eiptic Equations in NWP Mode Probem and Discretisation Mode Probem Finite Voume Discretisation Discretisation of the Genera Probem Discretisation of the Terms on the Boundary Specia Case Spherica Poar Coordinates Two Dimensiona Probems on the Surface of the Sphere Finite eements and a Link to Finite Voumes Using Quadrature Piecewise Biinear Finite Eements in Two Dimensions Link to the Finite Voume Scheme Extension to Three Dimensions iii

5 CONTENTS iv 4 Numerica Soution of Large Sparse Systems of Linear Equations Introduction and Mode Probems Basic Iterative Methods Point-Wise Reaxation Schemes Bock Reaxation Schemes Error Anaysis Preconditioned Kryov Subspace Methods Mutigrid Methods Heuristics Grid Transfer Operators The Two Grid Method (TGM) The Mutigrid Method (MGM) Basic Theory Anisotropic Probems Semi Coarsening with Point Reaxation Line Reaxation with Fu Coarsening Semi Coarsening with Line Reaxation Convergence Anaysis of the Mutigrid Method Anaysis of the Two-Grid Method Anaysis of the Mutigrid V-Cyce Agebraic Mutigrid (AMG) Agebraic Smoothness Strong Couping Seecting the Coarse Grid The Interpoation Operator Non-Uniform Mutigrid for Spherica Poar Grids Motivation by Studying Anisotropic Probems on the Unit Square Standard Geometric Mutigrid Approaches Agebraic Mutigrid (AMG) for Anisotropic Probems New Approach Conditiona Semi-Coarsening Heuristic Expanation of the Effectiveness Numerica Experiments Extensions to Three Dimensions Deaing with the Additiona Anisotropy in the z-direction Reducing the Anisotropy in the z-direction Line Reaxation and No Coarsening in z

6 CONTENTS v Comparison with Fu Coarsening and Semi Coarsening Convergence Theory Using a Tensor Product Approach Convergence Theory for NUMG in Two Dimensions Convergence Theory for NUMG in Three Dimensions Non-Uniform Mutigrid in Spherica Geometries Poisson-Type Equations on the Unit Sphere Deaing with the Singuarity of the Probem Numerica Resuts The Gaerkin Approach Comparison with Agebraic Mutigrid (AMG) Extension to 3D Eiptic Probems in NWP The Gaerkin Approach Reducing the Anisotropy in the Radia Direction Comparison with Agebraic Mutigrid (AMG) The Quasi-Geostrophic Omega Equation The Hemhotz Probem Paraeization Introduction to Paraeism Message Passing Performance of Parae Agorithms Mutigrid in Parae Grid Partitioning Communication Parae Components of Mutigrid Paraeization Strategy of NUMG Parae Mutigrid Agorithm Parae Numerica Resuts Speedup and Scaing D resuts D resuts Appication of Mutigrid in the Potentia Vorticity Based Contro Variabe Transform PV-Based Contro Variabe Transform Strategy for Soving the Baanced PV-Equation Resuts for the Simpified Mode Probem Resuts for the Fu Probem Summary

7 CONTENTS vi 8 Future Work and Extensions 198 A Convergence Theory Using Fourier Anaysis 199 B The Mutigrid Iteration Matrix 03 C Proof of the Approximation Property 05 Bibiography 08

8 List of Figures -1 4D-VAR Arakawa C-grid used in the horizonta Charney Phiips grid used in the vertica The Met Office grid staggering (a) The computationa grid in the λ-φ pane, and (b) the graded mesh in the r-direction The poar region, with the poe at the centre of n λ haf ces A spy pot showing the sparsity pattern of matrix A for the 3D probem A spy pot showing the sparsity pattern of the matrix for the D probem Finite eement mesh Eement τ i+ 1,j The two-dimensiona grid on the unit square The error of a random initia guess after zero iterations (top eft), two iterations (top right), four iterations (bottom eft) and six iterations (bottom right) of the Gauss Seide method. The osciatory components are removed after very few iterations, but the smooth components are damped very sowy Fu coarsening on an 8 8 uniform grid Linear interpoation in two-dimensions for (a) a ce centred grid and (b) a vertex centred grid. Circes denote the coarse grid nodes and crosses denote fine grid nodes The mutigrid (a) V-cyce and (b) W-cyce on four grids Semi coarsening on an 8 8 uniform grid Conditiona semi-coarsening on a 16 8 grid Conditiona semi-coarsening on a grid (after four refinements) 111 vii

9 LIST OF FIGURES viii 5-3 The weighting of vaues from adjacent nodes in (a) a uniform mesh, and (b) a non-uniform mesh Aspect ratio hy h x for (top eft) zero, (top right) one, (bottom eft) four and (bottom right) seven refinements A F has a 5-point stenci and A, for = 1,...,F 1 have a 1-point stenci, the extra fi-in created as a resut of the Gaerkin Product Snapshot of the vertica veocity, w, after one iteration of VAR at θ-eve 0 (roughy 8.8km above ground eve) The MPI BCAST operation The MPI ALLREDUCE operation, with an arithmetic/ ogica operation Partitioning Ω into four subdomains The communication needed for matrix-vector mutipications (a) Linear interpoation: some entries of the matrix depend on data from processors associated with adjacent subdomains ony, and (b) biinear interpoation: some entries of the matrix depend on data aso from processors associated with subdomains that share a corner with Ω i. Dotted ines denote the fine grid, the thin ines denote the coarse grid and the thick ines denote the subdomain boundaries (a) Fu weighting restriction: some entries of the matrix depend on data from processors associated with adjacent subdomains, and(b) Four point average: a the entries of the matrix depend on the data from the oca processor. Dotted ines denote the fine grid, the thin ines denote the coarse grid and the thick ines denote the subdomain boundaries Parae non-uniform coarsening keeping the ocation of the processor interface fixed. Grids for one refinement (top) and three refinements (bottom) with a uniform 4 4 partitioning onto 16 processors Communication for processor subdomains on the periodic boundary Speedup (strong scaabiity) test of the D NUMG method on wof for two different probem sizes Scaed efficiency (weak scaabiity) test of D NUMG on wof: Probem size on each processor Scaed efficiency (weak scaabiity) test of D NUMG on aquia: Probem size on each processor Speedup (strong scaabiity) test on wof for 3D NUMG (soid bue ine) and the r-ine preconditioned CG method (dotted back ine). Goba probem size:

10 LIST OF FIGURES ix 6-13 Scaed efficiency (weak scaabiity) test on wof for 3D NUMG: Probem size on each processor Scaed efficiency (weak scaabiity) test on aquia for 3D NUMG: Probem size on each processor Scaed efficiency (weak scaabiity) test on wof for the r-ine preconditioned CG method: Probem size on each processor Snapshot of the vertica veocity, w, after one iteration of VAR at θ-eve 0 (roughy 8.8km above ground eve), using eight processors (a) The finite voume grid with (a) p -points as ce centres and (b) ψ -points as ce centres Snapshot of the baanced streamfunction, ψ b, at ρ-eve 40 (roughy 9.8km above ground eve)

11 List of Tabes 5.1 Number of iterations, N its, required to sove (5.1.) Tota CPU time taken (incuding setup time) to sove (5.1.) Average convergence factor, µ avg, when soving (5.1.) The BoomerAMG method appied to mode probem (5.1.) as a stand aone sover and a preconditioner for the CG method. CPU times in seconds Mode probem(5.1.) soved using NUMG as a stand aone sover. CPU time in seconds Mode probem (5.1.) soved using NUMG as a preconditioner to the CG method. CPU time in seconds Three dimensiona mode probem (5.3.1) with L z = 10 4 soved using NUMG. CPU time in seconds Three dimensiona mode probem (5.3.1) with L z = 10 4 soved using z-ine reaxation. CPU time in seconds NUMG used to sove(5.3.1) for varying strengths of the vertica anisotropy. CPU time in seconds NUMG with conditiona semi-coarsening on the x y pane, z-ine reaxation and no vertica coarsening appied to (5.3.1). CPU time in seconds. The method is optima for a vaues of L z > Soving (5.3.1) using NUMG with z-ine reaxation, no vertica coarsening and fu coarsening on the x y pane. CPU time in seconds Soving (5.3.1) using NUMG with z-ine reaxation, no vertica coarsening and coarsening ony in the x-direction. CPU time in seconds Two dimensiona Poisson s equation on the unit sphere soved using NUMG (with a projection onto the range of A F in each iteration). CPU time in seconds Two dimensiona Poisson s equation on the unit sphere with varying anisotropy soved using NUMG (CPU time in seconds) x

12 LIST OF TABLES xi 5.15 Two dimensiona Poisson s equation on the unit sphere soved using NUMG, the Gaerkin product and fu-weighting restriction Two dimensiona Poisson s equation on the unit sphere soved using NUMG, the Gaerkin product and four-point averaging restriction BoomerAMG used to sove mode probem (5.5.1). CPU times in seconds Three dimensiona Poisson s equation on the unit sphere soved using NUMG. CPU time in seconds Three dimensiona Poisson s equation on the unit sphere soved using (coumn 1) NUMG with coarse grid operators created using the Gaerkin product, (coumn ) r-ine Gauss-Seide and (coumn 3) CG preconditioned with the r-ine Gauss Seide smoother. CPU time in seconds Preconditioned CG method (with r-ine Jacobi preconditioner) used to sove (5.6.1) with L λ = L φ = 1 and L r = NUMG with conditiona semi-coarsening on the φ λ pane, r-ine reaxation and no vertica coarsening appied to (5.6.1), with L λ = L φ = 1 and various vaues of L r. CPU time in seconds Three Dimensiona Poisson-type Equation (5.6.1) with L λ = L φ = 1 and L r = 10 4 soved using BoomerAMG as a preconditioner to CG Three dimensiona Hemhotz probem (.1.4), as used in NWP, soved on the unit sphere using NUMG. CPU time in seconds The magnitude of errors in the PV-based CVT. Coumn 5 denotes the number of PCG iterations required for soving(7.3.) and(7.3.3), respectivey. Simiary coumn 6 denotes the CPU time (in seconds) required for soving (7.3.) and (7.3.3), respectivey Detais of each component when soving the 3D baanced equation(7.3.). CPU time in seconds. Identica resuts are found when soving the 3D unbaanced equation (7.3.3) The magnitude of errors in the PV-based CVT. Coumn 5 denotes the number of PCG iterations required for soving(7.1.4) and(7.1.6), respectivey. Simiary coumn 6 denotes the CPU time (in seconds) required for soving (7.1.4) and (7.1.6), respectivey Detais of each component when soving the 3D baanced equation(7.1.4). The same resuts are found when soving the 3D unbaanced equation (7.1.6). CPU time in seconds

13 List of Agorithms 4.1 The Gauss Seide Method: gs(a,u,b) The Bock Gauss Seide Method: bockgs(a,u,b) Preconditioned conjugate gradient method: pcg(a,u,b) The Two Grid Method (TGM) The Mutigrid method (MGM) The Mutigrid V-cyce: Vcyce(A,u,b,) Conditiona semi-coarsening for mode probem (5.1.): Conditiona(h x (),h () y,n () x,n () y,,h ( 1) x,h ( 1) x,n ( 1) x,n ( 1) y ) Conditiona semi-coarsening for the Poisson-type equation on the sphere: Cond spherica(h () λ,h() φ,n() λ,n() φ,,h( 1) λ,h ( 1) λ,n ( 1) λ,n ( 1) φ ) The hybrid Jacobi/Gauss Seide Method: jac gs(a i,u i,b i ) The parae V-cyce on processor i: PVcyce((A ) i,(u ) i,(b ) i,) Parae conditiona semi-coarsening for the Poisson-type equation on the sphere: Cond parae(h () θ,h() φ,n() θ,n() φ,,h( 1) θ,h ( 1) θ,n ( 1) θ,n ( 1) φ ) appy operator(a,p,q) Preconditioned Conjugate Gradient method: pcg(a, u, b) Bi-CGSTAB method with eft preconditioning: pcg(a, u, b) xii

14 Chapter 1 Introduction 1.1 The Subject of the Thesis Two and three dimensiona eiptic partia differentia equations (PDEs) pay a major roe in the weather and cimate numerica mode at the UK Met Office, and in its variationa data assimiation code (VAR). The fuy compressibe and non-hydrostatic Euer equations form the basis of the dynamica core of the Met Office s weather and cimate prediction unified mode (UM). These equations form a hyperboic system, but through the appication of a semiimpicit time discretisation, an eiptic Hemhotz probem is naturay derived for the increment of the pressure fied, denoted Π, between two adjacent time steps. The equation, given in spherica poar coordinates, has the form rπ 1 r r (a(r) )+b(r) Π Π r r +c(r)π = Φ, (1.1.1) where a(r), b(r), c(r) > 0 r, r is the two-dimensiona (D) Lapacian in spherica poar coordinates at a constant height r, and the right-hand-side contains variabes computed at the previous time step. The numerica soution of (1.1.1) pays a vita roe in the UM, as it must be soved at each time step. The fu equation, incuding its derivation, is found in [8, 31]. The UM is in fact currenty undergoing a significant overhau, and its new modes are described in [57, 73, 89], but once again the key probem within these modes is the soution of the Hemhotz equation which retains the form of (1.1.1). In variationa data assimiation, the main task is to produce the best estimate for the current state of the atmosphere, which is then used as an initia condition for a future forecast. Every day the Met Office receives around haf a miion observations 1

15 CHAPTER 1. INTRODUCTION recording the atmospheric conditions around the word. However, even with this many observations there is not enough information on the behaviour of the atmosphere at a points on and above the Earth s surface. There are arge areas of ocean, inaccessibe regions on and and remote eves in the atmosphere where very few, or no, observations are recorded. To fi in the gaps, the avaiabe observations must be combined with forecasts of the expected conditions in the atmosphere, known as the background state. The Met Office uses a set of mode variabes, such as wind veocity, pressure and temperature, to describe the state of the atmosphere. Thus, given the background state and a set of observations in a certain time window, variationa data assimiation is the task of adjusting the mode variabes in view of gaining the best estimate of the current state of the atmosphere. This is done via the minimization of a cost function, which unfortunatey invoves the inversion of a arge and dense matrix reated to error covariances of the mode variabes, which is operationay unfeasibe. The matrix, known as the background error covariance matrix, is dense because the errors of the mode variabes in the background state are highy correated. A key task in simpifying the minimization process is the contro variabe transform (CVT), which is used to transform between the mode variabes and a set of variabes known as the contro variabes. Thepurposeof the CVT is to perform a transformation to a set of variabes that are assumed ess correated so that the background error covariance matrix of these variabes has a more sparse representation. The atmospheric state can be partitioned into components that are known to be baanced or unbaanced, which evove independenty and are therefore assumed to be uncorreated. Hence the choice of contro variabes is made according to whether they are baanced or unbaanced. The main botteneck in the CVT currenty operationa at the Met Office is in the numerica soution of the Quasi-Geostrophic Omega equation, which has the form N (r) rw f 0 1 r r ) (r w r = g. (1.1.) The equation is soved to find the vertica veocity increment(i.e. the deviation from the vertica veocity, w, given in the background state) in scaes important for weather forecasting in the atmosphere. f 0 is the Coriois parameter, which in the quasi-geostrophic regime is assumed constant. Note however that changing this to a (more reaistic) variabe parameter f(φ) poses no additiona difficuties for the methods we propose in this thesis. N (r) is reated to the frequency of vertica buoyancy osciations, which depends on the temperature gradient and varies smoothy with r. The right hand side term g encompasses a the sources of quasi-geostrophic forcing for vertica motion, such

16 CHAPTER 1. INTRODUCTION 3 as temperature gradients, quasi-geostrophic wind, quasi-geostrophic vorticity, and atent heat reease. Detais of the equation, incuding its derivation and the asymptotic regimes in which it is vaid, can be found in [1, 6]. Operationay at the Met Office, the soution w is currenty simpy set to zero because soving the equation takes too ong using the existing sovers at the Met Office. The current CVT is imited to certain regimes of the atmosphere. However, severa studies in [5, 6, 5, 53, 81] have demonstrated that it is possibe to eiminate the shortcomings of the current CVT by choosing a set of new potentia vorticity (PV) based contro variabes. PV is a competey baanced variabe so a contro variabe reatedtopvwibesuitedtodescribethebaancedpartoftheatmosphericfowbetter. Likewise, anti-pv is competey unbaanced, and so contro variabes reated to anti- PV wi accuratey describe the unbaanced components of fow. Using these contro variabes, a new PV-based CVT has been proposed to satisfy the assumption of the non-correation between the variabes at a regimes of the atmosphere, thus overcoming the imitations of the existing CVT. Unfortunatey, the PV-based CVT poses further difficuties, where more highy i-conditioned three dimensiona (3D) eiptic probems must be soved in addition to (1.1.). The equations, written abstracty, have the form α 0 ( ru ε 0 r r r f r u ) = f, (1.1.3) meaningatwo dimensiona(d) soveis embeddedwithinthe3dprobems, andsothey cannot be discretised directy. As with (1.1.), soving (1.1.3) is so far unfeasibe using the sovers at the Met Office, and so the PV-based CVT cannot be made operationa unti a better sover is deveoped for these probems. For the spatia discretisation of the above probems, many of the standard meteoroogica codes, in particuar at the Met Office, use spherica poar grids, which ead to strong grid anisotropies near the poes where the grid ines converge. Aternative grids that avoid the poe probem such as Yin Yang or icosahedra grids, are becoming increasingy popuar in numerica weather prediction, as discussed in [9, 79]. Nevertheess, for a the negative things the spherica poar grids might entai, these grids are very structured. This greaty simpifies the discretisation and coding for probems such as (1.1.1) and (1.1.), which is why they are sti widey used. We wi show in this thesis that from a sover point of view the bad reputation of spherica poar grids (e.g. in [79, 3.b]) is unjustified, provided the sovers are suitaby adapted. Before we expand a bit further on this et us note that the grid spacings in the radia direction are in genera much smaer than in the horizonta ones, since the thickness of the atmosphere is two orders of magnitude smaer than the circumference of the Earth.

17 CHAPTER 1. INTRODUCTION 4 This creates a further source of anisotropy. Additionay, the grid is usuay strongy graded in the radia direction with smaer grid spacings near the surface of the Earth to obtain a better resoution in the regions of most interest. A standard finite voume discretisation of (1.1.1) or (1.1.) on this anisotropic mesh eads to a system of equations Au = b, (1.1.4) where A is a arge, sparse, symmetric positive definite(spd) matrix. The discretisation which we use is basicay identica to that given in [7] for Poisson s equation on a spherica poar grid. The matrix A contains a 7-point stenci for each node on the grid, with non-zero entries ony for the node itsef and for its immediate neighbours. Typica grid resoutions currenty used in data assimiation at the Met Office are 16, 163 and 70 nodes in the atitudina, ongitudina and radia directions, respectivey. This eads to a arge probem size of over a miion degrees of freedom and an i-conditioned system matrix A, making (1.1.4) very difficut to sove efficienty. The sover currenty used at the Met Office, i.e. a Kryov subspace method preconditioned with simpe r- ine reaxation or ADI-type methods, performes increasingy poory as the probem size is increased, and so restricts the grid resoutions that are currenty feasibe for goba simuations, as highighted in [79, 3.b]. 1 It is the subject of this thesis to anayze and sove eiptic probems arising in numerica weather prediction using an optima and scaabe parae iterative sover. We focus on soving (1.1.1), (1.1.) and (1.1.3), a of which are given in spherica poar coordinates. By optima we mean that the time for soving the discretised probem is proportiona to the (discrete) probem size. Simiary, we say that an agorithm has optima parae scaabiity, if the soution time remains constant when the probem size and the number of processors are increased proportionay. The aforementioned sover used at the Met Office is not optima because the increase in sove times with respect to the probem size is not inear. Note that fast direct sovers for eiptic probems in spherica geometries based on fast Fourier transforms(fft) have aso been investigated in [54], but these sovers are not quite optima either. It is we known that it is necessary to resort to mutieve techniques to obtain optimaity of iterative methods for arge eiptic probems. Many variants of these iterative methods exist, but as outined in [0, 75] they a rey on reducing high frequency errors of an initia approximation using a reaxation method (the smoother), and approximating the remaining ow frequency errors on a succession of coarser grids (the coarse grid correction). For isotropic probems with smoothy varying coefficients, 1 Note that the mean radius of the earth is about 6370km, and so the horizonta grid size in the atitudina direction for 16 nodes is about 185km near the equator.

18 CHAPTER 1. INTRODUCTION 5 standard geometric mutigrid with simpe point-wise smoothing and uniform coarsening is the most efficient method, as demonstrated experimentay in severa artices such as [7, 41]. For anisotropic probems, this standard approach unfortunatey does not ead to an optima method. However, if the anisotropies are aigned with the grid then simpe modifications achieve optimaity even with strong anisotropies. These modifications are ine/pane smoothing and/or semi-coarsening. Line smoothing invoves coectivey reaxing a unknowns on an entire grid ine by soving a tridiagona system corresponding to the unknowns on that ine. Semi-coarsening uses a famiy of coarse grids that are ony coarsened in the direction of the arger coefficient. In (1.1.4) there are two sources of anisotropy: one due to the arge aspect ratio between the radia and horizonta grid spacings; the second due to the converging grid ines at the poes. In this thesis we propose a robust geometric mutigrid method that is abe to dea with both these probems by appying a simpe non-uniform partia coarsening combined with an r-ine smoother. The robustness of the non-uniform coarsening strategy is first demonstrated on a two-dimensiona mode probem: Poisson s equation on the unit sphere. The idea is simpe. The spherica poar grid introduces anisotropy near the poes but not near the equator, so the grid is semi-coarsened near the poes but fuy coarsened near the equator. We compare the off-diagona matrix entries in the atitudina and ongitudina direction at each ine of atitude, and the grid ine is fuy coarsened ony if the coefficients in both directions are of simiar magnitude. This wi be true near the equator where we coarsen in both directions, but not near the poes where we ony coarsen in the ongitudina direction, thus eading to coarse grids that are better and better adapted to the anisotropy. This coarsening strategy is the key heuristic for the popuar agebraic mutigrid (AMG) methods (see e.g. [0, 71]), and the robustness of these methods even on highy anisotropic probems demonstrate the success of the strategy. In 3D we dea with the strong anisotropy in the radia direction by using r-ine reaxation and no r-coarsening. This is then combined with the nonuniform coarsening strategy in the ongitudina and atitudina directions. Athough this partia coarsening ony eads to a coarsening factor of about 3 from one grid to the next (instead of 8 for uniform coarsening), it guarantees that the method is fuy robust to the anisotropies induced by the geometry and the grid and eads to an optima method with an average V-cyce convergence factor of ess than 0.1, as our numerica tests show. Geometric mutigrid methods with ine and pane smoothers, but with uniform semi-coarsening, have aready been studied in [7]. PDEs of the type (1.1.) from meteoroogica appications have aready been soved with geometric mutigrid methods, but ony on cube-ike domains with douby-periodic boundary conditions and not on

19 CHAPTER 1. INTRODUCTION 6 the entire gobe (cf. [3, 39, 6, 85, 83, 84]). The most cosey reated paper is [84], where r-ine reaxation and partia coarsening (i.e. uniform coarsening in the horizonta directions and no coarsening in the radia direction) was aready studied extensivey for the quasi-geostrophic equations in Cartesian coordinates. However, since the domain was not the entire atmosphere the additiona compication of the anisotropy at the poes payed no roe. Mutigrid agorithms have aso aready been proposed for aternative grids on the sphere, such as the icosahedra or Yin Yang grids, in [9, 51] The idea of conditiona semi-coarsening in the ongitudina direction proposed here has ony been expored for edge and corner singuarities so far (cf. [40, 55, 90]) butnotforsphericapoargrids(even intwodimensions). Tothebestofourknowedge, it seems to be a nove approach. It is ceary inspired by AMG ideas. AMG methods are fuy automatic and ony based on agebraic information in the matrix A. Coarse grid unknowns are chosen based on the reative size of the off-diagona entries in the matrix which in the appication here wi ead to very simiar coarse grids. However, AMG methods are known to require a arge setup cost to design these coarse grids and the operator-dependent interpoation and restriction operators, especiay in three dimensions. Our geometric method on the other hand, requires amost no setup cost to obtain the same robustness, which is why it easiy outperforms estabished AMG impementations. Numerica tests (cf. Chapter 5) for a variety of probem sizes confirm this. In that chapter, we aso give a comparison to preconditioned Kryov sovers as currenty used by the MET Office [8] and their coaborators []. As expected, Kryov methods are ony optima when preconditioned with a robust mutigrid method, such as AMG or the non-uniform geometric method proposed in this thesis. With standard preconditioners used at the MET Office, such as r-ine reaxation or ADI-type preconditioners (on a singe grid), the number of iterations grows with the probem size. We can make theoretica justifications for the optimaity of the non-uniform geometric mutigrid method based on the heuristics that discretisations on coarser grids produced by conditiona semi-coarsening yied more isotropic probems. Note that no such theory exists for AMG. A sequentia computations wi be carried out using the Fortran95 compier ifort on a singe processor of a Dua dua-core 64bit AMD Opteron 10 processor with cock speed of 1.8GHz, cache size 1.0MB and GB memory. The initia guess for each iterative scheme is aways taken to be zero. The stopping criterion is a reative residua reduction of Parae computations are carried out using two different custers, a 64-bit AMD Opteron 10 custer (wof) with a tota of 4 processors (each the same as above) and a 64-bit Inte Xeon E546 custer (aquia) of over 800 processors, each with GB memory and 3MB Cache. Both custers use an Infinipath network.

20 CHAPTER 1. INTRODUCTION 7 1. The Aims of the Thesis The main aim of this thesis is to provide a fast, efficient and robust iterative sover for the numerica soution of eiptic probems in spherica geometries arising in numerica weather prediction at the Met Office via a nove non-uniform mutigrid method. Since the Met Office runs its codes using massivey parae computers, we aso aim for the sover to have optima parae scaabiity. We aim to show the optimaity of the method both experimentay and theoreticay. A second major aim of the thesis is to theoreticay prove the robustness of the non-uniform mutigrid method with respect to probem size, when appied to mode probems of type (1.1.1) and (1.1.). The cassica resut for the uniform convergence of the mutigrid V-cyce [44, 16] is ony appicabe for isotropic probems. For some anisotropic probems, theoretica resuts for mutigrid using a variety of techniques can be found in, for exampe, [10, 17, 58, 87]. In particuar, for panar poar coordinates, mutigrid theory with ine smoothers and (uniform) semi-coarsening can be found in [14] (see aso [13]). In this thesis a tensor product anaysis is used in order to achieve a separation of coordinate directions for an anisotropic D probem. The anaysis reies on the fact that the anisotropy is grid aigned, and reduces the D probem to a famiy of 1D probems to which the standard theory of Hackbusch can be appied. We aim to prove the robustness of the non-uniform mutigrid method appied to anisotropic 3D eiptic probems of the form (1.1.1) and (1.1.) using a simiar approach. A fina aim of the thesis is to provide an efficient sover for the 3D probems in the new PV-based CVT. Since (1.1.3) cannot be discretised, a discrete operator does not exist. Thus we must appy the operator by means of D Poisson soves, the soutions to which are appied to the remaining components of the operator. Then we can use a preconditioned Kryov subspace method to sove (1.1.3). The preconditioning step invoves using mutigrid to sove a simpified form of the (1.1.3) that resembes the Quasi-Geostrophic omega equation, which can be done optimay. 1.3 The Achievements of the Thesis The foowing are the main achievements of the thesis: 1. A robust mutigrid method for the D Poisson equation on the unit sphere was successfuy deveoped using the non-uniform coarsening strategy. The uniform convergence of the method was shown experimentay, and some heuristic arguments were aso given to back-up the experimenta resuts.. The non-uniform mutigrid method was deveoped further and extended to 3D to

21 CHAPTER 1. INTRODUCTION 8 optimay sove the Hemhotz probem (1.1.1) and the Quasi-Geostrophic omega equation (1.1.). Note that the direction of strongest couping is not aways fixed, so an r-ine smoother is used to hande the strong couping in the radia direction, whist conditiona semi-coarsening deas with the anisotropy on the D pane. The same mutigrid method soves both probems optimay. 3. We use heuristics for the convergence of the V-cyce for the D Poisson probem and theory from [14] to show that 3D eiptic probems with grid-aigned anisotropy converge uniformy when soved using non-uniform mutigrid, with a contraction factor that is independent of mesh refinement and the varying coefficients. The proof is based on a tensor product anaysis that eads to a separation of coordinate directions to reducethe anaysis of the 3D probem to that of a famiy of simper D probems which can be soved optimay with the non-uniform mutigrid method. 4. A highy scaabe parae impementation of the non-uniform mutigrid method was deveoped. The method was shown to be amost optimay scaabe up to 56 processors, and was observed to perform significanty faster than severa existing methods that were tested. 5. The non-uniform mutigrid method was aso used successfuy as a preconditioner to Kryov subspace methods, acceerating these methods to an extent that they perform optimay. Using mutigrid as a preconditioner enabed a unique method for soving the eiptic probems in the PV-based CVT, which was not possibe when using the sovers currenty empoyed at the Met Office. 1.4 The Structure of the Thesis The majority of the subsequent chapters begin with a preambe to motivate the work of the chapter in addition to a iterature review reated to the work. The contents of each chapter in the remainder of this thesis is as foows: In Chapter we present the main eiptic probems that arise in numerica weather prediction, namey from the Met Office s Unified Mode and its variationa data assimiation scheme. The derivation of each probem is described, as we as the structure of the computationa grids that are used at the Met Office. In Chapter 3, we describe the finite voume discretisation of the D and 3D eiptic probems, using the grid structure of the Met Office described in Chapter. We present the discretisation in such a way that it appies to a the eiptic probems of interest in this thesis. We aso present an aternative discretisation method using finite eements,

22 CHAPTER 1. INTRODUCTION 9 which wi be necessary for the theoretica anaysis. However, we show that the two discretisation schemes can be inked using a quadrature rue that is at east second order accurate. Chapter 4 gives an overview of existing iterative sovers for eiptic probems, starting with reaxation schemes and Kryov subspace methods. We then describe the idea of mutigrid methods for isotropic eiptic probems soved on a uniform grid, and how the method needs to be adapted for anisotropic probems. This chapter aso incudes an overview of the convergence proofs of mutigrid methods, foowing the work of Hackbusch [44], and it is concuded with a description of AMG methods. Chapter 5 is the main chapter of the thesis. We describe the nove idea of a nonuniform geometric mutigrid method that is adapted to the particuar anisotropies induced by the geometry and by the grid, and highight some simiarities with the AMG methods described in Chapter 4. Numerica resuts are given for the sequentia sover appied to D and 3D mode probems on the unit square/cube with degenerate coefficients, as we as comparisons with AMG and preconditioned Kryov methods. The heart of this chapter is a new convergence theory that foows the techniques used by Börm and Hiptmair [14] and confirms the robustness of the method. However, the theory reies on a finite eement discretisation of the eiptic probem, so the quadrature rue from Chapter 3 is used to show that the theory aso carries over to a finite voume setting. The section is concuded with the appication of the method to eiptic probems on the sphere, in particuar the Hemhotz probem and the Quasi-Geostrophic Omega equation. In Chapter 6, we outine how we paraeized our method and demonstrate its parae scaabiity for up to 56 processors, as we as comparing the scaabiity to that of parae versions of the other sovers. Finay in Chapter 7, we outine another nove method for soving the 3D probems present in the PV-based CVT. We describe how the operator is appied without the use of a matrix, and how Kryov methods can be acceerated using a mutigrid preconditioner that soves a simpified probem. Numerica resuts are given for the performance of the method and for the accuracy of the compete cyce of the PV-based CVT when using this nove approach.

23 Chapter Eiptic Probems in Numerica Weather Prediction In this chapter we describe the main eiptic probems that arise in numerica weather prediction (NWP). The process of NWP invoves the assimiation of estimates to the initia conditions for a numerica weather forecast mode, and once these conditions are cacuated, the changes in weather are predicted by advancing the mode in time. The fagship numerica mode deveoped and used at the Met Office is caed the Unified Mode (UM) and the assimiation of the initia conditions is accompished by variationa data assimiation (VAR). For two decades, the UM has been used at the Met Office for both ow resoution cimate modeing and high resoution operationa NWP. It is versatie and capabe of modeing a wide range of time and space scaes and is run in many different configurations at the Met Office. It has been in continua deveopment since the eary 1990s, taking advantage of increasing supercomputer power, improved understanding of atmospheric physics and an increasing range of observationa data sources. The most current version of the UM uses governing equations with a non-hydrostatic, fuy compressibe and deep atmosphere formuation (see [4, 8, 31, 78]), which describe the rates of change of the wind components, the potentia temperature, density, humidity and pressure variabes. The equations are formuated in spherica poar coordinates and discretised using a staggered grid in each coordinate direction. The discretisation in time is done using a two-time eve predictor-corrector impementation of a semi-impicit scheme, as described in [8, 31]. The discretisation in this form eads to increased stabiity, thus aowing arger time steps to be used, but is compicated by the need to sove a three-dimensiona Hemhotz equation at each time step which takes up a significant fraction of the cost of the UM. This Hemhotz probem is the first 10

24 CHAPTER. ELLIPTIC PROBLEMS IN NWP 11 of the eiptic probems we are interested in. To sove it, a preconditioned generaized conjugate residua (GCR) method, as described in [34], is currenty used operationay in the UM. The UM aso reies on accurate initia conditions of the current state of the atmosphere, which is provided via data assimiation. By combining observationa data, statistica data, knowedge of atmospheric dynamics and previous forecasts, the best estimate to the initia conditions is found. Due to the sheer probem size invoved there are inherent probems in defining the required matrices, and a process known as the contro variabe transform (CVT) is used to simpify the probem, as described in [30, 5, 53, 80]. A key equation in the CVT is the Quasi-Geostrophic omega (QG-Ω) equation [38], which is the second of the eiptic probems of interest. The CVT is imited to certain regimes of atmospheric fow. Some of these imitations are thought to be overcome by a newy proposed CVT based on potentia vorticity (PV). Many theoretica studies of this new scheme exist (see [5, 53, 6, 5, 5, 78, 81]), but it is not yet operationa as the current sovers are not capabe of soving the probems invoved. Some of the probems that appear in the PV-transform are so i-conditioned that the current sovers at the Met Office do not converge to a sufficienty accurate soution even after severa hundred iterations. Thus the third of the eiptic probems of interest appears within the PV-based CVT. Each of the eiptic probems described above have different characteristics, but a of them are highy important in improving the efficiency and accuracy of NWP at the Met Office. The chapter is organized as foows. Section.1 describes the governing equations used in the UM and how the Hemhotz probem is derived from the semi impicit discretisation of the equations. In Section. we focus our attention on data assimiation. We describe the CVT that is currenty operationa at the Met Office and that is centra to simpifying the VAR process, and the QG-Ω equation that needs to be soved as a resut of the transformations. In Section.3 we discuss the new PV-based CVT as a possibe improvement to the currenty operationa CVT and the equations invoved. Finay in Section.4 the staggered grids used at the Met Office are defined, before summarizing the key eiptic probems in NWP in Section.5..1 The Hemhotz Probem in NWP The fuy compressibe, non-hydrostatic and deep atmosphere equations form the basis of the Met Office weather and cimate prediction unified mode (UM). A fuy compressibe system is one where the density of fuid in the system changes with respect to

25 CHAPTER. ELLIPTIC PROBLEMS IN NWP 1 pressure, which is the case with the Earth s atmosphere. Many cimate modes assume hydrostatic baance, where the two main vertica forces, gravity and the pressure gradient, baance each other out. Hydrostatic baance expains why the Earth s atmosphere does not coapse to a thin ayer on the ground, and is an accurate approximation on arge horizonta scaes. However, the UM does not assume compete hydrostatic baance which aows it to take into account vertica wind acceeration. For accuracy, a deep atmosphere formuation is used instead of the shaow atmosphere approximations (see [78]). The prognostic mode variabes used in atmospheric modeing are the three-dimensiona wind u with components (u, v, w), potentia temperature θ, pressure p, density ρ and the specific humidity q. In the UM the Exner function is used as the pressure variabe, defined as ( p Π = p ref ) κ, κ = R C p. (.1.1) R = is the gas constant, C p = 1005 the specific heat at constant temperature and p ref = 10 5 Pa a constant reference vaue of the pressure. Potentia temperature is defined in terms of temperature, T, as θ = T/Π. The governing equations are written genericay as where DX Dt = L(x,t,X)+N(x,t,X), (.1.) D Dt = t +u z is the Lagrangian derivative[8] which describes the time rate of change of the prognostic variabes whie moving with a veocity fied. X is a vector of the prognostic variabes and x and t denote position and time. L and N represent terms that are inear and noninear in X, respectivey. z is the horizonta gradient operator. The equations are discretised in the UM using a predictor-corrector impementation of a two-time eve (TL), semi-impicit (SI) time discretisation scheme, as outined in [8], though further deveopments are being made to the current methods in a scheme caed ENDGame [57, 73, 89]. A SI discretisation of (.1.) is given by X n+1 X n d t = (1 α)(l+n) n d +α(l+n)n+1, which advances the equation from time eve n to time eve n + 1, where α [0,1]. The subscript d denotes the evauation at a departure point x d, which is the point in space that X was measured at time eve n.

26 CHAPTER. ELLIPTIC PROBLEMS IN NWP 13 The predictor-corrector approach is then described as foows X (1) = X n d +(1 α) t(l+n)n d +α t(l+n)n, X () = X (1) +α tl () +α t(n (1) N n L n ), where X (1) is the predicted vaue at time eve n+1, and X () is the corrected estimate, such that X n+1 = X (). Aso, et primed quantities denote the increment from time eve n to n+1, i.e. X = X n+1 X n, for each of the prognostic variabes. Equations (.1.3) (.1.7), beow, are the governing equations used in the UM, as taken from [8], but with some simpifications made for purposes of carity. The simpifications are that we write the equations in Cartesian coordinates and assume a dry atmosphere with no orography and a uniform rotation rate of f = Ωcos y, where Ω is the Earth s anguar veocity. Momentum Equations Du Dt = fv C pθ Π x +F u, (.1.3) Dv Dt = fu C pθ Π y +F v, (.1.4) Dw Dt = g C pθ Π z +F w. (.1.5) Potentia Temperature Equation Dθ Dt = S. Continuity Equation (Conservation of Mass) Equation of State ρ t + z (ρu) = 0. (.1.6) κπθρ = p C p. (.1.7) g is the Earth s gravitationa force, and F u, F v, F w and S are the tendencies from the physics parameterizations. Let us now discuss how these equations are discretised and which further approximations are used. The equations for θ and u and are discretised

27 CHAPTER. ELLIPTIC PROBLEMS IN NWP 14 using the predictor-corrector approach, and after the correction step, they are: θ () = θ (1) α t θ(1) z w, (.1.8) u () = u (1) +α t [ f(v () v n { ) C p (θ (1) θ n ) Πn x +θ(1) x (Π ) }], (.1.9) { }] v () = v (1) α t [f(u () u n )+C p (θ (1) θ n ) Πn y +θ(1) y (Π ), (.1.10) w () = w (1) +α tc p { (θ n θ () ) Πn z θ(1) z (Π ) }. (.1.11) Substituting θ () from (.1.8) into (.1.11) and setting G = 1 α t C p θ (1) z Π n z, Gw () = w (1) (1 G)w n α tc p (θ (1) θ n ) Πn z α tc pθ (1) z (Π ). (.1.1) The evoution of the equation for density is handed in an Euerian fux-form (see [8]): [ ρ () = ρ n ( t ρ n (u n +αu ) ) + ( ρ n (v n +αv ) ) + ( ρ n (w n +αw ) )]. x y z Finay, there is no time derivative of Π in the equation of state, and so the equation is assumed to hod for the atest estimates of each variabe, denoted by superscript (): κπ () θ () ρ () = p() C p. (.1.13) Since these equations are couped, we must sove them simutaneousy to obtain X (). Theaim ofthesidiscretisation is toadvancethemodevariabes fromx n tox n+1, and this can be done by finding X instead. (.1.13) is rewritten in terms of the increments as κ(π n +Π )(θ n +θ )(ρ n +ρ ) = pn +p Now using the definition of Π from (.1.1) and a inear approximation, ( p Π n +Π n +p ) κ ( p n = = p 0 p 0 ) κ (1+ p p n C p. (.1.14) ) κ ) Π (1+ n κp p n, and we obtain the foowing approximation to the pressure increment p pn κ Π Π n. (.1.15) Now by expanding (.1.14) and negecting products of two or more primed quantities (e.g. Π θ ρ Π θ ρ n 0) we have

28 CHAPTER. ELLIPTIC PROBLEMS IN NWP 15 κπ θ n ρ n +κπ n θ ρ n +κπ n θ n ρ +κπ n θ n ρ n = pn +p C p. (.1.16) Then using (.1.15) to eiminate p from (.1.16) and rearranging gives the foowing inearized form of the equation of state: ( p n κc p Π n κθn ρ )Π n +κπ n θ n ρ +κπ n ρ n θ = pn κπ n θ n ρ n. (.1.17) C p The unknown quantities in (.1.17) are Π, θ and ρ. We find that where θ = (θ (1) θ n) α tw θ(1) = α tw θ(1) z z +X θ, [ ρ ( = t ρ n (u+αu ) ) + ( ρ n (v +αv ) ) + ( ρ n (w +αw ) )], x y z u = (u (1) u n )+α t [ f(v () v n ) C p { (θ (1) θ n ) Πn x +θ(1) x (Π ) }] = Aα tc p θ (1) x (Π ) Fα tc p θ (1) y (Π )+X u, (.1.18) { }] v = (v (1) v n ) α t [f(u () u n )+C p (θ (1) θ n ) Πn y +θ(1) y (Π ) = Aα tc p θ (1) y (Π )+Fα tc p θ (1) w = G 1 { (w (1) w n ) α tc p (θ (1) θ n ) Πn z x (Π )+X v, (.1.19) } G 1 α tc p θ (1) z (Π ) = G 1 α tc p θ (1) z (Π )+X w, (.1.0) where A = 1/(1 + α t f ), F = α tfa and X θ, X u, X v and X w contains expicit terms that are aready known. We substitute the equations for u, v and w into the equation for ρ, and substitute w into θ. Then we substitute ρ and θ into (.1.17) to yied the foowing Hemhotz equation for the remaining unknown Π { κπ n θ n t α C p A [ρ n θ (1) Π x + [ρ n θ (1) Π y ] x +Fρn θ (1) Π y ] y +Fρn θ (1) Π x ]} ( [G 1 A 1 ρ n θ (1) Π + z ( p n + κc p Π n κθn ρ n κπ n t α C p ρ n θ (1) G 1 θ(1) z ) Π z z )Π = κπ n θ n ρ n pn +Φ n, (.1.1) C p where Φ n contains physica parameters and terms evauated at the previous time step.

29 CHAPTER. ELLIPTIC PROBLEMS IN NWP 16 This equation contains cross derivative terms that introduce severa unwanted difficuties. It is therefore more practica to introduce further simpifications to the mode instead, and one way of simpifying the mode is to assume a constant Coriois term, i.e. set f = f 0. This resuts in A and F being constant. In addition we can enforce an averaging for ρ n and θ (1), meaning they wi not vary with atitude or ongitude. This means the cross derivative terms cance, and the ρ n θ (1) terms can be pued out of the derivatives. With these simpifications imposed, we obtain the foowing simpified mode equation that is more amenabe to practica impementation: { κπ n θ n t α C p ρ n θ (1) Π A (κπ n t α C p ρ n θ (1) G 1 θ(1) z x + Π y + ( 1 z GA ) ( Π p n z + Π )} z κc p Π n κθn ρ )Π n = κπ n θ n ρ n pn +Φ n C p (.1.) We assume that the Earth s surface and the top of the atmosphere are rigid boundaries, thus impose rigid boundary conditions u n = 0 at these boundaries, where n is a unit vector pointing outward from the boundary. This impies the Neumann boundary condition Π n = X n, (.1.3) using equations (.1.18) (.1.0). Once the equation is soved for Π, we use this to update the Exner pressure at time eve n + 1. This is used to find u n+1, v n+1 and w n+1 which are then used to find θ n+1 and ρ n+1. The papers [8, 31] aso describe the derivation of the Hemhotz equation in the UM but without the assumptions made in this section. In [57, 3.], the Hemhotz equation used in ENDgame is described, which is a simpified equation comparabe to (.1.), obtained by making suitabe assumptions as in this section to simpify the mode. The soution to the Hemhotz equation is therefore centra to advancing the mode at each time step, and so it must be soved efficienty to reduce the costs of the UM. The equation is soved operationay using a conjugate residua sover [34] on a grid with resoution This method aone is not adequate for soving the probem efficienty, thus it is preconditioned using a two dimensiona aternative direction impicit (ADI) method [11]. The ADI method uses a tridiagona sover in the x- and z directions, and the affect of this as a preconditioner is investigated in [, 69]. Athough the preconditioner acceerates the method, it is not optima and the time taken to sove the equation is sti not adequate. So further improvements are sought especiay for the huge probem sizes used in the UM.

30 CHAPTER. ELLIPTIC PROBLEMS IN NWP 17 We concude this section by defining the Hemhotz equation in spherica poar coordinates, since we are interested in soving this equation on a spherica geometry. We have r [a,a + d] (the radius), φ [0,π] (the poar ange) and λ [0,π] (the azimutha ange), where a 6371km is the radius of the Earth and d 63km is the depth of the atmosphere. The governing equations in spherica poar coordinates, given in [8], are: Du Dt = fv C pθ Π rsinφ λ +F u, Dv Dt = fu C pθ Π r φ +F v, Dw Dt = g C pθ Π r +F w, Dθ Dt = S, ρ t + z (ρu) = ρ t + 1 ( r r ρw ) (sinφ ρv)+ (ρu) = 0, r rsinφ φ rsinφ λ κπθρ = p C p, where f = Ω cos φ. Then the Hemhotz equation, with the same simpifications made for (.1.), is { κπ n θ n t α C p ρ n θ (1) A (κπ n t α C p ρ n θ (1) G 1 θ(1) r 1 Π ) r sin φ λ + 1 (sinφ Π r + 1 ( r Π )} sinφ φ φ r r GA r ) ( Π p n r + κc p Π n κθn ρ )Π n = κπ n θ n ρ n pn +Φ n C p (.1.4) with Neumann boundary conditions (.1.3) at the rigid boundaries, i.e. the Earth s surface and the top of the atmosphere.. Data Assimiation in NWP Forecasts using NWP have improved greaty since they were introduced over 50 years ago, and one of the main reasons for this is due to the improvement in obtaining the initia conditions, x 0, at a given time, t 0, for the mode forecast. x 0 is a state of the atmosphere at time t 0 obtained for the mode variabes. The mode variabes of the UM used at the Met Office are x = (u,v,w,θ,ρ,p,q),

31 CHAPTER. ELLIPTIC PROBLEMS IN NWP 18 where (u,v,w) are the three components of wind veocity, θ is the potentia temperature, ρ is density, p is pressure and q is specific humidity. These variabes are highy correated (e.g. a change in pressure wi affect the density of the air) so any errors in these variabes wi aso be correated. The mode domain used has degrees of freedom, and so the state vector of the mode variabes wi have O(10 7 ) eements for each variabe. The anaysis grid used for data assimiation, however, has a smaer resoution of degrees of freedom, and so the initia conditions obtained using data assimiation are interpoated before they are used in the forecast. Data Assimiation is the process of finding the best estimate of the current state of a system. In NWP, this system is the atmosphere and the oceans, and data assimiation is used to estimate x 0 by combining a previous forecast with observationa data, knowedge of atmospheric dynamics and statistica data that measures the accuracy of the forecast and observations. The uncertainties are quantified with Gaussian probabiity density functions (PDF). Typicay there are O(10 6 ) observations and O(10 7 ) mode variabes, thus the system is underobserved and an operator is required to map the mode space to the observation space. A mode state is found that is the statisticay optima estimate of the truth given the previous forecast and new observations, together with estimates of the errors in each. The errors in the observations and the previous forecast are minimized in order to produce the best estimate, or anaysis, of the current state of the atmosphere. Due to the chaotic nature of the governing equations in the mode, any errors in the initia conditions wi be ampified in the forecast. Thus, despite the continuous advancements in computationa power and numerica methods, these benefits cannot be fuy reaized in NWP without accurate data assimiation techniques. There are severa type of data assimiation methods such as optima statistica interpoation, Newtonian reaxation, 3D-VAR and 4D-VAR (see [56] for detais). Aso, ensembe-based methods such as ensembe Kaman fiter methods [35] are being expored as possibe additions to current operationa anaysis techniques. However, in this section we focus on 3D and 4D VAR, as these are currenty the most commony used data assimiation methods in weather forecasting, in particuar by the Met Office who use an incrementa version of 4D-VAR, described in [53, 61]...1 3D-VAR We denote the current state of the atmosphere obtained from a previous forecast as the background state, x b. The background state wi have uncertainties, and these are quantified in the background error covariance matrix, B, using the PDF associated with the background state. The matrix B is of size O( ). If x is the true state

32 CHAPTER. ELLIPTIC PROBLEMS IN NWP 19 of the atmosphere, then the error in the background state is quantified as x = x x b, (..1) where x is referred as the increment, or perturbation, to the background state. We combine the forecast error with the observation errors, which is quantified in the foowing cost function: J(x) = J b +J o = (x x b )B 1 (x x b )+(y H(x))R 1 (y H(x)), where y is the set of observations, H is a noninear observation operator that maps the mode space to the observation space and R is the observation error covariance matrix of size O( ). J b and J o are the cost functions for the background state and observations respectivey, and the objective of 3D-VAR is to find the mode state x that minimizes the noninear cost function J. However, for operationa purposes, it is more practica to inearise the cost function with respect to the increments to the background state: J(x ) = x B 1 x +(y H(x b ) H(x ))R 1 (y H(x b ) H(x )), (..) where H is a inear approximation to the observation operator H... 4D-VAR In 4D-VAR, the background error is modified not ony in space but aso over a certain time span caed the assimiation window. The assimiation window typicay spans a 1 hour time period around the anaysis time, and the objective of 4D-VAR is to minimize the misfit between the previous forecast and the observations during this time period. We describe a variant of 4D-VAR caed strong constraint 4D-VAR, in which the mode state x 0 at time t = t 0 is found that minimizes the foowing cost function n J(x 0 ) = (x 0 x b )B 1 (x 0 x b )+ (y i H i (x i ))R 1 (y i H i (x i )), i=0 with the constraint x i = M(t i,t 0,x 0 ). where M is the noninear time dependent mode from Section.1 and i is an index of timesteps. By minimizing this cost function, x 0 is found at the start of the assimiation

33 CHAPTER. ELLIPTIC PROBLEMS IN NWP 0 Figure -1: 4D-VAR window. Then the corrected forecast across the assimiation window, produced by evoving the mode M, minimizes the errors in the forecast and the observations during this time period. Figure -1 demonstrates the process of 4D-VAR, where t n is the end of the assimiation window. As with 3D-VAR, the 4D-VAR cost function is inearized with respect to the mode state increments for a more practica impementation. In addition to the observation operator, a inearized approximation to the mode is aso used which updates the mode space increments at each time step in the assimiation window...3 Contro Variabe Transforms in 3D-VAR Minimizingthecostfunctionin3D-or4D-VARishugeycosty duetothehugeprobem size invoved and the sheer cost in inverting the B matrix. The matrix represents the errors in the background state which are highy correated due to the strong correations between the mode variabes, such as density and pressure. These correations in the errors are represented in the matrix, hence B is dense and is in fact too arge to be used or even stored expicity. Therefore attempting to invert the matrix operationay is competey impractica. Instead, a new representation of the matrix is devised which is better conditioned and has a simper structure, consequenty aowing the minimization process to be performed more efficienty. As with other assimiation schemes, a new representation of the B matrix is achieved in the Met Office VAR by a transformation from the space of mode variabes, e.g. pressure, potentia temperature and wind veocity, to a space of variabes known as the contro variabes, e.g. streamfunction, veocity potentia and geostrophicay un-

34 CHAPTER. ELLIPTIC PROBLEMS IN NWP 1 baanced pressure. Then it is in this contro space that the minimization process is performed. The process of the contro variabe transform (CVT) is done operationay at the Met Office within incrementa 4D-VAR, but in this section we describe the process in 3D-VAR for simpicity. Firsty, instead of carrying out the transformations using the mode variabes, x, we formuate the probem in terms of their increments from the background state, x (reca (..1)), because it is with respect to these increments that the cost function is minimized (see (..)). We have x = (u,v,w,θ,ρ,p,q ), and we et, for exampe, u = u+u, v = v +v, p = p+p, where u, v and p are the background states of the variabes. Now, et v be the set of contro variabes. Then v is the set of increments of the contro variabes, and without giving the detais the transformation from the mode variabe increments to the contro variabe increments can be compacty formuated as v = Tx, (..3) for some matrix T, which we denote the T-transform. It s inverse, i.e. the transformation from v to x is denoted the U-transform x = Uv. (..4) The choice of the contro variabes is based on the fact that the background errors for each contro variabe are assumed uncorreated with each other. Then the background error covariance matrix for the contro variabes, B v, wi ony contain spatia correations in the errors for each contro variabe, and can be assumed to be bock diagona. B v is obtained from B by appying the U-transform (..4) to the first term of the cost function (..): x T B 1 x = v T U T B 1 Uv B 1 v = U T B 1 U.

35 CHAPTER. ELLIPTIC PROBLEMS IN NWP With this simper structure of the matrix, the minimization process can be done more easiy in the contro space. A simper form of (..) is now obtained: J(v ) = v B 1 v v +(y H(Uv b ) H(Uv ))R 1 (y H(Uv b ) H(Uv )). Once the minimization is competed, the soution v is transformed back to the mode space via the U-transform to give the soution x in the mode space. Operationay at the Met Office, the transformations are further extended to remove the spatia correations of each variabe using vertica and horizonta transforms [49]. By doing this, the matrix can be assumed to be diagona, hence it doesn t need to be stored expicity. Let z be the set of variabe increments which have undergone a contro variabe and spatia transform. Then z = T h T v Tx, x = UU v U h z, where T h and T v are the horizonta and vertica transforms which remove the spacia correations of the variabes, and U h U v are the respective inverses. The transformed background error covariance matrix, Λ, is simpy the identity matrix: I = Λ 1 = T h T v TB 1 UU v U h. Therefore, the probem of storing and inverting the B-matrix has been shifted to defining suitabe contro variabe and spacia transforms, and so the matrix never needs to be expicity represented...4 Choice of Contro Variabe Transform We now discuss the physics behind the the choice of the contro variabes. It is important that these variabes are uncorreated, or at east neary uncorreated, otherwise the assumption of the transformed background error covariance matrix having a diagona structure wi not be accurate. Athough the Met Office uses compressibe non-hydrostatic equations, here we ony anayze the hydrostatic motions. In [49, 53], it was identified that there are two modes of atmospheric motion, geostrophicay baanced Rossby waves and geostrophicay unbaanced inertia-gravity waves. Baanced and unbaanced fows are assumed to have itte or no interaction between each other, meaning that their errors wi be uncorreated. In a inear shaow atmosphere system, one third of the modes are characterized as baanced and two thirds as unbaanced [5], and this can be extended to a 3D atmo-

36 CHAPTER. ELLIPTIC PROBLEMS IN NWP 3 sphere with the assumption that the modes wi sti be uncorreated. This provides the inspiration for the choice of contro variabes in the CVT for the operationa VAR system. Good contro variabes are ones which capture either the baanced or unbaanced modes, and a separation of contro variabes in such a way forms the basis of the idea that their errors wi be uncorreated. At the Met Office the contro variabes currenty used operationay are: The streamfunction, ψ The veocity potentia, χ The unbaanced pressure, p u The streamfunction is assumed to be a competey baanced variabe, whist the veocity potentia is assumed unbaanced. In genera, pressure and streamfunction have baanced and unbaanced components, and ikewise for their perturbations: p = p u +p b, (..5) ψ = ψ u +ψ b. (..6) However, forthecvt,itisassumedthatψ u = 0becausethestreamfunctionisassumed to be competey baanced. Since a shaow atmosphere approximation is used for the anaysis of the modes, the depth of the atmosphere is assumed to have a constant vaue of a. The T-transform: (u,v,w,θ,ρ,p,q ) (ψ,χ,p u) We now give the detais of the T-transform, i.e. how to cacuate each of the contro variabes from the mode variabes. Note that there are more mode variabes than contro variabes, but in order to carry out the T-transform we ony need the three mode variabes (u,v,p ). The streamfunction and veocity potentia increments are found from the horizonta wind components by soving the foowing eiptic equations on each vertica eve of the sphere where r is the D Lapacian on the sphere: r ψ = 1 v asinφ λ 1 u a φ, (..7) r χ = 1 u asinφ λ + 1 v a φ, (..8) ( ( r = 1 a sinφ ) + ( 1 sinφ φ φ λ sinφ )). λ

37 CHAPTER. ELLIPTIC PROBLEMS IN NWP 4 Since the right hand sides of (..7) and (..8) have a zero mean vaue on the sphere, this ensures the existence of a soution in both equations which wi be unique up to a constant. Then, the inear baance equation r (fρ 0 r ψ b ) rp b = 0, (..9) for the increments of ψ b and p b is used to cacuate the baanced pressure fied where ρ 0 is a reference state density and f is the Coriois force which is atitude dependant. Since the streamfunction is assumed to be competey baanced, we have ψ = ψ b, and so p b can be cacuated. The fu pressure increment p is known because it is a mode variabe, so the unbaanced pressure is cacuated triviay from (..5). The U-transform: (ψ,χ,p u) (u,v,w,θ,ρ,p,q ) We now give the detais of the transformation from the contro variabes to the mode variabes, i.e. the U-transform. The three contro variabes are transformed to the mode variabes (u,v,p ), which are then used to obtain the remaining mode variabes (w,θ,ρ,q ). Firsty, a Hemhotz decomposition is used to separate veocities into rotationa and divergent parts. Therefore we have which gives ( u v ) = r ψ + r χ, (..10) u = 1 χ asinφ λ 1 a v = 1 asinφ ψ λ + 1 a ψ φ, (..11) χ φ. (..1) We then obtain the pressure by firsty cacuating the baanced pressure, p b. This is cacuated by soving (..9) on each vertica ayer, again with the assumption that ψ = ψ b. We know p u as it is a contro variabe so we obtain the fu pressure fied using (..5). With the pressure increment obtained, we then use (.1.1) and the foowing inearized hydrostatic equation (as stated in [80]) to obtain the virtua potentia temperature increment θ v : Π r = g c p θ v.

38 CHAPTER. ELLIPTIC PROBLEMS IN NWP 5 The potentia temperature, θ and specific humidity, q, increments are then cacuated by inearising the foowing equations θ ν = θ (1+[ǫ 1 1]q ), (..13) ( ) p rh = q, (..14) ǫe s (T) and then soving them for θ and q, whereǫis the ratio of the moecuar weight of water to the moecuar weight of dry air, e s (T) is the saturated vapour pressure of water and rh is the reative humidity increment. The density increment is obtained by rearranging the foowing equation from [80]: p = [ Rρ θ v a p ref ] 1 1 κ. (..15) Finay we obtain the vertica veocity increment w. Operationay in the UM, this is currenty set as w = 0, but a more accurate vaue can be obtained by soving a particuar equation known as the Quasi Geostrophic Omega (QG Ω) equation [38]: N (r) r(ρ ref w ) θ ref ρ ref f 0 ( ) 1 r θ ref ρ ref r (ρ refw ) = P w θ ref ρ ref (..16) with boundary conditions w = 0 on the upper and ower vertica boundaries. θ ref and ρ ref are hydrostaticay baanced reference states of the potentia temperature and density. N (r) = g θ 0 θ r, and f 0 is the Coriois parameter, assumed to be a constant. The form of this equation is vaid ony if the Boussinesq approximation is vaid and the equation is derived in the Met Office VAR scientific paper no. 16 (yet to be pubished). Ceary this is not the case for a nonhydrostatic mode, but the errors are ikey to be sma enough for equation (..16) to be a usefu approximation. The QG Ω equation is a three dimensiona eiptic probem soved for ρ ref w that is currenty attempted to be soved using the generaized conjugate residua (GCR) method [34, 69]. It is a highy i-conditioned probem due to the arge variation in coefficients between the poes in addition to the arge probem size of O(10 6 ). Hence, the GCR sover used at the Met Office takes over 700 iterations to sove this probem, and so there is considerabe scope for improvement before the sover can be used operationay.

39 CHAPTER. ELLIPTIC PROBLEMS IN NWP 6.3 PV-Based Contro Variabe Transformations The contro variabes described in Section..3 are chosen on the basis that they satisfy the inear baance equation (..9) exacty. However, this assumption is ony vaid for baanced components of fow because the unbaanced components cannot obey the inear baance equation. The separation between baanced and unbaanced modes is different in different Burger regimes [6]. The Burger number, Bu, is the ratio L R /L where L is the horizonta ength scae and L R is the Rossby radius of deformation, L R = gh f, where H is the depth scae. Bu is used to characterize the fow regime, where the two regimes are as foows: High Burger regimes (Bu 1): This is achieved when L L R, i.e. when f is sma (e.g. at the tropics). Here the streamfunction is dominated by baanced components, so ψ = ψ b. Low Burger regimes (Bu 1): This is achieved when L L R, where the pressure fied is dominated by baanced components, so p p b and in genera ψ u 0. This indicates that the contro variabes defined in Section..4 wi capture high Burger regimes accuratey because of the assumption that ψ is competey baanced. Soving the inear baance equation (..9) for p b is therefore ony vaid if L L R in the high Burger regimes. However, the assumption that ψ = ψ b is ony satisfied in high Burger regimes and so the ow Burger regimes are not captured accuratey. These imitations are overcome by using a different set of contro variabes which represent the separation of baanced and unbaanced components across a fow regimes. The baanced mode is captured using a quantity known as the potentia vorticity (PV). According to the theory of [5], baanced fow is associated with PV but unbaanced fow has no PV and so is competey independent from baanced fow, hence they are uncorreated. Thus we use this as motivation to choose PV-based contro variabes that wi more accuratey represent the fow in the two Burger regimes. These new variabes are: The baanced streamfunction, ψ b The veocity potentia, χ The unbaanced pressure, p u Ceary the ony difference is that ψ b is used instead of ψ, thus the assumption that the streamfunction is baanced across a fow regimes is no onger used.

40 CHAPTER. ELLIPTIC PROBLEMS IN NWP 7 This new contro variabe transform using PV recognizes the presence of unbaanced streamfunction in ow Burger regimes, but resembes the existing scheme (which is accurate) for high Burger regimes. Consequenty the contro variabes can be partitioned into purey baanced and unbaanced parts. The PV-based contro variabe transform is based around two equations. In the UM, PV is caed Erte PV, Q, which is cacuated from ψ and p, and an incrementa form of the equation is given as foows α 0 r ψ +β 0 p +γ 0 p r +ε 0 p r = Q, (.3.1) where α 0, β 0, γ 0 and ε 0 are reference state quantities of the mode variabes. The second is the inear baance equation (..9) which reates the baanced components of ψ and p. Now, since there is no PV in unbaanced components of fow, (.3.1) is written in terms of ψ u and p u as α 0 rψ u +β 0 p p u u +γ 0 r +ε p u 0 r = 0, (.3.) Now, by substituting (..5) and (..6) into (.3.1) and using (.3.) we get α 0 r ψ b +β 0 p b +γ 0 p b r +ε p b 0 r = Q, (.3.3) Now, ony the baanced component of fow satisfies the inear baance equation (..9), so repacing ψ b and p b with ψ u and p u respectivey gives a residua, denoted anti-pv, Q, which we cacuate from p and ψ r (fρ r ψ u) rp u = Q = rξ, (.3.4) where ξ is a measure of imbaance. Then adding the inear baance equation (..9) to (.3.4), we obtain r (fρ r ψ ) r p = r ξ, (.3.5) by using again (..5) and (..6). We have now cacuated PV and anti-pv from p and ψ, and thus we now use these to cacuate the contro variabes p u and ψ b. The baanced streamfunction is found by firsty using the inear baance equation (..9) to sove for p b, which can be formay written using the inverse Lapacian operator r as p b = r r (fρ 0 r ψ b ). (.3.6)

41 CHAPTER. ELLIPTIC PROBLEMS IN NWP 8 Substituting (.3.6) into (.3.3) we finay obtain the baanced equation for ψ b : α 0 rψ b +β ( 0 r r f r ψ b ) ( +γ0 r r f r ψ r b) + ( ε 0 r r r f r ψ b ) = Q (.3.7) A simiar approach is used to find the unbaanced pressure. Rearranging (.3.5) we get p u = r r (fρ 0 r ψ u ) ξ, (.3.8) and substituting (.3.8) into (.3.), we obtain the unbaanced equation for ψ u : α 0 rψ u ( +β 0 r r f r ψ u ) +γ0 r ( ε 0 r r r f r ψ u ) = β0 ξ +γ 0 ξ ( r r f r ψ u) + r +ε 0 ξ r (.3.9) Then we use (.3.8) to find p u. Despite the convincing theory for the new PV-based contro variabe transformations (see [53, 6, 5, 5, 78, 81]), these are not yet operationa. The main reason for this are the difficuties in soving (.3.7) and (.3.9) to obtain the contro variabes. The coefficients in the equations vary with height and atitude, and there are aso two dimensiona soves with r required within the three dimensiona probem. The resut is a highy i-conditioned probem and previous attempts at impementing a sover for this have not resuted in a satisfactory convergence to the soution. However, if these equations can be soved at a, et aone with great efficiency, then there is considerabe scope for improving the accuracy of the initia conditions produced by VAR, particuary in the ow Burger regimes..4 Grid Structure for the UM In this section we define the grids used for the discretisation of the equations in the UM and the CVT. The grid is reguar in ongitude λ and atitude φ, but graded in r with a higher resoution of grid points near the surface of the Earth. A graded vertica grid spacing is desirabe since there are arger vertica gradients and fuxes of variabes near the surface of the Earth. Aso, a staggered grid is used in a three coordinate directions, where the Arakawa C-grid staggering [] is used in the horizonta (i.e. the λ φ pane, see Figure -), whie the Charney-Phiips grid staggering [1] is used in the vertica (see Figure -3).

42 CHAPTER. ELLIPTIC PROBLEMS IN NWP 9 ψ,χ,q v ψ,χ,q u p,ρ,θ,w χ,q,d u ψ,χ,q v ψ,χ,q Figure -: Arakawa C-grid used in the horizonta w (=0) θ-eve Top w,q θ-eve 3 u,v,p,χ,q,q,ψ,d ρ-eve 3 w,q u,v,p,χ,q,q,ψ,d w,q u,v,p,χ,q,q,ψ,d w (=0) θ-eve ρ-eve θ-eve 1 ρ-eve 1 θ-eve 0 (=surface) Figure -3: Charney Phiips grid used in the vertica

43 CHAPTER. ELLIPTIC PROBLEMS IN NWP 30 Figure -4: The Met Office grid staggering In the horizonta, the variabes Π, ρ, θ, w, q, χ and u are ocated at the poes. In the vertica, the grid is staggered into θ- and ρ-eves. There is an extra θ-eve because these occupy both the upper and ower boundary, and the ρ-eves are ocated hafway between. The grid staggering and the position of each variabe is visuaized in three dimensions in Figure -4. The equations in the UM and in the CVT are discretised spatiay using finite differences, with a specia treatment of the variabes that are discretised at the poes. Detais of the spacia discretisation can be found in [8], but wi aso be covered in detai in Chapter 3 using a finite voume discretisation where the discretisation at the poes can be derived more naturay..5 Summary of the Key Eiptic Equations in NWP We have seen throughout this chapter that the process of NWP and data assimiation eads to three types of three dimensiona eiptic probems which are soved on a mode

44 CHAPTER. ELLIPTIC PROBLEMS IN NWP 31 domain representing the atmosphere of the Earth: The Hemhotz probem (.1.4), arising from the semi-impicit discretisation of the fuy compressibe non-hydrostatic equations used in the dynamica core of the UM. The Quasi Geostrophic Omega (QG Ω) equation (..16) for finding the vertica veocity increment in the contro variabe transform, used in variationa data assimiation. The baanced and unbaanced equations (.3.7) and (.3.9) for obtaining the contro variabes in the new PV-based contro variabe transform. The three probems each have a particuar roe in NWP, and so it is essentia to be abe to sove them efficienty and accuratey using sophisticated numerica methods. Athough each of these equations are eiptic, they a pose different difficuties. For exampe, the coefficients present in each probem are different, eading to different techniques required to compensate for the variation in these coefficients. The baanced and unbaanced equations require additiona two dimensiona soves, so further techniques must be investigated to tacke this probem. Moreover, soving these equations on a spherica domain poses additiona difficuties because the coefficients in the equations degenerate towards the poes. Therefore, it is cear that advanced numerica techniques are needed to aid the Met Office in soving these equations with greater efficiency than what they are currenty capabe of. This wi consequenty improve the forecasts from the NWP modes by aowing for finer grid resoutions and by improving the quaity of the initia conditions for the forecasts. The soution of these type of probems using iterative numerica techniques wi form the core of this thesis. In subsequent chapters we investigate various sovers that are known to be abe to dea with the type of issues that arise from these probems, and adapt them accordingy for the three main eiptic probems of interest. Comparisons wi be made with the sovers currenty used at the Met Office, and if those sovers are significanty outperformed, it is ikey that they wi be repaced by the new sovers described in this thesis.

45 Chapter 3 Mode Probem and Discretisation 3.1 Mode Probem A the probems described in the previous chapter (summarized in Section.5), when formuated in spherica coordinates and suitaby scaed, take the genera tensor product form (K u(ξ)) + a(ξ) u(ξ) + c(ξ)u(ξ) = g(ξ), ξ = (ξ 1,ξ,ξ 3 ) (0,1) 3 (3.1.1) with K = K(ξ) = α 1 (ξ) α (ξ) α 3 (ξ) a = a(ξ) = [a 1 (ξ),a (ξ),a 3 (ξ)] T,, and separabe functions α i (ξ) = α 1 i (ξ 1)α i (ξ )α 3 i (ξ 3), for i {1,,3}. The differentia operators in (3.1.1) are the usua gradient and divergence operators (see [15]) in Cartesian form, i.e. u = [ u ξ 1, u ξ, u ξ 3 ] T F = F 1 ξ 1 + F ξ + F 3 ξ 3. Let Ω = (0,1) 3 with boundary Γ. K is assumed to be positive definite amost everywhere (a.e.) in Ω, i.e. α i (ξ) > 0 for a i {1,,3} a.e., but we are particuary concerned with the highy anisotropic and degenerate cases where α i (ξ) 0 as ξ i 0 or ξ i 1. These cases arise in for exampe spherica poar coordinates where the 3

46 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 33 functions α i degenerate near the poes. In addition, for probems in numerica weather prediction, there wi usuay be a second anisotropy in that α 1 (ξ) α (ξ),α 3 (ξ). This comes from the fact that the Earth s atmosphere is very thin compared to the dimensions of the Earth s surface. Note however, that cruciay, these anisotropies are grid-aigned. In particuar, for the specia case of the Poisson equation with source term f posed in the Earth s atmosphere and written in spherica poar coordinates (see [15]), we have K = K(ξ) = ( ) sin(πξ a+dξ1 d ) sin(πξ ) π π sin(πξ ) where ξ 1 is associated with the radia direction, ξ with the poar ange and ξ 3 with the azimutha ange. Note that we have nondimensionaized the probem by setting ξ 1 := R a d, ξ := Φ π, ξ 3 := Λ π, where {(R,Φ,Λ) : a R a + d, 0 Φ π, 0 Λ π} are the usua spherica poar coordinates and a 6371km and d 63km are the Earth s radius and the depth of the atmosphere, respectivey. Because a d and because sin(πξ ) 0 as ξ 0 or ξ 1, we see that this probem is highy anisotropic as suggested above. Note that the equation has to be scaed by the voume eement of a sphere (a+dξ 1 ) sin(πξ ), defined in [15], so that it can be written in the genera tensor product form of (3.1.1). With a = 0 and c = 0 we have the Poisson equation: ξ 1 ( (a+dξ1 ) ) sin(πξ ) u d ξ 1 ( sin(πξ ) ξ π u ξ ) ( ξ 3. ) 1 u 4π sin(πξ ) ξ 3 = f(γ 1,γ,γ 3 )(a+dξ 1 ) sin(πξ ), (3.1.) where g(ξ 1,ξ,ξ 3 ) = f(γ 1,γ,γ 3 ), and γ is a reparameterization that maps the unit square to the spherica she with radius ranging from a to a+d as foows: γ(ξ 1,ξ,ξ 3 ) = ((a+dξ 1 )sin(πξ )cos(πξ 3 ),(a+dξ 1 )sin(πξ )sin(πξ 3 ),(a+dξ 1 )cos(πξ )) T.

47 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP Finite Voume Discretisation In this section we describe the discretisation of the genera probem (3.1.1) with various different boundary conditions using the finite voume method, as described in [36, 77]. We then ook at the specia case of spherica poar coordinates with periodic boundary conditions at the east and west boundaries and poar boundaries at the north and south boundaries. This wi cosey foow the discretisation presented in [7] for the two dimensiona (D) Poisson equation on the surface of the sphere, and [7] which aso describes the discretisation on the three dimensiona (3D) spherica she Discretisation of the Genera Probem In order to discretise (3.1.1) on the unit cube, et us subdivide Ω into n 1 n n 3 cubes with ce centres {(ξ 1,i,ξ,j,ξ 3,k ) : i = 1,...,n 1, j = 1,...,n, k = 1,...,n 3 }, and edge engths h 1 = 1/n 1, h = 1/n and h 3 = 1/n 3. We discretise (3.1.1) using the ce centred finite voume method. The PDE is integratedovereachmeshce, orcontrovoume, correspondingtoagridpoint(ξ 1,i,ξ,j,ξ 3,k ), i.e. Ω i,j,k = [ ξ 1,i 1,ξ 1,i+ 1 ] [ ξ,j 1,ξ,j+ 1 where ξ 1,i± 1 = ξ 1,i ± h 1, ξ,j± 1 = ξ,j ± h and ξ 3,k± 1 each contro voume consists of six faces, i.e. ] [ ξ 3,k 1,ξ 3,k+ 1 ], = ξ 3,k ± h 3. The boundary of Γ i,j,k = Γ i 1,j,k Γ i+ 1,j,k Γ i,j 1,k Γ i,j+ 1,k Γ i,j,k 1 Γ i,j,k+ 1, andthecefaces aredenotedby Γ i± 1,j,k = {ξ 1,i± 1. } [ξ,j 1,ξ,j+ 1 ] [ξ 3,k 1,ξ 3,k+ 1], with anaogous definitions for Γ i,j± 1,k and Γ i,j,k± 1 On each of the faces of Ω we impose either homogeneous Dirichet boundary conditions, i.e. homogeneous Neumann boundary conditions, i.e. u(ξ) = 0 ξ Γ, (3..1) u (ξ) = 0 ξ Γ, (3..) n (where n denotes the outward norma to the boundary Γ), or periodic boundary con-

48 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 35 ditions, i.e. u(0,ξ,ξ 3 ) = u(1,ξ,ξ 3 ), (3..3) u (0,ξ,ξ 3 ) = u (1,ξ,ξ 3 ) ξ [0,1], ξ 3 [0,1]. (3..4) ξ 1 ξ 1 and simiary for the boundary faces at ξ = 0,1 and ξ 3 = 0,1. We begin by deriving the discrete equations at points on the computationa domain whose contro voume does not contain a face on the boundary. The finite voume discretisation is obtained by firsty integrating (3.1.1) over each contro voume Ω i,j,k, i.e. (K u(ξ)) + a(ξ) u(ξ) + c(ξ)u(ξ) dv = g(ξ) dv i,j,k (3..5) Ω i,j,k Ω i,j,k where dv denotes the three dimensiona voume eement on Ω, i.e. dv = dξ 1 dξ dξ 3. (3..6) Discretisation of the Second-Order Term Let us first concentrate on finding the discrete equations for the second-order term. We use the divergence theorem to simpify the integrand of equation (3..5) by reducing a second order term to a first order term on the boundary of the contro voume, which requires the equation to be written in divergence form. Appying the divergence theorem, we obtain (K u) dv = K u n ds, Ω ijk Ω ijk where n is the outward unit norma vector to Ω i,j,k. Integrating over each ce face of the contro voume, we have K u n ds = α 1 (ξ) u ds Ω ijk Γ ξ i+ 1 i+ 1 1,j,k +,j,k α (ξ) u ds Γ i,j+ ξ i,j+ 1 1,k,k + Γ i,j 1,k Γ i 1,j,k α 1 (ξ) u ξ 1 ds i 1,j,k α (ξ) u ξ ds i,j 1,k Γ i,j,k+ 1 α 3 (ξ) u ξ 3 ds i,j,k+ 1 + Γ i,j,k 1 α 3 (ξ) u ξ 3 ds i,j,k 1, (3..7)

49 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 36 where the individua surface eements on Γ are: ds i± 1,j,k = dξ dξ 3, ds i,j± 1,k = dξ 1dξ 3, ds i,j,k± 1 = dξ 1 dξ. We approximate the derivative across the ce faces using centra differences, and then each of the ine integras are approximated using the midpoint rue which gives: K u n ds α 1(ξ i+ 1,j,k )U i+1,j,k U i,j,k Ω ijk h 1 h h 3 +α 1(ξ i 1,j,k )U i,j,k U i 1,j,k h 1 h h 3 α (ξ i,j+ 1,k )U i,j+1,k U i,j,k h 3h 1 +α (ξ h i,j 1,k )U i,j,k U i,j 1,k h 3h 1 h α 3(ξ i,j,k+ 1) U i,j,k+1 U i,j,k h 1h +α 3(ξ h i,j,k 1) U i,j,k U i,j,k 1 h 3 1h, h 3 where ξ i,j,k = (ξ 1,i,ξ,j,ξ 3,k ) and U i,j,k is the discrete approximation to the soution u at ξ i,j,k. Discrete approximations to derivatives at grid points are often written in stenci notation, which is a representation of the non-zero entries of a row of the matrix corresponding to a particuar node on the grid. The finite voume discretisation of the second order term in (3..5) is a 7-point stenci (i.e. non-zero ony at entries of the matrix corresponding to the node itsef and for its immediate neighbours) for each node on the grid whose contro ce does not contain a face on the boundary. The 7-point stenci for node (i,j,k) is: α 1(ξ i 1,j,k )H1 α (ξ i,j+ 1,k )H α 3(ξ i,j,k 1 )H 3 α 3(ξ i,j,k+ 1 )H 3 α (ξ i,j 1,k )H α1(ξ i+ 1,j,k)H1, (3..8) with denoting the sum of a the off-diagona entries, and where H 1 = h h 3 h 1, H = h 1 h 3 h and H 3 = h 1h h 3. The numbers in the square brackets give the 5-point stenci in the ξ ξ 3 pane in the usua way (see, for exampe [0]). The numbers outside the brackets denote the entries corresponding to the upwards and downwards neighbours. Note that we have used a simiar notation as in [7] to present the 7-point stenci. Note aso that this stenci assumes a uniform grid is being used in each coordinate direction. The foowing generaizations are made to the stenci to account for non-uniform grids. α 1 (ξ i 1,j,k )H 1 α (ξ i,j+ 1,k )H+ α 3 (ξ i,j,k 1 )H 3 α 3 (ξ i,j,k+ 1 )H + 3 α (ξ i,j 1,k )H α 1(ξ i+ 1,j,k )H+ 1. (3..9)

50 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 37 where H 1 ± = h 3,kh,j, H h ± ± = h 3,kh 1,i and H 1,i h ± 3 ± = h,jh 1,i, h,j h ± 1,i,h,j and h 3,k are the edge 3,k engths of contro voume Ω i,j,k, and h + and h define the distances between adjacent ce centres, e.g. h + 1,i = (h 1,i +h 1,i+1 ) Discretisation of the First-Order Term, h 1,i = (h 1,i +h 1,i 1 ) We now discretise the first-order term from (3..5). By integrating over the contro voume, we have Ω i,j,k a(ξ) u(ξ) dv, where dv is defined in (3..6). We use the average of forward and backward differences to approximate the first order derivatives across each ce, i.e. u ξ 1 1 i { } U i+1 U i h + + U i U i 1 1,i h. (3..10) 1,i Using (3..10) and the midpoint rue to approximate the voume integra, we obtain a(ξ) u(ξ) dv = Ω i,j,k a U i+1,j,k U i 1,j,k 1(ξ) Ω i,j,k h 1 = a 1(ξ i,j,k )(U i+1,j,k U i 1,j,k ) hh3 Ω i,j,k a 1(ξ) u ξ 1 +a (ξ) u ξ +a 3(ξ) u ξ 3 dv +a (ξ) U i,j+1,k U i,j 1,k h +a (ξ i,j,k )(U i,j+1,k U i,j 1,k ) h1h3 +a 3(ξ i,j,k )(U i,j,k+1 U i,j,k 1 ) h1h. +a 3(ξ) U i,j,k+1 U i,j,k 1 h 3 dξ 1 dξ dξ 3 if a uniform mesh is used. In this case the 7-point stenci for the first-order term is a 1 (ξ i,j,k ) h h 3 +a (ξ i,j,k ) h3h1 a 3 (ξ i,j,k ) h1h 0 +a 3 (ξ i,j,k ) h1h a (ξ i,j,k ) h3h1, +a 1 (ξ i,j,k ) h h 3, or more generay, without the assumption of a uniform mesh, a 1(ξ i,j,k ) V i,j,k h 1,i a 3(ξ i,j,k ) V i,j,k h 3,k +a (ξ i,j,k ) V i,j,k h +,j a (ξ i,j,k ) V i,j,k h,j +a 3(ξ i,j,k ) V i,j,k h + 3,k +a1(ξ i,j,k) V i,j,k. h + 1,i where V i,j,k = h 1,i h,j h 3,k. Note that the matrix resuting from the discretisation of the first order term does is non-symmetric (or skew-symmetric for a reguar grid).

51 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 38 Discretisation of the Zeroth-Order Term and Right-Hand-Side Lasty we discretise the zeroth order term and the right-hand-side. Approximating the integra of the zeroth order term over the contro voume using again the midpoint rue we get Ω i,j,k c(ξ)u(ξ) dv c(ξ i,j,k )u(ξ i,j,k )h 1,i h,j h 3,k. Simiary the integration of the right hand side yieds Ω i,j,k g(ξ) dv g(ξ i,j,k )h 1,i h,j h 3,k. 3.. Discretisation of the Terms on the Boundary Dirichet Boundary Conditions Suppose we have homogeneous Dirichet boundary conditions (3..1) on the boundary Γ. Let us without oss of generaity (w..o.g.) consider the nodes whose ce contains a face on the boundaryξ 1 = 0, e.g. nodes at which i = 1. Reca the foowing component from equation (3..7): Γ i 1,j,k α 1 (ξ) u ds ξ i 1,j,k. (3..11) 1 We now use one-sided differences to approximate u ξ 1, rather than centra differences, in order to use the vaue of the soution at the boundary ξ 1 = 0. The exact soution is known at ξ 1 = 0, namey U1,j,k = u(0,ξ,ξ 3 ) = 0, and the distance between the two points is h 1,1 / instead of h 1,1 so (3..11) becomes α 1 (ξ1,j,k)u 1,j,k 0 h 1,1 h h 3 = α 1 (ξ1,j,k)u 1,j,k h 1,1 h h 3. The stenci for the nodes at which i = 1, without the assumption of a uniform mesh, becomes a six-point stenci: α 3(ξ 1,j,k 1 ) h,jh 1,1 h 3,k α (ξ 1,j+ 1,k )h 3,kh 1,1 h +,j +α 1(ξ 1,j,k )h 3,kh,j h 1,1 α 3(ξ 1,j,k+ 1) h,jh 1,1 h + 3,k α (ξ 1,j 1,k )h 3,kh 1,1 h,j α1(ξ 3,j,k)h 3,kh,j. h + 1,1 Anaogousy, we can obtain simiar stencis at i = n 1, j = 1, j = n, k = 1 and k = n 3, i.e. at the edges and corners of the domain.

52 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 39 Neumann Boundary Conditions Now suppose we want to impose homogeneous Neumann boundary conditions (3..) on Γ, and consider the boundary face where ξ 1 = 0 (i.e. i = 1). Reca again the component (3..11) from (3..7). By boundary condition (3..), u ξ 1 ξ1 =0= 0, and so the component (3..11) vanishes and we are eft with the foowing stenci: α 3(ξ 1,j,k 1 ) h,jh 1,1 h 3,k α (ξ 1,j+ 1,k )h 3,kh 1,1 h +,j α 3(ξ 1,j,k+ 1) h,jh 1,1 h + 3,k α (ξ 1,j 1,k )h 3,kh 1,1 h,j α1(ξ 3,j,k)h 3,kh,j. h + 1,1 Simiar stencis are obtained anaogousy at the remaining edges and corners of Ω. Periodic Boundary Conditions Finay, suppose periodic boundary conditions (3..3) (3..4) are imposed on Γ, and consider the boundary faces where ξ 1 = 0 and ξ 1 = 1 (i.e. i = 1 and i = n 1 ). We reca again the component (3..11) from (3..7) as before, and consider the nodes whose ce contains a face on the boundary ξ 1 = 0. By (3..3), U1,j,k = U n 1 + 1,j,k so (3..11) becomes α 1 (ξ1,j,k)u 1,j,k U n1,j,k h 1,1 h h 3. where h 1,1 = 0.5 (h 1,1 +h 1,n1 ), and so the seven-point stenci is α 1 (ξ 1,j,k )h 3,kh,j h 1,1 α 3 (ξ 1,j,k 1 ) h,j h 1,1 h 3,k α (ξ 1,j+ 1,k )h 3,k h 1,1 h +,j α 3 (ξ 1,j,k+ 1 ) h,j h 1,1 h + 3,k α (ξ 1,j 1,k )h 3,k h 1,1 h,j α 1 (ξ 3,j,k )h 3,kh,j h +. 1,1 Once again, we obtain simiar stencis at each of the edges and corners of Ω. Resuting System of Linear Equations The discretisation resuts in a system of inear equations of the form Au = b, where A R n n, and n = n λ n φ n r is the dimension of the probem. Matrix A represents the discretisation of the zeroth, first and second order terms. It is a sparse and symmetric positive definite (SPD) matrix if the coefficient a to the first

53 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 40 order term in (3.1.1) is zero (otherwise it is non-symmetric) and if Dirichet boundary conditions are imposed (otherwise it is positive semi-definite). u R n is the unknown soution vector corresponding to the vaues of the unknown function u at the ce centres, and b R n is the right-hand side with entries g(ξ i,j,k )h 1,i h,j h 3,k. A exicographica ordering of the unknowns is used Specia Case Spherica Poar Coordinates Here we consider the discretisation of (3.1.1) for the particuar case of spherica poar coordinates, with ξ = (r,φ,λ) (0,1) 3, where r, φ and λ are the nondimensionaized radia direction, poar ange and azimutha ange respectivey. This specia case is of a particuar interest in numerica weather prediction, as the eiptic probems of interest from Chapter are a given in spherica poar coordinates. Reca that the probem needs to be scaed by (a+dr) sin(πφ) so that it can be written in the genera tensor product form. The Poisson equation on a sphere is given in(3.1.), but in genera, eiptic probems on a sphere may have additiona coefficients, which is certainy the case for the probems from Chapter. These can be written generay as α 1 (r,φ,λ) = L r (λ,φ)(a+dr) sin(πφ)/d, α (r,φ,λ) = L φ (r,λ)sin(πφ)/π, α 3 (r,φ,λ) = L λ (r,φ)/(4π sin(πφ)), where the coefficients L r, L λ and L φ have to again be assumed to be separabe, e.g. L r (φ,λ) = L λ r(λ)l φ r(φ). For simpicity we assume that the i th coefficient L i does not depend on the i th variabe, which is aways the case for the probems in Chapter. In genera we aso have c 0 and a 0, but for the probems in Chapter, these are restricted to c 0 and a(ξ) = [ a 1 (r)(a+dr) sin(πφ)/d,0,0 ]. The main difference of this setting is the boundary conditions in the φ-direction coming from the poe, which aso affects the mesh defined on Ω at these boundaries. The mesh is defined by subdividing Ω into n λ n φ n r cubes with ce centres {(r i,φ j,λ k ) : i = 1,...,n r, j = 1,...,n φ, k = 1,...,n λ }, and edge engths h λ = 1/n λ, h φ = 1/(n φ +1) and h r,i, i = 1,...,n r. In addition there are n r ces at the poes with edge engths 1, h φ / and h r,i. The computationa grid in the λ φ pane can be seen in Figure 3-1(a), where the top and bottom boundary represent the North and South poe, respectivey. The grid on the λ φ pane is uniform,

54 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 41 (a) π φ North poe h λ n φ 1 1 (φ j,λ k ) π South poe h φ λ n λ (b) Earth s surface Figure 3-1: (a) The computationa grid in the λ-φ pane, and (b) the graded mesh in the r-direction. except at the poes. The nodes are ocated at the ce centres, where λ k = (k 1 )h λ and φ j = jh φ. At the poes we use haf ces, so that the poes themseves are ocated at the centres of the ces in the physica domain Ω, and so that a discrete equation can be derived at these points in the same fashion as at the other points. In the radia direction, the mesh is graded as shown in Figure 3-1(b) with the ce centres ocated at r i = i 1 t=1 h r,t + 1 h r,i, and with the mesh widths h r,i increasing with i. Thus, the tota number of unknowns, incuding the unknowns at the poes, is (n λ n φ +) n r. This computationa grid is the same as the Arakawa C-grid and the Charney-Phiips grid introduced in Chapter, where the ce centred nodes are ocated at the p -points and ρ-eves. As described in Section 3..1, we discretise (3.1.1) using the ce centred finite voume method, where firsty the PDE is integrated over each mesh ce (or contro voume) corresponding to an interior grid point (r i,φ j,λ k ), i.e. Ω i,j,k, or corresponding to a grid point at the poe. Denoting the contro voume corresponding to a vertica grid point at eve i on the north and south poes as Ω S i and Ω N i respectivey, we have [ Ω S i = [ Ω N i = r i 1,r i+ 1 r i 1,r i+ 1 ] [ 0,φ1 [ ] 0,φ nφ + 1 ] [0,1], and ] [0,1]. where φ1 = h φ /. Except at the poes, the boundary of each contro voume, Γ i,j,k, consists of six faces, as described in Section Each contro voume at the poes,

55 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 4 however, has n λ + faces, i.e. Γ S i = n λ k=1 Γ i, 1,k ΓS Γ S, i 1 i+ 1 [ ] where Γ S = {r i± 1 i± 1} 0,φ1 [0,1] are the top and bottom faces of the contro voume at the south poe on vertica grid eve i. Since the probem is discretised on a sphere, it is necessary to impose periodic boundary conditions on the atera boundary, i.e. u(r,φ,0) = u(r,φ,1), u u (r,φ,0) = (r,φ,1) φ [0,1], r [0,1]. λ λ In addition, for the upper and ower boundaries of the atmosphere (corresponding to r = 0 and r = 1), the probems from Chapter either have homogeneous Dirichet boundary conditions, i.e. u(0,φ,λ) = u(1,φ,λ) = 0. or homogeneous Neumann boundary conditions, i.e. u r u (0,φ,λ) = (1,φ,λ) = 0. r The discretisation at the interior nodes is identica to the genera case, resuting in the same stenci (3..9). Reca that for spherica poar coordinates we have α 1 (ξ i± 1,j,k) = α 1(r i± 1,φ j,λ k ) = L r (φ j,λ k )(a+dr i+ 1) sin(πφ j )/d, α (ξ i,j± 1,k) = α (r i,φ j± 1,λ k ) = L λ (r i,λ k )sin(πφ j )/π, α 3 (ξ i,j,k± 1) = α 3 (r i,φ j,λ k± 1) = L φ (r i,φ j )/(4π sin(πφ j )), and h 1 = h r, h = h λ, h 3 = h φ. We now come to the discretisation at the boundaries. The homogeneous Dirichet and Neumann boundary conditions at r = 0 and r = 1 have been covered in the genera case, as we as the periodic boundary conditions on the atera boundary. What remains is the treatment of the north-south boundaries at the poar regions. Poes We now tacke the discretisation at the poes, which occupy an entire r λ pane. Each poe ce (apart from those at the top or bottom boundary) has an entire λ-ine of

56 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 43 Figure 3-: The poar region, with the poe at the centre of n λ haf ces neighbours in addition to upperand ower neighbours, thus a tota of n λ +neighbours, eading to a (n λ +3)-point stenci at these ces. The poes nodes on eve i are ocated at the centre of the contro voume Ω S i or Ω N i which is comprised of n λ haf ces (see Figure 3-), with Γ S or Γ N corresponding to the top and bottom faces of the contro i± 1 i± 1 voume. For the south poe (where φ = 0), the integration over contro voume Ω S i eads to the foowing discretisation: Γ S i = K u n ds Γ S i+ 1 α 1 (ξ) u r dss i+ 1 Γ S i 1 α 1 (ξ) u r dss i 1 + n λ k=1 Γ i, 1,k α (ξ) u φ ds i, 1,k = α 1 (r i+ 1,λ k) US i+1 US i,φ1 4 h + r,i n λ k=1 = α 1 (r i+ 1,φ1 4,λ k) h φ h r,ih λ h φ (1) h φ α 1(r i 1 α (r i,φ1,λ k) U i,1,k U S i h φ h + r,i n λ k=1 ( U S i+1 U S i h λ h r,i h r,i,φ1 4,λ k) US i U S i 1 (1) h φ ) α1 (r i 1,λ k) h φ (,φ1 4 U S h i Ui 1) S r,i α (r i,φ1,λ k) ( U i,1,k Ui S ), where Ui S is the discrete soution at the south poe on eve i and φ1 = h φ /4. For the 4 right hand side at the south poe, again we discretise by approximating the integra using the midpoint rue:

57 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 44 Ω S i g(a+dr) sin(πφ) dv = gi S (a+dr i) h φ sin(πφ1)h r,i 4, where h r,i h φ is an approximation to the voume of the contro ce ΩS i, and we see from Figure 3-1(a) that (r i,φ1,0.5) is the midpoint of Ω S i in the computationa grid. 4 Note that the midpoint of Ω S i in the physica domain is actuay ocated at the south poe (see Figure 3-), but the specia treatment we use at the poes is consistent with the work of Barros [7] and aso with the discretisation used at the Met Office. Anaogousy for the north poe (where φ = π) we obtain α 1 (r i+ 1,φ n φ + 3 4,λ k) h φ h r,ih λ h φ h + r,i n λ k=1 ( U N i+1 U N i ) α1 (r i 1,φ n φ + 3,λ k) h φ ( 4 U N h i Ui 1) N r,i α (r i,φ nφ + 1,λ k) ( U i,nφ,k U N i = g N i (a+dr i ) sin(πφ nφ )h r,i h φ. ) Resuting System of Linear Equations The discretisation resuts in a system of inear equations of the form Au = b. A R n n, and n = (n λ n φ +) n r is the dimension of the probem. u R n is the unknown soution vector corresponding to the vaues of the unknown function u at the ce centres, and b R n is the right-hand side containing the source terms. It contains seven non-zero entries per row corresponding to an interior node, and n λ +3 non-zeros per row corresponding to poe nodes that are not at the upper or ower boundaries. For typica probem sizes used at the Met Office, n λ +3 7, hence the rows corresponding to the poe nodes are significanty more dense. For nodes whose ce face is on the upper or ower boundary, there are six non-zeros (for non-poe nodes) or n λ + nonzeros per row (for poe nodes). Figure 3-3 is a spy pot showing the sparsity pattern of matrix A with probem size n = 15 (n λ = 6, n φ = 6, n r = 4). Periodic boundary conditions are imposed at the atera boundary, in addition to the poar boundary at the north-south boundaries. If the vertica boundary conditions are of Neumann-type, then these boundary conditions yied a singuar system of inear equations, meaning that the soution to this system is unique ony up to a constant. Techniques for deaing with this singuarity wi be discussed in Section 5.5..

58 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 45 Figure 3-3: A spy pot showing the sparsity pattern of matrix A for the 3D probem Two Dimensiona Probems on the Surface of the Sphere In NWP, the soution to Poisson-type equations on the sphere in two dimensions are aso highy important. One such use of a two dimensiona Poisson sover is for the baanced and unbaanced equations (.3.7) and (.3.9), as we as reguary within the contro variabe transform (CVT) described in Chapter. Therefore in this section we derive the finite voume discretisation of the two dimensiona Poisson-type equation. Consider the foowing two dimensiona abstract equation: (K u(ξ)) = g(ξ) on Ω D, (3..1) with K = K(ξ) = ( α 1 (ξ) 0 0 α (ξ) ), for ξ = (λ,φ) and for separabe functions α 1 (φ,λ) = α 1 1 (φ)α 1 (λ) > 1 a.e. and α (φ,λ) = α 1 (φ)α (λ) > 1 a.e. Of particuar interest is the case of a Poisson-type equation in spherica poar coordinates which is given by α 1 (φ,λ) = L φ (λ)sin(πφ)/π, α (φ,λ) = L λ (φ)/(4π sin(πφ)) and a scaing by sin(πφ):

59 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 46 Figure 3-4: A spy pot showing the sparsity pattern of the matrix for the D probem. ( ) Lφ (λ)sin(πφ) u φ π φ ( Lλ (φ) λ 4π sin(πφ) ) u λ = gsin(πφ), (3..13) This is soved on the unit square Ω D = (0,1), which is subdivided into n φ n λ ces {(φ j,λ k ) : j = 1,...,n φ, k = 1,...,n λ }, in addition to one ce for each poe. This computationa grid is pictured in Figure 3-1(a), and the edge engths h φ and h λ are defined as in Section As in Section 3..1, (3..1) is integrated over each contro voume, and using the Divergence theorem it is simpified to a first order term on the boundary of the contro voume. Each of the ine integras is approximated by the midpoint rue and the derivatives invoved on it by centra differences, producing a 5-point stenci at an interior node (j, k): L φ (λ k ) h λ sin(πφ j+ 1) h φ π L λ (φ j ) h φ 1 h λ 4π sin(πφ j L ) λ (φ j ) h φ 1 h λ 4π sin(πφ j ) L φ (λ k ) h sin(πφ λ j 1 ) h φ π and the sparsity pattern for the matrix is potted in Figure 3-4., (3..14)

60 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP Finite eements and a Link to Finite Voumes Using Quadrature Athough the majority of the work and computations in this thesis is based on using the finite voume discretisation, it is aso necessary to use the finite eement scheme (described in e.g. [19, 3, 67]) to prove some theoretica resuts (which we wi see in Chapter 5). However, as discussed in [14], a finite eement discretisation of an eiptic probem wi cover most finite voume discretisations, since the two schemes can be shown to agree if suitabe quadrature formuas are used. Thus, with sight modifications, the theoretica resuts in Chapter 5 wi carry over to a finite voume discretisation of the same probem. In Section the discretisation of the twodimensiona probem (3..1) using biinear finite eements is described, and in Section 3.3. a quadrature rue is devised which inks the finite voume and finite eement schemes. Finay in Section we discuss on how the ideas are extended to three dimensions Piecewise Biinear Finite Eements in Two Dimensions Consider an abstract two-dimensiona probem: (K(ξ) u(ξ)) = g(ξ), on Ω = Ω x Ω y, (3.3.1) with a continuous boundary Γ, homogeneous Dirichet boundary conditions u = 0 on Γ, and ξ = (x,y). As shown in [67], the finite eement method invoves writing probem (3.3.1) in a weak form and then approximating this weak form by formuating an approximate weak form in a finite dimensiona space. Tofindtheweak formof (3.3.1) weintroduceaspacev of functionsinωthat vanish on the boundary. Here, the appropriate choice for V is the Hibert space H 1 0 (Ω): H 1 0(Ω) = { v : Ω R : If u soves (3.3.1), then it aso satisfies Ω } ( v + v ) < and v = 0 on Γ v (K u)dv = gv dv, (3.3.) Ω Ω for any arbitrary function v V. Now usinggreen s formua [19, Chapter 0], we obtain.

61 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 48 from (3.3.): Ω v (K u)dv v(k u) nds Γ }{{} =0 as v Γ =0 = Ω gv dv. We now write the weak form of (3.3.1) as: Find u V such that a(u,v) = (g,v) L (Ω), v V (3.3.3) where a : V V R is a biinear form defined as a(u,v) = v (K u)dxdy, and (, ) L (Ω) is the scaar product of the function space L (Ω), i.e (g,v) L (Ω) = gv dxdy. Ω (Note that dxdy wi now be used to denote the D voume eement instead of dv) The soution u V of the weak form (3.3.3) is then approximated by choosing a finite dimensiona space V h V and by introducing the approximate weak form: Find u h V h such that a(u h,v h ) = (g,v h ) L (Ω), v h V h. (3.3.4) Infiniteeement methodsthefinitedimensionaspacev h isconstructedbydecomposing Ω into a mesh of trianguar or rectanguar eements whose vertices are the nodes of the mesh. We denote the coection of eements as T and a typica eement incuding its boundary as τ T. The set of nodes in the mesh, excuding the boundary because of the Dirichet boundary conditions, is denoted N. We aso require a suitabe basis for V h. In the owest order (biinear) case on rectanguar eements, as used here, the basis functions are (usuay) associated with nodes of the grid and have support ony in eements containing that node. For any n i N, denote the basis for V h as {φ i (x,y) : n i N}. The soution u h V h to the approximate weak form can be written as a inear combination of the basis functions u h = U j φ j, n j N Ω

62 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 49 and since a is a biinear form (inear in both its arguments) and L is aso inear, we get n j N a(φ j,φ i ) U j = (g,φ i ) }{{} L (Ω) n i N, }{{} A i,j b i or equivaenty Au = b, (3.3.5) where A is known as the stiffness matrix. The system (3.3.5) is assembed via eement stiffness matrices. We have A i,j = φ i (K φ j )dxdy Ω = φ i (K φ j )dxdy, (3.3.6) τ T τ }{{} A τ i,j where A τ i,j is the eement stiffness matrix for eement τ T. Since φ i has support ony in eements containing node n i, A τ i,j = 0 uness n i and n j are both nodes of eement τ. Hence, A τ i,j can be stored as a 4 4 matrix with rows and coumns corresponding to the four nodes of τ. Depending on the entries in the coefficient matrix K, it may be possibe to evauate the integra over τ exacty. However, it is common to use quadrature rues to approximate the integra, and for a suitabe quadrature rue, this wi ead to the finite voume discretisation of (3.3.1) given above, as we wi see in the foowing section Link to the Finite Voume Scheme Let us now restrict our probem to a uniform rectanguar mesh on the unit square, with mesh widths h x and h y in the x- and y-directions respectivey. We denote the set of nodes of the mesh as {(x i,y j ) : i = 1,...,n x, j = 1,...,n y }. where x i = ih x and y j = jh y. The eement centred at (x i+ 1,y j+ 1) is denoted τ i+ 1,j+1 where τ i+ 1,j+1 = [x i,x i+1 ] [y j,y j+1 ]. If the mesh width in the x- and y-directions are equa, then h x = h y = h. The mesh is visuaized in Figure 3-5. We choose V h to be a space of continuous and piecewise biinear functions on Ω that vanish on the boundary. We can do this by resorting

63 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 50 y τ i+ 1,j+1 (x i,y j ) h y (x 1,y 1 ) h x Figure 3-5: Finite eement mesh x to H0 1 (Ω)-conforming finite dimensiona spaces buit in a tensor product fashion, as done in [14]. That is, we decompose Ω x and Ω y into two 1D grids and introduce finite dimensiona spaces V x h H1 0 (Ω x) and V y h H1 0 (Ω y) on these grids. We see from [67] that suitabe noda basis functions for Vh x and V y h are the inear hat functions {φ x i : i = 1,...,n x} and {φ y j : j = 1,...,n y} respectivey, defined by φ x i (x) = (x x i 1 ) h x x [x i 1,x i ] (x i+1 x) h x x [x i,x i+1 ] 0 esewhere, φ y j (y) = (y y j 1 ) h y y [y j 1,y j ] (y j+1 y) h y y [y j,y j+1 ] 0 esewhere TheH 1 0 (Ω)-conformingfinitedimensionaspaceV h canthenbedefinedasv h = V x h V y h, i.e. V h = span{u(x)v(y) : u V x h, v V y h }. The noda basis {φ i,j : i = 1,...,n x, j = 1,...,n y } of V h is the set of products of any two basis functions of V x h and V y h, i.e. φ i,j (x,y) = (x x i 1 ) (y y j 1 ) h x (x x i 1 ) h x (x i 1 x) (y y j 1 ) h x (x i 1 x) h x h y x [x i 1,x i ], y [y j 1,y j ] (y j+1 y) h y x [x i 1,x i ], y [y j,y j+1 ] h y x [x i,x i+1 ], y [y j 1,y j ] (y j+1 y) h y x [x i,x i+1 ], y [y j,y j+1 ] 0 esewhere, (3.3.7) We cacuate the 4 4 eement stiffness matrix for eement τ i+ 1 by cacuating,j+1 A τ i+ 1,j+1 (i,j)(k,) = τ i+ 1,j+ 1 φ i,j (K φ k, )dxdy, (3.3.8)

64 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 51 (x i,y j+1 ) (x i+ 1,y j+1 ) (x i+1,y j+1 ) (x i,y j+ 1) τ i+ 1,j+ 1 (x i+1,y j+ 1) (x i,y j ) (x i+ 1,y j ) (x i+1,y j ) Figure 3-6: Eement τ i+ 1,j+1 where (x i,y j ) and (x k,y ) are two nodes of the mesh. (3.3.8) expands as A τ i+ 1,j+1 (i,j)(k,) = α 1 (x,y) φ i,j φ k, τ x x dxdy + i+ 1,j+ 1 τ i+ 1,j+ 1 α (x,y) φ i,j y φ k, y dxdy. (3.3.9) The integras in (3.3.9) are approximated as usua by quadrature rues. However, to find a scheme that resembes the finite voume discretisation, the two integras have to be approximated by different quadrature rues. For the first integra in (3.3.9) we use the midpoint rue in x and the trapezoida rue in y, i.e. τ i+ 1,j+ 1 f(x,y)dxdy h xh y ( ) f(x i+ 1,y j )+f(x i+ 1,y j+1 ). (3.3.10) For the second integra, we use the midpoint rue in y and the trapezoida rue in x, i.e. τ i+ 1,j+ 1 f(x,y)dxdy h xh y ( ) f(x i,y j+ 1)+f(x i+1,y j+ 1). (3.3.11) Both these rues are second order accurate. It is easy to verify that φ k, x (x k+ 1,y ) = 1, h x φ k, x (x k,y + 1) = 1, h y φ k, x (x k 1,y ) = 1, (3.3.1) h x φ k, x (x k,y 1) = 1, (3.3.13) h y and φ k, y, φ k, y vanish at a other points needed in the above quadrature rues. Hence, using (3.3.10) and (3.3.11) in (3.3.9) we get for the diagona entries in the eement stiffness matrix

65 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 5 A τ i+ 1,j+ 1 (i,j)(i,j) = α 1 (x,y) τ i+ 1,j+ 1 [ h xh y + h xh y [ α 1 (x i+ 1,y j) ( φi,j x ( φi,j x α 1 (x i,y j+ 1 ) ( φi,j y and using (3.3.1) and (3.3.13) we get ) dxdy + τ i+ 1,j+ 1 α (x,y) ) (x i+ 1,y j)+α 1 (x i+ 1,y j+1) ( φi,j y ( φi,j x ) dxdy ) (x i+ 1,y j+1) ) ( ) (x i,y j+ 1)+α 1(x i+1,y j+ 1 ) φi,j (x i+1,y x j+ 1) ] ], A τ i+ 1,j+1 h x (i,j)(i,j) = h xh y α 1 (x i+ 1,y j ) + h xh y h y α (x i,y j+ 1). (3.3.14) Finay assembing the diagona entries of the goba stiffness matrix A by using (3.3.6), i.e. by adding the entries of the eement stiffness matrices corresponding to the four eements containing node (x i,y j ), we get A (i,j)(i,j) = h y h x ( α 1 (x i+ 1 + h x h y ( ),y j ) + α 1 (x i 1,y j ) α 1 (x i,y j+ 1 ) + α (x i,y j 1) The off diagona entries are cacuated in the same way. Using the quadrature rues we have i+ 1,j+ Aτ 1 (i,j)(i+1,j) = hxhy + hxhy τ i+ 1,j+ 1 α 1 (x,y) φ i,j x [ ( φi,j α 1 (x i+ 1,y j ) [ α 1 (x i,y j+ 1 ) φ i+1,j x x ( φi,j y φ i+1,j x dxdy + τ i+ 1,j+ 1 α (x,y) φ i,j y ) (x i+ 1,y j ) + α 1 (x i+ 1,y j+1 ) φ i+1,j y which simpifies, using (3.3.1) and (3.3.13), to ) (x i,y j+ 1 ) + α 1 (x i+1,y j+ 1 ) A τ i+ 1,j+1 (i,j)(i+1,j) = h y α 1 (x h i+ 1,y j ). x ). φ i+1,j y ( φi,j φ i+1,j x x ( φi,j x dxdy ) ] (x i+ 1,y j+1 ) φ i+1,j x ) ] (x i+1,y j+ 1), Then by assembing the two eements that contain nodes (x i,y j ) and (x i+1,y j ) we get A (i,j)(i+1,j) = h y α 1 (x h i+ 1,y j ). x Simiary A (i,j)(i 1,j) = h y α 1 (x h i 1,y j ), A (i,j)(i,j±1) = h x α (x i,y x h j± 1), y

66 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 53 and A (i,j)(i 1,j 1) = A (i,j)(i+1,j 1) = A (i,j)(i 1,j+1) = A (i,j)(i+1,j+1) = 0. A the remaining entries in A wi be zero because the support of the basis functions at the two nodes do not overap. Thus the stenci at an interior node (x i,y j ) is hx h y α (x i,y j+ 1) h x α 1 (x i 1,y j ) hy h x α 1 (x i+ 1,y j ). (3.3.15) hx h y α (x i,y j 1) hy This is equivaent to the finite voume stenci (3..14), with x = λ and y = φ. Remark Note that this stenci is not what we woud have obtained by integrating the integras in (3.3.9) exacty. For exampe, in the simpe case K = I with mesh widths h x = h y = h, the stenci (using exact integration) resuting from the finite eement discretisation at some interior node (x i,y j ) is whereas using the quadrature rues (3.3.10) and (3.3.11), we obtain the finite voume stenci: Extension to Three Dimensions Consider the three dimensiona probem,. (3.3.16) (K u) = g on Ω 3D = Ω 1D Ω D, (3.3.17) with Dirichet boundary conditions u(ξ) = 0 for ξ Γ, where Ω 1D R, Ω D R and ξ = (x,y,z). As in Section we sove (3.3.17) by writing it in a weak form which is then approximated on a finite dimensiona space. The weak form of (3.3.17) in 3D is: Find u V such that a(u,v) = (g,v) L (Ω) v V,

67 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 54 where a(u,v) = v (K u) dxdydz, (g,v) L (Ω) = gv dxdydz. Ω 3D Ω 3D A suitabe choice for V is the Hibert space H 1 0 (Ω 3D). The abstract approximation of the weak form is: Choose V h V and find u h V h such that a(u h,v h ) = (g,v h ) L (Ω) v h V h. To find V h we first decompose Ω 3D into a mesh of cubic eements, where the nodes of the mesh are the vertices of the eements and denoted {(x i,y j,z k ) : i = 1,...,n x, j = 1,...,n y, k = 1,...,n z }. The eement centred at (x i+ 1,y j+ 1,z k+ 1) is denoted τ i+ 1. As in Section 3.3.1,j+1,k+1 we choose V h H0 1(Ω 3D) in a tensor product fashion by decomposing Ω 1D and Ω D into a 1D and D grid respectivey, and introducing finite dimensiona spaces V 1D h H 1 0 (Ω 1D) and V D h H0 1(Ω D) on these grids. The noda basis functions of Vh D are {φ i,j : i = 1,...,n x, j = 1,...,n y } (see (3.3.7)) and of V 1D h φ z k (z) = (z z k 1 ) h z z [z k 1,z k ] (z k+1 z) h z z [z k,z k+1 ] 0 esewhere WethenchooseV h = Vh D V 1D h withthenodabasisfunctions{φ i,j,k : i = 1,...,n x, j = 1,...,n y, k = 1,...,n z } of V h as the set of products of any two basis functions of Vh 1D and Vh D. As we have aready seen, the soution u h V h to the approximate weak form can be written as a inear combination of the basis functions of V h, and this eads to a system of equations Au = b assembed via the eement stiffness matrices. We have n x n y n z A (i,j,k)(,m,n) = i=1 j=1 k=1 τ i+ 1,j+ 1,k+1. are φ i,j,k (K φ,m,n ) dxdydz, (3.3.18) }{{} Aτ i+ 1,j+ 1,k+1 (i,j,k)(,m,n) where (x i,y j,z k ) and (x,y m,z n ) are two nodes of the mesh and where A τ i+ 1,j+1,k+1 (i,j,k)(,m,n) is an8 8matrix withrowsandcoumnscorrespondingtotheeight nodesofτ i+ 1.,j+1,k+1

68 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 55 The eement stiffness matrix for τ i+ 1,j+1,k+1 A τ i+ 1,j+ 1,k+1 (i,j,k)(,m,n) = α φ i,j,k 1(x,y,z) τ x x i+ 1,j+ 1,k+1 + α φ i,j,k φ,m,n (x,y,z) dxdydz + τ y y i+ 1,j+ 1,k+1 φ,m,n can be written as dxdydz τ i+ 1,j+ 1,k+1 α 3(x,y,z) φ i,j,k z φ,m,n z dxdydz. (3.3.19) The first integra is approximated by the midpoint rue in x and the trapezoida rue in y and z, i.e. τ i+ 1,j+ 1,k+1 h x h y h z 4 f(x,y,z) dxdydz = [ ] f(x i+ 1,y j,z k )+f(x i+ 1,y j+1,z k )+f(x i+ 1,y j,z k+1 )+f(x i+ 1,y j+1,z k+1 ). The second integra is approximated by the midpoint rue in y and the trapezoida rue in x and z, i.e. τ i+ 1,j+ 1,k+1 h x h y h z 4 f(x,y,z) dxdydz = [ ] f(x i,y j+ 1,z k)+f(x i+1,y j+ 1,z k)+f(x i,y j+ 1,z k+1)+f(x i+1,y j+ 1,z k+1), and the third integra is approximated by the midpoint rue in z and the trapezoida rue in x and y, i.e. τ i+ 1,j+ 1,k+1 h x h y h z 4 f(x,y,z) dxdydz = [ ] f(x i,y j,z k+ 1 )+f(x i,y j+1,z k+ 1 )+f(x i+1,y j,z k+ 1 )+f(x i+1,y j+1,z k+ 1 ). For the diagona entries of the eement stiffness matrix, we have A τ i+ 1,j+1,k+1 (i,j,k)(i,j,k) = h yh z 4h x α 1 (x i+ 1,y j,z k )+ h xh z 4h y α 1 (x i,y j+ 1,z k)+ h xh y 4h z α 1 (x i,y j,z k+ 1 ). Assembing the diagona entries of the goba stiffness matrix A by using (3.3.18), we get A (i,j,k)(i,j,k) = h ( ) yh z α 1 (x h i+ 1,y j,z x k ) + α 1 (x i 1,y j,z k ) ) + h xh z h y ( α (x i,y j+ 1,z k) + α (x i,y j 1,z k) + h xh y h z ( α 3 (x i,y j,z k 1 ) + α 3(x i,y j,z k+ 1 ) ).

69 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 56 Simiary we have A (i,j,k)(i±1,j,k) = h yh z α 1 (x h i± 1,y j,z k ), x A (i,j,k)(i,j±1,k) = h xh z α (x i,y h j± 1,z k ), y A (i,j,k)(i,j,k±1) = h xh y α 3 (x i,y j,z h k± 1). z A other entries in A wi be zero. Then the stenci at the interior node (x i,y j,z k ) is H zα 3 (x i,y j,z k 1) H yα (x i,y j+ 1,z k ) H xα 1 (x i 1,y j,z k ) H xα 1 (x i+ 1,y j,z k ) H yα (x i,y j 1,z k ) Hzα 3(x i,y j,z k+ 1 ) (3.3.0) which is the same as the finite voume stenci (3..8) with x = ξ 1, y = ξ and z = ξ 3, and where H x = hyhz h x, H y = hxhz h y and H z = hxhy h z. Remark For the case K = I with mesh widths h x = h y = h z = h, the stenci (using exact integration of (3.3.19)) resuting from the finite eement discretisation at some interior node (x i,y j,z k ) is h (3.3.1) where the terms in the first and third brackets represent entries corresponding to neighbours on z-eve k 1 and k+1 respectivey. By using the three quadrature rues from above instead, we obtain the finite voume stenci: h h h 6h h h. h Remark The tensor product,, of square matrices for A = (a i,j ) i,j R n,n and B R m,m is defined by a 11 B a 1 B a 1n B a A B = 1 B a B a n B.... (3.3.) a n1 B a n B a nn B

70 CHAPTER 3. ELLIPTIC PROBLEMS IN NWP 57 Using this definition, the the 3D operator represented by the stenci (3.3.0) can be found using the D operator represented by stenci (3.3.15) and the finite voume discretisation of a 1D Poisson-type operator whose stenci representation is [ α 3 3 (z k 1 ) 1 h z ] α 3 3 (z k+ 1 ) 1 h z. Denoting A 1D,A D,A 3D as the above one, two and three dimensiona operators, respectivey, we find A 3D by A 3D = A D h z α 3 1 (z k)i 1D +h x h y α 1 3 (x k)α 3 (y j)i D A 1D, where I 1D and I D are the identity matrices of the same dimension as A 1D and A D, respectivey.

71 Chapter 4 Numerica Soution of Large Sparse Systems of Linear Equations 4.1 Introduction and Mode Probems In this chapter we describe numerica methods for soving arge sparse inear systems of equations arising from the discretisation of second order eiptic partia differentia equations (PDEs). We wi mainy foow Briggs, Henson and McCormick [0]. Consider the foowing boundary vaue probem for the Poisson-type eiptic equation in one dimension: u (x)+cu(x) = f(x) on Ω 1D = (0,1), c 0, (4.1.1) u(0) = u(1) = 0. With c = 0 this is the Poisson equation. Thesimpicity of this equation means it can be soved anayticay, but for the purpose of this chapter we sove it numericay instead. The domain Ω 1D is spit up into n subintervas which introduces the grid T h with grid points x i = ih, i = 1,,n 1, where h is the constant width of the subintervas (mesh width). For simpicity we use a uniform mesh which is sufficient for this chapter, but the resuts in this chapter aso extend to non-uniform (but shape reguar) meshes. A discrete second order finite difference approximation repaces equation (4.1.1) as Note that for a uniform mesh, finite voume and finite eement discretisations ead to identica agebraic systems (apart from a scaing by h ). We saw in Chapter 3 how they can aso be inked in more genera cases via the use of suitabe quadrature rues. 58

72 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 59 foows: U i+1 +U i U i 1 h +cu i = f(x i ), 1 i n 1, (4.1.) U 0 = U n = 0, where the soution vector u = (U 1,U,,U n 1 ) approximates the exact soution u at the grid points. (4.1.) is a system of n 1 inear equations, represented in matrix form as +ch 1 1 h 1 +ch U 1. U n 1 1 +ch }{{}}{{} A u = f 1. f n 1. }{{} b The matrix A is sparse, symmetric (A = A T ) and tridiagona. Anaogousy, the twodimensiona version of this probem is u xx u yy +cu = f(x,y) on Ω D = [0,1], c 0, u = 0 on Γ D, where Γ D is the boundary of Ω D, and the finite difference approximation yieds the foowing system of N = (n x 1) (n y 1) inear equations U i 1,j +U i,j U i+1,j h x + U i,j 1 +U i,j U i,j+1 h y u i,0 = u i,nx = u 0,j = u ny,j = 0, 1 i n x 1, 1 j n y 1, = f ij, (4.1.3) where h x = 1/n x and h y = 1/n y, and the grid points (x i,y j ) = (ih x,jh y ) beong to the two-dimensiona grid shown in Figure 4.1. This system in matrix form is sparse, symmetric and bock-tridiagona: B Iα Iα B Iα Iα Iα B U 1. U (nx 1)(n y 1) = f 1. f (nx 1)(n y 1),

73 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 60 y (x i,y j ) h y hx x Figure 4-1: The two-dimensiona grid on the unit square where each diagona bock B is an (n x 1) (n x 1) tridiagona matrix which ooks ike the matrix A for the one-dimensiona probem. Each off-diagona bock is a mutipe, α = 1, of the (n h x 1) (n x 1) identity matrix. y It is convenient to use stencis to represent discrete equations at certain points on the grid, and the stenci notation was described in Chapter 3. The stenci representation for the inear system (4.1.) is 1 [ h 1 +ch 1 ], and for (4.1.3) is 1 h ch 1 1, assuming h = h x = h y. The matrices produced by the discretisation of boundary vaue probems have desirabe properties for numerica methods. Matrices that are sparse and symmetric can be expoited by certain numerica methods. Further desirabe properties incude diagona dominance, where the entries a ij of the matrix A satisfy N a ij a ii, 1 i N, j i and positive definiteness where x T Ax > 0 for any non-zero vector x, or equivaenty a the eigenvaues of A are rea and positive. Symmetric and diagonay dominant matrices with positive diagona eements are positive definite, and matrices that are symmetric positive definite (SPD) with non-positive off-diagona entries are M matrices [0].

74 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 61 During the past 50 years, an enormous amount of work has been devoted to the soution of arge, sparse inear systems. Existing methods fa into two main categories: direct or iterative methods. Direct methods, such as Gaussian Eimination or QR factorization [74], compute the soution exacty (up to rounding errors) in a finite number of arithmetic steps. These methods are very robust but generay inefficient, and ony specific probems concerning simpe geometries can be soved quicky (eg. using Fast Fourier Transforms[1]). Note, however, that there are sparse factorisations, based on a cever ordering of the unknowns that can be highy competitive even up to moderatey arge probem sizes especiay in one and two dimensions, as shown in [48] (seeaso[9,33]). Iterative methodsontheotherhandbeginwithaninitia guesstothe soution which generay converges to the actua soution through a succession of simpe updating steps (iterations). A convergence criterion and a toerance to are specified in order to determine when a sufficienty accurate soution has been found. Exampes incude basic iterative methods such as the Jacobi or Gauss Seide methods [64] and Kryov subspace methods such as the conjugate gradient (CG) method [47, 64, 74]. Two factors have to be considered for an effective soution of the system of inear equations. The first is the memory M that is needed, and the second, and more important factor, is the number Q of arithmetics operations (FLOPs) required to sove the system. The number of FLOPs required is directy proportiona to the CPU time on a simpe computer. A method is optima if the number of FLOPs and the memory grow ineary with probem size, ie. Q = O(N) and M = O(N) (see [66]). In genera, both Q and M are too high for direct methods to be feasibe. For iterative methods, the memory usage is normay optima (ie. M = O(N)) and the number of FLOPs per iteration is normay O(N). Thus we require the number of iterations reative to the stopping criteria, I(to), to be O(1). However, I(to) usuay depends on the condition number of the matrix which becomes worse as the probem size grows, thus iterative methods aone are usuay not optima. One technique for achieving optimaity is preconditioning. A matrix P is chosen such that preconditioning systems can be soved quicky (typicay O(N) operations), and such that the iteration matrix P 1 AP 1 is cose to the identity matrix, meaning the conditioning of P 1 AP 1 is significanty better than that of A. Preconditioning techniques can be very effective, especiay for Kryov subspace methods. Another technique for reducing the number of FLOPs is to use a mutigrid technique [0, 75]. Standard iterative methods such as Jacobi or Gauss Seide suffer from disabing imitations as they are ony effective in reducing the high frequency components of the error (ie. they ony smooth the error). Thus in order to treat the ow frequency components, a coarser grid is introduced. The smoothed residua is restricted

75 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 6 onto the coarse grid and the same procedure, known as the coarse grid correction, is appied recursivey. The system on the coarsest grid can be soved using direct or iterative methods. Mutigrid techniques are optima for most systems resuting from the discretisation of eiptic boundary vaue probems. In the foowing sections of this chapter, both techniques for acceerating the iterative methods are appied. In Section 4., we anayze the basic iterative methods which, when taking into account the particuar structure of the probem, can be used as an effective preconditioner to Kryov subspace methods and as an effective smoother for mutigrid methods. Section 4.3 describes the Kryov subspace methods and how preconditioners are used to acceerate them. Amongst the most effective preconditioners are mutigrid methods, and in Section 4.4 we describe the main components of the method required for Poisson-type probems and a variety of agorithms that can be used, with some basic theory to confirm the optima performance of the method. Section 4.5 describes some modifications to the method for soving eiptic probems with anisotropy, and a detaied convergence theory is given in Section 4.6. Finay in Section 4.7, we describe an aternative mutigrid method known as agebraic mutigrid (AMG) which is proven experimentay to be very robust for a wide cass of probems and is therefore a very popuar method for many industria appications. 4. Basic Iterative Methods Consider the system of inear equations Au = b (4..1) and et ũ be an approximation to the exact soution u. We denote the agebraic error of ũ as e = u ũ, where e is a vector whose magnitude can be measured by any of the vector norms. By obtaining the error of ũ we can obtain the exact soution, but since the exact soution is not known, e is not accessibe. Another measure of how we ũ approximates u is the residua, given by r = b Aũ. Hence the residua is the amount by which ũ fais to satisfy (4..1). If the system has a unique soution then r = 0 e = 0 and we have Ae = A(u ũ) = b Aũ = r.

76 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 63 This reationship between the error and the residua, caed the residua equation, has a vita roe in mutigrid. By finding an approximate soution, ẽ, of the residua equation, a new approximation is obtained for u which is given by ũ+ẽ. A second vita too for mutigrid methods is the use of basic iterative methods, and we begin by anayzing the performance of these methods on the system of equations (4..1) Point-Wise Reaxation Schemes Additive Spitting of the Matrix A Basic iterative methods, or reaxation schemes, for soving (4..1) are based on spitting A R N N into B C where B is nonsinguar. Then the system of equations (4..1) is equivaent to Bu = Cu+b, and so the iterative method is cast in the foowing form: Bu (k) = Cu (k 1) +b, (4..) where the matrix S = B 1 C is known as the iteration matrix. An obvious condition on (4..) is that the soution of (4..1) is a fixed point of (4..). The Jacobi Method One such iterative method is known as the Jacobi method, which spits A into the foowing: A = D +(L+U), with D the diagona of A, L the stricty ower trianguar part of A and U the stricty upper trianguar part of A, ie. D = a a NN, L = 0 a a N1 a N,N 1 0, U = 0 a 1 a 1N a N 1,N 0. Then the Jacobi method is defined by choosing B = D and C = (L+U).

77 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 64 The method is we defined if and ony if a the diagona terms are non-zero. So given an initia guess u (0), the Jacobi method in matrix/vector form is given by u (k) = D 1 (L+U) }{{} S J u (k 1) +D 1 b, where S J is the Jacobi iteration matrix. We rewrite this as u (k) = u (k 1) +D 1 (b Au (k 1) ), (4..3) which can be written in component form as foows: u (k) i = u (k 1) i +a 1 ii b i N j=1 a ij u (k 1) j, i = 1,...,N, Since the method updates the approximate soution one component at a time, methods of this type are known as point-wise reaxation schemes. Using a weighting factor, ω, we obtain the weighted Jacobi method u (k) = [(1 ω)i +ωs J ] u (k 1) +D 1 b, }{{} S J,ω with S J,ω being the weighted Jacobi iteration matrix. We wi see that for an appropriate choice of ω, the weighted Jacobi method has better smoothing properties. Gauss Seide Method The (forward) Gauss Seide method is defined simiary, with the same spitting A = D +L+U, but choosing B = D +L and C = U. Hence we have, after some agebraic manipuations, u (k) = u (k 1) +D 1 (b Lu (k) Uu (k 1) Du (k 1) ), (4..4) or equivaenty in component form, u (k) i = u (k 1) i +a 1 ii i 1 b i j=1 a ij u (k) j N j=i a ij u (k 1) j, i = 1,...,N.

78 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 65 Unike the Jacobi method, the approximate soution is updated immediatey after each component is determined, resuting in a faster agorithm. Here, the ordering of the components is important, and we use a exicographica ordering of the grid points. Agorithm 4.1 shows the impementation of the Gauss Seide method. Agorithm 4.1 The Gauss Seide Method: gs(a, u, b) -cm-cm Choose u (0) for k = 1,,..., unti convergence... for i = 1,N u (k) i = u (k 1) i end for end for ( +a 1 ii b i i 1 j=1 a iju (k) j ) N j=i a iju (k 1) j 4.. Bock Reaxation Schemes A simpe generaization of the above methods is known as bock reaxation. These methods update an entire set of components simutaneousy which typicay correspond to a subvector containing an entire ine of grid points. Thus we partition the vectors u and b into subvectors and ensure that these are compatibe with the partitioning of A. Suppose each subvector corresponds to an entire x-ine on the grid, then the partitioning is as foows: A 11 A 1 A 1ny A A = 1 A A ny , u = u 1 u., b = b 1 b.. A ny1 A ny A nyn y u ny b ny Simiary to the point-wise case, the spitting of the matrix A is given by with A = D +L+U, D = A A nyn y, L = 0 A A ny1 A ny,ny 1 0, U = 0 A 1 A 1ny A ny 1,n y 0.

79 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 66 The methods in matrix/vector form are the same as before, (4..3) for the bock Jacobi method and (4..4) for the bock Gauss Seide method. In component form the bock Gauss Seide method is u (k) i = u (k 1) i +A 1 ii i 1 b i A ij u (k) j=1 j n y j=i A ij u (k 1) j, i = 1,...,n y. ThebockreaxationmethodsthereforerequiresovingasystemofequationsA ii v i = w i for i = 1,...,n y that correspond to an entire ine of grid points. These are normay tridiagona systems which can be soved very efficienty using the Thomas agorithm (aka. the tridiagona matrix agorithm) [50, Chapter 8]. We see how the bock reaxation schemes are impemented in Agorithm 4.. Agorithm 4. The Bock Gauss Seide Method: bockgs(a, u, b) Choose u (0) for k = 1,,..., unti convergence... for i = 1,...,n y w i = b i i 1 j=1 A iju (k) j n y j=i A iju (k 1) j Sove v i = A 1 ii w i u (k) end for end for i = u (k 1) i +v i 4..3 Error Anaysis For any of the above iterative methods, we have that u (k) = Su (k 1) +g and u = Su+g, for the iteration matrix S. Thus the agebraic error, e (k), after k iterations, is e (k) = Se (k 1), (4..5) and it foows by induction that e (k) = S k e (0) and so taking norms on each side we have e (k) S k e (0),

80 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 67 where denotes any norm. This impies that the methods are convergent if S < 1, for any initia guess u (0). Now consider the weighted Jacobi method appied to the one-dimensiona Poisson s equation (the one-dimensiona probem is used for simpicity): u (x) = f(x), 0 < x < 1, u(0) = u(1) = 0, (4..6) which is discretised using the finite difference method with uniform mesh width h and n subintervas to form a discrete operator A R (n 1) (n 1). Recaing that S J,ω = (1 ω)i +ωs J and S J = D 1 (L+U), we can obtain S J,ω in terms of A as foows: S J,ω = I h ωa. It is known that the eigenvaues and eigenvectors of A are λ i (A) = 4 ( ) iπ h sin, v (i) j = sin ijπ n n, i,j = 1,...,n 1, respectivey, where i is known as the wavenumber. Therefore the eigenvaues of S J,ω are λ i (S J,ω ) = 1 h ωλ i(a) = 1 ωsin ( ) iπ, i = 1,...,n 1. (4..7) n and the eigenvectors of S J,ω are the same as those of A. Now ooking at the eigenvectors of A, observe that higher vaues of i correspond to more highy osciatory sine waves whie ower vaues of i produce onger smooth waves. It is possibe to expand an arbitrary vector in terms of the eigenvectors because the eigenvectors span R n 1. Thus we can write n 1 e (0) = c i v (i), i=1 where e (0) is the initia error, and since m sweeps of the iteration yied e (m) = SJ,ω m e(0), the error after m iterations becomes n 1 n 1 e (m) = c i SJ,ωv m (i) = c i λ m i (S J,ω )v (i) (4..8) i=1 i=1 because the eigenvectors of A and S J,ω are equa. Thus λ m i (S J,ω) must be as sma as possibe in order to produce the best convergence rate. From (4..7), we estabish that

iterations (bottom right) of the Gauss Seide method. The osciatory components are removed after very few iterations, but the smooth components are damped very sowy. λ m i (S J,ω) < 1 for a 0 < ω 1.

Thus for sma i, λ i wi aways be cose to 1, meaning that the smooth components of the error cannot be damped very we. For higher vaues of i (we arbitrariy set these to be n 1 i n 1), ie.

81 CHAPTER 4. NUMERICAL SOLUTION OF SYSTEMS OF EQUATIONS 68 Figure 4-: The error of a random initia guess after zero iterations (top eft), two iterations (top right), four iterations (bottom eft) and six iterations (bottom right) of the Gauss Seide method. The osciatory components are removed after very few iterations, but the smooth components are damped very sowy. λ m i (S J,ω) < 1 for a 0 < ω 1. Now for sma vaues of i, λ i (S J,ω ) = 1 ωsin (hiπ/) 1 ωh i π /4 1. because h 1, and so h i π 0. Thus for sma i, λ i wi aways be cose to 1, meaning that the smooth components of the error cannot be damped very we. For higher vaues of i (we arbitrariy set these to be n 1 i n 1), ie. for the osciatory components of the error, the vaue of ω which provides the best damping is found by imposing the condition λ n/ = λ n, and soving this for ω gives an optima vaue of ω = /3. In this case λ k < 1/3 for a n 1 k n 1, so a the osciatory components of the error are reduced by a factor of at east 1/3 in each iteration. We see that the weighted Jacobi method damps the osciatory components of the error very quicky, within a coupe of iterations, but the smooth components sowy. It is aso possibe to show (though not necessariy very easiy) that other iterative schemes, such as Gauss Seide, possess the same smoothing property, and Figure 4- demonstrates this in effect for a two dimensiona probem. Hence these methods are known as smoothers or reaxation methods, and they pay a vita roe within mutigrid methods.

Lecture Note 3: Stationary Iterative Methods

MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or