Finite Element Solver for Flux-Source Equations

Size: px

Start display at page:

Download "Finite Element Solver for Flux-Source Equations"

Elvin Terry
5 years ago
Views:

1 Finite Element Solver for Flux-Source Equations Weston B. Lowrie A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Aeronautics Astronautics University of Washington 2008 Program Authorized to Offer Degree: Aeronautics Astronautics

3 University of Washington Graduate School This is to certify that I have examined this copy of a master s thesis by Weston B. Lowrie and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. Committee Members: Uri Shumlak Thomas Jarboe Date:

5 In presenting this thesis in partial fulfillment of the requirements for a master s degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this thesis is allowable only for scholarly purposes, consistent with fair use as prescribed in the U.S. Copyright Law. Any other reproduction for any purpose or by any means shall not be allowed without my written permission. Signature Date

7 University of Washington Abstract Finite Element Solver for Flux-Source Equations Weston B. Lowrie Chair of the Supervisory Committee: Professor Uri Shumlak Aeronautics and Astronautics An implicit finite element solver is being developed. The solver uses the flux-source equation form such that many equation sets can be easily implemented. This helps simplify the discretization of the finite element method by keeping the specification of the physics separate. The Portable, Extensible, Toolkit for Scientific Computation (PETSc) is implemented for parallel matrix solvers and parallel data structures. The motivation behind the development is to have a general solver that can handle many equation sets, run on large parallel machines, and eventually be expandable to multiple dimensions. The development of the 1D solver, and results for several test case solutions to the Pseudo-1D Euler equations are discussed. Accuracy, convergence, and computational timing studies of the method are also described.

9 TABLE OF CONTENTS Page List of Figures iv List of Tables vii Chapter 1: Introduction Motivation Chapter 2: Finite Element Method Flux-Source Equation Form Galerkin s Method - Weak Form of Equations Nodal Basis Function using Lagrange Polynomials Modal Basis Functions using Jacobi Polynomials Basis Function Amplitudes Gaussian Quadrature Chapter 3: Solver Formulation General Equation Form Spatial Discretization Mass and Stiffness Matrices Construction Nonlinear Solver with Implicit Time Advance Jacobians with Respect to Basis Functions Chapter 4: General Boundary Conditions Natural Boundary Condition Specifying Different Boundary Equations Summary of Boundary Conditions Chapter 5: Artificial Dissipation i

10 Chapter 6: PETSc Parallelization and Solvers Vectors and Matrices Scalable Linear Equations Solvers (KSP) Scalable Nonlinear Equations Solvers (SNES) SuperLU Direct Solver Chapter 7: Pseudo-1D Euler Equations Diverging Nozzle Setup Boundary Condition Considerations Supersonic Inflow and Outflow in a Diverging Nozzle Supersonic Inflow and Subsonic Outflow in a Diverging Nozzle Subsonic Inflow and Outflow in a Diverging Nozzle Euler Shock Tube Chapter 8: Accuracy, Convergence, and Timing Studies Varying Polynomial Order Varying Timestep Errors with Large Timesteps Computational Timing Chapter 9: Future Developments and Plans Incorporate Quadrilateral/Hexahedral Structured Grid Generator Extend Algorithm to Three Dimensions Chapter 10: Conclusions Flux-Source Form Nodal versus Modal Basis Functions PETSc Data Structures Implicit Time Advance Bibliography Appendix A: 1D Finite Element Equation Solver Manual A.1 Introduction A.2 Compiling with PETSc libraries A.3 Running the Code A.4 Algorithm Outline ii

11 A.5 Data Structures A.6 Physics and Equation Specification Module Appendix B: Source Code iii

12 LIST OF FIGURES Figure Number Page 2.1 Sixth order nodal (Lagrange) polynomials on the domain x [ 1, 1]. Each node has a corresponding polynomial with a value of one at the node. At all other nodes, the value of the same polynomial is zero A three element system with second order nodal (Lagrange) polynomials. Each node has a corresponding polynomial, and the nodes that share element boundaries have polynomials that span both elements. For instance α 1 3 and α 2 1 from the first and second element provide the C0 continuity through the shared node Modal (Jacobi) Polynomials, J n (1,1) with highest order of 7 on the domain x [ 1, 1]. All polynomials are defined within the domain and go to zero at the domain boundaries, with the exception of the linear polynomials. The two linear polynomials range from zero on one boundary to one at the other boundary. These linear polynomials provide the continuity between elements A three element system with second order modal (Jacobi) polynomials α n. Each polynomial is not associated with any particular node, but defined at all points. The linear polynomials provide the C 0 continuity by spanning across element boundaries. Quadrature points are distributed evenly in this case and include the element boundaries, but they could also be defined only on the interior parts of the elements Nozzle used in solving the pseudo-1d Euler equations. Dashed box indicates the diverging section of the nozzle that is used in the simulations. The subscript c indicates the chamber, t represents the nozzle throat, i the inflow (for the computational domain), e the exit (outflow), and the shock subscript indicates a possible shock location when supersonic inflow and subsonic outflow conditions exist. The analytic cross sectional area function A indicates the modeled section geometry iv

13 7.2 Supersonic inflow and outflow in a nozzle after reaching a steady state (t=20). Plots of pressure p, density ρ, velocity u, and energy e. The dashed line represents the initial condition, while the solid line represents the solution at t=20. Each of the variables are normalized to freestream values: p = p ρ a, 2 ρ = ρ ρ, u = u a, e = e ρ a. Time is normalized to a characteristic time 2 t = t x τ, and the length of the domain to a characteristic length x = a. The τ characteristic time is defined as τ = L a, where L is the physical length of the domain Supersonic inflow and outflow after reaching a steady state (t=10) with over specified boundary conditions. Plots of pressure, density, velocity, and energy. The dashed line represents the initial condition, while the solid line represents the solution at t=10. A dissipation of ɛ = 5e 2 was used to resolve the boundary layer/shock. Each of the variables are normalized to freestream values: p = p ρ a, ρ = ρ 2 ρ, u = u a, e = e ρ a. Time is normalized to a 2 characteristic time t = t τ, and the length of the domain to a characteristic length x = x L a τ. The characteristic time is defined as τ = a, where L is the physical length of the domain Supersonic inflow and subsonic outflow after reaching a steady state (t = 50) for p e = 1.20 (blue),1.30 (red),1.40 (green) and 1.50 (magenta). Each of the variables are normalized to freestream values: p = p ρ a, ρ = ρ 2 ρ, u = u a, e = e ρ a. Time is normalized to a characteristic time t = t 2 τ, and the length of the domain to a characteristic length x = x a τ. The characteristic time is defined as τ = L a, where L is the physical length of the domain. A dissipation factor of ɛ = was used to resolve the shocks and control the dispersion Subsonic inflow and outflow conditions after reaching a steady state for pressure, density, velocity, and energy after t = 300. Each of the variables are normalized to freestream values: p = p ρ a, ρ = ρ 2 ρ, u = u a, e = e ρ a. 2, and the length of the domain to a characteristic length x = x a τ. The characteristic time is defined as τ = L a, where L is the physical length of the domain Time is normalized to a characteristic time t = t τ v

14 7.6 Euler shock tube result after t = 1.5. Plots of pressure, density, velocity, and energy. The dashed line represents the initial condition, while the solid line represents the solution at t=1.5. Each of the variables are normalized to freestream values: p = p ρ a 2 normalized to a characteristic time t = t τ to a characteristic length x = x, ρ = ρ ρ, u = u a, e = e ρ a 2. Time is, and the length of the domain a. τ The characteristic time is defined as τ = L a, where L is the physical length of the domain. A dissipation factor of ɛ = was used to resolve the shocks and control dispersion Nozzle convergence for varying polynomial order. The L2 Norms normalized by the number of degrees of freedom N p in the system versus the number of elements in the system N e are compared. Polynomial degrees of 2,3,4,5,6,7, and 8 are shown Nozzle convergence for varying time step sizes t. The L2 Norms normalized by the number of degrees of freedom N p in the system versus the number of elements in the system N e are compared. Several different time step sizes are shown t = 0.25, 0.333, 0.50, 0.667, 1.0, 1.333, 1.667, and Relative deviation from a steady state solution due to increase in time step size. The deviation measured is an infinity norm of the difference between the steady state solution and peak error due to oscillations. Figure 8.4 shows an example oscillatory error that results from large time steps. Five different spatial resolutions are compared, N e = 30, 40, 50, 100, and Velocity deviation from supersonic inflow and outflow steady state solution due to large time steps. The initial condition is the flat dashed line, the curved dashed line is the steady state solution, and the solid line is the erroneous solution due to large time steps Matrix structure for a 10 element system with 4th order polynomials (left) and 5 element system with 7th order polynomials (right). Both have 31 degrees of freedom A circle geometry showing the partitions (a) and after a structured quadrilateral mesh on each piece (b) A cylinder geometry showing the partitions (a) and after a structured hexahedral mesh on each piece (b) A cylinder geometry with cutaway showing the partitions (a) and after a structured hexahedral mesh on each piece (b) A HIT like geometry showing the partitions (a) and after a structured hexahedral mesh on each partition (b) vi

15 LIST OF TABLES Table Number Page 4.1 Function R b and Jacobian R b equations for both Neumann and Dirichlet boundary conditions applied to a primary variable and a non-primary variable. This table applies to both nodal and modal basis functions, where the all but one basis function is non-zero at the boundary Inflow and outflow boundary condition requirements for the Pseudo-1D Euler equations [8]. Characteristics are the eigenvalues for the Pseudo-1D Euler system of equations, where u is the bulk fluid velocity, and a is the sound speed in the fluid. The (+) indicates a right moving characteristic and the ( ) indicate a left moving characteristic Numerical versus analytical shock location in nozzle for several inflow/outflow pressure ratios Average Newton and average linear solver (GMRES) iteration counts for varying time steps after an equal amount of time steps (100). Average Newton iterations are per time step, and the linear iterations are also averaged per time step CPU timing and average Newton and Linear (GMRES) iteration counts for varying spatial resolution with polynomial order 4. N e is the number of elements, and N p is the total number of degrees of freedom. The average Newton iterations are per time step, and the average linear iterations are also per time step. The CPU time is measured using the intrinsic fortran routine CPU TIME CPU timing and average Newton and Linear (GMRES) iteration counts for varying polynomial degree. N e is the number of elements, and N p is the total number of degrees of freedom. The average Newton iterations are per time step, and the average linear iterations are also per time step. The CPU time is measured using the intrinsic fortran routine CPU TIME CPU timing, average CPU time per time step and average Newton and Linear (GMRES) iteration counts for varying time step with parameters: poly = 6, Θ = 0.50, N e = 50, t final = 10.0, ɛ = t is the time step size, and N t is the total number of time steps. The average Newton iterations are per time step, and the average linear iterations are also per time step. The CPU time is measured using the intrinsic fortran routine CPU TIME vii

16 8.5 Various linear solver types included with the PETSc libraries, and the SuperLU direct solver with their descriptions and parameters used for the runs in Table CPU timing and iteration results for different iterative linear solver methods described in Table viii

17 1 Chapter 1 INTRODUCTION A finite element solver is being developed to solve equations in the flux-source form. This enables physics equations of many types and complexity to be generally solved with a relatively small amount of editing to code. The finite element method is chosen due to its ability to effectively solve systems of equations with smooth solutions and with arbitrarily defined geometries. The solver takes advantage of the portable, extensible, toolkit, for scientific computation (PETSc) libraries for parallel data structure and solver management. It makes use of both the linear and nonlinear solvers built into PETSc as well as the interface to the SuperLU direct solver for large sparse matrices. Using these optimal solver libraries enables scaling of the code to large machines without having major rewrites. 1.1 Motivation The motivation behind developing a one dimensional code of this type is to prepare for developing a 3D, fully implicit, parallel, finite/spectral element code, that can solve the extended magnetohydrodynamic (MHD) equations and other plasma systems such as the two-fluid equations on general body-fitted grids. This is a large and complicated undertaking and using a one dimensional code can greatly simplify algorithm development, and ease the transition to three dimensions. The pseudo-1d Euler equations are implemented in this formulation because they are a relatively simple equation set that can be posed in the flux-source equation form. This equation set gives enough complexity such that a solver can be developed and tested, but also simple enough that it will not impede development.

18 2 Chapter 2 FINITE ELEMENT METHOD The finite element method is a robust method for solving partial differential equations on complex geometries. The method splits a large problem into many small elements and solves each piece simultaneously. Each element makes up a piecewise continuous solution of the larger problem. Within each element the solution is represented by basis (interpolation) functions that determine the solution in the interior of the element. With careful selection of basis functions the solution can be guaranteed continuous on the element boundaries. To take advantage of the piecewise representation, the PDE must not be in the differential form but the integral (weak) form. This form gives an approximate solution to the problem at any specified range, and therefore can be broken into elements. 2.1 Flux-Source Equation Form The flux-source equation form is used for its convenience. Many equation sets can be represented in this form, and thus a solver can be formulated that generally solves this equation type. The form is also known as divergence form, and has the form q t + f = s (2.1) where q is a vectors of primary variables and f, and s are the fluxes and sources associated with each of the primary variables. 2.2 Galerkin s Method - Weak Form of Equations Galerkin s method converts a continuous PDE to a discrete problem by formulating the equation in the weak form. The weak form is constructed by multiplying the equation by a trial function and integrating over the problem domain. For the Galerkin formulation the trial function is an interpolating (basis) function that is also used to represent each variable.

19 3 Using the Galerkin discretization, the PDE is converted to the weak form. q α i t d x + α i fd x = α i sd x (2.2) Ω Ω where the α s represent some interpolating basis function and Ω is the domain that the basis functions span. Additionally the variables are expanded in terms of the basis functions and amplitudes of the basis functions q = n Ω α i (x) q i (2.3) 2.3 Nodal Basis Function using Lagrange Polynomials A nodal basis function set has one particular function associated with each node. All other functions at this node are zero. The degree of polynomial thus determines the number of nodes required in the system. A common nodal basis set is the Lagrange polynomials which are used due to the C 0 continuity they provide and their simplicity. They are defined as a set of polynomials with degree (n 1) which passes through all n points. They have the form n α(x) = α j (x) (2.4) where, α j (x) = j=1 n k=1,k j This formulation written generally looks like [3] x x k x j x k (2.5) α(x) = (x x 2)(x x 3 )... (x x n ) (x 1 x 2 )(x 1 x 3 )... (x 1 x n ) y 1 + (x x 1)(x x 3 )... (x x n ) (x 2 x 1 )(x 2 x 3 )... (x 2 x n ) y (x x 1 )(x x 2 )... (x x n 1 ) (x n x 1 )(x n x 2 )... (x n x n 1 ) y n. (2.6) Figure 2.1 shows seventh order Lagrange polynomials and Figure 2.2 shows a second-order, three-element system. Notice the basis functions associated with element boundaries provide the continuity. A special property of the Lagrange polynomials is that the amplitude of a basis function is one at its corresponding node and zero at every other node. This property is useful because it makes the basis function amplitude the same as the primary variable value.

20 α Figure 2.1: Sixth order nodal (Lagrange) polynomials on the domain x [ 1, 1]. Each node has a corresponding polynomial with a value of one at the node. At all other nodes, the value of the same polynomial is zero.

21 α α α α 1 2 α 2 2 α 3 2 α 1 3 α 2 3 α 3 3 Nodes Figure 2.2: A three element system with second order nodal (Lagrange) polynomials. Each node has a corresponding polynomial, and the nodes that share element boundaries have polynomials that span both elements. For instance α3 1 and α2 1 from the first and second element provide the C 0 continuity through the shared node Modal Basis Functions using Jacobi Polynomials Modal basis sets have an arbitrary number of functions defined within each element and are not associated with any specific nodes. They have a polynomial defined for each order up to the highest specified. For instance a third order element will have a linear, quadratic, and cubic basis function defined. This differs from the nodal basis sets because all their polynomials are of the highest order specified. This means for a third order nodal element, all basis functions are cubic. A common modal basis set are the Jacobi polynomials, which are solutions to the Jacobi differential equation. They can be effectively used as modal basis functions in the finite element method because of their ability to provide C 0 continuity and their complete spectral sampling. It is also simple to compute the functions for an arbitrary polynomial order, which make them a convenient choice for numerical methods. They are defined by the recurrence relation J n (αp,βp) (x) = ( 1)n [ 2 n n! (1 x) αp βp dn (1 + x) dx n (1 x) αp+n (1 + x) βp+n] (2.7) for α p, β p > 1, where α p and β p are polynomial parameters and not the basis functions. A special case of the Jacobi polynomials is the Legendre polynomial for when α p = β p = 0.

22 6 In order to provide C 0 continuity with Jacobi polynomials, at least one of the functions must span continuously from one element to another. For simplicity the linear function is defined twice with opposite slopes and all other functions go to zero at the element boundaries. These functions have the form where J (αp,βp) is the Jacobi polynomial. P 0 (x) = 1 P 1a (x) = (1 + x)/2 P 1b (x) = (1 x)/2 P n (x) = (1 x 2 )J (αp,βp) n 2 (n 2) (2.8) This provides functions on the interval of x [ 1, 1], which can be mapped linearly onto the domain range of choice. Figure 2.3 shows these polynomials up to seventh order for α p = β p = 1. The linear elements will provide the continuity between elements. Figure 2.4 shows how these linear elements provide the continuity by showing a three element system. The quadrature points are placed at the element boundaries, and at the roots of the polynomials. These points could be placed anywhere in the element domain as long as there are at least the same amount as the number of basis functions. 2.5 Basis Function Amplitudes The finite element solver advances the amplitudes q of the basis function as the solution. The actual primary variables can be recovered by evaluating the summation from Eqn This formulation is convenient because it enables the solution to be represented continuously within elements, rather than just at nodal locations. Consequently, this also means the initial condition, flux and source must be represented as amplitudes of the basis functions rather than by the primary variables Initial Condition A set of nodes with a size determined by the number of degrees of freedom in the problem is defined. The initial condition is then defined on this set of nodes. This initial condition

23 α Figure 2.3: Modal (Jacobi) Polynomials, J n (1,1) with highest order of 7 on the domain x [ 1, 1]. All polynomials are defined within the domain and go to zero at the domain boundaries, with the exception of the linear polynomials. The two linear polynomials range from zero on one boundary to one at the other boundary. These linear polynomials provide the continuity between elements.

24 8 1 α 1 1 α 2 1 α 3 1 α 1 2 α 2 2 α 3 2 α 1 3 α 2 3 α 3 3 Quad. Pts Figure 2.4: A three element system with second order modal (Jacobi) polynomials α n. Each polynomial is not associated with any particular node, but defined at all points. The linear polynomials provide the C 0 continuity by spanning across element boundaries. Quadrature points are distributed evenly in this case and include the element boundaries, but they could also be defined only on the interior parts of the elements. represents the primary variables, but the solver needs to know the amplitudes of the basis functions corresponding to the primary variables. The amplitudes are found by solving a linear system for the whole domain. Figs 2.2 and 2.4 show an example domain consisting of three quadratic elements for nodal (Lagrange) and modal (Jacobi) polynomials respectively. The system of equations can be expressed in matrix form for the three element system q 1 α1 1 (1) α1 2 (1) α1 3 (1) q 1 q 2 α1 1 (2) α1 2 (2) α1 3 (2) q 2 q 3 α 1 1(3) α1 2 (3) α1 3 (3) α2 2 (3) α2 3 (3) 0 0 q 3 q 4 = 0 0 α 1 2(4) α2 2 (4) α2 3 (4) 0 0 q 4 (2.9) q α1 2(5) α2 2 (5) α2 3 (5) α3 2 (5) α3 3 (5) q 5 q α1 3(6) α3 2 (6) α3 3 (6) q α1 3(7) α3 2 (7) α3 3 (7) q 7 q 7 where q n represents the primary variables defined at some point n. This equation is a simple linear system and can be solved by inverting the matrix of basis functions to find the corresponding amplitudes. The size of the system is determined by the number of degrees of freedom, which corresponds to the total number of polynomial basis functions defined in problem. For instance the system shown in Figure 2.4 requires the initial condition to

25 9 be defined at seven points. There must be the same number of points as there are basis functions in order for the system to be solved. With the Jacobi polynomials described in Eqn. 2.8 and for the system shown in Figure 2.4 the matrix in Eqn. 2.9 is M Jacobi = (2.10) (Note: When using a nodal basis representation like Lagrange functions, this matrix is merely the identity matrix because the amplitudes of each function are one at its corresponding node and zero everywhere else. No inversion is required!) Flux and Source The flux and source amplitudes must also be found in a similar way as the initial primary variables. Since the flux and source are defined in terms of the primary variables, these amplitudes must be calculated first. f = i α i (x) f i, s = i α i (x) s i (2.11) The variables q from Eqn. 2.3 are used to compute the flux, f and source, s at the initial points. Then a system of equations similar to the initial condition system is formed for the

26 10 flux and source f 1 α1 1 (1) α1 2 (1) α1 3 (1) f 1 f 2 α1 1 (2) α1 2 (2) α1 3 (2) f 2 f 3 α 1 1(3) α1 2 (3) α1 3 (3) α2 2 (3) α2 3 (3) 0 0 f 3 f 4 = 0 0 α 1 2(4) α2 2 (4) α2 3 (4) 0 0 f 4, (2.12) f α1 2(5) α2 2 (5) α2 3 (5) α3 2 (5) α3 3 (5) f 5 f α1 3(6) α3 2 (6) α3 3 (6) f α1 3(7) α3 2 (7) α3 3 (7) f 7 f 7 s 1 s 2 s 3 s 4 s 5 s 6 s 7 α1 1 (1) α1 2 (1) α1 3 (1) s 1 α1 1 (2) α1 2 (2) α1 3 (2) s 2 α 1 1(3) α1 2 (3) α1 3 (3) α2 2 (3) α2 3 (3) 0 0 s 3 = 0 0 α 1 2(4) α2 2 (4) α2 3 (4) 0 0 s 4. (2.13) 0 0 α1 2(5) α2 2 (5) α2 3 (5) α3 2 (5) α3 3 (5) s α1 3(6) α3 2 (6) α3 3 (6) s α1 3(7) α3 2 (7) α3 3 (7) s 7 Notice the matrices are identical because they are a representation of the geometry and element connectivity, which remains constant for the flux and source. These equations must be solved every time the flux and source are evaluated. At minimum this occurs once per time step, although since the matrix is identical it only needs to be inverted or factored once before the time stepping begins. 2.6 Gaussian Quadrature The integrals arising from the weak form of the equations need to be calculated in some way. A numerical quadrature is a simple way to integrate some arbitrary function G(x), where the analytic integral might not be known. The method approximates the integral as a summation of the function evaluated at some quadrature points x multiplied by some weighting values w. This has the form b a G(x)dx n w i G(x i ) (2.14) i=1

27 11 where each quadrature point x i [a, b] has a corresponding weight w i associated with it. The method finds the quadrature points using the roots of some polynomial set. Usually these points are found on an interval of [ 1, 1] and they are transformed to some physical interval [a, b]. The method also finds corresponding weight values specific to the polynomial set. With the quadrature points and weighting values known, the summation is evaluated to approximate the integral in Eqn The polynomial set used plays a role in the convergence rates of the solution. For instance using the weights and roots of the Jacobi polynomials to perform numerical integration of Jacobi functions provides spectral convergence of the solution.[1] Different quadrature types can be used for different basis functions, but this will not necessarily ensure spectral convergence.

28 12 Chapter 3 SOLVER FORMULATION 3.1 General Equation Form To make the solver general, the flux-source equation form is used. This equation involves a vector of primary variables q and the fluxes f and sources s associated with each of the primary variables. q t + f = s (3.1) By applying the Galerkin spatial discretization described in Section 2.2, the weak form of the equation results. 3.2 Spatial Discretization Ω q α i t d x + α i fd x = α i sd x (3.2) Ω Ω Further spatial discretization is performed by expanding q, f, and s with respect to the basis functions, and their amplitudes. q = α j (x) q j (t), j f = α j (x) f j (t), j s = j α j (x) s j (t) (3.3) where α j (x) is the j th basis function, and q j, fj, and s j are the j th amplitudes of the basis functions. In one dimension with this representation after dropping the summation notation, Eqn. 3.2 now becomes { } qj α { } j α i α j d x + α i t Ω x d x fj = α i α j d x { s j } (3.4) Ω Ω Notice the spatial component of the primary variables is entirely represented by the basis function, and therefore the amplitudes can be taken outside the integral. integrals has a two index summation and can be represented as an element matrix. Each of the M e q t + K e f = Me s (3.5)

29 13 where q, f, and s are vectors of q j, f j, and s j from Eqn. 3.3 and α j M e = α i α j d x, and K e = α i d x. (3.6) x Ω Each element matrix can be assembled into a global matrix that represents the whole domain 3.3 Mass and Stiffness Matrices Construction Ω M q t + K f = M s. (3.7) The mass M, and stiffness K matrices, arise from the weak form of the flux-source equation (Eqn. 2.1) and when the basis functions are separated from the primary variables. The integrals are calculated using numerical quadrature and an element matrix is calculated. These matrices represent the coupling between spatial functions. For the system shown in Figure 2.4 the element mass matrix for element 1 is M e1 = α 1 1α1 1 α1 1α1 2 α1 1α1 3 w k α 2 1α1 1 α2 1α1 2 α2 1 α1 3 k α3 1α1 1 α3 1α1 2 α3 1α1 3 (3.8) where e1 represents the first element, the superscripts represent the element number, and the subscripts represent the basis function. The other elements are analogous. The global mass matrix is assembled by adding each element matrix into a large N x N matrix, where N is the total number of basis functions for the system. When elements share basis functions, the element matrices overlap in the global matrix and are added together. This summation is really just adding both sides of the integral together, which is split at the element boundary. For the three element system the mass matrix is α1 1 α1 1 α1 1α1 2 α1 1α α2 1 α1 1 α2 1α1 2 α 1 2 α α 3 1α1 1 α3 1α1 2 α3 1α1 3 + α2 1 α2 1 α1 2α2 2 α 2 1 α w k 0 0 α 2 2α2 1 α2 2α2 2 α2 2 α k 0 0 α3 2α2 1 α3 2α2 2 α3 2α2 3 + α3 1 α3 1 α1 3α3 2 α1 3 α α2 3α3 1 α2 3α3 2 α2 3 α3 3 (3.9) α 3 3 α3 1 α 3 3 α3 2 α 3 3 α3 3

30 14 where the superscripts represent the element number. The summed values (i.e. α3 1α1 3 +α2 1 α2 1 ) should be the same between elements, since they only represent an integral of the spatial basis function over the same size domain. The stiffness matrix is similar except that it represents the coupling between the basis functions α and its derivatives α. The matrix should have a similar sparsity pattern as the mass matrix. 3.4 Nonlinear Solver with Implicit Time Advance For an explicit time advance Eqn. 3.7 is modified to ( q n+1 q n ) M = M s n K t f n = X n (3.10) where n signifies the time step and the vector notation has been dropped for q, f, and s. With an implicit time advance using the Θ scheme, the equation is ( q n+1 q n ) M = [ ΘX n+1 + (1 Θ)X n] (3.11) t Since X n+1 is not known, an iterative scheme is used to solve the equation and it is rewritten in terms a residual R as function of the unknown q n+1. ( q R( q n+1 n+1 q n ) ) = M [ ΘX n+1 + (1 Θ)X n] = 0 (3.12) t where X n+1 is also a function of q n+1. Newton s method is used, which solves the equation for when R( q n+1 ) = 0. The method is formulated by approximating the function R using a Taylor series expansion. ( ) R R( q n+1 ) R( q k ) + q n+1 q = 0 (3.13) where q = q k+1 q k and the index k is the iterate. This is then rewritten as ( ) R q n+1 q = R( q k ) (3.14) which is a linear system and can be solved for q provided the Jacobian q k q k R q n+1, and function R( q k ) are known. The Jacobian is found by taking a derivative of R with respect to q n+1. ( ) R q n+1 = [ M q q n+1 t q ( ΘX n+1 + (1 Θ)X n) ] k (3.15)

31 15 This equation simplifies to ( ) R q n+1 q k = M t Θ Xn+1 q n+1 q k (3.16) The resulting Jacobian can be used in Eqn and along with the iterate function evaluation to solve the linear system for q. The iterate value is updated q k+1 = q k + q (3.17) Since q k is an estimate for q n+1, the solution to the linear system is inaccurate. The inaccuracy can be measured by evaluating Eqn with the updated iterate value q k+1 and comparing to some tolerance. R( q n+1 ) q k+1 tol 0 (3.18) If the evaluation of the function is within the tolerance limits, the solution is considered converged. Otherwise the process is repeated by evaluating the function and Jacobian with q k+1, the linear system from Eqn is solved again, the iterate value updated, and the function is again checked against the tolerance. When the tolerance is met, q k q n+1 (3.19) and the iterate value is considered the solution at the next time step q n Jacobians with Respect to Basis Functions The Jacobian from Eqn is needed in the Newton method solution process and is defined using derivative with respect to the basis function amplitudes. Since the Jacobian is defined in terms of amplitudes of basis functions, it needs to be calculated in much the same way as for the initial condition and flux and source amplitudes. After using the original definition for X R q = q [ M t q Θ ( M s n K f n)] (3.20) which is rewritten as [ ] R q = M t Θ M s f K q q (3.21)

32 16 The flux f s q, and source q seen that If q q and thus Jacobians are needed in terms of the amplitudes q. It can be f q = f q q q and s q = s q q q is expanded in terms of the basis functions it can be seen that With the equalities from Eqn. q q = j α j q j q f q = f q α j and (3.22) = α j (3.23) s q = s q α j (3.24) 3.24 a linear system can be constructed in much the same manner as in section 2.5 for the initial condition, flux, and source amplitudes. The constructed linear system can be solved for s l q, and f l q function l. Each l represents a column in a resulting matrix. [ ] s q [ ] s q 1 α 1 2 α 2.. ] [ s q α n n l = [α n ] [ ] s q [ ] s q with respect to a particular basis 1 α 1 2 α 2.. ] [ s q α n n l (3.25) where [α n ] is a matrix that is identical to the matrix from section 2.5. In the case for multiple primary variables each block in Eqn is a N eq x N eq matrix, where N eq is the number of primary variables. For a system with three variables, the first block would look like s 1 s 1 s 1 q 1 q 2 q 3 s 2 s 2 s 2 q 1 q 2 q 3 s 3 s 3 s 3 q 1 q 2 q 3 where the superscript represents the different primary variables. 1 (3.26)

33 17 Equation 3.25 is analogous to equations 2.3 and 2.9 in section 2.5, where variables at known points are used to solve for the amplitudes of the basis functions. In this case the Jacobian is known at some specific locations, and a Jacobian defined in terms of the amplitudes of the basis functions is needed. The vector on the left hand side in Eqn represents the known values, which are used to solve for the amplitude values.

34 18 Chapter 4 GENERAL BOUNDARY CONDITIONS The goal is to have a generalized form of the boundary conditions such that it is easy to specify boundary fluxes or specify a separate equation to be solved on the boundary. This is accomplished by having lists of boundary nodes and interior nodes. With these lists the equations that are specified for boundaries are applied only to boundary nodes, while the interior equations are solved on all the interior nodes. Two major types of boundary specification are used. One is the natural boundary condition, where the flux is controlled, and the second involves specifying an alternative arbitrary boundary equation. 4.1 Natural Boundary Condition A natural boundary condition is applied by specifying the flux term of the weak form of the governing equation (Eqn. 2.2). In one dimension the equation looks like Ω α f x dx } {{ } flux term = Ω α fdx x + } {{ } volume term [ αf ] Ω }{{} surface term (4.1) which is derived by integrating the term by parts and separating it into a volume term and surface term where Ω is the domain of interest. In one-dimension, the surface term is a surface evaluation, because each boundary consists of one node. This surface evaluation represents the amount of flux through the boundary nodes. Therefore the surface term can be specified to control the flux of the primary variables. For instance if one were to examine the fluid continuity equation ρ x + (ρu) x = 0 (4.2) the resulting surface term is [αρu] Ω, which is the momentum ρu multiplied by the basis function α evaluated at the boundary. controlled by specifying the value of this term. The momentum flux boundary condition can be

35 19 The specification of the flux term has several variants. It is treated identically to an interior equation, zeroed, or explicitly specified to some value. When treating the surface term identically to the interior elements, the flux originates through the surface term, and contributions to the term only originate from the element interior. It is as if the contribution from a neighboring element were excluded, but in this case it is a physical boundary. This is a useful boundary condition when no reflections are desired at the boundary. When the flux term is zeroed it is also called a zero-flux boundary condition. This means the term is completely removed, which is useful for specifying a solid wall boundary. The third variant involves explicitly specifying the flux, which is useful for specifying inflow and outflow conditions on a boundary. 4.2 Specifying Different Boundary Equations Alternatively to specifying the boundary flux, a separate equation can be specified for boundary nodes. The boundary equation is replaced by another equation on the boundary nodes, while the interior nodes remain with the standard governing equation. This is effective for specifying Dirichlet and Neumann boundary conditions Dirichlet Boundary Condition Dirichlet on Primary Variable Specifying a Dirichlet boundary condition involves changing the governing equation to q = β D (4.3) where β D is some specified value for the primary variable q, which can potentially be time dependent. In order to solve this equation in the finite element method described, the equation is modified on the boundary to R b = q β D = 0 (4.4)

36 20 Similarly to the interior equation, this is converted to the weak form in one dimension and the variable expanded in terms of the basis function R b = δ(x x b ) α j (x) q j β D dx = 0 (4.5) Ω j In this case rather than integrating over the whole domain with the basis function, an evaluation at the boundary is performed using a delta function δ(x x b ) about the boundary location x b. The delta function is critical because it reduces the integral to an evaluation and excludes the contribution of the basis functions integrated over the element domain. Despite the fact that all but one of the basis functions are zero at the element boundary, their integrals over the element domain are nonzero and would impact the boundary node. The primary variable q is expanded in terms of basis functions and amplitudes and the delta function collapses the integral. R b = j α j (x) q j β D = 0 (4.6) The summation is now over each of the basis functions at the boundary, and since all but one has a nonzero value the summation is dropped and the equation simplifies R b = j α(x b ) q j β D R b = α j (x b ) q j β D (4.7) where x b is x at the boundary. (Note: in general all the basis functions can have nonzero values at the element boundaries, and this would lead to different continuity properties between elements. For simplicity this formulation uses only one nonzero basis functions to provide the continuity, while all others are zero at the boundary.) The Jacobian also needs to be altered for the boundary equation. R q = α j (x b ) q j β D (4.8) q j This simplifies to R q = j α j (x b ) α j (x b ) (4.9) where x b is x at the boundary.

37 21 Dirichlet on Non-Primary Variable To hold non-primary variable fixed at the boundary the condition is R b = q2 q 1 β D = 0 (4.10) where q 1 and q 2 are each primary variables and some combination (possibly nonlinear) yields the desired condition. For example if q 1 = ρ and q 2 = ρu, then q 2 /q 1 = u and u is desired to be held fixed. In the weak form using a delta function, with q 1 and q 2 expanded in terms of the basis function, the equation is ( j R b = δ(x x b ) α ) j(x) q j 2 Ω j α j(x) q j 1 β D dx = 0 (4.11) Similar to the other case, this simplifies to j R b = α j(x) q j 2 j α j(x) q j 1 β D = 0 (4.12) The function R b is trivial to evaluate, but since the equation is a function of more than one of the primary variables, the Jacobian will be more complicated. [ R q = j α ] [ ] j(x) q j 2 q j α j(x) q j 1 R q = αj (x) q j 2 q α j (x) q 1 x=x j b (4.13) where q includes all primary variables q 1, q 2,..., q n, and x b is x at the boundary. This equation must be evaluated and used as the Jacobian at the boundary Neumann Boundary Condition Neumann on Primary Variable A Neumann boundary imposed on the boundary has the form The boundary equation is now Ω q x = β N (4.14) R b = q x β N = 0 (4.15) and the equation solved at the boundary in the weak form using a delta function is ( ) q R b = δ(x x b ) x β N dx = 0 (4.16)

38 22 Again the delta function δ(x x b ) is used to evaluate at the boundary rather than integrating over the whole domain. By expanding q in terms of the basis function the equation can be rewritten as R b = j α j (x) x q j β N = 0 (4.17) x=xb where x b is x at the boundary. Similar to the Dirichlet conditions this amounts to changing the R function at the boundary to Eqn The Jacobian will also be different and has the form R q = q j α j (x b ) q j β N (4.18) where α = α x. This equation simplifies to R q = j α j (x b ) (4.19) This is analogous to the Dirichlet case, except that the basis function evaluation is a derivative. Neumann on Non-Primary Variable A Neumann boundary condition on a non-primary variable is slightly more complicated than the primary variable case. variable. Again as an example q 2 /q 1 is used as the non-primary x ( ) q 2 = β N (4.20) q 1 The weak form using a delta function is R b = Ω [ δ(x x b ) x ( q 2 q 1 ) β N ] dx = 0 (4.21) Expanding q 1 and q 2 with respect to the basis functions and collapsing the integral and delta function R b = j α j(x) q 2 j j α j(x) q 1 j j α j(x) q 2 j j α j(x) q j 1 ( ) 2 β N = 0 (4.22) x=xb j α j(x) q j x=xb 1

39 23 Table 4.1: Function R b and Jacobian R b equations for both Neumann and Dirichlet boundary conditions applied to a primary variable and a non-primary variable. This table applies to both nodal and modal basis functions, where the all but one basis function is non-zero at the boundary. R b Dirichlet α j q j 2 α j q j 1 β ] Neumann Conserved Non-Conserved Conserved Non-Conserved P P α j q j β j α j q j β P j α j q2 j j α j q j 1 R b α j q [ α j q 2 j α j q 1 j j α j q j α j q j [ 1 P j α j q2 j P j α j q j 1 P j α j q1 j ( P β j α j q j 1 ) 2 P j α j q j 1 P ] j α j q1 j ( P j α j q j 1 ) 2 where x b is x at the boundary. Again the Jacobian is more complicated and looks like R q = j α j(x) q 2 j j q j α j(x) q j 1 α j(x) q j 2 j α j(x) q j 1 ( ) 2 (4.23) j α j(x) q j x=xb Summary of Boundary Conditions For both Neumann and Dirichlet boundary conditions applied to a primary variable (e.g. q 1, q 2, q 3,..., etc) and non-primary variable (e.g. q 2 /q 1 ), the function evaluation and Jacobian differ from the interior equations. Table 4.1 summarizes the different equation forms for the function R b and Jacobian R b at boundaries.

40 24 Chapter 5 ARTIFICIAL DISSIPATION When solving problems using continuous finite elements, resolving shocks or other sharp changes in the flow can be difficult and lead to numerical instabilities. The solution is constrained to be continuous by virtue of the method, so whenever a sharp discontinuity is present, the solution develops high frequency oscillations (Gibbs phenomenon) that ultimately destroys the solution. One way to counter the high frequency oscillations is to add a dissipation term to the governing equations. The goal is to give finite width to shocks and other sharp features, that would otherwise have large changes from one node to the next. A simple addition of a second order term like a Laplacian suffices to dampen the high frequency oscillations that occur. When added to the governing equations, the term can alter the physics of the problem. One way to minimize the impact of adding the dissipation term is to scale it such that differing levels of dissipation can be added. To do this a scalar, ɛ is multiplied to the term. The governing equation now looks like q t + f + ɛ 2 q = s (5.1) where the q operated on by the Laplacian can be applied to only the primary variables of choice. For instance it is common to only apply the term to the velocity or momentum. Applying the Galerkin spatial discretization method to the term yields In one dimension this simplifies to Ω ɛ 2 q Ω α i ɛ 2 qdx α i ɛ 2 qd x (5.2) Ω α i ɛ 2 q dx (5.3) x2

41 25 In order to reduce the order of the derivatives this term is now integrated by parts. Ω [ ] α i ɛ 2 q α i q dx = ɛ x2 Ω x x dx + ɛ q α i x Ω (5.4) where Ω represents the domain boundary. q is expanded in terms of the basis functions and amplitudes Ω α i ɛ 2 q dx = ɛ α x2 i Ω α j q j dx + ɛ [ α(x b )α(x b ) q j (5.5) ] Ω j After dropping the summation notation and moving q outside each of these terms, Eqn. 5.5 simplifies to Ω α i ɛ 2 q x 2 dx = j [ ɛ α iα jdx + ɛ [ ] α i α j ] { q Ω j } (5.6) Ω This can now be represented as a linear combination of matrices and vector of amplitudes q. where, [V 1 + V 2 ] { q} (5.7) V 1 = ɛ α iα jdx and V 2 = ɛ [ α(x b )α(x b ) ] Ω Ω Equation 3.7 can now be modified to include the dissipation terms. The new equation is M q t + K f + [V1 + V 2 ] q = M s (5.8) This is a relatively simple modification to the governing equations and allows for solutions that might develop sharp discontinuities during its evolution, as well as solutions with shocks in the solution.

42 26 Chapter 6 PETSC PARALLELIZATION AND SOLVERS The portable, extensible, toolkit for scientific computation (PETSc) is used for solver data structures. These include the vectors and matrices, the nonlinear solver (SNES), the Krylov subspace iterative linear solver (KSP), and an interface to the SuperLU direct linear solver. Using these data structures and solvers allows for relatively simple implementation and provides the groundwork for a scalable parallel solver. All of these data structures are designed for parallel implementations, so once the variables are defined in the proper way, the parallelization is mostly automatic. 6.1 Vectors and Matrices The PETSc vectors and matrices are created by using the PETSc command VecCreate() or MatCreate(). These functions need to know the global dimensions as well as any the range given to each processor. The processor range can also be calculated by PETSc by using PETSC_DECIDE for the size. This feature allows for a fairly automatic partitioning of parallel data to each processor. 6.2 Scalable Linear Equations Solvers (KSP) The PETSc libraries include a variety of linear solvers based on Krylov subspace iterative methods. Some of these methods are: generalized minimal residual (GMRES), conjugate gradient (CG), bi-conjugate gradient (BICG). There are several more types of iterative methods to suit a specific problem type. The convergence parameters for the KSP solver are: Relative tolerance - Tolerance relative the the previous iteration. (Default: RT OL = 10 5 )

43 27 Absolute tolerance - Global tolerance for convergence. (Default: ABST OL = ) Divergence - Number of iterations until the solution is considered diverged. (Default: DIV ERGENCE = 10 4 ) Preconditioning Side - The side of the matrix that the preconditioner is applied. (Default: Left) 6.3 Scalable Nonlinear Equations Solvers (SNES) A nonlinear solver is needed to approximate the solution of most interesting physical systems. Therefore a nonlinear solver is employed in the method to allow for these types of systems. The solver is the scaleable nonlinear equation solver (SNES), which is built into PETSc. It uses a Newton-based method, which solves the approximate linear system R q = R (6.1) where R is the function and R is the Jacobian. The solvers employ KSP for solutions to the linear systems while using a trust region method.[5] They then need a user specified function to evaluate the linear function, as well as the Jacobian Linear Function and Jacobian Evaluation The linear function evaluation and Jacobian evaluation subroutines are specified using the SNESSetFunction() and SNESSetJacobian() function respectively. This provides an easy way to modularize the code such that these subroutines are defined for the physics equations at hand Convergence Criteria There are several convergence criteria for the SNES solver: Absolute Tolerance - Tolerance for global root calculations. (Default: ABST OL = )

44 28 Relative Tolerance - Tolerance of norm compared to previous iteration s quantity. (Default: RT OL = 10 8 ) Step Tolerance - Tolerance in terms of the norm of the change in the solution between steps. (Default: ST OL = 10 8 ) Maximum Iterations - Maximum number of Newton nonlinear iterations per time step. (Default: MAXIT = 50) Maximum Evaluations - Maximum number of function evaluations per time step. (Default: MAXF = 10 4 ) These can all be set using the SNESSetTolerance() function, or set using runtime parameters. (i.e. -snes_rtol <value>). 6.4 SuperLU Direct Solver SuperLU is an optimized direct solver for large, sparse, nonsymmetric systems of linear equations.[6] PETSc has an interface to the solver through the KSP linear solver. This provides an easy way to use the solver using PETSc sparse matrices. Use of the solver is simple, and only requires specification of the SuperLU solver type and a conversion of the Jacobian matrix to the SUPERLU sparse matrix type.

45 29 Chapter 7 PSEUDO-1D EULER EQUATIONS The pseudo-1d Euler equations provide a good test problem for the flux-source equation form. The equation set is nonlinear and has a source term, which provides enough complexity to sufficiently test the finite element algorithm. Euler equations in one-dimension can only model some very simple flows, like the shock tube problem. The pseudo-1d Euler equations include cross sectional area as a variable, and as a result can model flow through a variable width nozzle or pipe. The equations remain approximately 1D by assuming that flow is uniform at each cross section. [13] The equations have the form q t + f = s (7.1) x where q, f, and s are vectors of primary variables, fluxes, and sources respectively ρa ρua 0 q = ρua, f = (ρu 2 + p)a, and s = p da dx ea u(e + p)a 0 and e = p (γ 1) ρu2 for an ideal gas. γ is the ratio of specific heats, and γ = 1.4 is used in the test problems. 7.1 Diverging Nozzle Setup The setup of a diverging nozzle problem involves specifying the area function, the initial density, velocity, and energy or pressure, and the boundary conditions. The area function used for the simulations is A = tanh(0.8x 4.0) (7.2) where x is the dimension along the length of the nozzle. Figure 7.1 shows a picture of the nozzle used in the simulations, where the dashed box represents the section modeled. This

46 Modeled Section Shock in Nozzle (for Supersonic Inflow / Subsonic Outflow) P e 1 P c A shock A e 0.5 A t A i 0 u c 0 M t M i M shock M e A = *TANH(0.8x 4.0) Figure 7.1: Nozzle used in solving the pseudo-1d Euler equations. Dashed box indicates the diverging section of the nozzle that is used in the simulations. The subscript c indicates the chamber, t represents the nozzle throat, i the inflow (for the computational domain), e the exit (outflow), and the shock subscript indicates a possible shock location when supersonic inflow and subsonic outflow conditions exist. The analytic cross sectional area function A indicates the modeled section geometry. section has the area defined by Eqn The initial conditions are defined within this section and the boundary conditions are applied at either end of the modeled section. In this case the inflow conditions are applied at x = 0 and the outflow at x = Boundary Condition Considerations The pseudo-1d Euler equations can model various flow conditions in a nozzle, whether it be all subsonic flow, all supersonic flow, or partially supersonic and partially subsonic. When considering the different cases it is important to consider how the boundary conditions are to be treated. A PDE must be well posed to have a unique solution. To achieve a well posed problem the initial and boundary conditions must be properly specified. The pseudo-1d Euler equa-

47 31 tions are no exception, and actually require more boundary conditions than the strictly mathematical requirements for a well posed problem. An intuitive explanation for this peculiarity can be realized by studying the method of characteristics. The eigenvalues of the flux Jacobian ( f q ) are: u, u + a, and u a, where u is the bulk flow velocity, and a is the sound speed. This means that depending on the type of flow (subsonic or supersonic) the characteristics will change direction. For a supersonic flow at an inlet, all characteristics are positive and therefore flow into the domain and affect the solution. Conversely at an outlet all characteristics flow out of the domain, and do not affect the solution in the interior. For a subsonic case two of the characteristics are positive and the other negative, and therefore results in information propagating in both directions. This implies that at an inlet two characteristics affect the solution, and at an outlet one of the characteristics affects the solution in the interior domain. What does this mean in terms of required boundary conditions? For every characteristic entering the domain, a corresponding fixed analytic condition is required on one variable at that boundary. A fixed analytic condition can be a Dirichlet boundary condition. Additionally for every characteristic leaving the domain, a numerical boundary condition is required. For the finite element case, the numerical condition could be either a Neumann or natural boundary condition. The purpose is to prevent reflections such that extraneous information does not collect in the domain. Table 7.1 [8] summarizes the boundary conditions required for each flow condition at both the inlet and outlet. Notice for every characteristic entering the domain an analytic boundary condition is required, and for every characteristic leaving the domain, a numerical (Neumann or Natural) boundary condition is required. These findings are only shown by empirical results, rather than strict mathematical proof. The following sections show results for various flow conditions employing the guidelines of Table 7.1 to pick the boundary conditions. Scenarios where a deviation from these guidelines are also shown.

48 32 Table 7.1: Inflow and outflow boundary condition requirements for the Pseudo-1D Euler equations [8]. Characteristics are the eigenvalues for the Pseudo-1D Euler system of equations, where u is the bulk fluid velocity, and a is the sound speed in the fluid. The (+) indicates a right moving characteristic and the ( ) indicate a left moving characteristic. Characteristics Inflow Outflow Subsonic Supersonic Subsonic Supersonic u = (+) u = (+) u = (+) u = (+) u + a = (+) u + a = (+) u + a = (+) u + a = (+) u a = ( ) u a = (+) u a = ( ) u a = (+) Number of Analytic B.C Number of Numerical B.C Supersonic Inflow and Outflow in a Diverging Nozzle A completely supersonic flow is studied. Supersonic conditions are initialized and maintained by specifying a high enough initial Mach number and specifying the boundary conditions recommended by Table 7.1. Boundary conditions that deviate from Table 7.1 are also explored to show how the system reacts when it s over specified Correctly Specified Boundary Conditions Table 7.1 recommends fixing three variables on the inflow and having natural boundary conditions on the outflow for supersonic flow. This means any three physical variables can be specified on the inflow in order for the problem to be well posed. One possibility is specifying the pressure, density, and momentum. Energy could also be specified instead of pressure, and velocity with momentum and the system would remain correctly specified. The choice of which variables to apply boundary conditions is problem dependent, but as long as the correct number are fixed the problem is well defined. Figure 7.2 shows the completely supersonic solution after reaching a steady state. In this case the pressure, density, and momentum are specified to be fixed to their initial condition

49 33 at the inflow. On the outflow natural boundary conditions are applied such that waves are not reflected back into the computational domain. Supersonic Inflow and Outflow Inflow ρ in = ρ o ρu in = ρu o p in = p o Outflow ρ out = Natural ρu out = Natural e out = Natural Overspecified Boundary Conditions If the problem is over specified, the system compensates by having a boundary/shock layer. The system pushes for the correct physics, but when an extraneous boundary condition does not allow for this, it comes as close as possible. In this case an extra Dirichlet boundary condition is applied to the outflow pressure. The boundary conditions are satisfied, but the boundary layer forms as a result. This is essentially applying subsonic boundary conditions to a supersonic flow, and thus creating a discontinuity or shock at the boundary. Figure 7.3 shows this result. Notice that variables other than pressure also have this boundary layer. Over Specified Supersonic Inflow and Outflow Inflow ρ in = ρ o ρu in = ρu o p in = p o Outflow ρ out = Natural ρu out = Natural p out = p o 7.4 Supersonic Inflow and Subsonic Outflow in a Diverging Nozzle A case where the flow at the inlet is supersonic and subsonic at the exit can exist when the pressure ratio between the inflow and outflow is small enough (i.e. the back pressure is high enough). In this type of flow a shock forms within the nozzle. Due to the shock in the flow some numerical dissipation is added to prevent instabilities and give some finite width to the shock. The dissipation parameter ɛ controls the amount of dissipation, and ɛ =

50 34 1 Pressure 1 Density P ρ Velocity 4 Energy v 2 e Figure 7.2: Supersonic inflow and outflow in a nozzle after reaching a steady state (t=20). Plots of pressure p, density ρ, velocity u, and energy e. The dashed line represents the initial condition, while the solid line represents the solution at t=20. Each of the variables are normalized to freestream values: p = p ρ a 2 normalized to a characteristic time t = t τ length x = x a τ. The characteristic time is defined as τ = L length of the domain., ρ = ρ ρ, u = u a, e = e ρ a. Time is 2, and the length of the domain to a characteristic a, where L is the physical

51 35 1 Pressure 1 Density P ρ Velocity 4 Energy v 2 e Figure 7.3: Supersonic inflow and outflow after reaching a steady state (t=10) with over specified boundary conditions. Plots of pressure, density, velocity, and energy. The dashed line represents the initial condition, while the solid line represents the solution at t=10. A dissipation of ɛ = 5e 2 was used to resolve the boundary layer/shock. Each of the variables are normalized to freestream values: p = p ρ a, ρ = ρ 2 ρ, u = u a, e = e ρ a. Time is 2 normalized to a characteristic time t = t τ length x = x a τ. The characteristic time is defined as τ = L length of the domain., and the length of the domain to a characteristic a, where L is the physical

52 36 was used to give the shock a finite width. This value was determined by first using a larger amount of dissipation, and then reducing the value until the problem has a small amount of dispersion. Reducing the dissipation further would give an overly dispersive solution. Having a larger amount of dissipation, yields a more diffuse solution, and the shock spans more nodes. Chapter 5 talks about the details of the adding numerical dissipation to the solver. Figure 7.4 shows plots for pressure, density, momentum, and energy after reaching a steady state. Supersonic Inflow and Subsonic Outflow Inflow ρ in = ρ o ρu in = ρu o p in = p o Outflow ρ out = Natural ρu out = Natural p out = p o Boundary Conditions Referring to Table 7.1 it can be seen that three Dirichlet boundary conditions are required on the inflow for supersonic flow, and one Dirichlet and two numerical conditions on the outflow boundary. For this case it is convenient to hold the density, momentum, and pressure on the inflow fixed and pressure on the outflow fixed. Natural boundary conditions are applied to density and momentum on the outflow to satisfy the two numerical conditions. To apply a pressure ratio, different values of pressure are held fixed on each boundary. Due to this difference, a linear profile is given to the initial pressure to avoid discontinuities at the boundary. These conditions yield a steady state shock in the domain at some location depending on the magnitude of the pressure ratio Shock Location in Nozzle The shock location in a pseudo-1d nozzle can be calculated analytically. This shock location can then be compared to the numerical shock location predicted by the pseudo-1d Euler equations.

53 37 Analytical Calculation To find the shock location analytically it is important to think of the nozzle as not just in terms of the diverging section, but a whole converging-diverging nozzle system. The system in mind is shown in Figure 7.1. The whole picture is needed because the theoretical chamber pressure, p c and throat area, A t are needed to find the shock location. As a first step the throat area is needed. Since the flow velocity cannot exceed Mach 1 at the throat, and an Mach number, M i = 1.25 is initialized at the domain inflow, there must be some smaller cross section where the flow velocity is sonic. ( Ai A t ) = 1 M 2 i [ ( γ 1 )] (γ+1)/(γ 1) Mi 2 (7.3) γ The cross sectional area of this point in the flow is the throat area and can be found by solving Eqn. 7.3 for A t. [7] where A i, and M i are the area and Mach number initialized at the inflow boundary. The next step is to find the exit Mach number assuming a non-isentropic flow. First the chamber pressure, p c is needed. This pressure represents the stagnant gas feeding the flow of the nozzle. See Figure 7.1. This pressure assumes there is no flow (or very close to no flow) and is the pressure compared to the exit pressure when determining the location of the shock. The chamber pressure is found using the isentropic relation ( p c = p i 1 + γ 1 2 M 2 i ) γ γ 1. (7.4) Isentropic flow is assumed prior to the inflow point and thus the chamber pressure can be deduced from this relationship. Now the non-isentropic exit velocity can be found by solving equation this equation while using the values obtained for A t and p c M e = 1 γ 1 + ( ) ( ) γ+1 (pc ) 2 γ 1 A 2 t. (7.5) γ 1 γ 1 γ 1 p e A e Now the flow conditions at both the inflow and outflow are known and the next step is to find the conditions at the location of the shock. First the pressure ratio about the shock

54 38 can be found by using the relation p o 2 p o 1 = p ( e p c 1 + γ 1 Me 2 2 ) γ γ 1. (7.6) With the pressure on either side of the shock known, the Mach number on the upstream side of the shock can be found. ( (γ + 1) 2 γm 2 shock γ 1 2 ) 1 γ 1 [ (γ ) M shock (γ 1) 1 2M 2 shock ] γ γ 1 po 2 p o 1 This non-linear Eqn. 7.7 is solved for M shock using any non-linear method desired = 0 (7.7) Once the Mach speed upstream the shock is known, the area at the shock can be found using Eqn. 7.3, except M i and A i are replaced by M shock and A shock, and the equation is solved algebraically for A shock. The final step is to find the physical location x shock by comparing the cross sectional area at the shock (A shock ) to the area function (Eqn. 7.2). The location that corresponds to the area at the shock is where the shock is predicted to reside. Results for several pressure ratios, p e /p c are listed in Table 7.2. These are compared to results obtained numerically. Numerical Calculation Several cases are run with differing inflow/outflow pressure ratios and compared to the analytical shock location. Results are summarized in Table 7.2. Figure 7.4 shows plots of pressure, density, momentum, and energy for these same cases. Notice the close comparison between analytical and numerical results. 7.5 Subsonic Inflow and Outflow in a Diverging Nozzle A completely subsonic flow is initialized in a diverging nozzle. This is initialized by specifying a subsonic initial Mach number throughout the domain, and being careful not to set a pressure ratio that will accelerate the flow into the supersonic regime.

55 39 2 Pressure 2 Density P 1 ρ Velocity 4 Energy v e Figure 7.4: Supersonic inflow and subsonic outflow after reaching a steady state (t = 50) for p e = 1.20 (blue),1.30 (red),1.40 (green) and 1.50 (magenta). Each of the variables are normalized to freestream values: p = p ρ a, ρ = ρ 2 ρ, u = u a, e = e ρ a. Time is 2 normalized to a characteristic time t = t τ, and the length of the domain to a characteristic length x = x a τ. The characteristic time is defined as τ = L a, where L is the physical length of the domain. A dissipation factor of ɛ = was used to resolve the shocks and control the dispersion.

56 40 Table 7.2: Numerical versus analytical shock location in nozzle for several inflow/outflow pressure ratios. p e /p c x numerical x analytical Boundary Conditions Referring to Table 7.1 the boundary conditions required for subsonic inflow are two Dirichlet and one natural and for subsonic outflow one Dirichlet and two natural boundary conditions. This case proves to be somewhat of a special case, and these boundary conditions are not completely followed. Instead of natural boundary conditions Neumann conditions are used, and only one variable on the inflow boundary is held fixed with a Dirichlet condition. It is not well understood why this deviation from the prescribed boundary conditions works, but with any combination of two fixed variables on the inflow, the system never reaches a steady state. The momentum is held fixed on the inflow, and the density is held fixed on the outflow. All other variables have Neumann conditions. Subsonic Inflow and Subsonic Outflow Inflow Outflow x ρ in = 0 ρ out = ρ o ρu in = ρu o x ρu out = 0 x p in = 0 x p out = Euler Shock Tube The Euler shock tube is a simplification of the pseudo-1d Euler equations where the area, A is uniform and a discontinuity is initialized inside the pipe. Figure 7.6 shows the initial

57 41 1 Pressure 1.02 Density P 0.98 ρ Velocity 2.52 Energy v e Figure 7.5: Subsonic inflow and outflow conditions after reaching a steady state for pressure, density, velocity, and energy after t = 300. Each of the variables are normalized to freestream values: p = p ρ a, ρ = ρ 2 ρ, u = u a, e = e ρ a. Time is normalized to a charac- 2 teristic time t = t τ, and the length of the domain to a characteristic length x = x a τ. The characteristic time is defined as τ = L a, where L is the physical length of the domain.

58 42 condition as a dashed line, and the result at t = 1.5. Notice on the density plot the shock wave, contact discontinuity, and rarefaction wave are all resolved. A dissipation parameter of ɛ = was used to resolve the shocks. This is an order of magnitude less dissipation compared to the nozzle problems with shocks. Less dissipation is needed because the number of time steps for this solution is significantly less, and the solution does not have time to develop large dispersive errors. Additionally the shock tube problem is fully conservative (no source terms) and thus is easier to stabilize Boundary Conditions The boundary conditions are trivial for the Euler shock problem because the domain of influence resides completely within the computational domain and is not determined by the boundary. The only requirement is to prevent reflections at the boundaries and therefore natural boundary conditions for all variables suffices. When the shock front reaches the boundary, the problem is effectively over.

59 43 6 Pressure 6 Density P 3 ρ Velocity 15 Energy 0 10 v e Figure 7.6: Euler shock tube result after t = 1.5. Plots of pressure, density, velocity, and energy. The dashed line represents the initial condition, while the solid line represents the solution at t=1.5. Each of the variables are normalized to freestream values: p = p ρ a, 2 ρ = ρ ρ, u = u a, e = e ρ a. Time is normalized to a characteristic time t = t 2 τ, and the length of the domain to a characteristic length x = x a τ. The characteristic time is defined as τ = L a, where L is the physical length of the domain. A dissipation factor of ɛ = was used to resolve the shocks and control dispersion.

60 44 Chapter 8 ACCURACY, CONVERGENCE, AND TIMING STUDIES The accuracy, convergence properties, and computational timing of the finite element solver are investigated. This is done by looking into various parameters such as polynomial degree, spatial resolution and size of time step. The pseudo-1d Euler equations were solved to perform the investigations. 8.1 Varying Polynomial Order A fundamental parameter in finite element methods is the highest polynomial order in the basis functions. The convergence properties of the solver are investigated by solving a test problem and varying the polynomial order. Other parameters are held fixed. Figure 8.1 shows a plot of normalized L2 norms versus spatial resolution for several polynomial orders using a nodal basis set. The L2 norms are normalized to the total number of nodes in the problem. From Figure 8.1 it can be seen that for increasing spatial resolution, the total error in the solution decreases. It can also be seen that for higher polynomial order the errors decrease faster for a corresponding increase in spatial resolution. This means that for higher polynomial order the convergence rate of the solution is faster. This is useful because when the computational cost of increasing the polynomial order can be afforded, a high rate of convergence can be expected. 8.2 Varying Timestep Figure 8.2 shows a plot of the normalized L2 norms versus the spatial resolution. The L2 norms are normalized to the number of nodes in the domain. Several different time steps were compared to show that with smaller time steps the magnitude of error decreases. This is clear from the figure where the smallest time step, t = 0.25 has an L2 norm

61 Poly=2 Poly=3 Poly=4 Poly=5 Poly=6 Poly=7 Poly=8 L 2 Norm / N p N e Figure 8.1: Nozzle convergence for varying polynomial order. The L2 Norms normalized by the number of degrees of freedom N p in the system versus the number of elements in the system N e are compared. Polynomial degrees of 2,3,4,5,6,7, and 8 are shown.

62 46 Table 8.1: Average Newton and average linear solver (GMRES) iteration counts for varying time steps after an equal amount of time steps (100). Average Newton iterations are per time step, and the linear iterations are also averaged per time step. t Avg. Linear Iterations Avg. Newton Iterations approximately order Table 8.1 has results for the average number of Newton iterations per time step and the average number of linear solver (GMRES) iterations for the same time steps. (Note: The linear solver is used within the nonlinear solver and since the GMRES method is an iterative method, it also has an iteration count. Alternatively if a direct solver were used for the linear solver, it would take only one iteration per Newton iteration.) The averages are found for several different time step sizes. It can be seen that as the time step decreases the Newton iteration count decreases. Eventually the iteration count reaches one and the problem has essentially become linearized. This means that the Newton convergence criteria is met on the first iteration and therefore any dominant non-linear effects are not present on the small timescales. One can deduce that with a smaller number of Newton iterations, the corresponding error in the solution also decreases. This is intuitive since each Newton iteration has some error tolerance associated with it, and for every subsequent iteration the total error compounds. It is a good idea to minimize the Newton iteration count to ensure good convergence and accuracy of the solution.

63 t = 0.25 t = t = 0.50 t = t = 1.0 t = t = t = 2.0 L 2 Norm / N p N e Figure 8.2: Nozzle convergence for varying time step sizes t. The L2 Norms normalized by the number of degrees of freedom N p in the system versus the number of elements in the system N e are compared. Several different time step sizes are shown t = 0.25, 0.333, 0.50, 0.667, 1.0, 1.333, 1.667, and 2.0.

64 Errors with Large Timesteps With an implicit time advance scheme the explicit time step limit can be exceeded without the risk of developing numerical instabilities. There can however be a decrease in accuracy due to excessively large implicit time steps. Figure 8.3 shows a plot of relative deviation from the steady state solution for supersonic inflow and outflow conditions. Notice that for an increase in time step there is a fairly linear relation with the relative deviation from the steady state solution. The figure has several different spatial resolutions overlaid to signify the dependence on time step size rather than spatial effects. For small time steps the spatial errors dominate and it can be seen that the differing resolutions do not overlap, but clearly for larger time steps all the different spatial resolutions overlap. Figure 8.4 is an example of the result obtained when the time step is too large. An oscillation forms, and yields an inaccurate solution. The peak of the oscillation was used when calculating the difference from steady state for Figure Computational Timing The time required to perform computations with the solver depend on several parameters. For instance the domain size directly influences the amount of computational time required to solve the problem, since the matrix size has increased. The more nodes in the problem, the larger the matrix size, and therefore the more time it takes to perform the calculation. Other less obvious parameters are the polynomial order of the elements, the linear solver type, and the size of the time steps. These parameters are investigated to show how the computational cost changes Varying Polynomial Order and Resolution Increasing the number of degrees of freedom in the problem increases the matrix size to be inverted, and therefore increases the amount of computational effort to solve the problem. Both decreasing the element size in the domain and increasing the polynomial order have this effect. Tables 8.2 and 8.3 show results for two cases of increasing resolution. Table 8.2 has

65 49 from Steady State (Infinity Norm) N e = 30 N e = 40 N = 50 e N = 100 e N = 200 e t Figure 8.3: Relative deviation from a steady state solution due to increase in time step size. The deviation measured is an infinity norm of the difference between the steady state solution and peak error due to oscillations. Figure 8.4 shows an example oscillatory error that results from large time steps. Five different spatial resolutions are compared, N e = 30, 40, 50, 100, and 200.

66 IC Steady State Solution Velocity Figure 8.4: Velocity deviation from supersonic inflow and outflow steady state solution due to large time steps. The initial condition is the flat dashed line, the curved dashed line is the steady state solution, and the solid line is the erroneous solution due to large time steps. results for increasing the total number of elements (smaller element size) while holding the polynomial order fixed. Table 8.3 shows results for increasing polynomial degree while holding the element size fixed. For both tables the average linear iterations per Newton iteration, the average Newton iterations per time step and CPU time are shown. Notice the linear iterations and CPU time increase, but the Newton iterations generally remain constant. For the case of increasing the polynomial order, the matrix to be inverted becomes less sparse due to the coupling between basis functions. This requires more computational operations to solve the problem and thus will take longer to solve. Figure 8.5 shows the matrix structure for two systems with equal number of degrees of freedom and differing polynomial order. Notice the higher polynomial order is less sparse and less banded. This is the price paid for the increased resolution of the higher order polynomials. Notice also in tables 8.2 and 8.3 for the case of N p = 301 (number of degrees of freedom) that the CPU times for 4th and 7th order elements respectively are and This is precisely due to the higher order elements require more computational effort.

67 nz = nz = 241 Figure 8.5: Matrix structure for a 10 element system with 4th order polynomials (left) and 5 element system with 7th order polynomials (right). Both have 31 degrees of freedom. Table 8.2: CPU timing and average Newton and Linear (GMRES) iteration counts for varying spatial resolution with polynomial order 4. N e is the number of elements, and N p is the total number of degrees of freedom. The average Newton iterations are per time step, and the average linear iterations are also per time step. The CPU time is measured using the intrinsic fortran routine CPU TIME. N e N p Avg. Newton Avg. Linear CPU time

68 52 Table 8.3: CPU timing and average Newton and Linear (GMRES) iteration counts for varying polynomial degree. N e is the number of elements, and N p is the total number of degrees of freedom. The average Newton iterations are per time step, and the average linear iterations are also per time step. The CPU time is measured using the intrinsic fortran routine CPU TIME. Poly N p Avg. Newton Avg. Linear CPU time

69 Varying Timestep Size The time step size plays an important role not only in the accuracy and convergence of the solution but also in the computational effort required. Generally speaking there is a trade-off between taking large implicit time steps and the amount of time it would take to advance the solution. Table 8.1 shows that as the time steps get smaller and smaller the Newton iterations approach 1. This means the problem has essentially become linear and is acting like an explicit time step. These time steps don t take as much computational effort because both the linear and Newton iterations small. As the time steps increase the iteration counts also increase, and therefore it requires more computational effort. Table 8.4 shows average Newton and linear iteration counts as well as CPU time and average CPU time per time step for various time steps all finishing at t final = 10. This means for larger time steps there are less total time steps, N t in the solution. Notice the iterations increase with an increasing time step as expected. Associated with the iteration count increase, the average CPU time per time step also has an increasing trend. This increase is sufficiently small, and as the time steps get larger, the total CPU time decreases. This means that the implicit time step gives a net savings in CPU time because it can compute to the same t final in a fraction of the time compared to a smaller time step Varying Linear Solver Type Choosing a linear solver is mostly a problem dependent choice, and usually there is a solver (or a few solvers) that can efficiently solve the problem better than other solvers. Table 8.5 lists various solver types, their descriptions, and a some of the key parameters used. All of these solvers are included in the PETSc KSP libraries with the exception of SuperLU, which is interfaced into the PETSc framework. The goal is to show the many available solvers included in the PETSc libraries and show that each one can sufficiently solve the problem. The KSP defaults are: Relative Tolerance, RT OL = 10 5, Absolute tolerance, ABST OL = 10 50, a divergence iteration count of 10 4, zero initial guess, and left preconditioning. Table 8.6 shows the total CPU time used, and the average linear iterations taken during

70 54 Table 8.4: CPU timing, average CPU time per time step and average Newton and Linear (GMRES) iteration counts for varying time step with parameters: poly = 6, Θ = 0.50, N e = 50, t final = 10.0, ɛ = t is the time step size, and N t is the total number of time steps. The average Newton iterations are per time step, and the average linear iterations are also per time step. The CPU time is measured using the intrinsic fortran routine CPU TIME. t N t Avg. Newton Avg. Linear Total CPU Time Avg. CPU Time / ts

71 55 Table 8.5: Various linear solver types included with the PETSc libraries, and the SuperLU direct solver with their descriptions and parameters used for the runs in Table 8.6. Solver Method Description Solver Parameters SuperLU SuperLU Sparse Direct Solver Zero Pivot Tol = GMRES Generalized Minimum Residual Converg. Tol. = FGMRES Flexible Generalized Minimal Residual Converg. Tol. = CG Conjugate Gradient KSP Defaults CGS Conjugate Gradient Squared KSP Defaults BICG Biconjugate Gradient KSP Defaults BiCGStab Stabilized BiConjugate Gradient Squared KSP Defaults BCGSL Enhanced BiCGStab L = 2, = 0 MINRES Minimum Residual Converg. Tol. = TFQMR Transpose Free Quasi Minimal Residual KSP Defaults CHEBYCHEV Chebychev Iterative emin=10 2, emax=10 +2 RICHARDSON Richardson Iterative Damping Factor = 1.0 each Newton iteration to solve the same exact problem. Notice that SuperLU is a direct solver and therefore by definition has only one iteration per call. The other iterative solver methods have various timing results, but the methods have not been fully optimized so this is expected. The GMRES, FGMRES, CGS, BICG, BiCGStab, BCGSL, TFQMR, and RICHARDSON methods proved to be approximately comparable with respect to the CPU timing for this specific problem. For other more complicated problems, these methods would probably diverge in their success.

72 56 Table 8.6: CPU timing and iteration results for different iterative linear solver methods described in Table 8.5 Solver CPU time Avg. Linear SuperLU GMRES FGMRES CG CGS BICG BiCGStab BCGSL MINRES TFQMR CHEBYCHEV RICHARDSON

73 57 Chapter 9 FUTURE DEVELOPMENTS AND PLANS 9.1 Incorporate Quadrilateral/Hexahedral Structured Grid Generator When expanding a code to higher dimensions the generation of the computational grid is an important part of the process. A structured hexahedral grid (quadrilateral in 2D) is desired to simplify the matrix structures in the solver. There are some cases where a fully structured grid is either impossible, or introduces undesirable distortions in the grid. A circle is a good example because distortions occur when mapping a logical rectangle to the circle. In the logical corners, the quadrilateral is deformed such that two adjacent sides are parallel rather than perpendicular. This would create problems in the solver. One way to reconcile the difficulty in creating a structured mesh for a circle is to use a semi-structured technique. This is done by partitioning the circle into pieces that can easily be meshed structurally. Figure 9.1(a) shows a circular domain partitioned and Figure 9.1(b) shows a resulting quadrilateral mesh on this geometry. By meshing in this fashion, the issue with poorly shaped quadrilaterals is minimized, but another problem emerges. Each of the partitions might have a structured mesh, but the interfaces between the partitions might not have a structured mesh pattern. Notice on Figure 9.1(b) at the corners of the square partition at the interfaces the grid pattern is unstructured. This small amount of unstructured griding is a compromise from having the poorly shaped grid cells that result on a completely structured circular grid. Consequently when the code reads in the grid structure, it must know how to handle the interfaces between the structured partitions. Analogous to the two dimensional case is having a collection of structured blocks in three dimensions. For example the circle shown in Figure 9.1(a) can be extruded to a cylinder. This cylinder is partitioned only in the cross section. Figure 9.2(a) shows the resulting 3D geometry partitioned into five pieces. Figure 9.2(b) shows the resulting hexahedral mesh on this geometry where the lengthwise dimension of the cylinder is meshed uniformly. Figures

74 58 (a) (b) Figure 9.1: A circle geometry showing the partitions (a) and after a structured quadrilateral mesh on each piece (b) 9.4(a) and 9.4(b) show a HIT like geometry partitioned and meshed with hexahedrons respectively. This shows that non-simply connected geometries are possible with this type of meshing. Having a domain meshed as a collection of structured meshes, which are mapped together in an unstructured fashion, is advantageous over an unstructured mesh. This is because a structured mesh provides a far simpler data structure and therefore has a simpler matrix sparsity patterns. As a result the solver will be faster at the expense of the slightly more complicated coding required to handle the mesh partition interfaces. 9.2 Extend Algorithm to Three Dimensions A three dimensional code that can implicitly solve complicated equation sets in the fluxsource form on complicated domains is of great interest. The motivation behind developing the one-dimensional finite element solver is to have the experience such that a three dimensions solver can be developed easily. Rather than continue to develop the one dimensional

75 59 (a) (b) Figure 9.2: A cylinder geometry showing the partitions (a) and after a structured hexahedral mesh on each piece (b) (a) (b) Figure 9.3: A cylinder geometry with cutaway showing the partitions (a) and after a structured hexahedral mesh on each piece (b)

76 60 (a) (b) Figure 9.4: A HIT like geometry showing the partitions (a) and after a structured hexahedral mesh on each partition (b) code, an existing two dimensional code called SEL [14] will be expanded to three dimensions. The existing two dimensional code is a spectral/finite element code that uses the flux-source equation formulation. SEL has successfully solved the extended MHD equations, among several other equation sets in two dimensions. This provides the framework needed to test and expand to a three dimensional solver. The semi-structured grid generation will fit well into the SEL framework because each structured partition will be a logical rectangle. SEL already solves problems on a logically rectangular domain, and therefore it will have to be expanded to handle multiple logical rectangles. Once the solver can handle multiple adjacent logical rectangles, the next step will be to expand to logical cubes (or rectangular parallelepiped). Once this is achieved handling multiple logical cubes is the next natural progression. By having this capability, an equation set can be solved on complicated three dimensional domains.

Chapter Two: Numerical Methods for Elliptic PDEs. 1 Finite Difference Methods for Elliptic PDEs

Chapter Two: Numerical Methods for Elliptic PDEs Finite Difference Methods for Elliptic PDEs.. Finite difference scheme. We consider a simple example u := subject to Dirichlet boundary conditions ( ) u