X. He and M. Neytcheva. Preconditioning the incompressible Navier-Stokes equations with variable viscosity. J. Comput. Math., in press, 2012.

Size: px

Start display at page:

Download "X. He and M. Neytcheva. Preconditioning the incompressible Navier-Stokes equations with variable viscosity. J. Comput. Math., in press, 2012."

Cory Wade
5 years ago
Views:

3 List of papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. I II III IV V VI M. Neytcheva, M. Do-Quang and X. He. Element-by-element Schur complement approximations for general nonsymmetric matrices of two-by-two block form. Springer Lecture Notes in Computer Science (LNCS), 5910/2010, X. He, M. Neytcheva and S. Serra Capizzano. On an augmented Lagrangian-based preconditioning of Oseen type problems. BIT-Numerical Mathematics, 51: , X. He and M. Neytcheva. Preconditioning the incompressible Navier-Stokes equations with variable viscosity. J. Comput. Math., in press, X. He, Marcus Holm and M. Neytcheva. Efficient implementations of the inverse Sherman-Morrison algorithm. Submitted to the conference proceedings of the PARA 2012 conference. X. He and M. Neytcheva. On preconditioning incompressible non-newtonian flow problems. The Department of Information Technology, Uppsala university, technical report, O. Axelsson, X. He and M. Neytcheva. Numerical solution of the time-dependent Navier-Stokes equations for variable density variable viscosity. The Department of Information Technology, Uppsala university, technical report, Reprints were made with permission from the publishers.

5 Contents 1 Introduction Incompressible Navier-Stokes equations Introduction Weak formulation and linearization Preconditioning of the two-by-two block systems Element-by-element approximation of a Schur complement matrix Augmented Lagrangian method Fast solutions with the modified pivot block in the augmented Lagrangian method Incompressible Navier-Stokes equations with variable viscosity Introduction Effect of the variable viscosity on the augmented Lagrangian method Incompressible Navier-Stokes equations with variable viscosity and density Introduction Reformulation of the coupled system Discretization in time, operator splitting scheme and linearization Preconditioning techniques Coupling with the phase-field model to solve the multi-phase flow problems Computational challenges and some open problems to be addressed in future research Efficient solutions of the modified pivot block in the augmented Lagrangian method Element-by-element Schur complement approximation method Adaptive mesh refinement Stable numerical schemes with higher order of accuracy in time 42 6 Summary of papers Paper I Paper II

6 6.3 Paper III Paper IV Paper V Paper VI Summary in Swedish Acknowledgements References

7 1. Introduction Computational fluid dynamics (CFD) is an important branch of fluid mechanics and computational mathematics. Numerical simulations become more and more irreplaceable and indispensable in modern research, not only because the traditional laboratory experiments are costly, but also because the numerical simulations enable us to model processes, which cannot be experimentally tested, and extend our capability to reproduce physical phenomena in order to obtain a deeper insight of the underlying processes and their interactions. The incompressible Navier-Stokes (N-S) equations are the governing equations for the incompressible flows, which are derived following the general physical laws, such as conservation of mass and momentum. The dynamics of the physical process is described by a mathematical model, which consists of a set of coupled nonlinear partial differential equations (PDEs). These equations, in turn, depend on various problem parameters, such as density and viscosity, that themselves may vary in time and space, exhibit discontinuities as in multiphase systems, and take critical values. Furthermore, the equations included in the coupled system are of different types: elliptic, parabolic or hyperbolic, which adds to the complexity of the task to simulate them numerically. Given the mathematical model, computer simulations consist of three tasks, highly related to each other. First, the continuous equations (the PDEs) have to be discretized appropriately and in a stable manner, guaranteeing sufficiently small and uniformly bounded discretization errors in time and space. Second, the nonlinearities have to be handled. Due to the very high complexity of the original model equations, it is in many cases not possible to solve those as a whole coupled system and the so-called splitting techniques come in play, entailing the necessity to estimate the related splitting error and to balance it with the discretization errors. The next step is to chose suitable numerical solution methods for the arising nonlinear or linearized algebraic systems of equations. The time integration, sometimes over long time intervals, the large, even huge size of the linear systems, the dependence of problem parameters, etc, impose strong requirements on those solution methods to be robust with respect to the parameters involved, to be of optimal order of computational complexity and, last but not least, to use the available computer resources to full extent. The latter makes a tight connection between the choice of the numerical methods and their efficient implementation on a nowadays complex and hierarchical computer architecture. The dynamic evolution of the computer architecture and the very demanding fluid flow simulation suggest that 7

8 when choosing or designing the solution methods, it might be profitable to utilize readily available computational kernels and toolboxes, which are shown to be numerically efficient and be highly optimized for the computer platform used for the simulations. The focus of this work is on the numerical solution methods for the linearized N-S equations, targeting iterative methods and preconditioners. We adopt the following strategy. As a first step, we consider the stationary N-S equations with constant coefficients, in this case, density and viscosity, and efficient preconditioned iterative methods for those. Next, we consider the stationary N-S equations with variable viscosity and study the applicability of the solution methods in that case. Finally, we consider the N-S equations in their full complexity, namely, when both density and viscosity vary in space and time, and when N-S equations have to be coupled to some additional PDEs in order to properly describe the physics of the processes. In the above stages we aim to show that efficient computational kernels for the simplest setting remain a method of choice for the most involved setting. In general, after discretization and linearization, the original nonlinear problem is converted into a sequence of linear problems to be solved. These are linear systems of equations, that are of the form A x = b with a coefficient matrix A, that is of large dimension. Because of the underlying mathematical model, the matrix A is indefinite and nonsymmetric of two-by-two block structure. Sparsity is also an important property of the coefficient matrix. There are two classes of solution strategies to solve this linear system, namely, direct methods and iterative methods. Although direct methods are robust, reliable and relatively well parallelizable, for large scale problems they are not feasible due to their high requirements on memory storage and unacceptably long computing time. In this work we focus on the Krylov subspace iterative methods [1, 65], which are computationally cheaper than the direct methods because matrix vector operations with sparse matrices are mainly involved. In order to accelerate the convergence rate of the iterative methods, efficient preconditioning techniques become essential. A preconditioner, denoted here by P, is in general a linear operator, defined explicitly as a matrix or implicitly as a procedure, that transforms the above linear system into an equivalent one of the form P 1 A x = P 1 b. When using preconditioning, the main aim is to define P such that the transformed matrix P 1 A has more favourable properties than A itself. In general, we would like P 1 A to act similarly to the identity operator, that would result in fast convergence of the iterative method. Ideally, P should be as close to A as possible, but at the same time P = A is not a realistic choice, 8

9 since it leads back to the complexity of the original problem. Thus, when constructing a preconditioner we seek the right balance in order to construct a preconditioner that conserves some important properties of the original system matrix but allows for a much more efficient solution procedure. How to estimate the efficiency of a preconditioner depends on the properties of A (symmetricity, nonsymmetricity, positive definiteness, etc). Generally speaking, a spectrum of P 1 A, contained into one or a few clusters away from zero, results in fast convergence. The efficient preconditioner should admit the following properties. To construct and to solve systems with the preconditioner should be computationally cheap and parallelizable. The preconditioned system should be much better conditioned than the original system itself, to yield fast convergence. Finally, the preconditioner should be robust with respect to all parameters involved-problem parameters (such as material coefficients), discretization parameters in space and time and method parameters (if any). The general structure of the matrix A, that arises when solving the N-S problems is of two-by-two block form, [ ] A B T A =. C D The matrix is indefinite. The main pivot block A might be symmetric or nonsymmetric, singular or nonsingular, and the block D may be zero. The blocks B and C may be of full or lower rank, quite often C B. Efficient preconditioners for the matrices arising from Navier-Stokes equations with constant density and viscosity have been studied intensively during the past decades, see the survey papers [3, 17, 62], the book [31] and the references therein. A class of widely-used preconditioners are derived based on an exact block-factorization of the matrix A, followed by approximations of A and the Schur complement S = D CA 1 B T, see the papers [5, 7, 8, 17] and the numerous references therein. A major prerequisite for the efficiency of such preconditioners is to find high quality approximations of A and S, or of their inverses. The approximation of A can be implicitly defined as an inner iterative solution method with proper stopping tolerance. Compared to approximating the pivot block A, finding efficient approximations for the Schur complement is much more difficult, due to the fact that S is in general a dense matrix. Forming S explicitly and solving systems with it are computationally heavy tasks. In Papers I, II and IV, we contribute to the search of efficient preconditioners of the Schur complement by thoroughly analyzing and testing the element-by-element Schur complement approximation method [6, 45, 51] and the so-called augmented Lagrangian method [5, 18, 20, 33]. For the constant density but varying viscosity Navier-Stokes equations, the matrix of the discrete linear system of equations is also indefinite and nonsymmetric of two-by-two block form, and the difference appears in the pivot block of A. Variable viscosity has its impact on the behavior of preconditioners, shown to be efficient for the constant viscosity case. Those precondi- 9

10 tioners have to be reconsidered and analyzed in order to show their robustness with respect to varying viscosity. In Paper III and Paper V we choose the augmented Lagrangian method and show that the corresponding preconditioner preserves its high quality also for spatially varying viscosity. The full complexity N-S model, i.e., when variations in time and space of the unknowns (velocity and pressure), as well as of the problem parameters (density and viscosity) are taken into account, includes one or more additional partial differential equations. In Paper VI, we reformulate the equations using the so-called momentum instead of the classical unknown velocity. A good reason for the change of variable is that the momentum is smoother than velocity. Another benefit is that, within the operator splitting, which is indispensable in this case, the matrices arising in the discrete linearized equations, are analogous to those in the two simplified formulations. Therefore, the already known preconditioners can be straightforwardly re-utilized here. The rest of this thesis is organized as follows. Chapter 2 is an overview of the linearization methods, finite element discretization and the most frequently used preconditioners proposed for the incompressible Navier-Stokes equations with constant viscosity and density. In Chapter 3 the augmented Lagrangian method is reconsidered and its behavior is analyzed for the case of spatially varying viscosity. Navier-Stokes equations with full complexity, i.e., time-dependence, nonlinearity, variable viscosity and variable density are considered in Chapter 4. A reformulation of the N-S problem and its stable numerical solution is introduced here. In Chapter 5 some computational challenges for further improvements on fast and reliable numerical solutions of the Navier-Stokes equations are discussed, and possible research directions are also outlined. A summary of the papers included in this thesis is given in Chapter 6. 10

11 2. Incompressible Navier-Stokes equations 2.1 Introduction In this chapter we consider the numerical solution methods of the incompressible flow problems with constant viscosity and density, modeled by the Navier- Stokes equations. The mathematical model reads as follows: u ν u + (u )u + p = f t on [0,T ], u = 0 on [0,T ], u = g on D [0,T ], ν u n np = 0 on N [0,T ], u(x,0) = u 0 (x) on. (2.1) Here u is the unknown velocity and p is the unknown pressure, is a bounded and connected domain R d (d = 2,3), = D N is its boundary where D and N denote the parts of the boundary, where Dirichlet or Neumann boundary conditions are imposed, correspondingly. The terms f : R d, g and u 0 are correspondingly, a given force field, Dirichlet boundary data and an initial condition for the velocity. The coefficient ν > 0 is the kinematic viscosity, assumed here to be constant (related to the so-called Reynolds number Re as Re = UL ν, where L denotes the characteristic length scale for the domain and U is some reference value of the velocity). The operator is the Laplace operator in R d, denotes the gradient, ( ) is the divergence operator, and n denotes the outward-pointing normal to the boundary. The above Navier-Stokes equations constitute the fundamental model for the incompressible flows in computational fluid dynamics. Due to the presence of the nonlinear term (u )u, some linearization methods must be used. The discretization of (2.1) has to obey certain requirements. We limit ourselves to the finite element method (FEM) and proper time discretization methods, and outline the above requirements in Section 2.2. The main focus of this work is on fast and reliable numerical solution methods for the systems of algebraic equations arising after discretizing and linearizing the N-S equations. The numerical solution of those linear systems is the major computational kernel for the incompressible flow simulations, as well as a major difficulty as it is performed repeatedly and has to be both reliable and as fast as possible. 11

12 We state that we are concerned only with large scale simulations, for which the use of direct solution methods, applied to the whole system, is not feasible. Instead, preconditioned Krylov subspace methods are to be utilized. Then, the construction of numerically and computationally efficient preconditioners becomes the major concern. In this chapter, some linearization methods and preconditioning techniques are introduced and our contributions to the search of efficient preconditioners are also presented. 2.2 Weak formulation and linearization For the weak formulation of the Navier-Stokes equations (2.1), we define the solution space for the velocity and the test space, namely, H 1 E = {u H 1 () d u = g on D }, H 1 E 0 = {v H 1 () d v = 0 on D }, H 1 () d = {u i : R d u i, u i x j L 2 (),i, j = 1,,d}, and define L 2 () as the approximate space for the pressure p, L 2 () = {p : R p 2 < }. Then the weak formulation of (2.1) reads as follows (see e.g. [31]): Find u H 1 E and p L 2 () such that u t vd + ν u : vd + (u u) vd p ( v)d = f vd, q ( u)d = 0, (2.2) for all v H 1 E 0 and q L 2 (). Here u : v represents the componentwise scalar product, e.g., in two dimensions u 1 v 1 + u 2 v 2 (u = (u 1,u 2 ) and v = (v 1,v 2 )). The pressure is uniquely defined only up to a constant term. To make is unique, one normally imposes the additional constraint p d = 0. We also assume that the discretization is done using a stable pair of FEM spaces, satisfying the LBB condition (see e.g. [31]). Due to the presence of the convective term (u )u in (2.1) or ( (u u) vd) in (2.2), the Navier-Stokes system is nonlinear. There are two widely used linearization methods, i.e., Newton s method and Picard s method, see e.g. [31]. Next we briefly introduce linearization by Newton s method, followed by Picard s method. 12

13 Let (u 0, p 0 ) be an initial guess and let (u k, p k ) be the approximate solution at the kth nonlinear step. Substituting into the weak formulation (2.2), the nonlinear residual is obtained as R k = f vd u k t vd ν u k : vd + P k = (u k u k ) vd p k ( v)d q ( u k )d, for all v H 1 E 0 and q L 2 (). We update the approximation of the velocity and pressure as u k+1 = u k + δu k, p k+1 = p k + δ p k, where δu k H 1 E 0 and δ p k L 2 () (provided that u k H 1 E and p k L 2 ()). Then, the correction (δu k,δ p k ) should satisfy the following problem: Find δu k H 1 E 0 and δ p k L 2 () such that (δu k ) vd + ν δu k : vd + (u k δu k ) vd t + (δu k u k ) vd + (δu k δu k ) vd δ p k ( v)d = R k q ( δu k )d = P k, for all v H 1 E 0 and q L 2 (). By dropping the term (δu k δu k ) v, we obatin Newton s linearization: Find δu k H 1 E 0 and δ p k L 2 () such that (δu k ) t vd + ν δu k : vd + (u k δu k ) vd + (δu k u k ) vd δ p k ( v)d = R k q ( δu k )d = P k for all v H 1 E 0 and q L 2 (). After the correction (δu k,δ p k ) has been computed, the next approximation is updated as u k+1 = u k + δu k and p k+1 = p k + δ p k. Picard s linearization process is obtained in a similar way as Newton s linearization, except that an additional term, (δu k u k ) vd is also dropped. Thus, Picard s linearization reads as follows: Find δu k H 1 E 0 and δ p k L 2 () such that (δu k ) t vd + ν δu k : vd + (u k δu k ) vd δ p k ( v)d = R k q ( δu k )d = P k, 13

14 for all v H 1 E 0 and q L 2 (). Similarly, we update the approximations as u k+1 = u k + δu k and p k+1 = p k + δ p k for k = 0,1, until convergence. The linear system to be solved at each Picard s step is also known as the Oseen problem. In summary, the linearization of the Navier-Stokes equations (2.1) by Newton s method results in a sequence of problems of the form Find (u H 1 E 0, p L 2 ()) such that u ν u + (w )u + (u )w + p = f t on (0,T ], u = 0 on (0,T ], with proper boundary conditions for u. Here the field w denotes the approximation of the velocity computed at the previous Newton iteration. Picard s linearization results in a sequence of Oseen s problems, namely, Find (u H 1 E 0, p L 2 ()) such that u ν u + (w )u + p = f t on (0,T ], u = 0 on (0,T ], with proper boundary conditions for u. As is well known, provided that the initial guess is sufficiently close to the exact solution, Newton s method shows locally quadratic convergence. However, besides more work to assemble the required matrices and vectors, another disadvantage of Newton s method is that the radius of its ball of convergence is proportional to the viscosity [31]. Therefore, for small viscosity, it is essential to run a few Picard iterations to feed a sufficiently good initial guess to the Newton s iterations, since Picard s method has a larger radius of convergence than Newton s method [42]. 2.3 Preconditioning of the two-by-two block systems Let X h E 0 and P h be finite dimensional subspaces of H 1 E 0 and L 2 (), and let { ϕ i } 1 i nu be the nodal basis of X h E 0 and {φ i } 1 i np be the nodal basis of P h. According to the Galerkin framework, the discrete velocity and pressure are represented as u h = n u i=1 u i ϕ i, p h = n p i=1 p i φ i, where n u and n p are the total number of degrees of freedom for velocity and pressure. The linear system arising from Newton s or Picard s linearization is of the form [ ][ ] [ A B T uh f = or A x = b, (2.3) B O p h g] 14

15 [ ] A B T where the system matrix A = is nonsymmetric and indefinite of B O saddle point form. The unknown vector u h is the discrete velocity and p h is the discrete pressure. Combining them together we set x T = [u T h, pt h ]. The matrix B R n u n p corresponds to the discrete (negative) divergence operator and B T corresponds to the discrete gradient operator. In Newton s method the pivot block A has the form A = σm + νl + N +W where M is the velocity mass matrix, L is the Laplacian matrix, N denotes the convective matrix and W denotes the Newton derivative matrix. The parameter σ denotes a function reciprocal to the time step (σ = 0 for a stationary problem). Given the approximation of u h, the entries of N and W are given by N = [n i j ], n i j = (u h ϕ j ) ϕ i, W = [w i j ], w i j = ( ϕ j u h ) ϕ i. In Picard s method, the derivative matrix W is neglected. Linear systems of the form (2.3) are often referred to as two-by-two block systems. Fast and reliable numerical solutions for two-by-two block systems have been intensively studied in the past decades, see the milestone papers [5, 7, 8, 17] and the book [31], and the numerous references therein. As is well known, direct solution methods are highly robust with respect to both problem and discretization parameters, and are, therefore, a preferred choice in numerical simulations performed by engineers and applied scientists. The limiting factors of the sparse direct solvers are most often the demands on memory resources and the need to repeatedly factorize matrices, which are recomputed during the simulation process, as for instance, the Jacobians in nonlinear problems. For real industrial applications where the models are mostly three dimensional and result in very large scale linear systems of the type (2.3), rapidly convergent iterative methods, accelerated by a proper preconditioner become the methods of choice. In this thesis, we consider the preconditioned Krylov subspace methods, see [1, 4, 65]. To accelerate the convergence rate of the Krylov subspace methods, efficient preconditioning techniques become crucial and essential. The definition preconditioning refers to transforming the linear system into an equivalent one, A x = b P 1 A x = P 1 b, with the aim that the coefficient matrix P 1 A should have more favorable properties for iterative solution methods than A itself. A preconditioner, denoted here by P, is in general a linear operator, defined explicitly as a matrix or implicitly as a procedure. The requirements for efficient preconditioners have been presented in Chapter 1. 15

16 How to construct efficient preconditioners for two-by-two block systems arising in the incompressible Navier-Stokes equations is one of the main concerns in this work. There are several strategies to construct preconditioners. The first class of preconditioners are referred to as purely algebraic preconditioners. The term algebraic means that when constructing preconditioners only the information of the coefficient matrix and the right hand side vector is needed. This class of preconditioners includes the incomplete LU factorization method, sparse approximate inverse and algebraic multilevel and multigrid methods (see, for example, the survey paper [15] and references therein). The study of these preconditioners is out of the scope of this thesis. In this work, we limit ourselves to the preconditioners based on some approximate block-factorization of the original matrix. In general, the exact block-factorization of a matrix of two-by-two block form is [ ] A11 A A = 12 = A 21 A 22 [ A11 ] [ 0 I1 A 1 S A A 12 0 I 2 ], (2.4) where I 1 and I 2 are identity matrices of proper dimensions. The pivot block A 11 is assumed to be nonsingular and S = A 22 A 21 A 1 11 A 12 is the exact Schur complement matrix. The preconditioners for such matrices of two-by-two block form are either of full block-factorized form or of block lower- or uppertriangular form ][ [Ã11 O I1 Ã M F = 1 11 A ] 12, (2.5) S O I 2 M L = A 21 [Ã11 O A 21 S ], M U = ] [Ã11 A 12. (2.6) 0 S Here the matrix Ã 1 11 denotes some approximation of A 1 11, given either in explicit form or implicitly defined via an inner iterative solution method with a proper stopping tolerance. The matrix S is some approximation of the exact Schur complement S. The literature on this class of preconditioners is huge, see [5, 7, 8, 11, 13, 32, 41, 50, 57, 60, 62, 64], the surveys [3, 17], the book [31] and the references therein. In [7] it is pointed out that for indefinite systems, the block-triangular preconditioner, M L or M U, is more efficient than the full block-factorized preconditioner M F. Furthermore, when solving systems with the preconditioner M F, we need the action of Ã 1 11 twice. This is, clearly, computationally heavier task, compared to M L and M U, where the action of Ã 1 11 is needed only once. So, for the Navier-Stokes equations linearized using Newton s method and Picard s method, the block-triangular preconditioners M L and M U are the ones to choose, which are identically efficient in practice. 16

17 For Ã 11 = A 11 and S = S in the preconditioner M L (2.6), the preconditioned matrix ML 1 A (A is defined as in (2.4)) is of the form [ ML 1 A = A 1 ][ ] [ 11 O A11 A 12 I1 A S 1 A 21 A 1 11 S 1 = 1 11 A ] 12 A 21 A 22 0 I 2 where the matrices I 1 and I 2 are the identity matrices of proper dimensions. It is known (cf., e.g. [7]) that (i) in this case the minimal polynomial of ML 1 A, i.e., the polynomial P( ) of the smallest degree for P(ML 1 A ) = 0 takes the form P = (1 t) 2 and there will be at most two iterations when solving systems with the matrix A preconditioned by M L ; (ii) in the general case, where Ã 1 11 A 1 11 and S S, the eigenvalues of ML 1 A are located in disks and the radii of the disks are controlled by making a sufficient number of inner iterations when iteratively solving systems with the pivot block matrix A 11 and by choosing a sufficiently accurate approximation S of S. Thus, we can see that the quality of the preconditioner M L applied to the matrix A depends on the accurate solutions of the pivot block matrix and how well the Schur complement matrix is approximated. The most challenging task, however, turns out to be the construction of numerically and computationally efficient approximations of the Schur complement matrix, which is in general dense and it is not practical to form it explicitly. The research on Schur complement approximations for the Stokes and Oseen s problem with constant viscosity has been quite active during the past decades [3, 17, 57, 62, 31]. Some of the well-known (problem-dependent) Schur complement approximations are the following. (1) The pressure mass matrix M p The matrix M p can be used for the Stokes problem and for relatively mild values of ν in the Oseen problem (see, e.g., [52]), but it is not efficient for more general settings. We note also that M p is always symmetric and positive definite while S is in general nonsymmetric. (2) The pressure convection-diffusion (PCD) preconditioner S PCD This preconditioner is first suggested in [43] and is an approximation of the Schur complement matrix of the form S 1 PCD = M p 1 A p L 1 where A p and L p are the pressure convection-diffusion and Laplace matrices correspondingly, and M p 1 denotes some approximate solution with M p (pressure mass matrix). As can be seen, some non-physical boundary conditions are needed for A p and L p. (3) The BFBt preconditioner This preconditioner is also an approximation of the Schur complement matrix and is defined as S 1 p, BFBt = (B M u 1 B T ) 1 B M u 1 A M u 1 B T (B M u 1 B T ) 1, 17

18 where M u is a diagonal approximation of the velocity mass matrix M u. This preconditioner is suggested in [29]. As is seen from its definition, no artificial boundary conditions have to be set, and the preconditioner is fairly easy to construct. (4) The Hermitian/Skew-Hermitian (HSS) method [12, 16] and the augmented Lagrangian (AL) method [18, 19, 20]. These approximations may be costly to apply, e.g., the BFBt preconditioner or may need the construction of an additional convection-diffusion operator on the finite element space for the pressure and an artificial boundary for the pressure, e.g., the PCD preconditioner, or may need special care of method parameters to achieve the optimal convergence rate and the so-obtained optimal parameters are problem-dependent, e.g., as in the HSS method and the AL method. These approximations are fairly robust with respect to the discretization and problem parameters, i.e., the mesh size h and the viscosity ν. Our contribution to the search of efficient preconditioners for the constant coefficients N-S equations is contained in Papers I and II. In paper I we contribute to the techniques for approximating the Schur complement matrix using an element-by-element framework. In paper II we present a more general analysis of the so-called augmented Lagrangian method Element-by-element approximation of a Schur complement matrix When discretizing the linearized Navier-Stokes equations or the Stokes equations with finite element method, the resulting linear system admits itself the two-by-two block structure as shown in (2.3) or the more general form (2.4). We note that the local stiffness matrix, corresponding to the coefficient matrix of the linear system obtained by discretizing the linearized Navier-Stokes equations or the Stokes equations on one finite element, admits also a two-bytwo block form, namely, [ A k A k = 11 A k ] 12 A k 21 A k, 22 and the whole coefficient matrix is obtained by A = n E k=1 RT k Ak R k. The terms R k are the Boolean matrices that prescribe the local-to-global correspondence of the degrees of freedom and n E denotes the total number of finite elements. The element-by-element approximation of the Schur complement [6, 45, 51] is constructed based on the local features of the finite element discretization, and is of the form S EBE = n E k=1 R T k Sk R k, or SEBE 1 n E = R T k S k R k, k=1 (2.7) 18

19 and S k is the local Schur complement on each element, i.e., S k = A k 22 Ak 21 (Ak 11 ) 1 A k 12 (A k 11 and Sk are assumed non-singular for all elements). From the formula (2.7) we see that the construction of such an approximation is numerically cheap and fully parallelizable. For a uniform mesh, we only need to compute the local Schur complement or its inverse on one element and assemble it on all elements. In Paper I this method is used to approximate the Schur complement of the system matrices arising from the Stokes problem and the Oseen problem with constant viscosity. For the Stokes problem, this approximation is independent of the mesh refinement. However, for the Oseen problem, as most of the widely-used preconditioners, this approximation is not fully robust with respect to the mesh size and the value of the viscosity. Due to its low computational cost and sparse structure, it is still very attractive for the incompressible Navier-Stokes equations with moderate viscosity, i.e., ν [0.1, 0.01], corresponding to Re [10,100]. Although in Paper I we only test this preconditioner for the Oseen problem, it is applicable also for the linear algebraic systems arising from Newton s linearization. After the FEM discretization, the arising coefficient matrices, both the global linear system and the local stiffness matrices, are already in a two-by-two block form due to the underlying Oseen problem or Stokes problem. In other cases, this two-by-two block structure can be obtained based on some proper block partitioning, such as that based on consecutive regular mesh refinements. Hence, this preconditioner is suitable for a broader class of applications. In several works (see e.g. [6, 45]) it has been shown that, S EBE is a high quality approximation of the exact Schur complement S, i.e., (1 ζ 2 )S S EBE S, where ζ is a positive constant, strictly less that one, independent on the mesh size and easily computable. However, so far this has been shown rigorously only in the case of symmetric and positive definite matrices. In Paper I, we also suggest a framework to study this preconditioner for general nonsymmetric matrices. As mentioned in Paper I, more effort is needed to analyse and optimize this preconditioning method for nonsymmetric matrices, and this is also one possible direction for my future research Augmented Lagrangian method Aiming to find a preconditioner which is fully independent of the mesh size and the viscosity, we consider the so-called augmented Lagrangian (AL) approach (see e.g. [5, 33]). To this end we first transform the linear system (2.3) into an equivalent one with the same solution, which is of the form [ A + γb T W 1 B B T B 0 ][ uh p h ] = [ f g ] or A AL x = b, (2.8) where γ > 0 and W are suitable scalar and matrix parameters. The modified right hand vector is f = f + γb T W 1 B g. It is clear that the transformation 19

20 (2.8) can be done for any value of γ and any nonsingular matrix W. In [18] and [20] the AL type preconditioners are proposed for the transformed system (2.8), which are of block lower- or upper-triangular form M AL L = [ ] A+γB T W 1 B 0 B 1 γ W or M AL U = [ ] A+γB T W 1 B B T 0 1 γ W (2.9) We now briefly explain the purpose of doing the transformation (2.8). Comparing the matrix A AL in (2.8) with its AL type preconditioner in (2.9) and the general two-by-two block matrix A in (2.4) with its block triangular preconditioner in (2.6), we can see that the Schur complement S AAL = B(A + γb T W 1 B) 1 B T of the transformed matrix A AL is approximated by 1 γ W. Here the matrix W can be chosen to be the pressure mass matrix as shown in [18] or even the identity matrix as shown in [5]. It has been analyzed in [18, 39] that, to make the AL type preconditioner work efficiently for any choice of W, the value of γ should be large, i.e., the spectrum of (M AL L ) 1 A AL clusters to 1 with γ. Thus, for large values of γ and provided that we accurately solve the system with the modified pivot block A AL = A + γb T W 1 B, the AL preconditioner will lead to a very fast convergence, within a few iterations. However, with increasing γ the modified pivot block matrix of A AL becomes increasingly ill-conditioned and finding fast solutions for systems with A AL becomes increasing difficult, which contradicts to the requirement that γ needs to be large. How to balance the effect of the value of the parameter γ is studied in a recent PhD thesis [69], where the matrix W is fixed as the pressure mass matrix and the focus is on how to choose the optimal value of γ. Indeed, some optimal values of γ are derived based on special techniques, and the optimal γ is found to be small- γ [0.01,0.001]. The small values contradict to the requirement that γ needs to be large so that the AL preconditioner (2.9) works efficiently for the transformed linear systems (2.8). Further, the so-obtained optimal values are still problem dependent and it is difficulty to apply them for broad applications. Focusing on the choice of γ only and ignoring the other matrix parameter W can not give us a complete insight of the behavior of the AL preconditioner. In Paper II our main contribution is that we analyse the effects of γ and W in a more general framework. The analysis reveals that when we aim at balancing the solution of the modified pivot block A AL (the inner solver) and the M AL L solution of the whole system, preconditioned by ML AL or MU AL, the optimal value of γ turns out to be one. The latter, however, entails that W should be a good approximation of the original Schur complement, i.e., W BA 1 B T. To the best of author s knowledge, Paper II is the first paper in this area to theoretically point out that the efficient Schur complement approximation of the original system matrix A is still not circumvented even when using the AL preconditioner. 20

21 The structure of the matrices, arising in the above outlined AL method can be related to those, arising after applying the so-called grad-div stabilization to the momentum equation. The AL method begins with the discretization, followed by stabilization (transformation). Provided that the incompressibility constraint is satisfied, i.e. u = 0, it is possible to apply the gradient operator to the divergence-free constraint ( u), pre-multiplied with a stabilization constant γ to the momentum equation in the Navier-Stokes equations (2.1). Thus, we obtain the so-called grad-div stabilization formulation [22, 38, 54] u ν u γ ( u) + (u )u + p = f, t u = 0. Then, after linearization and discretization with FEM, the resulting nonsymmetric and indefinite linear system takes the form [ A + γagd B T ][ ] [ uh f = or A B 0 p h g] GD x = b, (2.10) where the matrix A GD denotes the discrete operator of the term ( u)( v)d, which is similar to the matrix B T B (v is a test function). As can be seen, the resulting matrix A GD is similar to the matrix A AL in (2.8), with W- the identity matrix. It is clear that the matrix A GD is sparser than B T W 1 B and incomplete LU factorization can be used to construct a preconditioner for A + γa GD. However, in this case there is only the parameter γ to tune the grad-div stabilization, while in the AL method, we possess both γ and W to play with Fast solutions with the modified pivot block in the augmented Lagrangian method In some sense, the grad-div stabilization method can be seen as a attempt to find efficient solutions for the modified pivot block A AL = A + γb T W 1 B, which is dense and not practical to be formed explicitly. In [69], a specific multigrid method is used for this block which works efficiently with small values of γ. Clearly, with γ 0 the modified block A AL converges to the original pivot block A, which is sparse. At the same time, the efficiency of the AL preconditioner decreases when γ becomes small. In order to find efficient solutions independent of the values of γ, in Paper II we propose a numerical algorithm to compute the exact or an approximate inverse of A AL based on the inverse Sherman-Morrison s (ISM) formula [24, 25, 26], where the approximate inverse can be used as an multiplicative preconditioner when iteratively solving systems with A AL. 21

22 The matrix A AL can be rewritten in a more general form, namely, H = A 0 + XY T. Here the matrix A 0 is assumed to be non-singular and its inverse to be easily computed (e.g., A 0 could be diagonal, even the identity matrix). There are various application areas, where matrices of the form of H arise. For example, large matrices of that type appear in statistical problems and their exact inverses are to be computed. Thus, an efficient implementation of the ISM algorithm can serve as a useful computational kernel for a broader class of applications. In earlier works, the ISM algorithm is implemented with only level 1 BLAS routines [24, 25], which have limited efficiency on modern parallel architectures. Here, we give the ISM formula and briefly introduce the Blas-1 ISM algorithm (see below). Let H be of the form H = A 0 + XY T with X,Y R n m and I m R m m be the identity matrix of size m. The Sherman-Morrison-Woodbury formula provides an explicit form of (A 0 + XY T ) 1, given by the expression (A 0 + XY T ) 1 = A 1 0 A 1 0 X(I m +Y T A 1 0 X) 1 Y T A 1 0, (2.11) provided that the matrix I m +Y T A 1 0 X is nonsingular. Applying formula (2.11) on the columns of X and Y, in [24, 25] an algorithm is derived, to compute A 1 in the following form A 1 = A 1 0 A 1 0 UR 1 V T A 1 0, (2.12) where R R m m is a diagonal matrix and U,V R n m. The computational procedure is presented in Algorithm 1 below. We use Matlab-type notations and IA, IA0 denote A 1 and A 1 0, correspondingly. Algorithm 1 (Blas-1 ISM) for k=1:m, U(:,k) = X(:,k); V(:,k) = Y(:,k); for l=1:k-1, U(:,k) = U(:,k) - (V(:,l) *IA0*X(:,k))*R(l,l)*U(:,l); V(:,k) = V(:,k) - (Y(:,k) *IA0*U(:,l))*R(l,l)*V(:,l); end R(k,k) = 1/(1+V(:,k) *IA0*X(:,k)); end IA = IA0 - IA0*U*R*V *IA0; As we can see, Algorithm 1 consists of vector and matrix-vector operations only, which are relatively less efficient on nowadays computers. In Paper IV we propose two parallel strategies based on a block version of the ISM algorithm, which involve only matrix operations, e.g., matrix matrix. In this way, the highly optimized and parallelized BLAS3 routines can be used, which are shown to be more efficient than the BLAS1 routines. One of the two block 22

23 ISM algorithms is designed to achieve good performance in terms of computational complexity referred to as the block ISM (BISM) algorithm. The other, referred to as the reduced memory block ISM (RMISM) algorithm, reduces the memory demands to make it more suitable for large scale problems. Both block ISM algorithms are implemented on CPU and GPU based computers, and their efficiencies are compared and demonstrated in Paper IV. Here we only present the parallel speedup as an example. For arbitrary dense matrices X,Y of the size X,Y R n m,n = ,m = , and A 0 R n n - diagonal, the parallel speedup of the two block ISM algorithms using OpenMP is plotted in Figure 2.1. As can be seen, both of the two algorithms exhibit linear speedup. Figure 2.1. Scaling behavior on a multicore system A common way to compute an approximate inverse via ISM is to drop some relatively small-valued entries when doing the ISM factorization [24, 26]. In paper IV, we only consider the efficient implementation of the block ISM algorithm for dense matrices. The analysis of the effect of sparse matrices on the performance of the block ISM algorithm, data structures and parallelization techniques, as well as obtaining an approximate inverse of a sparse matrix are planned as future research directions. 23

24 3. Incompressible Navier-Stokes equations with variable viscosity 3.1 Introduction In this chapter we consider fast solution methods for the incompressible Navier- Stokes equations with variable viscosity. There are mainly two classes of applications with non-constant viscosity. The first class incorporates non(quasi)- Newtonian flows. For example, in non-newtonian flows viscosity may be a function of pressure and the rate-of-strain tensor (e.g., [21, 59]). In some quasi-newtonian flows the variable viscosity may also depend on pressure and shear (e.g., [40, 48, 58]). The other class of applications with variable viscosity arises in multi-phase flow models where each phase is assumed to be immiscible and incompressible. The incompressible Navier-Stokes equations with variable viscosity read as follows: u + u u (2ν( )Du) + p = f, t on (0,T ] u = 0, on (0,T ], (3.1) with some given boundary and initial conditions for u. Here (0,T ] R d (d = 2,3) is a bounded, connected domain with boundary and f : R d is a given force field. The operator Du = ( u + T u)/2 denotes the rate-ofstrain tensor. As mentioned, in non-newtonian flows, the kinematic viscosity ν may depend on the second invariant of the rate-of-deformation tensor D II (u) = 1 2 tr(d2 u) and pressure p, i.e., ν( ) = ν(d II, p) (e.g. [34]). In multiphase flows, the kinematic viscosity ν is a function of time and space, i.e., ν( ) = ν(x,t), to be determined. Variable viscosity is an important factor affecting the behaviors of the known preconditioners. In this chapter we choose the augmented Lagrangian method to study the impact of the variation of viscosity. Here, we fix the density as constant. An illustrative example for such a system is a mixture of water and oil, which have almost the same density, however their viscosities differ much. Other examples of problems of practical importance are considered in [68], namely, extrusion with variable viscosity and a geodynamic problem with a sharp viscosity contrast. 24

25 3.2 Effect of the variable viscosity on the augmented Lagrangian method As already stated, we assume that the kinematic viscosity coefficient is a smooth function, such that 0 < ν min ν(x,t) ν max, where ν min and ν max denote its minimal and maximal values. When solving (3.1), we limit ourselves to the stationary state, and we apply Picard s linearization method. This technique requires to solve a sequence of approximate solutions of the Oseen problem, which reads as follows: At each Picard iteration, find u : R d and p : R satisfying (2ν(x,t)Du) + (w )u + p = f, u = 0, in in (3.2) where w = u (k 1) is the velocity, which has been computed at the previous Picard iteration, and is updated at every nonlinear step. The weak formulation of (3.2) reads as follows: Find u H 1 E and p L 2 () such that 2ν(x,t)Du : Dvd + (w )u vd p( v)d = f vd, q ud = 0, (3.3) for all v H 1 E 0 and all q L 2 (). The linear systems arising from the weak formulation (3.3) are again of the form [ ][ ] [ F B T uh f = or A x = b, (3.4) B O p h g] [ ] F B T where the system matrix A = is nonsymmetric and indefinite of B O two-by-two block form. The unknown vector u h R n u is the discrete velocity and p h R n p is the discrete pressure. We also assume that the discretization is done using a stable pair of FEM spaces, satisfying the LBB condition [31]. Clearly, when considering variable viscosity, the difference, compared to the case of constant viscosity, can be observed in the pivot block F R n u n u, which, in the case of variable viscosity, has the form F = A ν + N. The matrix A ν is the discrete operator, corresponding to the term 2ν(x,t)Du : Dvd, where { ϕ i } 1 i nu are the nodal basis functions A ν R n u n u, [A ν ] i, j = 2ν(x,t)D ϕ i : D ϕ j. 25

26 Thus, the matrix A ν is symmetric and positive definite. Note, however, that it is not block-diagonal. The other matrices, i.e., N and B, are the same as defined in Chapter 2. Here, we also consider preconditioned iterative methods to solve the linear system (3.4), and the chosen preconditioning approach is the augmented Lagrangian method. As in Chapter 2, we first transform the linear system (3.4) algebraically into an equivalent one [ F + γb T W 1 B B T B 0 ][ uh p h ] = [ ] f g or A γ x = b, (3.5) where f = f + γb T W 1 g, and γ > 0 and W are suitable scalar and matrix parameters. Clearly, the transformation (3.5) does not change the solution for any value of γ and any nonsingular matrix W. The equivalent system (3.5) is what we intend to solve and the AL-type preconditioner for the transformed system matrix A γ is of the block triangular form [ F + γb T W 1 ] B 0 M L = B 1 γ W. (3.6) As analyzed in [5, 18, 39], the eigenvalues λ of the preconditioned matrix ML 1 A 1 γ consist of two parts, i.e., multiple 1 and 1+1/γµ, where µ are the eigenvalues of Q = W 1 BF 1 B T. Supposing that the µ are bounded in a rectangular box and the bounds are independent of the mesh size, then the eigenvalues λ are also bounded and the bounders are again robust with respect to the mesh size. The main theoretical contribution regarding the AL preconditioner in Paper III is that, with the choice of W = M (the pressure mass matrix), we derive the following bounds of the eigenvalues µ, namely, c 2 0 ν2 min ν max (ν 2 min + c2 1 ) Re(µ) 1 ν min and Im(µ) 1 2ν min, where the constants c 0, c 1, ν min and ν min are independent of the mesh size. This form generalizes the result from Theorem 1 in [30] where the Oseen problem with constant viscosity is considered. Still, the bottleneck problem in the AL preconditioning method is how to efficiently solve systems with the modified pivot block F + γb T W 1 B. Besides the ISM method introduced in Chapter 2, in Paper III we test an aggregation-based algebraic multigrid method (AGMG) [53, 55, 56] for the modified pivot block F +γb T W 1 B, and numerical experiments show that the AGMG method behaves reasonably well and can be utilized as a choice of method in practice. The task to find fast solution methods of the pivot block remains a very challenging problem and is also one direction for my future research. 26

27 For the non-newtonian flows, the velocity u and the pressure p satisfy the following generalized incompressible Navier-Stokes equations: u t + u u (2ν(D II(u), p)du) + p = f, in (0,T ] u = 0, in (0,T ] where the kinematic viscosity ν may depend on the second invariant of the rate-of-deformation tensor D II (u) = 1 2 tr(d2 u) and pressure p [34]. Because the viscosity function ν(d II (u), p) may also depend on velocity u, two of the terms exhibit nonlinear behavior: (2ν(D II (u))du, p) and u u. Whether Newton s or Picard s linearization is used, the linear system arising from FEM discretization is still of two-by-two block form as in (3.4) and the augmented Lagrangian preconditioning method is straightforwardly applicable. The contribution in Paper V is that, to the best of author s knowledge, it is the first time to propose the AL type preconditioning method in the area of fast solution methods for the non-newtonian flow problems. The numerical experiments included in Paper V confirm the usefulness of the approach. 27

28 4. Incompressible Navier-Stokes equations with variable viscosity and density 4.1 Introduction Variable density variable viscosity problems arise in many complex flow processes, and have been studied intensively by numerical simulations. For example, density and viscosity can be a function of the temperature in convection flows; variable density ground water flow phenomena, in particular when a fluid of high density overlays a fluid of lower density, driven by the gravity and the variable density; rising bubble phenomena where a bubble of light fluid is surrounded by a heavier fluid, and rises and deforms due to gravity and surface tension forces; splashing phenomena when a solid is dropped into a liquid, typically when penetrating the surface of the liquid, etc. The major difficulty of the numerical approximation of such flows is the high complexity of the nonlinear time-dependent variable density and variable viscosity Navier-Stokes equations, combined with the mass conservation equations for density and viscosity. Normally operator splitting schemes are used to solve the coupled system. Fast and reliable solution methods for the arising discrete linear systems of equations are crucial for numerical simulations. In this chapter all the above aspects are discussed. 4.2 Reformulation of the coupled system Now we consider the incompressible Navier-Stokes equations in their full complexity, including time-dependence, and spatial and temporal variation of density and viscosity. The formulation reads as follows: ρ( u t + u u) (2µDu) + p = ρf in (0,T ], ρ t u = 0 in (0,T ], + u ρ = 0 in (0,T ], (4.1) with some given boundary and initial conditions for u and ρ. The operator Du = ( u + T u)/2 denotes the rate-of-strain tensor. Using the mass conservation equation, i.e., the third equation in (4.1), namely, (ρu) + (u )(ρu) = ρ( u + u u) + u( ρ + u ρ) = ρ( u t t t t + u u), 28

29 the momentum equation can be reformulated as (ρu) + (u )(ρu) (2µDu) + p = ρf. t We assume that the viscosity depends on density as some Lipschitz-continuous function µ(ρ). Then a similar equation as for density also holds for viscosity, namely, µ t + u µ = µ ρ ( ρ + u ρ) = 0. t By introducing the momentum variable v = ρu, the incompressible Navier- Stokes equations with variable viscosity and density can be rewritten into the following formulation: v + (u )v (2µD(v/ρ)) + p = ρf t in (0,T ], v = u ρ in (0,T ], ρ + u ρ = 0 t in (0,T ], µ + u µ = 0 t in (0,T ]. (4.2) The second equation in (4.2) can be seen as a consequence of the incompressibility constraint. It is obtained from the relation v (ρu) = ρ u + u ρ. Since u = 0, then v = u ρ. The boundary and initial conditions are assumed to be ρ t=0 = ρ 0, µ t=0 = µ 0 = µ(ρ 0 ), u t=0 = u 0 v t=0 = v 0 = ρ 0 u 0, u Γ = b, ρ Γin = a, where Γ =, a > 0 and µ(ρ) is a given function. Since the advection equations for density and viscosity are first order hyperbolic equations, then the boundary[ conditions [ are given ] at inflow boundary, ρ a Γ in = {x Γ,u n < 0}. Therefore, =. For the Navier-Stokes µ] µ(a) Γ in equations (4.2), there is no need to impose any boundary and initial conditions for the pressure variable. In our work we advocate two ideas. The first is to use the momentum v = ρu instead of the velocity u as a variable in the model. The second claim is that one should solve the N-S equations as a coupled system for v and p instead splitting those, in this way avoiding the need to impose non-physical boundary conditions for the pressure. The rational for the latter idea is that as we have 29

30 already seen, we can solve the arising saddle point systems via fast and robust preconditioned iterative solution methods. There are several reasons why we should use the momentum variable v = ρu instead of the velocity u. First, one can expect that v has a smoother behavior, i.e., less strong variations, than u and can be more accurately approximated in numerical simulations. Second, when using the variable v, the momentum equation coupled with the divergence constraint for the momentum, i.e., the first two equations in (4.2), have a form analogous to the Navier-Stokes equations with constant viscosity and density (Chapter 2) and constant density variable viscosity (Chapter 3). After the operator splitting and linearization by the frozen coefficient approach in a similar way as is done in Picard s method, the resulting linear problem is still of Oseen s type (see details in the next section). Therefore, all the preconditioning methods proposed for the discrete linear system of equations arising in the Oseen problem are straightforwardly applicable. 4.3 Discretization in time, operator splitting scheme and linearization As already mentioned, in order to handle the high computational complexity of the mathematical model, normally some operator splitting methods are utilized, see [14, 27, 35, 36, 37, 61]. To get an insight of these operator splitting methods, we take the scheme given in [37] as an example. With the initialization (ρ 0,u 0, p 0 ), the approximate sequences {ρ n,u n, p n } n=0,1,,n on all time levels are computed by solving: ρ n un+1 u n τ ρ n+1 ρ n τ + (ρ n+1 u n ) ρn+1 2 un = 0, + ρ n+1 (u n )u n+1 µ u n+1 + ρn+1 4 ( un )u n+1 + p n = f n+1, u n+1 D = g, p n+1 = χ τ un+1, n p n+1 = 0. (4.3) As can be seen, the diffusion-convection term is advanced at each time step (i.e., the second equation in (4.3)) without enforcing the incompressibility constraint. The resulting, intermediate velocity field is then projected onto the space of discretely divergence-free vector field (i.e., the last equation in (4.3)). However, one needs to impose some non-physical boundary conditions for pressure, i.e., n p n+1 = 0. This scheme is proposed for the original Navier-Stokes equations (4.1) with constant viscosity variable density, and the two terms ρn+1 2 un, ρn+1 4 ( un )u n+1 are added for stability reasons. Also, it 30

31 is indicated in [35] that the resulting pressure is still a reasonable approximation to the true pressure, at least in the interior of the domain, and the errors are mainly located around the boundary. The computational procedure used in Paper VI is motivated by two facts. First, since in general the initial pressure is not known, we must keep the coupled diffusion and incompressibility constraint in its coupled form intact, which also enables the computation of pressure without use of artificial pressure boundary conditions. Therefore, instead we split off the advection part, which can be handled separately. Second, for reasons of stability and to avoid the use of very small time steps, we must use a stable implicit time integration method of second order of accuracy. At each time level, we first compute density, and then compute velocity and pressure at the same time by solving the momentum equation coupled together with the divergence constraint for the momentum. Implicitly, the incompressibility constraint, i.e., u = 0 is also satisfied. Furthermore, we linearize the coupled equations using a frozen coefficient approach in a similar way as is done in the Oseen problem. To this end, we find the approximate sequences {ρ n, µ n,v n,u n, p n } n=0,1,,n with the initial conditions (ρ 0 = ρ 0, µ 0 = µ 0,v 0 = v 0,u 0 = v 0 /ρ 0 ) for all time steps n from 0 to N 1. We also assume that µ is a known function of ρ. Algorithm 2 (Backward Euler scheme) A1-1: Compute ρ n+1 from A1-2: Compute (v n+1, p n+1 ) from v n+1 v n τ ρ n+1 ρ n + u n ρ n+1 = 0. (4.4) τ + (u n )v n+1 (µ n+1 D( vn+1 ρ n+1 )) + pn+1 = ρ n+1 f n+1, A1-3: Finally, obtain u n+1 as u n+1 = v n+1 /ρ n+1. v n+1 τ 2 p n+1 = u n ρ n+1. (4.5) The above equations are obtained by using the first-order semi-implicit discretization. To fully avoid unphysical oscillations (see, e.g., [37]), we additionally regularize the problem by adding the term τ 2 p n+1, where is the negative Laplacian operator and τ is the time step. To obtain a algorithm of second-order accuracy in time, we can replace the first-order Euler backward time discretization with the three-level backward differentiation formula (BDF2). This scheme proceeds as follows. First, one initializes (ρ 0, µ 0,v 0,u 0 ), and computes (ρ 1, µ 1,v 1,u 1, p 1 ) by using one step of the first-order Algorithm 2. Then for n 1, proceed as follows. 31

32 Algorithm 3 (BDF2) A2-1: Set the linearly extrapolated velocity at time level n + 1 as A2-2: Compute ρ n+1 from A2-3: Compute (v n+1, p n+1 ) from 3v n+1 4v n + v n 1 2τ u = 2u n u n 1. 3ρ n+1 4ρ n + ρ n 1 + u ρ n+1 = 0. (4.6) 2τ + (u )v n+1 (µ n+1 D( vn+1 ρ n+1 )) + pn+1 = ρ n+1 f n+1, A2-4: Recover the velocity u n+1 as u n+1 = v n+1 /ρ n+1. v n+1 τ 2 p n+1 = u ρ n+1. (4.7) Remark The second order backward difference time stepping method is simple to implement. The linearized coupled system of equations (4.7) can be seen as the Oseen problem introduced already in Chapter 2. Therefore, after discretization with the finite element method, the resulting nonsymmetric and indefinite linear system is of two-by-two block form. Then, the preconditioning techniques proposed for the two-by-two block systems are applicable, typically those preconditioners used in the Oseen problem, see the examples presented in Chapter 2. In the following section of this chapter we discuss another efficient preconditioning technique for the discrete equations (4.7). However, The BDF2 method is not fully stable in the sense of A- and B- stability for systems of nonlinear ordinary differential equations (cf., e.g., [70]). Such stability holds for linear problems with all eigenvalues of the operator located in the stable half of the complex plane. The stability analysis for nonlinear problems is more complicated, however, and one can not just rely on eigenvalues of the linearized (Jacobian) operator. It can be shown that methods, such as BDF2 or the traditional form of the trapezoidal method are not fully stable. This prevents the use of the methods for long time intervals. In [2] it has been shown that the so-called one-leg (or one-sided form of the θ-method for θ 1/2 O(τ) ) is stable for monotone operators uniformly in time and is, hence, applicable for infinitely long time integration intervals. It has a second order of accuracy for θ = 1/2 O(τ), where τ is the time step. For θ = 1/2 O(τ ς ), ς < 1, the method is not fully of second order but has increased stability properties. For reasons of simplicity, when we do not need to integrate on very long time intervals, as well as for reasons 32

33 of comparison with other related work, such as [37], the BDF2 method is also used in the numerical experiments. We first briefly recall the one-leg θ-method (OLTM). Consider the evolution equation dξ dt + F(t,ξ ) = 0, t > 0, ξ (0) = ξ 0. The classical θ-method on implicit one-leg form is where τ is the time step and ξ (t + τ) ξ (t) + τf( t, ξ ) = 0, t = 0,τ,2τ, ξ (0) = ξ 0, t = θt + (1 θ)(t + τ) ξ = θξ (t) + (1 θ)ξ (t + τ),0 θ 1. We refer to [2, 67] for more details on the one-leg form of the θ-scheme and its properties. We present next an implementation of OLTM for the density and the momentum equation. Note that we need to compute (u(t n + τ/2), p(t n + τ/2)), which is done by solving a Stokes problem of the form as in (4.8). To additionally simplify the computational procedure, we split the momentum v n+1 into two parts, i.e., v n+1 = (v n v n+1 2 )/2, where the component v n+1 1 recovers the convective character as in (4.10) and the other component v n+1 2 takes care of the diffusion property and the divergence constraint for the momentum as in (4.11). We choose θ = 1/2 to guarantee second order accuracy in time. Algorithm 4 (OLTM) A3-1: Compute v(t n + τ/2),u(t n + τ/2), p(t n + τ/2) by solving v n+ 1 2 v n τ/2 (µ n D( vn+ 1 2 ρ n )) + pn+ 1 2 = ρ n f n+ 1 2 (u n )v n, v n+ 1 2 τ 2 p n+ 1 2 = u n ρ n, (4.8) and set u n+ 1 2 = v n+ 1 2 /ρ n. A3-2: Compute ρ n+1 by solving ρ n+1 ρ n + u n+ 2 1 ρn+1 + ρ n = 0. (4.9) τ 2 A3-3: By defining ρ n+ 2 1 = (ρ n+1 + ρ n )/2 and µ n+ 1 2 = (µ n+1 + µ n )/2, one computes (v n+1 1,v n+1 2, p n+1 ) by solving v n+1 1 v n τ +u n+ 1 2 vn v n = ρ n+ 1 2 f n (µ n+ 2 1 D(u n+ 2 1 )) p n+ 1 2, 2 (4.10) 33

34 Table 4.1. Comparison of the computational complexity of BDF2 and OLTM BDF2 Solve the hyperbolic equations as in (4.6) Solve the Oseen type problem as in (4.7) Reassemble matrices corresponding to OLTM Solve the hyperbolic equations twice as in (4.9),(4.10) Solve the Stokes type problem twice as in (4.8),(4.11) Reassemble matrices corresponding to u, (µ n+1 D) u n,u n+ 1 2, (µ n D), (µ n+ 1 2 D) Recompute more right hand vectors v n+1 2 v n τ (µ n+ 1 2 D( vn+1 2 ρ n+1 + un )/2) + p n+1 = ρ n+ 2 1 f n+ 2 1 (u n+ 1 2 ) vn v n, 2 A3-4: Finally, we compute (v n+1,u n+1, p n+1 ) as v n+1 2 τ 2 p n+1 = 2u n+ 1 2 ρ n+1 v n+1 1. (4.11) v n+1 = vn v n+1 2, u n+1 = vn+1 2 ρ n+1, pn+1 = pn+1 + p n The form of the constraint in (4.11) is motivated by v n+1 = vn v n+1 2 = (ρ n+1 u n+1 ) = u n+1 ρ n+1 + ρ n+1 u n+1. 2 Using the assumption u n+1 = 0 one gets v n+1 2 = 2u n+1 ρ n+1 v n+1 1. Since the u n+1 is not known, we use u n+ 1 2 to replace it. Also, as can be seen, the incompressibility constraint, i.e., u = 0 is satisfied. Table 4.1 summarizes the computational complexity of Algorithm 3 and Algorithm 4 at each time level. As is well known, preconditioning and solving the Stokes problem is much easier than preconditioning and solving the Oseen problem especially for small values of the viscosity. Thus, the efficient preconditioned iterative solution methods for the Stokes equations will pay off the heavier assembling and computing work in Algorithm 4, making it a very attractive approach. 34

35 (a) (b) Figure 4.1. The difference between the computed and true pressure by using BDF2 (a) and OLTM (b) with τ = h = and T = Numerical experiments in Paper VI show that both the BDF2 and the oneleg θ-scheme are stable and second order accurate in time. Compared to the computation of velocity, density and viscosity, the more difficult task is to correctly compute the pressure unknowns. In Figure 4.1, for a test problem with known analytical solution (see Paper VI) we see that both schemes cap- 35

36 ture quite well the pressure. Since in the two algorithms we do not impose any artificial boundary conditions for the pressure unknowns, small difference between the computed and the true pressure appears globally within the domain. This is in contrast to the results in [35], see, e.g., Figure 4.2, where the difference is mainly located around the boundary due to some non-physical boundary conditions imposed for the pressure. Figure 4.2. Pressure error field at time T = 1 in a square, scaned from [35]. 4.4 Preconditioning techniques After discretizing in space using some proper finite element pair, we can rewrite the linearized system (4.7) in the BDF2 scheme into a block matrix structure as follows: A [ vh (t + τ) p h (t + τ) ] [ ] A B T = rhs, where A = B τ 2. (4.12) L p The system (4.12) will be solved by a generalized conjugate gradient method, such as GMRES ([65]) or GCGMR ([1]). The matrix block A has the form A = σm + E, where M is a mass matrix, E comes from the discrete diffusion and convection terms and σ is a function of the reciprocal of the time step τ. The block B arises from the discrete negative divergence operator and the term τ 2 L p corresponds to the discrete stability operator, i.e., τ 2 p in (4.7), where L p is the discrete Laplacian operator for the pressure unknowns. The preconditioner used for the system (4.12) is of block triangular form [Ã ] O P =, (4.13) B S where S is the approximation of the exact Schur complement of (4.12), i.e., S = τ 2 L p BA 1 B T. As can be seen, the preconditioner used here follows the same construction strategy introduced in Section 2.3, see the block triangular 36

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY REPORT 13-07 On preconditioning incompressible non-newtonian flow problems Xin He, Maya Neytcheva and Cornelis Vuik ISSN 1389-6520 Reports of the Delft Institute of Applied