SPECIAL TOPICS IN NUMERICS (Advanced Aspects of PDE Numerics)

Size: px

Start display at page:

Download "SPECIAL TOPICS IN NUMERICS (Advanced Aspects of PDE Numerics)"

Myrtle Boone
5 years ago
Views:

1 SPECIAL TOPICS IN NUMERICS (Advanced Aspects of PDE Numerics) Rolf Rannacher Institute of Applied Mathematics Heidelberg University Lecture Notes SS WS 2016/2017 February 5, 2017

2 ii Address of Author: Institute of Applied Mathematics Heidelberg University Im Neuenheimer Feld 205 (MATHEMATIKON) D Heidelberg, Germany

3 Contents 5 Adaptivity Introduction to adaptivity General principles of adaptivity A first example: computation of drag coefficient More demanding examples of goal-oriented mesh adaptation General principles of error estimation A linear PDE model case Finite element approximation Global a posteriori error estimates A posteriori error estimates for output functionals Practical aspects Evaluation of the error identity and indicators Mesh adaptation Balancing discretization and iteration errors Iterative solution methods Derivation of a combined error estimator Evaluation of the a posteriori error estimators Nested adaptive algorithm Numerical example Attempts towards theoretical justification Complexity analysis Goal-oriented versus energy-norm error estimation Convergence of residuals Approximation of weights Abstract framework of a posteriori error estimation A nested solution approach Control of the nonlinear and linear iteration errors Special case: inexact Newton method Numerical examples Application to eigenvalue problems iii

4 iv CONTENTS A posteriori error analysis Practical evaluation of the error representation Balancing of discretization and iteration error Numerical examples Application to optimization problems A posteriori error analysis via Euler-Lagrange formalism Application to a concrete boundary control problem Application to parameter estimation Exercises Bibliography 113

5 CONTENTS 1

6 2 CONTENTS

7 5 Adaptivity In this chapter, we develop a general approach to the design of adaptive methods for the finite element Galerkin approximation of variational problems. The applications considered include elliptic boundary value problems, linear and nonlinear, the corresponding eigenvalue and optimization problems as well as time dependent problems of parabolic or hyperbolic type. The main goal is to derive criteria for the systematic control of the combined errors by model approximation, discretization, and approximate algebraic solution. This results in the so-called Dual Weighted Residual (DWR) method, which combines concepts from optimal control with special properties of finite element Galerkin approximations. The material of this chapter is largely taken from Bangerth& Rannacher [24]. 5.1 Introduction to adaptivity General principles of adaptivity We begin with a brief introduction to the philosophy underlying the approach to selfadaptivity which will be discussed in this chapter. Let the goal of a simulation be the accurate and efficient computation of the value of a functional J(u), the goal quantity, with accuracy level TOL from the solution u of a continuous model by using an approximative discrete model of dimension N, indicated by a subscript h R + : A(u) = 0, A h (u h ) = 0. The evaluation of the solution by the functional J( ) represents what exactly we want to know of a solution. Then, the goal of adaptivity is the optimal use of computer resources according to either one of the following principles: Minimal work N for prescribed accuracy TOL: N min, TOL given. Maximal accuracy for prescribed work: TOL min, N given. These goals may be approached by automatic model, discretization and iteration adaptation on the basis of local error indicators taken from the computed solution. The main ingredients of this process are: rigorous a posteriori error estimates in terms of data and the computed solution as stopping criterion in the adaptation process; local error indicators extracted from the a posteriori error estimates for steering the adaptation process; automatic mesh adaptation strategies based on the local error indicators; 3

4 Adaptivity effective stopping criteria for the various iteration orocesses by balancing their contributions to the total error estimator.

8 4 Adaptivity effective stopping criteria for the various iteration orocesses by balancing their contributions to the total error estimator. We will demonstrate by examples that the appropriate choice of each of these steps is crucial for an economical simulation. Inappropriate realizations which violate the characteristic features of the underlying problem may drastically reduce the efficiency and reliability. The traditional approach to adaptivity aims at estimating the error with respect to the generic energy norm of the problem, or the global L 2 -norm. However this is generally not what applications need. In the following, we will present a collection of examples for such situations, where one is really interested in computing more locally defined quantities A first example: computation of drag coefficient In order to illustrate the role of adaptivity in the design of a computational mesh, we consider a viscous incompressible flow around a cylinder in a channel with a narrowed outlet as shown in Figure 5.1. The flow quantities, velocity v and pressure p, are determined by the classical incompressible Navier-Stokes equations t v ν v +v v + p = f, v = 0. The configuration is two-dimensional, with Poisseuille inflow, and viscosity number ν = 0.02, such that the flow is laminar and stationary. The narrowing of the outlet causes a so-called corner singularity. S Figure 5.1: Configuration and streamline plot for flow around a cylinder (ν = 0.02). The discretization of this problem is by a low-order finite element method using continuous, cellwise bilinear shape functions for velocity as well as for pressure with addi-

9 5.1 Introduction to adaptivity 5 tional pressure stabilization. The discretization is on form-regular quadrilateral meshes T h = {T}, which may contain hanging nodes for easing local mesh refinement. The goal is the accurate computation of the corresponding drag coefficient J(v,p) := c drag := 2 Ū 2 D S n T (2ντ pi)dds, oftheobstacle, where S isthesurfaceofthebody, D itsdiameter, Ū themaximal inflow velocity, τ := 1 2 ( v+ vt ) the strain tensor, and d = (0,1) T the main flow direction. In order to control the mesh adaptation process in this simulation, one may find good reasons to use either one of the following heuristic refinement indicators η T on the mesh cells T with sides Γ T : Vorticity: η T := h T v h T. First-order pressure gradient: η T := h T p h T. Second-order velocity gradient: η T := h T 2 h v h T. Residual-based indicator: η T := h T R h T +h 1/2 T r h T +h T v h T, R h T := f+ν v h v h v h p h, r h Γ := 1 2 nv h np h ], if Γ Ω 0, if Γ Γ rigid Γ in ν n v h +np h, if Γ Γ out where n is the normal unit vector and [ ] the jump across cell interfaces T. The mesh refinement process is organized in accordance to the relative size of these indicators, i.e., cells T of the current mesh with values η T exceeding a certain threshold are refined and the others kept equal (or coarsened). Then, the problem is solved on the new mesh and the adaptation process is continued. The mesh-dependent scaling in these indicators is problem dependent and reflects the structure of certain a posteriori error estimates. The vorticity as well as the pressure and velocity gradient indicators measure the smoothness of the discrete solution {v h,p h }, while the heuristical residual-based indicator additionally contains information about local conservation of momentum and mass. As competitors, we additionally consider global uniform refinement and refinement using a more systematic approach, called Dual Weighted Residual (DWR) method (the main theme of this chapter) which uses the same residual terms as in the heuristic residual-based indicators but multiplied by weights obtained by solving a global dual problem. The above test case has been designed in order to demonstrate the ability of the different refinement indicators to produce meshes on which the main features of the flow, such as boundary layers along rigid walls, vortices behind the cylinder, and the corner singularity at the outlet, are sufficiently resolved. Figure 5.2 shows locally adapted meshes,

6 Adaptivity obtained on the basis of the different refinement indicators. The results shown in Figure 5.

This demonstrates that a systematic approach to goal-oriented mesh adaptation is needed which not only takes into account local properties of the solution but also the

2: Meshes with N 5, 000 cells obtained by the vorticity indicator (left), the heuristic residual-based indicator (middle), and the DWR indicator (right). 1 0.

10 6 Adaptivity obtained on the basis of the different refinement indicators. The results shown in Figure 5.3 demonstrate that in this case the two ad hoc indicators involving only the norm of vorticity or pressure gradient are even less efficient than simple uniform refinement. This demonstrates that a systematic approach to goal-oriented mesh adaptation is needed which not only takes into account local properties of the solution but also the global dependence of the error in the target quantity on these properties. Figure 5.2: Meshes with N 5, 000 cells obtained by the vorticity indicator (left), the heuristic residual-based indicator (middle), and the DWR indicator (right) global refinement weighted indicator vorticity-norm indicator energy-norm indicator J(e) N Figure 5.3: Error J(e) in the drag coefficient versus number of cells N, for uniform refinement, the weighted indicator obtained by the DWR approach, the vorticity indicator, and the heuristic residual-based indicator.

11 5.1 Introduction to adaptivity More demanding examples of goal-oriented mesh adaptation Let us illustrate the need for goal-oriented mesh adaptation by two further examples of different type of complexity: a 3-d flow problem where only on locally adapted meshes sufficient accuracy is achieved with acceptable costs, and a 2-d flow problem in which the complication arises through the strong interaction of mass transport and heat transfer. In both situations, the main problem is the generation of error estimates which reflect the local and global dependency of the error on the locally observed solution properties. In fact, usually mesh adaptation in solving a coupled system of equations for a set of physical quantities u = (u i h )n i=1 is based on smoothness or residual information like η T := n ωt i D2 h ui h T or η T = i=1 n ωt i R i(u h ) T. i=1 Here, Dh 2ui h stands for certain second-order difference quotients, and R i(u h ) are certain residuals as introduced in the previous example. Both kinds of indicators are easily evaluated from the computed solution and are widely used in practice. The proper choice of the weights ωt i is crucial for the effectivity of the adaptation process. They should include both a scaling due to different orders of magnitude of the solution components u i, as well as the influence of the present cell T on the requested quantity of interest. A cylinder flow benchmark in 3-D (Braack& Richter [43]) We consider the 3-dimensional flow in a channel around a cylinder with square cross section as shown in Figure 5.4. The viscosity is ν = 0.05, such that the flow is laminar and stationary. This is part of a benchmark suite for the computation of viscous, incompressible fluid flow. Figure 5.4: Configuration of 3D flow around a square cylinder in a channel.

12 8 Adaptivity The goal is again the accurate computation of the drag coefficient 2 J(v,p) := c d = n T (2ντ pi)dds, Ū 2 DH where D, H are geometrical parameters, Ū the maximum inflow velocity and d = (1,0,0) T the main flow direction. The desired accuracy is TOL 1% which turns out to be a rather demanding task even for this simple flow situation. S In Table 5.1, we compare the efficiency of global uniform refinement against that of local refinement on the basis of a heuristic residual-based indicator and the DWR method as already used in the first example. The superiority of mesh adaptation by the DWR method is clearly seen. Figure 5.5 shows meshes which have been obtained by using refinement on the basis of the heuristic residual-based indicator and by the DWR method. The heuristic indicator fails to properly refine the area behind the cylinder which causes its poorer drag approximation. Table 5.1: Drag results: a) Q 2 /Q 1 -element with global refinement, b) Q 1 /Q 1 -element with local refinement by heuristic residual indicator, c) Q 1 /Q 1 -element with local refinement by DWR method; the first mesh level (N # of unknowns) on which an error less than 1% is achieved is indicated in boldface. a) N c d b) N c d c) N c d 15, , , , , , , , , , 035, , , , 666, , , , 052, , , ,

5.1 Introduction to adaptivity 9 Figure 5.

A heat-driven cavity benchmark in 2-D (Becker& Braack [36], Braack et al. [42]) We consider a 2-dimensional cavity flow.

6) is driven by a temperature difference θ h θ c = 720K, between the left ( hot ) and the right ( cold ) wall, under the action of gravity g in the

Here, the quantity to be computed is the average Nusselt number (mean heat flux) along the cold wall defined by Pr J(u) = Nu c := κ n θds, 0.

13 5.1 Introduction to adaptivity 9 Figure 5.5: Refined meshes and zoom into the cylinder vicinity obtained by the heuristic residual-based indicator (upper row) and by the DWR method (lower row). A heat-driven cavity benchmark in 2-D (Becker& Braack [36], Braack et al. [42]) We consider a 2-dimensional cavity flow. The flow in a square box with side length L=1 (see Figure 5.6) is driven by a temperature difference θ h θ c = 720K, between the left ( hot ) and the right ( cold ) wall, under the action of gravity g in the vertical direction. The Rayleigh number is Ra 10 6 making this problem computationally demanding. Here, the quantity to be computed is the average Nusselt number (mean heat flux) along the cold wall defined by Pr J(u) = Nu c := κ n θds, 0.3µ 0 θ 0 Γ cold where Pr is the Prandtl number and µ 0, θ 0 are certain reference values for viscosity and temperature. The underlying mathematical model is the low-mach number approximation of the stationary compressible Navier-Stokes equations which is expressed in terms of the set of primitive variables u = {v, p, θ} denoting velocity, pressure and temperature. In this case, due to the large temperature difference, the usual Boussinesq approximation is not sufficient.

puts more emphasis on the region along the hot boundary where the temperature gradient is dominant.

14 10 Adaptivity Figure 5.6: Flow in heat-driven cavity: velocity norm isolines (left) and temperature isolines (right). The meshes shown in Figure 5.7 indicate that the heuristic residual-based indicator induces mesh refinement mainly in those areas where the velocity is dominant while the weighted error indicator obtained by the DWR method puts more emphasis on the region along the hot boundary where the temperature gradient is dominant. The latter, given the quantity we want to compute, seems to be more important for capturing the heat transfer through the cavity. This is confirmed by the results shown in Table 5.2 which show that on the meshes generated by the properly weighted error indicators the accuracy in computing the Nusselt number is better by almost one order.

15 5.1 Introduction to adaptivity 11 Table 5.2: Computation of the Nusselt number in the heat-driven cavity by the heuristic residual-based indicator (left) and the weighted indicator by the DWR method (right); comparable error magnitudes are indicated in boldface (N # of cells). N Nu c error N Nu c error

4 General principles of error estimation In the following, we will introduce some of the main concepts of the DWR method for goal-oriented a posteriori error estimation within the framework of linear

16 12 Adaptivity Figure 5.7: Sequence of refined meshes for the heat-driven cavity with N 500, 5500, cells: heuristic residual-based indicator (top row), weighted error indicator by the DWR method (bottom row) General principles of error estimation In the following, we will introduce some of the main concepts of the DWR method for goal-oriented a posteriori error estimation within the framework of linear algebra. We will use the same concepts later for differential equations as well and use this simple example only to introduce the most important aspects. Traditional error estimation For regular matrices A, A h R N N, and vectors b, b h R N, consider the problems of finding x,x h R N from Ax = b, A h x h = b h, (5.1.1) where h R + is a parameter indicating the quality of approximation, i.e., A h A and b h b, as h 0. In this context, we introduce the notation of the approximation error e := x x h, the truncation error τ := A h x b h, and the residual ρ := b Ax h. Usually,

17 5.1 Introduction to adaptivity 13 a priori error analysis is based on the truncation error, and uses the identity A h e = A h x A h x h = A h x b h = τ, to derive an a priori error bound involving a discrete stability constant c S,h : e c S,h τ, c S,h := A 1 h. (5.1.2) In contrast to that, the a posteriori error analysis uses the relation Ae = Ax Ax h = b Ax h = ρ, to derive an a posteriori error bound involving a continuous stability constant: e c S ρ, c S := A 1. (5.1.3) Notice that the a priori error analysis is based on assumptions on the stability properties ofthe discrete operator A h, which maybedifficult toestablish forthe particularapproximation, while the a posteriori error analysis uses stability properties of the unperturbed continuous operator A which are often available from regularity theory. Further, the truncation error τ is not so easily computable in practical applications. Duality-based error estimation In order to avoid the aforementioned drawbacks and to estimate the error also with respect to arbitrary moments of the solution, we employ a duality argument well-known from the a priori error analysis of Galerkin methods. For some given j R N assume that we want to estimate the error with rspect to the (linear) functional J( ) := (,j): J(x) J(x h ) = J(e h ) = (e,j). This corresponds to the goal of computing the quantity J(x) rather than the whole vector x R N. For the estimation of this error, consider the solution z R N of the associated dual (or adjoint) problem This leads us to the error identity: A z = j. (5.1.4) J(e) = (e,j) = (e,a z) = (Ae,z) = (ρ,z), and then to the weighted a posteriori error estimate J(e) N ρ i z i. (5.1.5) i=1

18 14 Adaptivity In this estimate the residual components ρ i are easily computable but the determination of the weights z i requires the solution of the auxiliary problem (5.1.4), what may be as expensive as solving the original problem. The gain in using these weights is that they determine the influence of the local residuals ρ i on the error in the target quantity J(u). 5.2 A linear PDE model case In this section, we will develop the basics of the DWR method for linear elliptic partial differential equations as originally described in Becker and Rannacher [39]. As a model configuration, we consider the Poisson equation on a polygonal or polyhedral domain Ω R d, with Dirichlet boundary conditions u = f in Ω, u Ω = 0. (5.2.6) The discretization will be by a Galerkin finite element method which is based on the variational formulation of (5.2.6): Find u V := H0 1 (Ω), such that a(u,ψ) := ( u, ψ) = (f,ψ) ψ V. (5.2.7) In this section, we will exemplarily consider the a posteriori control of the resulting approximation u h with respect to the following goals ( output quantities ): Computation of an overview of the solution s structure (global norm error): (u u h ) TOL, u u h TOL. Computationof displacement or stress componentsatsomepoint a Ω(pointvalue error): J(u) := u(a), J(u) := i u(a). Computation of mean normal flux : J(u) := Ω n uds. However, we will always have in mind the accurate computation with respect to arbitrary output functional values J(u) of the solution. Here and below, we use the following notation: For a domain Ω R d, L 2 (Ω) is the Lebesgue space of square-integrable functions on Ω, which is a Hilbert space with the scalar product and norm ( 1/2. (v,w) Ω = vwdx, v Ω = v dx) 2 Ω Ω

19 5.2 A linear PDE model case 15 Analogously, L 2 ( Ω) is the space of square-integrable functions defined on the boundary Ω equipped with the scalar product and norm ( 1/2. (v,w) Ω = vwds, v Ω = v ds) 2 Ω The Sobolev spaces H 1 (Ω) and H 2 (Ω) consist of those functions v L 2 (Ω) which possess first- and second-order (distributional) derivatives v L 2 (Ω) d and 2 v L 2 (Ω) d d, respectively. For functions in these spaces, we use the semi-norms Ω v 1;Ω := v Ω, v 2;Ω := 2 v Ω. This notation can be extended to Sobolev spaces H m (Ω) of arbitrary order m 1, and even more general to spaces H m.p (Ω) for pth-order integrability. For the limit case p = of essentially bounded functions, we use the common notation W m, (Ω). The space H 1 (Ω) can be embedded into the space L 2 ( Ω), such that for each v H 1 (Ω) there exists a trace v Ω L 2 ( Ω). Further, the functions in the subspace H 1 0 (Ω) H1 (Ω) are characterized by the property v Ω = 0. By the Poincaré inequality, v Ω c v Ω, v H0 1 (Ω), (5.2.8) the H 1 -semi-norm v Ω is a norm on the subspace H0 1 (Ω). If the set Ω is identical with the domain on which the differential equation is posed, we usually omit the subscript Ω in the notation of norms and scalar products, for instance v = v Ω. All the above notation will be synonymously used also for vector- or matrix-valued functions v : Ω R d or R d d Finite element approximation The discretization of the model problem (5.2.6) seeks an approximations u h V h, the so-called Ritz projection of u, in a certain finite dimensional subspace V h V, a(u h,ψ h ) = (f,ψ h ) ψ h V h. (5.2.9) The main feature of the Galerkin method for linear problems is the so-called Galerkin orthogonality for the error e := u u h, a(e,ψ h ) = 0, ψ h V h. (5.2.10) The subspaces (finite element spaces) considered have the form V h = {v V : v T P(T), T T h }, defined on decompositions T h of Ω into cells T (triangles or quadrilaterals in R 2, and tetrahedra or hexahedra in R 3 ) of width h T = diam(t); we write h = max TinTh h T for the global mesh width. Here, P(T) denotes a suitable space of polynomial-like functions

20 16 Adaptivity defined on the cell T T h. In the numerical results discussed below, we have mostly used bilinear or trilinear finite elements on quadrilateral or hexahedral meshes, respectively, inwhich case P(T) = Q 1 (T) consists ofshapefunctionsobtainedvia abilinear transformation fromthe space of bilinears Q 1 (ˆT) = span{1,x 1,x 2,x 1 x 2 } or trilinears Q 1 (ˆT) = span{1,x 1,x 2,x 3,x 1 x 2,x 2 x 3,x 3 x 1,x 1 x 2 x 3 } on the reference cell ˆT = [0,1] d. Local mesh refinement or coarsening may be realized by using hanging nodes in such a way that global conformity is preserved, that is V h V. We consider the control of the error with respect to some output functional J( ), i.e., we want to have estimates for the difference J(e) = J(u) J(u h ). For simplicity, we assume here J( ) to be linear. Following the general concept of the DWR method, let z V be the solution of the associated dual problem and z h V h its finite element approximations defined by a(ϕ,z) = J(ϕ) ϕ V, (5.2.11) a(ϕ h,z h ) = J(ϕ h ) ϕ h V h. (5.2.12) Using this construction together with Galerkin orthogonality, we obtain J(e) = a(e,z) = a(e,z ψ h ) = (f,z ψ h ) a(u h,z ψ h ) =: ρ(u h )(z ψ h ), ψ h V h. The so-called residual ρ(u h )( ) of the Galerkin approximation u h may be viewed as a functional on the solution space V. Cell-wise integration by parts implies ρ(u h )(z ψ h ) = T T h { (f+ uh,z ψ h ) T ( n u h,z ψ h ) T } = T T h { (f+ uh,z ψ h ) T ([ nu h ],z ψ h ) T\ Ω }, where [ n u h ] denotes the jump of n u h across the inter-element edges (in 2-D) or faces (in 3-D), i.e., for two neighboring cells T,T T h with common edge Γ and normal unit vector n pointing from T to T, we set [ n u h ] = [ u h n] := ( u h T Γ u h T Γ ) n. Since n = n, this actually defines the jump of n u h across the edge Γ. For later use, we define the cell and edge residuals R h and r h, respectively, by R h T := f+ u h, { 1 2 r h Γ := [ nu h ], if Γ T\ Ω, 0, if Γ Ω. We collect the previous results in the following Proposition.

21 5.2 A linear PDE model case 17 Proposition 5.1: For the finite element approximation (5.2.9) of the Poisson problem, we have the a posteriori error representation J(e) = T T h { (Rh,z ψ h ) T +(r h,z ψ h ) T } =: E(uh ), (5.2.13) with an arbitrary ψ h V h, and as a consequence the a posteriori error estimate J(e) T T h (Rh,z ψ h ) T +(r h,z ψ h ) T =: T T h η T (u h ) η ω := T T h ρ T ω T, (5.2.14) where the cell residuals ( smoothness indicators ) ρ T and weights ( influence factors ) ω T are given by for an arbitrary ψ h V h. ρ T := ( R h 2 T +h 1 T r h 2 T) 1/2, ω T := ( z ψ h 2 T +h T z ψ h 2 T) 1/2, Remark 5.1: The dual solution z has the features of a generalized Green function G(T,T ), as it describes the dependence of the output error quantity J(e), which may be concentrated at some cell T, on local properties of the data, i.e. in this case the residuals ρ T on cells T. In order to evaluate the a posteriori error representation (5.2.13) or the resulting a posteriori error estimate (5.2.14), we need information about the continuous dual solution z. Since in practice, z is not explicitly known, such information has to be obtainedeither througha priorianalysis in formof boundsfor z incertainsobolev norms or through computation by solving the dual problem numerically. In the following, we will present some examples in which z can be bounded a priori or can even be explicitly determined. In later sections, dealing with real-life problems, we will always have to rely on the computational estimation of z Global a posteriori error estimates In the following, we will demonstrate how the previous approach can be used for deriving the known a posteriori error estimates with respect to global norms. Example 5.1: (Energy-norm error) For deriving the usual error estimate with respect to the natural energy norm associated with problem (5.2.7), we choose the error functional J(ϕ) = ( ϕ, e) e 1,

22 18 Adaptivity considering the error e as a fixed quantity. The corresponding dual solution z V is determined by a(ϕ,z) = ( ϕ, e) e 1 ϕ V, and admits the trivial a priori bound z 1 =: c S (stability constant). Clearly, in this particular case the dual solution is just given by z = e e 1. From Proposition 5.1 we infer the estimate J(e) = e ( ) 1/2 ( 1/2. ρ T ω T h 2 T T) ω2 T T h T T h We recall the interpolation error estimate to estimate further, T T h h 2 T ρ2 T ( { inf h 2 T z ψ h 2 T ψ h V +h 1 T z ψ h T} ) 2 1/2 ci z, (5.2.15) h T T h e c I ( T T h h 2 T ρ2 T ) 1/2 z ci c S ( This results in the classical energy-norm error estimate with ρ T as defined in Proposition 5.1. e η E := c I c S ( T T h h 2 T ρ2 T T T h h 2 T ρ2 T ) 1/2. ) 1/2, (5.2.16) Remark 5.2: By the projection property there holds e 2 = u 2 2( u, u h )+ u h 2 = u 2 2( u h, u h )+ u h 2 = u 2 u h 2, such that energy-norm error control turns out to be equivalent to energy error control. However, this requires the energy form to be a scalar product. Example 5.2: (L 2 -norm error) To derive an estimate with respect to the L 2 norm, we choose the error functional J(ϕ) = (ϕ,e) e 1. Suppose that the (polygonal or polyhedral) domain Ω is convex. Then, the corresponding dual solution z V H 2 (Ω) admits the a priori bound 2 z 1 =: c S (stability constant). From the result of Proposition 5.1, we infer the estimate e T T h ρ T ω T ( T T h h 4 T ρ2 T ) 1/2 ( 1/2. h 4 T T) ω2 T T h

23 5.2 A linear PDE model case 19 Using the interpolation error estimate we obtain ( { inf h 4 T z ψ h 2 K +h 3 T z ψ h T} ) 2 1/2 ci 2 z, (5.2.17) ψ h V h T T h e c I ( T T h h 4 Tρ 2 T ) 1/2 2 z c I c S ( This results in the well-known L 2 -norm error estimate e η L 2 := c I c S ( T T h h 4 T ρ2 T T T h h 4 Tρ 2 T ) 1/2. ) 1/2. (5.2.18) with ρ T again as defined in Proposition 5.1. In comparison to the energy-norm error estimate (5.2.16)theL 2 -normestimate involves theweighting h 4 K which reflects itshigher order of convergence A posteriori error estimates for output functionals Next, we turn to estimating the error with respect to local output functionals. Let TOL again denote the accuracy we want to achieve. Example 5.3: (Point-value error) To estimate the error at some point a Ω, we use the regularized functional J(u) := B ε 1 B ε udx = u(a)+o(ε 2 ), assuming u to be locally smooth, where B ε is the ε-ball around the point a and ε := TOL. The corresponding dual solution z behaves like a regularized Green function, i. e. in 2-D: z(x) = g a ε (x) log(r(x))+1, r(x) := x a 2 +ε 2. By choosing ψ h in the estimate (5.2.14) suitably, the weights have here the form ω T h 2 T 2 z T h 2 T T 1/2 r 2 T, r T := max x T r(x), such that, assuming TOL small, e(a) η ω := c I c S h3 T r 2 T T h T ρ T. (5.2.19) Example 5.4: (Derivative point-value error) To estimate the error in the derivative in

24 20 Adaptivity direction x i at some interior point a Ω, we use the regularized output functional J(u) := B ε 1 B ε i udx = i u(a)+o(ε 2 ), where again ε := TOL. In this case the dual solution behaves like a regularized derivative Green function, i.e. in 2-D, and the corresponding weights like z(x) = i g a ε(x) (x a) i r(x) 2 +1, r(x) := x a 2 +ε 2, ω T h 2 T 2 z T h 2 T T 1/2 r 3 T. This results in the a posteriori error estimate h3 T r T T T 3 h i e(a) η ω := c I c S ρ T. (5.2.20) Compared with (5.2.19), this localizes the region of influence towards the point a even more. Figure 5.8: Examples of computed dual solutions: regularized Green function and derivative Green function (scaled differently), for evaluating u(a) and 1 u(a). Example 5.5: (Mean normal-flux error) Another type of error functionals involves integrals over lower-dimensional manifolds. As an example, we consider the computation of the mean normal flux across the boundary, J(u) = n uds, Ω where for simplicity Ω is assumed to be the unit circle. The question is: What is an efficient mesh-size distribution for computing J(u)? Notice that in this simple context

25 5.2 A linear PDE model case 21 the computational goal is trivial since it can be reduced to evaluating data, J(u) = n uds = udx = f dx. Ω However, in more complex situations error functionals of this type cannot be so easily computed andareofhighinterest (c.g., thedragcoefficient ortheaverage Nusselt number mentioned in the Introduction). Here, the corresponding dual problem Ω Ω a(ϕ,z) = (1, n ϕ) Ω ϕ V C 1 ( Ω) has a measure solution of the type z 1 in Ω, z = 0 on Ω. Hence, to avoid dealing with measures, we use the regularized output functional J ε (ϕ) = ε S 1 n ϕdx = n ϕds+o(ε), ε where S ε = {x Ω : dist{x, Ω}<ε} and ε := TOL. The corresponding dual solution is explicitly given by { 1 in Ω\Sε, z ε = ε 1 dist{x, Ω} in S ε. Ω Figure 5.9: Refined mesh and computed dual solution for the mean normal flux. On cells T Ω\S ε there holds z I h z 0, which leads us to the error estimate J ε (e) η ω := ρ T ω T. T T h,t S ε The conclusion is: There is no contribution to the error from cells in the interior of Ω. Hence, whatever right-hand side f and resulting solution u, the optimal strategy is to refine the elements adjacent to the boundary and to leave the others unchanged.

26 22 Adaptivity In practice, due to the regularity constraints on the mesh, this may also lead to some refinement in the interior, however. This conclusion holds provided the integration of the integrals involving the right hand side f is done exactly. Example 5.6: (heterogeneous L 2 -norm error) Finally, we consider again the L 2 -norm error estimate but for a problem with strongly varying diffusion coefficient, {a u} = f in Ω, u Ω = 0. (5.2.21) This example is intended to show that even when estimating the error in global norms, it can be beneficial to keep the dual weights inside the error estimator rather than condensing them into just one global stability constant. In the considered situation the dual problem reads in strong form and the local residuals take the form {a z} = e e 1 in Ω, z Ω = 0, (5.2.22) R h T = f + {a u h }, r h Γ = 1 2 n [a u h]. By the same argument which led us to Proposition 5.1 and then to the standard L 2 -error estimate, we infer the following two types of a posteriori error estimates: Weighted error estimate: Global error estimate: e η ω L 2 := c I e η L 2 := c I c S ( T T h h 2 T ρ Tω T, ω T := 2 z T. (5.2.23) T T h h 4 T ρ2 T ) 1/2, cs := 2 z Ω. (5.2.24) The residual terms ρ T are defined as before. In both cases, the stability terms can be evaluated by replacing the true dual solution by its Galerkin approximation 2 z T 2 h z h T, with some second-order difference operator 2 h. The interpolation constant is typically of size c I 0.2. The error-dependent functional J( ) = (,e) e 1 is evaluated by replacing the unknown solution u by a patch-wise higher-order interpolation I (2) 2h u h of u h, e I (2) 2h u h u h. This gives us approximate L 2 -error estimators denoted by η ω L 2 (u h ) and η L 2(u h ), respectively. We want to compare the performance of these two L 2 -error estimators by a numerical experiment. To this end, consider the particular setting Ω = ( 1,1) 2 and a(x) = 0.1+e 3(x 1+x 2 ), with a sinusoidal solution u(x) and corresponding right-hand side f. In this calculation the mesh adaptation tries to equilibrate the local error indicators

27 5.2 A linear PDE model case 23 η T = h 2 T ρ Tω T and η T = h 4 T ρ2 T, respectively. This one and alternative strategies for mesh adaptation will be discussed in more detail below. Figure 5.10: Point-error distribution obtained by η L 2 (left) and η ω L 2 (right, scaled by 1:3) on meshes with N 10, 000 cells. Figure 5.10 shows the(scaled) error distribution on meshes obtained by the two estimators. The results shown in Table 5.3 indicate that efficient control of the L 2 -norm error in the case of heterogeneous coefficients requires the use of weighted a posteriori error estimates, i. e., the dual weights should be explicitly kept in the estimator and evaluated computationally. Table 5.3: Results obtained by η L2 and η ω L 2. η L 2 η ω L 2 TOL N e η L 2 c S N e η ω L Remark 5.3: (Curved boundaries) Most examples considered so far had been posed on polygonal domains which can be exactly matched by the finite element mesh domain, i. e., Ω h := {T T h } = Ω.

28 24 Adaptivity This assumption largely simplifies the error analysis and the resulting error estimators. However, in many practical cases at least parts of the boundary of the domain are curved and cannot be matched exactly by a polynomial approximation, e. g., in the cylinder-flow examples presented in the Introduction. Therefore, we have to deal with this complication. Figure 5.11: Standard situations of cells with curved edges. We consider the two typical situations depicted in Figure 5.11 in which a cell T at the boundary has a curved edge Γ T Ω. The computational domain can be made to satisfy Ω h = Ω, by simply extending or truncating the domain of definition of shape functions. Shape functions and transformations are left unchanged. This approximation results in a non-conforming finite element scheme, ( u h, ψ h ) = (f,ψ h ) ψ h V h, (5.2.25) in which V h V, since the elements of V h will usually not satisfy zero boundary conditions. For the error analysis, we assume that the error functional J( ) has an L 2 representation, i.e., there is a j L 2 (Ω), such that J(ϕ) = (ϕ,j). Situations in which this is not the case require special considerations. Using the solution z V of the dual problem we obtain the error identity For any ψ h V h, there holds and consequently, z = j in Ω, z Ω = 0, (5.2.26) J(e) = (j,e) = (e, z) = ( e, z) (e, n z) Ω. ( e, ψ h ) = ( u, ψ h ) ( u h, ψ h ) = ( u,ψ h )+( n u,ψ h ) Ω (f,ψ h ) = ( n u,ψ h ) Ω, J(e) = ( e, (z ψ h )) ( n u,z ψ h ) Ω +(u h, n z) Ω.

29 5.2 A linear PDE model case 25 Now, integrating cellwise by parts, we obtain J(e) = { } (f+ uh,z ψ h ) T +( n e,z ψ h ) T T T h ( n u,z ψ h ) Ω +(u h, n z) Ω. This implies the error representation J(e) = { } (Rh,z ψ h ) T +(r h,z ψ h ) T +(uh, n z) Ω, (5.2.27) T T h with cell and edge residuals defined as above by R h T := f+ u h and r h Γ := { 1 [ 2 nu h ] if Γ T\ Ω, n u h if Γ Ω. We note that the situation considered above is not very practical since the evaluation of the discrete equations (5.2.25) requires the use of numerical integration which in general introduces further errors that are not considered here. Remark 5.4: (Nonhomogeneous Dirichlet data) Nonhomogeneous Dirichlet data u = g on Ω, are usually treated by introducing a representative function û H 1 (Ω) satisfying û Ω = g, and a corresponding finite element approximation û h which is the interpolation (or the L 2 projection) of g along Ω. The solution is then sought as u h = û h +u 0 h, where u 0 h again has zero boundary values. Then, assuming again the domain to be polygonal or polyhedral, the error representation (5.2.13) takes the form J(e) = T T h { (Rh,z ψ h ) T +(r h,z ψ h ) T } (g gh, n z) Ω. (5.2.28) Remark 5.5: (Neumann boundary conditions) The treatment of Neumann boundary conditions n u = g on Γ N Ω, does not cause any problems even in the case of a curved boundary, provided that the mesh T h is compatible with the decomposition of the boundary Ω = Γ D Γ N. In this case, we simply assume the cells adjacent to the Neumann boundary to have possibly a curved edge or face matching the boundary exactly. Now, the variational formulation reads a(u,ψ) = (f,ψ)+(g,ψ) ΩN ψ V, where V := {v H 1 (Ω), v ΓD = 0}. For the error of the corresponding Galerkin approximation, again Galerkin orthogonality holds, i.e., a(e,ψ h ) = 0 for ψ h V h V. Then,

30 26 Adaptivity the error representation (5.2.13) remains valid with the only modification that the edge residuals are now defined by 1 [ 2 nu h ], if Γ T\ Ω, r h Γ := 0, if Γ Γ D, g n u h, if Γ Γ N. Remark 5.6: (Higher-order finite elements) We briefly describe the use of the DWR method for higher-order finite elements and will see that it can be used in this case without essential changes. Let V (p) h V, be finite element spaces of order p+1, i.e., they possess the local approximation properties of polynomials of degree p. For simplicity, we assume agin that the domain Ω is polygonal/polyhedral. We recall that by setting ψ h = I (p) h z in (5.2.13), we have: J(e) = T T h { (Rh,z I (p) h z) T +(r h,z I (p) h z) T}, with thecell- andedge-residuals R h and r h asdefined above andsome local interpolation I (p) h z V (p) h. In order to extract cell-error indicators for local mesh adaptation, we may proceed as before in the low-order case: J(e) T T h (Rh,z I (p) h z) T +(r h,z I (p) h z) T T T h { Rh T z I (p) h z T + r h T z I (p) h z T}. Remark 5.7: (Anisotropic mesh refinement) Sometimes isotropic mesh refinement as discussed so far is not efficient for properly resolving direction-dependent features of the solution. For example, in singularly perturbed problems of the form ε u+bu = f in Ω, u Ω = 0, with small coefficient ε, boundary layers may occur in which the solution has a large derivative in normal direction to the boundary while it varies only slowly in tangential direction. A similar phenomenon occurs in 3D along reentrant edges of the domain (edge singularities). In such a situation it is appropriate to use meshes which are anisotropically refined in the sense that the cells along the boundary are much thinner in normal than in tangential direction, see Figure 5.12.

31 5.3 Practical aspects 27 Figure 5.12: Locally anisotropic tensor-product meshes. The questions are now whether the weighted error estimator contains information about anisotropy in the exact solution, and whether we can extract local indicators which tell us how to adapt the mesh according to (i) orientation of cells, and (ii) optimal stretching of cells. This aspect of automatic mesh adaptation is a rather difficult one and still the subject of current research (cf. Bangerth& Rannacher [24] and Richter [67]). 5.3 Practical aspects In this section, we discuss several aspects of the practical realization of the DWR method described in the previous section. These are (i) the practical and efficient evaluation of the a posteriori error representations, (ii) the extraction of local refinement indicators, and (iii) the design of strategies for economical mesh adaptation. The starting point is the a posteriori error representation J(e) = T T h { (Rh,z ψ h ) T +(r h,z ψ h ) T } =: E(uh ). (5.3.29) as derived above for the Poisson problem on polygonal/polyhedral domains. For its direct evaluation, due to Galerkin orthogonality, the subtraction of the arbitrary element ψ h V h could be suppressed. From this, we extract local refinement indicators called η T of the form η T := (R h,z ψ h ) T +(r h,z ψ h ) T, which are used to steer the mesh adaptation. Collecting these error indicators, we obtain an a posteriori error estimator, i.e. an upper bound for the error, J(e) η := T T h η T. (5.3.30) We emphasize that, in practice, it does not make much sense to estimate further in the indicators η T, since we would inevitably lose sharpness of the error bound. Refinement and coarsening of quadrilateral meshes involving the use of hanging nodes

32 28 Adaptivity proceeds as indicated in Figures 5.13 and Here, the hanging nodes do not carry unknowns as those are eliminated by linear interpolation of the values at neighboring regular nodal points, what preserves the global conformity of the finite element ansatz, i.e., V h V. Theoccurrence ofhanging nodescanbeavoided byusing special transition cells (triangles or quadrilaterals), which bridge from cells of width h to those of width h/2. The construction of such cells may be complicated in 3-D. It may also cause a spreading of the refinement zone which complicates the data structure and implies extra work for de-refining. However, it is basically a question of taste which technique of mesh organization one prefers. Refinement Coarsening Figure 5.13: Refinement and coarsening in quadrilateral meshes. Figure 5.14: Q 1 nodal basis function on a patch of cells with hanging nodes For practical use of the error representation (5.3.29) and the error bound (5.3.30), we have to approximate the terms involving the unknown dual solution z, resulting in approximate error representations Ẽ(u h) anderror indicators η T. This may be expensive while the evaluation of the residuals R h and r h is usually cheap. We have to distinguish two related questions: Sharpness of the approximate error representations Ẽ(u h)? Effectivity of the approximate local error indicators η T for mesh refinement?

33 5.3 Practical aspects 29 For measuring the accuracy of the resulting error estimators, we will use the so-called (reciprocal) effectivity index defined by I eff := Ẽ(u h), J(e) which represents the degree of overestimation on the current mesh and should desirably be close to one. Remark 5.8: Often the physical quantity to be computed can be expressed in different forms which coincide on the continuous level but differ from each other for discrete functions and may lead to more or less robust approximations. A typical example is the mean normal flux, and, more interesting, the drag and lift coefficients computed from solutions of the Navier-Stokes equations. Both can be expressed as surface or, after integration by parts, as volume integrals. If the desired output functional is not properly defined on the solution space V, such as point evaluation in two or more dimensions, it requires regularization, J ε (u) = J(u)+O(ε), ε TOL. In the following, we will mostly suppress the index ε indicating this regularization. Remark 5.9: If on the basis of a numerical approximation to the dual solution z an approximate error representation Ẽ(u h) has been generated, one may hope to obtain an improved approximation to the target quantity by setting J(u h ) := J(u h )+Ẽ(u h) J(u). This post-processing can significantly improve the accuracy in computing J(u), but the resulting error can then no more be estimated on the basis of the available information Evaluation of the error identity and indicators Consider the approximation of the model Poisson problem u = f in Ω, u Ω = 0, (5.3.31) on a polygonal (or polyhedral) domain Ω R d (d = 2 or 3) by piecewise bilinear (or trilinear) finite elements (in short Q 1 elements ). For this prototypical case, we will compare several strategies of evaluating the error representation (5.3.29) and the corresponding local error indicators. Notice that the approximation of z in E(u h ) simply by its Ritz projection z h V h does not work since, in view of Galerkin orthogonality, it would result in the useless approximation Ẽ(u h) = 0.

34 30 Adaptivity Approximation by a higher-order method: A first possibility is to solve the dual problem by using biquadratic finite elements on the current mesh yielding an approximation z (2) h V (2) h to z. This yields the approximate error representation E (1) (u h ) := { (Rh,z (2) h I hz (2) h )T +( r h,z (2) h I hz (2) ) } h, T T T h and the corresponding local error indicators η (1) T = ( R h,z (2) h I hz (2) h )T +( r h,z (2) h I hz (2) ) h For the special situation considered below (see Table 5.4), this approximation results in an asymptotically optimal effectivity index, lim TOL 0 I (1) eff = 1. Notice that here the subtraction of I h z (2) h may be dropped without spoiling the quality of the approximation. However, approximating the dual problem by a higher-order method than used for u h itself seems not very attractive and can actually be avoided in most cases. T. Approximation by higher-order interpolation: A simplification is achieved by patch-wise higher-order interpolation of the bilinear Ritz projection z h V h of z. Here, we only consider biquadratic interpolation in 2-D. On square blocks of four neighboring cells the 9 nodal values of z h are used to define a biquadratic interpolation I (2) 2h z h. This is then used in the error representation instead of z, resulting in the approximate error representation E (2) (u h ) := { (Rh,I (2) 2h z ) } h z h +( r T h,i (2) 2h z h z h, ) T T T h and the corresponding local error indicators η (2) T = ( R h,i (2) For the special situation considered below (see Table 5.4), this approximation also results 2h z h z h ) T +( r h,i (2) 2h z h z h ) T. in an almost optimal effectivity index, lim TOL 0 I (2) eff 1. In the case of higher-order finite elements, with p 2, the evaluation of the error identity (5.3.29) may be done in a similar way as for p = 1 employing a patch-wise interpolation I (p ) 2h z h, with p > p, of the Ritz projection z h V (p) h. We emphasize that in this case the subtraction of z h in the cell-error terms should not be dropped since I (2) 2h z h itself is not necessarily a better approximation to z in the pointwise sense than z h, but the difference I (2) 2h z h z h is usually a good approximation to z I h z. The (heuristic) reasoning for this is that on not too irregular meshes on each mesh cell there holds z I h z h 2 T 2 z, I (2) 2h z h z h = I (2) 2h z h I (1) h I(2) 2h z h h 2 T 2 I (2) 2h z h, and (the difference quotient) 2 I (2) 2h z h is a reasonable approximation to 2 z.

35 5.3 Practical aspects 31 Approximation by difference quotients: The error representation (5.3.29) is estimated by E(u h ) T T h ρ T ω T, with the notation of Proposition 5.1. Applying cellwise interpolation estimates, we have ω 2 T = z I h z 2 T +h T z I h z 2 T c 2 Ih 4 T 2 z 2 T, with an interpolation constant c I (0.1,1) dependig on the geometry of the cell T. Now, the second derivatives 2 z are replaced by suitable second-order difference quotients 2 h z h of the Ritz projection z h of z. For the corresponding error estimator E (3) (u h ) := c I T ρ T [ n z h ] T, T T h h 3/2 we usually observe strong over-estimation, i.e. lim TOL 0 I (3) eff 1 (see Table 5.4) depending on what value we set for the interpolation constant c I. Numerical test The effectivity of these error estimators has been tested for the 2-D model problem(5.3.31) with the solution u(x) = (1 x 2 1)(1 x 2 2)sin(4x 1 )sin(4x 2 ) for the two output functionals J 1 (u) := S 1 udx, S := [ 1,0] [0, 1], J 2 2 2(u) := u( 1, 1). 2 2 S Table 5.4: Effectivity of weighted error indicators for the mean error J 1 (e) (left) and the point-error J 2 (e) (right). N J 1 (e) I (1) eff I (2) eff I (3) eff N J 2 (e) I (1) eff I (2) eff I (3) eff

36 32 Adaptivity Table 5.4 shows the corresponding effectivity indices obtained on sequences of locally refined meshes on the basis of the local error indicators η (i) T, i = 1,2,3. It turns out that the cheap estimator E (2) (u h ) based on local post-processing is almost as effective as the more expensive estimator E (1) (u h ). Therefore, in the following, we will almost exclusively mostly use the first one in the presented numerical tests. Remark 5.10: There are other techniques for the evaluation of the cellwise error indicator η T which are vertex oriented avoiding the computation of strong residuals and normal jumps across cell boundaries. This is advantageous in some cases, e. g., when the used data structure does not support cell boundary manipulations (cf. Braack& Ern [41] and Meidner et al. [53]). Remark 5.11: The traditional energy-norm error estimator η E is known to be not only reliable, i.e., it provides a safe upper bound for the energy error, but also efficient, i.e., it is asymptotically sharp in the sense that c 1 e η E c 2 { e + f fh }, where f h is the piecewise constant interpolation of f. This means that the estimator is up to a constant asymptotically correct. A corresponding result is not possible in general for estimators η ω of locally defined error quantities. Already the transition from the error representation to the error estimate in terms of (non-negative) cell-wise error indicators is critical, since by this localization the asymptotic sharpness of the global error representation may get lost. To illustrate this, consider the case J(u) = u(0) and assume that the exact as well as the approximate solution are anti-symmetric with respect to the x 1 -axis. Then, e(0) = 0, but usually T T h η T Mesh adaptation Next, we discuss strategies for successive mesh adaptation. Suppose that on the meshes T h, we have local error indicators η T extracted from an a posteriori error estimate J(e) η(u h ) := T T h η T, N := #{T T h }. (5.3.32) Using this information the computational mesh may be adapted using various different strategies. For quadrilateral meshes, as considered here, the refinement and coarsening is facilitated by using hanging nodes as described above. At first, we have to check whether on the current mesh T h the stopping criterion η(u h ) TOL is already satisfied. If this is the case, then u h is accepted as approximation to u that represents the target quantity J(u) by J(u h ) within the desired tolerance TOL. Otherwise, the next refinement cycle is started. To this end, the cells in the current mesh

37 5.3 Practical aspects 33 are ordered according to η T1 η Ti η TN. Then, the mesh adaptation may be organized using one of the following strategies. Error-balancing strategy: Below, we will give an argument which indicates that an optimal mesh is characterized by equilibrated error indicators, such as η Ti TOL N, i = 1,...N, which implies η(u h ) TOL. However, the so-called error-balancing strategy based on this idea is implicit as it involves the current number of mesh cells N which is obtained only at the end of the adaptation cycle. To overcome this difficulty, we check, starting from i = 1,j = 0, whether η Ti TOL N +3j. If this is not satisfied, then the cell T i is refined, the counters j and i are increased by one, and one proceeds to the next smaller η Ti. But, if the condition is satisfied, then the new mesh T new h is reached. This strategy is potentially optimal but involves many expensive checking operations and is therefore not really practicable. Fixed-error-reduction or fixed-fraction strategies: For fractions X, Y, with 1 X > Y, determine indices N,N {1,...,N}, such that N i=1 η Ti Xη, N i=n η Ti Yη. Then, the cells T 1,...,T N are refined and the cells T N,...,T N are coarsened. Common choices are X = 0.2 and Y = 0.1. Alternatively, in the fixed-rate strategy, one refines X N and coarsens Y N cells with largest and smallest error indicators, respectively. For appropriate choices of X, Y this accomplishes to keep the number of cells almost constant in the course of the mesh adaptation process. Look ahead strategy: Let the current mesh T h have N cells. We denote by T m h the locally refined mesh resulting from T h by refining the first m cells (with largest error indicators). The corresponding number of cells (under isotropic bisection) becomes N m = N +m(2 d 1).

38 34 Adaptivity The goal is now to determine m in such a way that the refinement step leads to error reduction and improved equilibration of the error indicators, which is supposed to be a criterion for the quality of a mesh-size distribution. To this end, we further assume that the cellwise error indicators η Ti have a predictable behavior under cell refinement, η Ti 2 α η Ti, with some α > 0. For example, for Q 1 -elements as considered here, we may assume α = 2. Then, the error estimator η m on the refined mesh T m h takes the form η m := η β m η Ti, β := 1 2 α. i=1 For the mesh T m h, we consider the global mean mesh width h = N 1/d m. Then, if the mesh T m h were equilibrated there would hold: η m = N m i=1 η m T i h α = N α/d m. Hence, the constant c m := η m N α/d m may be considered as a measure for the deviation of T m h from being equilibrated, such that the optimal choice for m would be m := arg min 1 m N c m. We remark that for a mesh T h with equilibrated error indicators η T = η/n there holds η N α/d and therefore η m = η(1 βx), x := m N, c m = η(1 βx)n α/d m = η(1 βx)n α/d( 1+(2 d 1)x ) α/d (1 βx) ( 1+(2 d 1)x ) α/d. For d = 2 and α = 2, this number c m has its minimum in (0,1] at x = 1. This leads to m = N, i.e., to global uniform refinement (once the error indicators are well balanced). Mesh-optimization strategy: The information contained in the error representation(5.3.29) may be used directly to construct an optimal mesh onwhich theerror tolerance η TOL isachieved, i.e. skipping the intermediate one-level refinement steps. Here, a mesh T h = {T} is characterized by a continuous mesh-size function h(x), where h T h T. Further, it is assumed that

39 5.3 Practical aspects 35 the error estimator is related to a continuous limit of the form η T h(x) 2 Φ(x)dx =: η(h), T T Ω h with Φ := (Φ 1 +Φ 2 )Φ 3 a mesh-independent weighting function. We note that the latter requirement rules out cases in which the functional J( ) and hence also the dual solution implicitly depend on the mesh size, as for example in the estimation of the error in the energy or the L 2 norm (cf. Section 5.2.2). The components of Φ are defined by the limiting processes of residuals and weights, for TOL 0: max h max 1 1, x T x T 2 h 1 T max [ nu h ] T max 2, x T x T (5.3.33) h 2 T max z I hz max 3. x T x T (5.3.34) Here, the mesh-size power h(x) 2 is related to the order of the considered finite element ansatz. The justification of these assumptions will be discussed in the next section. The function Φ(x) may have strong (regularized) singularities which will possibly require the mesh-size h(x) to reduce down towards zero even for tolerance TOL > 0. Further, we introduce the mesh complexity formula N = h d T h d T h(x) d dx =: N(h), T T h which denotes the number of degrees of freedom for a mesh-size function h(x). With this notation, we obtain the following result. Proposition 5.2: The mesh-optimization problem is solved by the mesh-size function provided that Ω η(h) min, N(h) N max (5.3.35) h opt (x) = W := For an optimal mesh, there hold ( W N max ) 1/d Φ(x) 1/(2+d), (5.3.36) Ω Φ(x) d/(2+d) dx <. TOL = W(2+d)/d N 2/d and N W(2+d)/2. (5.3.37) d/2 TOL Proof. Following the classical Lagrange approach, we introduce the Lagrangian L(h,λ) = η(h)+λ{n(h) N max }

40 36 Adaptivity with Lagrangian multiplier λ R. Then, the optimal mesh-size function h opt is characterized by the first-order optimality conditions d dt L(h+tϕ,λ) t=0 = 0, d dt L(h,λ+tµ) t=0 = 0, for all admissible variations ϕ and µ. This means that 2h(x)Φ(x) dλh(x) d 1 = 0, h(x) d dx N max = 0, and, consequently, h(x) = ( 2 dλ Φ(x) ) 1/(2+d), From this, we deduce the desired relations λ 2 d h(x)2+d Φ(x), h opt (x) = Ω ( ) d/(2+d) 2 Φ(x) d/(2+d) dx = N max. dλ Ω ( W N max ) 1/d Φ(x) 1/(2+d). From the formula for h opt, we conclude that on the optimal mesh, there holds ( W ) 2/d TOL = N Ω Φ(x) 2/(2+d) Φ(x)dx = W(2+d)/d N 2/d, which proves (5.3.37). Q.E.D. Remark 5.12: We note that in two dimensions the optimal mesh complexity is TOL = O(N 1 ) or N = O(TOL 1 ), for linear or bilinear finite elements, provided that sup TOL 0 W < is satisfied. This is the case even for rather irregular functionals J( ). For example, the evaluation of J(u) = i u(a) leads to Φ(x) ( x a 2 +TOL 2 ) 3 and, consequently, sup W x a 3/2 dx <. TOL 0 Ω In the case that sup TOL W =, as for example for J(u) = i 2 u(a), the optimal mesh complexity becomes TOL = O(N α 1 ) or N = O(TOL α 1 ), with some α > 0. In particular, for the latter functional, we easily find TOL = O(N 1 log(n) ). Remark 5.13: The result of Proposition 5.2 implies that the optimal mesh-size distri-

41 5.4 Balancing discretization and iteration errors 37 bution is characterized by the equilibration property h(x) 2+d Φ(x) ( TOL ) (2+d)/d = const. (5.3.38) W This justifies the strategy of equilibrating the local error indicators η T, η(h) = h(x) 2 Φ(x)dx h 2+d T Φ T = η T, T T h T T h Ω as used in the error-balancing strategy. Let the weight function Φ(x) be bounded. Then, once a balanced mesh satisfying (5.3.38) is reached, a maximum increase of accuracy is achieved by uniform mesh refinement. If Φ is singular, then the optimal mesh size tends to zero at these singularities even for TOL > 0. Remark 5.14: Alternatively to (5.3.35), we can also consider the mesh-optimization problem N(h) min, η(h) TOL, which has the solution (exercise) h opt (x) = ( TOL ) 1/2Φ(x) 1/(2+d). W Remark 5.15: Although the mesh-optimization strategy seems very attractive, its realization involves several problems: The derivation of the formula for an optimal mesh-size distribution is based on the assumption that on the considered meshes the cell-residuals behave in an optimal way under refinement, i.e., ρ T h T. This is hard to prove and is not true in general (transport problems). The numerical approximation of the weighting function Φ(x) should provide more information than can be cheaply obtained using only information on the current mesh. The explicit formulas for h opt (x) have to be used with care in designing a mesh as their derivation implicitly assumes that they actually correspond to scalar mesh-size functions of isotropic meshes, a condition, however, which is not incorporated into the formulation of the mesh-optimization problems. 5.4 Balancing discretization and iteration errors The derivation of the a posteriori error analysis developed until here has always assumed the discrete solution u h to be known exactly. However, in practice this is not the case as usually an iterative method is used to obtain u h. Hence, we have to cope with an

42 38 Adaptivity additional iteration error and its effect on the error quantity J(e). On the other hand, even in the best situation the exact finite element solution approximates the continuous solution only up to discretization accuracy. It seems natural to stop the iteration of the linear solver when the error due to the approximate solution of the discrete equations is comparable to the error due to the finite element discretization itself. To this end, we derive an a posteriori error estimator which assesses the influences of the discretization and the inexact solution of the arising algebraic equations. This allows us to balance both sources of errors. The material of this sectio is taken from Becker et al. [37] and Meidner et al. [53]. We consider again the elliptic model problem u = f in Ω, u = 0 on Ω, (5.4.39) with a right-hand side f L 2 (Ω) on a polygonal/polyhedral Ω R d (d=1,2) and its discretization by low-order, conforming linear or d-linear finite elements. More general settings can be treated by analogous arguments. We continue using the notation of function spaces, scalar products and norms introduced above. For the exact discrete solution u h V h V := H 1 0(Ω) of the finite element equations a(u h,ϕ h ) := ( u h, ϕ h ) = (f,ϕ h ) ϕ h V h. (5.4.40) and a certain (linear) output functional J( ) defined on V, we have obtained the a posteriori error representation J(e) = ρ(u h )(z ẑ h ), (5.4.41) with the residual functional ρ(u h )( ) := (f, ) a(u h, ) and the solution z V of the associated dual problem a(ϕ,z) = J(ϕ) ϕ V. (5.4.42) From this error identity, we have then derived a posteriori error estimates of the form J(e) η(u h ) := T T h η T, (5.4.43) with certain cellwise error indicators η T 0. Here, the subtraction of an arbitrary element ẑ h V h is possible due to Galerkin orthogonality, a(e,ψ h ) = 0, ψ h V h. Let us now assume that the discrete algebraic problem (5.4.40) is solved only approximately, e. g., by an iterative method such as the Gauß-Seidel (GS) method, the conjugate gradient (CG) method, or the multigrid (MG) method, yielding a ũ h u h. In general this approximate solution ũ h does not satisfy Galerkin orthogonality. Our goal is now to

43 5.4 Balancing discretization and iteration errors 39 derive an a posteriori estimate for the total error ẽ := u ũ h of the form J(ẽ) η h +η it, (5.4.44) where η h assesses the error due to the finite element discretization and η it the error due to the inexact solution of the discrete problem, both being evaluated from the actually computed solution ũ h Iterative solution methods For the following, we provide some additional notation. The discrete problem (5.4.40) is equivalent to the linear system Aξ = β, where u h = N i=1 ξ iϕ i with the usual nodal basis {ϕ i, i = 1,...,N} of V h and the coefficient vector ξ = (ξ i ) N i=1. Further, A = (a ij) N i,j=1 is the stiffness matrix with entries a ji = a(ϕ i,ϕ j ) and β = (b j ) N j=1 the load vector with b j = (f,ϕ j ) representing the right-hand side. We introduce the usual L 2 projection P h : V V h defined by and the Ritz projection Q h : V V h defined by (P h u,ϕ h ) = (u,ϕ h ) ϕ h V h, (5.4.45) a(q h u,ϕ h ) = a(u,ϕ h ) ϕ h V h. (5.4.46) Further, we define the discrete operator A h : V h V h by (A h v h,ϕ h ) = a(v h,ϕ h ) v h,ϕ h V h. (5.4.47) With these notations, we can rewrite equation (5.4.40) equivalently in operator form, A h u h = P h f. (5.4.48) Within a nested mesh-adaptation and solution process, we will consider hierarchies of meshes T l := T hl, l = 0,...,L, with mesh size parameters h l. Accordingly, the notation V l := V hl, u l := u hl, P l := P hl, Q l := Q hl, and A l := A hl will be used. First, we recall the definition of the multigrid (MG) method. We suppose that we are given a sequence of refined grids T l with corresponding finite element spaces V l. We assume the meshes T l to become finer with increasing l and the spaces V l to be nested, V l V l+1. We denote by S l : V j V l the smoothing operator on the level j (a simple fixed point iteration. e. g., Jacobi or Gauß-Seidel method). The grid transfer operations are p l+1 l : V l V l+1 (prolongation) and r l 1 l : V l V l 1 (restriction). We aim at finding anapproximation ũ L V L onthefinest mesh T L tothesolution u L V L oftheequation A L u L = f L := P L f (5.4.49) using a multigrid algorithm based on the hierarchy of meshes T l, l = 0,1,...,L. Start-

44 40 Adaptivity ing with an initial guess u 0 L, the multigrid process produces a sequence of approximations ũ L = u t+1 L via the procedure u t+1 L = MG(L,γ,u t L,f L) described in the following algorithm. Multigrid cycle u t+1 l = MG(l,γ,u t l,f l) 1. Solve A 0 u t+1 0 = f 0 exactly. 2. Pre-smoothing ū t l := Sν l (ut l ) 3. Form the residual d t l := f l A l ū t l and its restriction d t l 1 4. For r = 1 to γ (usually γ = 1 or γ = 2) iterate vl 1 r starting with vl 1 0 := Compute the correction ū t l := ūt l +pj l 1ṽγ l Post-smoothing u t+1 l := S µ l (ūt l ). := rl 1 l d t l. := MG(l 1,γ,vr 1 l 1, d t l 1 ) The parameters ν and µ indicate respectively the number of the pre- and post-smoothing steps on the different mesh levels. The structure of the multigrid algorithm is determined by the parameter γ. The case γ = 1 corresponds to the so-called V-cycle and γ = 2 to the W-cycle. For comparison, we also consider the Gauß-Seidel (GS) method for the nodal-value vector ξ L R N L corresponding to the finite element solution u L V L, (L L +D L )ξ t L = β L R L ξ t 1 L, t = 1,2,..., ξ0 L = ξ L 1, (5.4.50) with the usual splitting A L = L L + D L + R L, or the conjugate gradient (CG) method (without preconditioning), ξ t L R N L : β L A L ξ t L A 1 = min y L K t β L A L y L A 1, t = 1,2,..., (5.4.51) with the Krylov spaces K t := span{i,a,...,a t 1 L }, on the different mesh levels. All three iterative methods yield approximative discrete solutions on the finest mesh T L, which are denoted by ũ L V L Derivation of a combined error estimator We consider the control of the error with respect to some quantity of interest J(u), which is assumed to be given in terms of a linear, continuous functional J : V R. To this end, we introduce again the associated continuous dual problem and its discrete analogue, a(ϕ,z) = J(ϕ) ϕ V, (5.4.52) a(ϕ h,z h ) = J(ϕ h ) ϕ h V h. (5.4.53) To derive an error estimator which includes the error due to the inexact solution of the discrete problems, we replace the exact discrete solution u L on the current finest mesh

45 5.4 Balancing discretization and iteration errors 41 T L = T hl in the a posteriori error identity (5.4.41) by the actually computed discrete solution ũ L. Due to the lacking Galerkin orthogonality this produces an additional term, which is used to control the accuray of the algebraic iteration. Proposition 5.3: i) Let u V be the solution of problem (5.4.39) and ũ L V L the approximative finite element solution of the discrete problem (5.4.49) on the finest mesh T L. Then, we have the following general representation for the error ẽ := u ũ L : J(ẽ) = ρ(ũ L )(z ẑ L )+ρ(ũ L )(ẑ L ), (5.4.54) where the residual functional is defined as above, ρ(ũ L )( ) := (f, ) a(ũ L, ), and ẑ L V L can be arbitrarily chosen, e.g., ẑ L = z L. ii) If the multigrid method has been used for computing ũ L the following refined representation holds for the iteration residual: ρ(ũ L )(ẑ L ) = L (R j (ũ L ),ẑ l ẑ l 1 )+(R 0 (ũ L ),ẑ 0 ), (5.4.55) l=1 where R l (ũ L ) := P l (f L A L ũ L ). Here, ẑ l V l (l = 0,1...,L) can be chosen arbitrarily, e.g., ẑ l = z l. Proof. i) We again consider the continuous dual problem (5.4.52). There holds J(ẽ) = a(ẽ,z) = a(ẽ,z ẑ L )+a(ẽ,ẑ L ) = (f,z ẑ L ) a(ũ L,z ẑ L )+(f,ẑ L ) a(ũ L,ẑ L ) = ρ(ũ L )(z ẑ L )+ρ(ũ L )(ẑ L ). (5.4.56) This is the asserted error representation (5.4.54). ii) The first term on the right-hand side of (5.4.54) corresponds to the discretization error. The second term corresponding to the multigrid error can be rewritten in the form ρ(ũ L )(ẑ L ) = L { (f,ẑl ẑ l 1 ) a(ũ L,ẑ l ẑ l 1 ) } + { (f,ẑ 0 ) a(ũ L,ẑ 0 ) }. (5.4.57) l=1 Since V l V L for l L, we observe by the definitions of Q l, P l, and A l that for ϕ l V l there holds (f,ϕ l ) a(ũ L,ϕ l ) = (P l f,ϕ l ) (A l Q l ũ L,ϕ l ) Further, by means of the identity A l Q l = P l A L on V l for l L, we have (P l f,ϕ l ) (A l Q l ũ L,ϕ l ) = (P l (f A L ũ L ),ϕ l ) = (R l (ũ L ),ϕ l ). Using these identities for ϕ l = ẑ l ẑ l 1 and ϕ 0 = ẑ 0 in (5.4.57) completes the proof. Q.E.D.

46 42 Adaptivity The error representation (5.4.54) can be used for approximative solutions ũ L obtained by any solution process. Below, we will describe how this can be used for designing automatic stopping criteria for the iteration on the finest mesh level depending on the actual discretization error. We emphasize that the choice of the dual weights in the iteration error representation depends on the weights in the discretization error. Further, carrying the algebraic iteration to convergence, we have and therefore, ρ(u t L)(ẑ L ) ρ(u L )(ẑ L ) = 0 (t ), ρ(u t L)(z ẑ L )+ρ(u t L)(ẑ L ) ρ(u L )(z ẑ L ) (t ). This means that for ρ(ũ L )(ẑ L ) sufficiently small the remaining term ρ(ũ L )(z ẑ L ) can be safely used for estimating the error term J(ẽ). Remark 5.16: The result (ii) of Proposition 5.3 does not depend on the special form of the multigrid cycle, i. e., V-, W- or F-cycles are allowed. Moreover, there is no restriction concerning the application of pre/post-smoothing and the choice of the smoother. The error representation (5.4.55) for the multigrid method exploits the structure of this optimal iteration method. This allows us also to tune the smoothing iteration on the different mesh levels in order to get an easier balancing with the discretization error. In the case of canonically chosen grid transfer operations, p l+1 l = id, r l 1 l = P l 1, and if only pre-smoothing is used, the discrete residuals R l (ũ L ) can be identified with the iteration residuals obtained in the course of the correction process from the equation A l v l = P l d l+1, R l (ũ L ) = P l (f L A L ũ L ) = P l f L P l A L S ν L (ũ0 L ) P la L p L L 1ṽL 1 = P l (d L A L 1 ṽ L 1 ) = P l d L P l A L 1 S ν L 1 (ṽ L 1) P l A L 1 p L 1 L 2ṽL 2. = P l (d l+2 A l ṽ l+1 ) = P l d l+2 P l A l+1 S ν l+1(ṽ l+1 ) P l A l+1 p l+1 l ṽ l = P l (d l+1 A l ṽ l ) = R l (ṽ l ). (5.4.58) This shows that the discrete residuals can be evaluated on the intermediate grid levels T l without explicitly referring to the fine-grid solution ũ L. Combining this with the error representation in Proposition 5.3 yields the following:

47 5.4 Balancing discretization and iteration errors 43 Corollary 5.1: Assume the grid transfer operations in the multigrid algorithm are chosen canonically and the multigrid residual R 0 (v 0 ) on the coarsest level vanishes. Then, under the conditions of Proposition 5.3, the following error representation holds: J(e) = ρ(ũ L )(z ẑ L )+ L (R l (ṽ l ),ẑ l ẑ l 1 ), (5.4.59) l=1 where ẑ l V l can be chosen arbitrarily. Remark 5.17: From the a posteriori error representation stated in Proposition 5.3, we can in the usual way derive combined a posteriori error estimates with respect to the energy and the L 2 norm (exercise): ( ) 1/2 ( ) 1/2, ẽ E c S c I h 2 Tρ T (ũ L ) 2 +cs c IL max h 2 T R l (ṽ l ) 2 T (5.4.60) l=1,...,l T T L T T l 1 ( ) 1/2 ( ẽ c S c I h 4 T ρ T(ũ L ) 2 +cs c I L max ) 1/2, h 4 T R l(ṽ l ) 2 T (5.4.61) l=1,...,l T T L T T l 1 with the cell error indicators ρ T (ũ L ) and the iteration residuals R l (ṽ l ). The factor L in the iteration-error estimator can be reduced by exploiting that, depending on the refinement strategy used, a large percentage (usually more than 70%) of the cells remain unchangedineachrefinement cycle. Hence onthemajorityofcellsthedifferences ẑ l ẑ l 1 can be made to vanish Evaluation of the a posteriori error estimators The error representation (5.4.54) involves the dual solution z V and possibly its Ritz projection z h = Q h z V h. The difference z Q h z may again be approximated by local cellwise higher-order interpolation as described above, z z h T I (2) 2h z h z h T, T T h. For this, we need to solve the discrete dual problem (5.4.53) on the current mesh. This may be done employing a multigrid method on the same hierarchy of meshes as used for solving the primal problem. To be more precise, we carry out one multigrid iteration for solving the primal problem and then we execute one multigrid iteration for solving the dual problem. With these computed solutions, we evaluate the error estimators. This alternating solving will be continued till the stopping criterion for the multigrid iteration is fulfilled. For solving the discrete dual problem, there is no need to assemble a new stiffness matrix. The matrix corresponding to the primal problem can simply be transposed. The estimator for the iteration error does not require any approximation as it only

48 44 Adaptivity involves available discrete quantities, i. e., for a general iteration method: η (1) it := (f, z L ) a(ũ L, z L ), (5.4.62) where z L is the approxinate solution of the dual problem on the finest mesh T L. In the special case of a multigrid method, in the corresponding refined error estomator (5.4.59), we have to provide the approximations ẑ l and ẑ l 1 which are defined on the different grid levels T l and T l 1, respectively. One possibility is to store the calculated dual solutions on each level. Let us assume that we want to calculate ẑ l ẑ l 1 for one fixed l. Then, we prolongate the approximated dual solution z l 1 from a coarser level on the finer level l. Since we have nested finite element spaces V l V l+1, we use the canonical embedding p l l 1 = id as the prolongation operator in the multigrid method. By this means, we approximate ẑ l ẑ l 1 z l p l l 1 z l 1, and obtain the following approximate estimator for the multigrid error: η (2) it := L (R l (ṽ l ), z l p l l 1 z l 1), (5.4.63) l=1 where ṽ l is the approximative solution of the defect equation on mesh T l. There is still an alternative way of estimating the multigrid error using the identity (5.4.59). The computed dual solution z L on the finest mesh level is restricted to the lower mesh levels. Defining the functions r L z l L := rl+1 l rl+1 l+2 rl 1 L z L for 0 l < L, the dual weights are approximated like ẑ l ẑ l 1 r L z l L p l l 1 rl 1 L z L. On the finest level, we compute the difference z L p L L 1 rl 1 L z L. Thus, we obtain the following a posteriori error estimator for the multigrid error: L 1 η (3) it := (R l (ṽ l ),r L z l L p l l 1 rl 1 L z L)+(R L (ṽ L ), z L p L L 1 rl 1 L z L ). (5.4.64) l=1 In numerical experiments it has been observed that all three iteration error estimators η (i) it, i = 1,2,3, are equally efficient. Therefore, in all the numerical tests shown below involving the multigrid algorithm only the iteration error estimator η (2) it is used, which also allows to adapt the smoothing iteration on the different mesh levels Nested adaptive algorithm We propose an adaptive algorithm in which the discretization and multigrid errors are balanced. That is, we carry out the multigrid iteration until the following relation holds: η it η h.

49 5.4 Balancing discretization and iteration errors 45 Moreover, we may use the additional information provided by the multigrid error estimator η (2) it and allow the number of smoothing steps to vary over the different mesh levels in order to reduce the amount of work. In the following, we denote by ν l and µ l the number of pre- and post-smoothing steps, respectively, on mesh level l in the multigrid method. On the levels with biggest error contribution, we perform four pre- and post-smoothing steps, while only one pre- and post-smoothing step is used otherwise. Assuming that we want to compute the value of the quantity of interest up to a given accuracy TOL, we use the adaptive process described in the following. Nested adaptive multigrid algorithm 1. Initialization: Choose an initial discretization T 0 ;= T h0 and set l = Multigrid iteration: For fixed l 1 set t = If t 1 set ν j = 1, µ j = 1 for j = 0,...,l. 4. Apply one multigrid cycle to the problem A l u l = f l and increment t t Error estimation: Evaluate the estimators η l and η (2) it,l. 6. According to the error indicators on the different levels, (R j (ṽ j ), z j p j j 1 z j 1), determine the subset of levels I = {i 1,...,i l } with the biggest contribution to the error estimator and increase the number of smoothing steps by ν ij = 4, µ ij = If η it,l κ η l, then accept u t l as approximation. Otherwise increment t t+1 and go to (3). 8. If η l +η it,l TOL, then stop. 9. Mesh adaptation: Refine the mesh T l T l+1 accordingly to the error indicators η l. 10. Interpolate the previous solution ũ l on the current mesh T l Increment l l +1 and go to (2). To start the multigrid algorithm, we use the values from the computation on the previous mesh level as starting data. This allows us to avoid unnecessary iterations on the current mesh. Furthermore, we use an equilibration factor κ (0, 1] for comparing the two estimators η it and η h : η it κ η h. This ensures also that the local mesh refinement results from the value of the discretization error estimator. For the numerical test following below, for safety, we have chosen the factor κ = 0.1. Selecting a smaller value does not improve the accuracy of the computed value very much but increases the number of multigrid iterations. A greater value, however, can affect the local mesh refinement. As described in the above nested algorithm, we have to evaluate the error estimators η h and η it inevery multigridstep. Inorder toreduce thecomputationalwork, wepropose the following strategy for the adaptive algorithm: After solving the discrete equations via

50 46 Adaptivity multigrid on the mesh T hl, we save the value η hl by setting η old := η hl. On the next finer mesh T hl+1, we do not evaluate the discretization error estimator η hl+1 until η it,l+1 κ η old. (5.4.65) Then, we save the new value η hl+1 in η old. In the next multigrid iteration condition (5.4.65) will be verified again. Thus, we reduce the number of evaluations of the discretization error estimator on each mesh. In the numerical tests presented below the error estimator η h has been evaluated at most twice on every mesh Numerical example In the following, we demonstrate the efficiency and reliability of the proposed adaptive algorithm. We compare the adaptive multigrid method described above with a multigrid method employing a residual based stopping criterion which is commonly used for iterative methods. To be more precise, we replace the stopping criterion in the multigrid solver by requiring that the initial multigrid residual is to be reduced by a factor of The discretization error estimator will still be used for the construction of locally refined meshes and the error control. We denote this algorithm by MG I and the adaptive multigrid algorithm by MG II. Further, we show the results obtained by using the Gauß- Seidel and the CG method for computing the discrete solutions on the different mesh levels. In the tables below, the following notation is used, for the discretization error, for the iteration error, e := u u h exact discretization error, J(e) exact functional discretization error, η h estimator of discretization error, Ieff h = η h effectivity index of discretization error estimator, J(e) e it := u h ũ h exact iteration error, J(e it ) exact functional iteration error, I it,l η (1) it η (2) it η (3) it estimator of general iteration error, first special estimator of multigrid iteration error, second special estimator of multigrid iteration error, eff := η (l) it effectivity index of iteration error estimators, J(e it )

51 5.4 Balancing discretization and iteration errors 47 and for the total error, ẽ := u ũ h total error, eff := η h +η it effectivity index of total error estimator. J(ẽ) I tot In order to assess the exact iteration error J(e it ), we solve the discrete equations on each mesh additionally by the algorithm MG I using the ILU-iteration as smoother and requiring that the multigrid residual to be reduced down to round-off error level The exact value of J(e) is obtained by using an approximation to u on a very fine mesh. For our numerical test, we consider the Poisson problem u = 1 in Ω, u = 0 on Ω. (5.4.66) on an L-shaped domain Ω ( 1,1) 2 R 2. As target functional J( ), we choose the point value J(u) := u(a) with a = (0.2,0.2). Since this functional is not defined on V := H0(Ω) 1 it has to be regularized, J(u) := B ε (a) 1 u(x)dx = u(a)+o(ε 2 ), B ε(a) where B ε (a) is a ball around a with radius ε = TOL. In this case, the resulting meshes are strongly refined towards the point a and less strongly also close to the reentrant corner. We solve the discrete equations by the multigrid algorithm using a V-cycle and four ILU-steps for pre- and post-smoothing, i.e., we set γ = 1 and ν = µ = 4 in the multigrid algorithm. In Tables 5.5 and 5.6, we show the convergence history of the two algorithms MG I and MG II. We see that on most adapted meshes, only two or three multigrid iterations are needed to reduce the iteration error well below the discretization error.

52 48 Adaptivity Table 5.5: Iteration with MG I (iteration towards round-off error level) N #It J(ẽ) η h +η (2) it η h η (2) it I tot eff e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e Table 5.6: Iteration with MG II (adaptive stopping criterion) N #It J(ẽ) η h +η (2) it η h η (2) it I tot eff e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e In this test, we have not exploited all details provided by the error estimator for the MG iteration, particularly the possibility of adaptively steeing the number of smooting steps on the different mesh levels. This is not so relevant on isotropic meshes as considered here but may become important in case of strongly anisotropic meshes as used in resolving boundary layers. On such meshes the proper choice of smoothing is particularly critical for the efficiency of the multigrid solution. Finally, we consider the computation of the approximate solution u h onafixed locally refined, but still rather coarse, mesh by the Gauß-Seidel (GS) and the CG method.

53 5.4 Balancing discretization and iteration errors 49 Table 5.7: Gauss-Seidel iteration on a locally refined mesh with 721 cells (starting value taken from the preceding mesh) It J(e) η h I h eff J(e it ) η (1) it I it,1 eff u (t) L u L e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e-02 Table 5.8: CG iteration on a locally refined mesh with 721 cells (starting value taken from the preceding mesh) It J(e) η h I h eff J(e it ) η (1) it I it,1 eff b L A L ξ (t) ) A e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e-09

54 50 Adaptivity 5.5 Attempts towards theoretical justification In the following, we discuss some questions concerning the theoretical justification of the DWR method for goal-oriented mesh adaptation presented so far. We will see that this task is rather demanding and poses several new questions for the theoretical analysis of the finite element method. In fact, relying on the available results from the literature, we do not reach very far, yet. Since several not very practical assumptions will be used, we dispense with stating formal propositions Complexity analysis As an illustrative example, let us consider the evaluation of the derivative of a smooth solution u C 2 ( Ω) at some point a Ω R 2. For this, we use the regularized output functional J ε (u) := B ε (a) 1 1 u(a)dx = 1 u(a)+o(ε 2 ), ε := TOL. B ε(a) The corresponding dual solution behaves like a regularized derivative Green function of the Laplacian (assuming that diam(ω) 1) k z(x) r(x) k 1, r(x) := ( x a 2 +ε 2 ) 1/2. Then, for bilinear elements, the corresponding a posteriori error estimate J ε (e) T T h ρ T ω T, ω T = ( ) 1/2 z I h z 2 T +h T z I h z 2 T ci h 2 T 2 z T, takes the form J ε (e) η ω := T T h h 3 T r 3 T ρ T, r T := max x T r(x). Now, let us assume that the local residuals are related to the local mesh size like ρ T h T, (5.5.67) uniformly for every T T h and h > 0. This condition may be checked a posteriori in the course of the mesh adaptation process. Then, we obtain η ω h 4 T. (5.5.68) r 3 T T h T

55 5.5 Attempts towards theoretical justification 51 In Section 5.3.2, we have seen that the optimal mesh for prescribed accuracy TOL is characterized by the equilibration property η T = h4 T r 3 T TOL N J ε (e) T T h TOL N TOL. From this, we derive and consequently, h 2 T r3/2 T ( TOL N ) 1/2, N = ( N ) 1/2 h 2 Th 2 T = ( N ) 1/2. h 2 TOL Tr 3/2 T TOL T T h T T h This implies that (for any choice of the regularization parameter ε > 0) N 1 TOL, (5.5.69) which is better than the N TOL 2 that could be achieved on uniformly refined meshes on the basis of the general a priori convergence estimate 1 e(a) = O(h). This predicted asymptotic behavior is well confirmed by the results shown in Table 5.9. We emphasize that in this example strong mesh refinement occurs, although the solution is smooth. In fact, this phenomenon should rather be interpreted as mesh coarsening away from the point of evaluation. Further, observing that r min TOL and r max 1, for the optimized mesh, there holds h min ε 3/4 TOL 1/2 = TOL 5/4, h max TOL 1/2. This means that, starting from a coarse initial mesh T 0 with mesh width h 0 1 and performing mesh refinement by cell bisection, i.e., h min 2 L h 0, L 5 4 log 2(TOL) (5.5.70) refinement cycles are needed to reach the optimal mesh. Notice that after each refinement cycle usually a new solution is to be computerd. Therefore, it is desirable to keep L assmallaspossible. Forotherchoices oftheregularizationparameterε, e.g., ε = TOL 1/2 or ε = h min, we obtain different values for L, which are more or less favorable (exercise).

56 52 Adaptivity Table 5.9: Computing 1 u(0) using the estimator η ω (L refinement levels). TOL N L J ε (e) η ω Remark 5.18: An alternative, more explicit strategy for mesh adaptation may be based on the balancing condition h T ρ T r 3 T TOL Ω J ε (e) T T h h 2 T TOL Ω TOL. Here, the complexity analysis gives us (assuming again that ρ T h T ) h T ( TOL ) 1/2r 3/2 T Ω, and, consequently, N = T T h h 2 T h 2 T Ω TOL h2 T r T T T 3 h Ω TOL 2, which shows that this approach is not efficient for very singular error functionals Goal-oriented versus energy-norm error estimation As already mentioned above the energy-norm error estimator can be proven to be reliable as well as efficient, c 1 e E η E c 2 e E. (5.5.71) It can further be shown that in certain situations (essentially linear finite elements for Poisson-like problems), particularly for a certain refinement strategy, we have strict error reduction under refinement, e l+1 E κ e l E, l = 0,1,..., (5.5.72) with a mesh-independent reduction factor κ (0, 1). This implies convergence of the adaptive algorithm(see Dörfler [46]). Under similar conditions even optimal complexity can beshown for theadaptive algorithm, i.e., whenever, forsome s > 0, the solution(and

57 5.5 Attempts towards theoretical justification 53 the right hand side) can be approximated within a tolerance ε > 0 in energy norm by linear finite elements (piecewise constant functions) on some mesh with N = O(ε 1/s ), then the adaptive method produces approximations that converge with this rate, taking a number of operations of the order O(N) (see Stevenson [69]). Such rigorous results are not available for general goal-oriented adaptive algorithms. There has been an attempt in this direction based on the simple estimate J(e) = a(e,z) = a(e,e ) e E e E. (5.5.73) where e := z z h (see Mommer&Stevenson [54]). However, the relevance of this approach is rather questionable as the multiplicative splitting in (5.5.73) is not appropriate for very local goal-functionals (such as considered above). Here, a splitting like J(e) e L e L 1 would be preferable. Further, recalling that energy-norm error estimation is the same as estimation of the scalar energy-error, e 2 E = u 2 E u h 2 E, (5.5.74) suggests that control of the energy-norm error does not provide sufficient information for controling the error in an arbitrary goal quantity, J(e) ( u 2 E u h 2 E) 1/2 ( z 2 E z h 2 E ) 1/ Convergence of residuals The above example illustrates that the assumed asymptotic relation (5.5.67), ρ T h T, for the residuals ρ T is crucial for deriving optimal complexity results. For analyzing this question in the case of (bi)-linear finite elements on triangulations or Cartesian quadrilateral meshes, it suffices to consider the edge-residual part [ n u h ] T, since on Cartesian meshes in 2D the cell-residual term automatically satisfies for bounded f. Now, notice that h 1/2 T [ n u h ] T = h 1/2 f+ u h T = f T c(f)h T, (5.5.75) T h T h 1 T [ nu h ] T = h T D 2 h u h T, where D 2 h u h T is a second-order difference quotient of u h on the cell-patch T containing T and its neighbors. Hence, in order to establish the bound (5.5.67), we have to seek for an estimate of the form supmax h>0 x Ω D2 h u h c(u), (5.5.76) where the constant c(u) depends on bounds for higher-order derivatives of the solution u. For proving (5.5.76), one may use the local inverse relation, q T c r h 1 T q T, q P r (T),

58 54 Adaptivity and the natural nodal interpolation I h u V h, Dhu 2 h T h 1 T D2 hu h T h 1 T D2 h (u h I h u) T +h 1 T D2 h I hu T ch 2 T e T +ch 2 T (u I hu) T +h 1 T D2 h I hu T ch 2 T e T +ch 1 T 2 u T c 2 u, where again T denotes a cell-patch neighborhood of T. Unfortunately, this argument only works on quasi-uniform meshes, for which h max /h min c is assumed, since the local error estimate e ;T h T c(u) does not hold in this strong form on meshes with h max /h min. Alternatives may be found by adopting weighted-norm techniques from the classical pointwise error analysis of finite element approximation. However, most of these results also require the mesh to be quasi-uniform. Hence, though supported by some numerical evidence, this problem must be left open Approximation of weights Next, we analyze the effect of approximating the dual solution z in the weights ω T, on the accuracy of the error estimate. For simplicity, we again restrict our attention to the two-dimensional case. In the following, we will examine two methods of approximating z: i) Approximation by a higher-order method. First, we consider the approximation of the dual solution z by its Ritz projection z (2) h into the space V (2) h of biquadratic finite elements, ( ϕ h, z (2) h ) = J(ϕ h) ϕ h V (2) h. The resulting approximate error representation then reads Ê(u h ) := T T h { (Rh,z (2) h I hz (2) h ) T +(r h,z (2) h I hz (2) h ) T}. Its difference to the exact error representation E(u h ) can be written in the form E(u h ) Ê(u h) = ρ(u h )(ê I h ê ) = ( e, (ê I h ê )), with the abbreviation ê := z z (2) h. For estimating this error, we recall the well-known a priori error estimate for biquadratic finite elements: ( T T h k ê 2 T ) 1/2 ch 3 k 3 z, k = 0,1,2.

59 5.5 Attempts towards theoretical justification 55 Using this, we can then estimate as follows: E(u h ) Ê(u h) e (ê I h ê ( ) 1/2 ch 2 u h 2 T 2 ê 2 T ch 3 2 u 3 z. T T h (5.5.77) This estimate is unsatisfactory as it requires the primal as well as the dual solution to be smooth, which rules out most interesting applications. However, this objection can be somewhat weakened by the following modification of the argument: E(u h ) Ê(u h) = ( e, ê ) = ( (u I h u), ê ) ( ) 1/2 ch 2 h 2 T 2 u 2 T 3 z, T T h (5.5.78) or, setting ê := u u (2) h the biquadratic Ritz projection error, E(u h ) Ê(u h) = ( e, ê ) = ( ê, (z I h z)) ( ) 1/2. ch 2 3 u h 2 T 2 z 2 (5.5.79) T T T h Here only smoothness of one of the solutions u and z is required and singularities in the other one can be compensated by proper mesh refinement. Though the estimates (5.5.78) and (5.5.79) are better than (5.5.77), they still do not cover point-error evaluation since in this case h 2 T 2 z 2 T h 4 Tr 4 T O(1), T T h T T h and generally h 2 is not smaller than TOL. To get a better result, we may use an L -L 1 - duality argument as follows: E(u h ) Ê(u h) = ( e, ê ) = ( (u I (2) h u), ê )) cmax T {h2 T 3 u ;T } ê dx ch log(h min ) max T {h2 T 3 u ;T }. Ω (5.5.80) The L 1 -error estimate for the regularized Green function, ê dx ch log(h min ), Ω used in deriving (5.5.80) can be proven also on certain locally refined meshes (polynomial size-regular). The estimates (5.5.77) (5.5.80) are useful provided that h 3 TOL on the current mesh. According to the discussion at the beginning, this is satisfied even in the case of derivative point value evaluation with TOL h 2. However, since computing the dual solution by higher-order elements is too expensive in most practical situations, we

60 56 Adaptivity will not pursue this discussion further. ii) Approximation by higher-order interpolation. Next, we consider the theoretical justification of the practically more relevant approximate error estimator Ẽ(u h ) := T T h { (Rh,I (2) 2h z h z h ) T +(r h,i (2) 2h z h z h ) T }, where I (2) 2h z h is the patch-wise biquadratic interpolation of the bilinear Ritz projection z h as defined above. This raises the question: Why should I (2) 2h z h be a better approximation to z than z h? In fact, the construction of I (2) 2h z h is based on nodal point information of z h, and the point error (z z h )(a) behaves generally not better then O(h 2 ), even on uniform meshes. Hence, it seems unlikely that z I (2) 2h z h T z z h T. However, this is not the right point of view. We could also seek to prove the weaker relation ρ(u h )(z I (2) 2h z h) ρ(u h )(z z h ), which has the flavor of a global super-approximation property. Therefore, it will probably depend on some uniformity property of the mesh. In order to pursue this thought further, we rewrite the error identity J(e) = ρ(u h )(z z h ) in the form J(e) = ρ(u h )(z I (2) 2h z)+ρ(u h)(i (2) 2h z I(2) 2h z h)+ρ(u h )(I (2) 2h z h z h ), (5.5.81) where the last term on the right is just our approximate error estimator. The first and second terms will beestimated separately. To thisend, we have to assume thatthemeshes have already been optimized such that ρ T ch T. By the approximation properties of the interpolant I (2) 2h z, ( (2) z I 2h z 2 T + 1h 2 T z I (2) ) 1/2 2h z 2 (2) T c I h 3 T 3 z S(T), where S(T) is the cell-patch on which I (2) 2h z is defined, we obtain for the first term in (5.5.81): ( ρ(u h )(z I (2) 2h z) c ) 1/2 3 z. T T h h 6 T ρ2 T Consequently, using (5.5.67), ρ T h T, we arrive at the estimate ρ(u h )(z I (2) 2h z) c(u,z)h3. (5.5.82) The second term in (5.5.81) is the hard one, which requires more work, as it relates properties of the non-local Ritz projection and the local interpolation. Its estimation strongly relies on uniformity properties of the mesh T h. The idea is that the scaled error h 2 e is a smooth function, such that it can be approximated in V h. To make this concept clear, suppose that the mesh T h is uniform with mesh-width h. Then, it is

61 5.6 Abstract framework of a posteriori error estimation 57 known that in the nodal points, the error z z h allows an asymptotic expansion in powers of h such that (see Blum et al. [40]) I h z z h = I h (z z h ) = h 2 I h w+h 3 τ h, with some h-independent function w H0 1(Ω) and a remainder satisfying τ h c 3 z. From this, observing that I (2) 2h z = I(2) 2h I hz, we conclude and, using Galerkin orthogonality, ρ(u h )(I (2) 2h z I(2) 2h z h) = h 2 ρ(u h )(I (2) 2h w)+h3 ρ(u h )(τ h ), ρ(u h )(I (2) 2h z I(2) 2h z h) = h 2 ρ(u h )(I (2) 2h w I hw)+h 3 ρ(u h )(τ h ). Assuming that theinterpolationoperators I h and I (2) 2h behave like themodifiedh1 -stable interpolation operators, we have I (2) 2h w I hw T h1/2 I (2) 2h w I hw T ch w T, with a cell-neighborhood T of T. Collecting the foregoing estimates, we find ρ(u h )(I (2) 2h z I(2) 2h z h) c(u,z)h 3. (5.5.83) Finally, inserting the estimates (5.5.82) and (5.5.83) into (5.5.81), we conclude that J(e) = Ẽ(u h)+o(h 3 ). (5.5.84) We emphasize that this estimate has been derived for a smooth dual solution z which excludes almost all interesting applications. Furthermore, the meshes are required to be uniform which conflicts with our ultimate goal of mesh adaptation. Summerizing the discussion of this subsection, the fully rigorous theoretical justification of the DWR method is still a largely open problem. 5.6 Abstract framework of a posteriori error estimation In this section, we present a very general approach to a posteriori error estimation for the Galerkin approximation of nonlinear variational problems. The framework is kept on a abstract level in order to allow later for a unified application to rather different situations, such as nonlinear PDEs, stationary as well as nonstationary, but also eigenvalue and optimization problems. Let A( )( ) be a semilinear form (linear in the second argument) and J( ) an output functional, not necessarily linear, defined on some function space V. The goal is the evaluation of J(u) from the solution of the variational problem A(u)(ψ) = 0 ψ V. (5.6.85)

62 58 Adaptivity The corresponding Galerkin approximation uses finite dimensional subspaces V h V to determine u h V h by A(u h )(ψ h ) = 0 ψ h V h. (5.6.86) For this approximation there holds the general Galerkin orthogonality relation A(u)(ψ h ) A(u h )(ψ h ) = 0, ψ h V h. (5.6.87) We assume the existence of directional derivatives (Gateaux derivatives) of A and J up to order three, denoted by A (u)(ϕ, ), A (u)(ψ,ϕ, ), A (u)(ξ,ψ,ϕ, ), and J (u)(ϕ), J (u)(ψ,ϕ), J (u)(ξ,ψ,ϕ), respectively, for increments ϕ,ψ,ξ V, e.g., A 1{ } (u)(ϕ,ψ) := lim A(u+tϕ)(ψ) A(u)(ψ). t 0 t In these forms the dependence on the first argument in parentheses may be nonlinear while the dependence on all further arguments in the second set of parentheses is linear. Example 5.7: A typical example of a nonlinear problem of the form we are interested in is the so-called vector Burgers equation ν u+u u = f, in Ω, u Ω = 0. for a vector function u V := H 1 0 (Ω)d. This is the natural generalization of the classical 1-d Burgers equation to multiple dimensions. The corresponding variational formulation has the form (5.6.85) with the semilinear form A(u)(ψ) := ν( u, ψ)+(u u,ψ) (f,ψ). Example 5.8: An example with stronger nonlinearity is the quasi-linear p-laplace-like problem ( u ) = f, in Ω, u (1+ u ) p 1 Ω = 0, forascalarfunction u V := H 1 0(Ω) W 1, (Ω). Inthiscasethecorrespondingsemilinear form is given by A(u)(ψ) := ((1+ u ) 1 p u, ψ) (f,ψ). Example 5.9: An example with weaker nonlinearity is the diffusion-reaction problem u u 3 = f, in Ω, u Ω = 0, for a scalar function u V := H0 1 (Ω). This problem is interesting in the context of bifurcation theory. In this case the corresponding semilinear form is given by A(u)(ψ) := ( u, ψ) (u 3,ψ) (f,ψ).

63 5.6 Abstract framework of a posteriori error estimation 59 The above examples may also involve other types of boundary conditions, e. g., nonhomogeneous Dirichlet conditions, u Ω = g, or Neumann conditions, n u Ω = 0. For estimating the error J(u) J(u h ), we employ the Euler-Lagrange method of constrained optimization, i. e., we interprete the task of determining J(u) from a solution of equation (5.6.85) as the optimization problem of minimizing J(u) on the set of such solutions. Introducing a dual variable z V ( adjoint variable or Lagrangian multiplier ), we define the Lagrangian functional L(u,z) := J(u) A(u)(z), and seek for stationary points {u,z} V V of L(, ), i.e., L (u,z)(ϕ,ψ) = { J (u)(ϕ) A (u)(ϕ,z) A(u)(ψ) } = 0 {ϕ,ψ} V V. (5.6.88) This is the so-called Karush-Kuhn-Tucker (KKT) system in optimization. Clearly, the u-component of any such stationary point is a solution of the original problem (5.6.85). The corresponding Galerkin approximations {u h,z h } V h V h are defined by the discrete Euler-Lagrange system { } J L (u h )(ϕ h ) A (u h )(ϕ h,z h ) (u h,z h )(ϕ h,ψ h ) = = 0 {ϕ h,ψ h } V h V h, A(u h )(ψ h ) (5.6.89) where, again, the u h -component of any stationary point is a solution of the discrete problem (5.6.86). Now, the goal is to estimate the error J(u) J(u h ) in terms of the residuals associated with this set of equations. We prepare for this by considering first the general situation of the Galerkin approximation of stationary points of functionals. Proposition 5.4: Let L( ) be a three-times (Gateaux) differentiable functional defined on a (real or complex) vector space X which possesses a stationary point x X, i.e., L (x)(y) = 0 y X. (5.6.90) Suppose that on a finite dimensional subspace X h X, the Galerkin approximation L (x h )(y h ) = 0 y h X h, (5.6.91) also possesses a solution x h X h. Then, there holds the error representation L(x) L(x h ) = 1 2 L (x h )(x y h )+R h, y h X h, (5.6.92) with a remainder term R h which is cubic in the error e := x x h, R h := L (x h +se)(e,e,e)s(s 1)ds.

64 60 Adaptivity Proof. Using the fundamental theorem of analysis and observing that L (x)(e) = 0, we can write L(x) L(x h ) = = d dt L(x h+se)ds = 1 0 L (x h +se)(e)ds L (x h +se)(e)ds+ 1 2 L (x h )(e) 1 2{ L (x h )(e)+l (x)(e) }. Thelast termontherightisjust theapproximationoftheintegral termby thetrapezoidal rule. For this, we have the well-known error representation Hence, we obtain 1 0 f(t)dt = 1 2{ f(0)+f(1) } f (s)s(s 1)ds. L(x) L(x h ) = 1 2 L (x h )(e) L (x h +se)(e,e,e)s(s 1)ds. Finally, observing that L (x h )(y h ) = 0 for all y h X h, we have that This completes the proof. L (x h )(e) = L (x h )(x y h ), y h X h. Q.E.D. As an immediate consequence of Proposition 5.4, we obtain the following result for the Galerkin approximation of variational equations. Proposition 5.5: For any solution of equations (5.6.85) and (5.6.86), we have the error representation J(u) J(u h ) = 1 2 ρ(u h)(z ψ h ) ρ (u h,z h )(u ϕ h ) + R (3) h, (5.6.93) with arbitrary ϕ h,ψ h V h, and the primal and dual residuals ρ(u h )( ) := A(u h )( ), ρ (u h,z h )( ) := J (u h )( ) A (u h )(,z h ). The remainder R (3) h is cubic in the primal and dual errors e := u u h and e := z z h, R (3) h = { J (u h +se)(e,e,e) A (u h +se)(e,e,e,z h +se ) 3A (u h +se)(e,e,e ) } s(s 1)ds. Proof. In order to apply Proposition 5.4, we define the space X := V V and for arguments x = {u,z} X the functional L(x) := L(u,z). In this context the stationary

65 5.6 Abstract framework of a posteriori error estimation 61 points are denoted by x := {u,z} and x h := {u h,z h } with the error e x := x x h. Then, we have J(u) J(u h ) = L(x) A(u)(z) L(x h )+A h (u h )(z h ) = L(x) L(x h ). Hence, the error representation of Proposition 5.4 gives us with J(u) J(u h ) = 1 2 L (x h )(x y h )+R h, y h X h, R h := L (x h +se x )(e x,e x,e x )s(s 1)ds. By construction, we have for arbitrary y h = {ϕ h,ψ h } X h : L (x h )(x y h ) = L u(u h,z h )(u ϕ h )+L z(u h,z h )(z ψ h ) = J (u h )(u ϕ h ) A (u h )(u ϕ h,z h ) A(u h )(z ψ h ) = ρ (u h,z h )(u ϕ h )+ρ(u h )(z ψ h ). Notice that L(u, z) is linear in z. Consequently, the third derivative of L( ), consists of only three terms, namely, L xxx = L uuu +3L uuz +3L uzz +L zzz, J (u h +se)(e,e,e) A (u h +se)(e,e,e,z h +se ) 3A (u h +se)(e,e,e ). This implies the asserted form of the remainder term R (3) h. Q.E.D. Remark 5.19: The derivation of the error representations (5.6.92) and (5.6.93) does not require the uniqueness of solutions; this is important, for example, for the application to eigenvalue problems. In cases with non-unique solutions, the a priori assumption x h x (h 0) makes the result meaningful as then the remainder term can be assumed to be small. Remark 5.20: Thecubicremainder R (3) h can usually be neglected. However, in parameterdependent problems when approaching a bifurcation point, the derivatives of A(u)( ) and consequently R (3) h may become large. In such a situation the abstract theory can still be applied but with special care. The extreme situation occurs in the context of eigenvalue problems where one is directly working in the singular case. In the linear case, we have seen that the primal and dual residual terms coincide, i.e. ρ(u h )(z ψ h ) = ρ (z h )(u ϕ h ). This is no longer true in the nonlinear case, but the deviation from this property, i.e. the degree of nonlinearity of the problem, can be estimated as the following proposition shows.

66 62 Adaptivity Proposition 5.6: With the notation from above, there holds for any ϕ h, ψ h V h, with ρ := 1 0 ρ (u h,z h )(u ϕ h ) = ρ(u h )(z ψ h )+ ρ, (5.6.94) { A (u h +se)(e,e,z h +se ) J (u h +se)(e,e) } ds. Further, we have the simplified error representation for any ϕ h V h, with the quadratic remainder R (2) h := J(u) J(u h ) = ρ(u h )(z ϕ h )+R (2) h, (5.6.95) 1 0 {A (u h +se)(e,e,z) J (u h +se)(e,e)}sds Proof. We introduce the scalar function g( ) by g(s) := J (u h +se)(e) A (u h +se)(e,z h +se ). By the definition of z and z h, there holds and g(1) = J (u)(e) A (u)(e,z) = 0, g(0) = J (u h )(e) A (u h )(e,z h ) = ρ (u h,z h )(e), g (s) = J (u h +se)(e,e) A (u h +se)(e,e,z h +se ) A (u h +se)(e,e ). Therefore, using Galerkin orthogonality, ρ (u h,z h )(u ϕ h ) = ρ (u h,z h )(e) = g(0) = g(0) g(1) = = g (s)ds { A (u h +se)(e,e,z h +se ) J (u h +se)(e,e) } ds 0 A (u h +se)(e,e )ds = ρ+ρ(u h )(e ) = ρ+ρ(u h )(z ψ h ). this proves (5.6.94). In order to prove (5.6.95), we use integration by parts, obtaining R (2) h = 1 = 0 1 {A (u h +se)(e,e,z) J (u h +se)(e,e)}sds 0 {A (u h +se)(e,z) J (u h +se)(e)} ds+a (u)(e,z) J (u)(e),

67 5.6 Abstract framework of a posteriori error estimation 63 where the last two terms vanish by definition of z. Consequently, employing again Galerkin orthogonality, we obtain R (2) h = ρ(u h )(z ψ h )+J(u) J(u h ), for arbitrary ψ h V h. This completes the proof. We note that the simplified error representation (5.6.95) could have been derived also from (5.6.93) using the relation (5.6.94). However, this involves lengthy calculations, so that we preferred to present a more direct argument. Q.E.D. Example 5.10: We want to apply the above error analysis in a concrete situation. Consider the nonlinear vector-burgers equation from Example 5.7, where V := H 1 0 (Ω)d and A(u)(ψ) := ν( u, ψ)+(u u,ψ) (f,ψ). Suppose that the goal functional J( ) is linear, i.e., J(u) = (l,u) with some l L 2 (Ω) d. In this particular case, we have A (u)(ϕ,ψ) = ν( ϕ, ψ)+(u ϕ,ψ)+(ϕ u,ψ) A (u)(ξ,ϕ,ψ) = (ξ ϕ,ψ)+(ϕ ξ,ψ), A (χ,ξ,ϕ,ψ) = 0, and the remainder terms in the error representations (5.6.93) and (5.6.95), respectively, take the special form R (3) h (u,u h,z,z h ) = 1 2 R (2) h (u,u h) = = 3 = { J (u h +se)(e,e,e) A (u h +se)(e,e,e,z h +se ) 3A (u h +se)(e,e,e ) } s(s 1)ds (e e,e )s(s 1)ds = 1 2 (e e,e ), {A (u h +se)(e,e,z) J (u h +se)(e,e)}sds (e e,z)sds = (e e,z). As in the linear case the corresponding residual terms can be written in the following form as sums over cell residuals on the current mesh: ρ(u h )(z ψ h ) = A(u h )(z ψ h ) = (f,z ψ h ) ν( u h, (z ψ h )) (u h u h,z ψ h ) = T T h { (Rh,z ψ h ) T +(r h,z ψ h ) T }

68 64 Adaptivity with the equation and jump residuals defined by and, analogously, R h T := f +ν u h u h u h, r h Γ := ρ (z h )(u ϕ h ) = J() A (u h )(u ϕ h,z h ) { 1 2 ν[ hu h ], Γ Ω, 0, else, = (u ϕ h,j) ν( (u ϕ h ), z h ) (u h (u ϕ h ),z h ) ((u ϕ h ) u h,z h ) = (u ϕ h,j) ν( (u ϕ h ), z h )+(u ϕ h,( u h )z h )+(u ϕ h,u h z h ) (u ϕ h,( u h ) T z h ) = T T h { (u ϕh,r h ) T +(u ϕ h,r h ) T}, with the corresponding cell residuals R h T := j+ν z h u h z h +( u h )z h ( u h ) T z h, r h Γ := { 1 2 ν[ hz h ], Γ Ω, 0, else. In order to use the error representations (5.6.93) or (5.6.95) for practical mesh adaptation, we have to evaluate the primal and dual residual terms. As in the linear case, this requires approximation of the dual solution z and in the context of (5.6.93) additionally that of the primal solution u. This may be achieved again by post-processing of the Galerkin solutions z h and u h using patchwise higher-order interpolation. Let the resulting approximations be denoted by z h and ũ h. Then, neglecting the remainder terms the approximate error representations take the form η(u h,z h ) := 1 2 ρ(u h)( z h z h )+ 1 2 ρ (u h,z h )(ũ h u h ), (5.6.96) η(u h ) := ρ(u h )( z h z h ). (5.6.97) Remark 5.21: The identity (5.6.94) is useful as it offers the possibility of controlling the remainder R (2) h in the simplified error representation (5.6.95). In fact, comparing the two error representations (5.6.93), (5.6.95) and using (5.6.94), we see that R (2) h = ρ(u h )(z ψ h )+J(u) J(u h ) = ρ(u h )(z ψ h )+ 1 2 ρ(u h)(z ψ h )+ 1 2 ρ (u h,z h )(u ϕ h )+R (3) h = 1 2 ρ (u h,z h )(u ϕ h ) 1 2 ρ(u h)(z ψ h )+R (3) h = 1 2 ρ+r(3) h. Hence, we may try to control the linearization error by a posteriori checking the condition ρ ρ (u h,z h )(ũ h u h ) ρ(u h )( z h z h ) TOL. (5.6.98) where ũ h u and z h z are higher-order approximations and R (3) h is neglected.

69 5.6 Abstract framework of a posteriori error estimation 65 Remark 5.22: A posteriori error estimates for the Galerkin finite element approximation of nonlinear variational problems can also be derived using extensions of the classical energy-norm-based approach. This relies on assumptions on the monotonicity of the underlying problem, i. e. the coercivity of certain derivative forms, or involve nonlinear stability constants which depend on the unknown solution. These estimates usually represent the worst case scenario and will be of practical value only in special cases A nested solution approach For solving the nonlinear problems by a Galerkin finite element method, we employ the following iterative scheme. Starting from a coarse initial mesh T 0, a hierarchy of refined meshes T l,l = 0,...,L and corresponding nested finite element spaces V 0 V l V L. with dimensions N l is generated by the following nested solution process: Adaptive solution algorithm 1. Initialization: For l = 0, compute a solution u 0 V 0 on the mesh T Defect correction iteration: For l 1, start with u 0 l = u l 1 V l. For a computed iterate u j l V l evaluate the defect (d j l,ψ l) = A(u j l )(ψ l), ψ l V l, and solve the correction equation ( quasi-newton method ) Ã (u j l )(vj l,ψ l) = (d j l,ψ l) ψ l V l, by Krylov-space or multigrid iterations using the hierarchy of previously constructed meshes {T l 1,...,T 0 }. Here, Ã (u j l )(, ) stands for an approximation to the exact derivative form A (u j l )(, ). Update uj+1 l = u j l +αj l vj l, with a step-length parameter α j l, set j = j + 1, and repeat the iteration. This process is carried until a limit u l V l,isreached withsomerequiredaccuracy. Corresponding aposterioristopping criteria will be developed below. 3. Error estimation: Solve the (linearized) discrete dual problem z l V l : A (u l )(ϕ l,z l ) = J (u l )(ϕ l ) ϕ l V l, on the current mesh and evaluate the a posteriori error estimator (5.6.96): J(u) J(u l ) η(u l,z l ). If η(u l,z l ) TOL, or N l N max, then stop. Otherwise cell-wise mesh adaptation yields the new mesh T l+1. Then, set l := l +1 and go back to (2).

70 66 Adaptivity Remark 5.23: The nonlinear iteration described above is oriented at the solution of stationary elliptic problems in which a global transfer of information is present. In the case of transport-dominated problems, particularly those with information transfer into one direction only such as in nonstationary problems, one would organize the solution process differently, taking this transport direction into account. Remark 5.24: In the described Newton-like iteration the mesh adaptation is done for the limit solution on the current mesh in order to have a rigorous theoretical basis. However, it may be inefficient to carry the iteration on a coarser mesh to the limit knowing that the discretization accuracy on this mesh is still insufficient. Hence, one would like to combine the estimation of the discretization error with that of the iteration error, both in accordance with the accuracy in the target quantity. Such combined error estimation will be considered in more detail below. Remark 5.25: The solution of the linear dual problem usually requires much less work compared to solving the nonlinear primal problem (5.6.85). In fact, in the context of the nested solution method described above the primal solution is obtained by a Newtonlike iteration which requires the solution of several linear problems until convergence is reached. Then, solving the linear dual problem for the converged primal solution normally corresponds to about one additional Newton step. Further, this extra work is spent on optimized meshes adapted to the particular goal of the computation. Hence, particularly for nonlinear problems, the duality-based approach to adaptivity becomes relatively cheap Control of the nonlinear and linear iteration errors The accuracy in the algebraic solution process can be balanced with that due to discretization using computable a posteriori error estimates in which the outer nonlinear and inner linear iteration errors are separated from the discretization error. This results in effective stopping criteria for the algebraic iteration. The material presented in this section is largely taken from Rannacher& Vihharev [65]. We consider again the abstract problem posed in variational form A(u)(ψ) = 0 ψ V, (5.6.99) and its discretization by a Galerkin method using subspaces V h V, A(u h )(ψ h ) = 0 ψ h V h. ( ) Both problems are assumed to be (not necessarily uniquely) solvable. The goal is to estimatetheerror e := u u h withrespect tosomecontinuous outputfunctional J( ) defined on the solution space V. The corresponding error has been estimated in Proposition 5.5

71 5.6 Abstract framework of a posteriori error estimation 67 using the solutions z V and z h V h of the associated (linear) dual problems A (u)(ϕ,z) = J (u)(ϕ) ϕ V, ( ) A (u h )(ϕ h,z h ) = J (u h )(ϕ h ) ϕ h V h, ( ) We now turn to the question of estimating the error in case that the discrete primal and dual solutions u h,z h V h are computed only approximately. Proposition 5.7: Let {ũ h, z h } V h V h be approximations to the solutions {u,z} V V of the primal and dual problems (5.6.88) (KKT system) obtained by any iterative process on the mesh T h. Then, there holds the following error representation: with the residual terms J(u) J(ũ h ) = 1ρ(ũ 2 h)(z z h )+ 1 2 ρ (ũ h, z h )(u ũ h ) +ρ(ũ h )( z h )+R (3) h, ( ) ρ(ũ h )( ) := A(ũ h )( ), ρ (ũ h, z h )( ) := J (ũ h )( ) A (ũ h )(, z h ). The remainder term R (3) h is cubic in the primal and dual errors ẽ := u ũ h and ẽ := z z h, R (3) h = { J (ũ h +sẽ)(ẽ,ẽ,ẽ) A (ũ h +sẽ)(ẽ,ẽ,ẽ, z h +sẽ ) 3A (ũ h +sẽ)(ẽ,ẽ,ẽ )) } s(s 1)ds. ( ) Proof. We set x := {u,z}, x h := {ũ h, z h }, ẽ = x x,and L(x) := L(u,z), L( x h ) := L(ũ h, z h ). With this notation there holds L(x) L( x h ) = 1 0 L ( x+sẽ))(ẽ)ds. Using again the general error representation for the trapezoidal rule we conclude 1 0 f(s)ds = 1 2 (f(0)+f(1)) f (s)s(s 1)ds L(x) L( x h ) = 1 2 L (x)(ẽ)+ 1 2 L ( x h )(ẽ)+r (3) h = 1 2 L ( x h )(ẽ)+r (3) h, with the remainder term R (3) h given by ( ). Recalling the particular structure of the functional L( ) and observing that u satisfies (5.6.99), we obtain J(u) J(ũ h ) = L(x)+A(u)(z) L( x h ) A(ũ h )( z h ) = L(x) L( x h ) A(ũ h )( z h ) = 1 2 L ( x h )(x x h )+R (3) h A(ũ h)( z h ).

72 68 Adaptivity Observing that completes the proof. L ( x h )( ) = J (ũ h )( ) A (ũ h )(, z h ) A(ũ h )( ) Q.E.D. Remark 5.26: We note that the extra third residual term ρ(ũ h )( z h ) on the right-hand side of ( ) would vanish if evaluated for the exact discrete solution u h. In the present situation with an only approximate solution ũ h it will be used for controlling the iteration error. As in the linear case it can be interpreted as a measure for the deviation of ũ h from satisfying Galerkin orthogonality. The practical evaluation of the a posteriori error representation ( ) follows the same strategy as already used in the linear case. The unknown errors ẽ = u ũ h and ẽ = z z h are approximated by local higher-order interpolation: e I (2) 2h ũh ũ h, e I (2) 2h z h z h. Neglecting the higher-order remainder R h (3) this results in the following approximate error estimator for the discretization error: η h := 1 ρ(ũh )(I 2 2h z h z h )+ρ (ũ h, z h )(I 2h ũ h ũ h ). ( ) This error estimator is used for controlling the discretization error and for steering mesh refinement. For the latter purpose, the error estimator η h must be localized to cellwise contributions. This is done here by an analogous procedure as described above for the linear case. The additional term supposed to estimate the iteration error as in the linear case does not require any approximation and can directly be evaluated with he computed solutions, The total error is then approximately bounded by η it := ρ(ũ h )( z h ) = A(ũh )( z h ). ( ) J(u) J(ũ h ) η(ũ h ) := η h +η it. ( ) The estimator η it for the iteration error vanishes in the limit ũ h u h and will therefore eventually fall below the estimator η h for the discretization error. Based on the estimator ( ) the algorithm for adaptively balancing discretization and nonlinear iteration errors reads as follows. Solution by general fixed-point iteration 1. Initialization: Choose an initial discretization T 0 := T h0 and set l = Nonlinear iteration: On mesh T l,l 1, choose an initial value u 0 l (e.g., approximate solution on the preceding coarser mesh T l 1 ) and set t = 0.

73 5.6 Abstract framework of a posteriori error estimation Apply one step of the nonlinear iteration (e.g., a Newton-like iteration) u t l ut+1 l. This usually requires an inner linear iteration (e. g., a multigrid iteration), which also needs a stopping criterion. 4. Error estimation: Solve the corresponding discrete dual problem for z t+1 l linear iteration), (by some A (u t+1 l )(ϕ l,z t+1 l ) = J (u t+1 l )(ϕ l ) ϕ l V l, and evaluate the estimators η t+1 l and η t+1 it,l. 5. If η t+1 it,l > κη t+1 l : Increment t and go to (3). 6. If η t+1 j +η t+1 it,l TOL: STOP. 7. Mesh adaptation: Refine T l T l+1 using information from η t+1 l. 8. Increment l and go to (2). As initial value in step (2) of the above algorithm, we use the values from the computations on the previous mesh, thus avoiding unnecessary iterations on fine meshes. In the numerical tests below this leads to rather small iteration errors already at the very beginning. As in the linear case, we use an equilibration factor κ = 0.1. This ensures that the local mesh refinement really results from the value of the discretization error estimator. Selecting an even smaller value does not improve the accuracy of the computed value but increases the number of iterations. A larger value can affect the local mesh refinement. In order to reduce the computational work, we save the value η l obtained on the current mesh. On the next finer mesh T l+1, we evaluate the discretization error estimator only if for the iteration error estimator there holds η t+1 it,l+1 κηt+1 l Special case: inexact Newton method As special choice of a fixed-point iteration, we consider a so-called inexact Newton method. Then, the solution process uses the following algorithm. Solution by inexact Newton method 1. Initialization: Choose an initial discretization T 0 := T h0 and set l = Defect-correction iteration Choose an initial value u 0 l V l on mesh T l := T hl and set t = Compute an approximate solution ṽ t l of the defect equation by some iterative method v t 1 l v t l. A (u t l)(v t l,ψ l ) = A(u t l)(ψ l ) ψ l V l, 4. Update: u t+1 l = u t l +θ tṽ t l with some θ t (0,1].

74 70 Adaptivity 5. Error estimation: Compute an approximate solution z t+1 l linear dual problem of the corresponding bysomeiterativemethod zl t zt+1 l η t+1 l and η t+1 it,l. 6. If η t+1 l +η t+1 it,l TOL, then stop. A (u t+1 l )(ϕ l,z t+1 l ) = J (u t+1 l )(ϕ l ) ϕ l V l, 7. If η t+1 it,l κη t+1 l increment t t+1 and go to (3). and evaluate the corresponding error estimators 8. Mesh adaptation: Refine T l T l+1 using information from η t+1 l. 9. Increment l l +1 and go to (2). As initial value for the Newton method in step (2), we use again the result from the previous coarser mesh, which again leads to rather small iteration errors already at the very beginning. The evaluation of the discretization error estimator is done using the strategy described above. For the further discussion, we define the following residuals corresponding to the interior linear iterations in steps (3) and (5) of the above algorithm: r t l, := A (u t l )(ṽt l, )+A(ut l )( ), d t l, := A (u t+1 l )(, z t+1 l ) J (u t+1 l )( ). For the exact Newton method the linear residual occurring due to the inexact solution of the correction equations vanishes, i. e., for the inner iterations of the primal and dual problems there holds r t h, 0, d t h, 0. We aim at the equilibration of the errors due to the linearization, the iterative computations in the inner iteration and the inexact solution of the dual problem. For this purpose, we consider the difference between two Newton iterates. For the sake of brevity, we assume the functional J( ) to be at most quadratic. Hence, using Taylor expansion and the definition of the dual problem, we obtain J(u t+1 h ) J(ut h ) = J(ut h +θ tṽ t h ) J(ut h ) = θ t J (u t h )(ṽt h )+ 1 2 θ2 t J (u t h )(ṽt h,ṽt h ) = θ t A (u t h)(ṽ t h, z t h) θ t d t h,ṽ t h θ2 t J (u t h )(ṽt h,ṽt h ), ( ) and recalling the structure of the Newton method, J(u t+1 h ) J(ut h) = θ t A(u t h)( z t h)+θ t r t h, z t h θ t d t h,ṽ t h θ2 t J (u t h )(ṽt h,ṽt h ). ( )

75 5.6 Abstract framework of a posteriori error estimation 71 The first term on the right hand side corresponds to the linearization error whereas the second and the third terms are due to the inexact solution of the correction equations for the primal and the dual problems. The last term occurs due to the quadratic structure of the error functional and it dominates over the residual terms of the linear problems. It vanishes, e. g., if J( ) is linear. Taking into account ( ) suggests the following stopping criterion for the linear subiterations: max{ r t h, z t h, d t h,ṽ t h } κη t it = κ A(u t h)( z t h). ( ) In the case of a linear error functional, we have the following representation for the full linear iteration error. Lemma 5.1: Let vh t and ṽt h be exact respectively inexact solutions of the correction equation in the Newton method starting from the preceding iterate u t h and with the Jacobi matrix assembled for u t h. Further, let zt h be the inexact solution of the corresponding linear dual problem. Then, for the error in the next inexact iterate u t+1 h = u t h + θ tṽh t and its exact analogue û t+1 h := u t h +θ tvh t with respect to a linear functional J( ), there holds J(û t+1 h ) J(ut+1 h ) = θ t rh t, zt h θ t d t h,vt h ṽt h. ( ) Proof. By the definition of the Newton method there holds J(û t+1 h ) J(ut+1 h ) = J(ut h +θ t v t h) J(u t h +θ t ṽ t h) = θ t J(v t h) θ t J(ṽ t h). Now, we use the definition of the dual problem to obtain J(û t+1 h ) J(ut+1 h ) = θ ta (u t h)(v t h, z t h) θ t A (u t h)(ṽ t h, z t h) θ t d t h,v t h ṽ t h = θ t A(u t h t( zt h )+θ ta(u t h )( zt h ) θ t r t h, zt h θ t d t h,vt h ṽt h, which proves the assertion. Q.E.D. Remark 5.27: Under the same assumptions as in Lemma 5.1 for a quadratic functional there holds J(û t+1 h ) J(ut+1 h ) = θ t r t h, zt h θ t d t h,vt h ṽt h θ2 t( J (u t h )(vt h,vt h ) J (u t h )(ṽt h,ṽt h )). ( ) Remark 5.28: The error representation ( ) may be considered as additional justification of the stopping criterion ( ). It is valid for any linear solver. If a multigrid method is used, then following the argument in Meidner et al. [53] the error representation ( ) can be developed further into a form, which also allows to adaptively tune the inner smoothing iterations on each mesh level. This detail is omitted here for the sake of brevity.

76 72 Adaptivity Numerical examples In this section, we demonstrate the efficiency and reliability of the proposed adaptive algorithm. We compare the adaptive Newton method described above with a Newton method employing an algebraic stopping criterion requiring that the initial residual is to be reduced by a factor of Moreover, we employ the adaptive stopping rule for the inner linear solver in the Newton steps. This will be analogously compared to the case using a linear solver with an algebraic stopping rule requiring the initial residual to be reduced by a factor of In all cases the resulting discretization error estimator will be used for error control as well as local mesh refinement. We denote the exact Newton method by ALG-I, the adaptive Newton method with exact inner iteration by ALG-II and the fully adaptive Newton method by ALG-III. In the tables below, the following notation is used for the different errors: E h := J(u) J(u h ) exact discretization error and estimator η h, I h eff := η h E h effectivity index of discretization error estimator, E it := J(u h ) J(ũ h ) exact iteration error and estimator η it, I it eff := η it E it E := J(u) J(ũ h ) exact total error, I tot eff := η h +η m E effectivity index of iteration error estimator, effectivity index of total error estimator. Example 1 The first example examines the sharpness and the practical relevance of the a posteriori error estimators derived above on locally refined meshes. The test problem involves an only weak nonlinearity such that the assumptions of Proposition 5.7 are satisfied. Determine u = {u 1,u 2 } V := W 1,2 0 (Ω) 2 satisfying u 1 +2u 2 2 = 1, u 1 Ω = 0, u 2 +u 1 u 2 = 0, u 2 Ω = 0, ( ) on the slit domain Ω := ( 1,1) 2 \([0,0] [0, 1]), and evaluate J(u) := u 1 (a) at the point a = ( 0.75, 0.75). Since in this case the functional J( ) is not defined on the space V it has to be regularized like J ε (u) := B ε (a) 1 u(x)dx = u(a)+o(ε 2 ), B ε(a)

77 5.6 Abstract framework of a posteriori error estimation 73 where B ε (a) := {z Ω z a ε} is a ball with center a and radius ε TOL. The corresponding variational formulation reads where, for ϕ = {ϕ 1,ϕ 2 } V, A(u)(ϕ) = 0 ϕ V := H 1 0 (Ω)2, ( ) A(u)(ϕ) := ( u 1, ϕ 1 )+2(u 2 2,ϕ 1)+( u 2, ϕ 2 )+(u 1 u 2,ϕ 2 ) (1,ϕ 1 ). For the discretization of problem ( ), we use again a standard finite element method with conforming Q 1 elements. The resulting nonlinear algebraic problems on the meshes T h are solved by a damped Newton method with damping factor θ = 0.5, A (u t h )(ut+1 h,ϕ h) = A (u t h )(ut h,ϕ h)+θa(u t h )(ϕ h) ϕ h V h. ( ) The obtained results are shown in Tables 5.10 and We observe significant work savings by using the adaptive stopping criterion in the exact Newton iteration (ALG- II). The effectivity indices are relatively close to one, even on coarser meshes, which demonstrates the sharpness of the error estimator. Table 5.10: Example 1, exact Newton iteration towards round-off error level (ALG-I) N #It E η h +η it η h η it I tot eff e e e e e e e e e e e e e e e e e e e e e e e e Table 5.11: Example 1, exact Newton iteration with adaptive stopping criterion (ALG- II) N #It E η h +η it η h η it I tot eff e e e e e e e e e e e e e e e e e e e e e e e e

78 74 Adaptivity Example 2 The second example is a strongly nonlinear problem with local degeneration of ellipticity. Also in this irregular case the proposed adaptive algorithms work well. First, we compare the computational work on optimized against uniformly refined meshes and then the performance of the algorithms ALG-I, ALG-II and ALG-III. We consider the following quasi-linear boundary value problem of the so-called p-laplace operator: ( u p 2 u) = 0, in Ω, u = g, on Ω, ( ) on an L-domain Ω := ( 1,1) 2 \([0,1] [ 1,0]). The Dirichlet boundary data are g := exp(4 x 2/3 ). Weset p = 4. Aserrorfunctional, weconsider apoint-valueofthederivative of the solution, J(u) := x1 u(a), a = ( 0.75, 0.5), respectively its regularization, J ε (u) := B ε (a) 1 B ε(a) x1 u(x)dx = x1 u(a)+o(ε 2 ). In this case the appropriate solution space is V := W 1,p 0 (Ω). Then, the weak form of the problem reads: Find u g +V, such that ( u p 2 u, ϕ) = 0 ϕ V. ( ) The p-laplace operator is strictly monotone, coercive and continuous on V such that the Browder fixed-point theorem yields the existence of a unique solution. For the standard a posteriori error analysis of this kind of nonlinear problem, we refer to Carstensen& Klose [44]. The discrete problems are solved using the Newton method with a multigrid solver in V-cycle form with one ILU-step for pre- and post-smoothing for the inner iteration. Even for this irregular problem the use of goal-oriented mesh refinement leads to significant work savings. In order to achieve an accuracy of E on uniformly refinedmeshes, weneedtocomputethesolutiononameshwith nodes. Ifweuse goal-oriented mesh refinement then a mesh with nodes is sufficient for achieving the same accuracy. In order to demonstrate the reliability of the adaptive algorithms, in Table 5.12, we show the effectivity indices of the derived error estimators on the locally refined mesh with nodes.

79 5.6 Abstract framework of a posteriori error estimation 75 Table 5.12: Example 2, effectivity of the different error estimators on a mesh with nodes. # It E h η h I h eff E it η it I it eff e e e e e e e e e e e e e e e e e e e e In order to compare the adaptive algorithms ALG-I, ALG-II and ALG-III, we decrease the error tolerance to TOL = The results show that ALG-II clearly outperforms ALG-I ( exact Newton iteration) and is approximately four times faster. The use of an adaptive stopping criterion for the inner iteration in the Newton method (ALG-III) additionally results in 50% work savings. The whole history of the solution process on locally refined meshes is shown in Tables 5.13, 5.14 and Again the proposed adaptive algorithm turns out to be efficient and reliable. Table 5.13: Example 2, iteration with ALG-I (iteration towards round-off error). N #It E η h +η it η h η it I tot eff e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e

80 76 Adaptivity Table 5.14: Example 2, iteration with Alg. S II (adaptive stopping rule for Newton iteration). N #It E η h +η it η h η it I tot eff e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e For the detailed comparison of the algorithms ALG-II and ALG-III, we consider the number of linear iterations needed for satisfying the corresponding stopping rule for the multigrid solver. In the fully adaptive algorithm ALG-III, we only need 2 3 inner iterations in order to get the linear residuals r n, z hl and d n,ṽ n h l ten times smaller than the linearization error estimator. In ALG-II using the exact Newton method, we need approximately 20 inner linear iterations on each refinement level.

81 5.7 Application to eigenvalue problems 77 Table 5.15: Example 2, iteration with Alg. ALG-III (fully adaptive Newton iteration). N #It E η h +η it η h η it I tot eff e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e Application to eigenvalue problems Inthefollowing, wewill applytheabstract theoryofthedwrmethoddeveloped aboveto error control in the approximation of eigenvalue problems. We mention some prototypical examples we are particularly interested in: The symmetric eigenvalue problem of the Laplace operator: u = λu in Ω, u = 0 on Ω, The nonsymmetric eigenvalue problem of a convection-diffusion operator: u+b u = λu in Ω, u = 0 on Ω, The stability eigenvalue problem governed by the linearized Navier-Stokes equations: ν v + ˆv v +( ˆv) T v + p = λv in Ω, v = 0 on Ω, v = 0 in Ω. where ˆv is some base solution of the Navier-Stokes equations the stability of which is to be investigated. The results presented in this section for the approximation of these problems can mainly be found in Heuveline& Rannacher [49], Bangerth& Rannacher [24], and Rannacher et

82 78 Adaptivity al. [66]. The above eigenvalue problems are all treated within an abstract setting, which will be laid out in the following. Let V be a (in general complex) Hilbert space. We seek {u,λ} V C satisfying Au = λmu, with linear operators A and M in V. The operator M is assumed to be hermitian (complex symmetric) and positive semi-definite. It is introduced to allow for the situation of the stability eigenvalue problem for the Navier-Stokes equations, where the eigenvalue term only occurs in the first of the two equations (the momentum equation). We do not go deeper into the abstract functional analytic setting of eigenvalue problems since we will concentrate below on concrete problems formulated in the standard function spaces. Again, we prefer the variational formulation a(u,ψ) = λm(u,ψ) ψ V, ( ) where a(, ) := A, is the (generally nonsymmetric) sesquilinear form generated by A and m(, ) is the symmetric semi-definite sesquilinear form corresponding to M (usually the L 2 scalar product). Eigenfunctions are assumed to be normalized by m(u, u) = 1. ( ) For nonsymmetric A, one also considers the associated adjoint eigenvalue problem for the Hilbert-space adjoint A of A, A u = λ Mu. Observing that a (u,ϕ) = A u,ϕ = a(ϕ,u ) and m(u,ϕ) = m(ϕ,u ), this has the variational form a(ϕ,u ) = λ m(ϕ,u ) ϕ V. ( ) Notethatinthecontextofmatrixeigenvalueproblemstheadjoint eigenvectors u arealso called left eigenvectors. From the definition, we see that primal and adjoint eigenvalues are related by λ = λ, while the corresponding eigenvectors differ. Usually, the adjoint eigenvectors are normalized by requiring m(u,u ) = 1. This normalization is possible only if u is not m-orthogonal with respect to the whole eigenspaceof λ,whichisequivalenttorequiringthat λ hastrivialdefect,i.e., itsalgebraic eigenspace reduces to the geometric one(only trivial Jordan blocks in the matrix case); for a more detailed discussion of this issue see Heuveline& Rannacher [49], and the literature cited therein. We will see that the simultaneous consideration of primal and adjoint eigenvalue problems is essential for rigorous a posteriori error estimation. At this abstract stage, we do not impose further assumptions on the bilinear form a(, ) but just assume that there exist solutions to the stated eigenvalue problems. This can be guaranteed to hold for the examples mentioned above by employing more specific properties of the

83 5.7 Application to eigenvalue problems 79 related settings (compactness of the bilinear form a(, ) on L 2 (Ω), boundedness of the domain Ω, etc.). The Galerkin approximation of ( ) and ( ) uses finite element subspaces V h V, as described above, to determine {u h,λ h },{u h,λ h } V h C, satisfying a(u h,ψ h ) = λ h m(u h,ψ h ) ψ h V h, m(u h,u h ) = 1, ( ) a(ϕ h,u h ) = λ h m(ϕ h,u h ) ϕ h V h, m(u h,u h ) = 1. ( ) Our goal is to control the errors λ λ h, u u h, and u u h in the eigenvalues and eigenfunctions in terms of the residuals associated with these equations, from which then computable error estimators can be derived as above A posteriori error analysis In order to derive a posteriori error estimates, we embed the present situation into the general framework of variational equations introduced above. Consider the product spaces V := V C and V h := V h C V and define for pairs U := {u,λ} V and Ψ = {ψ,ν} V the semilinear form A(U)(Ψ) := λm(u,ψ) a(u,ψ)+ν { m(u,u) 1 }. With this notation, the above continuous eigenvalue problem and its discrete analogue can be written in the following compact form of semilinear variational problems: U = {u,λ} V : A(U)(Ψ) = 0 Ψ V, ( ) U h = {u h,λ h } V h : A(U h )(Ψ h ) = 0 Ψ h V h. ( ) Estimation of the eigenvalue error In order to control the error in the approximation of the eigenvalues, we use the output functional J(Φ) := µm(ϕ,ϕ). Since m(u,u) = 1 at the solution U = {u,λ}, there holds J(U) = λm(u,u) = λ, i. e., as desired this functional picks out the eigenvalue. We recall the Euler-Lagrange approach as used above, particularly the Lagrangian functional for arguments U = {u, λ} and Ψ = {ψ,ν}: L(U,Ψ) = J(U) A(U)(Ψ) = λm(u,u) λm(u,ψ)+a(u,ψ) ν { m(u,u) 1 }.

84 80 Adaptivity The dual solution Z = {z,π} V and its Galerkin approximation Z h = {z h,π h } V h are then determined by the stationarity equations Observing that (d t := d/dt) A (U)(Φ,Z) = J (U)(Φ) Φ V, ( ) A (U h )(Φ h,z h ) = J (U h )(Φ h ) Φ h V h. ( ) d t m(u+tϕ,u+tϕ) t=0 = m(ϕ,u)+m(u,ϕ) = m(ϕ,u)+m(ϕ,u) = 2Rem(ϕ,u), the left and right hand sides of ( ) for Z = {z,π}, U = {u,λ}, and Φ = {ϕ,µ}, read: A (U)(Φ,Z) = µm(u,z)+λm(ϕ,z) a(ϕ,z)+2 πrem(ϕ,u), J (U)(Φ) = µm(u,u)+2λrem(ϕ,u). Hence, the continuous dual problem takes the form µ{m(u,z) m(u,u)}+λm(ϕ,z) a(ϕ,z)+2{ π λ}rem(ϕ,u) = 0, {ϕ,µ} V. We show that theadjoint eigenpair U = {u,λ } solves this equation, i.e., Z = {z,π} = U is a solution. This is clear since for m(u,z) = 1 = m(u,u) and π = λ, the equation reduces to a(ϕ,z) = πm(ϕ,z), which is the defining equation for U. Whether this is the only solution is not relevant for the present discussion. Analogously, the discrete dual problem ( ) is solved by the discrete adjoint eigenpair Uh = {u h,λ h } determined by a(ϕ h,u h ) = λ h m(ϕ h,u h ) ϕ h V h, m(u h,u h ) = 1. ( ) Proposition 5.8: With the primal and adjoint eigenvalue residuals we have the error representation ρ(u h,λ h )( ) := a(u h, ) λ h m(u h, ), ρ (u h,λ h)( ) := a(,u h) λ hm(,u h), λ λ h = 1 2 ρ(u h,λ h )(u ψ h )+ 1 2 ρ (u h,λ h)(u ϕ h ) + R (3) h, ( ) for arbitrary ψ h, ϕ h V h, with the cubic remainder term R (3) h = 1 2 (λ λ h)m(u u h,u u h ).

85 5.7 Application to eigenvalue problems 81 Proof. The assertion is an immediate consequence of Proposition 5.5 applied to the present situation. We have { J(U) J(U h ) = 1 A(Uh )(Z Ψ 2 h ) } { J (U h )(U Φ h ) A (U h )(U Φ h,z h ) } +R (3) h, with arbitrary Φ h = {ϕ h,µ h }, Ψ h = {ψ h,χ h } V h, and a cubic remainder R (3) h, which will be evaluated below. We recall that with U = {u,λ},φ = {ϕ,µ},ψ = {ψ,ν},z = {z,κ}: J(U) = λm(u,u), J (U)(Φ) = µm(u,u)+2λrem(ϕ,u), A(U,Ψ) = λm(u,ψ) a(u,ψ)+ν { m(u,u) 1 }, A (U)(Φ,Ψ) = µm(u,ψ)+λm(ϕ,ψ) a(ϕ,ψ)+2 νrem(ϕ,u), and therefore, aditionally with U h = {u h,λ h },Φ h = {ϕ h,µ h },Ψ h = {ψ h,ν h },Z h = {z h,π h }, A(U h )(Z Ψ h,) = λ h m(u h,z ψ h ) a(u h,z ψ h )+( π χ h ){m(u h,u h ) 1}, J (U h )(U Φ h ) = (λ µ h )m(u h,u h )+2λ h Rem(u ϕ h,u h ), A (U h )(U Φ h,z h ) = (λ µ h )m(u h,z h )+λ h m(u ϕ h,z h ) a(u ϕ h,z h ) +2 π h Rem(u ϕ h,u h ). Using this in the above general error representation, we obtain: { λ λ h = 1 λh m(u 2 h,z ψ h )+a(u h,z ψ h ) ( π χ h ){m(u h,u h ) 1} } + 2{ 1 (λ µh )m(u h,u h )+2λ h Rem(u ϕ h,u h ) (λ µ h )m(u h,z h ) λ h m(u ϕ h,z h )+a(u ϕ h,z h ) 2 π h Rem(u ϕ h,u h ) } +R (3) h. We are free to choose µ h := λ, χ h := π, and observing that m(u,u) = 1 = m(u h,u h ), and λ h = π h = λ h, we obtain { λ λ h = 1 a(uh,z ψ 2 h ) λ h m(u h,z ψ h ) } { a(u ϕh,z h ) λ h m(u ϕ h,z h ) } + R (3) h It remains to evaluate the remainder term. Setting E := {u u h,λ λ h } and E := {u u h,λ λ h }, the general remainder term from Proposition 5.5 is R (3) h = { J (U h +se)(e,e,e) A (U h +se)(e,e,e,z h +se ) 3A (U h +se)(e,e,e ) } s(s 1)ds.

86 82 Adaptivity In the present case, we have for U = {u,λ},φ = {ϕ,µ},ψ = {ψ,ν},ξ = {ξ,κ}: and J(U) = λm(u,u). J (U)(Φ) = d t J(U +tφ) t=0 = µm(u,u)+2λrem(ϕ,u), J (U)(Ψ,Φ) = d t J (U +tψ)(φ) t=0 = 2µRem(u,ψ)+2κRem(ϕ,u)+2λRem(ϕ,ψ), J (U)(Ξ,Ψ,Φ) = d t J (U +tξ)(ψ,φ) t=0 = 2µRem(ξ,ψ)+2κRem(ϕ,ξ)+2νRem(ϕ,ψ), A(U)(Ψ) = λm(u,ψ) a(u,ψ)+ ν { m(u,u) 1 }, A (U)(Φ,Ψ) = d t A(U +tφ)(ψ) t=0 = λm(ϕ,ψ)+µm(u,ψ) a(ϕ,ψ)+2 νrem(ϕ,u), A (U)(Ξ,Φ,Ψ) = d t A(U +tξ)(φ,ψ) t=0 = κm(ϕ,ψ)+µm(ξ,ψ)+2 νrem(ϕ,ξ), A (U)(,, ) 0. Using these representations and observing that λ λ h = λ λ h, we obtain for the various terms in the third-order remainder: J (U h +se)(e,e,e) = 6(λ λ h )m(u u h,u u h ), A (U h +se)(e,e,e ) = (λ λ h )m(u u h,u u h)+(λ λ h )m(u u h,u u h) +2( λ λ h )Rem(u u h,u u h ) = 2(λ λ h )m(u u h,u u h )+2(λ λ h)m(u u h,u u h ), and A (U h +se)(e,e,e,z h +se ) 0. Consequently, R (3) h = 1 2 = { J (U h +se)(e,e,e) 3A (U h +se)(e,e,e ) } s(s 1)ds (λ λ h )m(u u h,u u h )s(s 1)de = 1 2 (λ λ h)m(u u h,u u h), which completes the proof. Q.E.D. Remark 5.29: We add the following remarks concerning Proposition 5.8: The error representation( ) can be written in the following(formally) remainder-

87 5.7 Application to eigenvalue problems 83 free form: (λ λ h )(1 σ h ) = 1 2 ρ(u h,λ h )(u ψ h )+ 1 2 ρ (u h,λ h)(u ϕ h ), ϕ h,ψ V h, ( ) with σ h := 1m(u u 2 h,u u h ) 1, what isareasonable assumption onfiner meshes. In the symmetric case, a(u,v) = a(v,u), when primal and adjoint eigenvalues and associated eigenfunctions coincide, this error representatiopn reduces to the simpler form (λ λ h )(1 σ h ) = ρ(u h,λ h )(u ψ h ), ψ h V h, ( ) with σ h := 1 2 m(u u h,u u h ). We note that, once knowing its form, this error representation can easily be derived directly without using the above general theory (exercise). The error representation ( ) holds true without any assumption on the multiplicity of the eigenvalue λ or its defect. However, such a restriction will become necessary in dealing with the error of the eigenfunctions. In the error representation ( ) only terms involving the computed primal and dual eigenpairs occur and no additional outer dual problem needs to be solved. In the nonsymmetric case the simultaneous consideration of primal and dual eigenvalue problems is essential within an optimal multigrid iteration anyway. Then, the computation of u for the error estimator does therefore not introduce extra work. In Proposition 5.8, we have assumed that the governing operator A remains unchanged under discretization, i.e. all coefficient are frozen. In considering the stability eigenvalue problem, we additionally have to allow for approximation of the operator A(û) depending on coefficients û that are also be subject to approximation errors. Estimation of the eigenfunction error Now, we turn to the estimation of the error in the eigenfunctions. To this end, let j( ) : V C be an arbitrary(for simplicity linear) functional, which is used for controlling the error in the eigenfunctions, that is, we want to estimate the error j(u) j(u h ). Let {u, λ} be an eigenpair of the eigenvalue problem( ), and suppose that the eigenvalue λ is simple. For applying the general theory of a posteriori error estimation, we now set for Φ V := V C: J(Φ) := j(ϕ), Φ = {ϕ,µ} V. Using this notation from above the associated dual problem determines Z = {z,π} V, such that A (U)(Φ,Z) = J(Φ) Φ V,

88 84 Adaptivity or in concrete terms, λm(ϕ,z)+µm(u,z) a(ϕ,z)+2 πrem(ϕ,u) = j(ϕ) Φ = {ϕ,µ} V. This reduces to the equation a(ϕ,z) λm(ϕ,z) = 2 πrem(ϕ,u) j(ϕ) ϕ V, ( ) together with the condition m(u,z) = 0. Since the eigenvalue λ is assumed to be simple, this system can be solved if its right-hand side vanishes on the eigenvector u, that is 2 πrem(u,u) j(u) = 0 π = 1 2 j(u). Consequently, for the given eigenpair {u, λ} of ( ) the reduced dual problem a(ϕ,z) λm(ϕ,z) = j(u)rem(ϕ,u) j(ϕ) ϕ V,, ( ) has a solution z V, which is uniquely determined by the condition m(u,z) = 0. By an analogous argument, the discrete dual problem A (U h )(Φ h,z h )) = J(Φ h ) Φ h = {ϕ h,µ h ) V h := V h C, ( ) is seen to have the form a(ϕ h,z h ) λ h m(ϕ h,z h ) = j(u h )Rem(ϕ h,u h )} j(ϕ h ) ϕ h V h, ( ) where {u h,λ h } is the eigenpair of the approximate eigenvalue problem ( ). This problem also has a solution z h V h, which is uniquely determined by the condition m(u h,z h ) = 0. This equation determines the dual residual ρ (z h )( ) := a(,z h ) λ h m(,z h ) j(u h ) Re m(,z h )}+j( ). ( ) After these preparations, we can state the following result for the error in the eigenfunction approximation. Proposition 5.9: Let {u h,λ h } be a computed eigenpair of ( ) and {u,λ} an associated eigenpair of ( ). Then, for the given continuous (linear) functional j( ) : V C and the associated solution z V of the reduced dual problem ( ), we have the error identity j(u u h ) = a(u h,z ψ h ) λ h m(u h,z ψ h )+R (2) h, ( ) for arbitrary ψ h V h with the quadratic remainder R (2) h := (λ λ h )m(u u h,z)+ 1 2 j(u)m(u u h,u u h ).

89 5.7 Application to eigenvalue problems 85 Proof. The assertion is again an immediate consequence of Proposition 5.5 applied to the present situation. We have { J(U) J(U h ) = 1 A(Uh )(Z Ψ 2 h ) } { J (U h )(U Φ h ) A (U h )(U Φ h,z h ) } +R (3) h, with arbitrary Φ h = {ϕ h,µ h }, Ψ h = {ψ h,χ h } V h, and the cubic remainder R (3) h = { J (U h +se)(e,e,e) A (U h +se)(e,e,e,z h +se ) 3A (U h +se)(e,e,e ) } s(s 1)ds. First, we note that with the above settings the primal residual is given by ρ(u h )( ) = a(u h, ) λm(u h, ) and the dual residual ρ (z h )( ) by ( ). Therefore, we have j(u u h ) = 1 2 { a(uh,z ψ h ) λ h (u h,z ψ h ) } { a(u ϕh,z h ) λ h (u ϕ h,z h ) j(u h )Rem(u ϕ h,u h )+j(u ϕ h ) } +R (3) h, and consequently, by taking ϕ h = u h, j(u u h ) = a(u h,z ψ h ) λ h (u h,z ψ h )+a(u u h,z h ) λ h (u u h,z h ) To identify the remainder R (3) h j(u h )Rem(u u h,u h )}+2R (3) h., we note that J (U h +se h )(E,E,E) 0 A (U h +se)(e,e,e,z h +se ), A (U h +se)(e,e,e ) = 2(λ λ h )m(u u h,z z h ) 2( π π h )m(u u h,u u h ), which yields R (3) h = 1 2 (λ λ h)m(u u h,z z h )+ 1 2 ( π π h)m(u u h,u u h ). We recall that π = 1 2 j(u) and π h = 1 2 j(u h) and obtain R (3) h = 1 2 (λ λ h)m(u u h,z z h )+ 1 4 j(u u h)m(u u h,u u h ). From this, we infer as an intermediate result that j(u u h ) = a(u h,z ψ h ) λ h m(u h,z ψ h )+a(u u h,z h ) λ h m(u u h,z h ) j(u h )Rem(u u h,u h )+(λ λ h )m(u u h,z z h )+ 1 2 j(u u h)m(u u h,u u h ). Next, by definition and since (u h,z h ) = 0, we have a(u u h,z h ) λ h m(u u h,z h ) = (λ λ h )m(u,z h ) = (λ λ h )m(u u h,z h ).

90 86 Adaptivity Further, noting that m(u,u) = 1 = m(u h,u h ), there holds m(u u h,u u h ) = m(u,u)+m(u h,u h ) 2Rem(u,u h ) = m(u,u) m(u h,u h ) 2Rem(u u h,u h ) = 2Rem(u u h,u h ). ( ) Then, combining the last three relations gives us j(u u h ) = a(u h,z ψ h ) λ h (u h,z ψ h ) +(λ λ h )(u u h,z h )+ 1 2 j(u)m(u u h,u u h )+(λ λ h )(u u h,z z h ). From this, we finally obtain the desired identity ( ). Q.E.D. Remark 5.30: Proposition 5.9 requires λ to be simple. In the case of multiplicity m > 1, we have to simultaneously consider a whole basis {v (i),i=1,...,m} of the eigenspace kern(a λi) in setting up the dual problem ( ) Practical evaluation of the error representation Next, we want to determine the explicit form of the residuals in Proposition 5.8. To this end, we need to be more specific about the particular structure of the eigenvalue problem considered. Here, we restrict ourselves to a simple model situation which, nevertheless, is prototypical for the problems we are interested in. On a polygonal or polyhedral domain Ω R d consider the eigenvalue problem of a second-order elliptic differential operator A such as, for example, Av := v +b v = λv in Ω, v Ω = 0, with a smooth (or even constant) transport coefficient vector b satisfying b 0. In this case, we have M := id and all eigenvalues are real and positive (exercise). Further, let this eigenvalue problem be approximated by the Galerkin method using piecewise linear or d-linear finite elements on meshes T h = {T}, as described above. Within this setting, we can proceed analogously as before, obtaining ρ(u h,λ h )(ψ) = a(u h,ψ) λ h m(u h,ψ) = T T h { (Auh λ h Mu h,ψ) T ( A nu h,ψ) T } = T T h { (Auh λ h Mu h,ψ) T ([ A n u h],ψ) T },

91 5.7 Application to eigenvalue problems 87 ρ (u h,λ h)(ϕ) = a(ϕ,z h ) λ hm(ϕ,z h ) = { } (ϕ,a z h λ hmz h ) T (ϕ, n A z h ) T T T h = T T h { (ϕ,a z h λ h Mz h) T (ϕ,[ A n z h]) T }. Hence, using again the notation of equation and jump residuals R h, R h, r h, and r h, respectively, analogously as introduced above, the residual admits the estimate ρ(u h,λ h )(u ψ h )+ρ (u h,λ h)(u ϕ h ) T T h { ρt ω T +ρ Tω T }, ( ) with the cell residuals ρ T, ρ T and and weights ω T, ω T defined by ρ T := ( 1/2, R h 2 T +h 1/2 T r h T) 2 ρ T := ( 1/2, Rh 2 T +h 1/2 T rh T) 2 ω T := ( 1/2, u ϕ h 2 T h1/2 T u ϕ h T) 2 ωt := ( 1/2. u ψ h 2 T +h1/2 T u ψ h T) 2 As a consequence of the above discussion, we obtain the following result. Proposition 5.10: Within the above setting, assuming that we have the weighted a posteriori error estimate m(u u h,u u h ) 1, ( ) λ λ h η ω λ := T T h { ρt ω T +ρ T ω T}. ( ) Proof. Using the estimate ( ) in the error representation ( ) gives us { } λ λ h 1 ρt ω 2 T +ρ (3) Tω T + R h. T T h Since, in virtue of assumption ( ), R (3) h = 1 2 (λ λ h)m(u u h,u u h) 1 2 λ λ h, the asserted estimate ( ) follows. Q.E.D. Remark 5.31: From the a posteriori error estimate( ) one can derive the following

92 88 Adaptivity energy-norm-type estimates: λ λ h η (1) λ := c λ T T h h 2 T and, assuming the eigenfunctions u and u to have H 2 -regularity, λ λ h η 2 λ := c λ ( T T h h 4 T { } ρ 2 T +ρ 2 T. ( ) { } ) ρ 2 T +ρ 2 1/2 T, ( ) with constants c λ growing with λ.. The proofs of these error estimates can be found in Heuveline&Rannacher [49]. Both error estimators, η (1) λ and η (2) λ, are asymptotically equivalent for regular situations. However, the required H 2 -regularity for the eigenfunctions renders the estimator η (2) λ useless for typical situations in which mesh adaptivity is needed. Remark 5.32: Neglecting the presence of the dual eigenvalue problem, we may try to control the error in approximating the eigenvalue using the primal residual part alone, i. e. using the reduced error estimator η red λ := T T h h 2 Tρ 2 T. Below, we will compare the performance of the above eigenvalue-error estimators ηλ ω, η (1), and ηred λ for a simple model situation. λ, η(2) λ Balancing of discretization and iteration error In practice the discrete eigenvalue primal and adjoint problems a(u h,ψ h ) = λ h m(u h,ψ h ) ψ h V h, m(u h,u h ) = 1, ( ) a(ϕ h,u h ) = λ h m(ϕ h,u h ) ϕ h V h, m(u h,u h ) = 1. ( ) are solved by some costly iterative method, e. g., for smaller-size problems variants of the QR method and for large-size problems Krylov-space methods, such as the Arnoldi method. Thatmeansthatusuallyonlyapproximations {ũ h, λ h } {u h,λ h } and {ũ h, λ h } {u h,λ h } are available. The content of the following proposition is a combined estimate for discretization and iteration errors. This is an immediate consequence of the corresponding result, Proposition 5.7, for the approximation of general nonlinear variational equations. Proposition 5.11: Let {ũ h, λ h },{ũ h, λ h } V h C be any approximate solutions of the discrete primal and adjoint eigenvalue problems ( ) and ( ), respectively,

93 5.7 Application to eigenvalue problems 89 obtained by any iterative process on the mesh T h. Then, with the notation of Proposition 5.8, there holds the following error representation: λ λ h = 1 2 ρ(ũ h, λ h )(ũ ψ h )+ 1 2 ρ (ũ h, λ h )(ũ ϕ h)+ρ(ũ h, λ h )(ũ h ) + R(3) h, ( ) for arbitrary ψ h V h, with the cubic remainder R (3) h := 1 2 (λ λ h )m(u ũ h,u ũ h ). Proof. The perturbed error representation ( ) can be derived from the corresponding general result in Proposition 5.7) by the same arguments as already used in the proof of Proposition 5.8: with the residual terms J(U) J(Ũh) = 1 2 ρ(ũh)(z Z h )+ 1 2 ρ (Ũh, Z h )(U Ũh) +ρ(ũh)( Z h )+R (3) h, ρ(ũh)( ) := A(Ũh)( ), ρ (Ũh, Z h )( ) := J (Ũh)( ) A (Ũh)(, Z h ). and the cubic remainder term R (3) h = { J (Ũh +sẽ)(ẽ,ẽ,ẽ) A (Ũh +sẽ)(ẽ,ẽ,ẽ, Z h +sẽ ) 3A (Ũh +sẽ)(ẽ,ẽ,ẽ )) } s(s 1)ds. We skip the technical details. Q.E.D Numerical examples We consider again the convection-diffusion model eigenvalue problem v +b v = λv in Ω, v Ω = 0, on the slit-domain Ω = ( 1,1) ( 1,3) \ {x R 2,x 1 = 0, 1 < x 2 0}, with the transport vector b = (0,b y ) T ; for a sketch of this configuration, see Figure In the computations on this test problem the mesh refinement is organized according to the fixed-rate strategy with refinement rate X = 0.2, see Section All discrete eigenvalue problems are solved exactly, i. e., the additional iteration error does not need to be taken into account. Test 1: Symmetric case At first, we consider the symmetric eigenvalue problem, i.e. b = 0. The eigenfunction corresponding to the smallest eigenvalue and the computational mesh generated on the

90 Adaptivity basis of the weighted error estimator ηλ ω are shown in Fig. 5.

(In this case ηλ red and are equivalent.) We see that in the symmetric case all estimators show almost equally good performance compared to that of uniform refinement.

94 90 Adaptivity basis of the weighted error estimator ηλ ω are shown in Fig Fig shows the mesh efficiencies achieved on the basis of the different error estimators ηλ ω, η(1) λ and η (2) λ introduced above compared with that of uniform mesh refinement. (In this case ηλ red and are equivalent.) We see that in the symmetric case all estimators show almost equally good performance compared to that of uniform refinement. This is due to the dominance of the error caused by the slit singularity which is well captured by all residual-based error estimators. In fact, it is known that in the symmetric case, the eigenvalue error is proportional to the square of the energy-norm error, λ λ h (u u h ) 2. η (1) λ P b Figure 5.15: Configuration for b 0 (left), adapted mesh with 12, 000 cells (middle), eigenfunction (right).

95 5.7 Application to eigenvalue problems η(1) λ (2) η λ weight η λ uniform λ λ h N Figure 5.16: Mesh efficiency achieved using the different error estimators, η (1) λ (symbol ), η (2) λ (symbol ), and ηλ ω (symbol ), compared against uniform refinement (symbol ). The curves for η (1) λ and ηλ ω lie above each other. Test 2: Nonsymmetric case Next, we consider the nonsymmetric version of the test eigenvalue problem with vertical transport, b y = 3. In this case, due to the Dirichlet boundary conditions, the primal eigenfunction has a steeper gradient at the top boundary, while the dual eigenfunction has one at the bottom boundary; see Figure The latter boundary layer strongly interferes with the slit singularity. Fig shows adapted meshes obtained using the energy-type error estimator η (1) λ, the reduced error estimator ηred λ, and the weighted error estimator ηλ ω. The superiority of the latter one is clearly seen in Figure 5.19.

96 92 Adaptivity Figure 5.17: Primal (left) and dual eigenfunction (right). Figure 5.18: Adapted meshes with 10,000 cells on the basis of the error estimators η (1) λ (left), ηλ red (middle), ηλ ω (right).

97 5.8 Application to optimization problems η(1) λ (2) η λ weight η λ uniform λ λ h Figure 5.19: Mesh efficiencies achieved on the basis of the different error estimators: η (1) λ (symbol ), ηλ red (symbol ), and ηλ ω (symbol ), compared against uniform refinement (symbol ). N 5.8 Application to optimization problems As another important application for the general theory of the DWR method developed above, we consider optimization problems with PDE constraints as discussed in Becker et al. [38]. In abstract variational notation, such problems are posed in a state space V and a control space Q and on which the cost functional J(. ) is defined. We consider a special class of such problems in which the state and control operators are associated with an (in general) semi-linear form A( )( ) and a bilinear form B(, ), respectively. The problem then reads: Minimize J(u,q) for pairs {u,q} V Q, subject to the constraint A(u)(ψ)+B(q,ψ) = 0 ψ V. ( ) Their Galerkin approximation uses subspaces V h Q h V Q and reads as follows: Minimize J(u h,q h ) for pairs {u h,q h } V h Q h, subject to the constraint A(u h )(ψ h )+B(q h,ψ h ) = 0 ψ h V h. ( ) As before, the semi-linear form A( )( ) is assumed to be sufficiently (directional) differentiable, while for simplicity, the cost functional J(, ) is assumed as linear or quadratic. This abstract setting is illustrated by the model situation described in the following example.

98 94 Adaptivity Example 5.11: We consider the state equation u+s(u) = f in Ω, ( ) posed on the T-shaped domain Ω R 2 shown in Figure 5.20, with a (usually nonlinear) reaction term s( ) and the Neumann boundary conditions n u = q on Γ C, n u = 0 on Ω\Γ C. Here, the control variable q prescribed along the control boundary Γ C is to be adjusted to minimize the cost functional (so-called tracking problem ) J(u,q) = 1 2 u u 0 2 Γ O α q 2 Γ C. The cost functional J(u, q) measures the deviation of the Dirichlet values of the solution u = u(q) along the observation boundary Γ O (the observations) from the prescribed function u 0. The parameter α 0 expresses the amount of regularization in the cost functional, which is usually needed for insuring the solvability of the optimization problem. In this problem, the generic solution spaces are V := H 1 (Ω) for the state variable u, and Q := L 2 (Γ C ) for the control variable q. The corresponding variational formulation of the state equation is ( u, ψ)+(s(u),ψ) (q,ψ) ΓC = (f,ψ) ψ V, ( ) with the control form B(q,ψ) := (q,ψ) ΓC. In the numerical example below, the nonlinear reaction term will be chosen as s(u) = u 2 (u 1). The economical numerical treatment of optimization problems with PDE constraints requires some preliminary thoughts: What is the appropriate notion of admissible states u = u(q)? In general, under discretization the state equation cannot be satisfied exactly, but only in an approximate sense. Since achieving accuracy in the approximation of PDEs is expensive the amount of admissibility of the state which is relevant for the optimization process becomes a crucial question. How to measure admissibility? In solving ODEs, we may require the error to be uniformly small. However, in the context of PDEs, the choice of an adequate error measure is not clear. Possible candidates such as the global energy-norm and L 2 norm or alternatively more localized error measures may lead to rather different degrees of admissibility of the state variable. For example, in minimizing the drag coefficient of a body embedded in a viscous, incompressible fluid the state corresponding to an optimal control may turn out as not at all incompressible in large parts of the flow domain considered. The practical answer to these questions should be taken from the optimization problem itself and is the theme of the following analysis.

99 5.8 Application to optimization problems 95 Observation boundary Γ O Control/Observation boundary Γ C = Γ O Control boundary ΓC Figure 5.20: Configuration of the boundary control model problem on a T-domain: configuration 1 (left), configuration 2 (right) A posteriori error analysis via Euler-Lagrange formalism In order to develop the adaptive discretization of optimal control problems as introduced above, we embed them into the abstract framework laid out in Section 5.6. This means that we use the so-called indirect method of optimization in which stationary points of the associated Lagrangian functional are computed among which the possible local minima can be found. The more traditional direct method of optimization tries to minimize the cost functional by working only on the set of admissible functions, i.e. those pairs {u,q} satisfying the state equation. In this case, we define the Lagrangian functional by L(u,q,λ) := J(u,q) A(u)(λ) B(q,λ), with the primal variable u V, the control variable q Q, and the adjoint variable λ V (Lagrangian multiplier playing here the role of the dual solution z from above). Again, we seek stationary points x := {u,q,λ} X := V Q V of L(x) := L(u,q,λ), which are determined by the following system of equations (KKT system): J u(u,q)(ϕ) A (u)(ϕ,λ) = 0 ϕ V, ( ) J q (u,q)(χ) B(χ,λ) = 0 χ Q, ( ) A(u)(ψ) B(q,ψ) = 0 ψ V. ( ) Notice that the third equation is just the state equation to be satisfied by any admissible pair {u,q}. The Galerkin approximation determines x h := {u h,q h,λ h } X h := V h Q h V h by a corresponding system of discrete equations: J u(u h,q h )(ϕ h ) A (u h )(ϕ h,λ h ) = 0 ϕ h V h, ( ) J q (u h,q h )(χ h ) B(χ h,λ h ) = 0 χ h Q h, ( ) A(u h )(ψ h ) B(q h,ψ h ) = 0 ψ h V h. ( )

100 96 Adaptivity We assume that both systems ( ) ( ) and ( ) ( ) have solutions. This has to be shown in a concrete situations using the particular structural properties of the problem considered. It appears quite natural to control the error of this approximation with respect to the cost functional, i.e. by estimating J(u,q) J(u h,q h ) in terms of the dual, control, and primal residuals defined by ρ (u h,q h,λ h )( ) := J u(u h,q h )( ) A (u h )(,λ h ), ρ q (u h,q h,λ h )( ) := J q (u h,q h )( ) B(,λ h ), ρ(u h,q h )( ) := A(u h )( ) B(,q h ). To this situation, we can directly apply the abstract result of Proposition 5.4. Proposition 5.12: For the approximation of the KKT system ( ) ( ) by the system ( ) ( ), we have the error representation J(u,q) J(u h,q h ) = 1 2 ρ (u h,q h,λ h )(u ϕ h )+ 1 2 ρq (u h,q h,λ h )(q χ h ) + 1 ρ(u 2 h,q h )(λ ψ h ) + R (3) h, ( ) for arbitrary ϕ h, ψ h V h and χ h Q h. The remainder R (3) h e u := u u h, e q := q q h, e λ := λ λ h, is cubic in the errors R (3) h = { A (u h +se u )(e u,e u,e u,λ h +se λ )+3A (u h +se u )(e u,e u,e λ ) } s(s 1)ds. Proof. For the proof, we recall the abstract error representation from Proposition 5.4, which applied to the Lagrangian functional reads as follows: L(x) L(x h ) = 1 2 L (x h )(x y h )+R (3) h, y h X h, ( ) with a remainder term R (3) h, which is cubic in the error ex := x x h, R (3) h := L (x h +se x )(e x,e x,e x )s(s 1)ds, where x = {u,q,λ} and x h = {u h,q h,λ h }. By computing the explicit form of the derivative L and noting that L(x) L(x h ) = J(u,q) J(u h,q h ), we obtain the residual part of the asserted representation. Since the control form B(, ) is assumed to be linear, L(u,q,λ) is linear in λ, and J(u,q) is assumed to be at most quadratic, the third derivative of L( ) consists of only two terms, namely, A (u h +se u )(e u,e u,e u,λ h +se λ ) 3A (u h +se u )(e u,e u,e λ ).

101 5.8 Application to optimization problems 97 This completes the proof. Q.E.D. Remark 5.33: We note that for the concrete situation of equation ( ) the remainder term in the error representation ( ) has the form R (3) h = { (s (u h +se u )e u3,λ h +se λ )+3(s (u h +se u )e u2,e λ ) } s(s 1)ds. Remark 5.34: In the a posteriori error representation ( ) only the primary variables x = {u, q, λ} occur which have to be computed simultaneously in the indirect method of optimization. Hence, no extra dual problem has to be solved, which makes this approach of error estimation with respect to the natural cost functional especially attractive. The mechanism underlying this property is the same, which was already seen in the context of eigenvalue computation. In all these examples, the error is to be controlled with respect to the functional from which the variational equations are derived as first-order stationarity conditions. Remark 5.35: For the practical solution of the nonlinear coupled system ( ) ( ), we use a nested iteration consisting of an outer Newton iteration with automatic mesh adaptation and an inner linear multigrid iteration; see Section Since this process starts from a coarse mesh, which is then successively refined, we may also speak of model enrichment in adaptively solving the optimality system. Remark 5.36: The indirect method of optimization combined with discretization yields approximations {u h,q h,λ h } which may be admissible only in a very weak sense. Nevertheless, by construction, on the optimally adapted mesh T h the obtained control q opt h yields a value of the cost functional J(u opt h,qopt h ) which differs from the exact optimal value J(u,q) only by the prescribed tolerance TOL. If for certain reasons a more admissible state variable ũ h is needed, then we may generate this in a post-processing step by solving the state equation on a finer mesh with the previously computed q opt h as fixed data: A(ũ h )(ϕ h ) = B(q opt h,ϕ h) ϕ h Ṽh Application to a concrete boundary control problem We consider the example described above with the state equation (5.20) and nonlinearity s(u) := u 3 u, posed on the T-domain shown in Figure Then, the KKT system reads as follows: (ϕ,u u 0 ) ΓO ( ϕ, λ) Ω (s (u)ϕ,λ) Ω = 0 ϕ V, α(q,χ) ΓC +(λ,χ) ΓC = 0 χ Q, ( u, ψ) Ω (s(u),ψ) Ω +(f,ψ) Ω +(q,ψ) ΓC = 0 ψ V. ( )

102 98 Adaptivity For illustration, we also state the strong form of this system: λ+s (u)λ = 0 in Ω, n λ ΓO = u u 0, n λ Ω\ΓO = 0, λ ΓC = αq ΓC, u+s(u) = f in Ω, n u ΓC = q, n u Ω\ΓC = 0. ( ) For the Galerkin approximation of this optimality system, we use the standard spaces V h of bilinear finite elements on meshes T h for the primal and adjoint variables u h and λ h, respectively. For the discretization of the control variable q, we use the space of traces of normal derivatives of functions in V h on Γ C, i.e. piecewise linear shape functions. The discrete optimality system reads (ϕ h,u h u 0 ) ΓO ( ϕ h, λ h ) Ω (s (u h )ϕ h,λ h ) Ω = 0 ϕ h V h, α(q h,χ h ) ΓC +(λ h,χ h ) ΓC = 0 χ h Q h, ( u h, ψ h ) Ω (s(u h ),ψ h ) Ω +(f,ψ h ) Ω +(q h,ψ h ) ΓC = 0 ψ h V h. ( ) For steering the mesh adaptation, we use the following a posteriori error estimator which is obtained from Proposition 5.12: { } J(u,q) J(u h,q h ) η ω := 1 2 ρ u T ωt λ +ρ q T ωq T +ρλ T ωt u, ( ) T T h where the residuals and weights are for x h = {u h,q h,λ h } defined by ρ λ T := Rλ h T +h 1/2 T r λ h T, ω u T := u I hu T +h 1/2 T u I hu T, ρ q T := h 1/2 T r q h T, ω q T := h1/2 T q I hq T, ρ u T := Ru h T +h 1/2 T r u h T, ω λ T := λ I hλ T +h 1/2 T λ I hλ T. with suitable approximations {I h u,i h q,i h λ} V h Q h V h. The cell residuals are given by R u h T := f + u h s(u h ), R λ h T := λ h s (u h )λ h, and edge residuals by 1 [ rh Γ u 2 nu h ], if Γ Ω, := n u h, if Γ Ω\Γ C, n u h q h, if Γ Γ C, 1 [ rh Γ λ 2 nλ h ], if Γ Ω, := n λ h, if Γ Ω\Γ O, n λ h u h +u 0, if Γ Γ O. r q h Γ := { λh +αq h, if Γ Γ C, 0, if Γ Γ C, We will compare the performance of the weighted error estimator ( ) with more traditional error estimators. Control of the error in the energy-norm of the state equation

103 5.8 Application to optimization problems 99 alone leads us to the reduced a posteriori error estimator ( ) 1/2, ηe red := c I h 2 T (ρ u T) 2 ( ) T T h while taking into account the full optimality system ( ) ( ) to η E := c I ( T T h h 2 T { (ρ u T ) 2 +(ρ λ T) 2 +(ρ q T )2}) 1/2, ( ) with the residual terms as defined above and interpolation constants c I usually set to be one. These ad hoc criteria aim at satisfying the state equation or the whole set of optimality conditions uniformly with good accuracy. However, this concept seems questionable since it does not take into account the sensitivity of the cost functional with respect to the discretization in different parts of the domain. Capturing these dependencies is the particular feature of the DWR method. Numerical results for Configuration 1 The configuration is as shown in Figure 5.20, with u 0 = sin(0.19x) and α = 0. This test case represents an extreme situation since here the observation u ΓO is evaluated right at the control boundary Γ C, i.e., the flow of information does not have to pass through the domain. Hence, the corner singularities do not much affect the optimization process and should not induce mesh refinement. N E rel I eff J(e) e-05 1e-06 1e-07 Reduced energy estimator Energy estimator DWR estimator Figure 5.21: Relative error E rel and effectivity index I eff obtained by the error estimator η ω (left), and efficiency of the meshes generated by the estimators ηe red (solid line), η E (dotted line, +), and η ω (dotted line). N

100 Adaptivity Figure 5.22: Size of cell-error indicators η T in the weighted error estimator η ω (left) and the energy-norm error estimator ηe red (right). In Figure 5.

The first one yields significantly more economical meshes; the optimal value (J(u h,q h ) = 0.

104 100 Adaptivity Figure 5.22: Size of cell-error indicators η T in the weighted error estimator η ω (left) and the energy-norm error estimator ηe red (right). In Figure 5.21, we compare the efficiency of the meshes generated by the weighted error estimator and the energy-norm error indicators. The first one yields significantly more economical meshes; the optimal value (J(u h,q h ) = ) of the cost function is obtained with only 3,500 cells compared to the 100,000 cells needed by the energy-norm error estimator. Figure 5.22 shows the distribution of cell-error indicators derived from the different a posteriori error estimators, and Figure 5.23 shows the discrete solutions obtained on the corresponding adapted meshes. Figure 5.23: Comparison of discrete solutions on adapted meshes obtained by the error estimators η ω (left) and ηe red (right). Numerical results for Configuration 2 For the second test, we take the observations as u 0 1 and set the regularization parameter to α = 0.1. Now, there exist several stationary points of the Euler-Lagrange equations. By varying the starting values for the Newton iteration, we can approximate

105 5.8 Application to optimization problems 101 each of these solutions. The trivial solution corresponding to u 1 is actually the global minimum of the cost functional. The two other computed solutions correspond to a local minimum and a local maximum. The effectivity of the error estimator η ω for computing the local minimum is shown in Figure Figure 5.25 shows the distribution of the local cell indicators η K for the two error estimators η ω and ηe red ; the corresponding meshes are seen in Figure Obviously, the weighted error estimator induces a stronger refinement along the observation and control boundaries which seems more relevant for the optimization process than resolving the corner singularities. However, in this case the error contribution by the interior cells around the reentrant corners is dominant over that by the boundary cells, such that also the energy-norm error estimator yields reasonable meshes for the optimization process. This explains why the gain in efficiency (about 25%) of the error estimator η ω over ηe red is less significant here compared to the previous example. N E rel I eff J(e) e-05 1e-06 1e-07 1e-08 1e-09 1e-10 Reduced energy estimator Energy estimator DWR estimator N Figure 5.24: Relative error E rel and effectivity index I eff obtained by the error estimator η ω (left), and efficiency of the meshes generated by the estimators ηe red (solid line), η E (dashed line, +), and η ω (dotted line).

102 Adaptivity Figure 5.25: Size of cell-error indicators η T in the weighted error estimator η ω (left) and the energy-norm error estimator ηe red (right). Figure 5.26: Comparison of discrete solutions on adapted meshes obtained by the error estimators η ω (left) and ηe red (right).

106 102 Adaptivity Figure 5.25: Size of cell-error indicators η T in the weighted error estimator η ω (left) and the energy-norm error estimator ηe red (right). Figure 5.26: Comparison of discrete solutions on adapted meshes obtained by the error estimators η ω (left) and ηe red (right) Application to parameter estimation Another important class of optimization problems related to PDEs are so-called parameter estimation problems. Here, we consider the model problem u+qu = f in Ω, u Ω = 0. ( )

Goal-oriented error control of the iterative solution of finite element equations

J. Numer. Math., Vol. 0, No. 0, pp. 1 31 (2009) DOI 10.1515/ JNUM.2009.000 c de Gruyter 2009 Prepared using jnm.sty [Version: 20.02.2007 v1.3] Goal-oriented error control of the iterative solution of finite