Approximation in Banach Spaces by Galerkin Methods

2 Approximation in Banach Spaces by Galerkin Methods In this chapter, we consider an abstract linear problem which serves as a generic model for engineering applications. Our first goal is to specify the conditions under which this problem is well-posed. We use the definition proposed by Hadamard [Had32]: a problem is well-posed if it admits a unique solution and if it is endowed with a stability property, namely the solution is controlled by the data. Two important results asserting well-posedness are presented: the Lax Milgram Lemma and the Banach Nečas Babuška Theorem. The former provides a sufficient condition for well-posedness, whereas the latter, relying on slightly more sophisticated assumptions, gives necessary and sufficient conditions. Then, we study approximation techniques based on the so-called Galerkin method. Both conforming and non-conforming settings are considered. We investigate under which conditions the stability properties of the abstract problem are transferred to the approximate problem, and we obtain a priori estimates for the approximation error. The last section of this chapter investigates a particular form of the Banach Nečas Babuška Theorem relevant to problems endowed with a saddle-point structure. 2.1 The Banach Nečas Babuška (BNB) Theorem In this section, we introduce an abstract problem and determine the conditions under which this problem is well-posed. 2.1.1 Well-posedness Consider the following (abstract) problem: { Seek u W such that a(u, v) = f(v), v V, (2.1) where:

82 Chapter 2. Approximation by Galerkin Methods (i) W and V are vector spaces equipped with norms denoted by W and V, respectively. In many applications, W and V are Hilbert spaces, but a more general case where V is a reflexive Banach space and W a Banach space can be considered. See Appendix A for an introduction to Banach and Hilbert spaces. Unless stated otherwise, we henceforth assume that W and V are Banach spaces and that V is reflexive. W is called the solution space, and V is called the test space. (ii) a is a continuous bilinear form on W V, i.e., a L(W V ; R); henceforth, we shall also say that a is bounded on W V. (iii) f is a continuous linear form on V, i.e., f V = L(V ; R). To simplify the notation, we write f(v) instead of f, v V,V. Henceforth, well-posedness is understood in the sense introduced by Hadamard [Had32]. Definition 2.1 (Hadamard). Problem (2.1) is said to be well-posed if it admits one and only one solution and if the following a priori estimate holds: c > 0, f V, u W c f V. In many applications, the bilinear form a results from the weak formulation of PDEs posed on a domain Ω R d with boundary conditions enforced on Ω. The linearity of a with respect to v directly results from the weak formulation whereas the linearity with respect to u is a consequence of the linearity of the model problem itself. The elements of W and V are scalar- or vector-valued functions defined on Ω. Three important examples falling into the framework of the abstract problem (2.1) are the Laplace equation, the Stokes equations, and the advection equation. These problems (and many variants thereof) are thoroughly investigated in Chapters 3, 4, and 5, respectively. They are now briefly introduced for the sake of illustration. The Laplace equation. Consider the PDE u = f in Ω plemented with the homogeneous Dirichlet condition u Ω = 0. This problem can be reformulated in the form (2.1) by setting W = V = H0 1 (Ω), a(u, v) = u v, and f(v) = Ω fv if f L2 (Ω) or f(v) = f, v H 1,H0 1 if f H 1 (Ω); see 3.1.1. The Stokes equations. Consider the PDEs u + p = s and u = g in Ω plemented with the homogeneous Dirichlet condition u Ω = 0. This problem falls into the above framework by setting Ω

2.1. The Banach Nečas Babuška (BNB) Theorem 83 W = V = [H0 1 (Ω)] d L 2 =0 (Ω), a((u, p), (v, q)) = u: v Ω Ω p v + q u, Ω and f(v, q) = Ω (s v + gq) provided s [L2 (Ω)] d and g L 2 (Ω) or f(v, q) = s, v H 1,H0 1 + Ω gq if s [H 1 (Ω)] d ; here, L 2 =0 (Ω) is the space of squaresummable functions with zero mean in Ω. One important difference with the Laplace equation is that the solution and the test functions are now vectorvalued; see 4.1.2. The advection equation. Let β [C 1 (Ω)] d be a given vector field and denote by Ω = {x Ω; (β n)(x) < 0} the so-called inflow boundary, n being the outward normal to Ω. Consider the PDE β u = f in Ω plemented with the boundary condition u Ω = 0. This problem can be reformulated in the form (2.1) by setting W = {u L 2 (Ω); β u L 2 (Ω); u = 0 on Ω }, a(u, v) = v(β u), Ω V = L 2 (Ω), and f(v) = Ω fv provided f L2 (Ω). The main difference with the Laplace and the Stokes equations is that the solution space and the test space are different; see 5.2.3. 2.1.2 The Lax Milgram Lemma Consider the case where the solution space and the test space are identical. Thus, the model problem is: { Seek u V such that (2.2) a(u, v) = f(v), v V. Lemma 2.2 (Lax Milgram). Let V be a Hilbert space, let a L(V V ; R), and let f V. Assume that the bilinear form a is coercive, i.e., (lm) α > 0, u V, a(u, u) α u 2 V. Then, problem (2.2) is well-posed with a priori estimate f V, u V 1 α f V. (2.3) Proof. Since this lemma is a consequence of the BNB Theorem, the proof is postponed to Lemma 2.8; see also Exercise 2.11 for a direct proof. Remark 2.3. The Lax Milgram Lemma holds in Hilbert spaces only (i.e., not in Banach spaces) since coercivity is essentially an Hilbertian property; see Exercise 2.8.

84 Chapter 2. Approximation by Galerkin Methods In the particular case where the bilinear form a is symmetric and positive, problem (2.2) can be interpreted as an optimization problem. Proposition 2.4. Along with the hypotheses of Lemma 2.2, assume that: (i) a is symmetric: a(u, v) = a(v, u), u, v V ; (ii) a is positive: a(u, u) 0, u V. Then, setting J(v) = 1 2a(v, v) f(v), u solves (2.2) if and only if u minimizes J over V. Proof. The proof relies on the following identity: for all u, v V and t R, J(u + tv) = J(u) + t(a(u, v) f(v)) + t2 2 a(v, v), (2.4) which results from the symmetry of a. Assume that u solves (2.2). Then, owing to the positivity of a, (2.4) implies that u minimizes J over V. Conversely, assume that u minimizes J over V. Let v V with a(v, v) 0 and take t = a(u,v) f(v) a(v,v) in (2.4). A straightforward calculation yields 0 J(u) J(u + tv) = (a(u, v) f(v))2. 2a(v, v) Owing to the positivity of a, this implies a(u, v) = f(v). If a(v, v) = 0, one can conclude similarly by taking t = a(u, v) + f(v). Remark 2.5. (i) When a is symmetric and coercive, the Lax Milgram Lemma implies that the optimization problem inf v V J(v) has a unique solution. The coercivity of a can be interpreted as a strong convexity property of the functional J. Problem (2.2) is termed a variational formulation. (ii) In several applications, the functional J represents an energy. For instance, in continuum mechanics, consider an elastic body deformed under an externally applied load; see 3.4. Then, 1 2a(v, v) is the elastic deformation energy in the equilibrium configuration, and f(v) is the potential energy under the external load. 2.1.3 The BNB Theorem: inf- conditions The BNB Theorem plays a fundamental role in this book. Although it is by no means standard, we have adopted the terminology BNB Theorem since, to our knowledge, the result in the form below was first stated by Nečas in 1962 [Neč62] and popularized by Babuška in 1972 in the context of finite element methods [BaA72, p. 112]. From a functional analysis point of view, this theorem is a rephrasing of two fundamental results by Banach: the Closed Range Theorem and the Open Mapping Theorem.

2.1. The Banach Nečas Babuška (BNB) Theorem 85 Theorem 2.6 (Banach Nečas Babuška). Let W be a Banach space and let V be a reflexive Banach space. Let a L(W V ; R) and f V. Then, problem (2.1) is well-posed if and only if: (bnb1) α > 0, inf a(w, v) α, w W v V w W v V (bnb2) v V, ( w W, a(w, v) = 0) = (v = 0). Moreover, the following a priori estimate holds: f V, u W 1 α f V. (2.5) Proof. Owing to Corollary A.46, the conditions (bnb1) and (bnb2) are equivalent to the well-posedness of (2.1). Moreover, the a priori estimate (2.5) directly results from the inequalities α u W v V a(u, v) v V = v V Remark 2.7. Let A L(W ; V ) be defined by f(v) v V = f V. w W, v V, Aw, v V,V = a(w, v). (2.6) Then, problem (2.1) amounts to seeking u W such that Au = f in V. From the results of Appendix A, we infer (bnb1) (Ker(A)={0} and Im(A) closed) (A T is surjective), (bnb2) (Ker(A T ) = {0}) (A T is injective). We are now in the position to prove the Lax Milgram Lemma. It suffices to verify that condition (lm) implies conditions (bnb1) and (bnb2). Lemma 2.8. Assume W = V. Then, (lm) implies (bnb1) and (bnb2). Proof. Assume (lm) and let w V. Condition (bnb1) is readily deduced from α w V a(w, w) w V v V a(w, v) v V. Let now v V. Taking w = v yields a(w, v) a(v, v) α v 2 V. w W Therefore, w W a(w, v) = 0 implies v = 0, thus proving (bnb2). Remark 2.9. (i) The BNB Theorem is sometimes referred to in the literature as the generalized Lax Milgram Theorem. The reason we shall not use this terminology

86 Chapter 2. Approximation by Galerkin Methods is that, as already mentioned in Remark 2.3, coercivity is essentially a Hilbertian property, i.e., the Lax Milgram Lemma is meaningful only in Hilbert spaces, whereas the proper setting for the BNB Theorem is that of Banach spaces. Hence, calling this theorem the generalized Lax Milgram Theorem would amount to calling Banach spaces generalized Hilbert spaces which would be somewhat misleading. (ii) Condition (bnb1) is usually termed an inf- condition. It can be recast in the following form, which will often be used in the sequel: w W, α w W v V a(w, v) v V. (iii) The reciprocal of Lemma 2.8 is wrong: conditions (bnb1) and (bnb2) do not imply property (lm). Hence, (lm) is not optimal in general (recall that (bnb1) (bnb2) are necessary and sufficient for well-posedness). However, when the bilinear form a is symmetric and positive, coercivity is both necessary and sufficient; see Corollary A.55. (iv) In finite dimension, (lm) is equivalent to stating that the matrix associated with the operator A is positive definite, whereas the conditions (bnb1) and (bnb2) are equivalent to its invertibility; see Remark 2.20(i) and Proposition 2.21. 2.1.4 Non-homogeneous Dirichlet boundary conditions This section analyzes a particular form of problem (2.1) arising when nonhomogeneous Dirichlet boundary conditions are enforced. This type of boundary condition is often encountered in engineering applications. It takes the form u Ω = g where g is a given function on Ω. Let B be a vector space of functions defined on Ω and assume that there exists a trace operator γ 0 L(W ; B) mapping functions of W to their restriction to Ω. The non-homogeneous version of problem (2.1) is: Seek u W such that a(u, v) = f(v), v V, (2.7) γ 0 (u) = g, in B. The main result of this section is the following: Proposition 2.10. Let W, V, and B be three Banach spaces, V being reflexive. Let γ 0 L(W ; B) and let a L(W V ; R). Assume that γ 0 is surjective and that the restriction of a to W 0 V, where W 0 = Ker(γ 0 ), satisfies the conditions of the BNB Theorem. Then, problem (2.7) is well-posed, and there exists c > 0 such that, for all f V and g B, u W c ( f V + g B ).

2.1. The Banach Nečas Babuška (BNB) Theorem 87 Proof. Since γ 0 is continuous and surjective, the Open Mapping Theorem implies that there exists c > 0 such that, for all g B, there is u g W satisfying γ 0 u g = g and u g W c g B. Clearly, (2.7) is equivalent to setting φ = u u g and considering the following problem: { Seek φ W0 such that (2.8) a(φ, v) = f(v) a(u g, v), v V. From the inequalities f(v) a(u g, v) ( f V + a u g W ) v V ( f V + c a g B ) v V, we deduce that the linear form f a(u g, ) is continuous on V. Since the restriction of a to W 0 V fulfills the conditions of the BNB Theorem, problem (2.8) has a unique solution. Therefore, problem (2.7) is also well-posed. Finally, the a priori estimate directly results from the above inequalities. Remark 2.11. The function u g is called a lifting of the boundary condition. From a theoretical viewpoint, non-homogeneous Dirichlet boundary conditions are thus handled very simply: after lifting the boundary condition and changing variable, the problem falls into the framework of the BNB Theorem. However, the argument presented above is not constructive since it does not provide an explicit expression for the lifting. In 3.2.2, we investigate a finite element approximation to (2.7) that accounts explicitly for non-homogeneous Dirichlet boundary conditions. Example 2.12. (i) Consider the Laplace equation u = f in a domain Ω with the nonhomogeneous Dirichlet boundary condition u Ω = g. The weak formulation of this problem is: Seek u H 1 (Ω) such that Ω u v = Ω fv, v H1 0 (Ω), (2.9) γ 0 (u) = g, in H 1 2 ( Ω). This problem clearly falls into the framework of (2.7) by setting W = H 1 (Ω), V = H 1 0 (Ω), γ 0 (v) = v Ω, W 0 = H 1 0 (Ω), and B = H 1 2 ( Ω). The operator γ 0 is indeed bounded and surjective from H 1 (Ω) into H 1 2 ( Ω) (see B.3.5), and the homogeneous version of problem (2.9) is well-posed (see Remark 3.9(ii)). (ii) The non-homogeneous Stokes problem can be recast in the framework of (2.7) by setting W = [H 1 (Ω)] d L 2 =0 (Ω), V = [H 1 0 (Ω)] d L 2 =0 (Ω), W 0 = [H 1 0 (Ω)] d L 2 =0 (Ω), γ 0 (u, p) = u Ω, and B = [H 1 2 ( Ω)] d. (iii) The non-homogeneous advection equation can be recast in the framework of (2.7) by setting W = {v L 2 (Ω); β v L 2 (Ω)}, V = L 2 (Ω), γ 0 (v) = v Ω, where Ω is the inflow boundary, and W 0 = {v V ; v Ω =

88 Chapter 2. Approximation by Galerkin Methods 0}. The characterization of the space B is somewhat technical in this case. It is possible to take B = L 2 loc ( Ω, β n ), the space of locally square-integrable functions over Ω for the surface measure β n. 2.2 Galerkin Methods This section is concerned with the approximation of the abstract problem (2.1) using Galerkin methods. 2.2.1 The setting The key idea underlying Galerkin methods is to replace the spaces W and V by finite-dimensional spaces W h and V h. The space W h is termed the solution space or the trial space, and the space V h is termed the test space. The finite element interpolation techniques presented in Chapter 1 provide practical means to construct such spaces, the index h referring to the mesh size. Henceforth, we assume that W h and V h are equipped with some norms, say Wh and Vh, respectively. Setting W (h) = W + W h, (2.10) we make the important assumption that this space can be equipped with a norm, say W (h), such that: (i) w h W (h) = w h Wh for all w h W h ; (ii) w W (h) c w W for all w W ; this property means that W is continuously embedded in W (h). In its most general form, the Galerkin method constructs an approximation of u by solving the following approximate problem: { Seek uh W h such that (2.11) a h (u h, v h ) = f h (v h ), v h V h. Problem (2.11) involves an approximation a h to the bilinear form a and an approximation f h to the linear form f. A particular case of (2.11) is one in which the same approximation space V h is chosen for the solution and the test functions, leading to the approximate problem: { Seek uh V h such that (2.12) a h (u h, v h ) = f h (v h ), v h V h. In this case, we say that a standard Galerkin method is used to approximate (2.1). When the solution and test spaces are different, the approximation method is sometimes called a Petrov Galerkin method, but in this book we refer to it as a non-standard Galerkin method.

2.2. Galerkin Methods 89 Definition 2.13 (Conformity). The approximation setting is said to be conforming if W h W and V h V ; it is said to be non-conforming otherwise. Definition 2.14 (Approximability). The approximation setting is said to have the approximability property if ( ) w W, inf w w h W (h) = 0. (2.13) w h W h lim h 0 Definition 2.15 (Consistency and asymptotic consistency). Let u solve (2.1). (i) The approximation setting is said to be consistent if a h can be extended to W (h) V h and if the exact solution u satisfies the approximate problem (2.11), i.e., if v h V h, a h (u, v h ) = f h (v h ). (2.14) It is said to be non-consistent otherwise. (ii) When a h is uniformly bounded on W h V h, the approximation method is said to be asymptotically consistent if there is an operator Π h : W W h such that, for all w W, Π h w w W (h) c inf wh W h w w h W (h) where c is independent of w, and lim h 0 ) f h (v h ) a h (Π h u, v h ) v h V h v h Vh ( The consistency error R h (u) is defined to be = 0. (2.15) R h (u) = v h V h f h (v h ) a h (Π h u, v h ) v h Vh. (2.16) Remark 2.16. The definition of asymptotic consistency is independent of the operator Π h provided the approximation setting has the approximability property. Indeed, assume there is an operator Πh 1 as in Definition 2.15(ii). Let Πh 2 : W W h be an operator such that, for all w W, Πh 2w w W (h) c inf wh W h w w h W (h). Then, for all v h V h, f h (v h ) a h (Π 2 hu, v h ) f h (v h ) a h (Π 1 hu, v h ) + a h (Π 1 hu Π 2 hu, v h ). Denote by R 1h (u) and R 2h (u) the consistency errors using Π 1 h and Π2 h, respectively. Using the uniform boundedness of a h, the triangle inequality, and the approximation property of Π 1 h and Π2 h yields R 2h (u) R 1h (u) + c a h Wh,V h inf u w h W (h), w h V h implying that the method is asymptotically consistent using Πh 2 ; in other words, up to a quantity controlled by the approximability property, the consistency error does not depend on the operator Π h chosen to measure it.

90 Chapter 2. Approximation by Galerkin Methods As an immediate consequence of Definition 2.15, we state the following: Lemma 2.17 (Galerkin orthogonality). If the approximation setting is consistent, the so-called Galerkin orthogonality holds: v h V h, a h (u u h, v h ) = 0. (2.17) Remark 2.18. In practice, non-consistent methods must be considered for various reasons. For instance, quadratures are often employed to evaluate the integrals defining the exact forms a and f; see 8.1. Another important example arises in the context of non-conforming methods where the exact forms a and f are no longer defined on the approximation spaces W h and V h ; see, e.g., 3.2.3, 3.2.4, 4.2.8, 5.6, and 5.7. Non-consistent methods are also considered in conforming stabilized finite element approximations; see, e.g., the subgrid viscosity method described in 5.5. 2.2.2 The linear system The approximate problem (2.11) is simply a linear system. To see this, let M = dim W h and N = dim V h. Let {ψ 1,..., ψ M } be a basis of W h and let {ϕ 1,..., ϕ N } be a basis of V h. In the framework of finite element methods, the functions {ψ 1,..., ψ M } (resp., {φ 1,..., φ N }) can be taken to be the global shape functions in W h (resp., V h ); see Chapter 1. Consider the expansion of u h in the basis of W h, M u h = U i ψ i, i=1 and introduce the coordinate vector of u h, U = (U i ) 1 i M R M, relative to the basis {ψ 1,..., ψ M }. Let A R N,M be the stiffness matrix with entries A ij = a h (ψ j, ϕ i ), 1 i N, 1 j M, and let F R N be the vector with components It is readily verified that 2.2.3 Well-posedness F i = f h (ϕ i ), 1 i N. (u h solves (2.11)) (AU = F ). The goal of this section is to investigate the well-posedness of the approximate problem (2.11). We shall see that this property is automatically granted in the coercive case for a conforming, consistent approximation. However, in the general case, there is no guarantee that the conditions of the BNB Theorem are automatically transferred from the exact problem to the approximate problem. As a result, non-trivial inf- conditions must be proven at the discrete level; see the numerous examples investigated in Chapters 4 and 5.

2.2. Galerkin Methods 91 A particular case: Conformity, consistency, and coercivity. Consider the following approximation of problem (2.2): { Seek uh V h such that a(u h, v h ) = f(v h ), v h V h, (2.18) with an approximation space V h V. Note that (2.18) involves the same bilinear form a and the same linear form f as (2.2). Proposition 2.19. Let V be a Hilbert space, let a L(V V ; R), and let f V. Let V h be a finite-dimensional space. Assume that: (i) a is coercive on V ; (ii) V h V. Then, the approximate problem (2.18) is well-posed. In particular, for all f V, the a priori estimate u h V 1 α f V holds. Proof. Since V h V, the bilinear form a is coercive on V h with constant α. To conclude, use the Lax Milgram Lemma. Remark 2.20. (i) For a conforming, consistent approximation of a coercive problem, the stiffness matrix is positive definite. Indeed, let N = dim V h and for X R N with X = (X i ) 1 i N, set ξ = N i=1 X iϕ i V h, where {ϕ 1,..., ϕ N } is a basis of V h. A straightforward calculation yields X R N, (AX, X) N = a(ξ, ξ) α ξ 2 V, implying that (AX, X) N 0. Moreover, (AX, X) N = 0 implies ξ = 0, whence we deduce X = 0 since {ϕ 1,..., ϕ N } is a basis of V h. (ii) If a is symmetric, the stiffness matrix is also symmetric. The general case. Consider the approximate problem (2.11). Owing to the BNB Theorem, the well-posedness of (2.11) is equivalent to the two following discrete conditions: a h (w h, v h ) (bnb1 h ) > 0, inf, w h W h v h V h w h Wh v h Vh (bnb2 h ) v h V h, ( w h W h, a h (w h, v h ) = 0) = (v h = 0). Condition (bnb1 h ) is often termed a discrete inf- condition. Let us interpret conditions (bnb1 h ) and (bnb2 h ) in terms of the stiffness matrix. Proposition 2.21. (i) (bnb1 h ) (Ker(A) = {0}). (ii) (bnb2 h ) (rank A = dim V h ). (iii) If dim W h = dim V h, (bnb1 h ) (bnb2 h ).

92 Chapter 2. Approximation by Galerkin Methods Proof. (i) The following equivalences hold: M (X Ker(A)) i, A ij X j = 0 ( i, a h (ξ, ϕ i ) = 0) j=1 ( v V h a h (ξ, v) = 0), with ξ = M i=1 X iψ i ; hence, ( ( ) ) (bnb1 h ) = ξ W h, a h (ξ, v) = 0 v V h (ξ = 0) = (Ker(A) = {0}). Conversely, assume that Ker(A) = {0}. Reasoning by contradiction, consider a sequence w hn W h with w hn Wh = 1 and such that a h (w hn, v) 1 v V h v Vh n. Since the unit sphere in W h is compact, there exists a subsequence, still denoted by w hn, that converges to w h as n. The limit w h satisfies w h Wh = 1 and v Vh a h (w h, v) = 0. This implies that X Ker(A) where X is the coordinate vector of w h, i.e., w h = M i=1 X iψ i. Since Ker(A) = {0}, X = 0, thus contradicting w h Wh = 1. (ii) The proof of (ii) is similar to that of (i) (consider A T instead of A). (iii) Direct consequence of (i), (ii), and the Rank Theorem. Theorem 2.22. Let V h and W h be two finite-dimensional spaces equipped with the norms Wh and Vh, respectively. Assume that: (i) a h is bounded on W h V h and f h is continuous on V h. (ii) The discrete inf- condition (bnb1 h ) is fulfilled. (iii) V h and W h have the same dimension. Then, the approximate problem (2.11) is well-posed, and the a priori estimate u h Wh 1 f h V h holds. Proof. Use Proposition 2.21(iii) and the BNB Theorem. Remark 2.23. (i) Even in the case of a consistent, conforming approximation, neither condition (bnb1) nor condition (bnb2) implies its discrete counterpart. (ii) By comparing Proposition 2.21(i) and (ii) with Remark 2.7, we realize that the interpretation of (bnb1 h ) and (bnb2 h ) in matrix terms is almost identical to that of (bnb1) and (bnb2) in operator terms. The only difference is that in finite dimension, the range of A is automatically closed.

2.3. Error Analysis 93 (iii) In the linear algebra framework, the inf- condition has a very simple interpretation. Given a matrix A R N,M, set α = min W R M max V R N (AW, V ) N W M V N, where (, ) N denotes the Euclidean scalar product in R N with associated norm N. Then, a straightforward calculation shows that α = min W M =1 AW N = λ min (A T A) 1 2, where λ min (A T A) is the smallest eigenvalue of A T A, i.e., α is the smallest singular value of A. 2.3 Error Analysis In this section, we derive estimates for the approximation error u u h, where u solves the exact problem (2.1) and u h solves the approximate problem (2.11). 2.3.1 The general case Theorem 2.24. Assume the following: (i) Condition (bnb1 h ) holds uniformly in h and dim(w h ) = dim(v h ). (ii) The bilinear form a h is uniformly bounded on W h V h. (iii) The approximation setting is asymptotically consistent. (iv) The approximation setting has the approximability property. Then, denoting by R h (u) the consistency error, the following estimate holds: u u h W (h) 1 R h (u) + c inf u w h W (h), (2.19) w h W h and lim h 0 u u h W (h) = 0. Proof. Let Π h be an operator involved in the definition of the asymptotic consistency of a h. (Recall that the uniform boundedness of a h allows to measure the consistency error by any operator Π h satisfying the assumptions of Definition 2.15; see Remark 2.16.) Clearly, a h (u h Π h u, v h ) = f h (v h ) a h (Π h u, v h ). Condition (bnb1 h ) yields u h Π h u Wh R h (u). Since the norms Wh and W (h) coincide on W h, the triangle inequality implies u h u W (h) 1 R h (u) + u Π h u W (h). Estimate (2.19) then results from u Π h u W (h) c inf wh W h u w h W (h). Finally, the asymptotic consistency and the approximability property readily yield lim h 0 u u h W (h) = 0.

94 Chapter 2. Approximation by Galerkin Methods This theorem clearly indicates the four properties which are required to prove the convergence of the approximate solution: uniform stability, uniform boundedness, asymptotic consistency, and approximability. A loose principle in numerical analysis, known as the Lax Principle, is that stability and consistency imply convergence. The fact that this principle does not mention continuity and approximability does not mean that these two properties should be taken for granted. For instance, the counterexample discussed in 2.3.3 shows that the approximability property may not hold in some circumstances. 2.3.2 Particular cases The non-consistent, non-conforming case. We assume that the bilinear form a h (, ) can be extended to W (h) V h so that a h (w, v h ) makes sense for w W and v h V h. The following result is known as the Second Strang Lemma [Str72]: Lemma 2.25 (Strang 2). Assume the following: (i) Condition (bnb1 h ) holds and dim(w h ) = dim(v h ). (ii) The bilinear form a h is bounded on W (h) V h. Then, the following error estimate holds: ( ) u u h W (h) 1 + a h W (h),vh inf u w h W (h) w h W h + 1 f h (v h ) a h (u, v h ). v h V h v h Vh (2.20) Proof. Let w h W h. Then, a h (u h w h, v h ) = a h (u h u, v h ) + a h (u w h, v h ) Condition (bnb1 h ) implies = f h (v h ) a h (u, v h ) + a h (u w h, v h ). u h w h W (h) v h V h f h (v h ) a h (u, v h ) v h Vh + a h W (h),vh u w h W (h). Estimate (2.20) then results from the triangle inequality. Remark 2.26. When the method is consistent, (2.20) simplifies into ( ) u u h W (h) 1 + a h W (h),vh inf u w h W (h). w h W h

2.3. Error Analysis 95 The non-consistent, conforming case. We now assume that W h W and V h V, i.e., the approximation setting is conforming. As a result, W (h) = W. However, we do not assume that the extended norm W (h) is the same as that of W. Furthermore, as opposed to the previous case, we do not assume that a h (, ) can be extended to W V h, i.e., we accept the fact that a h (w, v h ) may not make sense for (w, v h ) W V h. This is the case, for instance, when a h (u h, ) involves point values of u h, which may not necessarily exist for functions in W. It is also the case when a h involves a direct decomposition of u h in W h, which may not make sense for functions in W. A case corresponding to the second situation is investigated in 5.5; see Remark 5.55(i). The following result is known as the First Strang Lemma [Str72]: Lemma 2.27 (Strang 1). Assume the following: (i) W h W and V h V. (ii) Condition (bnb1 h ) holds and dim(w h ) = dim(v h ). (iii) The bilinear form a h is bounded on W h V h, and a is bounded on W V h when W is equipped with the extended norm W (h). Then, the following error estimate holds: u u h W (h) 1 f(v h ) f h (v h ) (2.21) v h V h v h Vh + inf w h W h [ ( 1 + a W (h),vh ) u w h W (h) + 1 a(w h, v h ) a h (w h, v h ) v h V h v h Vh Proof. Let w h W h. Condition (bnb1 h ) implies a h (u h w h, v h ) u h w h W (h). v h V h v h Vh A straightforward calculation yields a h (u h w h, v h ) = a(u w h, v h ) + a(w h, v h ) a h (w h, v h ) + f h (v h ) f(v h ). Therefore, u h w h W (h) a W (h),vh u w h W (h) a(w h, v h ) a h (w h, v h ) f(v h ) f h (v h ) + +. v h V h v h Vh v h V h v h Vh Conclude using the triangle inequality. The consistent, conforming case. For a consistent, conforming approximation with a h = a and f h = f, the approximate problem is: { Seek uh W h such that (2.22) a(u h, v h ) = f(v h ), v h V h. The following result is known as Céa s Lemma [Céa64]: ].

96 Chapter 2. Approximation by Galerkin Methods Lemma 2.28 (Céa). Let the hypotheses of Lemma 2.25 hold with V h V, W h W, a h = a, and f h = f. Let u h solve the approximate problem (2.22). Then, the following error estimate holds: ( ) u u h W 1 + a W,V inf u w h W. (2.23) w h W h Proof. For completeness, we present a direct proof without using (2.20). Let w h W h. Galerkin orthogonality implies v h V h, a(u h w h, v h ) = a(u w h, v h ). Using property (bnb1 h ) and the continuity of a yields a(u u h w h W h w h,v h ) a(u w v h V = h,v h ) v h V a W,V u w h W. v h V h v h V h Conclude using the triangle inequality. The coercive case. Assume that W h = V h, W = V, a h = a, and f h = f. Assume that the bilinear form a is coercive with coercivity constant α and continuity constant a. Galerkin orthogonality implies v h V h, a(u u h, u u h ) = a(u u h, u v h ). Using the coercivity and the continuity of a yields the estimate (2.23) with the constant a a α instead of 1 + α. This estimate can be sharpened even further if a is also symmetric. In this case, define the scalar product a(, ) with associated norm u 2 e = a(u, u) for u V. The norm e is called the energy norm and is equivalent to the norm V since u V, α 1 2 u V u e a 1 2 u V. Let P e,vh : V V h be the orthogonal projection for the scalar product a(, ). Owing to Galerkin orthogonality, it is clear that The Pythagoras identity yields, for all w h V h, As a result, u h = P e,vh (u). (2.24) a(u u h, u u h ) u u h 2 e u w h 2 e = a(u w h, u w h ). Hence, u u h V α u u h 2 V a(u w h, u w h ) a u w h 2 V. ( ) 1 a 2 α inf wh V h u w h V.

2.3. Error Analysis 97 2.3.3 A counterexample to the approximability hypothesis The approximability property (2.13) may seem unnecessary to verify, and one may believe that it is likely to be always satisfied for polynomial-based finite elements. Generally, practitioners do not bother to check this hypothesis. However, there are situations of engineering interest where this hypothesis may fail. It may happen, for instance, in electromagnetism. In the computational electromagnetism literature, there is a debate pitting proponents of edge finite elements against those of the all-purpose Lagrange finite elements. Both techniques work perfectly well in many cases, but there are a few situations where the two methods yield different answers, irrespective of the level of mesh refinement. It turns out that the origin of the discrepancy lies in the approximability property. This fact has been clarified by Costabel [Cos91]. Let Ω be a domain in R 3 with boundary Ω and outward normal n. In many electromagnetism problems governed by the Maxwell equations, a natural solution space is H 0 (curl; Ω) H(div; Ω) where H 0 (curl; Ω) = {v H(curl; Ω); v n Ω = 0}. Lemma 2.29 (Costabel). Assume that Ω is a polyhedron. If Ω is not convex, H 0 (curl; Ω) [H 1 (Ω)] 3 is a closed proper subspace of H 0 (curl; Ω) H(div; Ω). This result has the following striking consequence: Corollary 2.30. Let {W h } h>0 be a family of finite element spaces conforming in [H 1 (Ω)] 3 and set W h0 = {w h W h ; w h n Ω = 0}. Then, under the hypotheses of Lemma 2.29, {W h0 } h>0 cannot have the approximability property in H 0 (curl; Ω) H(div; Ω). Proof. It is clear that W h0 [H 1 (Ω)] 3 and W h0 H 0 (curl; Ω); hence, W h0 H 0 (curl; Ω) [H 1 (Ω)] 3. Since H 0 (curl; Ω) [H 1 (Ω)] 3 is closed in H 0 (curl; Ω) H(div; Ω), the limit of all the Cauchy sequences in W h0 are in H 0 (curl; Ω) [H 1 (Ω)] 3. Moreover, since H 0 (curl; Ω) [H 1 (Ω)] 3 is a proper subspace of H 0 (curl; Ω) H(div; Ω), there are functions of H 0 (curl; Ω) H(div; Ω) that lie at a positive distance from H 0 (curl; Ω) [H 1 (Ω)] 3. Therefore, Cauchy sequences in W h0 cannot reach these functions, i.e., {W h0 } h>0 does not have the approximability property in H 0 (curl; Ω) H(div; Ω). In the light of Corollary 2.30, we now understand why Lagrange finite elements may fail in electromagnetism. If the solution to be approximated is so rough as to be only in H 0 (curl; Ω) H(div; Ω) and not in more regular spaces, then Lagrange finite elements cannot interpolate it, whereas edge finite elements can. However, if, by some argument, it is known a priori that the solution is somewhat smoother, i.e., lives in a space that is slightly more regular than H 0 (curl; Ω) H(div; Ω), say H 0 (curl; Ω) [H 1 (Ω)] 3, then Lagrange finite elements yield the approximability property. In particular, if Ω is convex, the necessary extra regularity holds.

98 Chapter 2. Approximation by Galerkin Methods The above counterexample shows that the approximability property is not a hypothesis to be forgotten or to be treated too lightly. 2.3.4 The Aubin Nitsche Lemma The goal of this section is to derive an error estimate in a weaker norm than that of W (h). For the sake of simplicity, we restrict the analysis to the approximation of problem (2.2) in a standard, conforming, and consistent setting, i.e., W h = V h and the discrete problem is (2.18); see, e.g., [Bra97, p. 108] for nonconforming approximation settings. Problems (2.2) and (2.18) are assumed to be well-posed. Furthermore, we make the following additional assumptions: (an1) There exists a Hilbert space L into which V can be continuously embedded. We assume that L is equipped with a continuous, symmetric, and positive bilinear form l(, ), and we denote by L = l(, ) the corresponding seminorm. We further assume that there exists a Banach space Z V and a stability constant c S > 0 such that, for all g L, the solution ς(g) to the following adjoint problem: { Seek ς(g) V such that a ( v, ς(g) ) (2.25) = l(g, v), v V, satisfies the a priori estimate ς(g) Z c S g L. (an2) There exists an interpolation constant c i > 0 such that h, v Z, inf v v h V c i h v Z. v h V h Whenever property (an1) holds, problem (2.25) is said to be regularizing. The following lemma yields an error estimate in the seminorm L [Aub87]: Lemma 2.31 (Aubin Nitsche). Under the above assumptions, where c = c i c S a W,V. h, u u h L c h u u h V, Proof. Setting e h = u u h, it is clear that l(g, e h ) a ( e h, ς(g) ) e h L = =. g L g L g L g L Galerkin orthogonality implies a ( e h, ς(g) ) = a(e h, ς(g) v h ) for all v h V h. Hence, a ( e h, ς(g) ) a W,V e h V inf v h V h ς(g) v h V a W,V e h V c i h ς(g) Z a W,V e h V c i h c S g L The conclusion is straightforward. from (an2) from (an1).

2.4. Saddle-Point Problems 99 Example 2.32. (i) For a model problem with the Laplace operator, set Z = H 2 (Ω), V = H 1 (Ω), L = L 2 (Ω), L = 0,Ω. Assumption (an1) is not straightforward but can be proven when Ω satisfies some regularity properties; see 3.1.3. Assumption (an2) is a direct consequence of Corollary 1.109. (ii) Lemma 2.31 can also be applied to the Stokes problem. In this case, L is only a seminorm; see 2.4.2. 2.4 Saddle-Point Problems This section treats a particular form of problem (2.1) encountered, for instance, when dealing with the Stokes problem. Owing to the particular form of this problem (a saddle-point problem), we give a more precise, although equivalent, characterization of well-posedness. Then, we analyze the approximation of saddle-point problems using Galerkin methods. 2.4.1 Well-posedness Let X and M be two reflexive Banach spaces, f X, g M, and consider two bilinear forms a L(X X; R) and b L(X M; R). The abstract problem we investigate is: Seek u X and p M such that a(u, v) + b(v, p) = f(v), v X, (2.26) b(u, q) = g(q), q M. Example 2.33. The prototype example for (2.26) is the Stokes problem; see Chapter 4 for a thorough presentation. In this case, X = [H0 1 (Ω)] d, M = L 2 =0 (Ω), a(u, v) = Ω u: v, b(v, p) = Ω p v, f(v) = f v, and g(q) = Ω Ω gq. Another way of looking at problem (2.26) consists of introducing W = X M, c((u, p), (v, q)) = a(u, v) + b(v, p) + b(u, q), and k(v, q) = f(v) + g(q). One can then consider the following problem: { Seek (u, p) W such that (2.27) c((u, p), (v, q)) = k(v, q), (v, q) W. It is clear that (2.26) and (2.27) are equivalent. As a result, necessary and sufficient conditions for the well-posedness of (2.26) are the two conditions (bnb1) and (bnb2) for the bilinear form c. However, owing to the particular structure of (2.26), it is possible to reformulate (bnb1) and (bnb2) in terms of

100 Chapter 2. Approximation by Galerkin Methods conditions on the bilinear forms a and b. The goal of this section is to explore this point of view. Introduce the operators A and B such that A : X X with Au, v X,X = a(u, v) and B : X M (and B T : M = M X since M is reflexive) with Bv, q M,M = b(v, q). Problem (2.26) is equivalent to: Seek u X and p M such that Au + B T p = f, Bu = g. Let Ker(B) = {v X; q M, b(v, q) = 0} be the nullspace of B and let πa : Ker(B) Ker(B) be such that πau, v X,X = Au, v X,X for all u, v Ker(B). Theorem 2.34. Under the above framework, problem (2.26) is well-posed if and only if a(u, v) α > 0, inf α, u Ker(B) v Ker(B) u X v X (2.28) v Ker(B), ( u Ker(B), a(u, v) = 0) (v = 0), and β > 0, inf b(v, q) β. (2.29) q M v X v X q M Furthermore, the following a priori estimates hold: { u X c 1 f X + c 2 g M, p M c 3 f X + c 4 g M, with c 1 = 1 α, c 2 = 1 a β (1 + α ), c 3 = 1 a β (1 + α ), and c 4 = a β (1 + a 2 α ). (2.30) Proof. Problem (2.26) is well-posed if and only if the conditions (i) and (ii) of Theorem A.56 are satisfied. Owing to Corollary A.45 and the fact that Ker(B) is reflexive, the two inequalities in (2.28) are equivalent to the fact that πa is an isomorphism. Furthermore, inequality (2.29) is equivalent to the fact that B is surjective owing to condition (A.9) of Lemma A.40 and the fact that M is reflexive. Therefore, the well-posedness of problem (2.26) is equivalent to conditions (2.28) and (2.29). Let us now prove the a priori estimates (2.30). From condition (2.29) and Lemma A.42 (since M is reflexive), we deduce that there exists u g X such that Bu g = g and β u g X g M. Setting φ = u u g yields v Ker(B), a(φ, v) = f(v) a(u g, v). Noting that

2.4. Saddle-Point Problems 101 f(v) a(u g, v) ( f X + a u g X ) v X ( f X + a ) β g M v X, where a = a X,X, and taking the remum for v in Ker(B) yields α φ X f X + a β g M, owing to condition (2.28). The estimate on u then results from this inequality and the triangle inequality u X u u g X + u g X. To prove the estimate on p, deduce from condition (2.29) and Lemma A.40 that β p M B T p X, yielding β p M a u X + f X. The estimate on p M then results from that on u X. Remark 2.35. (i) If the bilinear form a is coercive on Ker(B), the conditions in (2.28) are clearly fulfilled. These conditions are also fulfilled if a is coercive on the whole space X. (ii) Saddle-point problems are historically important in the engineering literature since they contributed to the popularization of inf- conditions. In particular, (2.29) is known as the Babuška Brezzi condition [Bab73 a, Bre74]. To stress the fact that (2.28) and (2.29) are nothing more than a restatement of the conditions (bnb1) and (bnb2) for problem (2.27), we state the following: Proposition 2.36. Equip the space W = X M with the norm (u, p) W = u X + p M. Then, the bilinear form c satisfies (bnb1) and (bnb2) if and only if (2.28) and (2.29) hold. Proof. Let us prove that (2.28) and (2.29) imply (bnb1). Let (u, p) W. Let û X be such that Bû = Bu and β û X Bu M. Clearly, c((û, p), (v, q)) b(û, q) = Bû M β û X. (v,q) W (v, q) W q M q M Moreover, owing to the fact that u û is in Ker(B), α u û X a(u û, v) a(u û, v) + b(v, p) + b(u, 0) = v Ker(B) v X v Ker(B) (v, 0) W c((u, p), (v, q)) (v,q) W (v, q) W ( ) 1 + a β (v,q) W + a û X c((u, p), (v, q)) (v, q) W.

102 Chapter 2. Approximation by Galerkin Methods Using the triangle inequality yields the following bound on u X : ( ( )) 1 u X û X + u û X β + 1 α 1 + a β To bound p M, proceed as follows: implying (v,q) W c((u, p), (v, q)) (v, q) W. b(v, p) a(u, v) + b(v, p) + b(u, 0) a(u, v) β p M + v X v X v X (v, 0) W v X v X p M 1 β (v,q) W ( 1 + a c((u, p), (v, q)) (v, q) W + a u X, ( ( 1 β + 1 α 1 + a β ))) (v,q) W This proves (bnb1); the rest of the proof is left as an exercise. c((u, p), (v, q)) (v, q) W. One can generalize Proposition 2.4 to the abstract problem (2.26) assuming that the bilinear form a is symmetric and positive. In particular, one can prove that problem (2.26) is equivalent to a saddle-point problem. Recall the following: Definition 2.37. Given two sets X and M, consider a mapping L : X M R. A pair (u, p) is said to be a saddle-point of L if (v, q) X M, L(u, q) L(u, p) L(v, p). (2.31) Lemma 2.38. (u, p) is a saddle-point of L if and only if inf L(v, q)= L(u, q)=l(u, p)= inf L(v, p)= v X q M q M v X q M Proof. Definition 2.37 implies inf v X inf L(v, q) L(u, q) L(u, p) inf L(v, p) v X q M q M v X q M Moreover, for all pairs (v, q) X M, yielding Therefore, inf v X L(v, q) L(v, q) L(v, q ), q M inf q M v X L(v, q) inf L(v, q). v X q M inf L(v, q) = L(u, q) = L(u, p) = inf L(v, p) = v X q M q M v X q M L(v, q). (2.32) inf v X inf v X L(v, q). L(v, q). Note that the first equality means that the infimum over v is reached at u, and the last equality means that the remum over q is reached at p.

2.4. Saddle-Point Problems 103 Proposition 2.39. Assume that a is symmetric and positive. Then, the pair (u, p) solves (2.26) if and only if (u, p) is a saddle-point of the Lagrangian functional L(v, q) = 1 2a(v, v) + b(v, q) f(v) g(q). (2.33) Proof. Let (u, p) be an arbitrary pair in X M. Clearly, ( q M, L(u, q) L(u, p)) ( q M, b(u, q p) g(q p)) ( q M, b(u, q) = g(q)). (In the last equivalence, the fact that M is a vector space has been used.) Therefore, the first inequality in (2.31) is equivalent to stating that u satisfies the second equality in problem (2.26). For p M, consider now the functional J p (v) = 1 2a(v, v) + b(v, p) f(v). One readily verifies that ( ) ( v X, L(u, p) L(v, p)) J p (u) = min J p(v) v X ( v X, a(u, v) + b(v, p) = f(v)), where the last equivalence is a direct consequence of Proposition 2.4. Therefore, the second inequality in (2.31) is equivalent to stating that the pair (u, p) satisfies the first equality in problem (2.26). Remark 2.40. When a is symmetric and positive, (2.26) is often termed a saddle-point problem. Corollary 2.41. Assume that a is symmetric and positive and that the two conditions (2.28) and (2.29) are fulfilled. Then: (i) Problem (2.26) admits a unique solution. (ii) This solution is the unique saddle-point of the functional (2.33). (iii) This solution satisfies (2.32). 2.4.2 Approximation This section studies conforming approximations to problem (2.26). Let X h be a subspace of X and let M h be a subspace of M. Assume that X h and M h are finite-dimensional and consider the approximate problem: Seek u h X h and p h M h such that a(u h, v h ) + b(v h, p h ) = f(v h ), v h X h, (2.34) b(u h, q h ) = g(q h ), q h M h. Let B h : X h M h be the operator induced by b such that B hv h, q h M h,m h = b(v h, q h ). Let Ker(B h ) be the nullspace of B h, i.e., Ker(B h ) = {v h X h ; q h M h, b(v h, q h ) = 0}. We first address the well-posedness of the approximate problem (2.34).

104 Chapter 2. Approximation by Galerkin Methods Proposition 2.42. Problem (2.34) is well-posed if and only if > 0, β h > 0, inf u h Ker(B h ) v h Ker(B h ) inf q h M h a(u h, v h ) u h X v h X, (2.35) b(v h, q h ) β h. (2.36) v h X h v h X q h M Proof. Apply Theorem 2.34 and use the fact that in finite dimension, condition (2.35) implies both conditions in (2.28); see Proposition 2.21(iii). Remark 2.43. Condition (2.36) is equivalent to assuming that B h is surjective; see Lemma A.40. Our next goal is to estimate the approximation errors u u h and p p h. We first derive an a priori estimate similar to Céa s Lemma. Lemma 2.44. Under the assumptions (2.35) and (2.36), letting a = a X,X and b = b X,M, the solution (u h, p h ) to (2.34) satisfies the estimates with c 1h u u h X c 1h inf u v h X + c 2h inf p q h M, v h X h q h M h p p h M c 3h inf u v h X + c 4h inf p q h M, v h X h q h M h = (1+ a )(1+ b β h ), c 2h = b otherwise, c 3h = c 1h a β h, and c 4h = 1 + b β h Proof. Introduce the notation if Ker(B h ) Ker(B) and c 2h = 0 a + c 2h β h. Z h (g) = {w h X h ; q h M h, b(w h, q h ) = g(q h )}. (2.37) Clearly, Z h (g) is non-empty because the operator B h is surjective. Let v h be arbitrary in X h. Since B h verifies (2.36), the reciprocal of Lemma A.42 implies the existence of r h in X h such that q h M h, b(r h, q h ) = b(u v h, q h ) and β h r h X b u v h X. It is clear that b(r h + v h, q h ) = g(q h ), showing that r h + v h is in Z h (g). Let w h = r h + v h. Since w h is in Z h (g), u h w h is in Ker(B h ), yielding u h w h X a(u h w h, y h ) y h Ker(B h ) y h X y h Ker(B h ) y h Ker(B h ) a(u h u, y h ) + a(u w h, y h ) y h X b(y h, p p h ) + a(u w h, y h ). y h X If Ker(B h ) Ker(B), then b(y h, p p h ) = 0 for y h Ker(B h ); hence,

2.4. Saddle-Point Problems 105 u h w h X a u w h X. Using the triangle inequality yields ( ) u u h X 1 + a u w h X. In the general case, b(y h, p h ) = 0 = b(y h, q h ) for all q h M h since y h is in Ker(B h ), implying u h w h X a u w h X + b p q h M. Using the triangle inequality yields ( ) u u h X 1 + a u w h X + b p q h M. The estimate on u u h X then results from the inequality ( ) u w h X u v h X + r h X 1 + b β h u v h X. We now estimate p p h M. Since b(v h, p p h ) = a(u h u, v h ) for all v h in X h, we can introduce an arbitrary q h M h to obtain v h X h, b(v h, q h p h ) = a(u h u, v h ) + b(v h, q h p). Condition (2.36) then implies β h q h p h M a u u h X + b p q h M. The final result readily follows from the triangle inequality. We now establish an error estimate based on the Aubin Nitsche Lemma. To this end, we introduce the following assumptions: (anm1) There exists a Hilbert space H into which X can be continuously embedded. Denote by H and (, ) H the norm and the scalar product in H, respectively. We further assume that there exist two Banach spaces Y X and N M and a stability constant c S > 0 such that, for all g H, the solution to the adjoint problem: Seek ϕ(g) X and ϑ(g) M such that a(v, ϕ(g)) + b(v, ϑ(g)) = (g, v) H, v X, b(ϕ(g), q) = 0, q M, satisfies the a priori estimate ϕ(g) Y + ϑ(g) N c S g H. (anm2) There exists an interpolation constant c i > 0 such that, for all h and (v, q) Y N, inf ( v v h X + q q h M ) c i h( v Y + q N ). (v h,q h ) X h M h

106 Chapter 2. Approximation by Galerkin Methods Lemma 2.45. Under the assumptions (anm1) (anm2), there is c such that h, u u h H c h( u u h X + p p h M ). Proof. Set V = X M, Z = Y N, and L = H M equipped with the product norms. Define the symmetric positive bilinear form l((v, q), (w, r)) = (v, w) H and the seminorm (v, q) L = v H. To conclude, apply Lemma 2.31 using the bilinear form c((u, p), (v, q)) = a(u, v) + b(v, p) + b(u, q). 2.5 Exercises Exercise 2.1. Let V and W be two Banach spaces and let a L(W V ; R). Let A : W V be the mapping defined in (2.6). Show that A L(W ;V ) = a W,V. Exercise 2.2. Use Proposition 2.4 to prove Proposition A.31. Exercise 2.3. Prove Lemmas A.39 and A.40. (Hint: Use the Closed Range Theorem and the Open Mapping Theorem.) Exercise 2.4. Let V be a real Hilbert space equipped with the scalar product (, ) V and norm V. Let U be a nonempty, closed, and convex subset of V. (i) Let f V. Show that there is a unique u in V such that f u V = min v V f v V. (Hint: Consider a minimizing sequence and show that it is a Cauchy sequence.) (ii) Show that u is the solution to the above minimization problem if and only if (f u, v u) V 0 for all v U. (iii) Let a be a continuous, symmetric, and V -coercive bilinear form. Let L be a continuous linear form on V. Set J(v) = 1 2a(v, v) L(v). Show that there is a unique u V such that J(u) = min v V J(v) and that u is a minimizing solution if and only if a(u, v u) L(v u) for all v U. Exercise 2.5. Use the notation and results of Exercise 2.4. Let u be the unique element in V such that a(u, v u) L(v u) for all v U. Let V h be a finite-dimensional subspace of V, and let U h be a nonempty, closed, and convex subset of V h. Owing to Exercise 2.4, there is a unique u h in V h such that a(u h, v h u h ) L(v h u h ) for all v h U h. (i) Show that there is c 1 (u) such that, for all v U, u u h 2 V c 1 (u) ( u v h V + u h v V + u u h V u v h V ). (Hint: Prove α u u h 2 V a(u, v u h) L(v u h )+a(u h, v h u) L(v h u).) (ii) Show that there is c 2 (u) such that ( ( u u h V c 2 (u) inf u vh V + u v h 2 ) ) 1 V + inf u 2 h v V. v h U h v U

2.5. Exercises 107 Exercise 2.6. Prove Lemmas A.39 and A.40. Exercise 2.7. Let A R N,N be a non-singular matrix. Show that min max w R N v R N (Aw, v) N v N w N = min v R N (A T v, w) N max > 0. w R N v N w N Does this property still hold when A L(W ; V ) is a bijective Banach operator? Exercise 2.8. Let V be a Banach space. Prove that V can be equipped with a Hilbert structure with the same topology if and only if there is a coercive operator in L(V ; V ). (Hint: Think of Au, v V,V + Av, u V,V.) Exercise 2.9. Let V be a reflexive Banach space and let A L(V ; V ) be a monotone self-adjoint operator; see A.2.4. Prove that A is bijective if and only if A is coercive. (Hint: Prove that if A is monotone and self-adjoint, the following inequality holds: v, w V, Av, w V,V Av, v 1 2 V,V Aw, w 1 2 V,V ; then, use this inequality in the inf- condition satisfied by A.) Exercise 2.10. Let a L(V V ; R) be a symmetric coercive bilinear form on a Hilbert space V. Explain why the Lax Milgram Lemma is nothing more than a rephrasing of the Riesz Fréchet Theorem. Exercise 2.11. The goal of this exercise is to prove the Lax Milgram Lemma without using the BNB Theorem. Assume the hypotheses of the Lax Milgram Lemma and let A : V u a(u, ) V. (i) Prove that coercivity implies Au V α u V. (ii) Prove that A is injective and Im(A) is closed. (iii) Prove that Im(A) is dense in V. (Hint: Use Corollary A.18.) (iv) Conclude. Exercise 2.12. Complete the proof of Proposition 2.36. Exercise 2.13. Prove Proposition 6.55. Exercise 2.14. Let X 1, X 2, M 1, and M 2 be four reflexive Banach spaces, f X 2, g M 2. Let A L(X 1 ; X 2), B 1 L(X 2 ; M 1), and B 2 L(X 1 ; M 2). Consider the problem: Seek u X 1 and p M 1 such that Au + B1 T p = f, B 2 u = g.