LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES. Sergey Korotov,

LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES Sergey Korotov, Institute of Mathematics Helsinki University of Technology, Finland Academy of Finland 1

Main Problem in Mathematical Modeling and Numerical Analysis One of the most challenging problems in mathematical modeling and numerical analysis, which also has great importance for most of applications in industry is RELIABLE VERIFICATION OF ACCURACY OF APPROXIMATE SOLUTIONS OBTAINED IN COMPUTER SIMULATIONS Mathematically, this task is related to the so-called a posteriori error estimates, giving computable bounds for errors of various types and detecting zones, where such errors are excessively high and some mesh-refinement algorithm should be used 2

SOURCES OF ERRORS IN MODELING AND COMPUTATIONS In this section we discuss sources of the main types of errors in modeling and computational phases 3

Modeling Error In modeling phase a physical process (or object) of interest is usually described by means of a certain mathematical model. Then, the so-called modeling error ε 1 = U u arises, where U is the value characterizing the process (object) and u is the corresponding value obtained from the mathematical model. This error exists due to, e.g., the fact that usually second-order phenomena are neglected, there is also an undeterminacy in the problem data, often dimensional reduction is used to simplify models, etc. The symbol denotes a convenient measure of the difference between U and u (e.g., the absolute value of the difference or a norm in a suitable functional space) 4

Discretization Error In most of cases, the exact solution u cannot be obtained in analytical (explicit) form due to a high complexity of models constructed for real-life problems. Indeed, adequate models for complicated processes normally involve several differential and integral equations, various algebraic relations, etc. The solution u of such complicated models is often understood in an abstract sense as an element of a certain functional space Qualitative properties of u can be studied by pure mathematical methods, but the quantitative analysis normally requires a replacement of the original problem by a sequence of simpler problems whose solutions can be found relatively easily. Let u h be solution of approximate problem obtained on mesh with characteristic size h. Then discretization error ε 2 = u u h appears 5

Computational Error In turn, those approximate problems are themselves solved approximately, using concrete computers and concrete software packages, so the third type error, called the computational error, ε 3 = u h u ε h, appears, where uε h is what we really obtain in computer simulations The error ε 3 includes roundoff errors, errors due to forcibly stopped iterative processes and errors caused by bugs in computer codes. It is clear that estimation of ε 3 is an extremely difficult task All said above is presented in the following diagram 6

U Physical process ε 1 Modeling error u Mathematical model Au = f ε 2 Discretization error u h Discrete model A h u h = f h ε 3 Computational error u ɛ h Numerical solution A ɛ h uɛ h = f h ɛ 7

Two Principal Relations Computations on the base of a reliable (certified) model. Here ε 1 is assumed to be small and computed u ɛ h gives a desired information on U U u ɛ h ε 1 + ε 2 + ε 3 (1) Verification of a mathematical model. Here physical data U and computed results u ɛ h are compared to judge on the quality of mathematical model ε 1 U u ɛ h + ε 2 + ε 3 (2) 8

Thus, two major problems of mathematical modeling: reliable computer simulation verification of mathematical models by comparing physical and mathematical experiments require efficient methods able to provide COMPUTABLE AND REALISTIC estimates of ε 2 + ε 3 9

No Error Control - No Reliability In order to have really reliable computations we have to do both things: obtain an aproximate solution and also explicitly control the error ε 2 + ε 3 In practical computations the second part of the total work is often ignored. Mostly it happenes because the problem is very complicated and computable error estimates are simply unknown. In other cases, it happens because the analysts believe that approximations obtained with help of sufficiently fine meshes and powerful computers have values ε 2 and ε 3 so small that they should not be taken into an account at all 10

Such an approach does not provide with reliable computer simulations. Approximation solutions are quite sensitive to mesh perturbation and restructuring mesh can lead to very different results 2 1.5 1 0.5 0 0.5 1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.5 0 0 0.2 0.4 0.6 0.8 1 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 exact solution mesh with 741 nodes error = 0.09212 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 mesh with 785 nodes error = 0.11562 11 mesh with 855 nodes error = 0.44662

Mathematical Model: solution u and norm Approximate solutions without a real error control is just a part of the whole business. However, to develop tools for the error control we must first establish rules of the game : What do we mean as a solution of the mathematical model (e.g., of the given BVP)? How do we measure the difference between the exact solution of the given model and computed approximations? In order to answer the above two questions we shall introduce main definitions and denotations ( mathematical tools ) needed 12

MATHEMATICAL TOOLS: DEFINITIONS and TERMINOLOGY 13

Basic Facts from Functional Analysis Definition: A set V is called a linear vector space, if it has the following properties: If v, w V are arbitrary and α is a real (or complex) number then v + w and αv also belong to V ; moreover, these operations obey the usual rules of algebra, i.e.: v + w = w + v, v + (w + z) = (v + w) + z α(v + w) = αv + αw, α(βv) = (αβ)v, if v + w = v + z (α + β)v = αv + βv 1 v = v then w = z 14

Definition: A norm on linear vector space V is a real function denoted by V, which satisfies the following conditions: v + w V v V + w V (triangle inequality) αv V = α v V v V 0 if v 0 for any v, w V and any scalar α (by a scalar we mean real or complex number) It follows easily from above that a norm is a nonnegative function on V, i.e., v V 0 for any v V A linear vector space, which is equipped with a norm, is called a normed linear vector space 15

Definition: Banach space V is a normed linear vector space which is complete, that is, for each Cauchy sequence {v j } j=1 V there exists an element v V such that v j v V 0 as j Definition: We say that T is a linear operator from Banach space V to Banach space U if T (v + w) = T (v) + T (w) and T (αv) = αt (v) for any v, w V and any scalar α Definition: A linear operator T : V U is continuous if there exists a constant C > 0 such that T v U C v V v V 16

Let V be a linear vector space and let α C denotes the complex conjugate number to α Definition: A scalar (inner) product on V V is a complex function denoted by (, ) V which satisfies the following conditions: (v + y, w) V = (v, w) V + (y, w) V (αv, w) V = α(v, w) V (v, w) V = ((w, v) V ) C and (v, v) V 0 (v, v) V 0 if v 0 for any v, w, y V and any scalar α If V is a real space then (, ) V ( ) C can be omitted is a real function and the symbol 17

For the so-called induced norm v V = (v, v) V, v V we have the well-known Cauchy Schwarz inequality (v, w) V v V w V v, w V Moreover v V = sup w 0 (v, w) V w V A linear operator whose values are scalars is called a linear form (linear functional) A Banach space with a scalar product is called a Hilbert space 18

Riesz Theorem: Let V be a Hilbert space. Then for any linear continuous form F defined on V there exists exactly one element u V such that F (v) = (v, u) V v V Definition: A scalar mapping a(, ) defined on V V, where V is a linear vector space, is said to be a sesquilinear form, if for any fixed v V the mappings a(, v) and (a(v, )) C are linear. If moreover a(, ) is only real-valued then a(, ) is called a bilinear form. A sesquilinear form a(, ) is said to be continuous if there exists a constant C 1 > 0 such that a(v, w) C 1 v V w V v, w V 19

Lax Milgram Lemma: Let V be a Hilbert space and let a(, ) be a continuous sesquilinear form for which there exists a constant C 2 > 0 such that a(v, v) C 2 v 2 V v V (V -ellipticity condition) Then for any linear continuous form F defined on V there exists exactly one element u V such that a(v, u) = F (v) v V ( ) Actually, the Lax Milgram lemma says that a(, ) is a norm in V equivalent to the V -norm 20

A sesquilinear form a(, ) is said to be Hermitian if a(v, w) = (a(w, v)) C v, w V ( ) Theorem: Let the assumptions of Lax-Milgram lemma be satisfied and let a sesquilinear form a(, ) be Hermitian. Then the problem ( ) is equivalent to the problem: Find u V such that J(u) = inf v V where J is a quadratic functional given by J(v) ( ) J(v) = 1 a(v, v) Re(F (v)), v V ( ) 2 The functional ( ) is called the energy functional. Note that in the real case, the relation ( ) reduces to a(v, w) = a(w, v) and the corresponding bilinear form is then said to be symmetric 21

Functional Spaces L p (Ω) and C k (Ω) Let R d stand for the Euclidean space equipped with the norm d x = j=1 x 2 j 1/2, x = (x 1,..., x d ) T R d The Lebesgue space of real or complex-valued functions defined over an open set Ω R d, which are integrable with the power p [1, ), is denoted by L p (Ω) and equipped with the norm v 0,p,Ω = 1/p v p dx, v L p (Ω) Ω 22

When p = 2, we write shortly 0,Ω = 0,2,Ω and define the scalar (inner) product as follows (v, w) 0,Ω = vw C dx Recall the well-known Hölder inequality vw dx v 0,p,Ω w 0,q,Ω, v L p (Ω), w L q (Ω) Ω which holds for any p, q (1, ) satisfying the equality Ω 1 p + 1 q = 1 23

The Lebesgue space of measurable essentially bounded functions over Ω is denoted by L (Ω) and equipped with the norm v 0,,Ω = ess sup x Ω v(x) For any p, q [1, ], p q, the following algebraic imbedding L q (Ω) L p (Ω) holds. Moreover, we have also the topological imbedding; namely, there exists a constant C > 0 such that v 0,p,Ω C v 0,q,Ω v L q (Ω) We will not specify the term real or complex in further definitions of function spaces, since these definitions are essentially the same for both cases 24

By Ω we denote the closure of Ω and by Ω the boundary of Ω: Ω = Ω Ω, Ω = Ω (R d \ Ω) Recall that a domain Ω is an open and connected set in R d The symbol d {1, 2,...} is solely reserved for the dimension of Ω If Ω is a bounded domain then the space of continuous functions over Ω is denoted by C(Ω) and equipped with the norm Obviously we have v C(Ω) = max x Ω v(x) v C(Ω) = v 0,,Ω v C(Ω) 25

The symbol C k (Ω), k {0, 1,...}, stands for the space of functions whose classical derivatives up to order k belong to C(Ω) Moreover, we set C (Ω) = k=1 C k (Ω) i.e., C (Ω) is the space of infinitely differentiable functions over Ω Finally by C 0 (Ω) we denote the space of infinitely differentiable functions with a compact support in Ω, that is C 0 (Ω) = { v C (Ω) supp v Ω } where supp v = {x Ω v(x) 0} 26

Sobolev Spaces H k (Ω) and W k p (Ω) The (weak) solutions of many problems are looked for in Sobolev spaces. First we briefly recall some basic properties of the Sobolev space H k (Ω), k = {0, 1, 2,...}, i.e., classes of real or complex functions defined on a domain Ω R d whose generalized derivatives up to the k-th order belong to L 2 (Ω) We shall consider Sobolev spaces defined only on bounded domains with a Lipschitz continuous boundary which is a sufficiently wide class of domains for practical purpouses Definition: A bounded set Ω R d is said to have a Lipschitz continuous boundary if for any z Ω there exists a neighbourhood U = U(z) such that the set U Ω can be expressed, in some Cartesian coordinate system (x 1,..., x d ), by the inequality x d < F (x 1,..., x d 1 ), where F is a Lipschitz continuous function 27

Denote by L the set of all bounded domains with a Lipschitz continuous boundary If Ω L then the outward normal exists almost everywhere (a.e.) on Ω. This is the reason why we will confine ourselves to domains with a Lipschitz continuous boundary, since later we shall deal with, e.g., the normal derivative, normal component of flux, etc Another reason is that there are several definitions of the Sobolev spaces and in the case Ω / L the Sobolev spaces need not be uniquely defined Also, some important theorems formulated later on need not be valid when Ω / L Let us point out that there are other (nonequivalent) definitions of the Lipschitz continuity of Ω From now on, we always assume that Ω L 28

Definition: For any v C (Ω) and the index m = (m 1,..., m d ) we define the m-th classical derivative as follows D m v = m v x m 1 1 x m d d where m 1,..., m d are non-negative integers and m = m 1 +... + m d Definition: A function v L 2 (Ω) is said to have the m-th generalized derivative in L 2 (Ω) if there exists a function z L 2 (Ω) such that zw dx = ( 1) m vd m w dx w C0 (Ω) Ω Ω Then z is called the m-th generalized derivative of v and we set D m v = z. One may easily check that D m v is well-defined. Note that the generalized derivative is sometimes looked for in the larger space L 1,loc (Ω) 29

Now for k = 0, 1,..., the Sobolev spaces H k (Ω) are defined as H k (Ω) = { v L 2 (Ω) D m v L 2 (Ω), m k } It can be verified that H k (Ω), equipped with the scalar product (v, w) k,ω = D m v(d m w) C dx, v, w H k (Ω) m k Ω is a Hilbert space. Let us further introduce the induced norm and seminorm v k,ω = 1/2 D m v 2 dx, v H k (Ω) v k,ω = m k m =k Ω Ω D m v 2 dx 1/2, v H k (Ω) 30

The same symbols will be used also for vector functions, i.e., ( q ) 1/2 v k,ω = v l 2 k,ω for v = (v 1,..., v q ) T (H k (Ω)) q l=1 Moreover, the subscript Ω will be often omitted, i.e., Clearly, and (, ) k = (, ) k,ω, k = k,ω, k = k,ω v k 1 v k v H k (Ω), k = 1, 2,... L 2 (Ω) = H 0 (Ω) H 1 (Ω) H 2 (Ω). Note that each classical derivative is also the generalized derivative, and thus we have C k (Ω) H k (Ω), k = 0, 1,.... 31

Definition: A set Γ Ω is said to be a (relatively) open set in Ω if for any x Γ there exists an open ball B R d containing x such that B Ω Γ The Lebesgue space of square integrable function over an open set Γ Ω is denoted by L 2 (Γ) and equipped with the standard norm v 0,Γ = 1/2 v 2 ds, v L 2 (Γ) Γ Further we recall some important properties of Sobolev spaces 32

Trace Theorem: Let Ω L. Then there exists exactly one linear continuous operator γ : H 1 (Ω) L 2 ( Ω) such that γv = v v C (Ω) Ω The function γv for v H 1 (Ω) is called the trace of v and we denote it, for simplicity, by v Ω Trace Theorem, in fact, says that there exists a constant C > 0 such that v 0, Ω C v 1,Ω v H 1 (Ω) The trace theorem enables us to define the space H 1 0 (Ω) = { v H 1 (Ω) v = 0 on Ω } 33

Note that the spaces H 1 0 (Ω) and H 1 (Ω) can also be defined as the completion of C 0 (Ω) and C (Ω), respectively, under the norm 1,Ω, i.e., H 1 0 (Ω) = C 0 (Ω), H1 (Ω) = C (Ω) Denote by H 1/2 ( Ω) the space of traces of all functions from H 1 (Ω). Then we have the following density relation L 2 ( Ω) = H 1/2 ( Ω) where the closure is taken under the 0, Ω norm 34

Rellich Theorem: Let Ω L. Then the identity mapping from H 1 (Ω) to L 2 (Ω) is compact (i.e., any bounded sequence in H 1 (Ω) contains a subsequence converging in L 2 (Ω)) Green Theorem: Let Ω L. Then for each i {1,..., d} w i v dx + v i w dx = n i vw ds v, w H 1 (Ω) Ω Ω where n i are the components of the outward unit normal to Ω and i v = v x i Sobolev Imbedding Theorem: Let k be an integer such that 2k > d. Then H k (Ω) C(Ω) Ω and there exists a constant C > 0 such that v k C v C(Ω) v H k (Ω) 35

Generalized Poincaré s Inequality: Let Ω L and let ω be an open set either in Ω or in Ω 0, where Ω 0 Ω, Ω 0 L. Then v 1 C ( v 2 1 + ω where dω stands for dx or ds v dω 2 ) 1/2 v H 1 (Ω) Corollary: By the Cauchy Schwarz inequality 2 v dω meas ω v 2 dω v H 1 (Ω) ω Hence, we have ω v 1 C( v 2 1 + v 2 0,ω) 1/2 v H 1 (Ω) This inequality is usually called Friedrichs inequality when ω Ω. In particular, v 1 C v 1 v H 1 0 (Ω) 36

Finally in this section we briefly introduce the Sobolev spaces Wp k (Ω) of functions whose generalized derivatives up to order k {0, 1,...} belong to L p (Ω), p [1, ]. For p < they are equipped with the norm 1 v k,p,ω = p D m v p dx, v Wp k (Ω) m k Ω and seminorm v k,p,ω = m =k Ω D m v p dx 1 p, v W k p (Ω) 37

For p = v k,,ω = max m k Dm v 0,,Ω, We will again write k,p = k,p,ω, v k,,ω = max m =k Dm v 0,,Ω k,p = k,p,ω The Sobolev space W k p (Ω) is the completion of C (Ω) in the k,p -norm. Sobolev spaces are Banach spaces and for p = 2 they become Hilbert spaces, i.e., H k (Ω) = W k 2 (Ω) The Sobolev imbedding theorem for p [1, ] and k {1, 2,...} takes the form Wp k (Ω) C(Ω) if pk > d Note that the condition pk > d is not necessary but only sufficient. For instance, we have W1 2 (Ω) C(Ω) for d = 2 38

Classical and Variational Formulation of 2 nd order Elliptic Problem Using the model elliptic boundary value problem with mixed boundary conditions (which describes, for example, an electric or magnetic potential of a stationary linear problems) we shall demonstrate first the main idea of what we understand as the solution of BVPs, i.e., what our u is Classical Formulation: Find u C 2 (Ω) such that div (A u) = f in Ω (1) u = u 0 on Γ D (2) n T A u = g on Γ N (3) 39

Conditions: Assume that Ω L, n is the outward unit normal, A = (a ij ) (C 1 (Ω)) d d is a symmetric matrix which is uniformly positive definite, i.e., there exist constants C 1, C 2 > 0 such that C 1 ξ 2 ξ T A(x)ξ C 2 ξ 2 ξ R d x Ω f C(Ω), u 0 C 2 (Ω), g C(Γ N ), and Γ D, Γ N are relatively open sets with respect to the topology on Ω, Γ D Γ N = Ω and meas d 1 (Γ D Γ N ) = 0 (i.e., Γ D and Γ N are disjoint). In addition, we assume from that Γ D and Γ N have a finite number of components 40

Function u C 2 (Ω) satisfying (1) (3) is called the classical solution Note that the classical solution is sometimes looked for in the space C 1 (Ω) C 2 (Ω), i.e., u C 1 (Ω) and u Ω C 2 (Ω) The conditions (2) and (3) are called, respectively, Dirichlet and Neumann boundary conditions The problem (1) (3) is referred as the Dirichlet and Neumann problem, respectively, if Γ N = and Γ D = The matrix A describes physical properties of the medium Ω. If A is independent of x Ω we call the medium homogeneous (otherwise inhomogeneous) If a ii (x) = a 11 (x) for i = 2,..., d, and a ij (x) = 0 for i j, the medium is said to be isotropic (otherwise anisotropic) 41

If A is the identity matrix, (1) is called the Poisson equation If moreover f = 0, then (1) is called the Laplace equation and the associated operator u = div u = 11 u +... + dd u is called the Laplace operator, where ii u = 2 u x 2 i In practical problems the coefficients a ij are often not continuous (they are, e.g., piecewise constant). Then the classical divergence operator in (1) cannot be employed. That is why we introduce a weak formulation of the problem (1) (3), which enables us to consider also nonsmooth functions a ij, or f, g 42

Weak Formulation: Let us rewrite the equation (1) as follows d i (a ij j u) = f i,j=1 and introduce the space of test functions V = { v H 1 } (Ω) v = 0 on Γ D ( ) (V ) Multiplying ( ) by an arbitrary test function v V and then integrating over Ω, we obtain i (a ij j u)v dx = fv dx Ω i,j Ω Applying now Green Theorem we find that i v a ij j u dx = f v dx + Ω i,j Ω Ω n i a ij j u v ds i,j 43

Since v = 0 on Γ D, we obtain A u v dx = Ω Ω fv dx + Γ N gv ds Defining now the bilinear and linear forms a(v, w) = A w v dx, v, w H 1 (Ω) (a) F (v) = Ω Ω fv dx + Γ N gv ds, v V (F ) we see that any classical solution u of (1) (3) (if it exists) satisfies the equation a(v, u) = F (v) v V From now on we shall assume that a ij L (Ω), f L 2 (Ω), g L 2 (Γ N ) and u 0 H 1 (Ω). Moreover let a ij (x) = a ji (x) and let the ellipticity condition on A holds only for a.e. x Ω. We see then that the integrals in (a) and (F ) are still well-defined 44

First, we recall that an integral form of a physical law is more natural than its differential form From formula (V) we begin to use implicitly the concept of completion. Instead of looking for u C 2 (Ω) satisfying the differential equation (1) with the boundary conditions (2) (3), we shall solve the integral form equation a(v, u) = F (v) v V Its solution will be searched in the Hilbert space H 1 (Ω), which is, of course, complete and bigger than C 2 (Ω). This allows us to employ some useful results from functional analysis (e.g., the Lax Milgram Lemma) and, e.g., to apply the finite element method to find approximations to u 45

Definition: Let Ω L and Γ D. A function u H 1 (Ω) is called a weak (or generalized) solution of the problem (1) (3) if u u 0 V and a(v, u) = F (v) v V For convenience weak solution is denoted by u as classical one Theorem: Let Ω L and Γ D. Then there exists exactly one weak solution u H 1 (Ω) of the problem (1) (3) Note that the classical solution u C 2 (Ω) of (1) (3) need not exist, in general. However, if it exists then, it is also the weak solution 46

Now, after we have defined what we mean as the solution of the model and discussed its existence and uniqueness, we can define how to measure the accuracy of computed approximations It is clear that the most preferable situation is when we construct approximations in the same class of functions to which the exact solution belongs, or, if it is not so, that we can easily map the computed approximations to this class. For example for the above considered problem such class of functions is H 1 (Ω), i.e., we shall measure the error in the corresponding norm 1,Ω (or in some norm equivalent to it) In most of cases the error estimation is not easy at all since the exact solution is normally unknown in analytical form. However, several different approaches aimed at the control of the accuracy of approximations have been developed. On the base of our model problem we shall survey the most important ones 47

A PRIORI ERROR ESTIMATES 48

Error Estimate via Energy Functional In this section we shall discuss so-called a priori error estimates, which are essentially based on the fact that approximations are exactly computed by the finite element method, i.e., the error ε 3 is neglected in this approach The weak solution u can be equivalently found as follows: Find a function u which minimizes the quadratic functional J(v) = 1 2 a(v, v) F (v) + a(v, u 0) ) over space V, and then set u = u 0 + u ( u 0 + V If w = u 0 + w is another function from u 0 + V then ( ) a(u w, u w) = 2 J(w ) J(u ) (EE) 49

Projection Estimate It is clear that we can rarely find the exact (weak) solution u of the problem (1) (3) in an analytical form. Usually we can only solve it numerically, e.g., by the finite element method For this purpose we introduce some finite-dimensional subspace V h of V and solve the problem: Find a function u h such that u h u 0 V h and a(v h, u h ) = F (v h ) v h V h 50

The approximate solution u h can be equivalently found as follows: Find a function u h which minimizes the quadratic functional J(v) = 1 2 a(v, v) F (v) + a(v, u 0) over space V h, and then set u h = u 0 + u h Thus, if v h = u 0 + v h is another function from u 0 + V h then Thus, we have J(u h ) J(v h ) a(u u h, u u h )=2(J(u h ) J(u )) 2(J(v h ) J(u ))=a(u v h, u v h ) From above we get the basic relation for the difference between the finite element approximation and the exact solution a(u u h, u u h )= inf vh u 0 +V h a(u v h, u v h ) 51

We can rewrite it also as follows a(u u h, u u h )= inf v h V h a(u v h, u v h ) The above presented projection estimate serves as a basis for deriving a priori error estimates since it can be obviously written in the following form, called the projection theorem or Céa Lemma u u h 1 C inf v h V h u v h 1 = C u π h u 1 where π h u is the projection of u onto V h The estimate above shows that the error is bounded from above by the distance between the exact solution and its orthogonal projection onto the finite-dimensional space V h 52

If the approximation spaces V h are constructed in such a way that for any v V inf vh V h v v h 1 0 as h 0 then it follows that the finite element approximations converge to the exact solution Estimates of the convergence rate require a detailed analysis of interpolation properties of the spaces V h. Obviously, piecewise polynomial functions form the most convenient class for constructing V h Since u is an element of some Sobolev space, the problem turns to the question: how accurately can the elements of a Sobolev space be interpolated by piecewise polynomial functions 53

This question is studied by the interpolation theory (see, e.g., the well-known monograph by Ciarlet) which together with projection type estimates produces a posteriori error estimates qualified in terms of a mesh size h For our model problem they have the following form provided that u u h 1 Ch k u k+1, k {1, 2,... } Exact solution is sufficiently smooth Approximation u h is the exact finite element solution Mesh elements do not degenerate during refinement process 54

Drawbacks of A Priori Error Estimation A priori error estimates cannot guarantee that the error monotonically decreases as h 0 The constants C s in those estimates are often not known at all or highly overestimated A priori error estimates have only theoretical meaning they show that the approximation method is correct in principle. However, in practice we are strongly interested in the error for the concrete approximation on the concrete mesh Because of these reasons, starting from 70-th a different approach for the error control has been developing. It is called as a posteriori error control for partial differential equations 55