Discontinuous Galerkin Time Discretization Methods for Parabolic Problems with Linear Constraints

J A N U A R Y 0 1 8 P R E P R N T 4 7 4 Discontinuous Galerkin Time Discretization Methods for Parabolic Problems with Linear Constraints gor Voulis * and Arnold Reusken nstitut für Geometrie und Praktische Mathematik Templergraben 55, 506 Aachen, Germany * nstitut für Geometrie und praktische Mathematik, RWTH Aachen University, Templergraben 55, D-506 Aachen, Germany. (voulis@igpm.rwth-aachen.de) nstitut für Geometrie und praktische Mathematik, RWTH Aachen University, Templergraben 55, D-506 Aachen, Germany. (reusken@igpm.rwth-aachen.de)

DSCONTNUOUS GALERKN TME DSCRETZATON METHODS FOR PARABOLC PROBLEMS WTH LNEAR CONSTRANTS GOR VOULS AND ARNOLD REUSKEN Abstract. We consider time discretization methods for abstract parabolic problems with inhomogeneous linear constraints. Prototype examples that fit into the general framework are the heat equation with inhomogeneous (time-dependent) Dirichlet boundary conditions and the timedependent Stokes equation with an inhomogeneous divergence constraint. Two common ways of treating such linear constraints, namely explicit or implicit (via Lagrange multipliers) are studied. These different treatments lead to different variational formulations of the parabolic problem. For these formulations we introduce a modification of the standard discontinuous Galerkin (DG) time discretization method in which an appropriate projection is used in the discretization of the constraint. For these discretizations (optimal) error bounds, including superconvergence results, are derived. Discretization error bounds for the Lagrange multiplier are presented. Results of experiments confirm the theoretically predicted optimal convergence rates and show that without the modification the (standard) DG method has sub-optimal convergence behavior. Key words. abstract parabolic problem, discontinuous Galerkin methods, discretization of linear constraints, optimal discretization error bounds AMS subject classifications. 65M60, 65J10 1. ntroduction. Nowadays the discontinuous Galerkin (DG) finite element method is a popular discretization technique for many classes of ordinary and partial differential equations, cf. the overviews in [1, 0, 7]. For ordinary differential equations DG finite element methods have been analyzed in e.g. [6, 13, ] and the references therein. For parabolic partial differential equations the DG finite element for time discretization has been introduced in [1] and was further studied in [9, 10, 11, 3, 3, 8, 1]. n this paper we study DG finite element time discretization methods for parabolic problems with constraints. As far as we know, in the DG error analyses for parabolic problems available in the literature only the case with homogeneous constraints has been considered. n practice one often has to deal with inhomogeneous constraints. Typical examples are a heat equation with nonzero time-dependent Dirichlet boundary conditions, an instationary Stokes flow with a time-dependent inflow boundary condition and a Stokes flow with an inhomogeneous time-dependent divergence constraint. Runge-Kutta methods for parabolic equations with a non-homogenous constraint have been analyzed in the recent paper [3]. A DG time discretization method for the wave equation with an inhomogenous boundary condition is analyzed in [33]. n this paper we consider a DG time discretization of a parabolic problem with non-homogeneous linear constraints and present an error analysis resulting in optimal error bounds. t turns out that for obtaining such optimal bounds we have to modify the standard formulation of a DG Galerkin finite element method as given in [3]. Results of numerical experiments will show that without an appropriate modification the standard DG Galerkin finite element method applied to a parabolic problem with an inhomogeneous constraint does not yield optimal convergence rates. n this paper the parabolic problem and the error analysis of the DG time dis- nstitut für Geometrie und Praktische Mathematik, RWTH-Aachen University, D-5056 Aachen, Germany (voulis@igpm.rwth-aachen.de) nstitut für Geometrie und Praktische Mathematik, RWTH-Aachen University, D-5056 Aachen, Germany (reusken@igpm.rwth-aachen.de) 1

cretization are presented in an abstract setting (as in e.g. [8]). The linear constraints that are part of the problem can be treated in different ways: explicitly, by eliminating variables, implicitly, by means of a Lagrange-multiplier, or by a combination of these. The analysis in the paper covers all three cases. Prototype examples that fit into the abstract framework are the standard scalar heat equation with inhomogeneous (time-dependent) Dirichlet boundary conditions (usually treated explicitly) and the instationary Stokes equations with inhomogeneous (time-dependent) Dirichlet boundary conditions (usually treated explicitly) and/or an inhomogeneous divergence constraint (usually treated implicitly by the pressure Lagrange multiplier). These examples will be discussed in detail in Subsection.3. The main results of the paper are the following. Firstly, we present a modification of the standard formulation of a DG time discretization method that yields optimal order discretization errors also for the case of inhomogeneous constraints. Secondly, for this modified method we prove optimal discretization error bounds. Both optimal global energy norm bounds and optimal superconvergence results are derived. Finally, for the case that the constraints are treated implicitly (e.g. divergence constraint in Stokes problem) we derive discretization error bounds for the corresponding Lagrange multiplier. n the example of the Stokes problem this yields error bounds for the DG time discretization of the pressure variable. We note that even for the case of homogeneous constraints we are not aware of error analyses of DG time discretization methods for Stokes equations that yield (optimal order) error bounds for the pressure variable. A more detailed discussion of how the results of this paper are related to other literature is given in Remark.1. The paper is organized as follows. n Section we introduce the class of abstract parabolic problems that we consider. Different formulations, related to the different treatments of the constraints, are presented. Furthermore, specific concrete examples that fit in the abstract setting are discussed. n Section 3 we introduce a DG scheme for the time discretization of the abstract parabolic problem. n Section 4 we give an error analysis of the DG scheme. We derive optimal order error bounds in the energy norm and prove a nodal superconvergence result. n Section 5, for the case of an implicit treatment of the constraint, we analyze the discretization error in the Lagrange multiplier. We derive, under very mild assumptions, a sub-optimal result for the convergence of the Lagrange multiplier and using an additional regularity condition we are able to prove an optimal discretization error bound. n Section 6 we introduce a fully discrete numerical scheme. n Section 7 this scheme is used to perform numerical experiments and results of a few experiments are presented that illustrate the convergence behavior of the DG method. Finally, we give an outlook in Section 8.. Parabolic problem with linear constraints. n this section we introduce the class of parabolic problems with linear constraints that we treat in this paper. For this we first introduce some notation and recall relevant well-known results on wellposedness of (abstract) parabolic problems e.g. [3, 8]. Let U, H be real separable Hilbert spaces with a dense continuous embedding U H. The norms are denoted by U, H, respectively. These spaces induce a Gelfand triple U H = H U. Let a : U U R be a symmetric continuous coercive bilinear form on U: a(u, v) Γ u U v U for all u, v U, (.1) a(v, v) γ v U for all v U, (.)

with γ > 0. The corresponding operator is denoted by A : U U, Au(v) = a(u, v). Let = (0, T ) be a given time interval and for Hilbert spaces Z 1, Z we define W 1 (Z 1 ; Z ) := { u L (; Z 1 ) u L (; Z ) }, H m (; Z 1 ) := H m () Z 1, cf. [34]. A standard weak formulation of an abstract parabolic problem is as follows: find u W 1 (U; U ) such that u(0) = u 0 and u + Au = f in L (; U ). (.3) We assume u 0 H, f L (; U ). This problem is well-posed [34, 8]. Standard examples of problems that fit in this abstract framework are the heat equation with homogeneous BC (U = H 1 0 (Ω)) and the Stokes equation with homogeneous BC (U = { u H 1 0 (Ω) d div u = 0 }), cf. Examples.1 and. below. n the literature one can find analyses of DG time discretizations applied to these parabolic problems [3, 8]. n this paper we present an extension of the time discretization error analysis that applies to such problems with inhomogeneous boundary conditions or an inhomogeneous divergence constraint div u = g. For this we introduce an abstract (variatiational) formulation of a problem as in (.3) with linear constraints. For a Hilbert space Q let b(, ) : U Q R be a continuous bilinear form and B : U Q the corresponding operator representation: Bu(q) = b(u, q) for all u U, q Q. This bilinear form and linear operator B are called constraint operators. We define which is a Hilbert space. We assume that V := ker(b) = { u U Bu = 0 }, B : U Q is surjective. (.4) This is equivalent ([1], Lemma A.40) to the inf-sup property: α > 0 : inf sup b(u, q) α. (.5) q Q u U u U q Q n the subsections below we give different formulations for the abstract parabolic problem with constraints which are relevant in particular applications, cf. the examples in Subsection.3..1. Constrained formulation. For given f L (; V ), g L (; Q ) we consider the following abstract parabolic problem: find u W 1 (U; V ) such that u(0) = u 0 and u + Au = f in L (; V ) Bu = g in L (; Q ). (.6) We relax the ellipticity condition (.) and only require ellipticity of a(, ) to hold on the subspace V U. We thus have a(u, v) Γ u U v U for all u, v U, (.7) a(v, v) γ v U for all v V, (.8) with γ, Γ > 0. Concerning existence of a unique solution of (.6) we note the following. f g = 0 then the constraint in (.6) implies u V, and thus we are in the standard setting described above, with U replaced by V (and the Gelfand triple V V H V ). 3

Hence we have well-posedness. For the inhomogeneous case we apply a standard translation argument. Note that from the inf-sup property it follows that there exists a unique v L (; V ) such that Bv = g, where V is the orthogonal complement of V in U. Assume that g has sufficient regularity, e.g. g H 1 (; Q ), such that there exists a v W 1 (U; V ) that satisfies Bv = g. Application of the standard analysis yields that there exists (a unique) w W 1 (V; V ) with w(0) = u 0 v(0) and w + Aw = f Av v in L (; V ). Hence u := w + v W 1 (U; V ) satisfies u(0) = u 0 and the variational equation (.6). Uniqueness follows from the fact that (.6) with g = 0, f = 0 and u 0 = 0 has only the trivial solution... Mixed formulation. n practice, in particular in the context of Galerkin finite element discretization methods, it may be not convenient to use the kernel space V in the first equation in (.6). One then introduces a suitable Lagrange multiplier to enlarge the constrained space V. We introduce such a mixed formulation. The structure of this mixed formulation is inspired by the examples considered in Subsection.3, in which only for a part of the constraint equation Bu = g a corresponding Lagrange multiplier is used. For this we decompose the constraint bilinear form as b(u, q) = b 1 (u, q 1 ) + b (u, q ), q = (q 1, q ) Q 1 Q = Q, with Q i, i = 1,, Hilbert spaces, and b i : U Q i R continuous bilinear forms. This splitting is such that for the b 1 (, ) part a Lagrange multiplier will be introduced, whereas for the b (, ) part no Lagrange multiplier is used. We assume that Q 1 0. The bilinear forms b i (, ) induce corresponding operators B i : U Q i. From (.5) it follows that b i (u, q i ) inf sup α, i = 1,, (.9) q i Q i u U u U q i Qi holds, provided b i (, ) is not identically zero. Define V i := ker(b i ) = { u U b i (u, q i ) = 0 q i Q i }, i = 1,. Hence, V = V 1 V. We also have V = { u V b 1 (u, q 1 ) = 0 q 1 Q 1 }. (.10) We allow Q = {0}, Q 1 = Q, in which case we have V = U, V = V 1. Note that B : V Q 1 Q is a bijection and the preimage of Q 1 {0} is V V. t follows that B 1 : V V Q 1 is a bijection, from which we conclude that B 1 : V Q 1 is surjective. This is equivalent ([1], Lemma A.40) to the inf-sup property: β > 0 : inf q 1 Q 1 b 1 (u, q 1 ) sup β. (.11) u V u U q 1 Q1 Note that if Q = {0} then (.11) is the same as (.9) with i = 1. From (.11) and (.10) it follows that for the adjoint B 1 of B 1 : V Q 1 we have that B 1 : Q 1 V 0 := { g V g(v) = 0 for all v V } (.1) 4

is an isomorphism and B 1p V β p Q1 for all p Q 1, (.13) cf. Lemma 4.1 in [15]. We now introduce a mixed formulation of (.6). For f L (; V ) determine u W 1 (U; V ), p L (; Q 1 ) such that u + Au + B 1p = f in L (; V ) Bu = g in L (; Q ). (.14) Here B 1 : Q 1 V is the adjoint of B 1 : V Q 1. Note that for v V we have b (v, q ) = 0 for all q Q, hence, (B 1p)(v) = b 1 (v, p) = b(v, (p, 0)) holds for v V, p Q 1. We now discuss the relation between the constrained formulation (.6) and the mixed formulation (.14). Let u W 1 (U; V ) be the unique solution of (.6). We assume that u has additional regularity such that u W 1 (U; V ). Then l(t) := f(t) u (t) Au(t) V (a.e. in t) and l(t)(v) = 0 for all v V, i.e., l(t) V 0 holds. Hence, there exists a unique p(t) Q 1 such that B 1p(t) = l(t). Using (.13) it follows that p L (; Q 1 ). Thus, if the unique solution u of the constrained formulation (.6) has regularity u W 1 (U; V ), there is a unique p L (; Q 1 ) such that (u, p) is the unique solution of the mixed formulation (.14)..3. Examples. We describe three examples that fit into the general abstract framework presented above. n these examples we consider different types of constraints, which are allowed to be inhomogeneous. Example.1 (Heat equation). The heat equation in Ω R d, with (timedependent) inhomogeneous Dirichlet boundary conditions h L (; H 1 ( Ω)) in constrained formulation is as follows: u u = f u Ω = h. in L (; H 1 (Ω)) For this problem we have U = H 1 (Ω), H = L (Ω), Q = H 1 ( Ω), a(u, v) := Ω u v, b(u, q) = (tr (u), q) H 1 ( Ω), with tr : H 1 (Ω) H 1 ( Ω) the trace operator, which is surjective. This yields V = ker(b) = H 1 0 (Ω). The assumptions (.4), (.7), (.8) are satisfied. We do not consider a mixed formulation, because in the constrained formulation (.6) we use the space V = H 1 0 (Ω), which is the natural one for the bilinear form a(, ). Example. (Stokes equations). We consider the Stokes equation in Ω R d with homogeneous boundary conditions and a non-zero (time-dependent) divergence g L (; L 0(Ω)). n this setting we have U = H 1 0 (Ω) d, H = L (Ω) d, Q = L 0(Ω) := L (Ω)/R, a(u, v) := Ω u : v, b(u, q) = Ω div u q, V = { u H1 0 (Ω) d div u = 0 }. Recall that u H 1 0 (Ω) d div u L 0(Ω) is surjective. The Stokes equation in constrained formulation is as follows: u u = f in L (; V ) div u = g in L (; L 0(Ω)). 5

The assumptions (.4), (.7), (.8) are satisfied. For the mixed formulation we take b 0, i.e., b 1 (u, q) = Ω div u q, q Q 1 = Q, and V = U = H 1 0 (Ω) d. Hence, the formulation (.14) is the standard mixed formulation of a time-dependent Stokes problem with homogeneous boundary conditions, cf. [1] Sect. 4.1. Example.3. We consider the Stokes equation in Ω R d with non-zero (timedependent) boundary data h L (; H 1 ( Ω) d ) and a non-zero (time-dependent) divergence g L (; L 0(Ω)). We then have U = H 1 (Ω) d, H = L (Ω) d, Q = L 0(Ω) H 1 ( Ω) d, a(u, v) := Ω u : v, b(u, (q 1, q )) = Ω div u q 1 + (tr u, q ) H 1 ( Ω), V = { u H 1 0 (Ω) d div u = 0 }. Note that u H 1 (Ω) d (div u, tr u) L 0(Ω) H 1 (Ω) d is surjective. The Stokes equation in constrained formulation is given by u u = f in L (; V ) div u = g in L (; L 0(Ω)) tr u = h in L (; H 1 ( Ω) d ). The assumptions (.4), (.7), (.8) are satisfied. For the mixed formulation we take Q 1 = L 0(Ω), Q = H 1 (Ω) d, b 1 (u, q 1 ) = Ω div u q 1, b (u, q ) = (tr u, q ) H 1 ( Ω). Note that V = H 1 0 (Ω) d. Remark.1. n the numerical approximation of these parabolic problems one has to consider discretization in space and in time. n this paper we focus on the time discretization. We now briefly discuss the space discretization and continue this discussion in Section 6. The two formulations (.6) and (.14) lead to two different numerical schemes. n the former case one has to replace the space V by a suitable discrete (e.g., finite element) space. n the mixed formulation the spaces V and Q 1 have to be replaced by a suitable pair of discrete (finite element) spaces. n the specific case of the Stokes problem this leads to two different well-known finite element approaches. Numerous conforming and non-conforming finite element (or wavelet) schemes for space discretization of the Stokes problem exist. By far most of these schemes use the mixed formulation (.14), cf. [15, 1, 7, 4, 8]. n that setting we need a pair of discrete spaces V h, Q h 1 for approximation of functions from the spaces V = H 1 0 (Ω) d and Q 1 = L 0(Ω). Extensive literature concerning the choice of such a pair of spaces is available, cf. [1, 15] and the references therein. Recently so-called pressure robust mixed methods have been studied, e.g. [4, 5, 6]. To treat the Stokes problem by means of the constrained formulation (.14) a discrete (finite element or wavelet) space V h with divergence-free functions is required. Examples of such discrete spaces are treated in, e.g., [31, 35, 14, 19, 5]. By far most of the error analyses of these methods (for both approaches) known in the liteature result in bounds for the spatial discretization error in case of a stationary problem and for the discretization error in the time-dependent semidiscrete (i.e., discrete in space only) problem. n only very few papers, e.g., [1, ], the error in the fully discrete problem is treated. n these papers, however, only the case with homogeneous constraints (i.e., div u = 0 and homogeneous Dirichlet boundary conditions) is treated and the error in the discretization of the Lagrange multiplier (pressure) is not considered or only suboptimal error bounds are derived. We also note that instead of these popular Rothe or method of lines techniques one can apply a direct full space-time approach, e.g. the recent papers [9, 30]. 6

n this paper we do not compare all these different approaches. We present and analyze a finite element method that is of Rothe type. For discretization in time we apply a DG finite element method. An important aspect is that this method, which is a modification of the standard DG time discretization method ([3]) allows the treatment of time-dependent inhomogeneous constraints with an optimal order of accuracy. For the semidiscrete problem (i.e., discrete in time only) optimal discretization error bounds for both approaches (constrained and mixed) are derived. n the mixed formulation we derive optimal bounds for both the primal variable u and the Lagrange multiplier p. Remark.. n the analysis it is essential that the constraint operator B does not depend on t. Above we also assumed that A is independent of t. This assumption, however, is only essential for superconvergence and Subsection 5.1, all other results remain valid for a time-dependent A. Furthermore, the analysis also applies to a generalization of the problem (.6) where the term u (t) is replaced by Mu (t) with M a symmetric, elliptic bounded linear operator on H. Without loss of generality we can assume that Mu(v) = (u, v) H. An example of the latter generalization is given in Subsection 7.. 3. Discontinuous Galerkin discretization. n this section we present DG time discretization methods for the variational problems (.6) and (.14). First some notation is introduced. We take a fixed q N, q 1. n the discretization we will use polynomials of degree q 1 (q degrees of freedom) in time. The space of polynomials of degree q 1 is denoted by P q 1. For N N, introduce 0 = t 0 < < t N = T, = (t n 1, t n ], and k n = for n = 1,..., N. For simplicity, we assume that k = max n k n 1. We define the broken spaces P b () = N P q 1 ( ) L (), H 1,b () := P b (; Ĥ) := Pb () Ĥ, H1,b (; Ĥ) := H1,b Ĥ, N H 1 ( ) with Ĥ a given Hilbert space. For U H1,b (; Ĥ) and n = 1,..., N we will write U n = U n (t n ), U+ n 1 = lim t tn 1 U n (t). We also define [U] n = U+ n U n for n = 1,..., N 1. We define U by taking the derivative on each interval : U = U χ n, with χ n the characteristic function for. We define the following bilinear form on H 1,b (; Ĥ) which corresponds to a discrete time derivative: (Y, X) DĤ(Y, X) := N 1 (Y, X)Ĥ + ([Y ] n, X+)Ĥ n + (Y+, 0 X+)Ĥ 0 and we define (Y, X) D (Y, X) := N N 1 (Y, X )Ĥ + (Y n, [X] n )Ĥ (Y N, X N )Ĥ. Ĥ We will need some basic properties for DĤ and D. These results are standard and Ĥ can be found in [3, Chapter 1]. 7

Lemma 3.1. The following holds for any Hilbert space Ĥ: DĤ(u, v) = D (u, v) for all u, v Ĥ H1,b (; Ĥ) (3.1) DĤ(u, u) 1 un for all u Ĥ H1,b (; Ĥ). (3.) Proof. Equation (3.1) follows by integration by parts on each interval. The second part follows from DĤ(u, u) = = N 1 (u, u)ĥ + ([u] n, u n +)Ĥ + (u 0 +, u 0 +)Ĥ N 1 ( u n Ĥ un 1 + ) + ([u] n, u n Ĥ +)Ĥ + u 0 + Ĥ N 1 = u N + Ĥ u0 + + Ĥ N 1 = u N + Ĥ u0 + + Ĥ u N Ĥ, n=0 ( u n Ĥ un + Ĥ + un + Ĥ (un, u n +)Ĥ) ( u n Ĥ + un + Ĥ (un, u n +)Ĥ) where we have used (u n u n +, u n u n +)Ĥ 0 in the last step. We now recall a projection operator that plays a key role in the error analyses presented in [3, 8]. This operator is not used explicitly in a standard DG time discretization method applied to a parabolic problem without constraints. t will, however, be used in the DG method for the discretization of the problem with constraints (.6), that is introduced below. Definition 3.1. (as in [8]) Let J = (a, b). For a scalar function φ L (J) which is continuous at t = b we define its projection Π q 1 J φ P q 1 (J) by the q conditions Π q 1 J φ(b) = φ(b), b a (Π q 1 J φ φ) ψ = 0 ψ P q (J). (for q = 1 only the first condition is used). For φ L c() := { ψ L () ψ n is continuous at t n, 1 n N } we define the projection P b : L c() P b () by ( P bφ) n := Π q 1 (φ n ). Finally for any Hilbert space Ĥ and u L c(; Ĥ) = L c() Ĥ, we define qu := ( P b idĥ)u. For the derivation of properties of this projection operator we refer to [8, 3]. A useful characterization of the projection is given in the following lemma. Lemma 3.. [3, p. 07-08] Let w H 1,b (; Ĥ) for any separable Hilbert space Ĥ. The solution W P b (; Ĥ) of DĤ(W, X) = DĤ(w, X) for all X P b (; Ĥ) (3.3) fulfills W = q w. Proof. By taking an orthogonal basis of Ĥ it suffices to establish this for the scalar version of (3.3), which reads as follows: for given f H 1,b () find F P b () 8

such that for all n = 1,..., N: (F f ) 1 + (F f) n 1 + (F f) n 1 = 0 with (F f) 0 := 0 (3.4) and (F f )(t t n 1 ) k+1 = 0, for k = 0,..., q. (3.5) After integration (3.4) becomes (F f) n = (F f) n 1 with (F f) 0 := 0. From this we obtain F (t n ) = f(t n ) for all n = 1,..., N by induction. ntegration by parts in (3.5), together with F (t n ) = f(t n ) shows that (3.5) is equivalent to (F f) p = 0 for all p P q ( ). Hence F = P bf, cf. Definition 3.1. We now describe the DG time discretizations of the constrained and mixed formulations (.6) and (.14), respectively. We introduce the notation K(u, v) := D H (u, v) + a(u, v), u, v H 1,b (; H) L (; U). Time-discrete constrained formulation. The discrete (in time) version of (.6) reads: find U P b (; U) such that K(U, X) = f(x) + (u 0, X+) 0 H for all X P b (; V) (3.6) b(u, R) = ( q g)(r) for all R P b (; Q). (3.7) Time-discrete mixed formulation. The discrete (in time) version of (.14) reads: find U P b (; U), P P b (; Q 1 ) such that K(U, X) + b 1 (X, P ) = f(x) + (u 0, X+) 0 H for all X P b (; V ) (3.8) b(u, R) = ( q g)(r) for all R P b (; Q). (3.9) Note that in the discretization of the constraints the projection operator q is used, cf. Remark 3.1 below. f B corresponds to a trace operator (boundary condition) this is the usual treatment of an inhomogeneous Dirichlet boundary condition, but with the Dirichlet data projected in a suitable manner in time by means of q. n case of a Stokes problem with a homogeneous Dirichlet boundary condition, the condition in (3.9) is the divergence constraint with projected data q g. Remark 3.1. The formulations of the constraint equations in (3.7) and (3.9) are given in variational form because this is closest to the actual inplementation of the method, cf. Subsection 6.. For the analysis below it is useful to introduce an equivalent operator formulation. n operator form the equations (3.7) and (3.9) are equivalent to BU = q g in P b (; Q ). Due to the tensor product structure of q and the facts that B does not depend on t and U P b (; U) we get BU = B q U = 9

q BU. Hence the constraint conditions (3.7) and (3.9) have the equivalent operator formulation q (BU g) = 0 in P b (; Q ). n the error analysis we use this more compact representation of the discrete constraints. n the mixed formulation, for the case b 0, cf. Example.3, it is natural to split the constraint equation (3.9) in two parts: b 1 (U, R) = b(u, (R, 0)) = ( q g 1 )(R) for all R P b (; Q 1 ), (3.10) b (U, R) = b(u, (0, R)) = ( q g )(R) for all R P b (; Q ), (3.11) where g = (g 1, g ). Note that in (3.10) the same bilinear form b 1 (, ) and Lagrange multiplier space P b (; Q 1 ) as in (3.8) are used. n the following lemma we derive consistency and well-posedness of these discretizations. Lemma 3.3. Assume that the solution u of (.6) has regularity u H 1 (; H) and assume that g L c(; Q ). The time-discrete constrained and mixed formulations (3.6)-(3.7) and (3.8)-(3.9) are consistent, i.e.: K(u, X) = f(x) + (u 0, X+) 0 H for all X P b (; V) (3.1) q (Bu g) = 0, (3.13) and K(u, X) + b 1 (X, p) = f(x) + (u 0, X+) 0 H for all X P b (; V ) (3.14) q (Bu g) = 0 (3.15) hold. For f L (; V ) the time-discrete problem (3.6)-(3.7) has a unique solution U. f f has regularity f L (; V ) then there is a unique P P b (; Q 1 ) such that with the unique solution U of (3.6)-(3.7) the pair (U, P ) is the unique solution of the time-discrete mixed problem (3.8)-(3.9). Proof. The consistency properties are evident (note that due to continuity in time of u the jump terms in D H (, ) vanish). Existence of a discrete solution of (3.6)-(3.7) is proved by a standard shift argument as follows. Using that B : U Q is surjective, we find a v L c(; U) which satisfies the constraint Bv = g. Take V := q v P b (; U). Due to ellipticity of a(, ) on V and D H (X, X) 0 for all X P b (; U) the variational problem: find W P b (; V) such that K(W, X) = f(x) + (u 0, X+) 0 H K(V, X) for all X P b (; V) has a unique solution. Take U := W + V P b (; U). Then U satisfies (3.6) and we have q (BU g) = q (BV g) = q (B q v g) = q ( q Bv g) = q (Bv g) = 0, hence also the discrete constraint is satisfied. Suppose that there are two solutions U 1, U of (3.6)-(3.7). Then BU i = B( q U i ) = q BU i = q g holds, hence U 1 U L (; V). From the ellipticity of K(, ) on P b (; V) we get U 1 = U. 10

t remains to prove that (3.8)-(3.9) has a unique solution. f (U, P ) solves (3.8)- (3.9), then U solves (3.6)-(3.7). Conversely, if U solves (3.6)-(3.7), then (U, P ) solves (3.8)-(3.9) if and only if B 1P (X) = f(x) + (u 0, X+) 0 H K(U, X) for all X P b (; V ) holds. Since the right hand side vanishes for all X P b (; V) and B 1 : Q 1 V 0 is an isomorphism, there exists a unique P P b (; Q 1 ) which satisfies this relationship. Remark 3.. n the analysis below we will see that the projection operator q used in the time-discrete constraints (3.7) and (3.9) is essential for optimal order convergence results. Applying Lemma 3. to (3.7) (or (3.9)) shows that (3.7) is equivalent to D Q BU = D Q g, which is the discrete analogon of Bu = g with Bu(0) = g(0). The discretization of the constraint by D Q BU = D Q g is very similar to the index reduction approach used in [3] for obtaining accurate Runge-Kutta discretizations of DAEs with differentiation index (such as the Stokes equation), cf. equation (3.3c) in [3]. The projection operator q is also used in [33] to obtain optimal error bounds for the wave equation with non-homogeneous boundary conditions. 4. Error analysis for time-discrete constrained formulation. n this section we derive optimal discretization error bounds for the discrete constrained formulation (3.6)-(3.7). We first derive global bounds in the energy norm and then give a superconvergence result. 4.1. Optimal global error bounds. We apply a standard argument as in [8] and show that the discretization error can be bounded by the projection error. For this projection there are error bounds available. Theorem 4.1. Assume that the solution u of (.6) has regularity u L c(; U) H 1 (; H) and let U P b (; U) be the solution of (3.6)-(3.7). The following holds: u(t ) U(T ) H + u U L (;U) (1 + c γ Γ) u q u L (;U), (4.1) with c γ := max{ 1 γ, 1}. Proof. Define E := U q u. Note that BE = BU B( q u) = B( q U) B( q u) = q (BU Bu) = q (BU g) = 0. (4.) Hence E P b (; V) can be used as a test function in (3.6). This then yields, using the ellipticity of a(, ) on V and D H (E, E) 1 E(T ) H, with c γ = max{ 1 γ, 1}: 1 E(T ) H + E L (;U) c γk(e, E) = c γ K(u q u, E) ( = c γ DH (u q u, E) + a(u q u, E) ). Using (3.3) we get D H (u q u, E) = 0, (4.3) 11

and thus 1 E(T ) H + E L (;U) c γγ E L (;U) u q u L (;U) 1 c γγ u q u L (;U) + 1 E L (;U) holds. This yields E(T ) H + E L (;U) c γ Γ u q u L (;U). Using this, a triangle inequality and u(t ) ( q u)(t ) = 0 we obtain the result (4.1). Remark 4.1. For this analysis to work and the projection error bound (4.1) to hold it is crucial, cf. (4.) that in the discretization of the constraint we use BU = q g and not BU = g or some other interpolation (in time) of the data g. n Subsection 7.1 we will present a numerical experiment where the use of BU = g instead of BU = q g leads to sub-optimal results. Results for the projection error u q u L (;U) are known in the literature [8, Theorem 3.10]. Using these results we obtain the following optimal discretization error bound. Theorem 4.. Let U P b (; U) be the solution of (3.6)-(3.7), and assume that the solution u of (.6) has smoothness u H m (; U) for an m with 1 m q. The following holds: u(t ) U(T ) H + u U L (;U) c ( N kn m u (m) L (;U) ) 1 (4.4) ck m u (m) L (;U) (4.5) for some c > 0 which only depends on q, γ and Γ. Remark 4.. Typically one is interested in the case q = m. The other cases will however also be of some use in what follows. For the homogeneous case Bu = 0 the result (4.4) is the same as the one in [3, Theorem 1.1]. n applications the constraint Bu = g may be incompatible this the initial condition u(0) = u 0. This leads to low regularity of the solution u at t = 0, i.e. u (m) L (;U) is unbounded for larger m and small n. n such a setting the step size k n in (4.4) needs to be taken sufficiently small in the initial steps. n the specific case of Example.1 and Example.3 this is the case if the Dirichlet boundary condition is incompatible with the initial condition, cf. [8]. We also derive a bound for the error in the time derivative, which will be used in the analysis of superconvergence. Theorem 4.3. Assume that the solution u of (.6) has regularity u H 1 (; H) and assume that g L c(; Q ). Let U P b (; U) be the solution of (3.6)-(3.7). The following holds: u U L (;V ) c ( u ( q u) ) L (;V ) + u U L (;U) (4.6) for some c > 0 which only depends on q and Γ. f we additionally assume that u has smoothness u H m 1 (; U) H m (; V ) for an m with m q, then the following 1

holds: u U L (;V ) c ( N ) 1 kn m ( u (m) L (;V ) + u(m 1) L ( )) n;u) c k m 1 ( u (m) L (;V ) + u (m 1) L (;U)), (4.7) for some c > 0 which only depends on q, γ and Γ. Proof. We will use E = U q u P b (; H). Using the Riesz representation theorem in V (with V = U ), we find Ẽ N P q ( ; V) such that E (t) V = (E (t), Ẽ(t)) H = Ẽ(t) U for almost all t. We define X = N k 1 n (t t n 1 )Ẽ(t)χ P b (; V) and using Lemma 3. and the Galerkin orthogonality, cf. Lemma 3.3, we get This yields D H (E, X) = D H (U q u, X) = D H (U u, X) = K(U u, X) a(u u, X) = a(u U, X). D H (E, X) 1 Noting that the left-hand side is we get D H (E, X) = kn 1 kn 1 (t t n 1 ) E V N (t t n 1 )Γ u U U Ẽ U kn 1 (t t n 1 ) ( Γ u U U + E ) V. kn 1 (t t n 1 )(E, Ẽ) H = kn 1 (t t n 1 ) E V, kn 1 (t t n 1 )Γ u U U Γ U u L (;U). Using that the L 1 -norm on P q 4 ([0, 1]) is equivalent to the norm φ 1 0 t φ(t) dt, we get E V = N E V Ck 1 n (t t n 1 ) E V CΓ U u L (;U), for some constant C > 0 which depends only on q. Using a triangle inequality completes the proof of (4.6). Results for the projection error u ( q u) L (;V ) are known in the literature [3, p. 14]. Using these results and Theorem 4.1 we obtain the optimal discretization error bound (4.7). 13

4.. Optimal superconvergence result. As is known from the literature (see [3, Theorem 1.3]), for the case with a homogeneous constraint in (.6) we get a superconvergence result provided the solution u is sufficiently smooth. The analysis in [3, Theorem 1.3] is not directly applicable to the case with an inhomogeneous constraint. Note that in the latter case the projection operator q is used in the discretization method, cf. (3.7). n this subsection we derive an optimal superconvergence result for the discretization (3.6)-(3.7). For this we first introduce an abstract notion of regularity, which is very similar to the one used in [3]. We define V 0 := V H (with norm H ), V 1 := V (with norm U ), which induces the Gelfand triple V V 0 = (V 0 ) V. Furthermore V := { u V Au (V 0 ) }. Based on the Riesz representation theorem in V 0, for any u U with Au (V 0 ) we define Lu V 0 by ( Lu, v) H = a(u, v) for all v V. Let us consider the restriction of L := L V, i.e., L : V V 0 with (Lu, v) H = (Au)(v) = a(u, v) for all u V, v V. (4.8) For u V with Lu = 0 it follows that a(u, v) = 0 for all v V, hence, u = 0. Take f V 0. There exists a unique u V such that a(u, v) = (f, v) H for all v V. This implies Au (V 0 ) = f H, hence u V. We conclude that L : V V 0 is a bijection. ts inverse L 1 : V 0 V V 0 is symmetric, bounded and positive. The unique square root operator of L 1 (that commutes with L 1 ) is denoted by L 1 : V 0 V 0. We define L 1 := L 1 L : V V 0, hence, (L 1 v, L 1 v) H = (Lv, v) H = a(v, v) for all v V. The definition of L 1 can be extended to V. First note that A : V V is a homeomorphism and V is defined as the preimage of (V 0 ). Since (V 0 ) V is dense, it follows that V V is dense. Using the ellipticity and continuity of a(, ) we see that γ 1 v U a(v, v) 1 = L 1 v H Γ 1 v U for all v V. Thus, using the density of V in V, we conclude that L 1 : V V 0 is a homeomorphism. Since L 1 is symmetric (due to L 1 L = LL 1 ), the dual norm equivalence reads γ 1 v H L 1 v V Γ 1 v H for all v V. For m N, m >, we define V m = { u V Lu V m }. Note that L 1 Lv = LL 1 regularity we define for all v V 3 and D(L m ) = V m, m N. As a measure for m (u, L v) H u m := sup, u H, m N v V m v H (note that u m = is allowed). Below we use the notation L m u := (id L () L m )u, Lu = (idl () L)u, for m Z and for functions u for which the right hand-side expressions are well-defined. For the analysis we introduce a standard dual homogeneous problem, i.e., source term f = 0 and with a homogeneous constraint Bu = 0; due to duality t is replaced by 14

T t and we prescribe an initial value at t = T. This leads to the following problem formulation: find z W 1 (V; V ) with z(t ) = φ V 0 and z (t) + Az(t) = 0 in L (; V ). (4.9) This problem is well posed. n the analysis below we need that the solution z has regularity z H 1 (; H). t is easy to see that z has this regularity property if φ V. ndeed, for φ V let y W 1 (V; V ) be the solution of the parabolic problem (4.9) with y(t ) = L 1 φ V 0, then z = L 1 y solves (4.9) with z(t ) = φ and z = L 1 y L (; H). Hence, for φ V we have z H 1 (; H). This implies that (4.9) holds in L (, (V 0 ) ). The corresponding discrete problem is as follows: determine Z P b (; V) such that K(X, Z) = DH(X, Z) + a(z, X) = (φ, X N ) H for all X P b (; V). (4.10) This is the analogue of (3.6) for (4.9). This homogeneous backward problem has been discussed in detail in [3, p. 1-13]. f we replace t by T t, then we can apply the results from the previous subsection to z and Z. n particular we can derive the following error estimate. Lemma 4.4. Let z and Z be the solution of (4.9) and (4.10), respectively and assume that z H 1 (; H). The following error estimate holds for 1 m q: L m+1+ 1 (z Z) L (;H) + L m+ 1 (z Z ) L (;H) Ck m 1 φ H (4.11) for some C > 0 which only depends on q, γ and Γ. Proof. First consider the case m = 1. The homogeneous problem (4.9) can be formulated as z = Lz in L (; V 0 ). From (z, z) L (;H) = (L 1 z, L 1 z) L (;H) and integration by parts, we get L 1 z L (;H) 1 φ H. Similarly, taking X = Z in (4.10) and applying Lemma 3.1 shows that 1 ZN H + L 1 Z L (;H) D H(Z, Z) + a(z, Z) 1 φ H + 1 ZN H. We conclude that L 1 z L (;H) + L 1 Z L (;H) φ H. (4.1) From (4.6) and the standard bound z ( q z) L (;V ) c z L (;V ) (for some c which depends only on q, cf. [3, p. 14]), we know that L 1 (z Z ) L (;H) γ 1 z Z L (;V ) c ( z ) L (;V ) + z L (;U) + Z L (;U) c ( z L (;V ) + γ 1 L 1 z L (;H) + γ 1 L 1 Z L (;H)), with a constant c which only depends on q, γ and Γ. Noting that z L (;V ) = Lz L (;V ) Γ 1 L 1 z L (;H) and using (4.1), completes the proof for m = 1. We now consider the case m. Let y = L 1 m z, Y = L 1 m Z and ψ = L 1 m φ. Note that the solution operators φ z and φ Z in (4.9) and (4.10) commute with L. 15

Thus y solves (4.9) with initial condition y(t ) = ψ and Y solves the corresponding discrete problem K(X, Y ) = DH(X, Y ) + a(y, X) = (ψ, X N ) H for all X P b (; V). Note that y (m 1) = L m 1 y = z W 1 (V; V ). Using the estimates (4.4), (4.1) and y (m 1) = z we get y Y L (;U) ck m 1 y (m 1) L (;U) ck m 1 γ 1 L 1 z L (;H) cγ 1 k m 1 φ H and L m+1+ 1 (z Z) L (;H) = L 1 (y Y ) L (;H) ck m 1 Γ 1 γ 1 φ H, with a constant c which only depends on q, γ and Γ. From this we obtain the estimate for the first term in (4.11). n order to bound the second term we use (4.7), (4.1) and y (m 1) = z: L m+ 1 (z Z ) L (;H) = L 1 (y Y ) L (;H) γ 1 y Y L (;V ) ck m 1 ( y (m) L (;V ) + y (m 1) L (;U)) ck m 1 (Γ 1 L 1 z L (;H) + γ 1 L 1 z L (;H)) ck m 1 (Γ 1 + γ 1 ) L 1 z L (;H) ck m 1 (Γ 1 + γ 1 ) φ H, with a constant c which only depends on q, γ and Γ. This yields the bound for the second term in (4.11). Using this we can derive the following result, which has a superconvergence result as an easy corollary. Theorem 4.5. Assume that the solution u of (.6) has regularity u L c(; U) H 1 (; H) and Au L c(; (V 0 ) ). Let U P b (; U) be the solution of (3.6)-(3.7). Let l q 1. The following holds: U N u(t ) H ck l 1 ( Lu q Lu l ) 1 for some c > 0 which only depends on q, γ and Γ. Proof. We introduce E := U u, φ := E(T ) V, Ê = U q u P b (; V). θ = q u u = E Ê. We take φ in the homogeneous backward problem (4.9). The corresponding solutions of (4.9) and (4.10) are denoted by z and Z, respectively. We define ζ = z Z H 1,b (; H). One can not apply a straightforward duality argument for the following reason. We (only) have q B(U u) = 0 and thus in general E / L (; V). Hence, we can not use E as a test function in (4.9). To overcome this problem, below we consider a transformed dual problem with endpoint condition L 1 φ (instead of φ) and use a test function LE. The precise arguments are as follows. Due the regularity assumption Au L c(; (V 0 ) ) we have that Lu L c(; V 0 ) is welldefined. Using the Galerkin property K(U u, ) = 0 in P b (, V ) we have AU = D H (u U, ) + Au in P b (, V ). 16

Due to the density of V in V 0 it follows that AU P b (; (V 0 ) ) and thus LU P b (; V 0 ) L c(; V 0 ) is well-defined. t follows that LE L c(; V 0 ). Using this and that Lθ L c(; V 0 ) is well defined, we find that LÊ = LÊ = Lθ LE is well defined. Therefore, we have Ê L c(; V ). Note that due to E(T ) = Ê(T ) V we have ( LE)(T ) = LE(T ) = LE(T ) = Lφ. The dual problem with endpoint condition L 1 φ has corresponding solutions of (4.9) and (4.10) given by y := L 1 z and Y := L 1 Z. Hence, (L 1 φ, v) H = K(v, y) = D H (v, y)+ a(v, y) = D H (v, y)+ (v, Ly) H v L (; V). Using the density of V in V 0 we obtain (L 1 φ, v) H = D H (v, L 1 z) + (v, z) H for all v L (; V 0 ). We use v = LE as test function, which yields φ H = (L 1 φ, Lφ) H = (L 1 φ, LE(T )) H = D H ( LE, L 1 z) + ( LE, z) H. (4.13) Note that from Lemma 3., the Galerkin property 0 = K(E, X) for all X P b (; V) and Ê Pb (; V ) we get 0 = K(E, Z) = D H (E, Z) + a(e, Z) = D H (Ê, Z) + a(e, Z) = D H (LÊ, L 1 Z) + ( LE, Z) H = D H ( LÊ, L 1 Z) + ( LE, Z) H = D H ( LE, L 1 Z) + ( LE, Z) H. Combing this with (4.13) and using K(Ê, ζ) = 0 and (3.1), we obtain: φ H = D H ( LE, L 1 ζ) + ( LE, ζ) H = D H ( Lθ, L 1 ζ) + ( Lθ, ζ) H + D H ( LÊ, L 1 ζ) + ( LÊ, ζ) H = D H ( Lθ, L 1 ζ) + ( Lθ, ζ) H + D H (LÊ, L 1 ζ) + a(ê, ζ) = D H ( Lθ, L 1 ζ) + ( Lθ, ζ) H + D H (Ê, ζ) + a(ê, ζ) = D H ( Lθ, L 1 ζ) + ( Lθ, ζ) H + K(Ê, ζ) = D H( Lθ, L 1 ζ) + ( Lθ, ζ) H. Noting that θ n = 0 for all n = 1,..., N, we get φ H = (L 1 ζ, Lθ) H + (ζ, Lθ) H. (4.14) By the definition of l, we have φ H L l ζ H Lθ l + L l +1 ζ H Lθ l. 17

Using the Cauchy-Schwarz inequality, we get ( ) 1/ ( ) φ H Lθ l L l ζ L (;H) + L l +1 ζ L (;H). (4.15) We will estimate the second factor in (4.15) by using bounds from Lemma 4.4. First we consider the case that l is odd. Then m := l + 1 is a natural number with 1 m q. We then obtain L l +1 ζ L (;H) + L l ζ L (;H) = L (m 1)+ 1 ζ L (;H) + L m+ 1 ζ L (;H) Ck m 1 φ H = Ck l 1 φ H, (4.16) and in combination with (4.15) this yields the desired result. For l even we use the following property x V : L 1 x H = (Lx, x) H Lx H x H. The term L l ζ L (;H) can be treated as follows, where we use the bound derived in (4.16), Lemma 4.4, L l ζ L (;H) L l 1 ζ 1 l+1 L (;H) L ζ 1 L (;H) C ( ) k l 1 ( ) k l 1 φ H = Ck l 1 φ H. The term L l +1 ζ L (;H) can be treated in the same way. Thus also for l even we get the bound as in (4.16). Results for the projection error Lu q Lu l are known in the literature [8, Theorem 3.10]. Using these results we obtain the following optimal discretization error bound. Theorem 4.6. Assume that the solution u of (.6) has regularity u L c(; U) H m (; H) and Au H m (; (V 0 ) ). Let U P b (; U) be the solution of (3.6)-(3.7). For any l q 1, 1 m q, we have u(t ) U(T ) H ck l 1 ( N ) 1 ( ) 1 kn m Lu (m) l ck l 1 +m Lu (m) l (4.17) for some c > 0 which depends only on q, γ and Γ. Note that for the case of full regularity, i.e, l = q 1, m = q we obtain the optimal superconvergence bound of order k q 1. For U = V this corresponds to the result in [3, Theorem 1.3]. Our estimate is more general because it states which convergence order one has, depending on the regularity of the solution, without assuming full regularity l = q 1. For the general case V U, we have introduced the appropriate semi-norms l which express regularity. 5. Error analysis for time-discrete mixed formulation. n this section we derive discretization error bounds for the discrete mixed formulation (3.8)-(3.9). Let (u, p) be the solution of (.14) and (U, P ) the solution of (3.8)-(3.9), then u and U solve (.6) and (3.6)-(3.7), respectively, cf. Lemma 3.3. Thus, the optimal error bounds 18

derived in Section 4 also hold for the solution U of the discrete mixed formulation. t remains to bound the error for the Lagrange multiplier. We present two results. The first result gives a sub-optimal bound of order O(k q 1 ), without making any further specific assumptions. n the second part we introduce a certain regularity assumption under which an optimal error bound is derived. 5.1. Sub-optimal result. n this subsection we derive a sub-optimal discretization error bound for the Lagrange multiplier. Below we use the notation introduced in Subsection., in particular the bilinear form b 1 (, ) and the corresponding operator B 1 : V Q 1 with ker(b 1 ) = V 1. To simplify the presentation we assume that we have a uniform step size k n = k. We start with two lemmas that provide useful bounds for the error term q u U. Lemma 5.1. Let k n = k for all n = 1,..., N. Assume that the solution u of (.6) has regularity u H 1 (; U). Let U P b (; U) be the solution of (3.6)-(3.7). The following holds: ( q u U) L (;H) c(k 1 (u q u) L (;U) + k 1 q u u L (;U)) for some c > 0 which depends only on q, γ and Γ. Proof. Let E = q u U P b (; V). From the consistency property (3.1) and Lemma 3. we obtain the Galerkin relation D H (E, X) + a(u U, X) = 0 for all X P b (; V). For X := N (t t n 1)χ n E P b (; V) we get (E, X) H + a(e, X) + a(u q u, X) = 0. (5.1) The L 1 -norm on P q 4 ([0, 1]) is equivalent to φ 1 0 t φ(t) dt. Thus, by scaling, we get the following inverse inequality k E H C (t t n 1 ) E H for each interval, for some C which depends only on q. Using this and (5.1) we get ( ) k E L (;H) C (E, X) H = C a(e, X) a(u q u, X). (5.) Using integration by parts on each interval we obtain a(e, X) = = a(e, (t t n 1 )E ) k a(en, E n ) + 1 19 a(e, E) 1 Γ E L (;U). (5.3)

ntegration by parts on each interval yields a(u q u, X) a(u q u, (t t n 1 )E ) = a((u q u), (t t n 1 )E) a(u q u, E) t t n 1 a((u q u), E) + a(u q u, E) kγ (u q u) L (;U) E L (;U) + Γ u q u L (;U) E L (;U). (5.4) Using (5.3) and (5.4) in (5.) we obtain k E L (;H) C ( E L (;U) + u qu L (;U) + k (u q u) L (,U)), with a constant C which depends only on q and Γ. Using the triangle inequality and Theorem 4.1 completes the proof. Lemma 5.. Let k n = k for all n = 1,..., N. Assume that the solution u of (.6) has regularity u H 1 (; U). Let U P b (; U) be the solution of (3.6)-(3.7). The following holds: k 1 ( N 1 ( q u U) 0 + H + [ q u U] n H ) 1 c( (u q u) L (;U) + k 1 q u u L (;U)) for some c > 0 which depends only on q, γ and Γ. Proof. Let E := q u U P b (; V). Let E n = χ n E for all n = 1,..., N. Take for n N, Ẽ n P q 1 ( ; U) such that Ẽ n (t n 1 + t) = E n 1 (t n 1 t) for t (0, k). Take Ẽ1 = 0 and define V n := E n Ẽn for all n. Note that (V 1 ) 0 + = E 0 + and (V n ) n 1 + = [E] n 1 for n =,..., N. For all n = 1,..., N and for t we have t V n (t) H (V n ) n 1 + H + (V n ) n 1 + H + V n H (V n ) n 1 t n 1 E n H + Ẽ n H + H + V n H (V n ) n 1 + H + k 1 ( E L (;H) + E L ( 1;H)), (5.5) with 0 :=. From the consistency property (3.1) and Lemma 3. we obtain the Galerkin relation D H (E, X) + a(u U, X) = 0 for all X P b (; V). 0

f we take X = N V n, we obtain N 1 E+ 0 H + [E] n H = (E, X) H a(u U, X). (5.6) Using (5.5) we obtain for the first term on the right-hand side: (E, X) H k 1 (E, V n ) H E L 1 (;H) V n L (;H) E ( L (;H) (Vn ) n 1 + H + k 1 ( E L (;H) + E ) L ( 1;H)) k E L (;H) + 1 E0 + H + 1 N 1 [E] n H + k E L (;H). The second term can be bounded as follows: a(u U, X) N Γ u U L (;U) V n L (;U) Γ u U L (;U)( E n L (;U) + Ẽn L (;U)) Using these bounds in (5.6) we get Γ u U L (;U) + Γ E L (;U). k 1( N 1 ( q u) 0 + U+ 0 H + [E] n H) C ( E L (;H) + k 1 ( E L (;U) + u U L (;U) )) with a constant C which depends only on Γ. Using a triangle inequality and the results in Lemma 5.1 and Theorem 4.1, we obtain the desired result. Theorem 5.3. Let k n = k for all n = 1,..., N. Assume that the solution (u, p) of (.14) has regularity u H 1 (; U). Let (U, P ) P b (; U) be the solution of (3.8)- (3.9). Let q L be the L -orthogonal projection from L () to P b (). The following holds: p P L (;Q 1) c ( k 1 (q u u) L (;U) + k ) 1 q u u L (;U) + p L q p L (;Q 1) x with a constant c which depends only on q, γ, Γ, β from (.11) and c H := sup H x U x U. Proof. From the consistency property (3.14) we obtain the Galerkin relation D H (u U, X) + a(u U, X) + b 1 (X, p P ) = 0 for all X P b (; V ). Let P = q L p be the orthogonal projection of p L (; Q 1 ) to P b (, Q 1 ). From the inf-sup property (.11) it follows that for all t there exists a unique ˆV (t) V V 1

V such that: b 1 ( ˆV (t), q) = ( P (t) P (t), q) Q1 for all q Q 1 and ˆV (t) U 1 β P (t) P (t) Q1. This, P P P b (; Q 1 ) and a tensor-product argument imply ˆV P b (; V ). Let E = q u U. By Lemma 3. we have D H (E, ˆV ) + a(u U, ˆV ) + b 1 ( ˆV, p P ) = 0. (5.7) We first consider the first term of (5.7). Using the continuous embedding U H we get (E, ˆV ) H E L (;H) ˆV L (;H) c H E L (;H) ˆV L (;U). and using the inverse estimate ˆV n + U Ck 1 ˆV L (+1;U), with a constant C which depends only on q, we get N 1 (E+, 0 V+) 0 H + ([E] n, ˆV + n ) H c H E+ 0 H ˆV N 1 + 0 U + c H [E] n H ˆV + n U Cc H k 1 E 0 + H ˆV L ( 1;U) + Cc H k 1 N 1 N 1 Cc H k ( 1 E 0 + H + [E] n ) 1 H ˆV L (;U). [E] n H ˆV L (+1;U) For the second term, we have a(u U, ˆV ) Γ u U L (;U) ˆV L (;U). For the third term we have P P L (;Q = 1) ( P P, P P ) Q1 = ( P P, p P ) Q1 = b 1 ( ˆV, p P ). Combing these results with the definition of ˆV, we get P P L (;Q = 1) b 1 ( ˆV, p P ) = D H (E, ˆV ) a(u U, ˆV ) ( c E L (;H) + k ( 1 E 0 + H + c β N 1 [E] n ) 1 H + u U L (;U) ) ˆV L (;U) ( N 1 E L (;H) + k ( 1 E 0 + H + [E] n ) 1 ) H + u U L (;U) P P L (;Q 1). Combining this with the previous two lemmas, Theorem 4.1 and the triangle inequality completes the proof.

Results for the projection error p q L p L (;Q 1) with respect to the orthogonal projection are standard. Results for the projection errors ( q u u) L (;U) and q u u L (;U) are also standard, cf. Subsection 4.1. Using these we obtain the following sub-optimal discretization error bound. Theorem 5.4. Let (u, p) be the solution of (.14) and (U, P ) the solution of (3.8)-(3.9). Assume that the solution (u, p) has smoothness properties u H m (; U), p H m (; Q 1 ), for an m with 1 m q. The following holds: p P L (;Q 1) ck m ( 1 u (m) L (;U) + k 1 p (m) ) L (;Q 1) with a constant c which depends only on q, γ, Γ, β. (5.8) 5.. Optimal result. Below we again use the notation introduced in Subsection., in particular the bilinear form b 1 (, ) and the corresponding operator B 1 : U Q 1, with ker(b 1 ) = V 1. We introduce the H-orthogonal projection P H : H V H =: V H onto the closed subspace V H of H, and define P H := id H P H : H H. The following (regularity type) assumption on P H is crucial for our analysis. Assumption 5.1. P H V V 1 and there exists a constant c P such that P H v U c P v U for all v V. n the remainder of this subsection we assume that Assumption 5.1 is satisfied. Using this assumption we derive an optimal discretization error bound for the Lagrange multiplier. Example 5.1. We consider this assumption for Example.3. n this case we have U = H 1 (Ω) d, H = L (Ω) d, V 1 = { u H 1 (Ω) d div u = 0 }, V = H 1 0 (Ω) d and V = V 1 V = { u H 1 0 (Ω) d div u = 0 }. The space V H is given by V H = V H = { u L (Ω) d div u = 0, u n = 0 on Ω }, where n denotes the normal of Ω, cf. [15, p. 9]. The projection P H (also known as the Leray projector) can be characterized as follows. For given u L (Ω) d, P H u = u + q, with q such that (u + q, w) L (Ω) d = 0 for all w H1 (Ω)/R. When u H 1 0 (Ω) d, this q solves the Poisson problem with homogeneous Neumann boundary conditions: q = div u on Ω q = 0 n on Ω q = 0. We assume H -regularity of this problem, hence, Ω q H 1 (Ω) d q H (Ω) c div u L (Ω) c u H 1 (Ω) d, which implies that P H : H 1 0 (Ω) d H 1 (Ω) d is continuous. This and div P H u = div u + q = 0 imply that Assumption 5.1 holds. This H -regularity assumption, which holds when Ω is convex or has a C -boundary, is exactly the same as in [18, Assumption 3.1]. There it is used in the well-posedness analysis of time-dependent 3