Key words. optimal control, heat equation, control constraints, state constraints, finite elements, a priori error estimates

Similar documents
c 2008 Society for Industrial and Applied Mathematics

A Priori Error Analysis for Space-Time Finite Element Discretization of Parabolic Optimal Control Problems

FINITE ELEMENT APPROXIMATION OF ELLIPTIC DIRICHLET OPTIMAL CONTROL PROBLEMS

MEASURE VALUED DIRECTIONAL SPARSITY FOR PARABOLIC OPTIMAL CONTROL PROBLEMS

Hamburger Beiträge zur Angewandten Mathematik

A priori error estimates for state constrained semilinear parabolic optimal control problems

Semismooth Newton Methods for an Optimal Boundary Control Problem of Wave Equations

1. Introduction. In this paper we provide numerical analysis for the following optimal control problem:

u = f in Ω, u = q on Γ. (1.2)

arxiv: v1 [math.oc] 5 Jul 2017

A note on accurate and efficient higher order Galerkin time stepping schemes for the nonstationary Stokes equations

Efficient Numerical Solution of Parabolic Optimization Problems by Finite Element Methods

Priority Program 1253

Discontinuous Galerkin Time Discretization Methods for Parabolic Problems with Linear Constraints

REGULAR LAGRANGE MULTIPLIERS FOR CONTROL PROBLEMS WITH MIXED POINTWISE CONTROL-STATE CONSTRAINTS

Goal-oriented error control of the iterative solution of finite element equations

SUPERCONVERGENCE PROPERTIES FOR OPTIMAL CONTROL PROBLEMS DISCRETIZED BY PIECEWISE LINEAR AND DISCONTINUOUS FUNCTIONS

A-posteriori error estimates for optimal control problems with state and control constraints

Finite Element Error Estimates in Non-Energy Norms for the Two-Dimensional Scalar Signorini Problem

K. Krumbiegel I. Neitzel A. Rösch

A Concise Course on Stochastic Partial Differential Equations

An optimal control problem for a parabolic PDE with control constraints

Error estimates for the discretization of the velocity tracking problem

Convergence of a finite element approximation to a state constrained elliptic control problem

Adaptive methods for control problems with finite-dimensional control space

Konstantinos Chrysafinos 1 and L. Steven Hou Introduction

Goal-oriented error control of the iterative solution of finite element equations

Research Article A Two-Grid Method for Finite Element Solutions of Nonlinear Parabolic Equations

b i (x) u + c(x)u = f in Ω,

Hamburger Beiträge zur Angewandten Mathematik

Preconditioned space-time boundary element methods for the heat equation

A WEAK GALERKIN MIXED FINITE ELEMENT METHOD FOR BIHARMONIC EQUATIONS

Yongdeok Kim and Seki Kim

INTRODUCTION TO FINITE ELEMENT METHODS

PIECEWISE LINEAR FINITE ELEMENT METHODS ARE NOT LOCALIZED

Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems

Robust error estimates for regularization and discretization of bang-bang control problems

PDE Constrained Optimization selected Proofs

Applied/Numerical Analysis Qualifying Exam

OPTIMALITY CONDITIONS AND ERROR ANALYSIS OF SEMILINEAR ELLIPTIC CONTROL PROBLEMS WITH L 1 COST FUNCTIONAL

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

10 The Finite Element Method for a Parabolic Problem

Variational Formulations

Adaptive Finite Element Methods Lecture Notes Winter Term 2017/18. R. Verfürth. Fakultät für Mathematik, Ruhr-Universität Bochum

Overview. A Posteriori Error Estimates for the Biharmonic Equation. Variational Formulation and Discretization. The Biharmonic Equation

From Completing the Squares and Orthogonal Projection to Finite Element Methods

ENERGY NORM A POSTERIORI ERROR ESTIMATES FOR MIXED FINITE ELEMENT METHODS

DISCRETE MAXIMAL PARABOLIC REGULARITY FOR GALERKIN FINITE ELEMENT METHODS FOR NON-AUTONOMOUS PARABOLIC PROBLEMS

WEAK GALERKIN FINITE ELEMENT METHOD FOR SECOND ORDER PARABOLIC EQUATIONS

PREPRINT 2010:23. A nonconforming rotated Q 1 approximation on tetrahedra PETER HANSBO

Finite Element Methods for Fourth Order Variational Inequalities

Functional Analysis. Martin Brokate. 1 Normed Spaces 2. 2 Hilbert Spaces The Principle of Uniform Boundedness 32

Numerical Analysis of State-Constrained Optimal Control Problems for PDEs

SECOND-ORDER SUFFICIENT OPTIMALITY CONDITIONS FOR THE OPTIMAL CONTROL OF NAVIER-STOKES EQUATIONS

arxiv: v1 [math.na] 30 Jan 2018

SECOND ORDER TIME DISCONTINUOUS GALERKIN METHOD FOR NONLINEAR CONVECTION-DIFFUSION PROBLEMS

Efficient Higher Order Discontinuous Galerkin Time Discretizations for Parabolic Optimal Control Problems

S6. Control of PDE: Theory, Numerics, and Applications

Scientific Computing WS 2018/2019. Lecture 15. Jürgen Fuhrmann Lecture 15 Slide 1

On the relationship of local projection stabilization to other stabilized methods for one-dimensional advection-diffusion equations

A posteriori error estimation for elliptic problems

Numerische Mathematik

hp-version Discontinuous Galerkin Finite Element Methods for Semilinear Parabolic Problems

A. RÖSCH AND F. TRÖLTZSCH

BIHARMONIC WAVE MAPS INTO SPHERES

Basic Principles of Weak Galerkin Finite Element Methods for PDEs

PARABOLIC CONTROL PROBLEMS IN SPACE-TIME MEASURE SPACES

STATE-CONSTRAINED OPTIMAL CONTROL OF THE THREE-DIMENSIONAL STATIONARY NAVIER-STOKES EQUATIONS. 1. Introduction

IMPROVED LEAST-SQUARES ERROR ESTIMATES FOR SCALAR HYPERBOLIC PROBLEMS, 1

Solving Distributed Optimal Control Problems for the Unsteady Burgers Equation in COMSOL Multiphysics

Multigrid Methods for Saddle Point Problems

ETNA Kent State University

ELLIPTIC RECONSTRUCTION AND A POSTERIORI ERROR ESTIMATES FOR PARABOLIC PROBLEMS

A Two-Grid Stabilization Method for Solving the Steady-State Navier-Stokes Equations

NEW REGULARITY RESULTS AND IMPROVED ERROR ESTIMATES FOR OPTIMAL CONTROL PROBLEMS WITH STATE CONSTRAINTS.

Goal. Robust A Posteriori Error Estimates for Stabilized Finite Element Discretizations of Non-Stationary Convection-Diffusion Problems.

arxiv: v1 [math.na] 5 Jun 2018

Numerical Methods for the Navier-Stokes equations

Joint work with Nguyen Hoang (Univ. Concepción, Chile) Padova, Italy, May 2018

Numerical Analysis of Higher Order Discontinuous Galerkin Finite Element Methods

On Pressure Stabilization Method and Projection Method for Unsteady Navier-Stokes Equations 1

Decay in Time of Incompressible Flows

Time-dependent Dirichlet Boundary Conditions in Finite Element Discretizations

A Finite Element Method for the Surface Stokes Problem

GALERKIN TIME STEPPING METHODS FOR NONLINEAR PARABOLIC EQUATIONS

Kernel Method: Data Analysis with Positive Definite Kernels

An interpolation operator for H 1 functions on general quadrilateral and hexahedral meshes with hanging nodes

Basic Concepts of Adaptive Finite Element Methods for Elliptic Boundary Value Problems

Weak Formulation of Elliptic BVP s

Error estimates for Dirichlet control problems in polygonal domains

arxiv: v1 [math.na] 29 Feb 2016

WEAK GALERKIN FINITE ELEMENT METHODS ON POLYTOPAL MESHES

Mixed Finite Elements Method

Suboptimal Open-loop Control Using POD. Stefan Volkwein

On Friedrichs inequality, Helmholtz decomposition, vector potentials, and the div-curl lemma. Ben Schweizer 1

Technische Universität Graz

Weierstraß-Institut. für Angewandte Analysis und Stochastik. Leibniz-Institut im Forschungsverbund Berlin e. V. Preprint ISSN

A posteriori analysis of a discontinuous Galerkin scheme for a diffuse interface model

Numerical Methods for Large-Scale Nonlinear Systems

Discontinuous Galerkin Methods

Transcription:

A PRIORI ERROR ESTIMATES FOR FINITE ELEMENT DISCRETIZATIONS OF PARABOLIC OPTIMIZATION PROBLEMS WITH POINTWISE STATE CONSTRAINTS IN TIME DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER Abstract. In this paper, we consider an optimal control problem, which is governed by a linear parabolic equation and is subject to state constraints pointwise in time. Optimal order error estimates are developed for a space-time finite element discretization of this problem. Numerical examples confirm the theoretical results. As a byproduct of our analysis, we derive a new regularity result for the optimal control. Key words. optimal control, heat equation, control constraints, state constraints, finite elements, a priori error estimates AMS subject classifications. 49J20, 35K20, 49M05, 49M15, 49M25, 49M29, 65M12, 65M50, 65M60 1. Introduction. In this paper, we consider the following optimal control problem governed by the heat equation and subject to control and state constraints: Minimize 1 T x) û(t, x)) 2 0 Ω(u(t, 2 dx dt + α T q(t, x) 2 dx dt, (1.1a) 2 0 Ω subject to the equation constraints the control constraints t u u = f + q in (0, T ) Ω, u = 0 on (0, T ) Ω, u = u 0 in { 0 } Ω, (1.1b) q a q(t, x) q b a.e. in (0, T ) Ω (1.1c) for q a, q b R and the state constraint u(t, x)ω(x) dx b in [0, T ] (1.1d) Ω for given ω L 2 (Ω) and b R. The precise functional analytic setting of (1.1) is formulated in Section 2, below. Here, q denotes the (distributed) control and u the state variable. The cost functional (1.1a) is a quadratic functional of tracking type and the control q enters the state equation (1.1b) via the right-hand side. Besides box constraints (1.1c) on the control variable, we consider state constraints (1.1d) which are integrated in space and are understood pointwise in time. Parabolic optimal control problems with state constraints formulated pointwise in space and time, i.e., u(t, x) b for all (t, x) [0, T ] Ω (1.2) Lehrstuhl für Mathematische Optimierung, Technische Universität München, Fakultät für Mathematik, Boltzmannstraße 3, 85748 Garching b. München, Germany (meidner@ma.tum.de) Institut für Angewandte Mathematik, Ruprecht-Karls-Universität Heidelberg, INF 294, 69120 Heidelberg, Germany (rannacher@iwr.uni-heidelberg.de) Lehrstuhl für Mathematische Optimierung, Technische Universität München, Fakultät für Mathematik, Boltzmannstraße 3, 85748 Garching b. München, Germany (vexler@ma.tum.de) 1

2 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER are discussed in several publications, see, e.g., Casas [2] and Raymond & Zidany [25] for corresponding optimality conditions and Neitzel & Tröltzsch [21, 22] for regularization issues. The case of spatially integrated state constrains (1.1d) serves as an example for several applications, where some constraint which is formulated as a spatial functional (for instance drag or lift coefficients in CFD) should hold continuously in time. Optimal control problems of this type are considered in Goldberg & Tröltzsch [14] and Bonnans & Jaisson [1]. In these publications necessary and sufficient optimality conditions as well as regularity results are discussed. The main goal of this paper is to provide an a priori error analysis for a finite element discretization of the parabolic optimal control problem under consideration. To this end, we follow the strategy developed in Meidner & Vexler [18, 19], where optimal control problems are analyzed in the absence of state constraints. We consider a discontinuous Galerkin scheme, the dg(0) method, for temporal discretization, conforming (bi-/tri-)linear finite elements for spatial discretization and cellwise constants for the discretization of the control variable, see Section 3 for details. The main difficulty in the numerical analysis of optimal control problems with state constraints is the lack of regularity caused by the fact that the Lagrange multiplier corresponding to the state constraint (1.1d) is a Borel measure µ C([0, T ]). This affects the regularity of the adjoint state and of the optimal control q. Especially the lack of temporal regularity complicates the derivation of a priori error estimates for the corresponding finite element discretization. Error estimates for optimal control problems with state constraints governed by elliptic equations are derived in several publications. In Casas [3] error estimates are given for an optimal control problem with finitely many state constraints. In Deckelnick & Hinze [7, 8] error estimates of order h 1 ε in 2d and h 1 2 ε in 3d are derived for a problem with pointwise state constraints. A similar result is obtained in Meyer [20] with a different technique avoiding the consideration of Lagrange multipliers on the discrete level. The later technique is extended to problems governed by the Stokes equations in Reyes, Meyer & Vexler [5]. The publications Deckelnick, Günther & Hinze [6] and Ortner & Wollner [23] are devoted to problems with pointwise state constraints on the gradient of the state. We denote by k the maximum step size in the temporal discretization and by h the maximum cell size of the spatial mesh. The main result of this paper is the following estimate of the error between the optimal solution q of the continuous problem and the optimal solution q σ of the discrete one: q q σ L 2 (0,T ;L 2 (Ω)) C ( ln T ) 1 2 { 1 } k 2 + h. (1.3) α k This is to be compared to related results in Deckelnick & Hinze [9] for problem (1.1), but with state constraints pointwise in space and time ((1.2) instead of (1.1d)), which are of the lower order O( ln h 1 4 (h 1 2 + k 1 4 )) in 2d and O(h 1 4 + h 1 4 k 1 4 ) in 3d. One of the essential tools for the proof of the estimate (1.3) are error estimates with respect to the L (0, T ; L 2 (Ω))-norm for the state equation with low regularity of the data. The derivation of these estimates (see Section 5) is based on the techniques from Luskin & Rannacher [17] and Rannacher [24]. The paper is organized as follows: In the next section the optimal control problem is precisely formulated on the continuous level and optimality conditions are discussed. In Section 3 the three steps of discretization, i.e., temporal, spatial, and control discretization are described. In Section 4, we provide some stability estimates, which

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 3 are needed in the following analysis. Section 5 is devoted to error estimates for the state equation with respect to the L (0, T ; L 2 (Ω)) norm. The main result (1.3) is proved in Section 6. As a byproduct of our error analysis, we obtain a new regularity result for optimal control q in Section 7. In the last section, Section 8, we present a numerical example for illustrating our theoretical results. 2. Continuous problem. To set up a weak formulation of the state equation (1.1b), we introduce the following notation. For a convex polygonal or polyhedral domain Ω R n, n { 2, 3 }, we denote by V the Sobolev space H 1 0 (Ω). Together with H = L 2 (Ω), the Hilbert space V and its dual V form a Gelfand triple V H V. Here and in what follows, we employ the usual notation for Lebesgue and Sobolev spaces. For a time interval I = (0, T ), we introduce the state space X := { v : I Ω R v L 2 (I, V ) and t v L 2 (I, V ) } and the control space Q = L 2 (I, H). Remark 2.1. By obvious modifications, the error analysis derived below also applies to the case of finitely many (time-dependent) parameters instead of distributed control, i.e., for control spaces Q chosen as Q = R l or Q = L 2 (I, R l ) for l N. We use the following notation for the inner products and norms on L 2 (Ω) and L 2 (I, L 2 (Ω)): (v, w) := (v, w) L2 (Ω), v := v L 2 (Ω), (v, w) I := (v, w) L2 (I,L 2 (Ω)), v I := v L 2 (I,L 2 (Ω)). Further, we write v H 1 (Ω) := 1 v, v L 2 (I,H 1 (Ω)) := 1 v I, v H 2 (Ω) := 1 v, v L 2 (I,H 2 (Ω)) := 1 v I for the norms of the dual spaces H 1 (Ω), H 2 (Ω), L 2 (I, H 1 (Ω)), and L 2 (I, H 2 (Ω)), respectively. In this setting, the weak formulation of the state equation (1.1b) for given q Q, f L 2 (I, H), and u 0 H reads as follows: Find a state u X satisfying ( t u, ϕ) I + ( u, ϕ) I = (f + q, ϕ) I ϕ X, u(0) = u 0. (2.1) Assumption 1. Throughout, we assume the data f and u 0 to exhibit the higher regularity f L (I, L 2 (Ω)) and u 0 H 2 (Ω) V. To formulate the optimal control problem, we observe the control constraint (1.1c) by introducing the admissible set Q ad as Q ad := { q Q q a q(t, x) q b a.e. in I Ω }, where the bounds q a, q b R fulfill q a < q b. Furthermore, for the given weight ω H, we define the functional G: H R by G(v) := (v, ω).

4 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER The application of G to time dependent functions u: I H is defined by the setting G(u)(t) := G(u(t)). The state constraint (1.1d) is then formulated as G(u) b in Ī. (2.2) Remark 2.2. For u X, we have G(u)( ) C(Ī) by construction and due to the continuous embedding X C(Ī, H). With the cost functional J : Q L 2 (I, H) R defined as J(q, u) := 1 2 u û 2 I + 1 2 α q 2 I, the weak formulation of the optimal control problem (1.1) reads as Minimize J(q, u) for (q, u) Q ad X subject to (2.1) and (2.2), (2.3) where û L 2 (I, H) is the target state and α > 0 the regularization parameter. Assumption 2. Throughout, we assume the following Slater condition to be satisfied: q Q ad : G(u( q)) < b in Ī, (2.4) where u( q) is the solution of (2.1) for the particular control q. Remark 2.3. In view of the initial condition u 0 H, the relation G(u 0 ) < b is necessary for the assumed Slater condition to be satisfied. By standard arguments the feasibility of the Slater point q ensures the existence and uniqueness of optimal solutions to the considered problem (2.3). To formulate necessary optimality conditions, we employ the dual space of C(Ī) denoted by C(Ī) with its natural norm µ C( Ī) = sup { v, µ v C(Ī), v C(Ī) 1 }, where the duality product, between C(Ī) and C(Ī) is given by v, µ := v dµ. Theorem 2.4. A control q Q ad with associated state ū = u( q) is an optimal solution of problem (2.3) if and only if G(ū) b and there exists an adjoint state z L 2 (I, V ) and a Lagrange multiplier µ C(Ī) with µ 0 such that Ī ( t ϕ, z) I + ( ϕ, z) I = (ϕ, ū û) I + G(ϕ), µ ϕ X, ϕ(0) = 0 (2.5) (α q + z, q q) I 0 q Q ad (2.6) b G(ū), µ = 0. (2.7) Proof. For given u 0 H and f L 2 (I, H) the state equation (2.1) defines a continuous affine linear mapping q u from Q to X. Referring to [12] this mapping can be extended to a continuous affine linear mapping S : L 2 (I, V ) X. We denote the concatenation of S with the embedding X L 2 (I, H) by S. Since G is a continuous linear mapping from X to C(Ī) (cf. Remark 2.2), we can define G : L2 (I, V ) C(Ī) by G := G S. Furthermore, we define K C(Ī) by K := { v C(Ī) v b in Ī }.

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 5 These definitions enable us to embed (2.3) into the following abstract setting of optimization problems (cf., e.g., [15]): Minimize j(q) := J(q, S(q)) for q Q ad subject to G(q) K. Then, by the generalized KKT theory (see, e.g., [16, 30]) the assumed Slater condition (2.4) (which postulates the existence of q Q ad such that G( q) int K) implies that the optimality of q is equivalent to the existence of a Lagrange multiplier µ C(Ī) and an adjoint state z = S ( q) (S( q) û) + G ( q) µ L 2 (I, V ) fulfilling (α q + z, q q) I 0 q Q ad and v G( q), µ 0 v K. Recalling the definitions of S and G, we finally obtain that the derived expression of z is equivalent to z being the solution of (2.5) and we get the equivalence v G( q), µ 0 v K µ 0 and b G(ū), µ = 0. This completes the proof. Remark 2.5. The variational inequality (2.6) can be equivalently rewritten using the pointwise projection P Qad onto the set of admissible controls Q ad as follows: q = P Qad ( α 1 z). Therefore, the regularity z L 2 (I, V ) implies q L 2 (I, H 1 (Ω)) L (I Ω). In Section 7 we will provide a stronger regularity result for the optimal control q. 3. Discretization. In this section we describe the space-time finite element discretization of the optimal control problem (2.3). 3.1. Semidiscretization in time. At first, we define the semidiscretization in time of the state equation by discontinuous Galerkin methods, cf. [10, 18]. To this end, we consider a partitioning of the time interval Ī = [0, T ] such as Ī = {0} I 1 I 2 I M (3.1) with subintervals I m = (t m 1, t m ] of size and time points 0 = t 0 < t 1 < < t M 1 < t M = T. The discretization parameter k is viewed as a piecewise constant function by setting k Im = for m = 1, 2,..., M. The maximum size of the time steps is also denoted by k, i.e., k = max m=1,2,...,m. We impose the following conditions on the time mesh: (i) There are constants c, γ > 0 such that min ck γ. m=1,2,...,m (ii) There is a constant κ > 0 such that for all m = 1, 2,..., M 1 κ 1 +1 κ.

6 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER (iii) It holds k 1 4 T. The semidiscrete trial and test spaces are defined as { Xk r = v k L 2 (I, V ) v k Im P r (I m, V ), m = 1, 2,..., M Here, P r (I m, V ) is the space of polynomials of maximum degree r defined on I m with values in V. On Xk r we use the notation (v, w) Im := (v, w) L 2 (I m,l 2 (Ω)), v Im := v L 2 (I m,l 2 (Ω)). To define the discontinuous Galerkin (abbreviated as dg(r)) approximation using the space X r k, we employ the following notation for functions v k X r k : }. v + k,m := lim t 0 + v k (t m + t), v k,m := lim t 0 + v k (t m t) = v k (t m ), [v k ] m := v + k,m v k,m and define the bilinear form B(, ) for arguments u k, ϕ X r k by B(u k, ϕ) := ( t u k, ϕ) Im + ( u k, ϕ) I + m=1 ([u k ] m 1, ϕ + m 1 ) + (u+ k,0, ϕ+ 0 ). (3.2) Then, the dg(r) semidiscretization of the state equation (2.1) for given control q Q reads as follows: Find a state u k = u k (q) X r k satisfying m=2 B(u k, ϕ) = (f + q, ϕ) I + (u 0, ϕ + 0 ) ϕ Xr k. (3.3) The existence and uniqueness of solutions to (3.3) can be shown by using Fourier analysis, see [27] for details. Remark 3.1. Using a density argument it is possible to show that the exact solution u = u(q) X of the state equation (2.1) satisfies the identity B(u, ϕ) = (f + q, ϕ) I + (u 0, ϕ + 0 ) ϕ Xr k. Thus, the dg(r) time discretization satisfies the Galerkin orthogonality equation B(u u k, ϕ) = 0 ϕ X r k. Throughout the paper, we restrict ourselves to the lowest-order case r = 0, i.e., piecewise constant approximation in time. The resulting dg(0) scheme is a variant of the implicit Euler method. Because of this, the notation for the discontinuous piecewise constant functions v k Xk 0 can be simplified. Setting v k,m := v k,m we have v + k,m = v k,m+1 and [v k ] m = v k,m+1 v k,m. Since u k Xk 0 is piecewise constant in time the state constraint G(u k) b can be written in form of finitely many constraints, G(u k ) Im b for m = 1, 2,..., M. (3.4) Then, for the dg(0) time discretization the semidiscrete optimization problem has the following form: Minimize J(q k, u k ) for (q k, u k ) Q ad X 0 k subject to (3.3) and (3.4). (3.5)

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 7 Remark 3.2. We note that the optimal control q k is searched for in the subset Q ad of the continuous control space Q and the subscript k indicates the usage of the semidiscretized state equation. Similar to the continuous setting, we can formulate the following optimality condition: Theorem 3.3. A control q k Q ad with associated state ū k = u k ( q k ) is optimal solution of problem (3.5) if and only if G(ū k ) Im b for m = 1, 2,..., M and there exists an adjoint state z k Xk 0 and a Lagrange multiplier µ k C(Ī) given for any v C(Ī) by such that v, µ k = l=1 µ k,l k l I l v(t) dt with µ k,l R + (l = 1, 2,..., M) (3.6) B(ϕ, z k ) = (ϕ, ū k û) I + G(ϕ), µ k ϕ X 0 k (3.7) (α q k + z k, q q k ) I 0 q Q ad (3.8) b G(ū k ), µ k = 0. (3.9) Proof. Following the argument used in the proof of Theorem 2.4, we extend the mapping q u k Xk 0 to a linear mapping S k : (Xk 0) Xk 0 and denote the concatenation of S k with the embedding Xk 0 L2 (I, H) by S k. We directly obtain the continuity of S k and consequently also that of S k. The finitely many state constraints are described using of the continuous linear mapping G k : Xk 0 R M with (G k ) m := (G S k ) Im for m = 1, 2,..., M. By means of the set K k := { v R M v m b, m = 1, 2,..., M } we can rewrite problem (3.5) as follows: Minimize j k (q) := J(q, S k (q)) for q Q ad subject to G k (q) K k. In view of the Slater condition (2.4), by arguments as used later on in the proof of Lemma 6.2, we obtain that G k ( q) int K k is fulfilled for k small enough. Hence, as in the proof of Theorem 2.4, we obtain that the optimality of q k is equivalent to the existence of a Lagrange multiplier (µ k,l ) M l=1 RM + and an adjoint state z k Xk 0 fulfilling (3.7), (3.8), and (3.9). Via the construction given in (3.6), µ k is then defined as an element of C(Ī). Remark 3.4. As on the continuous level (see Remark 2.5) the variational inequality (3.8) can be equivalently rewritten using the pointwise projection P Qad as q k = P Qad ( α 1 z k ). Although the control has not yet explicitly been discretized, from this projection formula, we obtain that q k Im P 0 (I m, H 1 (Ω)) for m = 1, 2,..., M. Remark 3.5. We note that using integration by parts in time, the bilinear form B(ϕ, z k ) in (3.7) defined by (3.2) can equivalently be expressed as follows: B(ϕ, z k ) = m=1 M 1 (ϕ, t z k ) Im + ( ϕ, z k ) I (ϕ m, [z k ] m ) + (ϕ M, z k,m ). (3.10) m=1

8 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER 3.2. Discretization in space. To define the Galerkin finite element discretization in space, we consider families of two or three dimensional meshes covering the computational domain Ω, which satisfy the usual regularity conditions such as conformity and shape regularity (see, e.g., [4]). The meshes consist of quadrilateral or hexahedral cells K and are denoted by T h = {K}, where we define the discretization parameter h as a cellwise constant function by setting h K = h K with the diameter h K of the cell K. We use the symbol h also for the maximum cell size, i.e., h = max h K. On the mesh T h, we construct a conforming finite element space V h V in a standard way: V s h = { v V v K Q s (K) for K T h }. Here, Q s (K) consists of shape functions obtained via (bi-/tri-)linear transformations of polynomials in Q s ( K) defined on the reference cell K = (0, 1) n, where Q s ( K) n = span x αj j α j N 0, α j s. j=1 To obtain the fully discretized versions of the time discretized state equation (3.3), we introduce the space-time finite element space { X r,s = v kh L 2 (I, Vh s } ) v Im kh P r (I m, Vh s ), m = 1, 2,..., M Xk. r Then, the so-called cg(s)dg(r) discretization of the state equation for given control q Q has the following form: Find a state u kh = u kh (q) X r,s satisfying B(u kh, ϕ) = (f + q, ϕ) I + (u 0, ϕ + 0 ) ϕ Xr,s. (3.11) Throughout this paper we will restrict our analysis to the lowest-order case of (bi- /tri-)linear elements, i.e., we set s = 1 and consider the cg(1)dg(0) scheme. The state constraint on this level of discretization is given as in Section 3.1 by G(u kh ) Im b for m = 1, 2,..., M. (3.12) Then, the corresponding fully discrete optimal control problem reads as follows: Minimize J(q kh, u kh ) for (q kh, u kh ) Q ad X 0,1 subject to (3.11) and (3.12), (3.13) and the optimality conditions are given by the following theorem. Theorem 3.6. A control q kh Q ad with associated state ū kh = u kh ( q kh ) is optimal solution of problem (3.13) if and only if G(ū hk ) Im b for m = 1, 2,..., M and there exists an adjoint state z kh X 0,1 and a Lagrange multiplier µ kh C(Ī) given for any v C(Ī) by such that v, µ kh = l=1 µ kh,l k l I l v(t) dt with µ kh,l R + (l = 1, 2,..., M) (3.14) B(ϕ, z kh ) = (ϕ, ū kh û) I + G(ϕ), µ kh ϕ X 0,1 (3.15) (α q kh + z kh, q q kh ) I 0 q Q ad (3.16) b G(ū kh ), µ kh = 0. (3.17)

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 9 Proof. The theorem can be proved by repeating the steps of the proof of Theorem 3.3. Remark 3.7. As for q and q k (see Remark 2.5 and Remark 3.4), we obtain q kh = P Qad ( α 1 z kh ) (3.18) and therefore q Im kh P 0 (I m, H 1 (Ω)) for m = 1, 2,..., M. We note that since z kh is cellwise (bi-/tri-)linear, q Im kh may have kinks in the interior of a cell and therefore is in general not in P 0 (I m, V h ). 3.3. Discretization of the controls. In this subsection, we describe the discretization of the control variable by lowest-order finite elements, i.e., cellwise constant functions. We employ the same time partitioning and the same spatial mesh as for the discretization of the state variable and set Q d = { q Q q Im K P 0(I m K), m = 1, 2,..., M, K T h }. For this choice of the subspace Q d Q, we introduce the corresponding admissible set Q d,ad by Q d,ad := Q d Q ad. The state constraint can be expressed as in the previous sections by the conditions G(u σ ) Im b for m = 1, 2,..., M. (3.19) Then, the optimal control problem on this level of discretization reads as follows: Minimize J(q σ, u σ ) for (q σ, u σ ) Q d,ad X 0,1 subject to (3.11) and (3.19). (3.20) The unique optimal solution of (3.20) is denoted by ( q σ, ū σ ), where the subscript σ represents all three discretization parameters k, h, and d. The corresponding firstorder necessary optimality conditions are stated in the following theorem. Theorem 3.8. A control q σ Q d,ad with associated state ū σ = u kh ( q σ ) is optimal solution of problem (3.20) if and only if G(ū σ ) Im b for m = 1, 2,..., M, and there exists an adjoint state z σ X 0,1 and a Lagrange multiplier µ σ C(Ī) given for any v C(Ī) by such that v, µ σ = l=1 µ σ,l k l I l v(t) dt with µ σ,l R + (l = 1, 2,..., M) (3.21) B(ϕ, z σ ) = (ϕ, ū σ û) I + G(ϕ), µ σ ϕ X 0,1 (3.22) (α q σ + z σ, q q σ ) I 0 q Q d,ad (3.23) b G(ū σ ), µ σ = 0. (3.24) Proof. The theorem can be proved by repeating the steps of the proof of Theorem 3.3.

10 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER 4. Stability estimates. In this section, we provide several stability estimates for adjoint solutions arising from the optimality conditions of the optimization problem and for additional auxiliary solutions defined below in Section 4.2 4.1. Semidiscrete and discrete adjoint solution. At first, we consider the solution of the discrete adjoint equation (3.15). Theorem 4.1. For the solution z kh X 0,1 of (3.15) there holds } z kh I C { ū kh û I + ω µ kh C( Ī). (4.1) Proof. By means of the definition of µ kh, the definition of G, and the setting ϕ l = ϕ Il, (3.15) can be rewritten in the form B(ϕ, z kh ) = (ϕ, ū kh û) I + l=1 Defining the solutions zk l for l = 0, 1,..., M by we have the representation µ kh,l (ϕ l, ω) ϕ X 0,1. B(ϕ, zkh) 0 = (ϕ, ū kh û) I ϕ X 0,1 B(ϕ, zkh) l = (ϕ l, ω) ϕ X 0,1, l = 1, 2,..., M Hence, we get z kh = zkh 0 + µ kh,l zkh. l l=1 z kh I z 0 kh I + l=1 µ kh,l z l kh I z 0 kh I + max l=1,2,...,m zl kh I µ kh,l. To estimate z l kh I for l = 0, 1,..., M, we consider the solution z kh X 0,1 of B(ϕ, z kh ) = (ϕ, g) I + (ϕ M, z T ) ϕ X 0,1, with g L 2 (I, H) and z T H. By means of (3.10) and the setting z kh,m+1 := z T this can be rewritten as the following system of equations: ( ϕ, z kh ) Im (ϕ m, [ z kh ] m ) = (ϕ, g) I ϕ P 0 (I m, V h ), m = 1, 2,..., M. Choosing ϕ = z kh and using the algebraic identity implies (y m, [y] m ) = 1 2 y m+1 2 1 2 [y] m 2 1 2 y m 2 (4.2) z kh,m 2 + 2 z kh 2 I m z kh,m+1 2 + 2 g Im z kh Im. Hence, by the inequalities of Poincaré and Young and summing up for m = 1, 2,..., M, we end up with z kh 2 I C { g 2 I + z T 2}. l=1

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 11 Application of this estimate to the solutions zkh l with l = 0, 1,..., M yields z 0 kh I C ū kh û I and z l kh I C ω, l = 1, 2,..., M. Then, these estimates together with µ kh,l = µ kh, 1 µ kh C( Ī) l=1 imply the assertion. A similar result holds for the solution z σ of the discrete adjoint equation (3.22): Corollary 4.2. For the solution z σ X 0,1 of (3.22) there holds } z σ I C { ū σ û I + ω µ σ C( Ī). (4.3) Proof. The assertion follows immediately by repeating the steps of the proof of Theorem 4.1. 4.2. Continuous and semidiscrete auxiliary solutions. We consider the following forward and backward auxiliary problems: Find v X fulfilling ( t v, ϕ) I + ( v, ϕ) I = 0 ϕ X v(0) = v 0, (4.4) with initial value v 0 H, and find y X fulfilling (ϕ, t y) I + ( ϕ, y) I = 0 y(t ) = y T, ϕ X (4.5) with terminal value y T H. The corresponding semidiscrete analogues are given as follows: Find v k X 0 k fulfilling and find y k X 0 k fulfilling B(v k, ϕ) = (v 0, ϕ 1 ) ϕ X 0 k, (4.6) B(ϕ, y k ) = (ϕ M, y T ) ϕ X 0 k. (4.7) Furthermore the discrete variants are given by the following formulation: Find v kh X 0,1 fulfilling and find y kh X 0,1 fulfilling B(v kh, ϕ) = (v 0, ϕ 1 ) ϕ X 0,1, (4.8) B(ϕ, y kh ) = (ϕ M, y T ) ϕ X 0,1. (4.9) For the solution of (4.5), we have the following stability result. Theorem 4.3. For the solution y X of (4.5) there holds y I + max y(t) y T. (4.10) t Ī

12 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER Proof. See for instance [12]. Next, we prove an a priori estimate for the solution of (4.5) with respect to time-weighted norms. Theorem 4.4. For the solution y X of (4.5) there hold the a priori estimates (T t) t y(t) 2 dt C y T 2 (4.11) I I\I M t y(t) dt C ( ln T k ) 1 2 y T. (4.12) Proof. To estimate I (T t) ty(t) 2 dt, we choose ϕ = (T t) t y in (4.5) obtaining 0 = (T t) t y(t) 2 dt ((T t) y, t y) I I = t) t y(t) I(T 2 dt 1 d ( (T t) y(t) 2 ) dt 1 2 I dt 2 y 2 I. From this, we conclude 2 (T t) t y 2 dt + T y(0) 2 y 2 I. I Then, the application of the a priori estimate from Theorem 4.3 yields the first one of the asserted estimates. The second one then follows immediately from ( ) 1 ( 2 t y(t) dt (T t) 1 dt (T t) t y(t) 2 dt I\I M I\I M I\I M ( C ln T k ) 1 2 ( (T t) t y(t) 2 dt I The proof is complete. In the following theorem, we derive a stability estimate for the semidiscrete solutions of (4.6) and (4.7). Theorem 4.5. For the solutions v k Xk 0 of (4.6) and y k Xk 0 of (4.7) there hold the a priori estimates and T 2 v k,m 2 + T v k,m 2 + v k,m 2 + t m v k 2 I m + v k 2 I + m=1 T 2 y k,1 2 + T y k,1 2 + y k,1 2 + m=1 m=2 M 1 τ k,m y k 2 I m + y k 2 I + m=1 ) 1 2. ) 1 2 t 2 m 1 [ v k ] m 1 2 C v 0 2 (4.13) τ 2 k,m+1 [ y k ] m 2 C y T 2, (4.14)

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 13 with τ k,m = τ k Im := T t m 1. Proof. For proving the assertion for v k, we recall (4.6), which by means of the setting v k,0 := v 0 can be be rewritten as the following system of equations: ( v k, ϕ) Im + ([v k ] m 1, ϕ m ) = 0 ϕ P 0 (I m, V ), m = 1, 2,..., M. (4.15) (i) Choosing ϕ = v k in (4.15) leads us to Then, the algebraic identity is used to obtain v k 2 I m + ([v k ] m 1, v k,m ) = 0. ([v] m 1, v m ) = 1 2 v m 2 + 1 2 [v] m 1 2 1 2 v m 1 2 (4.16) v k,m 2 + 2 v k 2 I m v k,m 1 2. By adding these inequalities for m = 1, 2,..., M, we arrive at (ii) Integrating by parts in (4.15) and choosing ϕ Im and thus v k,m 2 + 2 v k 2 I v 0 2. (4.17) = t2 m 1 [ v k ] m 1 gives us t 2 m 1 ( v k, [ v k ] m 1 ) Im + t2 m 1 [ v k ] m 1 2 = 0 t 2 m 1( v k,m, [ v k ] m 1 ) + t2 m 1 [ v k ] m 1 2 = 0. Then, the algebraic identity (4.16) and the relation are used to obtain t 2 m 1 t 2 m 2 t m t 2 m v k,m 2 + 2 t2 m 1 [ v k ] m 1 2 t 2 m 1 v k,m 1 2 + 2 t m v k,m 2. By adding these inequalities for m = 2, 3,..., M and using t 1 = k 1, we arrive at T 2 v k,m 2 + 2 m=2 t 2 m 1 [ v k ] m 1 2 t 1 k 1 v k,1 2 + 2 2 t m v k 2 I m m=2 t m v k 2 I m. (4.18) m=1 (iii) Integrating by parts in (4.15) and choosing ϕ Im = t m v k gives us t m v k 2 I m + t m ([ v k ] m 1, v k,m ) = 0.

14 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER Then, the algebraic identity (4.16) and the relation κ 1 1 imply that t m v k,m 2 + 2t m v k 2 I m t m 1 v k,m 1 2 + κ 1 1 v k,m 1 2. By adding these inequalities for m = 2, 3,..., M and using t 1 = k 1, we arrive at M 1 T v k,m 2 + 2 t m v k 2 I m k 1 v k,1 2 + κ 1 v k 2 I m m=2 m=1 (1 + κ 1 ) v k 2 I. (4.19) Hence, it remains to estimate t 1 v k 2 I 1. (iv) Integrating by parts in (4.15) and choosing ϕ = v k leads us for m = 1 to v k 2 I 1 = (v k,1 v 0, v k,1 ) v k,1 v 0 v k,1 = k 1 2 1 v k,1 v 0 v k I1 and consequently to This implies v k I1 k 1 2 1 v k,1 v 0. k 1 v k,1 = k 1 2 1 v k I1 v k,1 v 0 and using t 1 = k 1 t1 vk 2 I 1 = k 2 1 v k,1 2 2 { v 0 2 + v k,1 2}. (4.20) Combining the estimates (4.17), (4.18), (4.19), and (4.20) yields the first one of the asserted estimates The second one on y k follows by inspection of (4.7), which by means of the setting y k,m+1 := y T can be rewritten as the following system of equations: ( ϕ, y k ) Im (ϕ m, [y k ] m ) = 0 ϕ P 0 (I m, V ), m = 1, 2,..., M. (4.21) We repeat the above steps (i) to (iv) and employ the algebraic identity (4.2) and the relation τ 2 k,m+1 τ 2 k,m 2 τ k,m to derive the desired result. For the case of more regular initial and terminal values for the solutions v 0 and y T of (4.6) and (4.7), respectively, we have the following results. Theorem 4.6. If v 0, y T H 2 (Ω) V, for the solutions v k Xk 0 of (4.6) and y k Xk 0 of (4.7) there hold the a priori estimates T v k,m 2 + v k,m 2 + v k 2 I + and m=2 M 1 T y k,1 2 + y k,1 2 + y k 2 I + m=1 t m 1 [ v k ] m 1 2 C v 0 2 (4.22) τ k,m+1 [ y k ] m 2 C y T 2, (4.23) with τ k,m = τ k Im := T t m 1. Proof. The proof of the assertion for v k is based on equation (4.15).

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 15 (i) Integrating by parts in (4.15) and choosing ϕ = 2 v k gives us and further, applying (4.16), v k 2 I m + ([ v k ] m 1, v k,m ) = 0 v k,m 2 + 2 v k 2 I m v k,m 1 2. By summing up for m = 1, 2,..., M, we obtain (ii) Integrating by parts in (4.15) and choosing ϕ Im and, consequently, v k,m 2 + 2 v k 2 I v 0 2. (4.24) = tm 1 [ 2 v k ] m 1 gives us t m 1 ( v k, [ v k ] m 1 ) Im + t m 1 [ v k ] m 1 2 = 0 t m 1 ( v k,m, [ v k ] m 1 ) + t m 1 [ v k ] m 1 2 = 0. Then, the algebraic identity (4.16) implies t m v k,m 2 + 2 t m 1 [ v k ] m 1 2 t m 1 v k,m 1 2 + v k,m 2. By adding these inequalities for m = 2, 3,..., M and using t 1 = k 1, we arrive at T v k,m 2 + 2 m=2 t m 1 [ v k ] m 1 2 k 1 v k,1 2 + v k 2 I m = v k 2 I. (4.25) Finally, the estimates (4.24) and (4.25) imply the assertion. The assertion for y k follows by repeating the steps (i) and (ii) for (4.21) employing identity (4.2). 5. Analysis of the discretization error for the state equation. The aim of this section is to prove a priori error estimates for the (uncontrolled) state equation (2.1) in the norm of L (I, L 2 (Ω)). These estimates form the basis of the error analysis for the whole optimization problem (2.3), which will be developed in Section 6, below. In contrast to the L (I, L 2 (Ω)) estimates available in the literature (cf. [10, 11]) the estimates we derive here only require the right-hand side f to be in L (I, L 2 (Ω)). Later this requirement carries over to the boundedness of the control q in L (I, L 2 (Ω)), which is fulfilled due to the prescribed control constraints. Let u X be the solution of the state equation (2.1) for q = 0, u k Xk r be the solution of the corresponding semidiscretized equation (3.3), and u kh X r,s be the solution of the fully discretized state equation (3.11). In order to separate the influences of the space and time discretization, we split the total discretization error e := u u kh in its temporal part e k := u u k and its spatial part e h := u k u kh. The temporal discretization error will be estimated in the following subsection, the spatial discretization error is treated in Section 5.2. m=2

16 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER 5.1. Analysis of the temporal discretization error. In this section, we will prove an error estimate for the temporal discretization error e k. For this, we need additionally to the solution y X of (4.5) the solution ỹ X of the auxiliary equation (ϕ, t ỹ) I + ( ϕ, ỹ) I = 0 ϕ X ỹ(t ) = y T, (5.1) where I = (0, t ) with some t I M = (T k M, T ]. The following lemma provides an estimate for the error between y and ỹ. Lemma 5.1. For the solutions y X of (4.5) and ỹ X of (5.1) there holds ( y ỹ L1 (I,L 2 (Ω)) + (y ỹ)(0) H 2 (Ω) Ck ln T k ) 1 2 y T. (5.2) Proof. Using the notation ξ := ỹ y, we have to estimate the two quantities ξ L 1 (I,L 2 (Ω)) and ξ(0) H 2 (Ω). Since y satisfies y(t ) = y T and the difference ξ solves (ϕ, t y) I + ( ϕ, y) I = 0 ϕ X (ϕ, t ξ) I + ( ϕ, ξ) I + (ϕ(t ), ξ(t )) = (ϕ(t ), y(t ) y(t )) ϕ X. (5.3) (i) Integrating by parts in (5.3) and choosing ϕ = 2 ξ, we obtain ( 2 ξ, t ξ) I ( 1 ξ, ξ) I + ( 2 ξ(t ), ξ(t )) = ( 2 ξ(t ), y(t ) y(t )). This implies 1 ξ(t ) 2 + 1 ξ(0) 2 +2 1 ξ 2 I 1 ξ(t ) 2 + 1 (y(t ) y(t )) 2 and, consequently, 1 ξ(0) 2 + 2 1 ξ 2 I 1 (y(t ) y(t )) 2. In virtue of t y = y and the stability estimate from Theorem 4.3 the righthand side can be estimated as ( ) 2 T 1 (y(t ) y(t )) 2 = 1 t y(t) dt dx Ω t T k M 1 t y(t) 2 dt t = k M T t y(t) 2 dt Ck 2 y T 2. By the definition of the norms of H 1 and H 2 this leads us to ξ(0) 2 H 2 (Ω) + ξ 2 L 2 (I,H 1 (Ω)) Ck2 y T 2. (5.4)

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 17 (ii) Integrating by parts in (5.3) and choosing ϕ = τ 1 ξ with implies τ(t) := max{t t, k} (τ 1 ξ, t ξ) I + (τξ, ξ) I k( 1 ξ(t ), ξ(t )) = k( 1 ξ(t ), y(t ) y(t )). By the relation (τ 1 ξ, t ξ) I = 1 2 and observing τ 1, we conclude I d ( ) τ( 1 ξ(t), ξ(t)) dt 1 τ ( 1 ξ(t), ξ(t)) dt dt 2 I k 1 ξ(t ) 2 + t 1 ξ(0) 2 + 2 τξ 2 I and, consequently, 1 ξ 2 I + k 1 ξ(t ) 2 + k 1 (y(t ) y(t )) 2, 2 τξ 2 I + t 1 ξ(0) 2 1 ξ 2 I + k 1 (y(t ) y(t )) 2. For estimating the second term on the right-hand side, we use t y = y to obtain ( ) 2 T 1 (y(t ) y(t )) 2 = 1 t y(t) dt dx Ω T k M 1 t y(t) 2 dt = k M t t T t y(t) 2 dt Ck y T 2. From this, using (5.4), we obtain by the definition of the H 1 norm that τξ 2 I Ck2 y T 2. Then, the estimate of ξ L 1 (I,L 2 (Ω)) follows from ξ 2 L 1 (I,L 2 (Ω)) τ 1 2 I τξ 2 I τ(t) 1 dt τξ 2 I Ck 2( ln T k + 1 ) y T 2 Ck 2 ln T k y T 2, where in the last inequality the assumption k 1 4T is used. Next, we provide an estimate for the error between y and its discrete analogue y k. Lemma 5.2. For the solutions y X of (4.5) and y k Xk 0 of (4.7) there holds ( y y k L1 (I,L 2 (Ω)) + y(0) y k,1 H 2 (Ω) Ck ln T k I ) 1 2 y T. (5.5) Proof. We define a semidiscrete projection π k : C(Ī \ I M, V ) X r k, for m = 1, 2,..., M, by π ky Im = y(t m 1 ). (5.6)

18 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER By inserting π k y, we obtain due to the definition of π k that y y k L 1 (I,L 2 (Ω)) + y(0) y k,1 H 2 (Ω) y π ky L 1 (I,L 2 (Ω)) + π ky y k L 1 (I,L 2 (Ω)) + π ky(0) y k,1 H 2 (Ω) For the first term, we have y πky L1 (I,L 2 (Ω)) = y(t) πky(t) dt + y(t) πky(t) dt I\I M I M { } Ck t y(t) dt + max y(t). I\I M t Ī Then, Theorem 4.4 and the a priori estimate from Theorem 4.3 imply ( y πky L 1 (I,L 2 (Ω)) Ck ln T ) 1 2 y T. k Using the notation ξ k := πk y y k, we have to estimate the two quantities ξ k L1 (I,L 2 (Ω)) and ξ k,1 H 2 (Ω). Employing Galerkin orthogonality and the definition of πk, for ϕ Xk 0 L2 (I, H 2 (Ω)), we have B(ϕ, ξ k ) = B(ϕ, y πky) = ( ϕ, y πky) I { } = ( ϕ m, y(t)) dt k( ϕ m ), y(t m 1 )) I m = m=1 m=1 I m (t m t)( ϕ m, t y(t)) dt. By means of (3.10) and the definition ξ k,m+1 := 0, this equality can be rewritten as the following system of equations, for m = 1, 2,..., M: ( ϕ, ξ k ) Im (ϕ m, [ξ k ] m ) = (t m t)( ϕ, t y(t)) dt I m ϕ P 0 (I m, H 2 (Ω) V ). (5.7) (i) Setting ϕ = 2 ξ k in (5.7), after integration by parts and observing t y = y, we obtain ( 1 ξ k, ξ k ) Im ( 1 ξ k,m, [ 1 ξ k ] m ) = (t m t)(ξ k, y(t)) dt. I m Estimating the right-hand side by (t m t)(ξ k, y(t)) dt 1 I m 2 1 ξ k 2 I m + 1 (t m t) 2 y(t) 2 dt 2 I m and applying the identity (4.2) to the left-hand side leads us to 1 ξ k,m 2 + 1 ξ k 2 I m 1 ξ k,m+1 2 + k 2 m y 2 I m. Summing this for m = 1, 2,..., M yields 1 ξ k,1 2 + 1 ξ k 2 I k 2 y 2 I. Consequently, by the a priori estimate from Theorem 4.3 and the definition of the norms of H 1 (Ω) and H 2 (Ω), we get ξ k,1 2 H 2 (Ω) + ξ k 2 L 2 (I,H 1 (Ω)) Ck2 y T 2. (5.8)

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 19 (ii) Setting ϕ Im = τ k,m 1 ξ k in (5.7), after integration by parts, we obtain τ k,m ξ k 2 I m τ k,m ( 1 ξ k,m, [ 1 ξ k ] m ) = τ k,m I m (t m t)(ξ k, t y(t)) dt Estimating the right-hand side by τ k,m (t m t)(ξ k, t y(t)) dt τ k,m I m 2 ξ k 2 I m + τ k,m (t m t) 2 t y 2 dt 2 I m and using the identities (4.2) and τ k,m = τ k,m+1 + leads us to τ k,m ξ k 2 I m + τ k,m 1 ξ k,m 2 τ k,m+1 1 ξ k,m+1 2 + 1 ξ k,m+1 2 + τ k,m I m (t m t) 2 t y 2 dt. Observing obtain κ+1, summing these equations for m = 1, 2,..., M, we T 1 ξ k,1 2 + τ k,m ξ k 2 I m m=1 κ 1 ξ k 2 I + Since for t I m, with m M 1, we have τ k,m (t m t) I 2 t y 2 dt. (5.9) m m=1 τ k,m T t m + κ+1 (1 + κ)(t t m ) (1 + κ)(t t) and for m = M, we have τ k,m = k M, the second term on the right-hand side of (5.9) can be estimated as follows: τ k,m (t m t) I 2 t y 2 dt m m=1 M 1 km 2 τ k,m t y 2 dt + km 2 (T t) t y 2 dt m=1 I m I M (1 + κ)k 2 (T t) t y 2 dt. I Using this, (5.8), and Theorem 4.4, we conclude from (5.9) that τ k,m ξ k 2 I m Ck 2 y T 2. m=1 Then, the desired estimate for ξ k L1 (I,L 2 (Ω)) follows from M ξ k 2 L 1 (I,L 2 (Ω)) τ 1 m=1 k,m m=1 τ k,m ξ k 2 I m Ck 2( ln T k + 1 ) y T 2 Ck 2 ln T k y T 2,

20 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER where in the last inequality the assumption k 1 4T is used. After these preparations, we can prove the following two theorems leading to the main result of this subsection. We begin with an estimate for the interpolation error u( ) u(t m ): Theorem 5.3. On each time interval I m with m = 1, 2,..., M, for the solution u X of (2.1), there holds ( u( ) u(t m ) L (I m,l 2 (Ω)) Ck ln T ) 1 2 { f L (I,L k 2 (Ω)) + u 0 }. (5.10) Proof. For simplicity, we only consider the last time interval I M and a fixed time point t I M. Let y and ỹ be the solutions of (4.5) and (5.1) with y T = u(t ) u(t ). Using (4.5), (5.1) and (2.1), we obtain by integration by parts in time and the condition u(0) = u 0 that This implies the relation (u(t ), u(t ) u(t )) = (f, y) I + (u 0, y(0)), (u(t ), u(t ) u(t )) = (f, ỹ) I + (u 0, ỹ(0)). u(t ) u(t ) 2 = (f, ỹ) I (f, y) I + (u 0, (ỹ y)(0)) T = (f, ỹ y) I (f(t), y(t)) dt + (u 0, (ỹ y)(0)) t { f L (I,L 2 (Ω)) + u 0 } { ỹ y L 1 (I,L 2 (Ω)) + k y L (I,L 2 (Ω)) + (ỹ y)(0) H 2 (Ω)}. By the a priori estimate from Theorem 4.3, we directly obtain y L (I,L 2 (Ω)) C u(t ) u(t ) and the assertion of Lemma 5.1 completes the proof. Furthermore, we estimate the error u(t m ) u k (t m ): Theorem 5.4. For the solution u X of (2.1) and the dg(0) semidiscretized solution u k Xk 0 of (3.3), there holds on each time interval I m with m = 1, 2,..., M the error estimate ( u(t m ) u k,m Ck ln T ) 1 2 { f L (I,L k 2 (Ω)) + u 0 }. (5.11) Proof. For simplicity, we only consider the last time point t M = T. The proof employs a duality argument. Let y X and y k Xk 0 be the solutions of (4.5) and (4.7) with y T = e k,m = u(t M ) u k,m. By Galerkin orthogonality, we have e k,m 2 = B(e k, y) = B(e k, y y k ) = B(u, y y k ) = (f, y y k ) I + (u 0, y(0) y k,1 ) { f L (I,L 2 (Ω)) + u 0 }{ y y k L1 (I,L 2 (Ω)) + y(0) y k,1 H 2 (Ω)}. Then, the assertion of Lemma 5.2 completes the proof. Based on the previous theorems, we can now state the main result of this subsection.

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 21 Corollary 5.5. For the error e k := u u k between the solution u X of (2.1) and the dg(0) semidiscretized solution u k Xk 0 of (3.3), there holds the estimate ( e k L (I,L 2 (Ω)) Ck ln T ) 1 2 { f L (I,L k 2 (Ω)) + u 0 }. (5.12) Proof. We decompose the error on I m for m = 1, 2,..., M as follows: e k L (I m,l 2 (Ω)) u( ) u(t m ) L (I m,l 2 (Ω)) + u(t m ) u k ( ) L (I m,l 2 (Ω)) = u( ) u(t m ) L (I m,l 2 (Ω)) + u(t m ) u k,m. Then, the assertion of the corollary follows from the Theorems 5.3 and 5.4. 5.2. Analysis of the spatial discretization error. In this section, we analyze the spatial discretization error e h. However, to derive the main result, we need to prove a sequence of auxiliary lemmas. Lemma 5.6. For the solutions v k Xk 0 of (4.6) and v kh X 0,1 of (4.8) and the solutions y k Xk 0 of (4.7) and y kh X 0,1 of (4.9), there holds (a) if v 0, y T H: (b) if v 0, y T V H 2 (Ω): v k v kh I Ch v 0 (5.13) y k y kh I Ch y T, (5.14) v k v kh I C T h 2 v 0 (5.15) y k y kh I C T h 2 y T. (5.16) Proof. We prove only the assertions for v k v kh. The assertions for y k y kh can be obtained by similar arguments. We use the splitting η h := v k v kh = v k P h v k + P h η h, where P h : V V h denotes the L 2 projection in space. The application of P h to time dependent arguments has to be understood pointwise in time. By Galerkin orthogonality, it holds B(P h η h, ϕ) = B(P h v k v k, ϕ) ϕ X 0,1, which can be rewritten by means of the definitions P h η h,0 := 0 and P h v k,0 v k,0 := 0 as the following system of equations, for m = 1, 2,..., M: ( P h η h, ϕ) Im + ([P h η h ] m 1, ϕ m ) = ( (P h v k v k ), ϕ) Im + ([P h v k v k ] m 1, ϕ m ) ϕ P 0 (I m, V h ). (5.17) For u V and u h V h, we define the Ritz projection R h : V V h and the discrete Laplacian h : V h V h by the relations ( R h u, ϕ) = ( u, ϕ) ϕ V h and ( h u h, ϕ) = ( u h, ϕ) ϕ V h.

22 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER As for P h, the application of h and R h to time dependent arguments has to be understood pointwise in time. Taking ϕ = 1 h P hη h in (5.17) and observing the definitions of the projectors P h and R h, we conclude P h η h 2 I m ([P h η h ] m 1, 1 h P hη h,m ) = ( (P h v k v k ), 1 h P hη h ) Im = ( (P h v k R h v k ), 1 h P hη h ) Im ( (R h v k v k ), 1 h P hη h ) Im = (P h v k R h v k, P h η h ) Im. By the definition of 1 h and R h, this implies P h η h 2 I m + ([ 1 h P hη h ] m 1, 1 h P hη h,m ) = (P h v k R h v k, P h η h ) Im. We remark, that the trick of comparing v kh to P h v k and R h v k rather than directly to v k is crucial and has been introduced into the error analysis of parabolic problems in [29]. Then, the algebraic identity (4.16) and Young s inequality leads us to P h η h 2 I m + 1 h P hη h,m 2 1 h P hη h,m 1 + P h v k R h v k 2 I m. By adding these identities for m = 1, 2,..., M, we arrive at P h η h 2 I P h v k R h v k 2 I, and observing η h = v k P h v k + P h η h gives us (a) For v 0 H, we have η h 2 I v k P h v k 2 I + P h v k R h v k 2 I. η h 2 I Ch 2 v k 2 I and Theorem 4.5 implies the asserted estimate. (b) For v 0 V H 2 (Ω), we have η h 2 I Ch 4 v k 2 I CT h 4 max v k,m 2. m=1,2...,m and Theorem 4.6 implies the asserted estimate. The proof is completed. Lemma 5.7. For the solutions v k Xk 0 of (4.6) and v kh X 0,1 of (4.8) and the solutions y k Xk 0 of (4.7) and y kh X 0,1 of (4.9), there holds (a) if v 0, y T H: T vk,m v kh,m Ch v 0 (5.18) T yk,1 y kh,1 Ch y T, (5.19) (b) if v 0, y T V H 2 (Ω): v k,m v kh,m Ch 2 v 0 (5.20) y k,1 y kh,1 Ch 2 y T. (5.21)

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 23 Proof. We only prove the assertion for v k v kh. The assertion for y k y kh can be proved similarly. We use the splitting η h := v k v kh = v k R h v k + R h η h, where R h denotes the Ritz projection onto V h. By Galerkin orthogonality, there holds B(R h η h, ϕ) = B(R h v k v k, ϕ) ϕ X 0,1, By means of the definitions R h η h,0 := 0 and R h v k,0 v k,0 := 0 this can be rewritten as the following system of equations, for m = 1, 2,..., M: ( R h η h, ϕ) Im + ([R h η h ] m 1, ϕ m ) = ( (R h v k v k ), ϕ) Im + ([R h v k v k ] m 1, ϕ m ) for all ϕ P 0 (I m, V h ). Taking ϕ Im = t m 1 R h η h and using the definition of the projector R h yields t m 1 R h η h 2 I m + t m 1 ([R h η h ] m 1, R h η h,m ) = t m 1 ([R h v k v k ] m 1, R h η h,m ). The identity (4.16) implies t m R h η h,m 2 + 2t m 1 R h η h 2 I m t m 1 R h η h,m 1 2 + R h η h,m 2 + t2 m 1 [R h v k v k ] m 1 2 + R h η k,m 2. Summing this for m = 2, 3..., M and using t 1 = k 1, we obtain T R h η h,m 2 k 1 R h η h,1 2 + Consequently, T R h η h,m 2 m=2 m=2 Since η h = v k R h v k + R h η h, we conclude T η h,m 2 2T R h v k,m v k,m 2 + 2 m=2 t 2 m 1 [R h v k v k ] m 1 2 + 2 R h η h,m 2. m=2 t 2 m 1 [R h v k v k ] m 1 2 + 2 R h η h 2 I. t 2 m 1 [R h v k v k ] m 1 2 + 4 R h v k v k 2 I + 4 η h 2 I. (a) For v 0 H, we obtain using the approximation properties of R h that T η h,m 2 Ch 2{ T v k,m 2 + v k 2 I + m=2 t 2 m 1 [ v k ] m 1 2} + C η h 2 I. Then, Theorem 4.5 and Lemma 5.6(a) imply the asserted estimate (b) For v 0 V H 2 (Ω), we obtain using the error estimates for R h that { T η h,m 2 Ch 4 T max v k,m 2 + m=1,2,...,m m=2 t m 1 [ v k ] m 1 2} + C η h 2 I. Then, Theorem 4.6 and Lemma 5.6(b) imply the asserted estimate.

24 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER The proof is complete. Lemma 5.8. For the solutions v k Xk 0 of (4.6) and v kh X 0,1 of (4.8) and the solutions y k Xk 0 of (4.7) and y kh X 0,1 of (4.9), there holds v k,m v kh,m H 2 (Ω) Ch 2 v 0 (5.22) y k,1 y kh,1 H 2 (Ω) Ch 2 y T. (5.23) Proof. We only prove the assertion for v k v kh. The assertion for y k y kh can be proved similarly. For the proof, we employ another duality argument. For η h := v k v kh, we recall that η h,m H 2 (Ω) = sup ψ H 2 (Ω) V (η h,m, ψ) ψ. For any fixed ψ H 2 (Ω) V, we consider the solutions y k Xk 0 of (4.7) and y kh X 0,1 of (4.9) with y T = ψ. Using (4.7) with ϕ = v k, (4.6) with ϕ = y k, (4.9) with ϕ = v kh, and (4.8) with ϕ = y kh, we obtain (v k,m, ψ) = B(v k, y k ) = (v 0, y k,1 ) and (v kh,m, ψ) = B(v kh, y kh ) = (v 0, y kh,1 ). Consequently, Then, Lemma 5.7(b) implies (η h,m, ψ) = (v 0, y k,1 y kh,1 ) v 0 y k,1 y kh,1. (η h,m, ψ) Ch 2 v 0 ψ, which yields the asserted estimate. Lemma 5.9. For the solutions y k Xk 0 of (4.7) and y kh X 0,1 holds of (4.9), there T y k,1 y kh,1 Ch 2 y T. (5.24) Proof. The proof employs a bootstrap argument based on the sub-optimal error estimate of Lemma 5.7. We use the solutions v k Xk 0 of (4.6) and v kh X 0,1 of (4.8) with v 0 = y k,1 y kh,1. Considering the equations (4.6) with ϕ = y k, (4.7) with ϕ = v k, (4.8) with ϕ = y kh, and (4.9) with ϕ = v kh on { 0 } I 1 I 2 I m with some m M and ϕ = 0 elsewhere yields and (y k,1 y kh,1, y k,1 ) = B(v k, y k ) = (v k, m, y k, m+1 ) (y k,1 y kh,1, y kh,1 ) = B(v kh, y kh ) = (v kh, m, y kh, m+1 ). Hence, with ξ h := y k y kh and η h := v k v kh, we obtain ξ h,1 2 = (η h, m, y k, m+1 ) (η h, m, ξ h, m+1 ) + (v k, m, ξ h, m+1 ) η h, m H 2 (Ω) y k, m+1 + η h, m ξ h, m+1 + v k, m ξ h, m+1 H 2 (Ω). (5.25)

FEM FOR PARABOLIC OPTIMAL CONTROL WITH STATE CONSTRAINTS 25 From Theorem 4.5, we have t m v k, m C ξ h,1 and τ k, m+1 y k, m+1 C y T, and further, by Lemma 5.8, η h, m H 2 (Ω) Ch 2 ξ h,1 and ξ h, m+1 H 2 (Ω) Ch 2 y T and by Lemma 5.7 t m η h, m Ch ξ h,1 and τ k, m+1 ξ h, m+1 Ch y T. For the three terms on the right-hand side of (5.25), we obtain η h, m H 2 (Ω) y k, m+1 Ch 2 τ 1 k, m+1 ξ h,1 y T, η h, m ξ h, m+1 Ch 2 t 1 m τ 1 k, m+1 ξ h,1 y T, v k, m ξ h, m+1 H 2 (Ω) Ch 2 t 1 m ξ h,1 y T. We choose m such that 1 2 T I m. Using the assumption k 1 4T, this implies Hence, we conclude t m 1 2 T and τ k, m+1 = T t m 1 2 T k m 1 4 T. ξ h,1 2 Ch2 T ξ h,1 y T, which implies the asserted estimate. After these preparations, we can now prove the main result of this subsection. Theorem 5.10. For the dg(0) semidiscretized solution u k Xk 0 of (3.3) and the fully discretized solution u kh X 0,1 of (3.11), we have the error estimate max u k,m u kh,m Ch 2 ln T { f L (I,L m=1,2,...,m k 2 (Ω)) + u 0 }. (5.26) Proof. For simplicity, we only consider the last time point t M = T. The proof again employs a duality argument. Let y k Xk 0 and y kh X 0,1 be the solutions of (4.7) and (4.9), respectively, with y T = e h,m = u k,m u kh,m. Using Galerkin orthogonality, we obtain e h,m 2 = B(e h, y k ) = B(e h, y k y kh ) = B(u k, y k y kh ) = (f, y k y kh ) I + (u 0, y k,1 y kh,1 ) { f L (I,L 2 (Ω)) + u 0 } { yk y kh L 1 (I,L 2 (Ω)) + y k,1 y kh,1 H 2 (Ω) Then, in view of the assumption k 1 4T, Lemma 5.9 implies y k y kh L 1 (I,L 2 (Ω)) τ 1 m=1 max k,m m=1,2,...,m ( τk,m y k,m y kh,m ) Ch 2( ln T k + 1 ) e h,m Ch 2 ln T k e h,m, }.

26 DOMINIK MEIDNER, ROLF RANNACHER, AND BORIS VEXLER and the assertion of Lemma 5.8 completes the proof. The following corollary, which is a direct consequence of the previous theorem, states the main result of this subsection. Corollary 5.11. For the error e h := u k u kh between the dg(0) semidiscretized solution u k Xk 0 of (3.3) and the fully discretized solution u kh X 0,1 of (3.11), there holds the estimate e h L (I,L 2 (Ω)) Ch 2 ln T k { f L (I,L 2 (Ω)) + u 0 }. (5.27) Proof. Since u k and u kh are constant on the time intervals I m, m = 1, 2,..., M, the assertion is directly implied by Theorem 5.10. 6. Error analysis for the optimal control problem. In this section, we will prove the main result of this article. Theorem 6.1. Let q Q ad be the solution of the optimal control problem (2.3) with optimal state ū X and q σ Q d,ad be the solution of the fully discrete optimal control problem (3.20) with discrete optimal state ū σ X 0,1. Then, the following error estimate holds: ( α q qσ I + ū ū σ I C{ k 1 2 ln T ) 1 ( 4 + h ln T ) 1 2 + 1 } h. (6.1) k k α The proof of this result is divided in three steps reflecting the three steps of discretization introduced in Section 3. In each step the important tools will be the estimates for the state equation from the previous section and the (uniform) boundedness of the discrete Lagrange multipliers, cf. [8, 6]. 6.1. Estimates for the error due to time discretization of the state. Lemma 6.2. Let q k Q ad be the solution of (3.5) with state ū k X 0 k and corresponding Lagrange multiplier µ k C(Ī). Then, there exists k 0 > 0 such that q k I + ū k I + µ k C( Ī) C k k 0. (6.2) Proof. Since G(u( q))( ) C(Ī), the Slater condition (2.4) ensures the existence of δ > 0 such that G(u( q)) b δ in Ī. Since Q ad L (I, L 2 (Ω)), we have from Corollary 5.5 for q Q ad that we also have u(q) u k (q) L (I,L 2 (Ω)) 0 (k 0), G(u k ( q)) = G(u( q)) + G(u k ( q) u( q)) b δ + ω u( q) u k ( q) L (I,L 2 (Ω)) < b for k k 0. This implies J( q k, ū k ) J( q, u k ( q)) = 1 2 u k( q) û 2 I + 1 2 α q 2 I u k ( q) u( q) 2 I + u( q) û 2 I + 1 2 α q 2 I C for k k 0 and, consequently, the bound q k I + ū k I C. (6.3)