Robust error estimates for regularization and discretization of bang-bang control problems

Robust error estimates for regularization and discretization of bang-bang control problems Daniel Wachsmuth September 2, 205 Abstract We investigate the simultaneous regularization and discretization of an optimal control problem with pointwise control constraints. Typically such problems exhibit bang-bang solutions: the optimal control almost everywhere takes values at the control bounds. We derive discretization error estimates that are robust with respect to the regularization parameter. These estimates can be used to make an optimal choice of the regularization parameter with respect to discretization error estimates. Keywords. Optimal control, bang-bang control, Tikhonov regularization, parameterchoice rule. AMS classification. 49K20, 49N45, 65K5 Introduction In this article we investigate the regularization and discretization of bang-bang control problems. The class of problems that we consider can be described as the minimization of 2 Su z 2 Y P) over all u L 2 D) satisfying the constraint u a u u b a.e. on D..) In line with the usual nomenclature in optimal control, the variable u will be called control, the variable y := Su will be called state. In the problem above D is a bounded subset of R n. The operator S is assumed to be linear and continuous from L 2 D) to Y with Y being a Hilbert space. The target state z Y is a given desired state. Moreover, we assume that the Hilbert space adjoint operator S of S maps from Y to L D). Here, we have in mind optimal control problems for linear partial differential equations. In order to numerically solve P), let us introduce a family of linear and continuous operators {S h } h>0 from L 2 D) to Y with finite-dimensional range Y h Y, This work was partially funded by Austrian Science Fund FWF) grant P23848. Institut für Mathematik, Universität Würzburg, 97074 Würzburg, Germany, daniel.wachsmuth@mathematik.uni-wuerzburg.de

where h denotes the discretization parameter. We assume S h S for h 0 in a sense clarified below, see Section.4. Furthermore, we consider the Tikhonov regularization with parameter α > 0. The regularized and discretized problem now reads: Minimize 2 S hu z 2 Y + α 2 u 2 L 2 D) P α,h ) subject to.). Let us note that the control space is not discretized, which is the variational discretization concept introduced in [5]. Here, one is interested in convergence results with respect to α, h) 0. Moreover, the choice of the parameter α depending on discretization parameters is important for actual numerical computations. Let us briefly review existing literature on this subject. Most of the existing results assume a bang-bang structure of the optimal control u 0 : u 0 x) {u a x), u b x)} a.e. on D. Convergence rate estimates with respect to α 0 for the undiscretized problem can be found in [2, 3]. There, an assumption on the measure of the almost-inactive set is used. Such an assumption was applied in different situation in the literature as well, we mention only [4, 0]. Convergence rate estimates of the regularization error are also available without the assumption of bang-bang structure. There one has to resort to source conditions, see e.g. [7], and combinations of source condition and active set conditions [3]. Discretization error estimates are well known in the literature. A-priori discretization error estimates concerning the discretization of P α ) for fixed α > 0 can be found for instance in [5, 6]. In the case α = 0 the analysis is much more delicate, a-priori error discretization estimates for this case can be found in [3]. Let us mention that the error estimates for the regularized problem α > 0 are not robust with respect to α. Consequently, the results in the case α = 0 cannot be obtained by passing to the limit α 0. The coupling of discretization and regularization was discussed in []. There the regularization parameter α is chosen depending on a-priori or a-posteriori discretization error estimates. Let us emphasize, that almost all of the above cited literature is concerned with convex problems. The extension to the non-convex case is not straight-forward, we refer to [] for results on second-order sufficient optimality conditions for bang-bang control problems. In a recent work, multi-bang control problems were investigated [2]. In this article, we discuss robust discretization error estimates for α > 0. In these estimates, the regularization parameter can tend to zero, and the resulting estimate coincides with that of [3] in the case α = 0. Of course, the robust estimates are not optimal in the discretization parameters. A second result is the arising choice of the regularization parameter depending on discretization quantities. It turns out that it is optimal to choose α proportional to the L - discretization error. The surprising fact is that this choice is independent of some unknown constants appearing in the assumption on the almost-inactive set. Notation In the sequel, we will frequently use generic constants c > 0 that may change from line to line, but which are independent of relevant quantities such as α and h. 2

. Assumptions and preliminary results Let us specify the standing assumptions that we will use throughout the paper. Let D, Σ, µ) be a given finite measure space. The operator S is assumed to be linear and continuous from L 2 D) to the Hilbert space Y. Moreover, we assume that the Hilbert space adjoint operator S of S maps from Y to L D). Let us remark that the requirements on S could be relaxed to allow S mapping into L p D), p 2, ), see [3]. The control constraints are given with u a, u b L D) and u a u b a.e. on D. The set of admissible controls is defined by U ad := {u L 2 D) : u a u u b a.e. on D}. As already introduced, we will work with a family of operators {S h } h>0, S h LL 2 D), Y ) with finite-dimensional range. The adjoint operators Sh are assumed to map from Y to L D). For completeness, let us define the regularized version of the undiscretized problem: Given α > 0, minimize 2 Su z 2 Y + α 2 u 2 L 2 D) P α ) subject to the inequality constraints.). Due to classical results, the problems P), P α ), P α,h ) admit solutions. Proposition.. The problems P) and P α,h ), are solvable with convex and bounded sets of solutions. The problems P α ) and P α,h ) are uniquely solvable for α > 0. The set of optimal states of P) {Su : u solves P)} is a singleton. Moreover, P) is uniquely solvable if S is injective. Proof. Due to the control constraints, the admissible set U ad is bounded in L 2 D). Moreover, it is convex and closed in L 2 Ω) hence weakly-closed. By assumption, the functionals to be minimized are convex and continuous from L 2 D) to R, which implies weakly lower semicontinuity. The existence of solutions of the problems P), P α ), and P α,h ) follows from the Weierstraß theorem, where L 2 D) is supplied with the weak topology. Since the objective functional and the admissible set are convex, the set of solutions of these problems is convex. If α > 0 then the objective functionals of P α ) and P α,h ) are strictly convex, which implies that the solutions are unique. Moreover, the mapping Su 2 Su z is strictly convex, hence the sets of optimal states for P) and P α,h) for α = 0 are singletons. If in addition S is injective, then the uniqueness of solutions of P) follows. Let us note that the solutions of P) and P α,h ) may not be uniquely determined if α = 0. However, the optimal states of P) and P 0,h ) are uniquely determined due to the strict convexity of the cost functional w.r.t. Su..2 Necessary optimality conditions The solutions of the considered problems can be characterized by means of first-order necessary optimality conditions, which are sufficient as well due to the convexity of the problems. 3

Proposition.2. For α 0 let u α and u α,h be solutions of P α ) and P α,h ), respectively. Let us define y α := Su α, y α,h := S h u α,h, p α := S y α z), and p α,h := S h y α,h z). Then it holds αu α + p α, u u α ) 0 u U ad and αu α,h + p α,h, u u α,h ) 0 u U ad. These variational inequalities imply the following pointwise variational inequality [9, Lemma 2.26] αu α x) + p α x))u u α x)) 0 u [u a x), u b x)] f.a.a. x D..2) In the case α > 0 this yields pointwise a.e. representations for the optimal control u α x) = proj [uax),u b x)] ) α p αx) f.a.a. x D. If α = 0 we have = u a x) if p 0 x) > 0 u 0 x) [u a x), u b x)] if p 0 x) = 0 = u b x) if p 0 x) < 0 f.a.a. x D..3) Similar relations hold for u α,h and u 0,h. For α = 0, the controls u 0 and u 0,h are bang-bang if p 0 0 and p 0,h 0 a.e. on D, respectively. Moreover, if p 0 = 0 and p 0,h = 0 on sets of positive measure then the values of u 0 and u 0,h cannot be determined by the respective variational inequalities.2)..3 Regularization error estimate Let us now recall the assumption on the almost-inactive sets. It is widely used in the literature, as it can be viewed as a strengthened complementarity condition. Assumption. Let us assume that there are κ > 0, c > 0 such that holds for all ɛ > 0. meas {x D : p 0 x) ɛ} c ɛ κ As discussed above, the optimal state y 0 = Su 0 and consequently the optimal adjoint state p 0 = S y 0 z) are uniquely determined. Under the assumption above, the optimal control u 0 is uniquely determined as well and has bang-bang type. Corollary.3. Let Assumption be satisfied. Then the problem P) is uniquely solvable, and its solution u 0 is bang-bang. Proof. By proposition. the optimal state y 0 := Su 0 with u 0 being a solution of P) is uniquely determined. Hence, also the adjoint p 0 = S y 0 z) is uniquely determined. Assumption implies that the set {x D : p 0 x) = 0} has measure zero. Then the pointwise representation.3) of solutions of P) completely determines the values of a solution u 0 on D up to sets of zero measure. This implies that P) is uniquely solvable. Moreover, it follows u 0 x) {u a x), u b x)} for almost all x D by.3). Hence, u 0 is of bang-bang type. 4

Assumption not only guarantees uniqueness of solutions of P), but it is also sufficient to prove convergence rates with respect to α for α 0. Proposition.4. Let Assumption be satisfied. Let d be defined by { d = 2 κ if κ, κ+ 2 if κ >. Then for every α max > 0 there exists a constant c > 0, such that y 0 y α Y + p 0 p α L D) c α d, u 0 u α L2 D) c α d /2, d /2+κ/2 mind,) u 0 u α L D) c α holds for all α 0, α max ]. The constant c depends on α max. Proof. For the proof we refer to [3, Theorem 3.2]. Let us present a small result to ease the work with the convergence rates stated above. Lemma.5. Let κ > 0 be given. Let d satisfy { d = 2 κ if κ, κ+ 2 if κ >. Then it holds κ min, d) = 2d. Proof. In the case κ we have κ min, d)+ = 2 κ + = 2 in the case κ > we obtain κ min, d) + = κ + = 2d. κ 2 κ = 2d, whereas Remark.6. With this small lemma at hand, we can simplify the convergence rate of Proposition.4 to u 0 u α L D) c α 2d, since it holds κ 2 min, d) = d 2 by Lemma.5 above..4 Discretization error estimates Let us now turn to the discretization of the considered optimal control problems. As already mentioned in the introduction, we consider approximations S h of the operator S. In order to control the discretization error we make the following assumption. Assumption 2. There exist continuous and monotonically increasing functions δ 2 h), δ h) : R + R + with δ 2 0) = δ 0) = 0 such that it holds S S h )u α,h Y + S S h)y α,h z) L2 D) δ 2 h), S S h)y α,h z) L D) δ h).4) for all h > 0 and α 0. 5

Note that this assumption contains the unknown solutions of discretized optimal control problems. The functions δ 2 h) and δ h) can be realized by means of a-posteriori error estimators, see e.g. []. In the analysis, we can as well allow for a-priori discretization error estimates, see the comments below in Remark.9. Under this assumption one can prove discretization error estimates for P α. Proposition.7. Let Assumption 2 be satisfied. Let α > 0. Then there is a constant c > 0 independent of α, h such that ) y α y α,h Y + α 2 uα u α,h L2 D) c + α 2 δ 2 h), holds for all h > 0. ) ) p α p α,h L D) c δ h) + + α 2 δ 2 h). Proof. For the proof, we refer to [5, Theorem 2.4] and [, Proposition.8]. Obviously these error estimates are not robust with respect to α 0. In the case α = 0 we have an independent discretization error estimate of [3], which relies on Assumption. Proposition.8. Let Assumptions and 2 be satisfied. Let d be as in Proposition.4. Then for every h max > 0 there is a constant c > 0 such that y 0 y 0,h Y c δ 2 h) + δ h) d), p 0 p 0,h L D) c δ 2 h) + δ h) mind,)), u 0 u 0,h L D) c δ 2 h) κ + δ h) κ mind,)) holds for all h < h max. The constant c depends on h max. Proof. For the proof, we refer to [3, Theorem 2.2] and [, Proposition.9]. Remark.9. As already mentioned, the estimates above are also valid if Assumption 2 on the discretization error is replaced by an a-priori variant. To this end, let us assume that there exist continuous and monotonically increasing functions δ 2h), δ h) : R + R + with δ 20) = δ 0) = 0 such that it holds S S h )u α Y + S S h)y α z) L2 D) δ 2h), S S h)y α z) L D) δ h).5) for all h > 0 and α 0. Here, the error estimate depends still on unknown solutions of P α ). However, it is standard to prove uniform norm bounds on y α and u α, which then can be used to apply standard e.g. finite element) a-priori error estimates to deduce the existence of δ 2 and δ, see [3, 5] for the related analysis with S being the solution operator of an elliptic equation. According to the discussion in [], the results of Propositions.7 and.8 are valid if δ 2 and δ are replaced by δ 2 and δ, respectively. 6

2 Robust discretization error estimates As one can see from the statement of Proposition.7, the error estimate is not robust with respect to α 0, as the constant on the right-hand side behaves like α /2 for α 0. In particular, this means that the estimates of Proposition.8 cannot be obtained from those of Proposition.7. The purpose of this section is to prove an a-priori error estimate for α > 0 that is robust for α 0. Lemma 2.. Let Assumption 2 be satisfied. Let α > 0. Then it holds 2 y α y α,h 2 Y + α u α u α,h 2 L 2 D) 2 δ 2h) 2 + δ h) u α,h u α L D), for all h > 0 and α > 0. Proof. Since u α,h and u α are feasible for P α and P α,h, respectively, we can use both functions in the variational inequalities of Proposition.2. Adding the obtained inequalities yields This implies αu α + p α, u α,h u α ) + αu α,h + p α,h, u α u α,h ) 0 α u α u α,h 2 L 2 D) p α p α,h, u α,h u α ). 2.) Using the definitions of p α and p α,h, we obtain p α p α,h, u α,h u α ) = S y α z) S hy α,h z), u α,h u α ) Here the second addend can be estimated as = S y α y α,h ) + S S h)y α,h z), u α,h u α ). S S h)y α,h z), u α,h u α ) δ h) u α,h u α L D). 2.2) where we used Assumption 2. If we would have estimated S S h )y α,h z), u α,h u α ) S S h )y α,h z) L 2 D) u α,h u α L 2 D) instead, we would obtain the non-robust estimate of Proposition.7.) We continue with investigating the first addend in the above estimate S y α y α,h ), u α,h u α ) = y α y α,h, Su α,h Su α ) Y = y α y α,h, S S h )u α,h + S h u α,h Su α ) Y = y α y α,h 2 Y + y α y α,h, S S h )u α,h ) Y 2 y α y α,h 2 Y + 2 δ 2h) 2, 2.3) where we used Assumption 2 in the last step. Combining the estimates 2.), 2.2), and 2.3) yields 2 y α y α,h 2 Y + α u α u α,h 2 L 2 D) 2 δ 2h) 2 + δ h) u α,h u α L D), which is the claim. The L -error in the previous estimate can be bounded if the regularity assumption, i.e. Assumption, is fulfilled. 7

Lemma 2.2. Let Assumption be satisfied. Let α > 0. Let v α, q α L D) be given satisfying the projection formula v α x) = proj [uax),u b x)] ) α q αx) f.a.a. x D. Then there is a constant c > 0 independent of α and v α, q α ) such that holds for all α > 0. u 0 v α L D) cα κ + p 0 q α κ L D) ) u 0 v α L2 D) cα κ/2 + p 0 q α κ/2 L D) ) Proof. This follows from [3, Lemma 3.3]. Now we have everything at hand to derive the robust error estimate. Theorem 2.3. Let Assumptions and 2 be satisfied. Then for every h max > 0 and α max > 0 there is a constant c > 0 such that y α y α,h Y c δ 2 h) + δ h) d + α d /2 δ h) /2), p α p α,h L D) c δ 2 h) + δ h) mind,) + α d /2 δ h) /2), u α u α,h L D) c δ 2 h) κ + δ h) κ mind,) + α κd /2) δ h) κ/2 + α κ min,d)) holds for all h < h max and α 0, α max ]. Here, d is given by Proposition.4. The constant c depends on α max and h max. Proof. Let h max > 0 and α max > 0 be given, and let h < h max and α 0, α max ] be arbitrary. In order to invoke Lemma 2., we start with an estimate of u α,h u α L D). With the result of Lemma 2.2 we obtain u α,h u α L D) u α,h u 0 L D) + u 0 u α L D) cα κ + p 0 p α,h κ L D) + p 0 p α κ L D) ) cα κ + p 0 p α κ L D) + p α p α,h κ L D) ) 2.4) with constants c > 0 independent of α and h. Due to the regularization error estimate of Proposition.4, we have p 0 p α κ L D) c ακd. Here, c > 0 depends on α max. The discretization error p α p α,h κ L D) can be estimated by p α p α,h L D) p α S y α,h z) + S y α,h z) p α,h L D) S y α y α,h ) L D) + S S h)y α,h z) L D) c y α y α,h Y + δ h)). 2.5) 8

This proves u α,h u α L D) cα κ min,d) + y α y α,h κ Y + δ h) κ ) where in the last step we used Lemma.5. With the result of Lemma 2. we obtain = cα 2d + y α y α,h κ Y + δ h) κ ) 2 y α y α,h 2 Y 2 δ 2h) 2 + δ h) u α,h u α L D) 2 δ 2h) 2 + c δ h)α 2d + y α y α,h κ Y + δ h) κ ). 2.6) 2.7) Let us first consider the case κ < 2, in which the term y α y α,h κ Y absorbed by the left-hand side. By Young s inequality we find can be c δ h) y α y α,h κ Y 4 y α y α,h 2 Y + c δ h) 2 2 κ. This implies 4 y α y α,h 2 Y c δ 2 h) 2 + α 2d δ h) + δ h) 2 2 κ + δ h) κ+ ) c δ 2 h) 2 + α 2d δ h) + δ h) 2d ) with c > 0 depending additionally on h max. Using 2.5) and 2.6) to estimate p α p α,h L D) and u α,h u α L D), respectively, yields the claim in the case κ < 2. Let us now prove the claim for the case κ 2. Here, it is not possible to absorb y α y α,h κ Y. We will derive a rough estimate of this quantity first. Due to the control constraints, the term u 0 u α L D) is uniformly bounded. Hence, we obtain by Lemma 2. y α y α,h 2 Y cδ 2 h) 2 + δ h)). Using this upper bound of y α y α,h Y in 2.7) yields 2 y α y α,h 2 Y 2 δ 2h) 2 + c δ h)α 2d + δ 2 h) 2κ + δ h) κ ). By Young s inequality we obtain δ h)δ 2 h) 2κ cδ h) κ+ κ+ 2κ + δ 2 h) κ ) = cδ h) κ+ + δ 2 h) 2κ+) ). Since κ + = 2d >, this proves y α y α,h 2 Y cδ 2 h) 2 + δ h)α 2d + δ h) κ+ ), with c > 0 depending additionally on h max. The estimates of p α p α,h L D) and u α,h u α L D) follow now from 2.5) and 2.6), respectively. Let us compare the results of this theorem to the results of Proposition.7 and.8. Clearly, the convergence rates of Theorem 2.3 with respect to the discretization quantities are smaller than the rates given by Proposition.7 in 9

the case α > 0. But the estimates of Theorem 2.3 are not only robust with respect to α 0 but also optimal in the case α = 0 as they coincide with the rates given by Proposition.8. Here, it would be interesting to search for results that combine both advantages: namely, provide convergence rates that are similar to Proposition.7 in the case α > 0 and that are on the same time robust with respect to α 0 and yield the convergence rate of Proposition.8 in the case α = 0. 3 A-priori regularization parameter choice The results of the previous section give rise to a-priori parameter choice rules αh), where α is chosen depending on δ h). Here it is important to ensure that the additional error introduced by the regularization is of the same order as the discretization error. It is favorable to obtain error estimates that are of the same order as the ones that are available for α = 0, see Proposition.8 above. Theorem 3.. Let Assumptions and 2 be satisfied. Let α be chosen such that αh) = δ h). Then for every h max > 0 there is c > 0 such that y 0 y αh),h Y c δ 2 h) + δ h) d), p 0 p αh),h L D) c δ 2 h) + δ h) mind,)), u 0 u αh),h L D) c δ 2 h) κ + δ h) κ mind,)) holds for all h < h max, where c depends on h max but is independent of h. Proof. Let us first investigate the error y 0 y αh),h Y. We have y 0 y αh),h Y y 0 y αh) Y + y αh) y αh),h Y. The first addend can be estimated by Proposition.4 using the assumption y 0 y αh) Y c αh) d c δ h) d. Applying Theorem 2.3 we can bound the second term from above as y αh) y α,h Y c δ 2 h) + δ h) d + α d /2 δ h) /2) This implies the claimed estimate c δ 2 h) + δ h) d). y 0 y αh),h Y c δ 2 h) + δ h) d). Similarly, we can prove p 0 p αh),h L D) c δ 2 h) + δ h) min,d)) and the estimate of u α u α,h L D). 0

This result proves convergence rates with respect to the discretization while still allowing for some regularization. The obtained convergence rate with respect to δ 2 and δ is optimal in the following sense: We prove the same convergence rates as in the case α = 0. The surprising fact about this result is that the optimal parameter choice αh) = δ h) is independent of the unknown parameter κ in Assumption, which played a key role in all the analysis above. Hence, this result is perfectly suited to be used in adaptive computations that both are adaptive in the discretization as well as in the regularization. Moreover, the theorem above yields the optimal convergence rate for all κ > 0, whereas the discrepancy-principle-based parameter choice rule of the previous work [] only yields optimal rates for κ. Additionally, the result gives a theoretically explanation for the numerical results in []. There, the discrepancy principle selects αh) h 2, which is up to logarithmic terms) the underlying convergence rate of the L -error for the problem considered there. Let us mention that an a-priori choice of the regularization parameter based on the non-robust estimate of Proposition.7 does not yield optimal convergence rates. With the estimates of Proposition.4 and.7 we have y 0 y α,h Y y 0 y α Y + y α y α,h Y ) c α d + α /2 δ 2 h) + δ 2 h). In order to balance the regularization and discretization error, one would have to choose ˆαh) δ 2 h) 2 2d+, which would lead to ) y 0 yˆαh),h Y c δ 2 h) 2d 2d+ + δ2 h). In the case κ =, we would get y 0 yˆαh),h Y c δ 2 h) 2/3. The resulting convergence rate thus is much lower than the optimal one obtained for the choice α δ h). Remark 3.2. The same results can be obtained if we want to use a-priori type discretization error estimates as in Remark.9 above. The results of Theorems 2.3 and 3. remain valid if δ 2 and δ are replaced by δ 2 and δ, respectively. 4 Relation to parameter choice rules In the previous work [] the following parameter choice rule was studied. There α was chosen by αh) := sup{α > 0 : I α,h δ 2 h) 2 + δ h) 2 }. 4.) Here, the discrepancy measure I α,h was defined by I α,h := u α,h u a )p α,h dµ + {x: p α,h >0} This choice was motivated by the estimate {x: p α,h <0} y 0,h y α,h 2 Y u α,h u 0,h, p α,h ) L2 D) I α,h, u α,h u b )p α,h dµ.

see [, Lemma 2.]. satisfies the relation Let us now study, whether the choice αh) = δ h) I αh),h δ 2 h) 2 + δ h) 2, which would imply αh) αh). In order to show this, let us cite the following estimate of I α,h I α,h c α p 0 p α,h κ L D) + ακ ), 4.2) which is taken from [, 2.3)]. Theorem 4.. Let Assumptions and 2 be satisfied. Let α be chosen such that αh) = δ h). Then for every h max > 0 there is c > 0 such that I αh),h c δ h) 2d + δ 2 h) 2d ) holds for all h < h max. Hence, if κ > then there is h 0 > 0 such that I αh),h c δ 2 h) 2 + δ h) 2 ) holds for all h < h 0. The constant c depends on h max but is independent of h. Proof. Using estimate 4.2) and Theorem 3. we find I αh),h c αh) p 0 p αh),h κ L D) + ακ ) c δ h)δ 2 h) κ + δ h) κ min,d) + δ h) κ ). By Young s inequality we obtain using 2d κ + δ h)δ 2 h) κ c δ h) κ+ + δ 2 h) κ κ+ κ ) c δ h) 2d + δ 2 h) 2d ). With the help of Lemma.5 we find which proves the claim. I αh),h c δ h) 2d + δ 2 h) 2d ), This shows that the a-priori choice of the regularization parameter given by Theorem 2.3 satisfies the discrepancy principle 4.) above in the case κ >. In the case κ = and hence d = ) one can prove a similar result, if one replaces 4.) with αh) := sup{α > 0 : I α,h τδ 2 h) 2 + δ h) 2 )}, where τ > 0 has to be sufficiently small. The theorem above shows that for sufficiently small h the inequality αh) αh) is satisfied. Here one would like to prove the reverse estimate. Such an estimate seems not to be available, as it would require to work with estimates of the discrepancy measure I α,h from below. 2

5 Application to an elliptic optimal control problem Let us now report about numerical experiments to solve an elliptic optimal control problem. This problem is given by: Minimize subject to the elliptic equation Jy, u) := 2 y y d L2 Ω) y = u + f y = 0 in Ω on Ω and the control constraints u a u u b a.e. on Ω. Here, Ω R n is a given bounded, polygonal, and convex domain with boundary Ω. Moreover, we have y d, f, u a, u b L Ω) with u a u b a.e. on Ω. The solution operator S, S : u ), satisfies the assumptions made at the beginning of the paper. We set z := y d Sf. Then the control problem can be rephrased equivalently in the form P). Let us now describe the discretization by finite elements. We use a family F = {T h } h>0 of regular meshes T h, consisting of closed cells T T h. For T T h let us define h T := diam T. We assume that there is a constant R, such that h T R R T for all h > 0, T T h, where R T is the diameter of the largest ball contained in T. The discrete space V h is defined as V h := {v H 0 Ω) : v T P T )}. The discrete operator S h is defined as the solution mapping of the discrete weak formulation of the elliptic equation. That is, S h u :== y h V h, where y h solves y h v h dµ = uv h dµ v h V h. Ω Please note, that S and S h are self-adjoint, S = S and S h = S h. We will report about experiments using a-priori or a-posteriori estimates to calculate δ h). 5. A-priori error estimates Let us assume in addition that the meshes T h are quasi-uniform. That is, there is a constant M > such that max T Th h T M min T Th h T for all h > 0. In this case, we have the following optimal optimal a-priori rates, which hold under additional regularity assumptions, see [3], with r2) = 2, r3) = /4. S S h )f L Ω) c h 2 log h rn) Ω 3

5.2 A-posteriori error estimates For comparison, we also used a-posteriori error estimates to determine the discretization error. We used the reliable and efficient error estimator from [8], which is defined as follows. Let us set η yα,h, := max T Th η T,yα,h, and [ ] η T,yα,h, := log h min 2 h 2 yα,h T y α,h + u α,h + f L T ) + h T n L T \Γ) where h min := min T Th h T, and [v] E denotes the jump of the quantity v across an edge E. Then it holds S S h )u α,h L Ω) c η yα,h, with c > 0 independent of h and α. Analogously, the error estimator η pα,h, for the error in the adjoint equation is defined. 5.3 Numerical experiments Let us now report about the outcome of numerical experiments. The following data was used: Ω = 0, ) 2, u a =, u b = +, zx, x 2 ) = sinπx ) sinπx 2 ) + sin2πx ) sin2πx 2 ) fx, x 2 ) = sign sin2πx ) sin2πx 2 ) ) + 2π 2 sinπx ) sinπx 2 ). It is easy to check that P) admits the following unique solution: u 0 x, x 2 ) = sign sin2πx ) sin2πx 2 ) ) y 0 x, x 2 ) = sinπx ) sinπx 2 ) p 0 x, x 2 ) = 8π 2 sin2πx ) sin2πx 2 ). In addition, it turns out that the regularity assumption is satisfied for all κ <. This implies by Proposition.4 d =, u 0 u α L D) c α. Moreover, with the choice αh) = δ h), we have by Theorem 3. u 0 u αh),h L D) c δ 2 h) + δ h)). Since Ω is bounded by assumption, we can estimate the L 2 -error against the L -error to obtain u 0 u αh),h L D) c δ h). The initial mesh was obtained by dividing the domain Ω into 32 triangles with mesh size h = 0.3536. This mesh is then successively refined using uniform refinement. Let us first describe the different tests with regularization parameter choices. We emphasize that both choices of the regularization parameter do not need information about the value of κ. We comment on the results below in Section 5.4. ), 4

h u 0 u αh),h L D) p 0 p αh),h L D) αh) 3.5355 0 3.2840 0.0098 0 2 9.7656 0 4.7678 0.045 0 3.4007 0 3 2.444 0 4 8.8388 0 2 3.455 0 2 8.7735 0 4.2207 0 4 4.494 0 2 7.7750 0 3 2.340 0 4.5259 0 5 2.2097 0 2.864 0 3 5.677 0 5 3.847 0 6.049 0 2 4.4220 0 4.2606 0 5 9.5367 0 7 h 2 h 2 h 2 Table : Choice by discrepancy principle, a-posteriori error estimate h u 0 u αh),h L D) p 0 p αh),h L D) αh) 3.5355 0 5.8409 0.028 0 2 6.2500 0 3.7678 0.766 0 3.4236 0 3.5625 0 3 8.8388 0 2 5.2228 0 2 8.803 0 4 3.9063 0 4 4.494 0 2.4560 0 2 2.445 0 4 9.7656 0 5 2.2097 0 2 3.408 0 3 5.2005 0 5 2.444 0 5.049 0 2 8.086 0 4.2699 0 5 6.035 0 6 h 2 h 2 h 2 Table 2: A-priori choice of regularization parameter, a-priori error estimates 5.3. Choice by discrepancy principle, a-posteriori error estimate First, let us report about the outcome of the computations if the regularization parameter is chosen by the discrepancy principle 4.). The costly computation of the supremum in 4.) was replaced by the following principle αh) := sup{α : α = 2 j α 0, j N, I α,h τη yα,h, + η pα,h, )} with τ = 2 0 4 and α 0 = 0 4. If z 0, then the supremum exists, see [, Lemma 3.]. The results of the computation can be found in Table. 5.3.2 A-priori choice of regularization parameter, a-priori error estimates Second, we applied the following strategy. According to Theorem 3., we choose αh) δ h). Here, we used δ h) = h 2, ignoring the logarithmic term, cf. Section 5.. The results of the computation can be found in Table 2. 5.3.3 A-priori choice of regularization parameter, a-posteriori error estimates The third strategy we investigated was the following. Again α was chosen proportionally to the L -error. Here, we used the a-posteriori estimates of Section 5.2. Then the regularization parameter αh) was chosen iteratively: For 5

h u 0 u αh),h L D) p 0 p αh),h L D) αh) 3.5355 0 4.8075 0.077 0 2 3.6464 0 3.7678 0.625 0 3.4200 0 3.2837 0 3 8.8388 0 2 4.898 0 2 8.7957 0 4 3.3204 0 4 4.494 0 2.3059 0 2 2.395 0 4 8.377 0 5 2.2097 0 2 3.73 0 3 5.785 0 5 2.0964 0 5.049 0 2 7.6073 0 4.2645 0 5 5.2427 0 6 h 2 h 2 h 2 Table 3: A-priori choice of regularization parameter, a-posteriori error estimates a fixed mesh, we solved the regularized discretized problem. If α η yα,h, + η pα,h, was satisfied, we set αh) := α and accepted the discrete solution. ) If α was larger than η yα,h, + η pα,h,, we set α := 2 ηyα,h, + η pα,h, and solved the discrete problem again. The results of the computation can be found in Table 3. 5.4 Comments on the computational results All three tests showed similar performances. Although the choices of the regularization parameter were conceptually different, the outcome is very similar. All the methods choose αh) h 2. The resulting errors in control, state, and adjoint behaved like u 0 u αh),h L D), y 0 y αh),h L 2 D), p 0 p αh),h L D) h 2. Moreover, the results using the optimistic a-priori rate δ h) = h 2 and the results using the a-posteriori estimator agree to large extend. Hence, the choice of the a-priori rate is justified. These results show that both parameter choices - a-priori as well as a-posteriori - lead to comparable results. 6 Conclusion In this paper, we presented a robust error estimate for the optimization problem under consideration. This robust estimate lends itself to an a-priori regularization parameter choice. The numerical results confirm these findings, results in optimal convergence rates with respect to the discretization parameter. Moreover, they show that this a-priori choice leads to the same accuracy as the a-posteriori choice based on a discrepancy principle. References [] E. Casas. Second order analysis for bang-bang control problems of PDEs. SIAM J. Control Optim., 504):2355 2372, 202. 6

[2] C. Clason and K. Kunisch. Multi-bang control of elliptic systems. Ann. Inst. H. Poincaré Anal. Non Linéaire, 203. [3] K. Deckelnick and M. Hinze. A note on the approximation of elliptic control problems with bang-bang controls. Comput. Optim. Appl., 52):93 939, 202. [4] U. Felgenhauer. On stability of bang-bang type controls. SIAM J. Control Optim., 46):843 867, 2003. [5] M. Hinze. A variational discretization concept in control constrained optimization: the linear-quadratic case. J. Computational Optimization and Applications, 30:45 63, 2005. [6] C. Meyer and A. Rösch. Superconvergence properties of optimal control problems. SIAM J. Control Optim., 433):970 985, 2004. [7] A. Neubauer. Tikhonov-regularization of ill-posed linear operator equations on closed convex sets. J. Approx. Theory, 533):304 320, 988. [8] R. H. Nochetto, A. Schmidt, K. G. Siebert, and A. Veeser. Pointwise a posteriori error estimates for monotone semi-linear equations. Numer. Math., 044):55 538, 2006. [9] F. Tröltzsch. Optimal control of partial differential equations, volume 2 of Graduate Studies in Mathematics. AMS, Providence, RI, 200. [0] M. Ulbrich and S. Ulbrich. Superlinear convergence of affine-scaling interior-point Newton methods for infinite-dimensional nonlinear problems with pointwise bounds. SIAM J. Control Optim., 386):938 984, 2000. [] D. Wachsmuth. Adaptive regularization and discretization of bang-bang optimal control problems. ETNA, 40:249 267, 203. [2] G. Wachsmuth and D. Wachsmuth. Convergence and regularization results for optimal control problems with sparsity functional. ESAIM Control Optim. Calc. Var., 73):858 886, 20. [3] G. Wachsmuth and D. Wachsmuth. On the regularization of optimization problems with inequality constraints. Control and Cybernetics, 4:25 54, 20. 7