Optimal Control Theory

Similar documents
Calculus of Variations. Final Examination

Chapter 4 Optimal Control Problems in Infinite Dimensional Function Space

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

Université de Metz. Master 2 Recherche de Mathématiques 2ème semestre. par Ralph Chill Laboratoire de Mathématiques et Applications de Metz

INFINITE TIME HORIZON OPTIMAL CONTROL OF THE SEMILINEAR HEAT EQUATION

USING FUNCTIONAL ANALYSIS AND SOBOLEV SPACES TO SOLVE POISSON S EQUATION

Analysis Comprehensive Exam Questions Fall 2008

Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems

Deterministic Dynamic Programming

2 Statement of the problem and assumptions

On Controllability of Linear Systems 1

Equations paraboliques: comportement qualitatif

PDE Constrained Optimization selected Proofs

Theorem 1. ẋ = Ax is globally exponentially stable (GES) iff A is Hurwitz (i.e., max(re(σ(a))) < 0).

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

REGULAR LAGRANGE MULTIPLIERS FOR CONTROL PROBLEMS WITH MIXED POINTWISE CONTROL-STATE CONSTRAINTS

The Dirichlet-to-Neumann operator

t y n (s) ds. t y(s) ds, x(t) = x(0) +

NONTRIVIAL SOLUTIONS TO INTEGRAL AND DIFFERENTIAL EQUATIONS

WELL-POSEDNESS FOR HYPERBOLIC PROBLEMS (0.2)

Corollary A linear operator A is the generator of a C 0 (G(t)) t 0 satisfying G(t) e ωt if and only if (i) A is closed and D(A) = X;

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

Weak Formulation of Elliptic BVP s

GENERATORS WITH INTERIOR DEGENERACY ON SPACES OF L 2 TYPE

Nonlinear stabilization via a linear observability

TD 1: Hilbert Spaces and Applications

Positive Stabilization of Infinite-Dimensional Linear Systems

Control, Stabilization and Numerics for Partial Differential Equations

LINEAR-CONVEX CONTROL AND DUALITY

Partial Differential Equations

MAT 578 FUNCTIONAL ANALYSIS EXERCISES

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

THE STOKES SYSTEM R.E. SHOWALTER

2.2 Annihilators, complemented subspaces

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

Theory of PDE Homework 2

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

Best approximations in normed vector spaces

EXAMINATION SOLUTIONS

Stabilization of Distributed Parameter Systems by State Feedback with Positivity Constraints

Functional Analysis. Martin Brokate. 1 Normed Spaces 2. 2 Hilbert Spaces The Principle of Uniform Boundedness 32

An introduction to Mathematical Theory of Control

Controllability of linear PDEs (I): The wave equation

Lecture Notes on PDEs

2) Let X be a compact space. Prove that the space C(X) of continuous real-valued functions is a complete metric space.

4 Linear operators and linear functionals

Weak Topologies, Reflexivity, Adjoint operators

EXISTENCE OF SOLUTIONS TO ASYMPTOTICALLY PERIODIC SCHRÖDINGER EQUATIONS

Homework If the inverse T 1 of a closed linear operator exists, show that T 1 is a closed linear operator.

I teach myself... Hilbert spaces

Variational and Topological methods : Theory, Applications, Numerical Simulations, and Open Problems 6-9 June 2012, Northern Arizona University

Your first day at work MATH 806 (Fall 2015)

Joint work with Nguyen Hoang (Univ. Concepción, Chile) Padova, Italy, May 2018

CONVERGENCE THEORY. G. ALLAIRE CMAP, Ecole Polytechnique. 1. Maximum principle. 2. Oscillating test function. 3. Two-scale convergence

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

c 2018 Society for Industrial and Applied Mathematics

INF-SUP CONDITION FOR OPERATOR EQUATIONS

ALMOST PERIODIC SOLUTIONS OF HIGHER ORDER DIFFERENTIAL EQUATIONS ON HILBERT SPACES

CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME. Chapter2 p. 1/67

Fact Sheet Functional Analysis

Sufficiency for Essentially Bounded Controls Which Do Not Satisfy the Strengthened Legendre-Clebsch Condition

A Concise Course on Stochastic Partial Differential Equations

minimize x subject to (x 2)(x 4) u,

1. Bounded linear maps. A linear map T : E F of real Banach

Continuous dependence estimates for the ergodic problem with an application to homogenization

Exponential stability of families of linear delay systems

Criterions on periodic feedback stabilization for some evolution equations

Stability of an abstract wave equation with delay and a Kelvin Voigt damping

Chapter 3 Second Order Linear Equations

Semigroups of Operators

Conservative Control Systems Described by the Schrödinger Equation

L p Spaces and Convexity

AN EFFECTIVE METRIC ON C(H, K) WITH NORMAL STRUCTURE. Mona Nabiei (Received 23 June, 2015)

ODE Final exam - Solutions

Introduction to Semigroup Theory

Applied Analysis (APPM 5440): Final exam 1:30pm 4:00pm, Dec. 14, Closed books.

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

Elliptic Operators with Unbounded Coefficients

Hille-Yosida Theorem and some Applications

A brief introduction to ordinary differential equations

************************************* Applied Analysis I - (Advanced PDE I) (Math 940, Fall 2014) Baisheng Yan

SECOND-ORDER SUFFICIENT OPTIMALITY CONDITIONS FOR THE OPTIMAL CONTROL OF NAVIER-STOKES EQUATIONS

(convex combination!). Use convexity of f and multiply by the common denominator to get. Interchanging the role of x and y, we obtain that f is ( 2M ε

Université Paul Sabatier. Optimal Control of Partial Differential Equations. Jean-Pierre RAYMOND

Introduction to Functional Analysis With Applications

Existence Theory: Green s Functions

Domain Perturbation for Linear and Semi-Linear Boundary Value Problems

Non-linear wave equations. Hans Ringström. Department of Mathematics, KTH, Stockholm, Sweden

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

6.241 Dynamic Systems and Control

1+t 2 (l) y = 2xy 3 (m) x = 2tx + 1 (n) x = 2tx + t (o) y = 1 + y (p) y = ty (q) y =

AN EXTENSION OF THE LAX-MILGRAM THEOREM AND ITS APPLICATION TO FRACTIONAL DIFFERENTIAL EQUATIONS

Controllability of the linear 1D wave equation with inner moving for

THEOREMS, ETC., FOR MATH 516

STOKES PROBLEM WITH SEVERAL TYPES OF BOUNDARY CONDITIONS IN AN EXTERIOR DOMAIN

ANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2.

On the bang-bang property of time optimal controls for infinite dimensional linear systems

Spectral theory for compact operators on Banach spaces

Weak Convergence Methods for Energy Minimization

Transcription:

Adv. Scientific Computing III Optimal Control Theory Catalin Trenchea Department of Mathematics University of Pittsburgh Fall 2008

Adv. Scientific Computing III > Control systems in Banach spaces Let X be a Banach space, with norm. Consider the differential equation { y (t) = Ay(t) + Bu(t), t (0, T ] y(0) = y 0 (1.1) where A is a infinitesimal generator of a semigroup S(t) = e At, u L 1 (0, T : U), with U - Banach space with norm U B L(U, X), y 0 X Under these conditions, there exists a mild solution y C([0, T ]; X): y(t) = e At y 0 + t 0 e A(t s) Bu(s)ds, t [0, T ]

Adv. Scientific Computing III > Control systems in Banach spaces Definition (1.1) is called state system or controlled system. u(t) is the control or command, U is the space of control y(t) is the state of the system, X is the space of states. Example (The heat equation) y t = y + u in Ω y = 0 on Ω y(x, 0) = y 0 (x) where u is a heat source.

Adv. Scientific Computing III > Control systems in Banach spaces (1) Controllability: given y 1 X, find the control u such that the state y satisfies y(t ) = y 1 (2) Stabilizability: assume that u : R + U (defined on the semiaxis), and there exists an operator F L(X, U), such that u(t) = F y(t) is a feedback control. Find the operator F such that the feedback law makes the system asymptotically stable: { y = (A + BF )y y(0) = y 0 y(t) = e (A+BF )t y 0, with e (A+BF )t M ωt.

Adv. Scientific Computing III > Control systems in Banach spaces input u y=... output y u=fy Figure: y = (A + BF )y, a Closed Loop System Remark All the systems in nature are closed loop systems! Observe the state (output) y and give the command (input) u.

Adv. Scientific Computing III > Control systems in Banach spaces (3) Optimal Control: find the control u(t) that minimizes the problem { T ( inf Cy 2 + u 2) } dt + Gy(T ) 2, with y = Ay + Bu, y(0) = y 0 u U 0 (4) Control H : find a feedback law that stabilizes the system and is immune to perturbations.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Assume A is a infinitesimal generator of a C 1 semigroup S(t) = e At on a real Hilbert space H (with norm H and inner product, H ) B L(U, H), where U is a Hilbert space with U,, U u L 2 (0, T ; U) f L 2 (0, T ; H) The state ystem { y (t) = Ay(t) + Bu(t) + f(t), t (0, T ] (2.1) y(0) = y 0 admits an unique mild solution Remark y(t) = e At y 0 + t 0 e A(t s) (Bu(s) + f(s)) ds (2.2) The mild solution is a.e. differentiable if y 0 D(A), and A generates a analytic semigroup operator.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Consider the operators C L(H, H) (the observation operator), Q L(H, H), Q = Q, symmetric positive definite. The optimal control problem: inf u { T 0 ( Cy(t) g(t) 2 H + u(t) 2 ) U dt + Q(y(T )), y(t ) H, y = Ay + Bu + f, y(0) = y 0 } where g = L 2 (0, T ; H). Hence we seek the control u that minimizes (2.3). Definition (2.3) u L 2 (0, T ; U) that minimizes (2.3) is called optimal control. y = y(u ) corresponding to u is the optimal state. (y, u ) is the optimal pair. J(u) = T ( 0 Cy(t) g(t) 2 H + u(t) 2 ) U dt + Q(y(T )), y(t ) H is the objective or the cost functional.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems u(t) 2 minimal energy expenses. Cy(t) u(t) 2 takes the state close to a situation g, minimizing also the final state. Cy is a measure of the state, since the state could be inaccessible, or we can access it only by measurements. Questions 1 Existence of an optimal pair. Uniqueness. 2 Necessary conditions for optimality: if u(t), y(t) are optimal, then the following conditions hold... 3 Sufficient conditions for optimality: if u(t), y(t) satisfy the following conditions..., then u(t), y(t) are optimal.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair Theorem (Existence and uniqueness of optimal pair) The optimal control problem (2.3) admits a unique solution. Let denote y u the solution of system (2.1) and the cost functional ϕ(u) = T 0 ( Cy u (t) g(t) 2 H + u(t) 2 ) U dt + Qy u (T ), y u (T ) H. The functional ϕ : L 2 (0, T ; U) R is 1 continuous 2 strictly convex 3 coercive The continuity of ϕ is an easy consequence of formula (2.2), which shows the linearity of y u with respect to u. (Exercise)

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair The linearity of the map u y u amounts to the fact that ϕ(u) is a quadratic function in u, which gives the convexity of ϕ(u). Let s recall ϕ(u) is convex: ϕ(λu + (1 λ)v) λϕ(u) + (1 λ)ϕ(v), λ [0, 1], u, v U ϕ(u) is strict convex: ( ) u + v ϕ = 1 2 2 ϕ(u) + 1 ϕ(v) = u = v 2 ϕ(u) is coercive: ϕ(u) is Gâteaux differentiable lim ϕ(u) = + u L 2 (0,T ;U)

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair We will have ϕ(u) convex since u 2 is convex: λu + (1 λ)v 2 = λ 2 u 2 + 2λ(1 λ) u, v + (1 λ) 2 v 2 λ u 2 + (1 λ) v 2 Indeed, this is true since 0 λ(1 λ) ( u 2 2 u, v + v 2) = λ(1 λ) u v 2, for all λ [0, 1], u, v U.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair Also u 2 is strict convex. We have u + v 2 2 which implies u + v 2 = 1 4 ( u 2 + v 2 + 2 u, v ) = 1 2 ( u 2 + v 2 ) + 1 4 [2 u, v ( u 2 + v 2 )] }{{} 0 Finally ϕ(u) is coercive since 2 = 1 2 ( u 2 + v 2 ) = u = v ϕ(u) u 2

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair So the Theorem would be proved provided we have the following Lemma If ϕ : U R is convex, l.s.c. and coercive functional, then ϕ reaches its global minimum: u 0 U such that ϕ(u 0 ) = inf u U ϕ(u). If ϕ is strict convex, then the minimum point is unique. Proof of Lemma. Let s show first that the infimum is unique, for a strict convex ϕ. Assuming that u 1, u 2 such that ϕ(u 1 ) = d = inf ϕ(u) } ( ) u U u1 + u 2 ϕ(u 2 ) = d = inf ϕ(u) d ϕ 1 2 2 ϕ(u 1) + 1 2 ϕ(u 2) = d u U ( ) u1 + u 2 = ϕ = 1 2 2 ϕ(u 1) + 1 2 ϕ(u 2) = u 1 = u 2

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair Denoting let show Assume that d = inf u U ϕ(u) d > d = = {u n } n : ϕ(u n ) 1 if {u n } n is unbounded, u n and hence contradicts the coercivity condition 2 if {u n } n is bounded, u n M (H is a Hilbert space, hence reflexive, Alaoglu s theorem implies the bounded sequences are weakly compact) u nk ũ Since ϕ is convex and lower semicontinuous is also weak l.s.c. (Mazur s lemma: convex closed sets are also weakly closed)

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair Hence lim inf ϕ(u n k ) ϕ(ũ) u nk eu Now ϕ(u } n k ) lim inf ϕ(u n n k k ) ϕ(ũ) ϕ(ũ) =!!! So By the definition of infimum, < d < {u n } n such that d ϕ(u n ) < d + 1 n (2.4) and hence ϕ(u n ) d As above, {u n } n is bounded (due to the coercivity condition) Taking the weak lim-inf u nk k u 0 lim inf ϕ(u nk ) ϕ(u 0 ) (2.5) u nk u 0

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Existence of an optimal pair Now from (2.4) and (2.5) d ϕ(u n ) < d + 1 n lim inf u nk u 0 ϕ(u nk ) ϕ(u 0 ) we deduce ( ϕ(u 0 ) lim inf ϕ(u nk ) lim ϕ(u n) lim d + 1 ) = d (2.6) u nk u 0 n n n Also we have (since u 0 U, the whole space U being weakly closed) which along with the above (2.6) yields d ϕ(u 0 ) (2.7) d = ϕ(u 0 ) concluding the proof of the lemma and theorem.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Necessary and sufficient conditions of optimality Theorem The pair (y, u ) is optimal in problem (2.3) if and only if p C([0, T ]; H) satisfying { p = A p + C (Cy g) p(t ) = Qy (T ) (2.8) and u (t) = B p(t) a.e. t [0, T ] (2.9) The optimal control u has the form given by the optimality condition (2.9), where the adjoint state p is a solution to the adjoint system, a linear system that marches backward in time from a final condition.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Necessary and sufficient conditions of optimality Example inf u { 1 0 ( (y 1) 2 + u 2) } dt : y = u, y(0) = 0 Here A = 0, B = I, f = 0, y 0 = 0, C = I, g = 1, Q = 0, T = 1. From the optimality condition (2.9) we have that u = B p p where the adjoint state p is a solution to the adjoint system given by (2.8) { p = A p + C (Cy g) y 1 Hence we have y = u = p p = y 1 p(1) = 0 = y (1) y(0) = 0 p(1) = Qy (T ) 0 y = y 1 y(0) = 0 y (1) = 0 = Find y p u done!!!

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Sufficient conditions of optimality Proof. Assuming (2.8) and (2.9) let show that (y, u ) is optimal, i.e., ϕ(u ) ϕ(u), u L 2 (0, T ; H) Using a 2 b 2 + 2a(a b) we have u (t) 2 U u(t) 2 U + 2 u (t), u (t) u(t) U Cy u (t) g(t) 2 Cy u (t) g(t) 2 + 2 Cy u (t) g(t), C(y u (t) y u (t) Qy u (T ), y u (T ) Qy u (T ), y u (T ) + 2 Qy u (T ), y u (T ) y u (T ) hence ϕ(u ) ϕ(u) + 2 T 0 (2.10) ] [ Cy u (t) g(t), C(y u (t) y u (t) + u (t), u (t) u(t) U dt + 2 Qy u (T ), y u (T ) y u (T ) where we denoted y u = y, y u = y.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Sufficient conditions of optimality On the other hand, multiplying (2.8) by y y and integrating by parts, we get T 0 then p, y y dt = = T 0 T p(t ), y (T ) y(t ) = T 0 0 [( A p, y y) + C (Cy g), y y ] dt A p, y y dt + p, A(y y) + T 0 T 0 T 0 p, (y y) dt Cy g, C(y y) dt Cy g, C(y y) dt

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Sufficient conditions of optimality p(t ), y (T ) y(t ) = = = T 0 T 0 T 0 p, (y y) A(y y) + }{{} (2.3) = B(u u) T p, B(u u) + B p }{{} (2.9) = u, u u + Substituting this in (2.10) we obtain 0 T 0 T 0 Cy g, C(y y) dt Cy g, C(y y) dt Cy g, C(y y) dt ϕ(u ) ϕ(u) + 2 p(t ), y (T ) y(t ) + 2 Q(y (T )), y (T ) y(t ) }{{} Q(y (T )) Hence (2.8)-(2.9) are sufficient to u to be optimal.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Necessary conditions of optimality So assuming that u is optimal we will show that (2.8)-(2.9). The optimality ϕ(u ) ϕ(u + λu) u L 2 (0, T ; U), λ R implies that u is a critical point, i.e., the Gâteaux derivative Let denote u + λu y λ that verifies So ϕ(u + λu) ϕ(u ) lim = lim λ 0 λ ϕ(u ) = 0 (2.11) 1 λ 0 λ T { y λ = Ay λ + B(u + λu) + f y λ (0) = y 0 [ T ( Cyλ (t) g(t) 2 + u λ 2) dt 0 ( Cy (t) g(t) 2 + u 2) dt 0 ] + Qy λ (T ), y λ (T ) Qy (T ), y (T )

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Necessary conditions of optimality 1 T = lim C(y λ + y ) 2g, Cy λ g Cy g, Cy λ g λ 0 λ 0 }{{} = Cy λ g 2 Cy g, Cy g + [ 1 T = lim C(y λ +y ) 2g, Cy λ g + Cy g, Cy λ +g Cy + g λ 0 λ 0 + [ 1 T = lim C(y λ +y ) 2g, Cy λ g C(y λ + y ) 2g, Cy g λ 0 λ 0 + [ 1 T ] = lim C(y λ +y ) 2g, C(y λ y ) + λ u, 2u + λu dt λ 0 λ 0 1 + lim λ 0 λ Q(y λ(t ) y (T )), y λ (T ) + y (T ). (basically: a 2 b 2 = (a b)(a+b))

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Necessary conditions of optimality From the linearity of the state equation and the definition (2.2) of mild solution we see that y(t) = e At y 0 + t 0 λ 0 y λ y, e A(t s) (Bu(s) + f(s)) ds y λ y λ 0 z λ where z is the solution of the sensitivity equation { z = Az + Bu z(0) = 0 So ϕ(u ϕ(u + λu) ϕ(u ) ), u L 2 (0,T ;U) = lim λ 0 λ = 2 T 0 [ Cz(t), Cy g + u, u ] dt + 2 Qy (T ), z(t ) }{{} = p(t ) (2.12) (2.13)

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Necessary conditions of optimality It is easy to see that p adjoint state satisfying (2.8). Multiplying (2.8) by z and integrating by parts we have p(t ), z(t ) = = T 0 T Substituting in (2.13) we get ϕ(u ), u L2 (0,T ;U) = 2 0 p, z A p, z + < C (Cy g, z) > dt }{{} = p,az p, z Az + C }{{} (Cy g), z dt (2.12) = Bu T hence from the optimality condition (2.11) 0 u B p, u dt, u L 2 (0, T ; U) 0 = ϕ(u ) = u B p i.e., the optimality condition (2.9). This concludes the proof.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control Recall that the open loop system inf u U { T } ( ) Cx(t) g(t) 2 + u(t) 2 U dt+ Qx(T ), x(t ) : x =Ax+Bu+f, x(0)=x 0 0 has the solution u given by where u = B p(t) t [0, T ] (2.14) { p = A p + C (Cx g) p(t ) = Qy (T ) (2.15) So, to find the optimal control u we need to solve (2.15), which is difficult, due to the fact that it has a final condition.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control Automata prefers closed loop controls, i.e., controls given as function of the state variable u(t) = Λ(t, x(t)) One reason is that the open loop controls are sensible to errors and don t improve in time. Synthesis of linear quadratic control problems is given by a Riccati equation Recall that the optimal control u exists and is characterized by (2.14)-(2.15). We seek p(t) = P (t)x (t) + r(t) (2.16) formally, where { P : [0, T ] L(H, H), linear continuous operator r : [0, T ] H and also P is self-adjoint, positive definite.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control Introducing formally p(t) = P (t)x (t) + r(t) from (2.16) in the adjoint equation (2.15) we get p = A p + C (Cx g) P (t)x (t) P (t)x (t)+r (t)=a P (t)x (t) A r(t)+c Cx (t) C g(t) using the state equation and the optimallity condition ( P (t)x (t) P (t) B p(t) {}}{ Ax (t) + B u (t) +f(t) } {{ } =x (t) ) + r (t) = A P (t)x (t) A r(t) + C Cx (t) C g(t)

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control P (t)x (t) P (t)ax (t) P (t)b B P (t)x (t)+b r(t) {}}{ B p(t) }{{} (2.16) A P (t)x (t) + A r(t) C Cx (t) + C g(t) = 0 and using again p(t) = P (t)x (t) + r(t) from (2.16) P (t)f(t) + r (t) [ P (t) + P (t)a P (t)bb P (t) + A P (t) + C C ] x (t) P (t)bb r(t) P (t)f(t) + r (t) + A r(t) + C g(t) = 0 i.e., [ ] } P +P A P BB {{ P +A P +C C } Riccati equation x +r } +A r P {{ BB r } P f +C g =0

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control Therefore u = B p = B P x + B r is the feedback optimal control if we could solve: 1 the nonlinear operatorial Riccati equation { P +P A+A P P BB P +C C = 0, t [0, T ) P (T ) = Q (2.17) 2 the differential equation { r + A r P BB r = P f C g, t [0, T ) r(t ) = 0 where the boundary conditions in (2.17) and (2.18) are chosen such that Qx (T ) (2.15) = p(t ) (2.16) = P (T )x (T ) + r(t ) (2.18)

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control Once the Riccati equation is solved, the differential equation (2.18) is easily solvable. Indeed, assuming that we have P continuous operator from (2.17) A P B B is a continuous operator. Then (2.18) has a unique mild solution, being a Volterra equation. Recall that for x = Ax + f we have x(t) = e At x 0 + t and for F : [0, T ] L(X, X) continuos operator, we have x(t) = e At x 0 + x = (A + F (t))x + f t 0 e A(t s) F (s)x(s) + 0 t 0 e A(t s) f(s)ds e A(t s) f(s)ds

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control Definition C s ([0, T ]; L(H)) = {P : [0, T ] L(H, H); t P (t)x is continuous x H} The space of strong continuous operators: strong continuity point-wise continuity (weaker that the continuity in the operator norm). We seek a solution to the Riccati equation (2.17) in this space. Since the boundary condition is at final time t = T, l we transform it to a initial condition by the change of variable P = P (T t) and get { P P A A P + P BB P = C C, t [0, T ) P (0) = Q

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems We say that P is a mild solution to the previous Riccati equation if P C s ([0, T ]; L(H)) and P (t) = e A t Qe At + t 0 e A (t s) [ C C P (s)bb P (s) ] e A(t s) ds (2.19) Theorem Assuming that Q is self-adjoint and positive definite, the Riccati equation has a unique mild solution P C s ([0, T ]; L(H)), that is self-adjoint and positive definite. Proof 1 local existence: using Banach contraction principle 2 can prolongate the local solution to [T 0, T ]

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Synthesis of optimal control For local existence: define the metric space { Y = P Cs ([0, T ]; L(H, H)) : P } (t) e A t Qe At L(H,H) α Let show (1) L applies Y to Y : L( P )(t)x P (t)x, x = t (2) For T 0 sufficiently small, L is a contraction. 0 ( Ce A(t s) x 2 B P e A(t s) x 2) ds Hence P = L( P ), a unique local solution defined on [0, T 0 ]. For global existence: it suffices to show that P (T ) is bounded in a neighborhood of T 0.

Adv. Scientific Computing III > Optimal control of linear-quadratic distributed parameter systems Example inf u { 1 (x 2 + u 2 )dt + 1 } 2 x2 (1); x = u C = I, g = 0, Q = 1 2I, A = 0, B = I, f = 0. Note that (f = 0, g = 0 r(t) 0). The Riccati equation: { p p 2 + 1 = 0 0 So p(1) = Q = 1 2 dp p 2 1 = dt, 1 2 ln p 1 p + 1 = t + c p = 3 e2(t 1) 3 + e 2(t 1) u = B P (t)x (t) + B r(t) 3 e2(t 1) 3 + e 2(t 1) x (t) the feedback optimal control

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Let consider the state equation { y = Ay + Bu y(0) = y 0 (3.1) where A generates the C 0 semigroup S(t) = e At on the Banach space X, B L(U, X). The equation (3.1) has a unique mild solution y C([0, T ]; X) y(t) = S(t)y 0 + t where the control u L p (0, T ; U), p 1. 0 S(t s)bu(s)ds, t [0, T ] (3.2)

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability The null controllability problem Given y 0 X, T > 0, find u L p (0, T ; U) such that y(t ) = 0.

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Let denote: B(0, 1) = {x X; x 1} Σ(0, ρ) = { u L p (0, T ; U); u L p (0,T :U) ρ } Definition The equation (3.1) is ρ-null controllable if y 0 B(0, 1), u Σ(0, ρ) such that y(t ) = 0. Note that if (3.1) is ρ-null controllable, for some ρ > 0, then one can control any point from outside B(0, 1), but with controls from outside Σ(0, ρ). Indeed { y = Ay + Bu y α z= = y(0) = y 0 y 0 α { z = Az + B ( ) u α z(0) = y 0 α B(0, 1)

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Theorem The equation (3.1) is ρ-null controllable if and only if S (T )x X ρ B S x L q (0,T ;U ) = ρ Z T 0 B S (t)x q U «1 q, x X. (3.3) X is the dual of X, S(t) L(X, X), S (t) its dual: (S (t)x, y) = (x, S(t)y), B L(X, U ), B L(U, X), 1 p + 1 q = 1. Proof. Let define the convex sets M = {S(T )y 0, y 0 B(0, 1)} { } T N = S(T t)bu(t)dt; u Σ(0, ρ) 0 Recall that (3.1) is ρ-null controllable if y 0 B(0, 1), u Σ(0, ρ) such that y(t ) = S(T )y 0 + T S(T t)bu(t)dt = 0, i.e., 0 M = M N

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Definition For a convex set M, we define its support function H M (p) = sup{ p, x (X,X); x M} Lemma For M, N convex sets we have the equivalence M N H M (p) H N (p), p X Proof of lemma M N sup p, x sup p, x, p X M N Assume that H M (p) H N (p), and M N : x 0 M \ N

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability x 0 M, x 0 / N, {x 0 } and N are two disjoint convex sets. By the Hahn-Banach separation theorem: a linear functional p that separates them, i.e., ˆp X such that ˆp(x) 1 < ˆp(x 0 ), x N sup N HN (ˆp) 1 < ˆp(x 0 ) which contradicts the assumption H M (p) H N (p). This concludes the proof of lemma. Proof of theorem - continued. Let compute H M (p) = sup {(p, x), x M} = sup {(S(T )y 0, p), y 0 B(0, 1)} = sup {(y 0, S (T )p), y 0 B(0, 1)} = S (T )p

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability H N (p) = sup { p, x X,X, x N} { T } = sup p, S(T t)bu(t)dt X,X, u Σ(0, ρ) 0 { T } = sup B S(T t) p, u(t) U,U dt, u Σ(0, ρ) 0 { T } = ρ sup B S(T t) p, u(t) U,U dt, u Σ(0, 1) 0 t=t t = ρ sup { T = ρ B S(t) p L q (0,T ;U ) 0 } B S(t) p, u(t t) U,U dt, u Σ(0, 1) Now from the definition and lemma we obtain (3.3).

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Corollary (1) Equation (3.1) is ρ-null controllable if and only if y 0 X, u L p (0, T ; U) such that y(t ) = 0 (i.e., ρ > 0 such that the condition (3.3) holds). Using the transformation z = y α we can extend the condition (3.3), which holds only for y 0 B(0, 1), to the whole space X: satisfying y 0 X, s.t. y 0 > 1, α : z 0 = y 0 α { z = Az + B ( ) u α B(0, 1) z(0) = y 0 α B(0, 1) We obtain αρ-controllability: u s.t. ( ) 1 T q S (T )x αρ B S (t)x q U dt, x X. 0

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Corollary (2) If (3.3) holds, then y 0 the state equation (3.1) is null-controllable with controls u Σ(0, ρ). Corollary (3) { y The equation = Ay + u y(0) = y 0 hence ρ > 0 satisfying Proof is null controllable: y 0 X, u L p (0, T ; U) such that y(t ) = 0, S (T )x ρ S (t)x L q (0,T ;U ), x X. S (T ) = S (T t)s (t) sup S (T t) S (t) }{{} M Holder S (T ) MT 1 p }{{ 1 } ( S (t) q dt) 1 q. =ρ

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Application (Controllability of the heat equation) The heat equation y y = u(x, t), in Ω (0, T ) y = 0 on Ω (0, T ) y(x, 0) = y 0 (x), in Ω where u is a heat source, is null-controllable. If we take u L 2 (Ω (0, T )), y 0 L 2 (Ω), then A =, D(A) = H 1 0 (Ω) H2 (Ω), U = X = L 2 (Ω), p = 2. By Corollary (3): u L 2 (0, T ; L 2 (Ω)) s.t. y(x, T ) = 0, x Ω Hence we can bring any temperature to the origin.

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Application { y y = m i=1 u i(t)a i (x), in Ω (0, T ) y = 0 on Ω (0, T ) y(x, 0) = y 0 (x), in Ω with u i L 2 (0, T ), and a i L 2 (Ω) characteristic functions { 1, x Ωi a i (x) = χ(x) = 0, x / Ω i (3.4) A =, X = L 2 (Ω), U = R m, u = (u 1,..., u m ), B : R m X = L 2 (Ω), Bu = m i=1 u i(t)a i (x), B : L 2 (Ω) R m defined as m Bu, y X = u, B y U u i y(x)a i (x)dx = (u, B y) U i=1 with B y = { y(x)a 1 (x)dx, Ω Ω Ω y(x)a 2 (x)dx,..., Ω y(x)a m (x)dx}

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Null controllability Now S (t) = S(t) is a self-adjoint semigroup and S (t)x is a solution of the problem y y = 0, in Ω (0, T ) y = 0 on Ω (0, T ) y(, 0) = x( ), in Ω Hence the heat equation (3.4) is null-controllable if condition (3.3) is satisfied, i.e., S (T )x ρ B S x L q (0,T ;X) x X = L 2 (Ω) T m ( 2 y(z, T ) 2 dz ρ 2 y(z, t)a i (z)dz) dt Ω 0 Ω i=1

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation The wave equation y tt y = u, in Ω (0, T ) [ y = 0 on Ω (0, T ) y(x, 0) = y0 (x) in Ω y t (x, 0) = y 1 (x) with u L 2 (Ω (0, T )) Fubini = L 2 (Ω) L 2 (0, T ), A =, D(A) = H0 1(Ω) H2 (Ω), H = L 2 (Ω), V = H0 1(Ω) can be written as or equivalently (3.5) y tt + Ay = u y(0) = y 0 t [0, T ], u L 2 (0, T ; H) (3.6) y t (0) = y 1 y t = z, z t + Ay = u, y(0) = y 0, on X = V H (3.7) z(0) = y 1

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation We can write it on X = V H as ( ) ( ) d y y = A + Bu, ( dt ) z ( z ) (3.8) y y0 (0) = z where A = generates a C 0 semigroup. Question y 1 ( ) ( ) 0 I 0, Bu= A 0 Bu Is (3.8) null controllable? ( (3.5) null controllable?) Lemma (Existence of a solution) if y 0 V, z 0 H, u L 2 (0, T ; H), (3.8) admits a unique mild solution: y C([0, T ], V ), z C([0, T ], H).

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation Theorem The equation (3.5) (3.8) is null controllable in time T : y 0 V, y 1 H, u L 2 (0, T ; H) such that y(t ) = 0, y t (T ) = 0. Example (harmonic oscilator) y tt + y = u y(0) = y 0 y t (0) = y 1 Not physically intuitive. Theorem = u s.t. y(t ) = 0, with finite speed.

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation Proof. For (3.8) to be null controllable, we will have to verify condition (3.3). Recall that X = V H (V = H0 1, H = L2 ), X = V H ( ) 0 I (V = H 1 ), A =, A A 0 = S is generated by the equation: ( ) ( ) d p p = A ( dt ) q ( q ) = p p0 (0) = q i.e., S (t) = ( ) p(t). q(t) q 0 ( 0 A I 0 ), Bu= ( 0 u p = Aq q = p p(0) = p 0 V q(0) = q 0 H = H )., (3.9)

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation Let determine B : ( ) ( ) B p p, u =, Bu = q q ( ) ( ) B p 0 = q q Now the inequality (3.3) writes ( p q ), T S (T )z 2 ρ 2 B S (t)z dt 0 ( ) 0 = q, u u T ( p(t ) 2 V + q(t ) 2 H ρ 2 B p(t) T ) 2 dt=ρ 2 q(t) 2 q(t) Hdt, (3.10) (p 0, q 0 ) V H, and (p, q) solution of (3.9) 0 0

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation Since (3.9) writes as a 1 st order equation as q + Aq = 0 q(0) = q 0 H, (3.11) q (0) = p 0 V we have to prove (3.10) for all solutions of (3.11). With the change of variable z =A 1 q we have that (3.11) is equivalent to and (3.10) is equivalent to z + Az = 0 z(0) = z 0 = A 1 q 0 D(A) z (0) = z 1 = A 1 p 0 V (3.12) T Az(T ) H + z (T ) 2 V }{{}}{{} ρ2 Az(t) 2 0 }{{} Hdt (3.13) q(t ) p(t ) q(t) z solutions of (3.12).

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation The initial conditions in (3.12) yield a unique solution z C([0, T ], V ), Az C([0, T ], H). The equation is invariant to the change of sense of time: z(t) z(t t); the initial conditions become final conditions and (3.13) writes T Az 0 H + z 1 2 V ρ2 Az(t) 2 Hdt (3.14) Let define the operator Γ : D(A) V R(Γ) L 2 (0, T ; H) by 0 Γ(z 0, z 1 ) = Az Theorem (open mapping theorem of S. Banach) Let T : X Y be a continuous linear surjective operator, X, Y Banach spaces, then T is an open map. (If D is an open set in X, then T (D) is an open set in Y )

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation Corollary (bounded inverse theorem) If T : X Y be a continuous linear bijective operator, X, Y Banach spaces, then the inverse operator T 1 : Y X is also a linear continuous map. We denote X = D(A) V, Y = R(Γ) L 2 (0, T ; H), Y endowed with the L 2 (0, T ; H) norm. Hence, if we show that Γ : D(A) V Y L 2 (0, T ; H) is (1) injective (2) surjective then (3) linear continuous Γ 1 (f) c f (3.14) : open mapping Thm = Γ 1 is continuous, i.e., T Az 0 H + z 1 2 V ρ2 Az(t) 2 Hdt which will finally yield that the wave equation is null controllable. 0

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation (1) Γ is injective: { Γ(z) = 0 z 0 = z 1 = 0 z = 0 Az = 0 z Ω = 0 z(t) = 0, t z 0 z(0) = z 0 = 0, z (0) = z 1 = 0 (2) R(Γ) is closed: If (z n 0, zn 1 ) D(A) V, with (zn 0, zn 1 ) (z 0, z 1 ) and Γ(z n 0, zn 1 ) = Azn f L 2 (0, T ; H) Az = f, z a solution to (3.12) z n + Az n = 0 z n (0) = z0 n z n (0) = z1 n R t 0 z n (t) = z0 n + tz1 n R t 0 z n (t) = z n 1 t {z n (t)} n is bounded in L 2 (0, T ; H) 0 t 0 Az n (s)ds (t s) Az n (s) }{{} ds bdd in L 2 (0,T ;H)

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation Now by multiplying (3.12) with z n and integrating by parts: 1 ( z n (t) 2 L 2 2 (Ω) H + zn (t) 2 L (Ω)) = 0 2 z n (t) 2 H + z n (t) 2 V = z1 n 2 H + z0 n 2 V Similarly, multiplying (3.12) with Az n and integrating by parts: 1 ( z n (t) 2 L 2 2 (Ω) H + zn (t) 2 L (Ω)) = 0 2 z n (t) 2 H + z n (t) 2 V = z1 n 2 H + z0 n 2 V

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Controllability of the wave equation (3) Γ is linear continuous: multiply (3.12) by Az and integrate by parts hence and 0 = (z, Az) + (Az, Az ) = (z, Az ) (z, Az ) +(Az, Az ) }{{} =(Az,z ) = (z, Az ) (Az, Az) + (Az, Az ) = (z, Az ) + ( Az 2 ) (z (t), Az (t)) + ( Az(t) 2 ) = (z 1, Az 1 ) +( Az }{{} 0 2 ) = constant z 1 2 V T 0 Az 2 dt T ( z 1 2 V + Az 0 2). This concludes the proof of the controllability of the wave equation.

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Wave equation with boundary control y tt y = 0, in Ω (0, T ) [ y = u on Σ = Ω (0, T ) y(x, 0) = y0 (x) in Ω y t (x, 0) = y 1 (x) (3.15) where y 0 L 2 (Ω), y 1 H 1 (Ω) = (H 1 0 (Ω)), u L 2 (Σ). Problem Is there a boundary control u such that y(t ) = 0, y (T ) = 0?

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Wave equation with boundary control Let define the lift operator B : L 2 ( Ω) L 2 (Ω) { Bu = 0 in Ω Bu = u on Ω So Bu = z is a weak solution to the non-homogeneous Dirichlet elliptic problem { z = 0 in Ω z = u on Ω (3.16) i.e., z L 2 (Ω) is defined by a variational formula: ϕ H0 1 (Ω) H 2 (Ω): 0 = ϕ z = ϕ zdx + ϕ z Ω Ω Ω ν dσ ϕ = ( ϕ)zdx }{{} z Ω Ω ν + z ϕ Ω }{{} ν dσ =u =0 = z ϕdx = u ϕ dσ (3.17) ν Ω Ω

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Wave equation with boundary control { ϕ = f, Ω Let denote f L 2 (Ω) satisfying ϕ = 0, Ω The operator Bu is well defined: z L 2 (Ω) satisfying (3.17) zf = u ϕ dσ (3.18) ν Theorem (F. Riesz representation) Ω Let X be a Hilbert space and f a bounded linear functional on f. Then there exists a uniquely determined vector y f X such that Ω f(x) = (x, y f ) forall x X, and f = y f. Therefore is enough to show that the functional f Ω u ϕ ν dσ to be continuous: u ϕ Ω ν dσ = u ϕ L 2 (Ω) ν C u L2 (Ω) ϕ H2 (Ω) C u L2 (Ω) ϕ L 2 L2 ( Ω) ( { } the trace operator v v Ω, v ν is linear continuous, and surjective from W 2,p (Ω) onto W 2 1/p ( Ω) W 1 1/p,p ( Ω) )

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Wave equation with boundary control With the lift Bu given by (3.16) we can homogenize (3.15) y tt (y Bu) = 0, in Ω (0, T ) y Bu = 0 on Σ = Ω (0, T ) y(x, 0) = y 0 (x), y t (x, 0) = y 1 (x) in Ω and with A =, D(A) = H 1 0 (Ω) H2 (Ω), self-adjoint and continuous { ytt + A(y Bu) = 0, in (0, T ) (3.19) y(0) = y 0, y (0) = y 1 Taking θ = A 1 y, then { θtt + Aθ = Bu, in (0, T ) θ(0) = A 1 y 0, θ (0) = A 1 y 1 θ tt θ = Bu, in Ω (0, T ) θ = 0 on Ω (0, T ) θ(0) = A 1 y 0, θ }{{} (0) = A 1 y }{{} 1 H0 1(Ω) L 2 (Ω)

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Wave equation with boundary control So we have reduced the boundary controllability problem to the distributed case, which can be treated with the general theorem. Or, one could use the Hilbert Uniqueness Method (HUM), specific to the wave equation [JL Lions in 1984].

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method Theorem If T is sufficiently large, then ( )y 0 L 2 (Ω), y 1 H 1 (Ω), ( )u L 2 ( Ω (0, T )) boundary control such that y(t, x) 0, y t (T, x) 0, ( )x Ω.

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method Lemma Let ϕ be the solution of ϕ tt ϕ = 0 ϕ(0) = ϕ 0 H 1 0 (Ω) ϕ t (0) = ϕ 1 L 2 (Ω) ϕ = 0 on Ω (3.20) then the following inequality holds ( T ϕ 0 H 1 0 (Ω) + ϕ 1 L 2 (Ω) c(t T 0 ) 1 0 Ω ( ) ) ϕ 2 1/2 dσdt (3.21) ν

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method The boundary controllability of the wave equation is equivalent to (by changing the sense of time): ( ) ( )(y 0, y 1 ) L 2 (Ω) H 1 (Ω), ( )u = ϕ ν such that the solution of the equation y tt y = 0 y = u y(0) = 0 y t (0) = 0 on Ω (0, T ) (3.22) verifies the final condition { y(t ) = y0 This is what we are going to prove. y t (T ) = y 1

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method Let construct the problem y tt y = 0 y = ϕ ν on Ω (0, T ) y(0) = 0 y t (0) = 0 (3.23) where ϕ is the solution to ϕ tt ϕ = 0 ϕ = 0 ϕ(t ) = ϕ 0 H0 1(Ω) ϕ t (T ) = ϕ 1 L 2 (Ω) on Ω (3.24) and define the operator Λ : H 1 0 (Ω) L2 (Ω) H 1 (Ω) L 2 (Ω), Λ(ϕ 0, ϕ 1 ) = ( y t (T ), y(t ))

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method Λ is linear continuous and satisfies Indeed Λ(ϕ 0, ϕ 1 ), (ϕ 0, ϕ 1 ) ω (ϕ 0, ϕ 1 ) 2 H 1 0 L2 (3.25) Λ(ϕ 0, ϕ 1 ), (ϕ 0, ϕ 1 ) = ( y t (T ), y(t )), ϕ 1 ), (ϕ 0, ϕ 1 ) = y t (x, T )ϕ 0 (x) + y(x, T )ϕ 1 (x) (3.26) Ω To evaluate the RHS, we multiply (3.23) by ϕ, (3.20) by y T T T T y tt ϕ ϕ y = 0 = y tt ϕ + ϕ y 0 Ω T 0 Ω ϕ tt y 0 Ω T 0 Ω y ϕ = 0 = 0 Ω T 0 Ω ϕ tt y + 0 Ω T 0 Ω ϕ y T 0 Ω T 0 Ω ϕ y ν y ϕ ν

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method and subtract T T 0 = (y tt ϕ ϕ tt y) y ϕ 0 Ω 0 Ω }{{} ν y ϕ ν =0 = (y t (x, T )ϕ(x, T ) y t (x, 0) ϕ(x, 0)) y t ϕ t }{{} hence Ω Ω Ω (3.23) = 0 (y(x, T )ϕ t (x, T ) y(x, 0) }{{} (3.23) = 0 Ω ϕ t (x, 0)) + ϕ t y t + Ω (y t (x, T ) ϕ(x, T ) y(x, T ) ϕ t (x, T )) = }{{}}{{} (3.24) = ϕ 0 (3.24) = ϕ 1 }{{} (3.26) = Λ(ϕ 0,ϕ 1),(ϕ 0,ϕ 1) T 0 Ω T 0 y }{{} Ω (3.23) = ϕ ν y ϕ ν ϕ ν

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method Therefore Λ(ϕ 0, ϕ 1 ), (ϕ 0, ϕ 1 ) = T 0 Ω ( ) ϕ 2 Lemma c(t T 0 ) (ϕ 0, ϕ 1 ) 2 ν H0 1 L2 By the Lax-Milgram lemma this implies that Λ is surjective, i.e., R(Λ) = H 1 (Ω) L 2 (Ω) Hence, for any (y 0, y 1 ) H 1 (Ω) L 2 (Ω) ( )(ϕ 0, ϕ 1 ) H 1 0 (Ω) L2 (Ω) such that Λ(ϕ 0, ϕ 1 ) = (y 0, y 1 ) Hence this ϕ solving (3.24) with (ϕ 0, ϕ 1 ) as final (or initial) conditions gives the boundary control u = ϕ ν for equation (3.23). This concludes the proof of the theorem.

Adv. Scientific Computing III > Controllability of linear distributed parameter systems Hilbert Uniqueness Method Lemma (Λ is surjective) Let X be a reflexive Hilbert space and Λ : X X a linear continuous, positive definite operator. If λ is also coercive: Λx, x ω x 2, ( )x X, then R(Λ) = X. Proof: R(Λ) is a dense closed set in X. Dense (with Banach separation theorem). Assume that R(Λ) X, then 0 y X = X such that y, x = 0, ( )x R(Λ). Taking x = R(Λ) this implies 0 = (Λ)y, y ω y 2 y = 0, i.e., all functionals that are zero on R(Λ) are identically zero. R(Λ) is closed: let f R(Λ), i.e., Λx n f. We show that f R(Λ). ω x n x m 2 Λx n Λx m, x n x m hence {x n } is a Cauchy sequence, hence convergent Λx = f. Corollary (of Hahn-Banach separation theorem) Let F E a vector subspace such that F E. Then ( )f E, f 0 such that f, x = 0 x F.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions In our basic optimal control problem for ODEs, we use u(t) for the control and x(t) for the state. The variable satisfies a differential equation which depends on the control variable x (t) = g(t, x(t), u(t)). As the control function is changed, the solution of the differential equation will change, Thus, we can view the control-to-state relationship as a map u(t) x = x(u) (of course, x is really a function of the independent variable t; we write x(u) simply to remind us of the dependence on u).

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions Our basic optimal control problem consists of finding a piecewise continuous control u(t) and the associated state variable x(t) to maximize the given objective functional, i.e., t1 max f(t, x(t), u(t))dt u t 0 subject to x (t) = g(t, x(t), u(t)) (4.1) x(t 0 ) = x 0 and x(t 1 ) free. Such a maximizing control is called an optimal control. By x(t 1 ) free, it s meant that the value of x(t 1 ) is unrestricted. For our purposes, f and g will always be C 1 in all three arguments. Thus, as the control(s) will always be picewise continuous, the associated states will always be piecewise differentiable.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions The principle technique for such an optimal control problem is to solve a set of necessary conditions that an optimal control and corresponding state must satisfy. It is important to understand the logical difference between necessary conditions and sufficient conditions of solution sets. Necessary Conditions: If u (t), x (t) are optimal then the following conditions hold... Sufficient Conditions: If u (t), x (t) satisfy the following conditions..., then u (t), x (t) are optimal. Let derive the necessary conditions. Express our objective functional in terms of the control: J(u) = t1 t 0 f(t, x(t), u(t))dt where x = x(u) is the corresponding state.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions The necessary conditions that we derive we developped by Pontryagin and his co-workers in Moscow in the 1950 s. Potryagin produced the idea of the adjoint functions to append the differential equation to the objective functional. Adjoint functions have a similar purpose as Lagrange multipliers in multivariate calculus, which append constraints to the function of several variables to be maximized or minimized. 1 Thus, we begin by finding appropriate conditions that the adjoint function should satisfy. 2 Then, by differentiating the map from the control to the objective functional, we will derive a characterization of the optimal control in terms of the optimal state and corresponding adjoint.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions Assume a (piecewise continuous) optimal control exists, and that u is such a control, with x is such a control, with x the corresponding state. Namely, J(u) J(u ) <, for all controls u. Let h(t) be a piecewise continuous variation function and ε R a constant. Then u ε (t) = u (t) + εh(t) is another piecewise continuous control. Let x ε be the state corresponding to the control u ε, namely, x ε satisfies d dt xε (t) = g(t, x ε (t), u ε (t)) (4.2) wherever u ε is continuous. Since all trajectories start at the same position, we take x ε = x 0.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions It is easily seen that u ε (t) u (t) for all t as ε 0. Further, for all t u ε ε = h(t). ε=0 In fact, something similar is true for x ε. Because of the assumption made on g, it follows that x ε (t) x (t) for each fixed t. Further, the derivative ε xε (t) ε=0 exists for each t. The actual values of quantity will prove unimportant. We need only to know that it exists.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions The objective functional at u ε is J(u ε ) = t1 t 0 f(t, x ε (t), u ε (t))dt. We are now ready to introduce the adjoint function of variable λ. Let λ be a piecewise differentiable function on [t 0, t 1 ] to be determined. By the Fundamental Theorem of Calculus, t1 t 0 which implies t1 t 0 d dt [λ(t)xε (t)]dt = λ(t 1 )x ε (t 1 ) λ(t 0 )x ε (t 0 ), d dt [λ(t)xε (t)]dt + λ(t 0 )x ε (t 0 ) λ(t 1 )x ε (t 1 ) = 0.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions Adding this 0 to our J(u ε ) gives t1 [ J(u ε ) = f(t, x ε (t), u ε (t)) + d ] t 0 dt (λ(t)xε (t)) dt + λ(t 0 )x ε (t 0 ) λ(t 1 )x ε (t 1 ) = t1 t 0 [ f(t, x ε (t), u ε (t)) + λ (t)x ε (t) + λ(t)g(t, x ε, u ε ) ] dt + λ(t 0 )x ε (t 0 ) λ(t 1 )x ε (t 1 ), where we used the product rule and the fact that g(t, x ε, u ε ) = d dt xε at all but finitely many points. Since the maximum of J with respect to the control u occurs at u, the derivative of J(u ε ) with respect to ε (in te direction h) is zero, i.e., 0 = d dε J(uε ) J(u ε ) J(u ) = lim. ε=0 ε 0 ε

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions This gives a limit of an integral expression. A version of Lebesgue Dominated Convergence Theorem allows us to move the limit (and thus the derivative) inside the integral. This is due to the compact interval of integration and the piecewise differentiability of the integrand. Therefore, 0 = d dε J(uε ) ε=0 t1 = t 0 ε [f(t, xε (t), u ε (t)) + λ (t)x ε (t) + λ(t)g(t, x ε (t), u ε (t))] dt ε λ(t 1)x ε (t 1 ). ε=0 ε=0 Applying the chain rule to f and g, it follows t1 )] x 0 = [f ε x t 0 ε + f u ε u ε + λ (t) (g xε ε + λ(t) x ε x ε + g u ε u dt ε ε=0 ε λ(t 1)x ε (t 1 ), (4.3) ε=0 where the arguments of the f x, f u, g x, and g u terms are (t, x (t), u (t)).

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions Rearranging the terms in (4.3) gives t1 [ (fx 0 = + λ(t)g x + λ (t) ) x ε ] t 0 ε (t) + (f u + λ(t)g u )h(t) dt ε=0 λ(t 1 ) xε ε (t 1). (4.4) ε=0 We want to choose the adjoint function to simplify (4.4) by making the coefficient of x ε ε (t) ε=0 vanish. Thus, we choose the adjoint function λ(t) to satisfy λ (t) = [f x (t), x (t), u (t)+λ(t)g x (t, x (t), u (t))] (adjoint equation) and the boundary condition λ(t 1 ) = 0 (transversality condition).

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions Now (4.4) reduces to 0 = t1 t 0 (f u (t, x (t), u (t)) + λ(t)g u (t, x (t), u (t))) h(t)dt. And this holds for any piecewise continuous variation function h(t), it holds for In this case 0 = h(t) = f u (t, x (t), u (t)) + λ(t)g u (t, x (t), u (t)). t1 t 0 (f u (t, x (t), u (t)) + λ(t)g u (t, x (t), u (t))) 2 dt, which implies the optimality condition f u (t, x (t), u (t)) + λ(t)g u (t, x (t), u (t)) = 0 for all t 0 t t 1.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Necessary Conditions These equations form a set of necessary conditions that an optimal control and state must satisfy. In practice, one does not need to rederive the above equations in this way for a particular problem. In fact, we can generate the above necessary conditions for the Hamiltonian H, which is defined as follows, H(t, x, u, λ) =f(t, x, u) + λg(t, x, u) =integrand + adjoint * RHS of DE. We are maximizing H with respect to u at u, and the above conditions can be written in terms of the Hamiltonian: H u = 0 at u f u + λg u = 0 (optimality condition), λ = H x λ = (f x + λg x ) (adjoint equation), λ(t 1 ) = 0 (transversality condition). We are given the dynamics of the state equation: x = g(t, x, u) = H λ, x(t 0) = x 0.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle These conclusions can be extended to a version of Pontryagin s Maximum Principle. Theorem If u (t) and x (t) are optimal, then there exists a piecewise differentiable adjoint variable λ(t) such that H(t, x (t), u(t), λ(t)) H(t, x (t), u (t), λ(t)) for all controls u at each time t, where the hamiltonian H is H = f(t, x(t), u(t)) + λ(t)g(t, x(t), u(t)), and λ (t) = H(t, x (t), u (t), λ(t)), x λ(t 1 ) = 0. We have already shown with this adjoint and Hamiltonian, H u = 0 at u for each t. Namely, the Hamiltonian has a critical point, in the u variable, at u for each t. It is not surprising that this critical point is a maximum considering the optimal control problem. However, the proof of this theorem is quite technical and difficult, and we omit it here.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle Theorem Suppose that f(x, u) and g(t, x, u) are both continuously differentiable functions in their three arguments and concave in u. Suppose u is an optimal control for problem (4.1), with associated state x, and λ a piecewise differentiable function with λ 0 for all t. Suppose for all t 0 t t 1 0 = H u (t, x (t), u (t), λ(t)). Then for all contros u and each t 0 t t 1, we have H(t, x (t), u(t), λ(t)) H(t, x (t), u (t), λ(t)).

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle Proof. Fix a control u and a point in time t 0 t t 1. Then, H(t, x (t), u (t), λ(t)) H(t, x (t), u(t), λ(t)) = [f(t, x (t), u (t)) λ(t)g(t, x (t), u (t))] [f(t, x (t), u(t)) λ(t)g(t, x (t), u(t))] = [f(t, x (t), u (t)) f(t, x (t), u(t))] + λ(t) [g(t, x (t), u (t)) g(t, x (t), u(t))] (u (t) u(t))f u (t, x (t), u (t)) + λ(t)(u (t) u(t))g u (t, x (t), u (t)) = (u (t) u(t))h u (t, x (t), u (t)) = 0.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle An identical argument generates the same necessary conditions when the problem is minimization rather than maximization. In a minimization problem, we are minimizing the Hamiltonian pointwise, and the inequality in Pontragin s Maximum Principle is reversed. Indeed, for a minimization problem with f, g being convex in u, we can derive H(t, x (t), u(t), λ(t)) H(t, x (t), u (t), λ(t)) by the same argument as in the previous theorem. We have converted the problem of finding a control that maximizes (or minimizes) the objective functional subject to the differential equation and initial condition, to maximizing the Hamiltonian pointwise with respect to the control. Thus to find the necessary conditions, we do not need to calculate the integral in the objective functional, but only use the Hamiltonian. Later, we will see usefulness of the property that the Hamiltonian is maximized by an optimal control.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle We can also check concavity conditions to distinguish between controls that maximize and those that minimize the objective functional. If 2 H u 2 < 0 at u, then the problem is maximization, while 2 H u 2 > 0 at u goes with minimization. We can view our optimal control problem as having two unknowns, u and

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle We can view our control problem as having two unknowns, u and x, at the start. We have introduced an adjoint variable λ, which is similar to a Lagrange multiplier. It attaches the differential onto the maximization of the objective functional. The following is an outline of how this theory can be applied to solve the simplest problems. 1 Form the Hamiltonian for the problem. 2 Write the adjoint differential equation, transversality boundary condition, and the optimality condition. Now there are three unknowns, u, x, and λ. 3 Try to eliminate u by using the optimality equation H u = 0, i.e., solve for u in terms of x and λ. 4 Solve the two differential equations for x and λ with two boundary conditions, substituting u in the differential equations with the expression for the optimal control from the previous step. 5 After finding the optimal state and adjoint, solve for the optimal control.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle If the Hamiltonian is linear in the control variable u, it can be difficult to solve for u from the optimality equation. If we can solve for u from the optimality equation, we are then left with two unknowns x and λ satisfying two differential equations with two boundary conditions. We solve that system of differential equations for the optimal state and adjoint and then obtain the optimal control. We can solve that analytically or numerically. When we are able to solve for the optimal control in terms of x and λ, we will call that formula for u the characterization of the optimal control. The state equations and the adjoint equations together with the characterization of the optimal control and the boundary conditions are called the optimality system. For now, let us try to better understand these ideas with a few examples.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle Example min u 1 0 u 2 (t)dt subject to x (t) = x(t) + u(t), x(0) = 1, x(1) free. Can we se what the optimal control should be? The goal of the problem os to minimize this in tegral, which does not involve the state. Only the integral of control is to be minimized. Therefore, we expect the optimal control to be zero. We verify the necessary conditions. We begin by forming the Hamiltonian H The optimality condition is H = u 2 + λ(x + u). 0 = H u = 2u + λ at u u = 1 2 λ.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle We see the problem is indeed minimization as The adjoint equation is given by 2 H u 2 = 2 > 0. λ = H x = λ λ(t) = ce t, for some constant c. But, the transversality condition is λ(1) = 0 ce 1 = 0 c = 0. Thus, λ 0, so that u = λ 2. So, x satisfies x = x and x(0) = 1. Hence, the optimal solutions are λ 0, u 0, x (t) = e t.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle Example 1 min u 2 1 0 3x(t) 2 + u 2 (t)dt subject to x (t) = x(t) + u(t), x(0) = 1. The 1 2 which appears before the integral will have no effect on the minimizing control and, thus, no effect on the problem. It is inserted in order to make the computations slightly neater. Also note we have omitted the phrase x(1) free from the statement of the problem. This is standard notation, in that a term which is unrestricted is simply not mentioned.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle Form the Hamiltonian of the problem The optimality condition gives H = 3 2 x2 + 1 2 u2 + xλ + uλ. 0 = H u + u + λ at u u = λ. Notice 1 2 cancels with the 2 which comes from the square on the control u. Also, the problem is minimization problem as 2 H u 2 = 1 > 0. We use the Hamiltonian to find a differential equation of the adjoint λ, λ (t) = H x = 3x λ, λ(1) = 0.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Pontryagin s Maximum Principle Substituting the derived characterization for the control variable u in the equation for x, we arrive at ( ) ( ) ( ) x 1 1 x =. λ 1 1 λ The eigenvalues of the coefficient matrix are 2 and -2. Finding the eigenvectors, the equations for x and λ are ( ) ( ) ( ) x 1 (t) = c λ 1 e 2t 1 + c 1 2 e 2t 3 Using x(0) = 1 and λ(1) = 0, we find c 1 = 3c 2 e 4 and c 2 = 1 3e 4 +1. Thus, using the optimality equation, the optimal solutions are u (t) = x (t) = 3e 4 3 3e 4 + 1 e2t 3e 4 + 1 e 2t, 3e 4 1 3e 4 + 1 e2t + 3e 4 + 1 e 2t.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties In the last chapter we developed necessary conditions to solve basic optimal control problems. However, some difficulties can arise with this method. It is possible that the necessary conditions could yield multiple solutions sets, only some of which are optimal controls. Further, recall that in the development of the necessary conditions, we began by assuming an optimal control exists. It is also possible that the necessary conditions could be solvable when the original optimal control problem has no solution. We expect the objective functional evaluated at the optimal state and control to give a finite answer. If this objective functional value turns out to be or, we would say the problem has no solution.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties Example max u 1 0 x(t) + u(t)dt subject to x (t) = 1 u 2 (t), x(0) = 1. The Hamiltonian and the optimality condition are: H(t, x, u, λ) = x + u + λ(1 u 2 ), H u = 1 2λu = 0 u = 1 2λ. From the adjoint equation and its boundary condition, λ = H x We can directly calculate = 1 and λ(1) = 0, λ(t) = 1 t.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties Note that the concavity with repsect to the control u is correct for a maximization problem, H uu = 2λ 0, as λ(t) 0. Next, we calculate the optimal state using the differential equation and its boundary condition and find that x = 1 u 2 = 1 x (t) = t 1 and x(0) = 1, 4(1 t) 2 1 4(1 t) + 5 4 and u (t) = 1 2(1 t).

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties However, notice that when the objective functional is evaluated at the optimal control and state, we do not obtain a finite answer, 1 0 [x (t) + u (t)] dt = 1 0 t + 5 4 + 1 dt =. 4(1 t) There is not an optimal control in this case, since we are considering problems with finite maximum (or minimum) objective functional values, even though the solutions we found satisfy the necessary conditions. In this simple example, the optimal control and state become unbounded at t = 1. In most applications we want the optimal control and state values to remain bounded.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties What caused this unbounded difficulty in this example? One explanation is the quadratic nonlinearity u 2 in the differentiable equation. For example, consider the simple differentiable equation x = x 2 with x(0) = 1, which has a quadratic term. The solution is easily shown to be x = 1 1 t, which become unbounded in finite time (at t = 1). This illustrates how difficulty can arise with quadratic terms in differential equations.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties Existence and Uniqueness Results It should be clear now that simply solving the necessary conditions is not enough to solve an optimal control problem. To completely justify the methods used, some existence and uniqueness results should be examined. First, what conditions can guarantee the existence of a finite objective functional value at the optimal control and state? The following example is an example of a sufficient condition result.

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties Theorem Consider J(u) = t1 t 0 f(t, x(t), u(t))dt subject to x (t) = g(t, x(t), u(t)), x(t 0 ) = x 0. Suppose that f(t, x, u) and g(t, x, u) are both C 1 in their three arguments and concave in x and u. Suppose u is a control, with associated state x, and λ a piecewise differentiable function, such that u, x, and λ together satisfy on t 0 t t 1 : f u + λg u = 0, λ = (f x + λg x ), λ(t 1 ) = 0, λ(t) 0. Then, for all controls u, we have J(u ) J(u).

Adv. Scientific Computing III > Basic Optimal Control Problems [4] > Existence and solution properties Proof. Let u be any control, and x its associated state. Note, as f(t, x, u) is concave in both the x and u variable, we have by the tangent line property

Adv. Scientific Computing III > PDEs [4] The idea of optimal control of PDEs starts with a PDE with state variable w, and control u Let A be a partial differential operator with appropriate boundary conditions (BC), and initial conditions (IC), Aw = f(w, u) in Ω (0, T ) along with BC and IC We will treat problems with space x and time t variables, but one could treat steady state problems with only spatial variables.

Adv. Scientific Computing III > PDEs [4] The goal of the problem is modeled in by the cost functional, in an integral form. (P) Find the optimal control u, in an appropriate (admissible) control set such that J(u ) = inf J(u) u with the objective functional J(u) = T 0 Ω g(x, t, w(x, t), u(x, t))dxdt After specifying a control set and a solution space for the states, one can usually obtain the existence of a state solution given a control. Namely, for a given control u, there exists a state solution w = w(u), showing the dependence of w on u.

Adv. Scientific Computing III > PDEs [4] > Existence of an optimal control Proving the existence of an optimal control in the PDE case requires more work than the ODE case. A priori estimates of the norms of the states in the solution space are needed to justify convergence. If the controls are bounded above and below, one can usually obtain corresponding bounds in the solution space for the states. This boundedness gives the existence of a minimizing sequence u n where lim J(u n) = inf J(u). n u In the appropriate weak sense, this usually gives u n u in L 2 (Ω (0, T )) w n = w(u n ) w in the solution space for some u and w. One must show that w = w(u ), which means that the solution depends continuously on the controls. We must also show that u is an optimal control, i.e., J(u ) = inf u J(u).

Adv. Scientific Computing III > PDEs [4] > Existence of an optimal control To derive the necessary conditions for optimality, one needs to differentiate the cost functional with respect to the control, namely, differentiate the map u J(u). Note that w = w(u) usually contributes to J(u), so we must also differentiate the map u w(u). In the usual well-posed PDE problem, continuous dependence of the solution on the control would imply continuity of this map (for counterexamples, see e.g., [7, 6, 5, 3]), but differentiability is needed here.

Adv. Scientific Computing III > PDEs [4] > Sensitivities and necessary conditions Assume that the map u w(u) is weakly differentiable in the directional derivative sense (Gâteaux): w(u + εl) w(u) lim = ψ. ε 0 ε The function ψ is called the sensitivity of the state with respect to the control. A priori estimates of the quotients in the norm of the solution space give the existence of the limit function ψ and that it solves a PDE, which is the linearized version of the state PDE Lψ = F (w, l, u) with the appropriate BC,IC. Note the linear operator L comes from linearizing the state PDE operator A. Usually, ψ will have zero BC and IC since w(u + εl) and w(u) have the same BC and IC.

Adv. Scientific Computing III > PDEs [4] > Sensitivities and necessary conditions Differentiating the objective functional J(u) with respect to u at u, J(u + εl) J(u ) 0 lim. ε 0 + ε We use the adjoint variable λ (solving an adjoint equation, adjoint to the state equation) and ψ to simplify and obtain the explicit characterization u = G(w, λ) of the optimal control. The operator in the adjoint equation is the adjoint operator (in the functional analysis sense) of the operator L acting on ψ in the sensitivity equation. The boundary conditions from the adjoint system come from the boundary conditions for the sensitivity equations and properties of the adjoint operator.

Adv. Scientific Computing III > PDEs [4] > Sensitivities and necessary conditions Formally, the operator L and the adjoint operator L are related by λ, Lψ = L λ, ψ where, is the L 2 -inner product. A key tool is integration by parts in the multi-dimensions to throw all the derivatives on ψ from the operator L onto the derivatives of λ in the operator L.

Adv. Scientific Computing III > PDEs [4] > Sensitivities and necessary conditions In a time dependent problem, the adjoint problem usually has final time conditions. The non-homogeneous term of the adjoint equation comes from differentiating the integrand of the objective functional with respect to the state. Informally, L λ = (integrand of J), w where L is the operator from the linearized PDE in ψ. We will have a characterization of an optimal control in terms of the solutions of the state and adjoint equations. This system the state and adjoint equations together with, the control characterization (called optimality condition) is called optimality system.

Adv. Scientific Computing III > PDEs [4] > Uniqueness of the optimal control Uniqueness of the solutions to the optimality system: Aw = f(w, G(w, λ)), (integrand of J) u, L λ = BC, state IC, and final conditions for the adjoint system, will imply uniqueness of the optimal control, since an optimal control and corresponding state and adjoint satisfy this system. In the usual time dependent case, like the diffusion equation, the adjoint equation has final time conditions while the state equation has an initial time condition. This means that the state equation and the adjoint equation have opposite time orientations w(, 0) = w 0 ( ) in Ω, λ(, T ) = 0 in Ω. Typically, one can only prove uniqueness for small T. Numerical algorithms are usually able to solve the optimality systems for larger times. Note that an alternative approach to prove uniqueness of the optimal control is to verify directly the strict convexity of the map u J(u); this involves calculation of the second derivative of the control-to-state map u w(u).

Adv. Scientific Computing III > PDEs [4] > Uniqueness of the optimal control A backward-forward sweep iteration method can be used to solve such optimality systems. Each sweep must be done by some type of PDE solver, like finite difference or finite elements. We repeat the iterations until convergence of successive controls and state solutions are close together. If there is a problem with uniqueness of the solutions of the optimality system, one might choose to also calculate the change in the objective functional.

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] Consider the problem of harvesting in a diffusing population w t (x, t) α w(x, t)=w(x, t)(1 w(x, t)) u(x, t)w(x, t) in Ω (0, T ) w(x, t)=0 on Ω (0, T ) w(x, 0)=w 0 (x) 0 on Ω The state w(x, t) is the density population and the harvesting control is u(x, t). Note that the state equation has logistic growth w(1 w) and constant diffusion coefficient α.

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] The profit objective functional is J(u) = T 0 Ω e δt ( pu(x, t)w(x, t) Bu(x, t) 2) dx dt, which is a discounted revenue less cost stream. With p representing the price of harvesting population, puw represents the revenue from the harvested amount uw. We use a quadratic cost for the harvesting effort with a weight coefficient B. At first we consider the case of a positive constant B. The coefficient e δt is a discount term with 0 δ < 1. For convenience, we now take the price to be p = 1.

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] A point of interest in a fishery application is where the marine reserves should be paced, i.e., the regions of no harvesting, u (x, t). We seek to find u such that J(u ) = max J(u), u where the maximization is over all measurable controls with 0 u(x, t) M < 1 a.e. Under this set-up, we note that any state solution will satisfy w(x, t) 0 on Ω (0, T ), by the Maximum Principle for parabolic equations [?].

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] First, we differentiate the map u w(u). Given a control u, consider another control u ε = u + εl, where l is a variation function and ε > 0. Let w = w(u) and w ε = w(ε). The state PDEs corresponding to controls u and u ε are w t α w = w(1 w) uw, w ε t α w ε = w ε (1 w ε ) u ε w ε. The difference quotient w ε w ε satisfies the PDE ( w ε ) ( w w ε ) ( w α = wε w (w ε ) 2 w 2 ) (1 w) ε t ε ε ε ( w ε ) w u lw ε. ε

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] Assume that as ε 0, w ε w and For the nonlinear term note that (w ε ) 2 w 2 ε w ε w ε ψ = (w ε + w) wε w ε 2wψ. The corresponding derivative quotients will converge ( w ε ) ( w w ε ) w α ψ t ψ. ε ε The resulting PDE for ψ is t ψ t α ψ = ψ 2wψ uψ lw in Ω (0, T ) ψ = 0 on Ω (0, T ) ψ(x, 0) = 0 on Ω

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] Given an optimal control u and the corresponding state u, we write the sensitivity PDE as Lψ = lw, where Lψ = ψ t ψ ψ + 2w ψ + u ψ. To find the adjoint equation, i.e., the L operator in the adjoint PDE: T 0 Ω e δt λlψdxdt = T 0 Ω e δt ψ(l λ + δλ)dxdt. To see the specific terms of L, use integration by parts to see T 0 Ω e δt λψ t dxdt = T 0 Ω e δt ( δλ + λ t )ψdxdt. The boundary terms on Ω {T } and Ω {0} vanish due to λ and ψ being zero on the top and the bottom of our domain, respectively. The term with δ comes from the discount term in the objective functional.

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] Next notice T e δt λψ xx dxdt = 0 Ω T 0 Ω ( e δt λ x )ψ x dxdt = T since λ and ψ are zero on Ω (0, T ). The linear terms of L go directly in L as the same types of terms. Our operator L is L λ = λ t λ λ + 2w λ + u λ and the adjoint PDE is L λ + δλ = u in Ω (0, T ) λ = 0 on Ω (0, T ) λ(, T ) = 0 on Ω The non-homogeneous term u in the adjoint PDE comes from (integrand of J) (state) = (uw) w = u where we use the integrand of J without the discount factor e δt, which came into play in the integration by parts above. 0 Ω e δt λ xx ψdxdt

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] Next we use the sensitivity and adjoint functions in the differentiation of the map u J(u). At the optimal control u, the quotient is non-positive since J(u ) is the maximum value, i.e., J(u + εl) J(u ) 0 lim. ε 0 ε Rewriting the adjoint equations as L λ+δλ=u, this limit simplifies to 0 lim = = = T ε 0 0 T 0 Ω T 0 Ω T 0 Ω Ω e δt 1 ε ( (u + εl)w ε u w (B(u + εl) 2 Bu 2 ) ) dxdt e δt (u ψ+lw 2Bu l) dxdt= T e δt (λlψ + lw 2Bu l) dxdt = e δt l (w (1 λ) 2Bu ) dxdt, 0 Ω T where we used that the RHS of the ψ PDE is lw. e δt (ψ(l λ+δλ)+lw 2Bu l) dxdt 0 Ω e δt ( λlw + lw 2Bu l) dxdt

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] On the set {(x, t) : 0 < u (x, t) < M}, the variation l can have any sign, because the optimal control can be modified a little up or down and still stay inside the bounds. Thus on this set, in the case that B 0, the rest of the integrand must be zero, so that u = w (1 λ). 2B By taking the upper and lower bounds into account, we obtain ( ( w u )) (1 λ) = min M, max, 0. 2B This completes the analysis in the case of positive balancing constant B.

Adv. Scientific Computing III > PDEs [4] > Harvesting example [1, 2] However, the B = 0 case is also important. In this case, we are maximizing profit (yield) only. When B = 0, the problem is linear in the control u. The argument above goes through with B = 0 until the end, before we solve for u. On the set {(x, t) : 0 < u (x, t) < M}, the variation l can have any sign, which corresponds to λ = 1. This case is called singular because the integrand of the objective functional drops out on this set. Suppose λ = 1 on some set of positive measure. By looking at the adjoint PDE, and noting the derivatives of λ are 0, we can solve for the state w = 1 δ 2. Now use this constant for w in the state equation and solve for the optimal control u = 1 + δ 2. 0 if λ>1, By an argument similar to the one above, we get u 1+δ (x, t)= 2 if λ=1, M if λ<1.

Adv. Scientific Computing III > PDEs [4] > Beaver example Consider harvesting a beaver population that caused damage through flooding and destroying trees, i.e., consider only the nuisance side of the beavers, not the benefit side. The population density of the beaver species is given by the model z t (x, y, t) α z(x, y, t)=z(x, y, t)(a bz(x, y, t) u(x, y, t)) in Ω (0, T ) z(x, y, t)=0 on Ω (0, T ) z(x, y, 0)=z 0 (x, y) on Ω where α is the constant diffusion coefficient, and a and b are spatially dependent growth parameters. The control is u, the proportion of population z to be trapped per unit time. The zero boundary conditions imply the unsuitability of the surrounding habitat.

Adv. Scientific Computing III > PDEs [4] > Beaver example Given the control set U = {u L 2 (Ω (0, T )) : 0 u(x, y, t) M} we seek to minimize the cost functional T ( ) 1 J(u) = e rt 2 γz(x, y, t)2 + cu(x, y, t) 2 z(x, y, t) dxdydt. 0 Ω In this objective functional, 1 2 γz2 represents the density dependent damage that beavers cause. The cu 2 z term represents the cost of trapping, which is composed of two factors, (cu unit cost)(uz amount trapped). The term e rt is included for discontinued values of the accrued future cost. We find the optimal control p that minimizes the objective functional J(u ) = min u U J(u).

Adv. Scientific Computing III > PDEs [4] > Beaver example The model is very similar to the previous example, so we omit the derivation of the sensitivity equation: for a variation l, the resulting PDE for ψ ψt α ψ =aψ 2bz ψ u ψ lz in Ω (0, T ) ψ =0 on Ω (0, T ) ψ(,, 0)=0 in Ω The form of the cost functional is slightly different, which affects the adjoint equation. Given an optimal control u and corresponding state z, the adjoint problem is given by λ t α λ=aλ 2bz λ rλ u λ + γz + c(u ) 2 in Ω (0, T ) λ=0 on Ω (0, T ) λ(,, T )=0 in Ω. The non-homogeneous term γz + c(u ) 2 in the adjoint equation is from differentiating the integrand 1 2 γz2 + cu 2 z of the objective functional (without the discount factor) with respect to the state z. Note again the r term comes from the discount factor. The operator in the adjoint PDE is the adjoint operator of Lψ = ψ t α ψ aψ + 2bz ψ + u ψ.

Adv. Scientific Computing III > PDEs [4] > Beaver example At the optimal control u, we again differentiate J(u) with respect to u. The quotient is nonnegative since J(u ) is the minimum value J(u + εl) J(u ) 0 lim. ε 0 + ε We simplify this limit [ 1 0 lim e rt 2ε γ ( z ε2 z 2) + c ( (u + εl) 2 z ε u 2 z )] dxdt = = = T ε 0 0 Ω T 0 Ω T 0 Ω T 0 Ω e rt[ γz ψ+cu 2 ψ+2cu z l ] = e rt (λlψ+2cu z l) dxdt= e rt lz ( λ+2cu ) dxdt As before, we used T 0 T Ω 0 Ω (z ε ) 2 z 2 zψ. 2ε e rt [ψl λ+rψλ+2cu z l] dxdt e rt ( λlz +2cu z l) dxdt

Adv. Scientific Computing III > PDEs [4] > Beaver example Considering the possible variation l, we obtain u = λ 2c on the interior of the control set (away from the bounds), since z is positive inside the domain. Taking the bounds into account, we conclude ( ( u = min max 0, λ ) ), M. 2c

Adv. Scientific Computing III > PDEs [4] > Predator-prey example Let treat an example with system of PDEs to show how to handle the adjoint system. We consider optimal control of a parabolic system with Neumann boundary conditions. Solutions of the state system represent population densities of the prey and predator species. The system has Lotka-Volterra type growth terms and local interaction terms between the populations. The two controls represent harvesting (actually rates of harvest). The spatial domain is bounded set in R n.

Adv. Scientific Computing III > PDEs [4] > Predator-prey example { wt b 1 w = (a 1 d 1 w)w c 1 wv u 1 w v t b 2 v = (a 2 d 2 v)v + c 2 wv u 2 v in Q = Ω (0, T ) with initial and boundary conditions w(x, 0) = w 0 (x), v(x, 0) = v 0 (x) for x Ω, w v ν (x, t) = 0, ν (x, t) = 0 on Ω (0, T ). Here, the terms and coefficients are u 1 (x, t), u 2 (x, t) = controls w(x, t) = prey population (1 st state variable) v(x, t) = predator population (2 nd state variable) b 1 (x, t), b 2 (x, t) = diffusion coefficients, strictly positive a 1 (x, t), a 2 (x, t), d 1 (x, t), d 2 (x, t) = standard logistic growth terms c 1 (x, t), c 2 (x, t) = interaction coefficients ν(x) = outward unit normal vector at x on Ω.

Adv. Scientific Computing III > PDEs [4] > Predator-prey example The class of admissible controls is U = { (u 1, u 2 ) L 2 (Q) : 0 u i Γ i a.e. for i = 1, 2. We want to maximize the cost functional J(u 1, u 2 ) defined by ( K1 u 1 (x, t)w(x, t) + K 2 u 2 (x, t)v(x, t) M 1 u 2 1(x, t) M 2 u 2 2(x, t) ) dxdt, Q where K 1 u 1 w, K 2 u 2 v represent the revenue of harvesting, and M 1 u 2 1, M 2u 2 1 denote the cost of controls.

Adv. Scientific Computing III > PDEs [4] > Predator-prey example Let calculate the sensitivities. First we assume that the map is differentiable, i.e., w(u + εk) w(u) ε (u 1, u 2 ) U (w, v) ψ 1 and v(u + εk) v(u) ε ψ 2 as ε for any u = (u 1, u 2 ) U and bounded k = (k 1, k 2 ) such that (u + εk) U for ε small. (ψ 1 ) t b 1 ψ 1 = a 1 ψ 1 2d 1 ψ 1 w c 1 (wψ 2 +vψ 1 ) u 1 ψ 1 k 1 w in Q (ψ 2 ) t b 2 ψ 2 = a 2 ψ 2 2d 2 ψ 2 v+c 2 (wψ 2 +vψ 1 ) u 2 ψ 2 k 2 v in Q ψ 1 (x, 0) = 0 = ψ 2 (x, 0) in Ω, ψ 1 ν = 0 = ψ2 ν = 0 on Ω (0,T ).

Adv. Scientific Computing III > PDEs [4] > Predator-prey example To derive the optimality system and to characterize the optimal control, we need adjoint variables and adjoints of the operators associated with the ψ 1, ψ 2 system. We write the ψ 1, ψ 2 PDE system as ( ) ( ) ψ1 h1 w L = h 2 v ψ 2 where L ( ψ1 ψ 2 ) ( ) L1 ψ = 1 + M L 2 ψ 2 ( ψ1 ψ 2 ), ( ) ( ) L1 ψ 1 (ψ1 ) = t b 1 ψ 1, L 2 ψ 2 (ψ 2 ) t b 2 ψ 2 and ( ) a1 + 2wd M = 1 + c 1 v + u 1 c 1 w. c 1 v a 2 + 2d 2 v c 2 w + u 2

Adv. Scientific Computing III > PDEs [4] > Predator-prey example The adjoint PDE system is ( ) L ψ1 ψ 2 ( ) K1 u = 1, K 2 u 2 where K 1, K 2 are the constants from the objective functional. The adjoint operator is ( ) ( ) ( ) L ψ1 L = 1 λ 1 ψ 2 L 2 λ + M T ψ1, 2 ψ 2 where M T denotes the transpose of the matrix. The derivative terms of the adjoint system are the same as in our other examples, yielding the adjoint system ( ) ( ) ( ) (λ1 ) t b 1 λ 1 + M T ψ1 K1 u = 1 (λ 2 ) t b 2 λ 2 ψ 2 K 2 u 2 For the adjoint system, we have the appropriate BC, namely, zero Neumann conditions and zero final-time conditions. The adjoint system is calculated at the optimal controls u = (u 1, u 2) and corresponding states w, v.

Adv. Scientific Computing III > PDEs [4] > Predator-prey example J(u + εk) J(u ) 0 lim ε 0 [ ε K1 u 1 (w = lim ε 0 Q ε w ) + K 2 u 2 (v ε v ) + K 1 k 1 w ε + K 2 k 2 v ε ] dxdt ε 1 [ ( M1 (u ε 1 + εk 1 ) 2 (u 1) 2) ( + M 2 (u 2 + εk 2 ) 2 (u 2) 2)] dxdt Q ( ) K1 u 1 dxdt+ (k Q K 2 u 1 (K 1 w 2M 1 u 2 1) + k 2 (K 2 v 2M 2 u 2)) dxdt Q ( )] b 1 λ 1 ψ 1 +b 2 λ 2 ψ 2 +(ψ 1, ψ 2 )M T λ1 dxdt ((λ λ 1 ) t ψ 1 +(λ 2 ) t ψ 2 )dxdt 2 = (ψ 1, ψ 2 ) [ = Q + (k 1 (K 1 w 2M 1 u 1) + k 2 (K 2 v 2M 2 u 2)) dxdt Q ( ) k1 w = (λ 1, λ 2 ) Q k 2 v dxdt + (k 1 (K 1 w 2M 1 u 1) + k 2 (K 2 v 2M 2 u 2)) dxdt Q = ( k 1 λ 1 w k 2 λ 2 v )dxdt + (k 1 (K 1 w 2M 1 u 1) + k 2 (K 2 v 2M 2 u 2)) dxdt Q Q = (k 1 (K 1 w λ 1 w 2M 1 u 1) + k 2 (K 2 v λ 2 v 2M 2 u 2)) dxdt Q Q

Adv. Scientific Computing III > PDEs [4] > Predator-prey example We obtain the characterization of the optimal control pair ( ( )) 1 u 1(x) = min Γ 1, max (K 1 λ 1 )w, 0, M 1 ( ( )) 1 u 2(x) = min Γ 2, max (K 2 λ 2 )v, 0. M 2

Adv. Scientific Computing III > PDEs [4] > Identification example We want to identify the unknown coefficient of interaction term in a predator-prey system with a Neumann BC in a 2D bounded domain. Let w represent the population concentration of prey and v the predator concentration. The system has local interaction terms representing a predator-prey situation, and the prey equation has Lotka-Voltera type growth term.

Adv. Scientific Computing III > PDEs [4] > Identification example The state system w t (x, t) b 1 w(x, t) = ( ) a 1 (x, t) d(x, t)w(x, t) w(x, t) u(x)w(x, t)v(x, t), v t (x, t) b 2 v(x, t) = a 2 (x, t)v(x, t) + u(x)w(x, t)v(x, t), in Q = Ω (0, T ), with initial and boundary conditions w(x, 0) = w 0 (x),, v(x, 0) = v 0 (x), x Ω, w v (x, t) = 0, ν (x, t) = 0 on Ω (0, T ). ν The terms and coefficients are u(x) = coefficient of the interaction term to be identified w(x, t) = prey population (1 st state variable) v(x, t) = predator population (2 nd state variable) b 1, b 2 = diffusion coefficients a 1 (x, t), a 2 (x, t), d(x, t) = standard logistic growth terms ν(x) = outward unit normal vector at x on Ω.

Adv. Scientific Computing III > PDEs [4] > Identification example The class of admissible interaction coefficients (controls) is U = {u L 2 (Ω) : 0 u M a.e. in Ω}. We want to minimize the objective functional 1 [ J(u) = (w(x, t) z 1 (x, t)) 2 + (v(x, t) z 2 (x, t)) 2] dxdt W 2 + β u 2 (x)dx 2 Ω where W Q, the area of W is positive, and z 1, z 2 are observations of the prey and predator populations. Namely, given partial (and perhaps noisy) observations z 1, z 2 of the true solution w, v in a subdomain W of Q, we seek to identify the u(x) which best matches the model to the data. Note the control u is only a function of x, while our PDE system has space and time variables. The second term of the objective functional is artificial, we have added it to prevent the problem from being linear in the control. In practice, we choose β very small, so the emphasis of the problem lies in minimizing the closeness of the states to the observed data. This method of solving this identification problem is based on Tikhonov s regularization. Our eventual estimate for u will depend on the choice of β.

Adv. Scientific Computing III > PDEs [4] > Identification example The mapping u U (w, v) is differentiable, i.e., w(u + εk) w(u) ε ψ 1 and v(u + εk) v(u) ε ψ 2 as ε 0 for any u U and bounded k such that (u + εk) U for ε small. As we did in the previous two examples, one can show that ψ 1, ψ 2 satisfy (ψ 1 ) t b 1 ψ 1 = a 1 ψ 1 2dψ 1 w u(wψ 2 + vψ 1 ) kwv, and (ψ 2 ) t b 2 ψ 2 = a 2 ψ 1 + u(wψ 2 + vψ 1 ) + kwv in Q, ψ 1 (x, 0) = 0 = ψ 2 (x, 0), for x Ω, ψ 1 ν = 0 = ψ 2 ν on Ω (0,T ).

Adv. Scientific Computing III > PDEs [4] > Identification example To derive the optimality system and to characterize the optimal control, we need adjoint variables and adjoint operators associated with the ψ 1, ψ 2 system. We write the ψ 1, ψ 2 PDE system as ( ) ( ) ψ1 kwv L = kwv where ( ) ( ) ψ1 L1 ψ L = 1 + M ψ 2 Lψ 2 ( ) Lψ1 = L 2 ψ 2 ( ψ1 ψ 2 ψ 2 ), ( ) (ψ1 ) t b 1 ψ 1, and M = (ψ 2 ) t b 2 ψ 2 We define the adjoint PDE system as ( ) ( ) L λ1 w z1 = χ λ 2 w z W, 2 ( ) a1 + 2wd + uv uw. uv a 2 uw where χ W is the characteristic function of the set W, ( ) ( ) ( ) ( ) ( ) L λ1 L = 1 ψ 1 λ 2 L + M T λ1 L and 1 ψ 1 (λ1 ) 2ψ 2 λ 2 L = t b 1 λ 1. 2ψ 2 (λ 2 ) t b 2 λ 2

Adv. Scientific Computing III > PDEs [4] > Identification example The components of the RHS of this system are the derivatives of the first integrand in with objective functional with respect to each state. We rewrite the objective functional as 1 2 Q [ (w(u) z1 ) 2 + (v(u) z 2 ) 2] χ w dxdt + β 2 Ω u 2 (x)dx. For the adjoint system, we have the appropriate boundary conditions, zero Neumann conditions and zero final-time conditions. For a fixed β, the adjoint system is calculated at the optimal control u and corresponding states w, v L 1λ 1 = (w z 1 )χ w + a 1 λ 1 2dw λ 1 u (v λ 1 v λ 2 ) in Q, and λ 1 ν = 0 on Ω (0, T ), L 2λ 1 = (v z 2 )χ w a 2 λ 2 + u (w λ 2 w λ 1 ) in Q, λ 2 = 0 on Ω (0, T ), ν and the transversality conditions λ 1 (x, T ) = 0 and λ 2 (x, T ) = 0 for x Ω.

Adv. Scientific Computing III > PDEs [4] > Identification example The directional derivative of J(u) w.r.t. u in the direction k at u : J(u + εk) J(u ) 0 lim ε 0 ε [ (w =lim ε 0 W ε z 1 ) 2 (w z 1 ) 2 + (vε z 2 ) 2 (v z 2 ) 2 ] dxdt+ β [ (u +εk) 2 (u ) 2] dx ε ε 2ε Ω ( ) (w = (ψ 1, ψ 2 ) z 1 )χ W Q (v dxdt+β u k dx z 2 )χ W Ω [ ( )] = b 1 λ 1 ψ 1 +b 2 λ 2 ψ 2 +(ψ 1, ψ 2 )M T λ1 dxdt ((λ Q λ 1 ) t ψ 1 +(λ 2 ) t ψ 2 )dxdt 2 Q + β u k dx Ω ( ) kw = (λ 1, λ 2 ) v Q kw v dxdt+β u k dx= ( kλ 1 w v + kλ 2 w v )dxdt+β u kdx Ω Q Ω ( ) T = k(x) βu + ( λ 1 w v + λ 2 w v )dt dx Q 0

Adv. Scientific Computing III > PDEs [4] > Identification example The characterization of the optimal control ( ( 1 u β (x) = min M, max β T 0 )) (λ 1 λ 2 )w β v β dt, 0.

Adv. Scientific Computing III > PDEs [4] > Controlling boundary terms w t w = f in Q = Ω (0, T ) w ν = u on Ω (0, T ) w(x, t) = w 0 (x) in Ω The control is the flux on the boundary and the source term f is a given bounded function. The control set is U = {u L 2 ( Ω (0, T )) : 0 u(x, t) M}. We seek to adjust the flux control to drive the solution toward a desired profile and minimize the cost of control, by minimizing the cost functional J(u) = Q 1 2 (w(x, t) z(x, t))2 dxdt + B 2 with z L 2 (Q) a given function. Ω (0,T ) u(x, t)dσdt,

Adv. Scientific Computing III > PDEs [4] > Controlling boundary terms Since the state equation is linear, the sensitivity equation is ψ t ψ = 0 in Q = Ω (0, T ) ψ ν = k on Ω (0, T ) ψ(x, t) = 0 in Ω where ψ is the directional derivative of w with respect to u in the direction k. The adjoint equation λ t λ = w z in Q = Ω (0, T ) ψ ν = 0 on Ω (0, T ) ψ(x, T ) = 0 in Ω

Adv. Scientific Computing III > PDEs [4] > Controlling boundary terms The optimality condition yields J(u + εk) J(u ) 0 lim ε 0 ε = 1 2 lim (w ε z) 2 (w z) dxdt + B ε 0 + Q ε 2ε = ψ(w z)dxdt + B u kdσdt = = Q Q Ω (0,T ) ( ψλ t + λ ψ)dxdt + B Ω (0,T ) k(λ + Bu )dσdt and the optimal control is u (x) = min Ω (0,T ) Ω (0,T ) u kdσdt ( ( )) λ M, max B, 0 (u +εk) 2 (u ) 2 dσdt

Adv. Scientific Computing III > Appendix: Convex Functions Let X be a real linear space, R = [, ]. Definition The function ψ : X R is called convex if ψ(λx + (1 λ)y) λψ(x) + (1 λ)ψ(y) (6.1) for every λ [0, 1] and all x, y X The function ψ : X R is called strictly convex if a strict inequality in (6.1) holds for λ (0, 1), x y X with ψ(x) <, ψ(y) <. Definition Level set of ψ : X R {x X; ψ(x) λ}, λ R ψ is a convex function level sets are convex sets. The converse is not true.

Adv. Scientific Computing III > Appendix: Convex Functions Definition ψ is called proper convex if < ψ(x) and ψ + for a convex function, ψ : X R, we call the effective domain of ψ the convex set Definition D(ψ) = {x X; ψ(x) < + } Given A X, the indicator function of A, denoted I A : X R, is given by { 0 if x A I A (x) = + if x / A Proposition The subset A X is convex if and only if its indicator I A is convex.

Adv. Scientific Computing III > Appendix: Convex Functions Definition Let ψ : X R. The epigraph of ψ is the set Proposition epi(ψ) = {(x, λ); x X, λ R, λ ψ(x)} A function ψ : X R is convex if and only if its epigraph is a convex subset of X R.

Adv. Scientific Computing III > Appendix: Convex Functions Definition Let X be a topological space. and lim inf x x 0 ψ(x) = sup V V (x 0) lim sup ψ(x) = inf x x 0 sup V V (x 0) x V inf ψ(x) x V ψ(x) where V (x 0 ) is a base in the neighborhoods of x 0 in X. Definition The function ψ :X R is lower-semicontinuous (upper-semicontinuous) at x 0 if lim inf x x 0 ψ(x) ψ(x 0 )), (lim sup x x 0 ψ(x) ψ(x 0 )).

Adv. Scientific Computing III > Appendix: Convex Functions Proposition Let X be a topological space and ψ :X R, R = [, + ]. The following conditions are equivalent: (i) ψ is lower-semicontinuous on X (ii) the level sets {x X; ψ(x) λ}, λ R are closed (iii) the epigraph of the function ψ is closed in X R. Corollary A X is closed if and only if its indicator I A is lower-semicontinuous Theorem (Theorem of Weierstrass) A lower-semicontinuous function ψ on a compact topological space X takes a minimum value on X. Moreover, if it takes only finite values, it is bounded from below. Proposition Any convex, proper and lower-semicontinuous function is bounded from below by an affine function.

Adv. Scientific Computing III > Appendix: Convex Functions Definition Let X, Y be two real linear normed spaces, F : D X Y. We say that the operator F has a directional derivative at point x D ( D= interior of D) in direction h X if the limit exists. F (x, h) = lim λ 0 F (x + λh) F (x) λ F (x, 0) = 0 for any x D the operator h F (x, h) is homogeneous, i.e., F (x, αh) = αf (x, h) in general F is not linear (not additive)

Adv. Scientific Computing III > Appendix: Convex Functions Definition F is Gâteaux differentiable at point x if for x D the directional derivative F (x, h) exists in any direction h, the operator h F (x, h) is linear and continuous (i.e., F (x, ) L (X, Y )) In this case F (x) L (X, Y ) F (x) = lim λ 0 F (x + λh) F (x) λ is called the Gâteaux derivative of F at point x. Definition (Equivalent definition) The operator F is Gâteaux differentiable at point x D if an operator A : X Y such that F (x + λh) F (x) λah lim = 0 for any h X. λ 0 λ

Adv. Scientific Computing III > Appendix: Numerical examples Identification of diffusion coefficient Consider the following 1-dimensional differential equation ( d q(x) du(x) ) = f(x), a < x < b dx dx u(a) = u(b) = 0 (7.1) and a given function û L 2 (a, b). Find a control function q that minimizes the cost functional b b ( ) dq(x) 2 dx (7.2) J(u(q)) = 1 2 a (u(x) û(x)) 2 dx + α 2 a dx

Adv. Scientific Computing III > Appendix: Numerical examples Example problem [a, b] = [0, 1] û(x) = x(1 x 2 ) f(x) = 15x 4 + 3x 2 6x. For this problem, if the coefficient α in J(u, q) is zero, then the optimal control function q(x) can be shown analytically to be q(x) = x 3 + 1 We use an iterative procedure to determine the optimal control function. The value of α is a parameter that can be set by the user. If it is not zero, then the control function will vary from the exact solution known for α = 0.

Adv. Scientific Computing III > Appendix: Numerical examples Example problem The Gradient Algorithm (a) initialization: (i) choose ε and q 0 = x = 1; set k = 0 and λ = 1; (ii) get u 0 by solving (7.1) with q = q 0 (iii) evaluate J(q 0 ); (b) main loop: (iv) set k = k + 1; (v) get p k from (Adjoint Eq) with u = u k 1 ; (vi) set q k = q k 1 λ dj(qk 1 ) dq, (vii) get u k from (7.1) with q = q k ; (viii) evaluate J(q k ); (ix) if J(q k ) J(q k 1 ), set λ = λ/2 and go to (vi); otherwise continue; (x) if J(q k ) J(q k 1 ) / J(q k ) > ε, set λ = 1.5λ and go to (iv); otherwise stop. The bulk of the computational costs are found in the backward-in-time solution of the discrete adjoint system in step (v) and in the forward-in-time solution of the discrete state system in step (vii).

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 0 and û q 0 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 1 and û q 1 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 2 and û q 2 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 3 and û q 3 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 4 and û q 4 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 5 and û q 5 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 6 and û q 6 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Example problem u 7 and û q 7 and q exact

Adv. Scientific Computing III > Appendix: Numerical examples Plankton-Fish System State equations u t = d 1 u + ru(1 u) a uv in Q, 1 + bu v t = d 2 v + a uv 1 + bu mv f(x,t) gv 2 1 + g 2 v 2 in Q, u ν = u = 0 on Ω (0, T ), ν u(x, 0) = u 0 (x), v(x, 0) = v 0 (x) in Ω. state variables: u, v ( phyto/zoo plankton) control variable: f (predation rate) u 0, v 0 H 1 (Ω) L (Ω), d 1, d 2, r, a, b, g, m > 0

Adv. Scientific Computing III > Appendix: Numerical examples Controlling plankton densities Cost functional J(u, v, f) = 1 ( u ū 2 + v v 2) dx dt + α 2 Q 2 Q f t 2 dx dt target densities : ū, v α > 0

Adv. Scientific Computing III > Appendix: Numerical examples Department of Mathematics Math 2603

Adv. Scientific Computing III > Appendix: Numerical examples Magnetohydrodynamics. The state equations The Navier-Stokes equations u t 1 u + (u )u Scurl B B + p = ϕ, Re div u = 0, The Maxwell equations B t + 1 curl (curl B) curl (u B) = curl ψ, Re m div B = 0, in Ω (0,T ). in Ω (0, T ).

Adv. Scientific Computing III > Appendix: Numerical examples

Adv. Scientific Computing III > Appendix: Numerical examples

Adv. Scientific Computing III > Appendix: Numerical examples [Gierer and Meinhardt (1972) activator-inhibitor model] { u where r, µ, α, D u, D v > 0. t = D u u + ru2 v µu + r, v t = D v v + ru 2 αv,

Adv. Scientific Computing III > Appendix: Numerical examples Department of Mathematics Math 2603