Optimal Control. Historical Overview and Theoretical Development. Nicolai Christov

Optimal Control Historical Overview and Theoretical Development Nicolai Christov Laboratoire Franco-Chinoise d Automatique et Signaux Université Lille1 Sciences et Technologies and Nanjing University of Science and Technology Ecole d été CA NTI, Université "Politehnica" de Bucarest 26-30 mai 2014

Short contents 1. The Optimal Control Problem 2. Calculus of Variations 3. Optimal Control without Inequality Constraints 4. Pontryagin Maximum Principle 2/ 41

1. The Optimal Control Problem 3/ 41

1.1 The brachistochrone problem I, Johann Bernoulli, address the most brilliant mathematicians in the world. Nothing is more attractive to intelligent people than an honest, challenging problem, whose possible solution will bestow fame and remain as a lasting monument. Following the example set by Pascal, Fermat, etc., I hope to gain the gratitude of the whole scientific community by placing before the finest mathematicians of our time a problem which will test their methods and the strength of their intellect. If someone communicates to me the solution of the proposed problem, I shall publicly declare him worthy of praise. Given two points A and B in a vertical plane, what is the curve traced out by a point acted on only by gravity, which starts at A and reaches B in the shortest time? (Acta Eruditorum, juin 1696) 4/ 41

J. Bernoulli (1667-1748) Acta Eruditorum Journal Johann Bernoulli made a number of contributions to infinitesimal calculus and educated Leonhard Euler. In the 17th century there were only 4 scientific journals: Journal des savants (1665), Transactions of the Royal Society (1665), Giornale de letterati (1668) and Acta Eruditorum (1682). 5/ 41

1.1 The brachistochrone problem The solution of the brachistochrone problem, which is a segment of a cycloid, was found by Newton, Leibnitz, L Hôpital, and the two Bernoulli brothers - Johann and Jacob. The Royal Society published Newton s solution anonymously in the Philosophical Transactions of the Royal Society in January 1697. The May 1697 publication of Acta Eruditorum contained Leibnitz s solution to the brachistochrone problem, Johann Bernoulli s solution, Jacob Bernoulli s solution, and a Latin translation of Newton s solution. The solution by de L Hôpital was not published until 1988. 6/ 41

I. Newton (1643-1727) G.W. Leibnitz (1646-1715) Isaac Newton is considered by many to be the greatest and most influential scientist who ever lived. Gottfried Wilhelm Leibnitz was one of the great thinkers of the seventeenth and eighteenth centuries and is known as the last "universal genius". 7/ 41

Jacob Bernoulli (1654-1705) G. de l Hôpital (1661-1705) Jacob Bernoulli made important contributions to probability theory and differential equations. Guillaume de l Hôpital was a student of Johann Bernoulli and is the author of the first textbook on the differential calculus Analyse des infiniment petits pour l intelligence des lignes courbes (1696). 8/ 41

1.2 General assumptions Consider continuous-time dynamical systems ẋ(t) = f (x(t), u(t), t) (1) The state and control vectors are subject to constraints: x(t) X, u(t) U (2) The function f is continuous (or at least piecewise continuous) and has partial derivatives in respect to all its arguments The function f satisfies the Liepschitz condition f (x, u, t) f ( x, u, t) < K x x 9/ 41

1.3 Statement of the optimal control problem Find the control u(t) which transfers the state x(t) from an initial value x(t 0 ) to a final value x(t f ) and which satisfies the constraint x(t) X, and minimizes a performance index (optimality criterion) or I = tf t 0 F(x(t), u(t), t)dt + φ(x(t 0 ), x(t f ), t 0, t f ) (3) I = tf t 0 F(x(t), u(t), t)dt + φ(x(t f ), t f ) (4) 10/ 41

1.4 Types of optimal control problems According to the optimality criterion: time optimal control problem fuel optimal control problem minimization of the state vector deviation from a reference value According to the boundary conditions: fixed or free final state or time According to the state equation: linear / non linear time-invariant or not with or without external perturbations 11/ 41

The solution of the optimal control problem depends on whether the bounds for u and x are attained or not: First case: the boundary values of u and/or x are not attained Calculus of variations Second case: the boundary values of u and/or x are attained Pontryagin maximum principle 12/ 41

2. Calculus of Variations 13/ 41

2.1 Preliminaries: definitions Function: Variable Variable Functional: Function Variable Standard problem: Find x(t) minimizing the functional I = tf t 0 F(x(t), ẋ(t), t)dt, x(t) R n. (5) Thus one looks for x (t) such that I(x) > I(x ) for all x(t) x (t) 14/ 41

2.1 Preliminaries: variational approach The standard problem can be reformulated in the following way: Find x(t) such that I(ε) = tf +ɛτ f t 0 +ετ 0 F(x(t) + εη(t), ẋ(t) + ε η(t), t)dt (6) is minimal for ε = 0 whatever the constants τ 0 et τ f and regardless of the function η(t). Necessary condition for minimum: ( ) di(ε) = 0 dε ε=0 15/ 41

2.2 Euler equation In the case when the limits t 0, t f, x(t 0 ), x(t f ) are fixed we have ( ) di(ε) tf ( F = dε ε=0 t 0 x η + F ) tf ( F ẋ η dt = t 0 x d ) F η dt dt ẋ L. Euler showed that only if the equation ( ) di(ε) dε ε=0 is zero for each η(t) if and F x d F dt ẋ = 0 (7) is satisfied. This equation is called Euler equation. 16/ 41

2.3 Transversality conditions If the limits t 0, t f, x(t 0 ), x(t f ) are not fixed, we have ( ) di(ε) tf ( F = dε ε=0 t 0 x d ) F η dt + F dt ẋ ẋ T η F ẋ T η tf ( + F F ) ( ẋ T ẋ τ f F F ) t f ẋ T ẋ τ 0 t 0 Thus the minimization of I consists of satisfying both the Euler equation (7) and the transversality conditions F = 0, F ( = 0, F F ) ( ẋ tf ẋ t0 ẋ T ẋ = 0, F F ) t f ẋ T ẋ = 0 t 0 (8) t0 17/ 41

2.4 Variational problem with equality constraint In this case one seeks the function x(t) that minimizes I = under a constraint tf t 0 F (x(t), ẋ(t), t)dt (9) g(x(t), ẋ(t), t) = 0. (10) J.L. Lagrange showed that the solution of this problem can be obtained by solving the standard problem I = tf t 0 ( ) F(x(t), ẋ(t), t) + λ T (t)g(x(t), ẋ(t), t) dt min. The function λ(t) is called Lagrange multiplier. (11) 18/ 41

L. Euler (1707-1783) J.L. Lagrange (1736-1813) L. Euler made enormous contributions to a wide range of mathematics and physics. J.L. Lagrange made important contributions to all fields of analysis, number theory, and classical and celestial mechanics. L. Euler is considered as the equivalent to the doctoral advisor of J.L. Lagrange. 19/ 41

3. Optimal Control without Inequality Constraints 20/ 41

3.1 General case For the dynamical system ẋ(t) = f (x(t), u(t), t), x(t 0 ) = x 0 (12) one seeks the control u(t) that minimizes the performance index I = tf t 0 F(x(t), u(t), t)dt. (13) Define the function L(x, ẋ, u, λ, t) = F(x, u, t) + λ T (t)(f (x, u, t) ẋ) (14) called function of Lagrange. 21/ 41

3.1 General case Thus the problem is to minimize J = tf t 0 L(x, ẋ, u, λ, t)dt. (15) In this case the necessary condition for minimum is L x d L dt ẋ = 0 L x + λ = 0 L λ d L dt λ = 0 = L λ = 0 (16) L u d L dt u = 0 L u = 0 These equations are called Euler-Lagrange equations. 22/ 41

3.1 General case If t f and x(t f ) are free, it is necessary to take into account the transversality conditions: x(t f ) free L = 0 ẋ tf t f free ( L L ) ẋ T ẋ = 0 t f 23/ 41

3.2 Linear-Quadratic problem For the linear system ẋ(t) = Ax(t) + Bu(t), x(0) = x 0 (17) one seeks the control u(t) that minimizes the quadratic criterion I = 1 2 tf 0 ( ) x T (t)qx(t) + u T (t)ru(t) dt + 1 2 x T (t f )Sx(t f ) (18) where Q and S are symmetric positive semi-definite matrices and R is positive definite. The final time t f is fixed and the final state x(t f ) is free. 24/ 41

3.2 Linear-Quadratic problem In this case the Euler-Lagrange equations are λ(t) = Qx(t) A T λ(t) λ(t) = Qx(t) A T λ(t) ẋ(t) = Ax(t) + Bu(t) = ẋ(t) = Ax(t) + Bu(t) (19) Ru(t) + B T λ(t) = 0 u(t) = R 1 B T λ(t) with the transversality condition λ(t f ) = Sx(t f ) (20) 25/ 41

3.2 Linear-Quadratic problem In 1960 R.E. Kalman showed that λ(t) = P(t)x(t), P(t f ) = S (21) where P(t) satisfies the differential Riccati equation Ṗ = AT P + PA PBR 1 B T P + Q, P(t f ) = S. (22) It follows that u(t) = K (t)x(t), K (t) = R 1 B T P(t) (23) i.e. we have a feedback optimal control. 26/ 41

3.2 Linear-Quadratic problem: infinite horizon In this case one seeks the control u(t) minimizing the criterion ( ) I = x T (t)qx(t) + u T (t)ru(t) dt (24) with x f = 0. 0 R.E. Kalman showed that if the pair (A, B) is controllable and the pair (A, Q) observable, the matrix P in the relation λ(t) = P(t)x(t) is the unique positive definite solution of the algebraic Riccati equation A T P + PA PBR 1 B T P + Q = 0. (25) Thus the optimal control is u(t) = R 1 B T Px(t) = Kx(t). (26) 27/ 41

R.E. Kalman (1930) L.S. Pontryagin (1907-1988) In the 1960s, R.E. Kalman was the leader in the development of the modern control theory. His main contributions are the "Kalman filter" and the Linear-Quadratic Regulator. L.S. Pontryagin made important contributions to topology, algebra, and dynamical systems. In 1961 he published The Mathematical Theory of Optimal Processes with his students V.G. Boltyanskii, R.V. Gamkrelidze and E.F. Mishchenko. 28/ 41

4. Pontryagin Maximum Principle 29/ 41

4.1 Maximum principle formulation Consider now the optimal control problem with inequality constraints u(t) U. In this case the optimal control consists of parts where u(t) is inside of U and parts where u(t) is on the boundary of U. u(t) is included in U : one can use the classical variational methods u(t) is on the boundary of U : one can not use the classical variational methods because variations can not be done outside the boundary. 30/ 41

4.1 Maximum principle formulation One seeks the control u(t) U that transfers the system ẋ(t) = f (x(t), u(t)) (27) from the state x 0 to the state x f I = tf To the n components x i x 0 = and minimizes the criterion t 0 F(x(t), u(t))dt. (28) t t 0 of the state vector x the component F(x(τ), u(τ))dτ is added, so that if x = x f, then x 0 (t f ) = I. One has x 0 (t) = F(x(t), u(t)) f 0 (x(t), u(t)) (29) 31/ 41

4.1 Maximum principle formulation Introduce the adjoint vector ψ such that ( ) f T n f j (x, u) ψ = ψ ψ i = x x i ψ j, i = 0,..., n and the Hamiltonian H(ψ, x, u) = ψ T f = j=0 (30) n ψ i f i. (31) i=0 Equations (27), (29) and (30) can be rewritten in the Hamilton canonical form dx i dt dψ i dt = H ψ i = H x i i = 0,..., n (32) 32/ 41

4.1 Maximum principle formulation Denote M(ψ, x) = sup u U H(ψ, x, u). In 1956 L.S. Pontryagin and his students V.G. Boltyanskii, R.V. Gamkrelidze and E.F. Mishchenko formulated the so-called Pontryagin Maximum Principle : In order for a control u(t) and a trajectory x(t) to be optimal, it is necessary that there exist nonzero continuous vector ψ(t) satisfying the system (32) such that : 1. The Hamiltonian H(ψ(t), x(t), u(t)) attains its maximum for u = u(t) t [t 0, t f ], i.e. H(ψ(t), x(t), u(t)) = M(ψ(t), x(t)) (33) 2. At the terminal time t f, the relations ψ 0 (t f )) 0, M(ψ(t f ), x(t f )) = 0 (34) are satisfied. 33/ 41

4.2 Time optimal control In this case one seeks the control u(t) transferring the system ẋ(t) = f (x(t), u(t)) = a(x(t)) + B(x(t))u(t) (35) from the state x 0 to the state x f I = tf and minimizing the criterion t 0 dt = t f t 0 (36) under the constraint U i min u i (t) U i max, i = 1, 2,..., m. The Hamiltonian for this problem is H(ψ(t), x(t), u(t)) = 1 + ψ T (t)a(x(t)) + ψ T (t)b(x(t))u(t). (37) 34/ 41

4.2 Time optimal control It is therefore to find the maximum of ψ T (t)b(x(t))u(t) (the only term depending of u ). Decompose B(x(t)) in columns b i (x(t)) : m ψ T (t)b(x(t))u(t) = ψ T (t)b i (x(t))u i (t). i=1 If u i are independent, it is necessary to find the maximum of each term ψ T (t)b i (x(t))u i (t). 35/ 41

4.2 Time optimal control For each t one thus finds the value of u i (t) : ψ T (t)b i (x(t)) > 0 u i (t) = U i max ψ T (t)b i (x(t)) < 0 u i (t) = U i min ψ T (t)b i (x(t)) = 0 u i (t) = undetermined If this term does not vanish on some time intervals, the system is called normal (in the opposite case : singular). For singular systems, the maximum principle does not apply. 36/ 41

4.2 Time optimal control The optimal control is thus a succession of U i min et U i max (hence the name of this type of control: Bang-Bang) Algorithm for solving the time optimal problem : integrate u i (t) = sign(ψ T (t)b i (x(t))) ẋ(t) = a(x(t)) + B(x(t))u(t) (38) ψ T (t) = ψ T (t) a(x(t)) x ψ T (t) m i=1 b i (x(t)) u i (t) x It is therefore to find ψ(t 0 ) so as to attain x(t f ). The time required to attain x(t f ) is the minimal time. 37/ 41

4.2 Time optimal control: linear systems Consider the system ẋ(t) = Ax(t) + Bu(t), 1 u i (t) 1. (39) Equations (38) become u i (t) = sign(ψ T (t)b i ) ẋ(t) = Ax(t) + Bu(t) (40) ψ(t) = A T ψ(t) 38/ 41

4.2 Time optimal control: linear systems The system is normal (i.e. there is no time interval [t 1, t 2 ] where ψ T b i = 0 ) if the system is controllable with respect to all input vector components u i. To check if a system is normal, it is sufficient to verify that ] rank [b i Ab i A 2 b i A n 1 b i = n for each control component. A linear time-invariant system of order n, normal and with real poles, switches at most n 1 times. Example: a system of order 2 = a single switching. 39/ 41

References H.J. Sussmann, J.C. Willems. 300 Years of Optimal Control: from the Brachystchrone to the Maximum Principle. IEEE Control Systems Magazine, June 1997, pp. 32-44 (http://www.math.rutgers.edu/ ~sussmann/currentpapers.html) B.D.O. Anderson, J.B. Moore. Optimal Control: Linear Quadratic Methods. Prentice-Hall International, London, 1989 (http://users.rsise.anu.edu.au/~john/ papers/index.html) B.D.O. Anderson, J.B. Moore. Optimal Filtering. Dover Publications, New York, 2005 (http://users.rsise. anu.edu.au/~john/papers/index.html) 40/ 41

References P. Varaiya. Lecture Notes on Optimization. University of California, Berkeley,1998 (http://paleale.eecs.berkeley.edu/~varaiya/) C. Jordan. Cours d Analyse de l Ecole Polytechnique, Tome III : Equations Différentielles. Paris, 1915 (http://gallica.bnf.fr) E. Trélat. Contrôle optimal : théorie & applications. Vuibert, Paris, 2005 (http://www.univ-orleans.fr/mapmo/ membres/trelat/publications.html) 41/ 41