Elements of Optimal Control Theory Pontryagin s Principle

Similar documents
HARVESTING IN A TWO-PREY ONE-PREDATOR FISHERY: A BIOECONOMIC MODEL

Theory and Applications of Constrained Optimal Control Proble

CHAPTER 2 THE MAXIMUM PRINCIPLE: CONTINUOUS TIME. Chapter2 p. 1/67

Homework 2 Solutions Math 307 Summer 17

(e) Use Newton s method to find the x coordinate that satisfies this equation, and your graph in part (b) to show that this is an inflection point.

EXTREMAL ANALYTICAL CONTROL AND GUIDANCE SOLUTIONS FOR POWERED DESCENT AND PRECISION LANDING. Dilmurat Azimov

Applications of differential calculus Relative maxima/minima, points of inflection

Deterministic Dynamic Programming

Introduction to di erential equations

Math 307 E - Summer 2011 Pactice Mid-Term Exam June 18, Total 60

Pontryagin s maximum principle

1. The accumulated net change function or area-so-far function

Class 11 Physics NCERT Exemplar Solutions Motion in a Straight Line

1 (t + 4)(t 1) dt. Solution: The denominator of the integrand is already factored with the factors being distinct, so 1 (t + 4)(t 1) = A

Final Value = Starting Value + Accumulated Change. Final Position = Initial Position + Displacement

First-Order ODEs. Chapter Separable Equations. We consider in this chapter differential equations of the form dy (1.1)

Carolyn Abbott Tejas Bhojraj Zachary Carter Mohamed Abou Dbai Ed Dewey. Jale Dinler Di Fang Bingyang Hu Canberk Irimagzi Chris Janjigian

Exercise 4) Newton's law of cooling is a model for how objects are heated or cooled by the temperature of an ambient medium surrounding them.

CHAPTER FIVE. Solutions for Section 5.1. Skill Refresher. Exercises

CHAPTER 3 THE MAXIMUM PRINCIPLE: MIXED INEQUALITY CONSTRAINTS. p. 1/73

1.2. Introduction to Modeling. P (t) = r P (t) (b) When r > 0 this is the exponential growth equation.

Use separation of variables to solve the following differential equations with given initial conditions. y 1 1 y ). y(y 1) = 1

Computing Taylor series

An introduction to Mathematical Theory of Control

Chapter 5. Pontryagin s Minimum Principle (Constrained OCP)

2.3 Homework: The Product and Quotient Rules

Math 1314 Test 2 Review Lessons 2 8

Optimisation and optimal control 1: Tutorial solutions (calculus of variations) t 3

MB4018 Differential equations

Control, Stabilization and Numerics for Partial Differential Equations

Optimal Control, TSRT08: Exercises. This version: October 2015

Physics 20 Homework 2 SIMS 2016

S2500 ANALYSIS AND OPTIMIZATION, SUMMER 2016 FINAL EXAM SOLUTIONS

Average rates of change May be used to estimate the derivative at a point

HOMEWORK # 3 SOLUTIONS

Stat 475 Life Contingencies I. Chapter 2: Survival models

Math 308 Exam I Practice Problems

Homework #4 Solutions

ANSWERS Final Exam Math 250b, Section 2 (Professor J. M. Cushing), 15 May 2008 PART 1

Stability Analysis And Maximum Profit Of Logistic Population Model With Time Delay And Constant Effort Of Harvesting

Calculus I (Math 241) (In Progress)

Homework 2. Due Friday, July We studied the logistic equation in class as a model of population growth. It is given by dn dt = rn 1 N

1.5 First Order PDEs and Method of Characteristics

Lecture 3 Dynamics 29

CHAPTER 9 MAINTENANCE AND REPLACEMENT. Chapter9 p. 1/66

Section 1.1 Differential Equation Models. Key Terms:

A first order system of ordinary di erential equations is a set of equations of the form .. (24) x 1 (t 0 )=x 10 x 2 (t 0 )=x 20.

Energy-Efficient Train Control and Speed Constraints

(1 2t), y(1) = 2 y. dy dt = t. e t y, y(0) = 1. dr, r(1) = 2 (r = r(θ)) y = t(t2 + 1) 4y 3, y(0) = 1. 2t y + t 2 y, y(0) = 2. 2t 1 + 2y, y(2) = 0

Math 1071 Final Review Sheet The following are some review questions to help you study. They do not

2.4 Differences Between Linear and Nonlinear Equations 75

Solving Differential Equations: First Steps

1 What is a differential equation

5.1 - Polynomials. Ex: Let k(x) = x 2 +2x+1. Find (and completely simplify) the following: (a) k(1) (b) k( 2) (c) k(a)

Available online at ScienceDirect. Procedia Computer Science 20 (2013 ) 90 95

120 CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE. Figure 1.30: The graph of g(x) =x 2/3.

Parametric Equations, Function Composition and the Chain Rule: A Worksheet

Physics 50 Winter 2016 Final Exam

SCHOOL OF MATHEMATICS MATHEMATICS FOR PART I ENGINEERING. Self-paced Course

Differential Games, Distributed Systems, and Impulse Control

Over the optimal control theory in investment strategies

LECTURES ON OPTIMAL CONTROL THEORY

Waves on 2 and 3 dimensional domains

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Optimal Control - Homework Exercise 3

Final Exam Study Aid

Optimal control theory with applications to resource and environmental economics

Homework #5 Solutions

Review: control, feedback, etc. Today s topic: state-space models of systems; linearization

SALES AND MARKETING Department MATHEMATICS. 2nd Semester. Bivariate statistics. Tutorials and exercises

Optimization and Control

MATH 115 SECOND MIDTERM

Representation of Functions as Power Series

Solutions of Spring 2008 Final Exam

Partial Differential Equations (PDEs)

MATH 115 SECOND MIDTERM EXAM

Extremal Trajectories for Bounded Velocity Mobile Robots

Question 1. (8 points) The following diagram shows the graphs of eight equations.

2. (Review) Write an equation to describe each linear function based on the provided information. A. The linear function, k(x), has a slope

3. Find the slope of the tangent line to the curve given by 3x y e x+y = 1 + ln x at (1, 1).

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

(1) Rate of change: A swimming pool is emptying at a constant rate of 90 gal/min.

Interpolation Theory

1.1 Differential Equation Models. Jiwen He

Social Science/Commerce Calculus I: Assignment #10 - Solutions Page 1/15

Tutorial on Control and State Constrained Optimal Control Problems

Pontryagin s Minimum Principle 1

Logistic Model with Harvesting

Applied Calculus I. Review Solutions. Qaisar Latif. October 25, 2016

Math 116: Business Calculus Chapter 4 - Calculating Derivatives

Tutorial-1, MA 108 (Linear Algebra)

MATH 353 LECTURE NOTES: WEEK 1 FIRST ORDER ODES

We now impose the boundary conditions in (6.11) and (6.12) and solve for a 1 and a 2 as follows: m 1 e 2m 1T m 2 e (m 1+m 2 )T. (6.

Continuous Threshold Policy Harvesting in Predator-Prey Models

Math 2930 Worksheet Introduction to Differential Equations

y = (1 y)cos t Answer: The equation is not autonomous because of the cos t term.

Solutions to the Review Questions

ODE Math 3331 (Summer 2014) June 16, 2014

Calculus of Variations Summer Term 2014

5.3. Exercises on the curve analysis of polynomial functions

Transcription:

Elements of Optimal Control Theory Pontryagin s Principle STATEMENT OF THE CONTROL PROBLEM Given a system of ODEs (x 1,...,x n are state variables, u is the control variable) 8 dx 1 = f 1 (x 1,x 2,,x n ; u) >< dx 2 = f 2 (x 1,x 2,,x n ; u) >: dx n = f n (x 1,x 2,,x n ; u) with initial data x j () = x j, j =1,..n, the objective is to find an optimal control u = u (t) and the corresponding path x = x (t), apple t apple T which mimimizes the functional Z T J(u) =c 1 x 1 (T )+c 2 x 2 (T )+ c n x n (T )+ F (x 1 (t),,x n (t); u(t)) among all admissible controls u = u(t). Pontryagin s Principle gives a necessary condition for u = u (t) and corresponding x = x (t) to be optimal, i.e. I(u )=mini(u) u METHOD FOR FINDING OPTIMAL SOLUTION: Construct the Hamiltonian d j H( 1,..., n,x 1,...,x n, u) = 1 f 1 + 2 f 2 +... n f n + F where 1,, n are so-called adjoint variables satisfying the adjoint system j =1,...,n Note: = @H @x j, dx j = @H @ j, j =1,...,n For each fixed time t ( apple t apple T ), choose u be the value of u that maximizes the Hamiltonian H = H(u) among all admissible u s ( and x are fixed): H(u ) = max H(u) u Case 1. Fixed Terminal Time T, Free Terminal State x(t ) Set terminal conditions for the adjoint system: j (T )=c j, j =1...n. Solve the adjoint system backward in time. Case 2. Time Optimal Problem. Free Terminal Time T, Fixed Terminal State x(t ). c j =, j =1,...,n, F 1. No terminal conditions for the adjoint system. Solve the adjoint system with H( (T ), x(t ), u(t )) =. 1

Example 1. Landing Probe on Mars Problem Consider the time-optimal problem of landing a probe on Mars. Assume the descent starts close to the surface of Mars and is a ected only by gravitational field and by friction with air (proportional with the velocity). The lander is equipped with a braking system, which applies a force u, in order to slow down the descent. The braking force cannot exceed a given (maximum) value U. Let x(t) be the distance from the landing site. Then Newton s Law gives mẍ = mg kẋ + u. The objective is to find the optimal control u(t) such that the object lands in shortest time. For convenience assume m = 1, k = 1, U = 2 and g = 1 and apple u apple 2. The system reads As initial condition take x() = 1 and ẋ() =. ẍ = 1 ẋ + u Solution This problem is a standard time optimal problem (when terminal time T for return to origin has to be minimized). It can be formulated as a Pontryagin Maximum Principle problem as follows. Write the second order ODE for x = x(t) as a system of two equations for two state variables x 1 = x (position) and x 2 =ẋ (speed). dx 1 = x 2 dx 2 = 1 x 2 + u (1) and initial conditions x 1 = 1 and x 2 =. The objective is to minimize T subject to the constraint apple u apple 2 and terminal conditions x 1 (T ) =, x 2 (T ) =. We set the Hamiltonian (for time optimal problem) where 1 and 2 safisfy the adjoint system H( 1, 2,x 1,x 2 ; u) = 1 x 2 + 2 ( 1 x 2 + u) 1 d 1 d 2 = @H @x 1 = @H @x 2 = 1 + 2 (2) There are no terminal conditions for the adjoint system. Pontryagin s principle asks to maximize H as a function of u 2 [, 2] at each fixed time t. Since H is linear in u, it follows that the maximum occurs at one of the endpoints u = or u = 2, hence the control 2

is bang-bang! More precisely, u (t) =whenever 2 (t) < u (t) =2whenever 2 (t) > We now solve the adjoint system. Since 1 c 1 is constant, we obtain 2(t) =c 1 + c 2 e t (3) It is clear that no matter what constants c 1 and c 2 we use, 2 can have no more than one sign change. In order to be able to land with zero speed at time T, it is clear that the initial stage one must choose u = and then switch (at a precise time t s )tou = 2. The remaining of the problem is to find the switching time t s and the landing time T.Tothisend,we solve the state system (1) for u(t), for t 2 [,t s ] and u(t) 2, for t 2 [t s,t]: In the time interval t 2 [,t s ], u yields: dx 1 = x 2 dx 2 = 1 x 2 which has the solution x 1 (t) = 11 t e t and x 2 (t) = 1+e t In the time interval t 2 [t s,t], u 2 yields: dx 1 dx 2 = x 2 =1 x 2 which has the solution x 1 (t) = 1+t T + e T t and x 2 (t) =1 e T t The matching conditions for x 1 and x 2 at t = t s yield the system of equations for t s and T. which can be solved (surprisingly) explicitely: ( 11 ts e ts = 1+t s T + e T ts 1+e ts =1+e T ts Whether you use a computer or you do it by hand, the solution turns out to be t s 1.692 and T 11.386. (quite short braking period...) Conclusion: To achieve the optimal solution (landing in shortest amount of time), the probe is left to fall freely for the first 1.692 seconds and then the maximum brake is applied for the next.694 seconds. 3

Example 2. Insects as Optimizers Problem Many insects, such as wasps (including hornets and yellowjackets) live in colonies and have an annual life cycle. Their population consists of two castes: workers and reproductives (the latter comprised of queens and males). At the end of each summer all members of the colony die out except for the young queens who may start new colonies in early spring. From an evolutionary perspective, it is clear that a colony, in order to best perpetuate itself, should attempt to program its production of reproductives and workers so as to maximize the number of reproductives at the end of the season - in this way they maximize the number of colonies established the following year. In reality, of course, this programming is not deduced consciously, but is determined by complex genetic characteristics of the insects. We may hypothesize, however, that those colonies that adopt nearly optimal policies of production will have an advantage over their competitors who do not. Thus, it is expected that through continued natural selection, existing colonies should be nearly optimal. We shall formulate and solve a simple version of the insects optimal control problem to test this hypothesis. Let w(t) and q(t) denote, respectively, the worker and reproductive population levels in the colony. At any time t, apple t apple T, in the season the colony can devote a fraction u(t) of its e ort to enlarging the worker force and the remaining fraction 1 u(t) to producing reproductives. Accordingly, we assume that the two populations are governed by the equations ẇ = buw µw q = c(1 These equations assume that only workers gather resources. The positive constants b and c depend on the environment and represent the availability of resources and the e ciency with which these resources are converted into new workers and new reproductives. The per capita mortality rate of workers is µ, and for simplicity the small mortality rate of the reproductives is neglected. For the colony to be productive during the season it is assumed that b>µ. The problem of the colony is to maximize J = q(t ) subject to the constraint apple u(t) apple 1, and starting from the initial conditions u)w w() = 1 q() = (The founding queen is counted as a worker since she, unlike subsequent reproductives, forages to feed the first brood.) Solution We will apply the Pontryagin s maximum principle to this problem. Construct the Hamiltonian: H( w, q,w,q,u)= w (bu µ)w + q c(1 u)w, where w and q are adjoint variables, safistying the adjoint system and terminal conditions w = @H @w =(bu µ) w + c(1 u) q (4) q = @H @q = (5) w(t )=, q (T )=1 It is clear, from the second adjoint equation, that q (t) = 1 for all t. Note also the adjoint equation for w cannot be solved directly, since it depends on the unknown function u = u(t). The other necessary conditions must be used in conjunction with the adjoint equations to determine the adjoint variables. 4

Since this Hamiltonian is linear as a function of u (for a fixed time t), H = H(u) =[( w b c)w]u +(c w µ)w and since w>, it follows that it is maximized with respect to apple u apple 1 by either u = or u = 1, depending on whether w b c is negative or positive, respectively. Using pplane7.m, we plot the solution curves in the two regions where the switch function S(w, w )= w b c is positive (then u = 1) and negative (then u = ) respectively. (See figures.) It is now possible to solve the adjoint equations and determine the optimal u (t) by moving backward in time from the terminal point T. In view of the known conditions on w at t = T,wefind w (T )b c = c<, and hence the condition for maximization of the Hamiltonian yields u (T ) =. Therefore near the terminal time T the first adjoint equation becomes w(t) = µ w (t)+c, w (T )= 5

which has the solution w(t) = c µ 1 e µ(t T ), t 2 [t s,t] Viewed backward in time it follows that w (t) increases from its terminal value of. When it reaches a point t s <T where w (t) = c b, the value of u switches from to 1. [The point t s is found to be t s = T + 1 µ ln 1 µ b.] Past that point t = t s, the first adjoint equation becomes w(t) =(b µ) w, for t<t s. which, in view of the assumption that b>µ, implies that, moving backward in time, w (t) continues to increase. Thus, there is no additional switch in u. Exercise: Show that the optimal number q (T )= c b e(b µ)ts = c b e(b µ)[t + 1 µ ln(1 µ b )].(exercise!). Conclusion: In terms of the original colony problem, it is seen that the optimal solution is for the colony to produce only workers for the first part of the season (u (t) =1,t 2 [,t s ]), and then beyond a critical time to produce only reproductives (u (t) =,t2 [t s,t]). Social insects colonies do in fact closely follow this policy, and experimental evidence indicates that they adopt a switch time that is nearly optimal for their natural environment. Example 3: Ferry Problem Consider the problem of navigating a ferry across a river in the presence of a strong current, so as to get to a specified point on the other side in minimum time. We assume that the magnitude of the boat s speed with respect to the water is a constant S. The downstream current at any point depends only on this distance from the bank. The equations of the boat s motion are x 1 = S cos + v(x 2 ) (6) x 2 = S sin, (7) where x 1 is the downstream position along the river, x 2 is the distance from the origin bank, v(x 2 )isthe downstream current, and is the heading angle of the boat. The heading angle is the control, which may vary along the path. Solution This is a time optimal problem with terminal constraints, since both initial and final values of x1 and x2 are specified. The objective function is (negative) total time, so we choose f = 1. 6

The adjoint equations are found to be 1 = (8) 2 = v (x 2 ) 1 (9) There are no terminal conditions on the adjoint variables. The Hamiltonian is H = 1 S cos + 1 v(x 2 )+ 2 S sin 1 (1) The condition for maximization yields and hence, = dh d = 1S sin + 2 S cos tan = 2 1 We then rewrite (5) (along an optimal path) S H = 1 cos + v(x 2) 1. (11) Next, di erentiating (5) with respect to t, we observe d H = 1S cos 1 S sin + 1v(x 2 )+ 1 v (x 2 ) x 2 + 2S sin + 2 S cos = 1(S cos + v(x 2 )) + ( 1 S sin + 2 S cos ) + 1 v (x 2 ) x 2 + 2S sin = (where we used (2), (3), and (4)). Thus, along an optimal path, H is constant in time. Using (6) and (7) and the fact that 1 is constant ( 1 = ), we obtain: H const. (12) S cos + v(x 2)=A for some constant A. Once A is specified this determines as a function of x 2. cos = S A v(x 2 ) (13) Back in the state space, (1) and (2) become: x 1 = x 2 = S S 2 A v(x 2 ) + v(x 2) (14) s S 1 2 (A v(x 2 )) 2, (15) The constant A is chosen to obtain the proper landing point. A special case is when v(x 2 )=v is independent of x 2,hencesois. Then the optimal paths are then straight lines. 7

Example 4. Harvesting for Profit Problem Consider the problem of determining the optimal plan for harvesting a crop of fish in a large lake. Let x = x(t) denote the number of fish in the lake, and let u = u(t) denote the intensity of harvesing activity. Assume that in the absence of harvesting x = x(t) grows according to the logistic law x = rx(1 x/k), where r > is the intrinsic growth rate and K is the maximum sustainable fish population. The rate at which fish are caught when there is harvesting at intensity u is h = qux for some constant q > (catchability). (Thus the rate increases as the intensity increases or as there are more fish to be harvested). There is a fixed unit price p for the harvested crop and a fixed cost c per unit of harvesting intensity. The problem is to maximize the total profit over the period apple t apple T, subject to the constraint apple u apple E, where E is the maximum harvesting intensity allowed (by the WeCARE environmental agency). The system is and the objective is to maximize x (t) =rx(t)(1 x(t)/k) h(t) x() = x J = Z T [ph(t) cu(t)] [The objective function represents the integral of the profit rate, that is the total profit.] 1. Write the necessary conditions for an optimal solution, using the maximum principle. 2. Describe the optimal control u = u (t) and resulting population dynamics x = x (t). 3. Find the maximum value of J for this optimal plan. [You may assume (at some point) the following numerical values: the intrinsic growth rate r =.5, maximum sustainable population K = 25,, q = 1 5, maximum level of intensity E = 5 (in boat-days). The selling price per fish p = 1 (in dollars), unit cost c = 1 (in dollars), initial population x = 15, (fish) and length of time (in years) T = 1. ] Solution This is an optimal control problem with fixed terminal time, hence the Pontryagin maximum principle can be used. We first set up the problem: The objective function is J = Z T [pqx(t)u(t) cu(t)]. The state variable is x = x(t) and satisfies the di erential equation with initial condition x() = x. The Hamiltonian x = rx(1 x/k) qxu H(,x,u)=pqxu cu + (rx(1 x/k) qxu) =[(qx(p ) c] u + rx(1 x/k) is linear in u, hence to maximize it over the interval u 2 [,E] we need to use boundary values, i.e. u = or u = E, depending on the sign of the coe cient of u. More precisely 8

u (t) = whenever qx(t)(p (t)) c< u (t) =E whenever qx(t)(p (t)) c> The adjoint variable satisfies the di erential equation and the terminal condition = @H @x = (pqu +(r(1 2x/K) qu) ). (T )=. Based on the computed values of u (either or E) we get two systems (x and ) to consider. For u we get For u E we get ( x = rx(1 x/k) = r(1 2x/K) ( x = rx(1 x/k) qxe = (pqe +(r(1 2x/K) qe) ) We can employ the use of Matlab code pplane7.m to help decide if any switching is needed. Here is a phase portrait for each (x, ) system, and the zero level set of the switch function S(x, ) =qx(p ) c, using the numerical values given in the problem: q = 1 5,p=1,c=1. We plot separately the solutions curves only in the appropriate regions of the phase plane, that is where S(x, ) < for u and S(x, ) > for u E. The direction of the solution curves is determined by the two systems above, as follows: increasing x (left to right) for first graph (u = ) and decreasing x (right to left) for second plot (u = E). 9

harvest1.pdf harvest2.pdf Figure 1. Solution curves (x(t), (t)): in the region S(x, ) < (TOP,u =) and in the region S(x, ) > (BOTTOM,u = E) It is deduced from these plots that in order to achieve (T ) = when we start with x() = 15, we must have u (T )=E (that is finish the year with maximum level of e ort). Hence, u (t) =E for 1

t 2 [t s,t] for some t s <T. By solving backward in time the system (with u = E), we can compute the value of the switching time t s. It turns out that, for the given values of the parameters x,p,c,theswitchingtimet s =(seebelow), in other words the optimal control is u (t) =E for all times t 2 [,T], meaning that NO SWITCHING is required in the e ort of fishing (being always at its maximum). As a matter of fact, we can solve both systems explicitly. Notice that the numerical values are such that qe = r =.5, hence the system (u = E) simplifyto ( x = r/kx 2 = pr +2r/Kx. Hence, when u E, we can explicitly solve for x = x first (separation of variables) x 1 (t) = 1 x + r K t for t 2 [,T] The equation for can also be solved explicitly (it s linear equation in hence method of integrating factor applies) (t) =p( K + rt)+const ( K + rt) 2, where K = K x (= 5/3) It is clear that one can choose the constant above such that (T ) =. More precisely, choose const = p( K + rt) 1. One can verify that, in this case, S(x (t), (t)) > for all t 2 [,T], hence t s =. All of the above imply that the optimal control and optimal path are The graphs are included below. u (t) =E, x 1 (t) = 1 x + r t 2 [,T] K t, (c) The maximum value for the profit, given by the optimal strategy found above, is then max u J(x, u) =J(x,u )= = Z T Z 1 (pqex (t) ce).5 1/15, +.5/25 t 5 2389 ( using Matlab!) Conclusion: To achieve maximum profit, the company needs to fish at its maximum capacity for the entire duration of the year: u (t) =E = 5 boat-days, t 2 [,T]. 11