Optimal Control, TSRT08: Exercises. This version: October 2015

Size: px

Start display at page:

Download "Optimal Control, TSRT08: Exercises. This version: October 2015"

Shannon Moore
5 years ago
Views:

1 Optimal Control, TSRT8: Exercises This version: October 215

3 Preface Acknowledgment Abbreviations The exercises are modified problems from different sources: Exercises 2.1, 2.2,3.5,4.1,4.2,4.3,4.5, 5.1a) and 5.3 are from dare dre dp hjbe lqc mil ode pde pmp tpbvp are discrete-time algebraic Riccati equation discrete-time Riccati equation dynamic programming Hamilton-Jacobi-Bellman equation linear quadratic control matrix inversion lemma ordinary differential equation partial differential equation Pontryagin minimization principle two-point boundary value problem algebraic Riccati equation Exercises 1.1, 1.2, 1.3 and 1.4 are from Exercises 8.3, 8.4, 8.6 and 8.8 are based on 1

5 Exercises This version: October 215

7 1 Discrete Optimization In this section, we will consider solving optimal control problems on the form N 1 minimize φ(x N ) + f (k, x k, u k ) k= subject to x k+1 = f(k, x k, u k ), x given, x k X k, u k U(k, x k ), 1.1 A certain material is passed through a sequence of two ovens. Denote by x : the initial temperature of the material, x k, k = 1, 2: the temperature of the material at the exit of oven k, u k 1, k = 1, 2: the prevailing temperature in oven k. The ovens are modeled as x k+1 = (1 a)x k + au k, k =, 1, where a is a known scalar from the interval (, 1). The objective is to get the final temperature x 2 close to a given target T, while expending relatively little energy. This is expressed by a cost function of the form r(x 2 T ) 2 + u 2 + u 2 1, using the dynamic programming (dp) algorithm. This algorithm finds the optimal feedback control u k, for k =, 1,..., N 1, via the backwards recursion where r > is a given scalar. For simplicity, we assume no constraints on u k. (a) Formulate the problem as an optimization problem. (b) Solve the problem using the dp algorithm when a = 1/2, T = and r = 1. (c) Solve the problem for all feasible a, T and r. J(N, x) = φ(x) J(n, x) = min {f (n, x, u) + J(n + 1, f(n, x, u))}. u U(n,x) 1.2 Consider the system x k+1 = x k + u k, k =, 1, 2, 3, with the initial state x = 5 and the cost function The optimal cost is J(, x ) and the optimal feedback control at stage n is 3 (x 2 k + u 2 k). k= Apply the dp algorithm to the problem when the control constraints are u n = arg min {f (n, x, u) + J(n + 1, f(n, x, u))}. u U(n,x) 1.3 Consider the cost function U(k, x k ) = {u x k + u 5, u Z}. N 1 α N φ(x N ) + α k f (k, x k, u k ), k= 1

8 where α is a discount factor with < α < 1. Show that an alternate form of the dp algorithm is given by V (N, x) = Φ(x), V (n, x) = min u U(n,x) { f (n, x, u) + αv (n + 1, f(n, x, u)) }, where the state evolution is given as x k+1 = f(k, x k, u k ). 1.4 A decision maker must continually choose between two activites over a time interval [, T ]. Choosing activity i at time t, where i = 1, 2, earns a reward at a rate g i (t), and every switch between the two activity costs c >. Thus, for example, the reward for starting with activity 1, switching to 2 at time t 1, and switching back to 1 at time t 2 > t 1 earns total reward t1 t2 T g 1 (t) dt + g 2 (t) dt + g 1 (t) dt 2c. t 1 t 2 We want to find a set of switching times that maximize the total reward. Assume that the function g 1 (t) g 2 (t) changes sign a finite number of times in the interval [, T ]. Formulate the problem as a optimization problem and write the corresponding dp algorithm. 1.5 Consider the infinite time horizon linear quadratic control (lqc) problem minimize u( ) ( x T k Qx k + u T ) k Ru k k= subject to x k+1 = Ax k + Bu k, x given, where Q is a symmetric positive semidefinite matrix, R is a symmetric positive definite matrix, and the pair (A, B) is controllable. Show by using the Bellman equation that the optimal state feedback controller is given by u = µ(x) = (B T P B + R) 1 B T P Ax, where P is the positive definite solution to the discrete-time algebraic Riccati equation (dare) P = A T P A A T P B(B T P B + R) 1 B T P A + Q. 1.6 Consider the finite horizon lqc problem minimize u( ) x T N Q N x N + N 1 k= subject to x k+1 = Ax k + Bu k, x given, ( x T k Qx k + u T k Ru k ) where Q N, Q and R are symmetric positive semidefinite matrices. (a) Apply the discrete version of the Pontryagin minimization principle (pmp) and show that the solution to the two-point boundary value problem (tpbvp) x k+1 = Ax k 1 2 BR 1 B T λ k+1, λ k = 2Qx k + A T λ k+1, λ N = 2Q N x N, yields the minimizing sequence x given, u k = 1 2 R 1 B T λ k+1, k =,..., N 1. (b) Show that the above tpbvp can be solved using the Langrange multiplier λ k = 2S k x k, where S k is given by the backward recursion S N = Q N, S k = A T (S 1 k+1 + BR 1 B T ) 1 A + Q. (c) Use the matrix inversion lemma (mil) to show that the above recursion for S k is equivalent to the discrete-time Riccati equation (dre) S k = A T S k+1 A A T S k+1 B(B T S k+1 B + R) 1 B T S k+1 A + Q. The mil can be expressed as (A + UBV ) 1 = A 1 A 1 U(B 1 + V A 1 U) 1 V A 1 for arbitrarily matrices A, B, and C such that required inverses exists. (d) Use your results to calculate a feedback from the state x k u k = µ(k, x k ). 2

9 2 Dynamic Programming In this chapter, we will consider solving optimal control problems on the form 2.1 Find the optimal solution to the problem minimize u( ) subject to tf φ(x(t f )) + f (t, x(t), u(t)) dt t i ẋ(t) = f(t, x(t), u(t)), x(t i ) = x i, u(t) U, t [t i, t f ] minimize u( ) subject to expressed as a system of odes. tf ( (x(t) cos t) 2 + u 2 (t) ) dt ẋ(t) = u(t), x() =, using dynamic programming via the Hamilton-Jacobi-Bellman equation (hjbe): 1. Define Hamiltonian as H(t, x, u, λ) f (t, x, u) + λ T f(t, x, u). 2. Pointwise minimization of the Hamiltonian yields µ(t, x, λ) arg min H(t, x, u, λ), u U 2.2 A producer with production rate x(t) at time t may allocate a portion u(t) of his/her production rate to reinvestments in the factory (thus increasing the production rate) and use the rest (1 u(t)) to store goods in a warehouse. Thus x(t) evolves according to ẋ(t) = αu(t)x(t), where α is a given constant. The producer wants to maximize the total amount of goods stored summed with the capacity of the factory at final time. (a) Formulate the continuous-time optimal control problem. (b) Solve the problem analytically. and the optimal control is given by u (t) µ(t, x, V x (t, x)), where V (t, x) is obtained by solving the hjbe in Step Solve the hjbe with boundary conditions: V t = H(t, x, µ(t, x, V x ), V x ), V (t f, x) = φ(x). 3

10 3 PMP: A Special Case In this chapter, we will start solving optimal control problems on the form minimize u( ) subject to tf φ(x(t f )) + f (t, x(t), u(t)) dt t i ẋ(t) = f(t, x(t), u(t)), x(t i ) = x i. using a special case of the pmp: Assuming that the final time is fixed and that there are no terminal constraints, the problem may be solved using the following sequence of steps 1. Define the Hamiltonian as H(t, x, u, λ) f (t, x, u) + λ T f(t, x, u). 2. Pointwise minimization of the Hamiltonian yields µ(t, x, λ) arg min H(t, x, u, λ), u( ) and the optimal control candidate is given by 3. Solve the adjoint equations u (t) µ(t, x(t), λ(t)). λ(t) = H (t, x(t), µ(t, x(t), λ(t)), λ(t)), x λ(t 4. Compare the candidate solutions obtained using pmp f ) = φ x (x (t f )) 3.1 Solve the optimal control problem minimize u( ) tf 3.2 Find the extremals of the functionals (a) J = (b) J = 1 1 ẏ dt, yẏ dt, given that y() = and y(1) = 1. ( x(t) + u 2 (t) ) dt subject to ẋ(t) = x(t) + u(t) + 1, x() =. 3.3 Find the extremals of the following functionals (a) J = (b) J = (c) J = (y 2 + ẏ 2 2y sin t) dt, ẏ 2 t 3 dt, where y() =. (y 2 + ẏ 2 + 2ye t ) dt. 3.4 Among all curves of length l in the upper half plane passing through the points ( a, ) and (a, ) find the one which encloses the largest area in the interval [ a, a], i.e., solve maximize x( ) a a x(t) dt subject to x( a) =, x(a) =, K(x) = a a 1 + ẋ(t)2 dt = l. 4

where q are generalized coordinates, τ are generalized forces and δ q = δu, to derive the generalization of the Euler-Lagrange equations for nonconservative mechanical systems without energy

11 3.5 From the point (, ) on the bank of a wide river, a boat starts with relative speed to the water equal ν. The stream of the river becomes faster as it departs from the bank, and the speed is g(y) parallel to the bank, see Figure 3.5a. Figure 3.6a. Electromechanical system. where q are generalized coordinates, τ are generalized forces and δ q = δu, to derive the generalization of the Euler-Lagrange equations for nonconservative mechanical systems without energy dissipation (no damping): Figure 3.5a. The so called Zermelo problem in Exercise 3.5. The movement of the boat is described by ẋ(t) = ν cos (φ(t)) + g(y(t)) ẏ(t) = ν sin(φ(t)), where φ is the angle between the boat direction and bank. We want to determine the movement angle φ(t) so that x(t ) is maximized, where T is a fixed transfer time. Show that the optimal control must satisfy the relation 1 cos(φ (t) + g(y(t)) = constant. ν In addition, determine the direction of the boat at the final time. 3.6 Consider the electromechanical system in Figure 3.6a. The electrical subsystem contains a voltage source u(t) which passes current i(t) trough a resistor R and a solenoid of inductance l(z) where z is the position of the solenoid core. The mechanical subsystem is the core of mass m which is constrained to move horizontally and is retrained by a linear spring and a damper. (a) Use the following principle: J(u( )) = δ tf t i L(q(t), u(t)) dt + tf t i τ(q(t)) T δq dt =, d L dt q L = τ, (3.1) q Also, apply the resulting equation to the electromechanical system above. (b) For a system with energy dissipation the Euler-Lagrange equations are given by d L dt q L q = τ F (3.2) q where q are generalized coordinates, τ are generalized forces, and F is the Rayleigh s dissipation energy defined as F (q, q) = 1 k j q j 2 (3.3) 2 where k j are damping constants. 2F is the rate of energy dissipation due to the damping. Derive the system equations with the damping included. 3.7 Consider the optimal control problem minimize u( ) subject to j γ 2 x(t f ) tf u 2 (t) dt 2 t i ẋ(t) = u(t) x(t i ) = x i. Derive a optimal feedback policy using pmp. 5

12 3.8 Consider the dynamical system ẋ(t) = u(t) + w(t) where x(t) is the state, u(t) is the control signal, w(t) is a disturbance signal, and where x() =. In so-called H -control the objective is to find a control signal that solves min u( max J(u( ), w( )) ) w( ) where J(u( ), w( )) = T where ρ is a given parameter and where γ < 1. ( ρ 2 x 2 (t) + u 2 (t) γ 2 w 2 (t) ) (a) Express the solution as a tpbvp problem using the pmp. Note that you do not need to solve the equations yet. (b) Solve the tpbvp and express the solution in feedback form, i.e., u(t) = µ(t, x(t)) for some µ. (c) For what values of γ does the control signal exist for all t [, T ] 6

13 4 PMP: General Results In this chapter, we will consider solving optimal control problems on the form minimize u( ) subject to using the general case of the pmp: 1. Define the Hamiltonian as tf φ(x(t f )) + f (t, x(t), u(t)) dt t i ẋ(t) = f(t, x(t), u(t)), x(t i ) = x i, x(t f ) S f (t f ), u(t) U, t [t i, t f ], H(t, x, u, λ) f (t, x, u) + λ T f(t, x, u). 2. Pointwise minimization of the Hamiltonian yields µ(t, x, λ) arg min H(t, x, u, λ), u U and the optimal control candidate is given by 3. Solve the tpbvp u (t) µ(t, x(t), λ(t)). λ(t) = H (t, x(t), µ(t, x(t), λ(t)), λ(t)), x λ(t f ) φ x (x (t f )) S f (t f ), ẋ(t) = H λ (t, x(t), µ(t, x(t), λ(t)), λ(t)), x(t i) = x i, x(t f ) S f (t f ). 4. Compare the candidate solutions obtained using pmp 4.1 A producer with production rate x(t) at time t may allocate a portion u(t) of his/her production rate to reinvestments in a factory (thus increasing the production rate) and use the rest (1 u(t)) to store goods in a warehouse. Thus x(t) evolves according to ẋ(t) = αu(t)x(t), (4.1) where α is a given constant. The producer wants to maximize the total amount of goods stored summed with the capacity of the factory at final time. This gives us the following problem: maximize u( ) x(t f ) + tf ( 1 u(t) ) x(t) dt subject to ẋ(t) = αu(t)x(t), < α < 1 x() = x >, u(t) 1, t [, t f ]. Find an analytical solution to the problem above using the pmp. 4.2 A spacecraft approaching the face of the moon can be described by the following equations ẋ 1 (t) = x 2 (t), ẋ 2 (t) = cu(t) x 3 (t) g(1 kx 1(t)), ẋ 3 (t) = u(t), where u(t) [, M] for all t, and the initial conditions are given by x() = (h, ν, m) T with the positive constants c, g, k, M, h, ν and m. The state x 1 is the altitude above the surface, x 2 the velocity and x 3 the mass of the spacecraft. Calculate the structure of the fuel minimizing control that brings the craft to rest on the surface of the moon. The fuel consumption is tf u(t) dt, 7

14 and the transfer time t f is free. Show that the optimal control law is bang-bang with at most two switches. x 2 x 4 u 4.3 A wasp community can in the time interval [, t f ] be described by ẋ(t) = (au(t) b)x(t), x() = 1, ẏ(t) = c(1 u(t))x(t), y() =, v x 3 where x is the number of worker bees and y the number of queens. The constants a, b and c are given positive numbers, where b denotes the death rate of the workers and a, c are constants depending on the environment. The control u(t) [, 1], for all t [, t f ], is the proportion of the community resources dedicated to increase the number of worker bees. Calculate the control u that maximizes the number of queens at time t f, i.e., the number of new communities in the next season. 4.4 Consider a dc motor with input signal u(t), output signal y(t) and the transfer function G(s) = 1 s 2. Introduce the states x 1 (t) y(t) and x 2 (t) ẏ(t) and compute a control signal that takes the system from an arbitrary initial state to the origin in minimal time when the control signal is bounded by u(t) 1 for all t. 4.5 Consider a rocket, modeled as a particle of constant mass m moving in zero gravity empty space, see Figure 4.5a. Let u be the mass flow, assumed to be a known function of time, and let c be the constant thrust velocity. The equations of motion are given by ẋ 1 (t) = x 3 (t), ẋ 2 (t) = x 4 (t), ẋ 3 (t) = c u(t) cos v(t), m ẋ 4 (t) = c u(t) sin v(t). m Figure 4.5a. A rocket (a) Assume that u(t) > for all t R. Show that cost functionals of the class minimize v( ) gives the optimal control (the bilinear tangent law). tf x 1 dt or minimize φ(x(t f )) v( ) tan v (t) = a + b t c + d t (b) Assume that the rocket starts at rest at the origin and that we want to drive it to a given height x 2f in a given time t f such that the final velocity in the horizontal direction x 3 (t f ) is maximized while x 4f =. Show that the optimal control is reduced to a linear tangent law, tan v (t) = a + b t. (c) Let the rocket in the example above represent a missile whose target is at rest. A natural objective is then to minimize the transfer time t f from the state (,, x 3i, x 4i ) to the state (x 1f, x 2f, free, free). Solve the problem under assumption that u is constant. (d) To increase the realism we now assume that the motion is under a constant gravitational force g. The only difference compared to the system above is that equation for the fourth state is replaced by ẋ 4 (t) = c u(t) sin v(t) g. m 8

15 Show that the bilinear tangent law still is optimal for the cost functional minimize v( ) tf φ(x(t f )) + (e) In most cases the assumption of constant mass is inappropriate, at least if longer periods of time are considered. To remedy this flaw we add the mass of the rocket as a fifth state. The equations of motion becomes ẋ 1 (t) = x 3 (t), dt ẋ 2 (t) = x 4 (t), ẋ 3 (t) = c u(t) cos v(t), x 5 (t) ẋ 4 (t) = c u(t) sin v(t) g, x 5 (t) ẋ 5 (t) = u(t), where u(t) [, u max ], t [, t f ]. Show that the optimal solution to transferring the rocket from a state of given position, velocity and mass to a given altitude x 2f using a given amount of fuel, such that the distance x 1 (t f ) x 1 () is maximized, is v (t) = constant, u (t) = {u max, }. 9

16 5 Infinite Horizon Optimal Control In this chapter, we will consider solving optimal control problems on the form minimize u( ) subject to f (x(t), u(t)) dt ẋ(t) = f(x(t), u(t)), x() = x, u(t) U(x) using dynamic programming via the infinite horizon Hamilton-Jacobi-Bellman equation (hjbe): 1. Define Hamiltonian as H(x, u, λ) f (x, u) + λ T f(x, u). 2. Pointwise minimization of the Hamiltonian yields and the optimal control is given by µ(x, λ) arg min H(x, u, λ), u U u (t) µ(x, V x (x)), where V (x) is obtained by solving the infinite horizon hjbe in Step Solve the infinite horizon hjbe with boundary conditions: = H(x, µ(x, V x ), V x ). 5.1 Consider the following infinite horizon optimal control problem minimize u( ) subject to 1 2 ẋ(t) = u(t), x() = x, ( u 2 (t) + x 4 (t) ) dt (a) Show that the solution is given by u (t) = x 2 (t) sgn x(t). (b) Add the constraint u 1. What is now the optimal cost-to-go function J (x) and the optimal feedback? 5.2 Solve the optimal control problem minimize u( ) subject to 5.3 Consider the problem minimize u( ) subject to where α > and β >. 1 2 ( u 2 (t) + x 2 (t) + 2x 4 (t) ) dt ẋ(t) = x 3 (t) + u(t), x() = x, J = ÿ(t) = u(t), u(t) R, t ( α 2 u 2 (t) + β 2 y 2 (t) ) dt (a) Write the system in a state space form with a state vector x. (b) Write the cost function J on the form ( u(t) T Ru(t) + x(t) T Qx(t) ) dt. (c) What is the optimal feedback u(t)? (d) To obtain the optimal feedback, the algebraic Riccati equation (are) must be solved. Compute the optimal feedback by using the Matlab command are for the case when α = 1 and β = 2. In addition, compute the eigenvalues of the system with optimal feedback. 1

17 (e) In this case the ARE can be solved by hand since the dimension of the state space is low. Solve the ARE by hand or by using a symbolic tool for arbitrarily α and β. Verify that your solution is the same as in the previous case when α = 1 and β = 2. (f) (Optional) Compute an analytical expression for the eigenvalues of the system with optimal feedback. 5.4 Assume that the lateral equations of motions (i.e. the horizontal dynamics) for a Boeing 747 can be expressed as ẋ(t) = Ax(t) + Bu(t) (5.1) (a) A reasonable feedback, by intuition, is to connect the aileron with the roll angle and the rudder with the (negative) yaw angle. Create a matrix K in Matlab that makes these connection such that u(t) = Kx(t). Then compute the poles of the feedback system and explain why this controller will have some problems. (b) Use the Matlab command lqr to compute the optimal control feedback. (Make sure you understand the similarities and the differences between lqr and are.) Compute the poles for the case when Q = I 6 6 and R = I 2 2. (c) Explain how different choices of (diagonal) Q and R will affect the performance of the autopilot. where x = (v r p φ ψ y) T, u = (δa δr) T, v; lateral velocity r; yaw rate p; roll rate φ; roll angle ψ; yaw angle y; lateral distance from a straight reference path δa; aileron angle δr; rudder angle and A = , B = C =( 1) (5.2) The task is now to design an autopilot that keeps the aircraft flying along a straight line during a long period of time. First an intuitive approach will be used and then a optimal control approach. 11

18 6 Model Predictive Control In this exercise session we will focus on computing the explicit solution to the model predictive control (mpc) problem (see [?]) using the multi-parametric toolbox (mpt) for matlab. The toolbox is added to the matlab path via the command initcourse( tsrt8 ), and the mpt manual can be downloaded from the course homepage 6.1 Here we are going to consider the second order system y(t) = 2 s 2 u(t), (6.1) + 3s + 2 sampled with T s =.1 s to obtain the discrete-time state-space representation ( ) ( ) x k+1 = x k + u.956 k y k = ( 1 ) (6.2) x k. (a) Use the mpt toolbox to find the mpc law for (6.2) using the 2-norm with N = 2, Q = I 2 2, R =.1, and the input satisfying ( ) ( ) u 2 and x 1 1 How may regions are there? Present plots on the control partition and the time trajectory for the initial condition x = ( 1 1 ) T. (b) Find the number of regions in the control law for N = 1, 2,..., 14. How does the number of regions seem to depend on N, e.g., is the complexity polynomial or exponential? Estimate the order of the complexity via arg min α,β αn β n r 2 2 and arg min γ(δ N 1) n r 2 2, γ,δ where n r denotes the number of regions. Discuss the result. 12

19 7 Numerical Algorithms All Matlab-files that are needed are available to download from the course pages, but some lines of code are missing and the task is to complement the files to be able to solve the given problems. All places where you have to write some code are indicated by three question marks (???). A basic second-order system will be used in the exercises below and the model and the problem are presented here. Consider a 1D motion model of a particle with position z(t) and speed v(t). Define the state vector x = (z v) T and the continuous time model ẋ(t) = f(x(t), u(t)) = ( ) ( 1 x(t) + u(t). (7.1) 1) The problem is to go from the state x i = x() = (1 1) T to x(t f ) = ( ) T, where t f = 2, but such that the control input energy t f u2 (t)dt is minimized. Thus, the optimization problem is tf min J = f (x, u)dt u s.t. ẋ(t) = f(x(t), u(t)) x() = x i x(t f ) = x f where f (t) = u 2 (t) and f( ) is given in (7.1). (7.2) 7.1 Discretization Method 1. In this approach we will use the discrete time model ( ) ( x[k + 1] = f(x[k], 1 h u[k]) = x[k] + u[k] (7.3) 1 h) to represent the system where t = kh, h = t f /N and x[k] = x(kh). discretized version of the problem (7.2) is then min u[n],n=,...,n 1 N 1 h k= u 2 [k] s.t. x[k + 1] = f(x[k], u[k]) x[] = x i x[n] = x f. Define the optimization parameter vector as The (7.4) y = ( x[] T u[] x[1] T u[1]... x[n 1] T u[n 1] x[n] T u[n] ) T. (7.5) Note that u[n] is superfluous, but is included to make the presentation and the code more convenient. Furthermore, define and G(y) = g 1 (y) g 2 (y) g 3 (y) g 4 (y). g N+2 (y) N 1 F(y) = h u 2 [k] (7.6) = k= x[] x i x[n] x f x[1] f(x[], u[]) x[2] f(x[1], u[1]). x[n] f(x[n 1], u[n 1]) (7.7) Then the optimization problem in (7.4) can be expressed as the constrained problem min y F(y) (7.8) s.t. G(y) =. (7.9) 13

20 (a) Show that the discrete time model in (7.3) can be obtained from (7.1) by using the Euler-approximation x[k + 1] = x[k] + hf(x[k], u[k]). (7.1) and complement the file secordersyseqdisc.m with this model. (b) Complement the file secordersyscostcon.m with a discrete time version of the cost function. (c) Note that the constraints are linear for this system model and this is very important to exploit in the optimization routine. In fact the problem is a quadratic programming problem, and since it only contain equality constraints it can be solved as a linear system of equations. Complete the file maindisclinconsecordersys.m by creating the linear constraint matrix A eq and the vector B eq such that the constraints defined by G(y) = is expressed as A eq y = B eq. Then run the Matlab script. (d) Now suppose that the constraints are nonlinear. This case can be handled by passing a function that computes G(y) and its Jacobian. Let the initial and terminal state constraints, g 1 (y) = and g 2 (y) =, be handled as above and complete the function secordersysnonlcon.m by computing and the Jacobian [ y c eq ] T = ceqjac = c eq = (g 3 (y) g 4 (y)... g N+2 (y)) (7.11) g 3 x[] g 4 x[].. g N+2 x[] g 3 u[] g 4 u[] g N+2 u[] g 3 x[1] g 4 x[1] g N+2 x[1] g 3 u[1]... g 4 u[1]... g N+2 u[1]... T (7.12) Each row in (7.12) is computed in each loop in the for-statement in secordersysnonlcon.m. Note that the length of the state vector is 2. Also note that Matlab wants the Jacobian to be transposed, but that operation is already added last in the file. Finally, run the Matlab script maindiscnonlinconsecordersys.m to solve the problem again. 7.2 Discretization Method 2. In this exercise we will also use the discrete time model in (7.3). However, the terminal constraints are removed by including it in the objective function. If the terminal constraints are not fulfilled they will penalize the objective function. Now the optimization problem can be defined as N 1 c x[n] x f 2 + h u 2 [k] min u[n],n=,...,n 1 k= s.t. x[k + 1] = f(x[k], u[k]) x[] = x i (7.13) where f( ) is given in (7.3), N = t f /h and c is a predefined constant. This problem can be rewritten as an unconstrained nonlinear problem min J = F(x i, x f, h, u[], u[1],..., u[n 1]) (7.14) u[n],n=,...,n 1 where w( ) contains the system model recursion (7.3). The optimization problem will be solved by using the Matlab function fminunc. (a) Write down an explicit algorithm for the computation of F( ). (b) Complement the file secordersyscostunc.m with the cost function F( ). Solve the problem by running the script maindiscuncsecordersys.m. (c) Compare method 1 and 2. What are the advantages and disadvantages? Which method handles constraints best? Suppose the task was also to implement the optimization algorithm itself, which method would be easiest to implement an algorithm for (assuming that the state dynamics is non-linear)? (d) (Optional) Implement an unconstrained gradient method that can replace the fminunc function in maindiscuncsecordersys.m. 7.3 Gradient Method. As above the terminal constraint is added as a penalty in the objective function in order to introduce the PMP gradient method here in a more straightforward way. Thus, the optimization problem is min u J = c x(t f ) x f 2 + tf u 2 (t)dt (7.15) s.t. ẋ(t) = f(x(t), u(t)) (7.16) x() = x i. (7.17) (a) Write down the Hamiltonian and show that the Hamiltonian partial derivatives w.r.t. x and u are H x = ( λ 1 ) H u = 2u(t) + λ 2, (7.18) 14

21 respectively. (b) What is the adjoint equation and its terminal constraint? (c) Complement the files secordersyseq.m with the system model, the file secordersysadjointeq.m with the adjoint equations, the file secordersysfinallambda.m with the terminal values of λ, and the file secordersysgradient.m with the control signal gradient. Finally, complement the script maingradientsecordersys.m and solve the problem. (d) Try some different values of the penalty constant c. What happens if c is small? What happens if c is large? 7.4 Shooting Method. Shooting methods are based on successive improvements of the unspecified terminal conditions of the two point boundary value problem. In this case we can use the original problem formulation in (7.2). Complement the files secordersyseqandadjointeq.m with the combined system and adjoint equation, and the file theta.m with the final constraints. The main script is mainshootingsecordersys.m. 7.5 Reflections. (a) What are the advantages and disadvantages of discretization methods? (b) Discuss the advantages and disadvantages of the problem formulation in 7.1 compared to the formulation in 7.2. (c) What are the advantages and disadvantages of gradient methods? (d) Compare the methods in 7.1, 7.2 and 7.3 in terms of accuracy and complexity/speed. Also compare the results for different algorithms used by fmincon in Exercises 7.1. Can all algorithms handle the optimization problem? (e) Assume that there are constraints on the control signal and you can use either the discretization method in 7.1 or the gradient method in 7.3. Which methods would you use? (f) What are the advantages and disadvantages of shooting methods? 15

22 8 The PROPT toolbox PROPT is a Matlab toolbox for solving optimal control problems. See the PROPT manual [?] for more information. PROPT is using a so called Gauss pseudospectral method (GPM) to solve optimal control problems. A GPM is a direct method for discretizing a continuous optimal control problem into a nonlinear program (NLP) that can be solved by welldeveloped numerical algorithms that attempt to satisfy the Karush-Kuhn-Tucker (KKT) conditions associated with the NLP. In pseudospectral methods the state and control trajectories are parameterized using global and orthogonal polynomials, see Chapter 1.5 in the lecture notes. In GPM the points at which the optimal control problem is discretized (the collocation points) are the Legendre-Gauss (LG) points. See [?] for details. PROPT initialization; PROPT is available in the ISY computer rooms with Linux, e.g. Egypten. Follow these steps to initialize PROPT: 1. In a terminal window; run module add matlab/tomlab 2. Start Matlab, e.g. by running matlab & in a terminal window 3. At the Matlab prompt; run initcourse tsrt8 4. At the Matlab prompt; run initpropt 5. Check that the printed information looks OK Minimum Curve Length; Example 1. Consider the problem of finding the curve with the minimum length from a point (, ) to (x f, y f ). The solution is of course obvious, but we will use this example as a starting point. The problem can be formulated by using an expression of the length of the curve from (, ) to (x f, y f ): s = xf 1 + y (x) 2 dx. (8.1) Note that x is the time variable. The optimal control problem is solved in mincurvelength.m by using PROPT/Matlab. (a) Derive the expression of the length of the curve in (8.1). (b) Write problem on a standard optimal control form. Thus, what are ψ( ), f(( )), f (( )), etc. (c) Use Pontryagins minimum principle to show that the solution is a straight line. (d) Examine and run the script mincurvelength.m 8.2 Minimum Curve Length; Example 2. Consider the minimum length curve problem again. The problem can be re-formulated by using a constant speed model where the control signal is the heading angle. This problem is solved in the PROPT/Matlab file mincurvelengthheadingctrl.m. (a) Examine the PROPT/Matlab file and write down the optimal control problem that is solved in this example, i.e., what are f, f and φ in the standard optimal control formulation. (b) Run the script and compare with the result from the previous exercise. 8.3 The Brachistochrone problem; Find the curve between two points, A and B, that is covered in the least time by a body that starts in A with zero speed and is constrained to move along the curve to point B, under the action of gravity and assuming no friction. The Brachistochrone problem (Gr. brachistos - the shortest, chronos - time) was posed by Johann Bernoulli in Acta Eruditorum in 1696 and the history about this problem is indeed very interesting and 16

23 8.4 The Brachistochrone problem; Example 2. The time for a particle to travel on a curve between the points p = (, ) and p f = (x f, y f ) is t f = pf where ds is an element of arc length and v is the speed. (a) Show that the travel time t f is 1 ds (8.3) p v Figure 8.3a. The brachistochrone problem. involves several of the greatest scientists ever, like Galileo, Pascal, Fermat, Newton, Lagrange, and Euler. The Brachistochrone problem; Example 1. Let the motion of the particle, under the influence of gravity g, be defined by a time continuous state space model Ẋ = f(x, θ) where the state vector is defined as X = (x y v) T and (x y) T is the Cartesian position of the particle in a vertical plane and v is the speed, i.e., ẋ = v sin(θ) (8.2) ẏ = v cos(θ) The motion of the particle is constrained by a path that is defined by the angle θ(t), see Figure 8.3a. (a) Give an explicit expression of f(x, θ) (only the expression for v is missing). (b) Define the Brachistochrone problem as an optimal control problem based on this state space model. Assume that the initial position of the particle is in the origin and the initial speed is zero. The final position of the particle is (x(t f ) y(t f )) T = (x f y f ) T = (1 3) T. (c) Modify the script mincurvelengthheadingctrl.m of the minimum curve length example 8.2 above and solve this optimal control problem with PROPT. t f = xf 1 + y (x) 2 2gy(x) dx (8.4) in the Brachistochrone problem where the speed of the particle is due to gravity and its initial speed is zero. (b) Define the Brachistochrone problem as an optimal control problem based on the expression in (8.4). Note that the time variable is eliminated and that y is a function x now. (c) Use the PMP to show (or derive) that the optimal trajectory is a cycloid with the solution y (x) = C y(x). (8.5) y(x) x = C (φ sin(φ)) 2 y = C (8.6) 2 (1 cos(φ)). (d) Try to solve the this optimal control problem with PROPT/Matlab, by modifying the script mincurvelength.m from the minimum curve length example. Compare the result with the solution in Examples 1 above. Try to explain why we have problems here. 8.5 The Brachistochrone problem; Example 3. To avoid the problem when solving the previous problem in PROPT/Matlab we can use the results in the exercise 8.4c). The cycloid equation (8.5) can be rewritten as y(x)(y (x) 2 + 1) = C. (8.7) 17

24 PROPT can now be used to solve the Brachistochrone problem by defining a feasibility problem min C s.t. y(x)(y (x) 2 + 1) = C y() = y(x f ) = y f (8.8) Solve this feasibility problem with PROPT/Matlab. Note that the system dynamics in example 3 (and 4) are not modeled on the standard form as an ODE. Instead, the differential equations on the form F (ẋ, x) = is called differential algebraic equations (DAE) and this is a more general form of system of differential equations. 8.6 The Brachistochrone problem; Example 4. An alternative formulation of the Brachistochrone problem can be obtained by considering the law of conservation of energy (which is related to the principle of least action, see Example 17 in the lecture notes). Consider the position (x y) T of the particle and its velocity (ẋ ẏ) T. 8.8 The Zermelo problem. Consider a special case of Exercise 3.5 where g(y) = y. Thus, from the point (, ) on the bank of a wide river, a boat starts with relative speed to the water equal ν. The stream of the river becomes faster as it departs from the bank, and the speed is g(y) = y parallel to the bank. The movement of the boat is described by ẋ(t) = ν cos (φ(t)) + y(t) y (t) = ν sin(φ(t)), where φ is the angle between the boat direction and bank. We want to determine the movement angle φ(t) so that x(t ) is maximized, where T is a fixed transfer time. Use PROPT to solve this optimal control problem by complementing the file mainzermelopropt.m, i.e. replace all question marks??? with some proper code. 8.9 Second-order system. Solve the problem in (7.2) by using PROPT. (a) Write the kinetic energy E k and the potential energy E p as functions of x, y, ẋ, ẏ, the mass m, and the gravity constant g. (b) Define the Brachistochrone problem as an optimal control problem based on the law of conservation of energy. (c) Solve the this optimal control problem with PROPT. Assume that the mass of the particle is m = The Brachistochrone problem; Reflections. (a) Is the problem in Exercise 8.4 due to the problems formulation or the solution method or both? (b) Discuss why not only the solution method (i.e. the optimization algorithm) is important, but also the problem formulation. (c) Compare the different approaches to the Brachistochrone problem in Exercises Try to explain the advantages and disadvantages of the different formulations of the Brachistochrone problem. Which approach can handle additional constraints best? 18

25 Hints This version: October 215

27 1 Discrete Optimization 1.2 Note that the control constraint set and the system equation give lower and upper bounds on x k. Also note that both u and x are integers, and because of that, you can use tables to examine different cases, where each row corresponds to one valid value of x, i.e., x J(k, x) u Define V (n, x) := α n J(n, x) where J is the cost to go function. 1.4 Let t 1 < t 2 <... < t N 1 denote the times where g 1 (t) = g 2 (t). It is never optimal to switch activity at any other times! We can therefore divide the problem into N 1 stages, where we want to determine for each stage whether or not to switch. 1.5 Consider a quadratic cost V (x) = x T P x, where P is a symmetric positive definite matrix. 1

28 2 Dynamic Programming 2.1 V (t, x) quadratic in x. 2.2 b) What is the sign of x(t)? In the hjbe, how does V (t, x) seem to depend on x? 2

29 3 PMP: A Special Case 3.4 Rewrite the equation K(x) = l as a differential equation ż = f(z, x, ẋ), plus constraints, and then introduce two adjoint variables corresponding to x and z, respectively. 3.6 a) The generalized force is u(t). 3.8 a) You can consider [ ] u as a control signal in the pmp framework. w 3

30 5 Infinite Horizon Optimal Control 5.2 Make the ansatz V (x) = αx 2 + βx c) Use Theorem 1 in the lecture notes. e) Note that the solution matrix P is symmetric and positive definite, thus P = P T >. 4

31 6 Model Predictive Control 6.1 Useful commands: mpt_control, mpt_simplify, mpt_plotpartition, mpt_plottimetrajectory and lsqnonlin. 5

32 7 Numerical Algorithms 7.1 c) The structure of the matrix A eq is E E F E... A eq = F E... F E F E (7.1) where the size of F and E is (2 3). d) You can use the function checkceqnonlcon to check that the Jacobian of the equality constraints are similar to the numerical Jacobian. You can also compare the Jacobian to the matrix A eq in this example. 6

33 8 The PROPT toolbox 8.4 Use the fact that H(y, u, λ) = const. in this case (explain why!). 7

35 Solutions This version: October 215

37 1 Discrete Optimization 1.1 (a) The discrete optimization problem is given by minimize u, u 1 r(x 2 T ) k= subject to x k+1 = (1 a)x k + au k, k =, 1, x given, u k R, k =, 1. (b) With a = 1/2, T = and r = 1 this can be cast on standard form with N = 2, φ(x) = x 2, f (k, x, u) = u 2, and f(k, x, u) = 1 2 x + 1 2u. The dp algorithm gives us: Stage k = N = 2: Stage k = 1: J(2, x) = x 2. u 2 k J(1, x) = min {u 2 + J(2, f(1, x, u))} u = min u {u 2 + ( 1 2 x u)2 }. (1.1) The minimization is done by setting the derivative, w.r.t. u, to zero, since the function is strictly convex in u. Thus, we have which gives the control function 2u + ( 1 2 x u) = u 1 = µ(1, x) = 1 5 x. Note that we now have computed the optimal control for each possible state x. By substituting the optimal u 1 into (1.1) we obtain J(1, x) = 1 5 x2. (1.2) Stage k = : J(, x) = min {u 2 + J(1, f(, x, u)} u = min {u 2 + J(1, 1 u 2 x u)}. Substituting J(1, x) by (1.2) and minimizing by setting the derivative, w.r.t. u, to zero gives (after some calculations) the optimal control and the optimal cost u = µ(, x) = 1 21 x (1.3) J(, x) = 1 21 x2. (1.4) (c) This can be cast on standard form with N = 2, φ(x) = r(x T ) 2, f (k, x, u) = u 2, and f(k, x, u) = (1 a)x + au. The dp algorithm gives us: Stage k = N = 2: Stage k = 1: J(2, x) = r(x T ) 2. J(1, x) = min {u 2 + J(2, f(1, x, u))} u = min {u 2 + r((1 a)x + au T ) 2 }. u (1.5) The minimization is done by setting the derivative, w.r.t. u, to zero, since the function is strictly convex in u. Thus, we have which gives the control function 2u + 2ra((1 a)x + au T ) = u 1 = µ(1, x) = ra(t (1 a)x) 1 + ra 2. 1

38 Note that we now have computed the optimal control for each possible state x. By substituting the optimal u 1 into (1.5) we obtain (after some work) r((1 a)x T )2 J(1, x) = 1 + ra 2. (1.6) Stage k = : J(, x) = min {u 2 + J(1, f(, x, u)} u = min {u 2 + J(1, (1 a)x + au)}. u Substituting J(1, x) by (1.6) and minimizing by setting the derivative, w.r.t. u, to zero gives (after some calculations) the optimal control and the optimal cost u = µ(, x) = r(1 a)a(t (1 a)2 x) 1 + ra 2 (1 + (1 a) 2. (1.7) ) J(, x) = r((1 a)2 x T ) ra 2 (1 + (1 a) 2 ). (1.8) 1.2 We have the optimization problem on standard form with N = 4, φ(x) =, f (k, x, u) = x 2 + u 2, and f(k, x, u) = x + u. Note that x k + u k 5 x k+1 5. So we know that x k 5 for k =, 1, 2, 3, if u k U(k, x). Stage k = N = 4: J(4, x) =. Stage k = 3: J(3, x) = min {f (3, x, u) + J(4, f(3, x, u)} u U(3,x) = min u U(3,x) {x2 + u 2 } = x 2 (1.9) since u = U(3, x) for all x satisfying x 5. Stage k = 2: J(2, x) = min {f (2, x, u) + J(3, f(2, x, u))} u U(2,x) = min u U(2,x) {x2 + u 2 + (x + u) 2 } = 3 2 x2 + min x u 5 x {2(1 2 x + u)2 } (1.1) Stage k = 1: x J(2, x) u min u 5 {2u 2 } = 1 3/2 + min 1 u 4 {2(1/2 + u) 2 } = 2 1, min 2 u 3 {2(1 + u) 2 } = /2 + min 3 u 2 {2(3/2 + u) 2 } = 14 2, min 4 u 1 {2(2 + u) 2 } = /2 + min 5 u {2(5/2 + u) 2 } = 38 3, 2 J(1, x) = min {f (1, x, u) + J(2, f(1, x, u))} u U(1,x) = x 2 + min x u 5 x {u2 + J(2, x + u)} x J(1, x) u + min u 5 {u 2 + J(2, + u)} = min 1 u 4 {u 2 + J(2, 1 + u)} = min 2 u 3 {u 2 + J(2, 2 + u)} = min 3 u 2 {u 2 + J(2, 3 + u)} = min 4 u 1 {u 2 + J(2, 4 + u)} = min 5 u {u 2 + J(2, 5 + u)} = 4 3 (1.11) since the expression inside the brackets u 2 + J(2, x + u) are, for different x and u, x u =

39 x Apply the general DP algorithm to the given problem: J(N, x) = α N Φ(x) α N J(N, x) }{{} J(n, x) = α n J(n, x) = min u U(n,x) min u U(n,x) :=V (N,x) = Φ(x) { α n f (n, x, u) + J(n + 1, f(n, x, u)) } { f (n, x, u) + α n J(n + 1, f(n, x, u)) } α n { J(n, x) = min f (n, x, u) + α α (n+1) } J(n + 1, f(n, x, u)) }{{} u U(n,x) }{{} :=V (n,x) :=V (n+1,f(n,x,u)) Stage k = : Note that x = 5! In general, defining V (n, x) := α n J(n, x) yields J(, x) = min {f(, x, u) + J(1, f(, x, u))} u U(,x) = {x = 5} = 25 + min 5 u {u2 + J(1, 5 + u)} = 41 (1.12) V (N, x) = Φ(x), V (n, x) = min u U(n,x) { f (n, x, u) + αv (n + 1, f(n, x, u)) }, given by u = 3 since the expression inside the brackets u 2 + J(1, 5 + u) are, for x = 5 and different u, x u = Define x k = u k = The state transition is { if on activity g 1 just before time t k 1 if on activity g 2 just before time t k { continue current activity 1 switch between activities x k+1 = f(x k, u k ) = (x k + u k ) mod 2 To summarize; the system evolves as follows: k x k u k J k or 2 3 or 1 or 1 and the profit for stage k is f (x, u) = The dp algorithm is then J(N, x) =, tk+1 t k g 1+f(x,u) (t) dt u k c. J(n, x) = max u {f (x, u) + J(n + 1, (x + u) mod 2)}. 3

40 1.5 The Bellman equation is given by where V (x) = min u {f (x, u) + V (f(x, u))} (1.13) f (x, u) = x T Qx + u T Ru, f(x, u) = Ax + Bu. Assume that the cost is on the form V = x T P x, P >. Then the optimal control law is obtained by minimizing (1.13) µ(x) = arg min {f (x, u) + V (f(x, u))} u = arg min {x T Qx + u T Ru + (Ax + Bu) T P (Ax + Bu)} u = arg min {x T Qx + u T Ru + x T A T P Ax + u T B T P Bu + 2x T A T P Bu}. u The stationary point of the expression inside the brackets are obtained by setting the derivative to zero, thus 2Ru + 2B T P Bu + 2B T P Ax = u = (B T P B + R) 1 B T P Ax, which is a minimum since R > and P >. Now the Bellman equation (1.13) can be expressed as x T P x = x T Qx + u T (Ru + B T P Bu + B T P Ax) +x T A T P Ax + x T A T P Bu }{{} = = x T Qx + x T A T P Ax x T A T P B(B T P B + R) 1 B T P Ax which holds for all x if and only if the dare P = A T P A A T P B(B T P B + R) 1 B T P A + Q, (1.14) holds. Optimal feedback control and globally convergent closed loop system are guaranteed by Theorem 2, assuming that there is a positive definite solution P to (1.14). 1.6 (a) 1. The Hamiltonian is H(k, x, u, λ) = f (k, x, u) + λ T f(k, x, u) = x T Qx + u T Ru + λ T (Ax + Bu). 2. Pointwise minimization yields = H u (k, x k, u k, λ k+1 ) = 2Ru k + B T λ k+1 = = u k = 1 2 R 1 B T λ k The adjoint equations are λ k = H x k (x k, u k, λ k+1 ) = 2Qx k + A T λ k+1, k = 1,..., N 1 λ N = φ x (x N) = 2Q N x N. Hence, we obtain (1.15) (1.16) x k+1 = Ax k 1 2 BR 1 B T λ k+1. (1.17) Thus, the minimizing control sequence is (1.15) where λ k+1 is the solution to the tpbvp in (1.16) and (1.17). (b) Assume that S k is invertible for all k. Using in (1.17) yields 1 2 S 1 λ k = 2S k x k x k = 1 2 S 1 k λ k k+1 λ k+1 = Ax k 1 2 BR 1 B T λ k+1 λ k+1 = 2(S 1 k+1 + BR 1 B T ) 1 Ax k. (1.18) Insert this result in the adjoint equation (1.16) λ k = 2S k x k = 2Qx k + 2A T (S 1 k+1 + BR 1 B T ) 1 Ax k, S N x N = Q N x N and since these equations must hold for all x k we see that the backwards recursion for S k is S N = Q N S k = A T (S 1 k+1 + BR 1 B T ) 1 A + Q 4

41 (c) Now, the mil yields S k = A T (S 1 k+1 + BR 1 B T ) 1 A + Q = A T (S k+1 S k+1 B(R + B T S k+1 B) 1 B T S k+1 )A + Q = A T S k+1 A A T S k+1 B(R + B T S k+1 B) 1 B T S k+1 A + Q This recursion converges to the dare in (1.14) as k if (A, B) is controllable and (A, C) is observable where Q = C T C. (d) By combining (1.15) and (1.18) we get u k = R 1 B T (S 1 k+1 + BR 1 B T ) 1 Ax k (1.19) Another matrix identity also refered to as the mil is BV (A + UBV ) 1 = (B 1 + V A 1 U)V A 1 with this version of the mil we get u k = (R + B T S k+1 B) 1 B T S k+1 Ax k (1.2) which converges to the optimal feedback for infite time horizon LQ as k.q 5

42 2 Dynamic Programming The Hamiltonian is given by 2. Point-wise optimization yields H(t, x, u, λ) f (t, x, u) + λ T f(t, x, u) = (x cos t) 2 + u 2 + λu. µ(t, x, λ) arg min H(t, x, u, λ) u R and the optimal control is = arg min {(x cos t) 2 + u 2 + λu} = λ u R 2, u (t) µ (t, x(t), V x (t, x(t))) = 1 2 V x(t, x(t)), where V x (t, x(t)) is obtained by solving the hjbe. 3. The hjbe is given by V t = H(t, x, µ(t, x, V x ), V x ), V (t f, x) = φ(x), where V t V/ t and V x V/ x. In our case, this is equivalent to V t + (x cos t) 2 V 2 x /4 =, V (t f, x) =, which is a partial differential equation (pde) in t and x. These equations are quite difficult to solve directly and one often needs to utilize the structure of the problem. Since the original cost-function f (x(t), u(t)) (x(t) cos t) 2 + u 2 (t), is quadratic in x(t), we make the guess V (t, x) P (t)x 2 + 2q(t)x + c(t). Now, substituting these expressions into the hjbe, we get P (t)x q(t)x + ċ(t) +(x cos t) 2 (2P (t)x + 2q(t) ) 2 )/4 =, }{{}}{{} =V t =V x which can be rewritten as ( P (t) + 1 P (t) 2) x ( q(t) cos t P (t)q(t) ) x + ċ(t) + cos 2 t q 2 (t) =. Since this equation must hold for all x, the coefficients needs to be zero, i.e., P (t) = 1 + P 2 (t), q(t) = cos t + P (t)q(t), ċ(t) = cos 2 t + q 2 (t), which is the sought system of odes. The boundary conditions are given by which come from V (t f, x) =. P (t f ) = q(t f ) = c(t f ) =, 2.2 (a) The problem is a maximization problem and can be formulated as minimize u( ) subject to tf ( ) x(t f ) 1 u(t) x(t) dt ẋ(t) = αu(t)x(t), x() = x >, u(t) [, 1], t This yields V t = P (t)x q(t)x + ċ(t), V x = 2P (t)x + 2q(t). (b) 1. The Hamiltonian is given by H(t, x, u, λ) f (t, x, u) + λ T f(t, x, u) = (1 u)x + λαux. 6

43 2. Pointwise optimization yields µ(t, x, λ) arg min H(t, x, u, λ) u [,1] = arg min { (1 u)x + λαux} u [,1] = arg min {(1 + λα)ux} u [,1] 1, (1 + λα)x < =, (1 + λα)x >, ũ, (1 + λα)x = where ũ is arbitrary in [, 1]. Noting that x(t) > for all t, we find that 1, αλ < 1 µ(t, x, λ) =, αλ > 1. (2.1) ũ, αλ = 1 3. The hjbe is given by V t = H(t, x, µ(t, x, V x ), V x ), V (t f, x) = φ(x), where V t V/ t and V x V/ x. In our case, this is may be written as V t + (1 µ(t, x, V x ))x + V x α µ(t, x, V x )x =. By collecting µ(t, x, V x ) this may be written as V t + (1 + αv x )x µ(t, x, V x )) x =, and inserting (2.1) yields { V t + αv x x =, αv x < 1 V t x =, αv x > 1, (2.2) with the boundary condition V (t f, x) = φ(x) = x. Both the equations above are linear in x, which suggests that V (t, x) = g(t)x for some function g(t). Substituting this into (2.2) yields { ġ(t)x + αg(t)x =, g(t) < 1/α ġ(t)x x =, g(t) > 1/α, These equations are valid for every x > and we can thus cancel the x term, which yields { ġ(t) + αg(t) =, g(t) < 1/α ġ(t) 1 =, g(t) > 1/α. The solutions to the equations above are { g 1 (t) = c 1 e α(t t ), g(t) < 1/α g 2 (t) = t + c 2, g(t) > 1/α, for some constants c 1, c 2 and t. The boundary value is V (t f, x) = x g(t f ) = 1 > 1/α ( since α (, 1) ). This gives g 2 (t f ) = t + c 2 in the end. From g 2 (t f ) = 1, we obtain c 2 = 1 t f and g(t) = t 1 t f, in some interval (t, t f ]. Here, t fulfills g 2 (t) = 1/α, and hence t = 1 1/α + t f, which is the same time at which g switches. Now, use this as a boundary value for the solution to the other pde g 1 (t) = c 1 e α(t t ) : Thus, we have g 1 (t ) = 1/α c 1 = 1/α. g(t) = { 1 α e α(t t ), t < t t 1 t f, t > t, where t = 1 1/α + t f. The optimal cost function is given by { 1 V (t, x) = α e α(t t ) x, t < t (u = ) (t 1 t f )x, t > t (u = 1), where t = 1 1/α + t f. Thus, the optimal control strategy is { u t 1, t < t (t) = µ(t, x(t), V x (t, x(t))) =, t > t and we are done. Not also that V x and V t are continuous at the switch boundary, thus fulfilling the C 1 condition on V (t, x) of the verification theorem. 7

44 3 PMP: A Special Case 3.1 The Hamiltonian is given by Pointwise minimization yields H(t, x, u, λ) f (t, x, u) + λ T f(t, x, u) = x + u 2 + λx + λu + λ. µ(t, x, λ) arg min H(t, x, u, λ) = 1 u 2 λ, and the optimal control can be written as The adjoint equation is given by u (t) µ(t, x(t), λ(t)) = 1 2 λ(t). λ(t) = λ(t) 1, which is a first order linear ode and can thus be solved easily. The standard integrating factor method yields λ(t) = e C t 1, for some constant C. The boundary condition is given by Thus, and the optimal control is found by 3.2 (a) Since λ(t f ) = φ x (t f, x(t f )) λ(t f ) =, λ(t) = e t f t 1, u (t) = 1 2 λ(t) = 1 et f t. 2 J = 1 ẏ dt = y(1) y() = 1, it holds that all y( ) C 1 [, 1] such that y() = and y(1) = 1 are minimal. (b) Since J = 1 yẏ dt = d dt (y2 ) dt = 1 ( y(1) 2 y() 2) = 1 2 2, it holds that all y( ) C 1 [, 1] such that y() = and y(1) = 1 are minimal. 3.3 (a) Define x = y, u = ẏ and f (t, x, u) = x 2 + u 2 2x sin t, then the Hamiltonian is given by The following equations must hold H(t, x, u, λ) = x 2 + u 2 2x sin t + λu y() = = H (t, x, u, λ) = 2u + λ u (3.1b) λ = H (t, x, u, λ) = 2x + 2 sin t x (3.1c) λ(1) = Equation (3.1b) gives λ = 2 u = 2ÿ and with (3.1c) we have with the solution ÿ y = sin t y(t) = c 1 e t + c 2 e t + 1 sin t, 2 c where (3.1a) gives the relationship of c 1 and c 2 as and where (3.1d) gives. Consequently, we have c 2 = c 1 c 2 = c 1 e 2 + e cos , c 2 R y(t) = e cos 1 2(e 2 + 1) (et e t ) + 1 sin t, 2 (3.1a) (3.1d) 8

Optimal Control - Homework Exercise 3

Optimal Control - Homework Exercise 3 December 17, 2010 In this exercise two different problems will be considered, first the so called Zermelo problem the problem is to steer a boat in streaming water,