Static and Dynamic Optimization (42111)

Size: px

Start display at page:

Download "Static and Dynamic Optimization (42111)"

Arthur Higgins
5 years ago
Views:

1 Static and Dynamic Optimization (42111) Niels Kjølstad Poulsen Build. 303b, room 016 Section for Dynamical Systems Dept. of Applied Mathematics and Computer Science The Technical University of Denmark phone: mobile: :21 Lecture 8: Free dynamic optimization (D+C) 1/53

2 Outline of lecture Recap L7 Analytical solutions and Numerical methods Continuous time system Exercise DO.2 Reading guidance (DO: 11-14, 27-34) 2/53

3 Dynamic Optimization (D) Minimize J (ie. determine the sequence u i R m, i = 0,..., N 1) where: N 1 J = φ(x N )+ i=0 L i (x i,u i ) subject to (for i = 0, 1,... N 1): x i+1 = f i (x i,u i ) x 0 = x 0 Given: N horizon (number of intervals) x 0 R n initial state is known f i dynamics (R n+m+1 R n ) L i (x i,u i ) stage cost φ(x N ) terminal cost 3/53

4 Euler-Lagrange equations Defining the Hamiltonian function H i = L i (x i,u i )+λ T i+1 f i(x i,u i ) The KKT conditions on this problem (with equality constraints) results in: x i+1 = f i = λ T i = 0 T = H i x i = H i u i = ( ) T H i λ i L i +λ T i+1 f i x i x i L i +λ T i+1 f i u i u i with boundary conditions: x 0 = x 0 λ T N = x N φ N where for short: f i = f i (x i,u i ) L i = L i (x i,u i ) H i = H i (x i,u i,λ i+1 ) 4/53

5 This is a two-point boundary value problem (TPBVP) with N(2n +m) unknowns and equations. The quantity u i H i is the gradient of J wrt. u i. Equal to zero on optimal trajectories. λ 0 is the gradient of J wrt. x 0. 5/53

6 Type of solutions: Analytical solutions (for very simple problems) Semi analytical solutions (eg. the LQ problem) Numerical solutions 6/53

7 LQ problem I (Simple version). LQ Let us now focus on the problem of bringing the linear, first order system given by: x i+1 = ax i +bu i x 0 = x 0 along a trajectory from the initial state, such the cost function (for chosen p 0, q 0 and r > 0): J = 1 N 1 2 px2 N qx2 i ru2 i i=0 is minimized. The Hamiltonian for this problem is H i = 1 2 qx2 i ru2 i +λ [ ] i+1 axi +bu i and the Euler-Lagrange equations are: x i+1 = ax i +bu i λ i = qx i +aλ i+1 0 = ru i +bλ i+1 (1) (2) (3) which has the two boundary conditions x 0 = x 0 λ N = px N 7/53

8 The stationarity conditions (3), 0 = ru i +bλ i+1 give us a sequence of decisions u i = b r λ i+1 (4) if the costate is known. 8/53

9 Inspired from the boundary condition we will postulate a relationship λ i = s i x i (5) Actually separation of the variables (i and x). If we insert the control law, (4), and the costate candidate, (5), in the state equation, (1), we find or x i+1 = ax i b b r s i+1x i+1 x i+1 = [ 1+ b2 r s i+1] 1axi 9/53

10 From the costate equation, (2), we have s i x i = qx i +as i+1 x i+1 = [ q +as i+1 (1+ b2 r s i+1) 1 a ] x i which has to be fulfilled for any x i. This is the case if s i is given by the backwards recursion or s i = as i+1 (1 + b2 r s i+1) 1 a+q }{{} x 1 1+x = 1 x 1+x s i = q +s i+1 a 2 (abs i+1) 2 r +b 2 s i+1 s N = p (6) This can be solved (recursively and backwards). 10/53

11 With this solution (the sequence of s i ) we can determine the (sequence of) control actions or u i = b r λ i+1 = b r s i+1x i+1 = b r s i+1(ax i +bu i ) where u i = abs i+1 r +b 2 s i+1 x i K i = s i+1ab r +s i+1 b 2 = K i x i For the costate we have: λ i = s i x i which can be compared with (which can be proven) J = 1 2 s 0x /53

12 LQ problem - II Linear Dynamics: x i+1 = Ax+Bu x 0 = x 0 x i R n u i R m and a Quadratic objective function: J = 1 2 xt N Px N + 1 N 1 ( ) x T i 2 Qx i +u T i Ru i R > 0 i=0 The matrices, Q, R and P, are symmetric and positive semidefinite. The problem has the Hamiltonian: H i = 1 ( ) x T i 2 Qx i +u T i Ru i +λ T i+1 (Ax i +Bu i ) and the Euler-Lagrange equations are (necessary conditions): ( ) T λ H i = x i+1 = Ax+Bu x 0 = x 0 x H i = λ T i = x T i Q+λT i+1 A λt N = xt N P u H i = 0 T = u T i R+λT i+1 B 12/53

13 The solution to the LQ problem is: u i = K i x i where the gain is given by K i = ( B T S i+1 B +R ) 1 B T S i+1 A and S i is a solution to the Riccati equation S i = Q+A T S i+1 A A T S i+1 B [ B T S i+1 B +R ] 1 B T S i+1 A S N = P The matrix, S i is a symmetric, positive semidefinite matrix. Notice the Costate λ i = S i x i S i 0 which might be compared to: J = 1 2 xt 0 S 0x 0 13/53

14 Riccati Count Jacopo Francesco Riccati Born: 28 May 1676, Venice Dead: 15 April 1754, Treviso University of Padua Source: Wikipedia 14/53

15 Pause 15/53

16 Numerical methods Shooting methods (forward or backward) Gradient methods Brute force 16/53

17 Numerical methods - illustrated on a simple LQ problem. (LQ problem in the simple version). Let us now focus on the problem of bringing the linear, first order system given by: x i+1 = ax i +bu i x 0 = x 0 along a trajectory from the initial state, such the cost function: J = 1 N 1 2 px2 N + i=0 1 2 qx2 i ru2 i is minimized. The Euler-Lagrange equations are: x i+1 = ax i +bu i λ i = qx i +aλ i+1 0 = ru i +bλ i+1 (7) (8) (9) which has the two boundary conditions x 0 = x 0 λ N = px N 17/53

18 Forward shooting The stationarity condition (9) 0 = ru i +bλ i+1 gives us simply: u i = b r λ i+1 We can (for a 0) reverse the costate equation into λ i = qx i +aλ i+1 λ i+1 = λ i qx i a 18/53

19 Starting with x 0 and λ 0 ( guessed value ) we can for i = 0, 1,... N 1 iterate: λ i+1 = λ i qx i a u i = b r λ i+1 x i+1 = ax i +bu i We end up with x N and λ N which ( for correct λ 0 ) should fulfill λ N = px N ε N = λ N px N = 0 Notice: a problem if a << 1 or a >> 1. 19/53

20 Contents of a file (parms.m) setting the parameters. % Constants etc. alf=0.05; a=1+alf; b=-1; x0=50000; N=10; q=alf^2; r=q; p=q; 20/53

21 The following code (fejlf.m) solves these recursions. function err=fejlf(la0) parms % set parameters a,b,p,q,r,x0 la=la0; x=x0; for i=0:n-1, la=(la-q*x)/a; u=-b*la/r; x=a*x+b*u; end err=la-p*x; λ i+1 = λ i qx i a u i = b r λ i+1 x i+1 = ax i +bu i 21/53

22 Extented version of fejlf.m (for plotting). function [err,xt,ut,lat]=fejlf(la0) parms % set parameters a,b,p,q,r,x0 la=la0; x=x0; ut=[]; lat=la; xt=x; for i=0:n-1, la=(la-q*x)/a; u=-b*la/r; x=a*x+b*u; xt=[xt;x]; lat=[lat;la]; ut=[ut;u]; end err=la-p*x; 22/53

23 Master program (script). % The search for la0 la0g=10; % a wild guess% la0=fsolve( fejlf,la0g) % The simulation with the correct la0 [err,xt,ut,lat]=fejlf(la0); subplot(211); bar(ut); grid; title( Input sequence ); subplot(212); bar(xt); grid; title( Saldo ); 23/53

24 5 x 104 Input sequence x 104 Balance /53

25 Forward shooting method - Simplified If separation possible: reverse the costate equation and find u i from the stationarity condition. The Euler-Lagrange equations x i+1 = f i (x i,u i ) λ T i = L i (x i,u i )+λ T i+1 f i (x i,u i ) x i x i λ i+1 = h i (x i,λ i ) 0 T = L i (x i,u i )+λ T i+1 f i (x i,u i ) u i u i u i = g i (x i,λ i+1 ) Guess λ 0 (or another parameterization) and use x 0. Iterate for i = 0,1... N 1: 1 Knowing x i and λ i, determine u i and λ i+1 from the stationarity and the costate equation. 2 Update the state equation i.e. find x i+1 from x i and u i. At the end (i = N) check if λ T N = x N φ(x N ) ε = λ T N x N φ(x N ) = 0 T 25/53

26 Forward shooting method - General The Euler-Lagrange equations x i+1 = f i (x i,u i ) λ T i = L i (x i,u i )+λ T i+1 f i (x i,u i ) x i x i 0 T = L i (x i,u i )+λ T i+1 f i (x i,u i ) u i u i ( ) ( ) λi+1 xi = g u i i λ i Guess λ 0 (or another parameterization) and use x 0. Iterate for i = 0,1... N 1: 1 Knowing x i and λ i, determine u i and λ i+1 from the stationarity and the costate equation. 2 Update the state equation i.e. find x i+1 from x i and u i. At the end (i = N) check if λ T N = x N φ(x N ) ε = λ T N x N φ(x N ) = 0 T 26/53

27 Gradient methods The Euler-Lagrange equations are: x i+1 = f i (x i,u i ) λ T i = L i (x i,u i )+λ T i+1 f i (x i,u i ) x i x i 0 T = L i (x i,u i )+λ T i+1 f i (x i,u i ) u i u i = x i H i = u i H i which has the two boundary conditions x 0 = x 0 λ T N = x N φ(x N ) Guess a sequence of decisions, u i i = 0, 1,... N 1. Search for an optimal sequence of decisions, u i i = 0, 1,... N 1 using e.g. a Newton Raphson iteration: [ u j+1 i = u j 2 ] 1 i u 2 H i H i i u i 27/53

28 Brute force Guess a sequence of decisions, u i i = 0, 1,... N 1. 1 Start in x 0 and iterate the state equation forwards i.e. determine the state sequence, x i. 2 Determine the performance index. Search (e.g. using an amoeba method) for an optimal sequence of decisions, u i i = 0, 1,... N 1. 28/53

29 Pause 29/53

30 Continuous time Optimization Discrete time 0 N i Continuous time 0 T t The Schaefer model (Fish in the Baltics) x i+1 = x i +rhx i (1 αx i ) x 0 = u 0 h is the length of the intervals. The model can in continuous time be given as: ẋ t = dxt dt = rxt(1 αxt) x 0 = x 0 30/53

31 The fox(f) and rabbit(r) example. [ ] [ ] [ ṙ α = 1 r β 1 rf r F α 2 F +β 2 rf F ] 0 [ ] r0 = F 0 In general Dynamic (continuous time) state space model: ẋ 1 x 1 u ẋ 2 x. = 2 1 ft,. u ẋ m n t. x n t x 1 x 2. x n 0 = x 1 x 2.. x n 0 or in short: ẋ t = f t(x t,u t) x 0 = x 0 f : R n+m+1 R n The function, f, should be sufficiently smooth (existence and uniqueness). 31/53

32 Solution to ODE Analytical methods Numerical methods ẋ = f t(x t) x 0 = x 0 Euler integration (the most simple method) x t+h = x t +hf t(x t) is the most simple method. Others and more efficient numerical methods do exists. 32/53

33 The fox(f) and rabbit(r) example. [ ṙ F ] [ = α 1 r β 1 rf α 2 F +β 2 rf ] [ ] ur u f [ r F ] 0 [ ] r0 = F Lotka Volterra rabbit # fox, rabbit fox period 33/53

34 % function dx=dfoxc(t,x,u,a1,a2,b1,b2) % % Dynamic function for the continuous Lotka-Volterra system. % It determine the state derivative as function of the time, state vector % and system parameters. r=x(1); F=x(2); dx=[ a1*r-b1*r*f; -a2*f+b2*r*f ]-u; % # of Rabbits is the first state % # of Foxes is the second state % dx: derivative of x [ ṙ F ] [ = α 1 r β 1 rf α 2 F +β 2 rf ] [ ] ur u f [ r F ] 0 [ ] r0 = F 0 34/53

35 % function foxc % % This program simulates the trajectories for the Lotka-Volterra system. % a1=0.03; a2=0.03; b1=0.03/100; b2=b1; r=100; f=50; x0=[r;f]; Tstp=500; dt=0.1; Tspan=0:dT:Tstp; %Tspan=[0 Tstp]; u=[0.0;0.0]; % System parameters (enters in dfoxc function) % Initial # of rabbits % Initial # of foxes % Initial state value % Stop time % Step size in output data % Time span of which the solution is to be found % Alternative time span % Shootings of rabbits and foxes [time,xt]=ode45(@dfoxc,tspan,x0,[],u,a1,a2,b1,b2); % ODE solver % See dfox for dynamic % function 35/53

36 % % The rest of this program (until next function declaration) is just % plot and plot related commands plot(time,xt); grid; title( Lotka-Volterra ); ylabel( # fox, rabbit ); xlabel( period ); text(50,80, fox ); text(220,140, rabbit ); % print foxc.pps % Just a printing command % % end of main program % /53

37 Analytical solutions Discrete time Continuous time x i+1 = x i x i = C ẋ = 0 x = C x i+1 = x i +α x i = C +αi ẋ = α x t = C +αt x i+1 = ax i x i = Ca i ẋ = ax x t = Cexp(at) x i+1 = Ax i +Bu i x 0 = x 0 i x i = A i x 0 + A n j 1 Bu j j=0 ẋ = Ax+Bu x 0 = x 0 t x t = e At x 0 + e A(t s) Bu sds 0 Constant as C can be determined from boundary conditions. Examples are x 0 = x 0 or x N = x N 37/53

38 The Performance Index In discrete time we search for a sequence of decisions (u i performance index i = 0, 1,... N 1) such that the N 1 J = φ(x N )+ i=0 L i (x i,u i ) h is optimized s.t. the dynamics (and other possible constraints). In continuous time we search for a decision function (u t 0 t T) such that the performance index T J = φ T (x T )+ L t(x t,u t) dt 0 is optimized s.t. the dynamics (and other possible constraints). 38/53

39 Dynamic Optimization Free dynamic optimization: Minimize J (ie. determine the function u t, 0 t T) where: T J = φ T (x T )+ 0 L t(x t,u t) dt Objective subject to ẋ = f t(x t,u t) x 0 = x 0 Dynamics T is given x 0 is given f t(x t,u t) dynamics L t(x t,u t) kernel or running cost φ T (x T ) terminal loss and x T is free. 39/53

40 Euler-Lagrange Equations ẋ t = f t(x t,u t) λ T t = L t(x t,u t)+λ T t f t(x t,u t) x t x t 0 T = u t L t(x t,u t)+λ T t u t f t(x t,u t) with boundary conditions: x 0 = x 0 λ T T = x φ T(x T ) 40/53

41 Hamilton function Define the Hamilton function as: H(x,u,λ,t) = L t(x t,u t)+λ T t ft(xt,ut) Then the Euler-Lagrange equations (KKT conditions) for this problem can be written as: ẋ T = λ Ht λ T = x Ht 0T = u Ht H is the gradient of J wrt. u. u λ T 0 is the gradient of J wrt. x 0. The first equation is just the state equation ẋ = f t(x t,u t) 41/53

42 Properties of the Hamiltonian H t(x t,u t) = L t(x t,u t)+λ T t ft(xt,ut) Ḣ = t H + u H u+ x H ẋ+ λ H λ = t H + u H u+ = t H + u H u+ x H f +ft λ [ ] x H + λ T f = H = 0 for time invariant problems t along the optimal trajectories for x, u and λ. Motion control 42/53

43 Proof The Lagrange function for the problem is: T J L = φ T (x T )+ L t(x t,u t)dt 0 T + 0 λ T t [ft(xt,ut) ẋt]dt If partial integration T T λ T ẋdt+ λ T xdt = λ T T x T λ T 0 x is introduced the Lagrange function can be written as: J L = φ T (x T )+λ T 0 x 0 λ T T x T T + (L t(x t,u t)+λ T t f t(x t,u t)+ λ ) T t x t dt 0 and the Euler-Lagrange equations emerge from the stationarity of the Lagrange function. ( ) dj L = φ T λ T t dx T x T T ( + ) 0 x L+λT x f + λ T δxdt T ( + ) 0 u L+λT u f δudt 43/53

44 The Fundamental Lemma of the calculus of variation The Fundamental Lemma: Let f t be a continuous real-values function defined on a t b and suppose that: b f tδ t dt = 0 a for any δ t C 2 [a,b] satisfying δ a = δ b = 0. Then f t 0 t [a,b] 44/53

45 Motion control Optimal stepping (in continuous time) but in one dimension. Consider the problem of bringing the system ẋ = u x 0 = x 0 from the initial state along a trajectory such the cost J = 1 2 px2 T + T u2 t is minimized. The Hamiltonian function is H t = 1 2 u2 t +λtut and the Euler-Lagrange equations are: ẋ = u λ = 0 0 = u t +λ t with boundary conditions: x 0 = x 0 λ T = px T 45/53

46 The last two are easily solved: λ t = px T u t = λ t = px T The state equation (with the solution to u t) gives us x t = x 0 px T t from which we can find x T = and then x t = λ t = x T = x 0 px T T 1 1+pT x 0 0 for p ( ) p 1 1+pT t x 0 p 1+pT x 0 u t = p 1+pT x 0 and the Hamilton function is constant: H = 1 [ ] p 2 1+pT x 0 46/53

47 LQ problem Continuous time and Free Consider the linear dynamic system ẋ = Ax t +Bu t x 0 = x 0 and the cost function J = 1 2 xt T Px T + 1 T x T t Qx t +u T t Ru t dt 2 0 The problem has the Hamiltonian: H = 1 2 xt t Qxt ut t Rut +λt (Ax t +Bu t) and the Euler-Lagrange equations: ẋ = Ax t +Bu t x 0 = x 0 λ T t = x T t Q+λ T t A λ T T = xt T P 0 = u T t R+λT t B or ẋ = Ax t +Bu t x 0 = x 0 λ t = Qx t +A T λ t λ T = Px T u t = R 1 B T λ t 47/53

48 We will try the candidate function: Then λ t = S tx t λ t = Ṡtxt +Stẋt = Ṡtxt +St ( Ax t BR 1 B T S tx t ) If inserted in the costate equation λ t = Qx t +A T λ t Ṡtxt St ( Ax t BR 1 B T S tx t ) = Qx t +A T S tx t then for every x t: Ṡtxt = StAxt +AT S tx t +Qx t S tbr 1 B T S tx t which fulfilled if (the Riccati equation): Ṡt = StA+AT S t +Q S tbr 1 B T S t S T = P u t = R 1 B T S tx t 48/53

49 Minimum drag nose shape (Newton 1686) Find the shape i.e. r(x) of a axial symmetric nose, such that the drag is minimized. y θ a Flow x V l The decision u(x) is the slope of the profile: r = u = tan(θ) x r(0) = a 49/53

50 Minimum drag nose shape (Newton) Find the shape i.e. r(x) of a axial symmetric nose, such that the drag is minimized. a D = q C p(θ)2πrdr 0 q = 1 2 ρv 2 (Dynamic pressure) C p(θ) = 2sin(θ) 2 for θ 0 y θ a Flow x V l 50/53

51 Minimum drag nose shape (Newton) y θ a Flow x V Dynamic: r x = u r 0 = a tan(θ) = u Cost function (drag coefficient, including a blunt nose): C d = D l qπa 2 = 2r2 l +4 ru u 2dx 1 l 51/53

52 Minimum drag nose shape (Newton) r//a Cd = x/a 52/53

53 Reading guidance DO: 11-14, /53

Static and Dynamic Optimization (42111)

Static and Dynamic Optimization (421) Niels Kjølstad Poulsen Build. 0b, room 01 Section for Dynamical Systems Dept. of Applied Mathematics and Computer Science The Technical University of Denmark Email: