ES22 Lecture Notes #11 Theoretical Justification for LQ problems: Sufficiency condition: LQ problem is the second order expansion of nonlinear optimal control problems. J = φ(x( ) + L(x,u,t)dt ; x= f(x,u,t) ; x( t )= x ; ψ(x( ))= (P) t Suppose all necessary condition for (P) have been satisfied. What about sufficiency condition. Recall for static optimization problem with equality constraints, the trick is to expand the augmented criterion to second order and the constraints to first order, i.e.? J = φ(x( ) + λ T (f- x) +L(x,u,t)dt =( φ x -λ T ) δx( )+ [(λ+h x )δx+ H u t? δj = [φ x -λ T ] δx( )+ [(λ T +H x )δx(t)+ H u δu(t)]dt + 1 2 (δxt φ xx δx) + [δx δu] T t to H xx H xu δx H ux H uu δu dt and δx= f x δx + f u δu=- H λ ; δx( t )= δx = ; λ T =- H x, λ T ( )= φ x( ) ; H u = for all t choosing λ T =- H x, λ T ( )= φ x( ) ; H u = for all t we get, δx= f x δx + f u δu=- H λ ; δx( t )= δx = ; and δj = 1 2 (δxt φ xx δx) + t [δx δu] T H xx H xu δx H ux H uu δu dt which is recognized as a LQ problem. Thus if we can show that the minimum of this accessory problem, which is δx(t o ) T S(t o )δx(t o ), is zero or S(t )> then the stationary solution must be a local minimum also. Consequently, the sufficient condition is simply the existence of a positive definite solution of the Riccati equation (also know as the Jacobi condition or the conjugate point condition. See optics example). Practical Rationale for LQ problems Aerospace guidance and control Chemical process control YCHo 11/2/98 1
All kinds of automative applications: cruise control, engine control, temperature control Economic growth and resource models for nation, industry, and firms communication networks, computer systems manufacturing plants stock market seismic data processing image analysis weather prediction Infinite time regulator problem the case of in the LQ problem with A, B, Q, R, constant. Intuition ds/dt= since the Q term (decreased S) and the SBR - 1 B T S term (increases S) fight to a stand still. If S( -)=constant than u = - R -1 B T Sx = Kx ==> x = (A +BK)x is a constant coefficient linear system. Question: is it stable? Or does optimality imply stability? Answer: Stability the optimal return function x T Sx V(x) is a Lyapunov function. We have dv/dt = x T Sdx/dt + x T ds/dtx + dx/dt T Sx = x T [(A+BK) T S -SA-A T S-Q+SBR -1 B T S + (A+BK)S]x = x T [-Q-SBR -1 B T S]x < ==> stability Convergence If the system is controllable than we know a finite time control sequence followed by zero u(t) will stabilize the system at zero. The optimal control sequence must have a smaller value for J ==> convergence since nonzero integral of J is constantly increasing. Inhomogeneous LQ problem: J = 1 2 (( x- x f) T S f (x- x f )) + [x- x u- u] T t Q N T N R x- x u- u dt x= Ax + Bu + f(t) ; x( t )= x where x, u, f are given functions of time and x f is given constant. The optimal V(x,t) is (1/2)x T Sx +α T x +β where S still obeys the same Riccati equation and α and β obey linear ODEs dependent on S(t). It is also possible to add linear terms in x and u to the terminal and in-flight portion of the cirterion without changing the general nature of the solution. Two special cases: (i) Terminal Constraints: J = 1 2 u T R u dt ; x = A(t)x +B(t)u ; x( )= Consider t YCHo 11/2/98 2
J = ν T x( ) + 1 2 u T R u dt t and the HJBPDE, we get after assuming V(x,t)=(1/2)x T Sx +α T x +β, - 1 2 xt Sx - α T x - β = 1 2 xt [SA + A T S - SB R -1 B T S]x + α T Ax - 1 2 αt BR -1 B T α - α T BR -1 B T Sx Collecting and equating terms in x T x, x, and scalars, we get -S = SA + A T S - SB R -1 B T S ; S( ) = - α = ( A T - SB R -1 B T )α ; a T ( ) = ν T - β = - 1 2 αt BR -1 B T α ; β( ) = which implies S(t) = for all t and dα/dt = -A T α and α(t) = Φ(, t) ν β(t) = ν T { Φ(, τ)b R -1 B T Φ T (, τ)dτ}ν t We can solve for ν assuming controllability via x( ) = = - Φ(, t o )x +{ Φ(, τ)b R -1 B T Φ T (, τ)dτ}ν t NOTE: This solution should be taken with a grain of salt. Since ν depends on x and α(t) depends on ν. Thus we don't really have a feedback solution in the strict sense. (ii) The Least square fit Problem (was a take home quiz problem in 199) Let the scalar time function z(t) sin(t) for t [,]. Let x(t) a+bt. Determine a and b such that J = 1 2 (z-x ) 2 dt is minimized. (i) (ii) (iii) (iv) (2%) convert this problem to an nonstandard inhomogeneous Linear-Quadratic problem in which the optimizing variables are the constants a and b. (I have lectured in class extensively on how this portion of the problem can be solved and have warned everybody that this part of my lecture is important and subject to quiz). (35%) solve this problem from first principles (i.e., derive the appropriate conditions for optimum ; don't just assert them) using the LaGrangian methods (35%) solve the same problem again from first principle using dynamic programming and show that you can get the same answer. (1%) argue from intuitive grounds that the optimal b must be equal to zero. YCHo 11/2/98 3
(v) explain in what sense this problem captures the essential elements of the course YCHo 11/2/98 4
Solution: Define x(t)=x 1 (t), dx 1 /dt=x 2, dx 2 /dt= (1) Then we can let x 1 ()=a, x 2 ()=b. (2) (i) (ii) The inhomogeneous LQ problem is to choose a and b to minimize J subject to (1) and (2). Adjoin (1) to J via multiplier functions λ 1 (t) and λ 2 (t), we get J= 1 2 (z- x 1 ) 2 dt + [λ 1 (x 2 -x 1 )+ λ 2 (-x 2 )]dt integrating by parts J= [- λ 1 x 1 -λ 2 x 2 ] + [ 1 2 (z- x 1 )2 + λ 1 x 2 +λ 1 x 1 +λ 2 x 2 ]dt Taking variations on J, we get δj = [- λ 1 δx 1 -λ 2 δx 2 ] + [-(z- x1 )δx 1 +( λ 1 +λ 2 )δx 2 +λ 1 δx 1 ]dt Let us choose for convenience λ 1 = z- x 1 ; λ 1 ()= (3) then, λ 2 = - λ 1 ; λ 2 ()= (4) δj = λ 1 () δx 1 ()+ λ 2 () δx 2 ()= λ 1 () δa+ λ 2 () δb (5) For optimum a and b, we must have in addition, λ 1 ()= λ 2 ()= (6) Thus, the necessary conditions are (4),(6), and x 1 (t)=a+bt, i.e., (1). Integrating (4) using (6), we get λ 1 ()= zdt - a - 1 2 b2 t λ 2 ()= - zdτdt- 1 2 a2-1 6 b3 o The optimum a and b are now obtained by solving YCHo 11/2/98 5
- - 1 2 2-1 2 2-1 6 3 a b = - o t zdt zd τdt For z(t) =sin(t), we obtain from (7 ), a=2/ and b=. (7) (iii) For the dynamic programming solution, define V(x 1,x 2,t) as the value of J when starting at t with x 1 (t) and x 2 (t), then the usual DP argument yields V(x 1,x 2,t) = V( x 1 +² x 1, x 2 +² x 2,t+²t) + 1 2 (z- x 1 )2 ²t ==> - V t = V x 1 x 2 +V x 2 * + 1 2 (z(t)- x 1 )2 (8) Try a solution of the PDE (8) with V(x 1,x 2,t)= (1/2)S 11 (t)x 2 1 + S 12 (t)x 1 x 2 + (1/2)S 22 (t)x 2 2 + α l (t)x 1 +α 2 (t)x 2 +β(t). Substituting into (8) we get, - [ 1 2 S 11 x 1 2 +S 12 x 1 x 2 + 1 2 S 22 x 2 2 +α 1 x 1 +α 2 x 2 +β]=[s 11 x 1 + S 12 x 2 +α 1 ]x 2 + 1 2 (z- x 1 )2 Collecting terms in x 1 2, x 2 2, x 1, x 2 and equating their coefficients, we get S 11 = -1 ; S 12 = - S 11 ; S 22 = - 2S 12 with S 11 ()=, S 12 ()=, S 22 ()= - α 1 = - z ; - α 2 = α 1 with α 2 ()=, α 1 ()= β= - 1 2 z2 ; β()= (9) Integrating (9), S 11 (t)=-t, S 12 (t)= 2 /2 + (1/2)t 2 - t, S 22 (t)=(1/3) 3-2 t-(1/3)t 3 +t 2, α 1 (t) = -1-cos(t), α 2 (t)=-+t+sint. Now for optimum a and b, we differentiate V(x 1 (),x 2 (),) w.r.t. a and b. we have V/b=,V/a= ==> S 11 () S 12 () a -α = 1 () S 12 () S 22 () b -α 2 () (1) Solving (1), we obtain once again b= a=2/. (iv) symmetry dictates that b=. YCHo 11/2/98 6