Constrained Innite-Time Nonlinear Quadratic Optimal Control V. Manousiouthakis D. Chmielewski Chemical Engineering Department UCLA 1998 AIChE Annual Meeting
Outline Unconstrained Innite-Time Nonlinear Quadratic-Optimal Control Hamilton-Jacobi-Bellman Equation State Dependent Riccati Equation and the Curl Condition Closed-Loop Stability Example Constrained Innite-Time Nonlinear Quadratic-Optimal Control Family of Finite Time Problems Equivalence to the Innite Time Problem Example
Nonlinear Quadratic Optimal Control Innite-time Problem @J @x Z 1 min u + T Q(x)x ; 1 @J f(x) x @x 4 x T Q(x)x + u T R(x)u dt ;1 (x)b T (x) @J B(x)R = @x optimal is given by u (x) = ; 1 2 R;1 (x)b T (x) @J @x policy The T s.t. _x = f(x) + B(x)u The solution can be found by solving the HJB equation T T The determination of J(x) is nearly impossible in all but the simplest cases.
@J = 2 @x (x) 1 @P (x)^=p xt + @x 2 Innite-time Nonlinear Quadratic Optimal Control J(x) can be written as: J(x) = x T P (x)x where P (x) is symmetric. 1 In this case: @P + 1 B @P xt (x) @x 2 C x A Dene Then the HJB becomes: x T (x)a(x) + A T (x)(x) + Q(x) ; (x)b(x)r ;1 (x)b T (x)(x) x = where A(x) s.t. f(x) = A(x)x.
(H1) R(x) >, 8 x 2 < n, and Q(x) >, 8 x 2 D < n The gradients @(A(x)x) @x (H4) @b i(x) @x and Assumptions (H2) A(x) and B(x) are analytic matrix valued functions 8 x 2 D (H3) The pair (A(x) B(x)) is controllable (in the linear system sense) 8 x 2 D exist and are continuous and bounded 8 x 2 D (where b i (x) is the ith column of B(x))
State Dependent Riccati Equation If (x) satises the SDRE: (x)a(x) + A T (x) ; (x)b(x)r ;1 (x)b T (x)(x) + Q(x) = then the HJB will appear to be satised. In this case, the optimal policy is u (x) = ;R ;1 (x)b T (x)(x)x and the optimal value function is J(x) = x T P (x)x
@J @x Gradient of a Scalar Function For the HJB to actually be satised 2x T (x) must equal The conditions for a vector function, v(x), to be a gradient are : i @v = @v j j @x @x i 8 i j Assumption: (H5) A(x) is s.t. the solution to the SDRE, (x), satises: curl(x T (x)) = 8 x 2 D
Existence of a (x) there exists a solution to the HJB equation, If then there exists A(x) s.t. the solution J(x), @x = 2xT (x): The following guarantees the existence of a SDRE such that the curl condition is satised (Lu and Huang, 199). to the SDRE equation, (x), satises @J
= x T Z 1 (x)d! x J(x) Stability and Optimality and Chmielewski, 1998) Theorem:(Manousiouthakis (H1)-(H5) hold. Then the scalar function Let the necessary conditions for optimality (HJB) for all x 2 D. satises the feedback policy Furthermore, u(x) = ;R ;1 (x)b T (x)(x)x is asymptotically stabilizing. Let (H1)-(H5) hold, and assume D = < n. Corollary: the feedback policy is globally asymptotically stabilizing. Then
An Inverse Method Consider (x) and A(x) to be design functions. First we must guarantee (x) > curl(x T (x)) = A(x)x = f(x) Then simply calculate Q(x) from the SDRE. Q(x) = (x)b(x)r ;1 (x)b T (x)(x) ; A T (x)(x) ; (x)a(x)
> 8 x 2 Z An Inverse Method (continued) Let Z be dened as (a positively invariant set) Z = ( " n jj(x) J(x) = x T Z ) 1 (x)d# x x < 2 Dene s.t. x T Q(x)x + u T R(x)u = x T Q(x) + (x)b(x)r ;1 (x)b T (x)(x) x Then a sucient condition closed-loop stability is that x() 2 Z.
_x 1 ;2k a x 2 1 ; 4k a C Ass + F V x1 + 2k b x 2 x 1 = C A ; C Ass x 2 = C B ; C Bss u = C Ain ; C Ainss k a Example a k *) Consider a CSTR Reactor with reaction 2A k b B. 2 3 3 2 3 2 F V 4 5 = 4 5 4 u + 5 _x 2 a 2 + 2k ac Ass x 1 ; F V b x2 + k 1 x k F k 2 CAss = 1 2 (C Ainss ; C Ass ) b + V C Bss =
Example (continued) The following SS design parameters are used (all in SI units) k a = :5 k b = :1 F = :2 V C Ainss = 1 C Ass = 1:59 C Bss = 4:21
; 4k a C Ass + F V 4 2k a C Ass 2k b and Q o = 5 4 o 11 q q o 22 5 Example (continued) Let (x) = P o where P o satises A T o P o + P o A T o + Q o ; P o BR ;1 B T P o = where R = (V=F) 2, 2 3 2 3 F V + k b ; A o = Clearly this choice of (x) satises the \Curl" condition.
o 11 + 2k ax 1 (2p 11 ; p 12 ) + p 2 11 q k a x 1 (p 22 ; 2p 12 ) q o 22 q o Example (continued) Q(x) is calculated, via the SDRE, to be: 2 3 q o 11 + 2k ax 1 (2p 11 ; p 12 ) k a x 1 (p 22 ; 2p 12 ) Q(x) = 4 5 Then Q(x) + P o BR ;1 B T P o is positive denite i q o 11 + 2k ax 1 (2p 11 ; p 12 ) + p 2 11 > 22 ; (k a x 1 (p 22 ; 2p 12 )p 11 p 12 ) 2 >
Z = 2 < 2 j T P o Example (continued) For q o 11 = 1, and qo 22 = :1 x 1 2 ( ;1 39 ) ) Q(x) + P o BR ;1 B T P o > The largest positively invariant set such that Q(x) + P o BR ;1 B T P o > 8 x 2 Z, corresponds to = 1:8
Example 1: Closed-Loop Stability Region Q(x) + P o B R -1 B T P o > 1 Concentration of B 5 Q(x) + P o B R -1 B T P o > -5-1 -5 5 1 Concentration of A
Example 1: Closed-Loop Simulations 5 4 Concentration 3 2 C B C A 1 Nominal Inputs Optimal Control 5 1 15 2 25 Time
Z 1 Constrained ITNQOC Problem x T Qx + T (u)r(u) dt = inf u () s.t. _x = A(x)x + B(x)(u) x() = where () : < M! U, is dened as = arg min (u) 2U jju ; jj Further Assumptions: (H) U is convex and contains the origin in its interior (H) 2 X o ^= f 2 < n j 9 u s.t. () < 1g
( Z T " x T Z 1 (x)d# x = J(x) Constrained FTNQOC Problem Consider the following family of nite-time optimal control problems. x T Qx + T (u)r(u) dt + J(x(T ))) T () = inf u s.t. _x = A(x)x + B(x)(u) x() = where
of Constrained and Unconstrained Equivalence Problems X = f 2 < n j K() 2 Ug O 1 ^= 2 < n j x(t) 2 X 8 t > Dene where K(x) is the unconstrained optimal feedback gain. Lemma: Let K(x) be bounded for all x in a neighborhood of the origin. Then 2 intfug =) 2 intfo 1 g
(i) 1 ^=lim T!1 T exists for all 2 X o (ii) T = 1, T = T + 8, u T = u T + 8, x T (T ) 2 O 1 (iii) If 9 T > such that x T (T ) 2 O 1 8 2 X o 9 T s.t. x T (T ) 2 O 1 Finite-Time Solution to Innite-Time Problem Lemma: Let (H1-H) hold and assume the solution to T is unique. Then then T = Theorem: Let (H1-H) hold and assume the solution to T is unique. Then
8 < 2 n j 9 u s.t. lim jjx(t)jj < = t!1 : X max = X o 9 = Equivalence of Initial Condition Sets Dene the set of constrained stabilizable initial conditions as X max = Theorem: Let (H1-H) hold and assume the solution to T is unique. Then
Z 1 @4k B C Ass + F a V x1 (t) 2 + ru(t) 2 dt Example 2 Consider the CSTR of Example 1 with k b = 1 _x 1 = ;2k a x 2 1 ; C x 1 + F A u V with objective function (r = (F=V ) 2 ) and constraints u(t) + C Ainss 2 [ 1] 8 t ) X max = (;3:89 1)
Example 2 (continued) The resulting SDRE is: 2(;2k a x 1 ; 2k a C Ass ) (x 1 ) + 1 ; 2 (x 1 ) = Which has solution! (x 1 ) = 2 k a x 1 + k a C Ass + r (k a x 1 + k a C Ass ) 2 + 1
u(x) = ; V F k ax 1 + k a C Ass + r (k a x 1 + k a C Ass ) 2 + 1! x 1 Example 2 (continued) The Unconstrained Optimal Control Policy is and O 1 = (;:4 1) Finally, the Closed-Loop System is _x = ;x 1 r (ka x 1 + k a C Ass ) 2 + 1
Example 2: Optimal State Trajectories 1..8 Concentration of A..4 No Constraints Constrained Input Nominal Input.2. 5 1 15 2 25 Time
Example 2: Optimal Input Policies No Constraints Constrained Input C A in 4 2 5 1 15 2 25 Time
Acknowledgments The Following Financial Support is Gratefully Acknowledged: NSF GER-95545 NSF CTS-9312489 DOEd P2A432-95