Topic # Feedback Control Systems

Topic #17 16.31 Feedback Control Systems Deterministic LQR Optimal control and the Riccati equation Weight Selection

Fall 2007 16.31 17 1 Linear Quadratic Regulator (LQR) Have seen the solutions to the LQR problem, which results in linear full-state feedback control. Would like to demonstrate from first principles that this is the optimal form of the control. Deterministic Linear Quadratic Regulator Plant: ẋ(t) = A(t)x(t) + B u (t)u(t), x(t 0 ) = x 0 z(t) = C z (t)x(t) Cost: J LQR = 1 2 tf t 0 [ z T (t)r zz (t)z(t) + u T (t)r uu (t)u(t) ] dt + 1 2 x(t f) T P tf x(t f ) Where P tf 0, R zz (t) > 0 and R uu (t) > 0 Define R xx (t) = C T z R zz (t)c z 0 A(t) is a continuous function of time. B u (t), C z (t), R zz (t), R uu (t) are piecewise continuous functions of time, and all are bounded. Problem Statement: Find input u(t) t [t 0, t f ] to min J LQR This is not necessarily specified to be a feedback controller.

Fall 2007 16.31 17 2 This is the most general form of the LQR problem we rarely need this level of generality and often suppress the time dependence of the matrices. Finite horizon problem is important for short duration control - such as landing an aircraft Control design problem is a constrained optimization, with the constraints being the dynamics of the system. The standard way of handling the constraints in an optimization is to add them to the cost using a Lagrange multiplier Results in an unconstrained optimization. Example: min f(x, y) = x 2 + y 2 subject to the constraint that c(x, y) = x + y + 2 = 0 2 1.5 1 0.5 y 0 0.5 1 1.5 2 2 1.5 1 0.5 0 0.5 1 1.5 2 x Figure 1: Optimization results Clearly the unconstrained minimum is at x = y = 0

Fall 2007 16.31 17 3 To find the constrained minimum, form augmented cost function L f(x, y) + λc(x, y) = x 2 + y 2 + λ(x + y + 2) Where λ is the Lagrange multiplier Note that if the constraint is satisfied, then L f The solution approach without constraints is to find the stationary point of f(x, y) ( f/ x = f/ y = 0) With constraints we find the stationary points of L L x = L y = L λ = 0 which gives L x = 2x + λ = 0 L y = 2y + λ = 0 L λ = x + y + 2 = 0 This gives 3 equations in 3 unknowns, solve to find x = y = 1 The key point here is that due to the constraint, the selection of x and y during the minimization are not independent The Lagrange multiplier captures this dependency.

Fall 2007 16.31 17 4 LQR optimization follows the same path, but it is greatly complicated by the fact that the cost involves an integration over time. To optimize the cost, we follow the procedure of augmenting the constraints in the problem (the system dynamics) to the cost (integrand) to form the Hamiltonian: H = 1 2 ( x T (t)r xx x(t) + u T (t)r uu u(t) ) +p T (t) (Ax(t) + B u u(t)) p(t) R n 1 is called the Adjoint variable or Costate It is the Lagrange multiplier in the problem. The necessary conditions (see 16.323 notes online) for optimality are that: 1. ẋ(t) = H T = Ax(t) + B(t)u(t) with x(t0 ) = x 0 p 2. ṗ(t) = H T = Rxx x(t) A T p(t) with p(t f ) = P tf x(t f ) x 3. H u = 0 R uuu + B T u p(t) = 0, so u = R 1 uu B T u p(t) Can check for a minimum by looking at 2 H 2 0 (need to check u that R uu 0)

Fall 2007 16.31 17 5 Key point is that we now have that ẋ(t) = Ax(t) + Bu (t) = Ax(t) B u R 1 uu B T u p(t) which can be combined with equation for the adjoint variable ṗ(t) = R xx x(t) A T p(t) = Cz T R zz C z x(t) A T p(t) [ ] [ ] [ ẋ(t) A B u Ruu 1 B T ] u x(t) = ṗ(t) Cz T R zz C z A T p(t) which is called the Hamiltonian Matrix. This matrix describes coupled closed loop dynamics for both x(t) and p(t). Dynamics of x(t) and p(t) are coupled, but x(t) known initially and p(t) known at terminal time, since p(t f ) = P tf x(t f ) Two point boundary value problem typically hard to solve. However, in this case, we can introduce a new matrix variable P (t) and it is relatively easy to show that: 1. p(t) = P (t)x(t) 2. It is relatively easy to find P (t).

Fall 2007 16.31 17 6 How proceed? 1. For the 2n system [ ] [ ẋ(t) A = ṗ(t) Cz T R zz C z B u Ruu 1 Bu T A T define a transition matrix [ ] F11 (t 1, t 0 ) F 12 (t 1, t 0 ) F (t 1, t 0 ) = F 21 (t 1, t 0 ) F 22 (t 1, t 0 ) ] [ x(t) p(t) and use this to relate x(t) to x(t f ) and p(t f ) [ ] [ ] [ ] x(t) F11 (t, t f ) F 12 (t, t f ) x(tf ) = p(t) F 21 (t, t f ) F 22 (t, t f ) p(t f ) so x(t) = F 11 (t, t f )x(t f ) + F 12 (t, t f )p(t f ) ] = [F 11 (t, t f ) + F 12 (t, t f )P tf x(t f ) ] 2. Now find p(t) in terms of x(t f ) ] p(t) = [F 21 (t, t f ) + F 22 (t, t f )P tf x(t f ) 3. Eliminate x(t f ) to get: ] [ ] 1 p(t) = [F 21 (t, t f ) + F 22 (t, t f )P tf F 11 (t, t f ) + F 12 (t, t f )P tf x(t) P (t)x(t)

Fall 2007 16.31 17 7 Now have p(t) = P (t)x(t), must find the equation for P (t) ṗ(t) = C T z R zz C z x(t) A T p(t) = P (t)x(t) + P (t)ẋ(t) P (t)x(t) = C T z R zz C z x(t) + A T p(t) + P (t)ẋ(t) = C T z R zz C z x(t) + A T p(t) + P (t)(ax(t) B u R 1 uu B T u p(t)) = (C T z R zz C z + P (t)a)x(t) + (A T P (t)b u R 1 uu B T u )p(t) = (C T z R zz C z + P (t)a)x(t) + (A T P (t)b u R 1 uu B T u )P (t)x(t) = [ A T P (t) + P (t)a + C T z R zz C z P (t)b u R 1 uu B T u P (t) ] x(t) This must be true for arbitrary x(t), so P (t) must satisfy P (t) = A T P (t) + P (t)a + C T z R zz C z P (t)b u R 1 uu B T u P (t) Which, is the matrix differential Riccati Equation. Note that the optimal value of P (t) must be found by solving this equation backwards in time from t f with P (t f ) = P tf

Fall 2007 16.31 17 8 The control gains are then u opt (t) = R 1 uu B T u p(t) = R 1 uu B T u P (t)x(t) = K(t)x(t) So the optimal control inputs can in fact be computed using linear feedback on the full system state Key point: This controller works equally well for MISO and MIMO regulator designs. Normally we are interested in problems with t 0 = 0 and t f =, in which case we can just use the steady-state value of P ss. If we assume LTI dynamics and let t f, then at any finite time t, would expect the Differential Riccati Equation to settle down to a steady state value (requires that A, B u, C z be stabilize and detectable) which is the solution of A T P + P A + R xx P B u R 1 uu B T u P = 0 Called the (Control) Algebraic Riccati Equation (CARE) Find optimal steady state feedback gains in Matlab using u(t) = K ss x(t) from K ss = lqr(a, B u, C T z R zz C z, R uu )

Fall 2007 16.31 17 9 LQR Stability Margins LQR approach selects closed-loop poles that balance between system errors and the control effort. Easy design iteration using R uu Sometimes difficult to relate the desired transient response to the LQR cost function. Particularly nice thing about the LQR approach is that the designer is focused on system performance issues Turns out that the news is even better than that, because LQR exhibits very good stability margins Consider the LQR stability robustness. J = 1 2 0 ẋ = Ax + Bu z T z + ρu T u dt z = C z x, R xx = C T z C z C z z u B (si A) 1 K x Study robustness in the frequency domain. Loop transfer function L(s) = K(sI A) 1 B Cost transfer function C(s) = C z (si A) 1 B

Fall 2007 16.31 17 10 Can develop a relationship between the open-loop cost C(s) and the closed-loop return difference I +L(s) called the Kalman Frequency Domain Equality Sketch of Proof [I + L( s)] T [I + L(s)] = 1 + 1 ρ CT ( s)c(s) Start with u = Kx, K = 1 ρ BT P, where 0 = A T P P A R xx + 1 ρ P BBT P Introduce Laplace variable s using ±sp 0 = ( si A T )P + P (si A) R xx + 1 ρ P BBT P Pre-multiply by B T ( si A T ) 1, post-multiply by (si A) 1 B Complete the square... [I + L( s)] T [I + L(s)] = 1 + 1 ρ CT ( s)c(s) Can handle the MIMO case, but look at the SISO case to develop further insights (s = jω) [I + L( jω)] [I + L(jω)] = (I + L r (ω) jl i (ω))(i + L r (ω) + jl i (ω)) 1 + L(jω) 2 and C T ( jω)c(jω) = C 2 r + C 2 i = C(jω) 2 0

Fall 2007 16.31 17 11 Thus the KFE becomes 1 + L(jω) 2 = 1 + 1 ρ C(jω) 2 1 Implications: The Nyquist plot of L(jω) will always be outside the unit circle centered at (-1,0). 4 3 L N (jω) 2 1+L N (jω) Imag Part 1 0 1 ( 1,0) 2 3 4 7 6 5 4 3 2 1 0 1 Real Part Great, but why is this so significant? Recall the SISO form of the Nyquist Stability Theorem: If the loop transfer function L(s) has P poles in the RHP s-plane (and lim s L(s) is a constant), then for closed-loop stability, the locus of L(jω) for ω : (, ) must encircle the critical point (-1,0) P times in the counterclockwise direction (Ogata528)

Fall 2007 16.31 17 12 So we can directly prove stability from the Nyquist plot of L(s). But what if the model is wrong and it turns out that the actual loop transfer function L A (s) is given by: L A (s) = L N (s)[1 + Δ(s)], Δ(jω) 1, ω We need to determine whether these perturbations to the loop TF will change the decision about closed-loop stability can do this directly by determining if it is possible to change the number of encirclements of the critical point stable OL 1.5 1 0.5 L A (jω) Imag Part 0 0.5 ω 1 ω 2 L N (jω) 1 1.5 1.5 1 0.5 0 0.5 1 1.5 Real Part Figure 2: Perturbation to the LTF causing a change in the number of encirclements

Fall 2007 16.31 17 13 Claim is that since the LTF L(jω) is guaranteed to be far from the critical point for all frequencies, then LQR is VERY robust. Can study this by introducing a modification to the system, where nominally β = 1, but we would like to consider: The gain β R The phase β e jφ K(sI A) 1 B β In fact, can be shown that: If open-loop system is stable, then any β (0, ) yields a stable closed-loop system. For an unstable system, any β (1/2, ) yields a stable closed-loop system gain margins are (1/2, ) Phase margins of at least ±60 which are both huge.

Fall 2007 16.31 17 14 3 stable OL 2 Imag Part 1 0 1+L ω=0 L 1 2 ω 3 2 1 0 1 2 3 4 Real Part Figure 3: Example of LTF for an open-loop stable system Figure 4: Example loop transfer functions for open-loop unstable system.

Fall 2007 16.31 17 15 Weighting Matrix Selection A good rule of thumb when selecting the weighting matrices R xx and R uu is to normalize the signals: R xx = α 2 1 (x 1 ) 2 max α 2 2 (x 2 ) 2 max... α 2 n (x n ) 2 max R uu = ρ β 2 1 (u 1 ) 2 max β 2 2 (u 2 ) 2 max... β 2 m (u m ) 2 max The (x i ) max and (u i ) max represent the largest desired response/control input for that component of the state/actuator signal. The i α2 i = 1 and i β2 i = 1 are used to add an additional relative weighting on the various components of the state/control ρ is used as the last relative weighting between the control and state penalties gives us a relatively concrete way to discuss the relative size of R xx and R uu and their ratio R xx /R uu

Fall 2007 16.31 17 16 LQR Summary While we have large margins, be careful because changes to some of the parameters in A or B u can have a very large change to L(s). Similar statements hold for the MIMO case, but it requires singular value analysis tools.

Fall 2007 16.31 17 17 LQ Servo Using scaling N can achieve zero steady state error, but the approach is sensitive to the accurate knowledge of all plant parameters Can modify the LQ formulation to ensure that zero steady state error is robustly achieved in response to a constant reference commands. Done in the LQ formulation by augmenting integrators to the system output and then including a penalty on the integrated output in the cost. If the relevant system output is y(t) = C y x(t) and reference r(t), add extra states x I (t), where ẋ I (t) = e(t) Then penalize both x(t) and x I (t) in the cost If the state of the original system is x(t), then the dynamics are modified to be [ ] [ ] [ ] [ ] [ ] ẋ(t) A 0 x(t) Bu 0 = + u(t) + r(t) ẋ I (t) C y 0 x I (t) 0 I and define x(t) = [ x T (t) x I T (t) ] T The optimal feedback for the cost [ J = x T R xx x + u T R uu u ] dt is of the form: [ x u(t) = [ K K I ] 0 x I ] = Kx(t)

Fall 2007 16.31 17 18 Once we have used LQR to design the control gains K and K I, we typically implement the controller using the architecture: r 1 s x I K I K u G(s) y x Figure 5: Typical implementation of the LQ servo For the DOFB problem, ˆx(t) = Aˆx(t) + Bu(t) + L(y(t) ŷ(t)) = (A LC)ˆx(t) + Bu(t) + Ly(t) ẋ I (t) = r(t) y(t) u(t) = [ K K I ] [ ˆx x I ] so with x(t) = [ ˆx T (t) x T I (t) ] T we have [ ] A LC BK BKI x(t) = x(t) + 0 0 u(t) = K x(t) [ L I which gives the closed-loop system: [ ] A B K ẋ(t) [ ] [ ] = LC A LC BK BKI x(t) C 0 0 y(t) = [ C 0 ] [ ] x(t) x(t) ] y(t) + [ x(t) x(t) [ 0 I ] + ] r(t) [ 0 I ] r(t)

Fall 2007 16.31 17 19 Example: G(s) = (s + 2)/(s + 1)(s + s + 1) Use LQR and its dual to find K and L using the dynamics augmented with an integrator. Implement as given before to get the following results Figure 6: LQ servo root locus closed-loop poles, open-loop poles, Compensator poles, Compensator zeros Figure 7: Step response of the LQ servo

Fall 2007 16.31 17 20 LQ Servo Code (lqservo.m) 1 % 2 % LQ servo 3 % 4 G=tf([1 2],conv(conv([1 1],[1 1 1]),[1])); 5 [a,b,c,d]=ssdata(g);ns=size(a,1); 6 klqr=lqr(a,b,c *c,1); 7 z=zeros(ns,1); 8 9 Rxx1=[c *c z;z 10]; 10 Rxx2=[c *c z;z 0.1]; 11 12 abar=[a z;-c 0];bbar=[b;0]; 13 kbar1=lqr(abar,bbar,rxx1,1); 14 kbar2=lqr(abar,bbar,rxx2,1); 15 16 l=lqr(a,c,b*b,.001);l=l ; 17 18 ac1=[a-l*c-b*kbar1(1:ns) -b*kbar1(end);z 0]; 19 ac2=[a-l*c-b*kbar2(1:ns) -b*kbar2(end);z 0]; 20 by=[l;-1];br=[l*0;1]; 21 cc1=-[kbar1];cc2=-[kbar2]; 22 Gc_uy=tf(ss(ac1,by,cc1,0));[nn,dd]=tfdata(Gc_uy, v ); 23 Gc_ur=tf(ss(ac1,br,cc1,0)); 24 25 acl1=[a -b*kbar1;[l;-1]*c ac1]; 26 acl2=[a -b*kbar2;[l;-1]*c ac2]; 27 bcl=[b*0;br]; 28 ccl=[c c*0 0]; 29 30 figure(1); 31 [y2,t2]=step(ss(acl2,bcl,ccl,0)); 32 [y1,t1]=step(ss(acl1,bcl,ccl,0),t2); 33 plot(t1,y1,t2,y2, LineWidth,2) 34 legend( Int Penalty 10, Int Penalty 0.1, Location, SouthEast ) 35 print -dpng -r300 lqservo1.png 36 37 figure(2); 38 rlocus(-g*gc_uy) % is a positive feedback loop now 39 hold on;plot(eig(acl1)+j*eps, bd, MarkerFaceColor, b );hold off 40 hold on;plot(eig(a)+j*eps, mv, MarkerFaceColor, m );hold off 41 hold on;plot(roots(nn)+j*eps, ro, MarkerFaceColor, r );hold off 42 hold on;plot(roots(dd)+j*eps, rs, MarkerFaceColor, r );hold off 43 axis([-7 2-7 7]) 44 print -dpng -r300 lqservo2.png 45