A game theoretic approach to moving horizon control Sanjay Lall and Keith Glover Abstract A control law is constructed for a linear time varying system by solving a two player zero sum dierential game on a moving horizon, the game being that which is used to construct an H 1 controller on a nite horizon. Conditions are given under which this controller results in a stable system and satises an innite horizon H 1 norm bound. A risk sensitive formulation is used to provide a state estimator in the observation feedback case. 1 Introduction The moving horizon control technique was developed in the 1970's, with one of the rst papers being that by Kleinman (Kleinman 1970) in 1970. The techniques in his paper were later reformulated and generalized by various authors, see for example (Chen & Shaw 198, Kwon & Pearson 1977, Kwon, Bruckstein & Kailath 1983). Kwon and Pearson (Kwon & Pearson 1977) proved stability for linear time varying systems using a controller which optimised a quadratic cost function integrated over a time interval from the current time t to a xed distance ahead t + T. The solution to this problem was given in terms of a Riccati dierential equation, integrated backwards from time t + T at each time t. For time varying systems, this gives a practical method of stabilising the system. Kwon et. al. (Kwon et al. 1983) formulated a general procedure for the recursive update of such Riccati equations, which generalise easily to the indenite Riccati equations in this paper. Further, this method allows stabilization of systems which are known only for the short term future. In dealing with long term variation of systems, there are at present two extreme options. The rss to assume a time invariant system and design a controller robust against time varying perturbations, which usually leads to very conservative controller design. The other is to design fully for time varying systems. In the case of both linear quadratic and H 1 controllers, this requires the backwards integration from innity of a Riccati equation (see for example (Tadmor 1993, Ravi, Nagpal & Khargonekar 1991)), and somewhat optimistically assumes knowledge of the system throughout future time. The moving horizon method can be viewed as a compromise between these two methods. The aim of this present paper is to extend the work of Tadmor (Tadmor 199), in which a receding horizon controller is formulated with each nite horizon optimisation based upon an 1
H 1 optimisation. Is hoped that the practical advantages of receding horizon control might be combined with the robustness advantages of H 1 control. Tadmor proved that that this controller was stable and satised an innite horizon norm bound. Although derived using dierent approaches to the H 1 problem, the controller described in this paper is essentially the same in the state feedback case, diering only in the terminal constraint used. In the observation feedback case, however, is dicult to produce a natural formulation of the H 1 receding horizon control problem. For the nite horizon problem, typically the assumptions made are that the initial state at the beginning of each optimization interval is completely known, or is completely unknown. In the latter case is treated as part of the disturbance and subject to a quadratic weighting (Uchida & Fujita 199) (Khargonekar, Nagpal & Poolla 1991). In the receding horizon problem, though, we have observations from before the optimization interval. We attempt to make use of prior observations using the theory of risk sensitive control, as developed by Whittle (Whittle 1990), which has been shown to be equivalent to H 1 control in many situations. Preliminaries We consider the system _x(t) = A(t)x(t) + B 1 (t)w(t) + B (t)u(t) (1) z(t) = C 1 (t)x(t) + D 1 (t)u(t) y(t) = C (t)x(t) + D 1 (t)w(t) where the coecient matrices are bounded matrix valued functions of time. We use the following assumptions D 0 1C 1 = 0 D 0 1D 1 = I D 1 B 0 1 = 0 D 1 D 0 1 = I () which can be removed by suitable changes in variables, see for example (Doyle, Glover, Khargonekar & Francis 1989, Ravi et al. 1991, Tadmor 1993). Further, we shall assume there exists " > 0 such that B 1 (t)b 1 (t) 0 "I for all t > 0 C 1 (t) 0 C 1 (t) "I for all t > 0 (3) These assumptions can be removed by assuming uniform complete controllability and observability of the system, and slightly modifying the following arguments. Let L m [ ; t f ] be the Hilbert space of square integrable functions on [ ; t f ] R taking values in R m, and write R + = [0; 1). Denote the usual norm on L by kk, and if F : L! L then denote by kk 1 the norm on F induced by kk Then x: R +! R n, u L m 1 [ ; t f ],w L m [ ; t f ], y: R +! R p. The signal w represents all external inputs including disturbances, sensor noise, and commands, z represents the error signal, y is the measured variable and u is the control input. We shall omit the space dimensions of signals in the sequel.
3 State feedback 3.1 Finite horizon H 1 control Dene the cost function J (u; w) = Z tf n z(t) 0 z(t)? w(t) 0 w(t) o We can regard J as a function of either L signals or feedback strategies. Let M = f : [ ; t f ] R n! R m1 g and N = f : [ ; t f ] R n! R m g. These spaces are the strategy spaces, and we shall write strategies as ; to distinguish them from signals u; w. With the control u(t) = (t; x(t)) in place, the operator T zw maps w to z. If x( ) = 0, then we can dene kt zw k 1 = sup kt zw wk =kwk wl [ ;t f ] The nite interval H 1 problem is to nd a linear causal control such that kt zw k 1 < for a given > 0. kt zw k 1 < if and only if there exists " > 0 such that We formulate the nite time dierential game J (u; w)?"kwk 8w L [ ; t f ] (4) inf sup J (; ) (5) M N which is a zero sum game, where u is the minimizing player and w is the maximizing player. The designer chooses u(t) = (t; x(t)) such that even if nature is malicious and chooses the worst case w, then equation (4) is satised. If the extremizing operators in equation (5) are interchangeable, then we call the optimal u and worst case w saddle point strategies. Theorem 1 gives sucient conditions for the existence of a saddle point solution. A saddle point solution u(t) = (t; x(t)), w(t) = (t; x(t)) will satisfy J ( ; w) J ( ; ) J (u; ) 8u; w L [ ; t f ]: Theorem 1 (Basar & Olsder 198, Limebeer, Anderson, Khargonekar & Green 199) Let Z tf J (u; w) = where F > 0 is some weighting matrix. If n z(t) 0 z(t)? w(t) 0 w(t) o + x(t f ) 0 F x(t f )? _ P = P A + A 0 P + C 0 1C 1? P (B B 0?? B 1 B 0 1)P; P (t f ) = F (6) has a unique symmetric bounded solution on t [ ; t f ], then J (u; w) = x( ) 0 P ( )x( ) +? Z t Z tf f 3 ju() + B () 0 P ()x()j d jw()?? B 1 () 0 P ()x()j d
Further, with full state information available to u(t) = (t; x(t)) and w(t) = (t; x(t)), there exists a unique feedback saddle point solution given by (t; x(t)) =?B (t) 0 P (t)x(t) (t; x(t)) =? B 1 (t) 0 P (t)x(t) and the saddle point value of the game is given by J ( ; ) = x( ) 0 P ( )x( ): 3. The moving horizon dierential game We now try to nd u(t) = (t; x(t)) at time t, where is the saddle point strategy for player u for inf sup M t N t (Z t+t t n z(s) 0 z(s)? w(s) 0 w(s) o ds + x(t + T ) 0 F (t + T )x(t + T )) where M t = f : [t; t + T ] R n! R m1 g, N t = f : [t; t + T ] R n! R m g, and w(t) = (t; x(t)). The idea behind this is to try to gain both the robustness benets of the H 1 formulation and the practical control of time varying systems associated with receding horizon control. At each time t we try to solve the nite horizon H 1 problem on the optimisation interval [t; t + T ] to nd a feedback strategy u(t) = (t; x(t)). In practice, this would be implemented for a nite time > 0, after which a new calculation on the interval [t + ; t + + T ] would be performed. In this paper we idealise this situation and assume that the controller is updated continuously. For each separate H 1 optimisation, the initial state may not be zero. Therfore, an induced norm interpretation on each nite horizon is not possible, since zero input u gives a nonzero output z on the interval [ ; t f ], and thus the induced norm is undened. We also have a penalty on the terminal state, F (t) > 0. This is often incorporated into nite horizon problems to allow for compromises between the norm of z and the size of x(t f ). In the moving horizon case, we shall show that a suciently large F will cause the closed loop to be stable. In order to nd u(t) for the moving horizon problem we musntegrate equation (6) backwards from the boundary condition at t + T to time t. We can rewrite this as a partial dierential equation as follows? P (; ) = P (; )A() + A() 0 P (; )? P (; )(B ()B () 0?? B 1 ()B 1 () 0 )P (; ) + C 1 () 0 C 1 () P (t; t) = F (t) for all t The moving horizon saddle point controller is given by u(t) =?B (t) 0 P (t; t + T )x(t) 4
In the case when all system matrices A; B 1 ; B ; C 1 ; D 1 are time invariant, then P (t; t + T ) is independent of t also. 3.3 Stability Tadmor (Tadmor 199) proves stability for the above controller when F = 1. In this case, we can consider the Riccati equation for P?1 and the interpretation on the nite horizon is that the controller is subject to the constraint x(t + T ) = 0. The linear quadratic problem with a similar terminal weight was considered by Kwon, Bruckstein and Kailath (Kwon et al. 1983). We use a modied form of their methods to handle the terminal weight. Before the stability proof, we need the following lemmas, in which we will write P (; ; F ) for the solution of the Riccati equation with boundary condition P (; ; F ) = F for =, with xed. The following results are derived in a slightly dierent way in (Kwon et al. 1983), and various similar results can be found in (Anderson & Moore 1989, Reid 1970, Bucy 1967, Ran & Vreugdenhil 1988, Basar & Bernhard 1991). Lemma Let P satisfy the following Riccati equation on [ ; t f ], with F 0 and Q 0.? _ P = P A + A 0 P? P (B B 0?? B 1 B 0 1)P + Q P (t f ) = F then if P exists, and either F > 0 or Q > 0, then P (t) > 0 8t [ ; t f ]. Lemma 3 Suppose and P satises _ F + A 0 F + F A? F (B B 0?? B 1 B 0 1)F + C 0 1C 1 0; F (t) > 0; t > 0 (7) then, if both sides exist,? _ P = A 0 P + P A? P (B B 0?? B 1 B 0 1)P + C 0 1C 1 P (; 1 ; F ( 1 )) P (; ; F ( )); 1 Proof We know (see (Basar & Bernhard 1991)) x( ) 0 P ( ; t f ; F (t f ))x( ) = (8) then min u max w Z tf (x(s) 0 C 1 (t) 0 C 1 (t)x(s) + u(s) 0 u(s)? w(s) 0 w(s)) ds + x(t f ) 0 F (t f )x(t f ) (i) If ~ C1 (t) 0 ~ C1 (t) C 1 (t) 0 C 1 (t) 8t [ ; t f ], then replacing C 1 by ~ C 1 results in a new solution, ~P satisfying ~P (t) P (t) 8t [ ; t f ]. (ii) Similarly, if ~F (t) F (t) 8t [ ; t f ], then ~P (t) P (t) 8t [ ; t f ]. 5
P (t; ; F ( )) P (t; 1 ; F ( 1 )) for all t 1 With either of these changes, equation(8) becomes x( ) 0 ~P ( ; t f ; F (t f ))x( ) = min u max w (Z tf (x(s) 0 C 1 (s) 0 C 1 (s)x(s) + u(s) 0 u(s)? w(s) 0 w(s)) ds + x(t f ) 0 F (t f )x(t f ) + h(u; w) and h(u; w) > 0 8u; w L. We can write equation(7) as ) where 0. By denition,? _ F = A 0 F + F A? F (B B 0? B 1 B 0 1)F + (C 0 1C 1 + ); P ( ; ; F ( )) = F ( ) so, since the Riccati equation for P and F are equal at t =, and that for F has a bigger constant term, F (t) P (t; ; F ( )) for all t We can rewrite this as P ( 1 ; ; F ( )) P ( 1 ; 1 ; F ( 1 )) so, when t = 1, the LHS is smaller than the RHS. We can take this point as a boundary condition, and, using (ii) above, Theorem 4 For the system described by equations (1-3), if at each time t the saddle point solution to the dierential game on [t; t + T ] exists, then if _ F + A 0 F + F A? F (B B 0?? B 1 B 0 1)F + C 0 1C 1 0; F (t) > 0; t > 0 and if there exists constants 1 > 0 and > 0 such that 1 I P (t; t + T; F (t + T )) I then the control u(t) =?B (t) 0 P (t; t + T; F (t + T ))x(t) is stabilising. Proof Let (t) = A(t)? B (t)b (t) 0 P (t; t + T; F ). Then, with the controller in place, _x = (t)x(t) + B 1 (t)w(t). Let V (t) = x(t) 0 P (t; t + T; F (t + T ))x(t). Then _V = x 0 x Using Lemma 3,?P 0 B B 0 P?? P B 1 B 0 1P? C 0 1C 1 + P (t; ; F ()) 0 =t+t 6 P (t; ; F ()) =t+t!
Since C1C 0 1 > "I, _V (t) <?"x(t) 0 x(t). Then, since by assumption P is bounded, is a Lyapunov function for the closed loop system, hence the system is globally exponentially stable. If we can prove boundedness of P (t; t + T; F (t + T ), then we shall have a sucient condition for the controller to be stable. Tadmor (Tadmor 199) proves that P (t; t + T; 1) is bounded above and below. Slight modication of his proof shows: R t Theorem 5 Let G i (t 0 ; t 1 ) = 1 t 0 (t 0 ; t)b i (t)b i (t) 0 (t 0 ; t) 0. If (A; B ) is uniformly controllable and G (t 0 ; t 1 )?1 G 1 (t 0 ; t 1 ) is uniformly bounded for all t 1? t 0 T, then there exists > 0 such that P (t; t + T; F (t + T )) is bounded above for all t > 0. 3.4 Innite horizon norm bounds Following (Tadmor 199), with the controller u(t) =?B (t) 0 P (t; t + T; F (t + T ))x(t) and w L [t 0 ; 1] we know Z 1 d t 0 n o x(t) 0 P (t; t + T; F (t + T ))x(t) = Z 1 (?z 0 z + w 0 B 01P x?? x 0 P B 1 B 01P x + x 0 t 0 P (t; ; F ()) =t+t! ) x If the closed loop is stable, then Z 1 d t 0 fx(t)0 P (t; t + T; F (t + T ))x(t)g =?x(t 0 ) 0 P (t 0 ; t 0 + T; F (t 0 + T ))x(t 0 ) Let ^w(t) = w(t)?? B(t)P 0 (t; t + T; F (t + T ))x(t). Then, if x(t 0 ) = 0, kzk? kwk =? k ^wk +Z 1 x 0 t 0 P (t; ; F ()) x =t+t Using Lemma 3, Hence kt zw k 1. kzk? kwk 0! 4 Calculation of P (; ; F ()) In order to implement the moving horizon controller, we need to calculate the value of P (t; t + T; F (t + T )) for all t > 0. This requires the solution of a Riccati dierential equation over the interval [t; t + T ] for each time t, given boundary conditions P (t; t + T; F (t + T )) = F (t + T ) for all t > 0. Kwon, Bruckstein and Kailath (Kwon et al. 1983) applied results from scattering theory in their solution to the quadratic problem, to give a forwards dierential equation for P (t; t + T; F (t + T )). Here we state the same solution for the indenite Riccati equation. See (Verghese, Friedlander & Kailath 1980) for details on the derivation of these formulae. In this section, P (; ) is used to mean P (; ; 0). Dene for convenience N() = B ()B () 0?? B 1 ()B 1 () 0 7
Let Consider the system of equations S(; ) = (; ) L(; ) P (; ) (; ) (; ) = (; )[N()P (; )? A()] (9) (; ) = [P (; )N()? A 0 ()] (; ) (10) L(; ) = (; )N() (; ) (11) P (; ) = A0 ()P (; ) + P (; )A()? P (; )N()P (; ) + C 1 () 0 C 1 () (1) with boundary conditions! S(; ) = I 0 0 I The following equations (; ) = [A() + L(; )C 1() 0 C 1 ()](; ) (13) (; ) = (; )[A0 () + C 1 () 0 C 1 ()L(; )] (14) L(; ) = A()L(; ) + L(; )A0 () + L(; )C 1 () 0 C 1 ()L(; )? N() (15) P (; ) = (; )C 1() 0 C 1 ()(; ) (16) give the partial derivatives with respect to. Is straightforward to verify this by taking both sets of second partial derivatives, and showing that they are equal, and verifying that along the boundary the total derivative d S(; t) S(t; t) = + =t! S(t; ) is zero, since ; ; L; P are constant along the boundary. We can nd a dierential equation for P (t; t + T ) in terms of t since Hence d d dl dp d S(t; t + T ) = S(; t + T ) =t + =t S(t; ) =t+t = [N(t)P? A(t)] + [A(t + T ) + LC 1 (t + T ) 0 C 1 (t + T )] (17) = [P N(t)? A 0 (t)] + [A 0 (t + T ) + C 1 (t + T ) 0 C 1 (t + T )L] (18) = N(t) + A(t + T )L + LA 0 (t + T ) + LC 1 (t + T ) 0 C 1 (t + T )L? N(t + T ) (19) = A 0 (t)p + P A(t)? P N(t)P + C 1 (t) 0 C 1 (t) + C 1 (t + T ) 0 C 1 (t + T ) (0) 8
where = (t; t + T ) etc. This gives a dierential equation for P (t; t + t; 0). The boundary conditions for equations (17{0) are given by solving equations (9{1) backwards in time from = = T to = 0; = T. This backwards solution has to be performed only once, when t = 0. We now make use of the following identity. P (; ; F ()) = P (; ) + (; )F ()[I? L(; )F ()]?1 (; ) This can be veried in a straightforward manner. Then P (t; t + T; F (t + T )) = P + F (t + T )[I? LF (t + T )]?1 These equations are simpler than they appear, since = 0. They give us a quadratic dierential equation in matrices of size n, which is integrated forwards to give the controller. 5 State estimation and output feedback In this section we consider the problem of optimally estimating the state. Let J( ; t f ) = Z tf n z(t) 0 z(t)? w(t) 0 w(t) o Let u [0;ti ] = f u(t); 0 t g. Let t [ ; t f ] be the current time. We then wish to nd the feedback saddle point strategy for the u player given information u [0;t] and y [0;t]. For the moving horizon controller we only need to know this strategy at time t =. In recent years H 1 theory has been furnished with a certainty equivalence principle in several dierent formulations (Doyle et al. 1989, Whittle 1990, Basar & Bernhard 1991). For the nite horizon time varying H 1 problem Basar and Bernhard (Basar & Bernhard 1991, p. 119) give a derivation of the optimal estimator for output feedback. Stated in simple terms, their resulndicates that one should look, at each instant of time t, for the worst possible disturbance w [ti ;t] compatible with information available at that time about u [ti ;t] and y [ti ;t]. This generates a corresponding worst possible state trajectory, which should be used by the feedback law as if it were the actual state. The worst case approach to our problem is to extend the maximisation problem into the past, and maximise the cost function with respect to unobservables before, back to time t = 0. This would require us to assume that, for times 0 < t <, w was trying to maximize the cost function on [ ; t f ]. However, if we do this then w can make J innite, since there is no disincentive on the size of w outside [ ; t f ]. This means that we cannot use this principle to estimate the state without further assumptions. In fact, given the values of past observables, we can only draw conclusions about the current value of the state if we know some information about w (since we have the nondegeneracy assumption that D 1 D 0 1 = I). In the nite horizon problem, we assume than the past player w was trying to maximize the cost function over the entire nite interval. One possible alternative approach is to assume that w is trying to maximize J(0; t f ), while u is trying to minimize J( ; t f ). For u, in fact this is equivalent to playing a moving horizon game in which only the endpoint moves. 9
Alternatively, is possible to consider a stochastic formulation. In this case we should like to assume that outside the optimisation interval, w is accurately modelled by white noise. In order to do this, we need a stochastic model for H 1 control. 5.1 Risk sensitive control There is a strong connection between the formulation of risk sensitive control developed by Whittle (Whittle 1990) and H 1 control. This is specied in (Glover & Doyle 1988). We consider here the system described by equations (1 { 3). However, we now change our assumptions so that w is white noise with covariance function (t)i, the identity matrix multiplied by a delta function. Dene the cost function C = Z tf z(t) 0 z(t) + x(t f ) 0 F (t f )x(t f ) where x( ) is normally distributed with mean ^x 0 and covariance matrix V, and F (t) > 0. Let L = log(e (exp( C=))) where in this case (t; y(t)) is a policy, and E indicates the expectation when the inpus u(t) = (t; y(t)). The risk sensitive control problem is min M L where M is the space of all policy functions. Glover and Doyle (Glover & Doyle 1988) show that this is equivalent to an H 1 problem. However, in this case, a sucient statistic for the initial conditions are the mean and variance of the initial state. If the system has been running previous to the implementation of a risk sensitive controller, these can be provided by a Kalman lter. The solution to the risk sensitive control problem is given by Whittle (Whittle 1990): Theorem 6 If there exists an X 1 0 satisfying? _X 1 = A 0 X 1 + X 1 A? X 1 (B 0 B?? B 1 B 0 1)X 1 + C 0 1C 1 X 1 (t f ) = F (t f ) and Y 1 satisfying _Y 1 = AY 1 + Y 1 A 0? Y 1 (C 0 C?? C 0 1C 1 )Y 1 + B 1 B 0 1 Y 1 ( ) = V such that Y 1 (t)?1?? X 1 (t) > 0, then let _^x = A^x + B u + Y 1 C(y 0? C ^x) +? Y 1 C1C 0 1^x ^x( ) = ^x 0 and x = (I?? Y 1 X 1 )?1^x Then the optimal risk sensitive feedback control law is given by u =?BX 0 1 x 10
This has exactly the form of an H 1 controller (see, for example, Basar and Bernhard (Basar & Bernhard 1991, pp. 14-16)) with unknown weighted initial state. However, is important to note that the assumptions are very dierent, and that w is white noise. In the moving horizon case, we only need u(t) at t =. At this time, ^x( ) and Y 1 ( ) are the mean and variance of the state. Note that the moving horizon objective at time is the expectation of L over the current state variability and future disturbances. The information required, thas the mean and variance of the current state, is precisely that provided by a Kalman lter, as follows. Theorem 7 Suppose x(0) has mean ^x 0 and covariance matrix V. Let Y > 0 be the solution to _Y = AY + Y A 0? Y (CC 0 )Y + B 1 B 0 1 Y ( ) = V and let ^x satisfy _^x = A^x + B u + Y C 0 (y? C ^x) ^x( ) = ^x 0 then ^x(t) is the mean of x and Y (t) is the variance of x at time t. Hence we propose a moving horizon controller of the form _Y = AY + Y A 0? Y (C 0 C )Y + B 1 B 0 1 Y ( ) = V _^x = A^x + B u + Y C 0 (y? C ^x) ^x( ) = ^x 0? X 1(; ) = X 1 (; )A() + A() 0 X 1 (; )? X 1 (t; t) = F (t) for all t Then the controller is given by X 1 (; )(B ()B () 0?? B 1 ()B 1 () 0 )X 1 (; ) + C() 0 C() u(t) =?B (t) 0 X 1 (t; t + T )(I?? Y (t)x 1 (t; t + T ))?1^x(t) Let Z 1(t) = (I?? Y (t)x 1 (t; t + T ))?1 Y (t) Then we can rewrite equations (5.1{5.1) as _x = (A? (B B 0?? B 1 B 0 1)X 1 )x? Z 1(C 0 C +? R)x + Z 1C 0 y where u(t) =?B 0 X 1 (t; t + T )x(t) R(t) = C 1 (t) 0 C 1 (t)? X 1(t; ) =t+t Theorem 8 If X 1 and Z 1 satisfy 1 I X 1 (t; t + T ) I 1 I Z 1(t) I for xed 1 ; for all t > 0, then the above controller is globally exponentially stable. 11
Proof Let e = x? x. Then where Let F = d x = F x e e " A? B BX 0 1 B BX 0 1 #?? B 1 B1X 0 1 +? Z 1R A? Z 1(CC 0 +? R) +? B 1 B1X 0 1 Let G = F 0 + F + _. Then Using dz?1 1 d x e = " X1 0 0 Z?1 1 0 x e! = x 0 e G x e =?Z?1 1A? A 0 Z?1 1 +? R + C 0 C? Y?1 B 1 B 0 1Y?1?? X 1 (B B 0?? B 1 B 0 1)X 1 some algebra reveals hence G = where 64?X 1B B0 X 1?? X 1 B 1 B 0 1X 1? R X 1 B B 0 X 1? X 1 B 1 B 0 1Z?1 1 + R X 1 B B 0 X 1? Z?1 1B 1 B 0 1X 1 + R " d x 0 e x e # =? x 0 e ^G x e ^G =??" X 1 Z?1 1 #?X 1 B BX 0 1? Z 1B?1 1 B1Z 0 1?1?? (CC 0 +? R)? jb 0 X 1 xj? jc ej? x 0 Rx # B 1 B 0 1" X 1 Z?1 1 We know that x = 0 if and only if x = e, and in this case? x e 0 ^G x e =? jb 0 1Y?1 xj < jxj for some > 0 since by assumption B 1 B 0 1 > "I for some " > 0. Using lemma 3, if F satises equation (7), then R > 0, since by assumption C 0 1C 1 > 0. If we have a bound for X 1, then in order to bound Z 1 we need only bound Y. Bounds for Y are given by Kalman (Kalman 1960). Therefore stability is again conditional on the existence of a bound for X 1, of which we are assured in the case when F = 1. # 0 3 75 1
6 Conclusions With the terminal constraint x(t + T ) = 0, the moving horizon state feedback problem is known to be stable and to resuln a global norm bound for the closed loop system, under the assumptions of Theorem 5. Several interesting problems remain. Is possible within the current formulation to use a time varying, which might allow both for attempts to improve performance if the system becomes easier to control with time, and to reduce performance demands if the system becomes more dicult to control. Is also feasible to extend the above approach for the observation feedback case so that the horizon becomes [t? T 1 ; t + T ]. In this case, we might expect a more conservative controller. One motivation for this control methodology has been thas possible to implement it knowing only details of the system up to time t + T ahead. In this formulation that knowledge is discreet; we assume that up to time t+t we know exactly the system matrices, after that we know nothing about them. This seems unnatural, since we might hope that an online identication process would give us a description of the future system with gradually increasing uncertainty in time. The linking of the present controller with a mathematical model for such uncertainty might produce a more realistic control system. References Anderson, B. D. O. & Moore, J. B. (1989), Optimal Control - Linear Quadratic Methods, Prentice Hall. Basar, T. & Bernhard, P. (1991), H 1 Optimal Control and Related Minimax Design Problems. A Dynamic Game Approach, Systems and Control: Foundations and Applications, Birkhauser. Basar, T. & Olsder, G. J. (198), Dynamic Noncooperative Game Theory, Mathematics in Science and Engineering, Academic Press. Bucy, R. S. (1967), `Global theory of the riccati equation', Journal of Computer and System Sciences 1, 349{361. Chen, C. C. & Shaw, L. (198), `On receding horizon feedback control', Automatica 18(3), 349{35. Doyle, J. C., Glover, K., Khargonekar, P. P. & Francis, B. A. (1989), `State-space solutions to standard H and H 1 control problems', IEEE Transactions on Automatic Control 34(8), 831{847. Glover, K. & Doyle, J. C. (1988), `State-space formulae for all stabilizing controllers that satisfy an H 1 -norm bound and relations to risk sensitivity', Systems and Control Letters 11, 167{17. Kalman, R. E. (1960), `Contributions to the theory of optimal control', Boletin de la Sociedad Matematica Mexicana 5(1), 10{119. 13
Khargonekar, P. P., Nagpal, K. M. & Poolla, K. R. (1991), `H 1 control with transients', SIAM Journal on Control and Optimization 9(6), 1373{1393. Kleinman, D. L. (1970), `An easy way to stabilize a linear constant system', IEEE Transactions on Automatic Control p. 69. Kwon, W. H., Bruckstein, A. M. & Kailath, T. (1983), `Stabilizing state-feedback design via the moving horizon method', International Journal of Control 37(3), 631{643. Kwon, W. H. & Pearson, A. E. (1977), `A modied quadratic cost problem and feedback stabilization of a linear system', IEEE Transactions on Automatic Control (5), 838{ 84. Limebeer, D. J., Anderson, B. D. O., Khargonekar, P. P. & Green, M. (199), `A game theoretic approach to H 1 control for time varying systems', SIAM Journal on Control and Optimization 30(), 6{83. Mayne, D. Q. & Michalska, H. (1990), `Receding horizon control of nonlinear systems', IEEE Transactions on Automatic Control 35(7), 814{84. Ran, A. C. M. & Vreugdenhil, R. (1988), `Existence and comparison theorems for algebraic riccati equations for continuous and discrete time systems', Linear Algebra and its applications 99, 63{83. Ravi, R., Nagpal, K. M. & Khargonekar, P. P. (1991), `H 1 control of linear time varying systems: A state space approach', SIAM Journal on Control and Optimization 9(6), 1394{1413. Reid, W. T. (1970), Riccati Dierential Equations, Academic Press. Tadmor, G. (199), `Receding horizon revisited: An easy way to robustly stabilize an ltv system', Systems and Control Letters 18, 85{94. Tadmor, G. (1993), `The standard H 1 problem and the maximum principle: the general linear case', SIAM Journal on Control and Optimization 31(4), 813{846. Uchida, K. & Fujita, M. (199), `Finite horizon H 1 control problems with terminal penalties', IEEE Transactions on Automatic Control 37(11), 176{1767. Verghese, G., Friedlander, B. & Kailath, T. (1980), `Scattering theory and linear least squares estimation, parii: The estimates', IEEE Transactions on Automatic Control 5(4), 794{80. Whittle, P. (1990), Risk-sensitive optimal control, Wiley Interscience series in Systems and Optimization, Wiley. 14