H 1 optimisation. Is hoped that the practical advantages of receding horizon control might be combined with the robustness advantages of H 1 control.

Similar documents
and the nite horizon cost index with the nite terminal weighting matrix F > : N?1 X J(z r ; u; w) = [z(n)? z r (N)] T F [z(n)? z r (N)] + t= [kz? z r

Linear Quadratic Zero-Sum Two-Person Differential Games Pierre Bernhard June 15, 2013

Linear Quadratic Zero-Sum Two-Person Differential Games

An LQ R weight selection approach to the discrete generalized H 2 control problem

Generalized Riccati Equations Arising in Stochastic Games

Stochastic Nonlinear Stabilization Part II: Inverse Optimality Hua Deng and Miroslav Krstic Department of Mechanical Engineering h

A Generalization of Barbalat s Lemma with Applications to Robust Model Predictive Control

Noncooperative continuous-time Markov games

Riccati difference equations to non linear extended Kalman filter constraints

Chapter 7 Interconnected Systems and Feedback: Well-Posedness, Stability, and Performance 7. Introduction Feedback control is a powerful approach to o

Linear-Quadratic Optimal Control: Full-State Feedback

FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES. Danlei Chu, Tongwen Chen, Horacio J. Marquez

Eects of small delays on stability of singularly perturbed systems

Problem Description The problem we consider is stabilization of a single-input multiple-state system with simultaneous magnitude and rate saturations,

Average Reward Parameters

Robust Stabilization of the Uncertain Linear Systems. Based on Descriptor Form Representation t. Toru ASAI* and Shinji HARA**

Closed-Loop Structure of Discrete Time H Controller

tion. For example, we shall write _x = f(x x d ) instead of _x(t) = f(x(t) x d (t)) and x d () instead of x d (t)(). The notation jj is used to denote

u e G x = y linear convolution operator. In the time domain, the equation (2) becomes y(t) = (Ge)(t) = (G e)(t) = Z t G(t )e()d; and in either domains

LMI based output-feedback controllers: γ-optimal versus linear quadratic.

Optimal discrete-time H /γ 0 filtering and control under unknown covariances

Risk-Sensitive Filtering and Smoothing via Reference Probability Methods

Auxiliary signal design for failure detection in uncertain systems

Intrinsic diculties in using the. control theory. 1. Abstract. We point out that the natural denitions of stability and

Design of FIR Smoother Using Covariance Information for Estimating Signal at Start Time in Linear Continuous Systems

A lifting approach. Control Engineering Laboratory. lifted time-invariant system description without causality

1 Introduction 198; Dugard et al, 198; Dugard et al, 198) A delay matrix in such a lower triangular form is called an interactor matrix, and almost co

To appear in IEEE Trans. on Automatic Control Revised 12/31/97. Output Feedback

Robust sampled-data stabilization of linear systems: an input delay approach

FIR Filters for Stationary State Space Signal Models

H -Optimal Tracking Control Techniques for Nonlinear Underactuated Systems

ECON 582: Dynamic Programming (Chapter 6, Acemoglu) Instructor: Dmytro Hryshko

Decentralized and distributed control

Chapter 9 Observers, Model-based Controllers 9. Introduction In here we deal with the general case where only a subset of the states, or linear combin

LINEAR QUADRATIC OPTIMAL CONTROL BASED ON DYNAMIC COMPENSATION. Received October 2010; revised March 2011

On the stability of receding horizon control with a general terminal cost

INRIA Rocquencourt, Le Chesnay Cedex (France) y Dept. of Mathematics, North Carolina State University, Raleigh NC USA

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

6.241 Dynamic Systems and Control

Marcus Pantoja da Silva 1 and Celso Pascoli Bottura 2. Abstract: Nonlinear systems with time-varying uncertainties

The Rationale for Second Level Adaptation

only nite eigenvalues. This is an extension of earlier results from [2]. Then we concentrate on the Riccati equation appearing in H 2 and linear quadr

Adaptive Nonlinear Model Predictive Control with Suboptimality and Stability Guarantees

The norms can also be characterized in terms of Riccati inequalities.

Tilburg University. An equivalence result in linear-quadratic theory van den Broek, W.A.; Engwerda, Jacob; Schumacher, Hans. Published in: Automatica

Null controllable region of LTI discrete-time systems with input saturation

Carsten Scherer. Mechanical Engineering Systems and Control Group. Delft University of Technology. Mekelweg CD Delft

Adaptive linear quadratic control using policy. iteration. Steven J. Bradtke. University of Massachusetts.

An Overview of Risk-Sensitive Stochastic Optimal Control

H -Optimal Control and Related Minimax Design Problems

Managing Uncertainty

Robust Output Feedback Controller Design via Genetic Algorithms and LMIs: The Mixed H 2 /H Problem

THIS paper deals with robust control in the setup associated

Small Gain Theorems on Input-to-Output Stability

Second-Order Linear ODEs

Experimental evidence showing that stochastic subspace identication methods may fail 1

On robustness of suboptimal min-max model predictive control *

= log (n+2) τ n 2.5. (t) y 2. y 1. (t) 1.5. y(t) time t. = log (n+2) 0.5. u(t) 0.5

POSITIVE CONTROLS. P.O. Box 513. Fax: , Regulator problem."

Abstract An algorithm for Iterative Learning Control is developed based on an optimization principle which has been used previously to derive gradient

Discrete-Time H Gaussian Filter

Frequency domainsolutionto delay-type Nehari problem

Chapter 3. LQ, LQG and Control System Design. Dutch Institute of Systems and Control

Finite-Dimensional H Filter Design for Linear Systems with Measurement Delay

A Finite Element Method for an Ill-Posed Problem. Martin-Luther-Universitat, Fachbereich Mathematik/Informatik,Postfach 8, D Halle, Abstract

Nonlinear and robust MPC with applications in robotics

OPTIMAL CONTROL AND ESTIMATION

Chapter 2 Optimal Control Problem

Model reduction for linear systems by balancing

H-INFINITY CONTROLLER DESIGN FOR A DC MOTOR MODEL WITH UNCERTAIN PARAMETERS

Risk-Sensitive Control with HARA Utility

Adaptive Predictive Observer Design for Class of Uncertain Nonlinear Systems with Bounded Disturbance

A dual control problem and application to marketing

Adaptive and Robust Controls of Uncertain Systems With Nonlinear Parameterization

Design of a full order H filter using a polynomial approach

Linear-Quadratic Stochastic Differential Games with General Noise Processes

Dampening Controllers via a Riccati Equation. Approach. Pod vodarenskou vez Praha 8. Czech Republic. Fakultat fur Mathematik

Topic # Feedback Control Systems

Learning with Ensembles: How. over-tting can be useful. Anders Krogh Copenhagen, Denmark. Abstract

State-Feedback Optimal Controllers for Deterministic Nonlinear Systems

Robust control and applications in economic theory

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION - Vol. XIII - Nonlinear Observers - A. J. Krener

MRAGPC Control of MIMO Processes with Input Constraints and Disturbance

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Benchmark problems in stability and design of. switched systems. Daniel Liberzon and A. Stephen Morse. Department of Electrical Engineering

QUASI-UNIFORMLY POSITIVE OPERATORS IN KREIN SPACE. Denitizable operators in Krein spaces have spectral properties similar to those

I. D. Landau, A. Karimi: A Course on Adaptive Control Adaptive Control. Part 9: Adaptive Control with Multiple Models and Switching

The Uniformity Principle: A New Tool for. Probabilistic Robustness Analysis. B. R. Barmish and C. M. Lagoa. further discussion.

A converse Lyapunov theorem for discrete-time systems with disturbances

On Optimal Performance for Linear Time-Varying Systems

Balancing of Lossless and Passive Systems

(q 1)t. Control theory lends itself well such unification, as the structure and behavior of discrete control

Pointwise convergence rate for nonlinear conservation. Eitan Tadmor and Tao Tang

Linear stochastic approximation driven by slowly varying Markov chains

output dimension input dimension Gaussian evidence Gaussian Gaussian evidence evidence from t +1 inputs and outputs at time t x t+2 x t-1 x t+1

Linear Regression and Its Applications

A Proof of the EOQ Formula Using Quasi-Variational. Inequalities. March 19, Abstract

State Estimation using Moving Horizon Estimation and Particle Filtering

New Lyapunov Krasovskii functionals for stability of linear retarded and neutral type systems

SURVEY OF LINEAR QUADRATIC ROBUST CONTROL

Transcription:

A game theoretic approach to moving horizon control Sanjay Lall and Keith Glover Abstract A control law is constructed for a linear time varying system by solving a two player zero sum dierential game on a moving horizon, the game being that which is used to construct an H 1 controller on a nite horizon. Conditions are given under which this controller results in a stable system and satises an innite horizon H 1 norm bound. A risk sensitive formulation is used to provide a state estimator in the observation feedback case. 1 Introduction The moving horizon control technique was developed in the 1970's, with one of the rst papers being that by Kleinman (Kleinman 1970) in 1970. The techniques in his paper were later reformulated and generalized by various authors, see for example (Chen & Shaw 198, Kwon & Pearson 1977, Kwon, Bruckstein & Kailath 1983). Kwon and Pearson (Kwon & Pearson 1977) proved stability for linear time varying systems using a controller which optimised a quadratic cost function integrated over a time interval from the current time t to a xed distance ahead t + T. The solution to this problem was given in terms of a Riccati dierential equation, integrated backwards from time t + T at each time t. For time varying systems, this gives a practical method of stabilising the system. Kwon et. al. (Kwon et al. 1983) formulated a general procedure for the recursive update of such Riccati equations, which generalise easily to the indenite Riccati equations in this paper. Further, this method allows stabilization of systems which are known only for the short term future. In dealing with long term variation of systems, there are at present two extreme options. The rss to assume a time invariant system and design a controller robust against time varying perturbations, which usually leads to very conservative controller design. The other is to design fully for time varying systems. In the case of both linear quadratic and H 1 controllers, this requires the backwards integration from innity of a Riccati equation (see for example (Tadmor 1993, Ravi, Nagpal & Khargonekar 1991)), and somewhat optimistically assumes knowledge of the system throughout future time. The moving horizon method can be viewed as a compromise between these two methods. The aim of this present paper is to extend the work of Tadmor (Tadmor 199), in which a receding horizon controller is formulated with each nite horizon optimisation based upon an 1

H 1 optimisation. Is hoped that the practical advantages of receding horizon control might be combined with the robustness advantages of H 1 control. Tadmor proved that that this controller was stable and satised an innite horizon norm bound. Although derived using dierent approaches to the H 1 problem, the controller described in this paper is essentially the same in the state feedback case, diering only in the terminal constraint used. In the observation feedback case, however, is dicult to produce a natural formulation of the H 1 receding horizon control problem. For the nite horizon problem, typically the assumptions made are that the initial state at the beginning of each optimization interval is completely known, or is completely unknown. In the latter case is treated as part of the disturbance and subject to a quadratic weighting (Uchida & Fujita 199) (Khargonekar, Nagpal & Poolla 1991). In the receding horizon problem, though, we have observations from before the optimization interval. We attempt to make use of prior observations using the theory of risk sensitive control, as developed by Whittle (Whittle 1990), which has been shown to be equivalent to H 1 control in many situations. Preliminaries We consider the system _x(t) = A(t)x(t) + B 1 (t)w(t) + B (t)u(t) (1) z(t) = C 1 (t)x(t) + D 1 (t)u(t) y(t) = C (t)x(t) + D 1 (t)w(t) where the coecient matrices are bounded matrix valued functions of time. We use the following assumptions D 0 1C 1 = 0 D 0 1D 1 = I D 1 B 0 1 = 0 D 1 D 0 1 = I () which can be removed by suitable changes in variables, see for example (Doyle, Glover, Khargonekar & Francis 1989, Ravi et al. 1991, Tadmor 1993). Further, we shall assume there exists " > 0 such that B 1 (t)b 1 (t) 0 "I for all t > 0 C 1 (t) 0 C 1 (t) "I for all t > 0 (3) These assumptions can be removed by assuming uniform complete controllability and observability of the system, and slightly modifying the following arguments. Let L m [ ; t f ] be the Hilbert space of square integrable functions on [ ; t f ] R taking values in R m, and write R + = [0; 1). Denote the usual norm on L by kk, and if F : L! L then denote by kk 1 the norm on F induced by kk Then x: R +! R n, u L m 1 [ ; t f ],w L m [ ; t f ], y: R +! R p. The signal w represents all external inputs including disturbances, sensor noise, and commands, z represents the error signal, y is the measured variable and u is the control input. We shall omit the space dimensions of signals in the sequel.

3 State feedback 3.1 Finite horizon H 1 control Dene the cost function J (u; w) = Z tf n z(t) 0 z(t)? w(t) 0 w(t) o We can regard J as a function of either L signals or feedback strategies. Let M = f : [ ; t f ] R n! R m1 g and N = f : [ ; t f ] R n! R m g. These spaces are the strategy spaces, and we shall write strategies as ; to distinguish them from signals u; w. With the control u(t) = (t; x(t)) in place, the operator T zw maps w to z. If x( ) = 0, then we can dene kt zw k 1 = sup kt zw wk =kwk wl [ ;t f ] The nite interval H 1 problem is to nd a linear causal control such that kt zw k 1 < for a given > 0. kt zw k 1 < if and only if there exists " > 0 such that We formulate the nite time dierential game J (u; w)?"kwk 8w L [ ; t f ] (4) inf sup J (; ) (5) M N which is a zero sum game, where u is the minimizing player and w is the maximizing player. The designer chooses u(t) = (t; x(t)) such that even if nature is malicious and chooses the worst case w, then equation (4) is satised. If the extremizing operators in equation (5) are interchangeable, then we call the optimal u and worst case w saddle point strategies. Theorem 1 gives sucient conditions for the existence of a saddle point solution. A saddle point solution u(t) = (t; x(t)), w(t) = (t; x(t)) will satisfy J ( ; w) J ( ; ) J (u; ) 8u; w L [ ; t f ]: Theorem 1 (Basar & Olsder 198, Limebeer, Anderson, Khargonekar & Green 199) Let Z tf J (u; w) = where F > 0 is some weighting matrix. If n z(t) 0 z(t)? w(t) 0 w(t) o + x(t f ) 0 F x(t f )? _ P = P A + A 0 P + C 0 1C 1? P (B B 0?? B 1 B 0 1)P; P (t f ) = F (6) has a unique symmetric bounded solution on t [ ; t f ], then J (u; w) = x( ) 0 P ( )x( ) +? Z t Z tf f 3 ju() + B () 0 P ()x()j d jw()?? B 1 () 0 P ()x()j d

Further, with full state information available to u(t) = (t; x(t)) and w(t) = (t; x(t)), there exists a unique feedback saddle point solution given by (t; x(t)) =?B (t) 0 P (t)x(t) (t; x(t)) =? B 1 (t) 0 P (t)x(t) and the saddle point value of the game is given by J ( ; ) = x( ) 0 P ( )x( ): 3. The moving horizon dierential game We now try to nd u(t) = (t; x(t)) at time t, where is the saddle point strategy for player u for inf sup M t N t (Z t+t t n z(s) 0 z(s)? w(s) 0 w(s) o ds + x(t + T ) 0 F (t + T )x(t + T )) where M t = f : [t; t + T ] R n! R m1 g, N t = f : [t; t + T ] R n! R m g, and w(t) = (t; x(t)). The idea behind this is to try to gain both the robustness benets of the H 1 formulation and the practical control of time varying systems associated with receding horizon control. At each time t we try to solve the nite horizon H 1 problem on the optimisation interval [t; t + T ] to nd a feedback strategy u(t) = (t; x(t)). In practice, this would be implemented for a nite time > 0, after which a new calculation on the interval [t + ; t + + T ] would be performed. In this paper we idealise this situation and assume that the controller is updated continuously. For each separate H 1 optimisation, the initial state may not be zero. Therfore, an induced norm interpretation on each nite horizon is not possible, since zero input u gives a nonzero output z on the interval [ ; t f ], and thus the induced norm is undened. We also have a penalty on the terminal state, F (t) > 0. This is often incorporated into nite horizon problems to allow for compromises between the norm of z and the size of x(t f ). In the moving horizon case, we shall show that a suciently large F will cause the closed loop to be stable. In order to nd u(t) for the moving horizon problem we musntegrate equation (6) backwards from the boundary condition at t + T to time t. We can rewrite this as a partial dierential equation as follows? P (; ) = P (; )A() + A() 0 P (; )? P (; )(B ()B () 0?? B 1 ()B 1 () 0 )P (; ) + C 1 () 0 C 1 () P (t; t) = F (t) for all t The moving horizon saddle point controller is given by u(t) =?B (t) 0 P (t; t + T )x(t) 4

In the case when all system matrices A; B 1 ; B ; C 1 ; D 1 are time invariant, then P (t; t + T ) is independent of t also. 3.3 Stability Tadmor (Tadmor 199) proves stability for the above controller when F = 1. In this case, we can consider the Riccati equation for P?1 and the interpretation on the nite horizon is that the controller is subject to the constraint x(t + T ) = 0. The linear quadratic problem with a similar terminal weight was considered by Kwon, Bruckstein and Kailath (Kwon et al. 1983). We use a modied form of their methods to handle the terminal weight. Before the stability proof, we need the following lemmas, in which we will write P (; ; F ) for the solution of the Riccati equation with boundary condition P (; ; F ) = F for =, with xed. The following results are derived in a slightly dierent way in (Kwon et al. 1983), and various similar results can be found in (Anderson & Moore 1989, Reid 1970, Bucy 1967, Ran & Vreugdenhil 1988, Basar & Bernhard 1991). Lemma Let P satisfy the following Riccati equation on [ ; t f ], with F 0 and Q 0.? _ P = P A + A 0 P? P (B B 0?? B 1 B 0 1)P + Q P (t f ) = F then if P exists, and either F > 0 or Q > 0, then P (t) > 0 8t [ ; t f ]. Lemma 3 Suppose and P satises _ F + A 0 F + F A? F (B B 0?? B 1 B 0 1)F + C 0 1C 1 0; F (t) > 0; t > 0 (7) then, if both sides exist,? _ P = A 0 P + P A? P (B B 0?? B 1 B 0 1)P + C 0 1C 1 P (; 1 ; F ( 1 )) P (; ; F ( )); 1 Proof We know (see (Basar & Bernhard 1991)) x( ) 0 P ( ; t f ; F (t f ))x( ) = (8) then min u max w Z tf (x(s) 0 C 1 (t) 0 C 1 (t)x(s) + u(s) 0 u(s)? w(s) 0 w(s)) ds + x(t f ) 0 F (t f )x(t f ) (i) If ~ C1 (t) 0 ~ C1 (t) C 1 (t) 0 C 1 (t) 8t [ ; t f ], then replacing C 1 by ~ C 1 results in a new solution, ~P satisfying ~P (t) P (t) 8t [ ; t f ]. (ii) Similarly, if ~F (t) F (t) 8t [ ; t f ], then ~P (t) P (t) 8t [ ; t f ]. 5

P (t; ; F ( )) P (t; 1 ; F ( 1 )) for all t 1 With either of these changes, equation(8) becomes x( ) 0 ~P ( ; t f ; F (t f ))x( ) = min u max w (Z tf (x(s) 0 C 1 (s) 0 C 1 (s)x(s) + u(s) 0 u(s)? w(s) 0 w(s)) ds + x(t f ) 0 F (t f )x(t f ) + h(u; w) and h(u; w) > 0 8u; w L. We can write equation(7) as ) where 0. By denition,? _ F = A 0 F + F A? F (B B 0? B 1 B 0 1)F + (C 0 1C 1 + ); P ( ; ; F ( )) = F ( ) so, since the Riccati equation for P and F are equal at t =, and that for F has a bigger constant term, F (t) P (t; ; F ( )) for all t We can rewrite this as P ( 1 ; ; F ( )) P ( 1 ; 1 ; F ( 1 )) so, when t = 1, the LHS is smaller than the RHS. We can take this point as a boundary condition, and, using (ii) above, Theorem 4 For the system described by equations (1-3), if at each time t the saddle point solution to the dierential game on [t; t + T ] exists, then if _ F + A 0 F + F A? F (B B 0?? B 1 B 0 1)F + C 0 1C 1 0; F (t) > 0; t > 0 and if there exists constants 1 > 0 and > 0 such that 1 I P (t; t + T; F (t + T )) I then the control u(t) =?B (t) 0 P (t; t + T; F (t + T ))x(t) is stabilising. Proof Let (t) = A(t)? B (t)b (t) 0 P (t; t + T; F ). Then, with the controller in place, _x = (t)x(t) + B 1 (t)w(t). Let V (t) = x(t) 0 P (t; t + T; F (t + T ))x(t). Then _V = x 0 x Using Lemma 3,?P 0 B B 0 P?? P B 1 B 0 1P? C 0 1C 1 + P (t; ; F ()) 0 =t+t 6 P (t; ; F ()) =t+t!

Since C1C 0 1 > "I, _V (t) <?"x(t) 0 x(t). Then, since by assumption P is bounded, is a Lyapunov function for the closed loop system, hence the system is globally exponentially stable. If we can prove boundedness of P (t; t + T; F (t + T ), then we shall have a sucient condition for the controller to be stable. Tadmor (Tadmor 199) proves that P (t; t + T; 1) is bounded above and below. Slight modication of his proof shows: R t Theorem 5 Let G i (t 0 ; t 1 ) = 1 t 0 (t 0 ; t)b i (t)b i (t) 0 (t 0 ; t) 0. If (A; B ) is uniformly controllable and G (t 0 ; t 1 )?1 G 1 (t 0 ; t 1 ) is uniformly bounded for all t 1? t 0 T, then there exists > 0 such that P (t; t + T; F (t + T )) is bounded above for all t > 0. 3.4 Innite horizon norm bounds Following (Tadmor 199), with the controller u(t) =?B (t) 0 P (t; t + T; F (t + T ))x(t) and w L [t 0 ; 1] we know Z 1 d t 0 n o x(t) 0 P (t; t + T; F (t + T ))x(t) = Z 1 (?z 0 z + w 0 B 01P x?? x 0 P B 1 B 01P x + x 0 t 0 P (t; ; F ()) =t+t! ) x If the closed loop is stable, then Z 1 d t 0 fx(t)0 P (t; t + T; F (t + T ))x(t)g =?x(t 0 ) 0 P (t 0 ; t 0 + T; F (t 0 + T ))x(t 0 ) Let ^w(t) = w(t)?? B(t)P 0 (t; t + T; F (t + T ))x(t). Then, if x(t 0 ) = 0, kzk? kwk =? k ^wk +Z 1 x 0 t 0 P (t; ; F ()) x =t+t Using Lemma 3, Hence kt zw k 1. kzk? kwk 0! 4 Calculation of P (; ; F ()) In order to implement the moving horizon controller, we need to calculate the value of P (t; t + T; F (t + T )) for all t > 0. This requires the solution of a Riccati dierential equation over the interval [t; t + T ] for each time t, given boundary conditions P (t; t + T; F (t + T )) = F (t + T ) for all t > 0. Kwon, Bruckstein and Kailath (Kwon et al. 1983) applied results from scattering theory in their solution to the quadratic problem, to give a forwards dierential equation for P (t; t + T; F (t + T )). Here we state the same solution for the indenite Riccati equation. See (Verghese, Friedlander & Kailath 1980) for details on the derivation of these formulae. In this section, P (; ) is used to mean P (; ; 0). Dene for convenience N() = B ()B () 0?? B 1 ()B 1 () 0 7

Let Consider the system of equations S(; ) = (; ) L(; ) P (; ) (; ) (; ) = (; )[N()P (; )? A()] (9) (; ) = [P (; )N()? A 0 ()] (; ) (10) L(; ) = (; )N() (; ) (11) P (; ) = A0 ()P (; ) + P (; )A()? P (; )N()P (; ) + C 1 () 0 C 1 () (1) with boundary conditions! S(; ) = I 0 0 I The following equations (; ) = [A() + L(; )C 1() 0 C 1 ()](; ) (13) (; ) = (; )[A0 () + C 1 () 0 C 1 ()L(; )] (14) L(; ) = A()L(; ) + L(; )A0 () + L(; )C 1 () 0 C 1 ()L(; )? N() (15) P (; ) = (; )C 1() 0 C 1 ()(; ) (16) give the partial derivatives with respect to. Is straightforward to verify this by taking both sets of second partial derivatives, and showing that they are equal, and verifying that along the boundary the total derivative d S(; t) S(t; t) = + =t! S(t; ) is zero, since ; ; L; P are constant along the boundary. We can nd a dierential equation for P (t; t + T ) in terms of t since Hence d d dl dp d S(t; t + T ) = S(; t + T ) =t + =t S(t; ) =t+t = [N(t)P? A(t)] + [A(t + T ) + LC 1 (t + T ) 0 C 1 (t + T )] (17) = [P N(t)? A 0 (t)] + [A 0 (t + T ) + C 1 (t + T ) 0 C 1 (t + T )L] (18) = N(t) + A(t + T )L + LA 0 (t + T ) + LC 1 (t + T ) 0 C 1 (t + T )L? N(t + T ) (19) = A 0 (t)p + P A(t)? P N(t)P + C 1 (t) 0 C 1 (t) + C 1 (t + T ) 0 C 1 (t + T ) (0) 8

where = (t; t + T ) etc. This gives a dierential equation for P (t; t + t; 0). The boundary conditions for equations (17{0) are given by solving equations (9{1) backwards in time from = = T to = 0; = T. This backwards solution has to be performed only once, when t = 0. We now make use of the following identity. P (; ; F ()) = P (; ) + (; )F ()[I? L(; )F ()]?1 (; ) This can be veried in a straightforward manner. Then P (t; t + T; F (t + T )) = P + F (t + T )[I? LF (t + T )]?1 These equations are simpler than they appear, since = 0. They give us a quadratic dierential equation in matrices of size n, which is integrated forwards to give the controller. 5 State estimation and output feedback In this section we consider the problem of optimally estimating the state. Let J( ; t f ) = Z tf n z(t) 0 z(t)? w(t) 0 w(t) o Let u [0;ti ] = f u(t); 0 t g. Let t [ ; t f ] be the current time. We then wish to nd the feedback saddle point strategy for the u player given information u [0;t] and y [0;t]. For the moving horizon controller we only need to know this strategy at time t =. In recent years H 1 theory has been furnished with a certainty equivalence principle in several dierent formulations (Doyle et al. 1989, Whittle 1990, Basar & Bernhard 1991). For the nite horizon time varying H 1 problem Basar and Bernhard (Basar & Bernhard 1991, p. 119) give a derivation of the optimal estimator for output feedback. Stated in simple terms, their resulndicates that one should look, at each instant of time t, for the worst possible disturbance w [ti ;t] compatible with information available at that time about u [ti ;t] and y [ti ;t]. This generates a corresponding worst possible state trajectory, which should be used by the feedback law as if it were the actual state. The worst case approach to our problem is to extend the maximisation problem into the past, and maximise the cost function with respect to unobservables before, back to time t = 0. This would require us to assume that, for times 0 < t <, w was trying to maximize the cost function on [ ; t f ]. However, if we do this then w can make J innite, since there is no disincentive on the size of w outside [ ; t f ]. This means that we cannot use this principle to estimate the state without further assumptions. In fact, given the values of past observables, we can only draw conclusions about the current value of the state if we know some information about w (since we have the nondegeneracy assumption that D 1 D 0 1 = I). In the nite horizon problem, we assume than the past player w was trying to maximize the cost function over the entire nite interval. One possible alternative approach is to assume that w is trying to maximize J(0; t f ), while u is trying to minimize J( ; t f ). For u, in fact this is equivalent to playing a moving horizon game in which only the endpoint moves. 9

Alternatively, is possible to consider a stochastic formulation. In this case we should like to assume that outside the optimisation interval, w is accurately modelled by white noise. In order to do this, we need a stochastic model for H 1 control. 5.1 Risk sensitive control There is a strong connection between the formulation of risk sensitive control developed by Whittle (Whittle 1990) and H 1 control. This is specied in (Glover & Doyle 1988). We consider here the system described by equations (1 { 3). However, we now change our assumptions so that w is white noise with covariance function (t)i, the identity matrix multiplied by a delta function. Dene the cost function C = Z tf z(t) 0 z(t) + x(t f ) 0 F (t f )x(t f ) where x( ) is normally distributed with mean ^x 0 and covariance matrix V, and F (t) > 0. Let L = log(e (exp( C=))) where in this case (t; y(t)) is a policy, and E indicates the expectation when the inpus u(t) = (t; y(t)). The risk sensitive control problem is min M L where M is the space of all policy functions. Glover and Doyle (Glover & Doyle 1988) show that this is equivalent to an H 1 problem. However, in this case, a sucient statistic for the initial conditions are the mean and variance of the initial state. If the system has been running previous to the implementation of a risk sensitive controller, these can be provided by a Kalman lter. The solution to the risk sensitive control problem is given by Whittle (Whittle 1990): Theorem 6 If there exists an X 1 0 satisfying? _X 1 = A 0 X 1 + X 1 A? X 1 (B 0 B?? B 1 B 0 1)X 1 + C 0 1C 1 X 1 (t f ) = F (t f ) and Y 1 satisfying _Y 1 = AY 1 + Y 1 A 0? Y 1 (C 0 C?? C 0 1C 1 )Y 1 + B 1 B 0 1 Y 1 ( ) = V such that Y 1 (t)?1?? X 1 (t) > 0, then let _^x = A^x + B u + Y 1 C(y 0? C ^x) +? Y 1 C1C 0 1^x ^x( ) = ^x 0 and x = (I?? Y 1 X 1 )?1^x Then the optimal risk sensitive feedback control law is given by u =?BX 0 1 x 10

This has exactly the form of an H 1 controller (see, for example, Basar and Bernhard (Basar & Bernhard 1991, pp. 14-16)) with unknown weighted initial state. However, is important to note that the assumptions are very dierent, and that w is white noise. In the moving horizon case, we only need u(t) at t =. At this time, ^x( ) and Y 1 ( ) are the mean and variance of the state. Note that the moving horizon objective at time is the expectation of L over the current state variability and future disturbances. The information required, thas the mean and variance of the current state, is precisely that provided by a Kalman lter, as follows. Theorem 7 Suppose x(0) has mean ^x 0 and covariance matrix V. Let Y > 0 be the solution to _Y = AY + Y A 0? Y (CC 0 )Y + B 1 B 0 1 Y ( ) = V and let ^x satisfy _^x = A^x + B u + Y C 0 (y? C ^x) ^x( ) = ^x 0 then ^x(t) is the mean of x and Y (t) is the variance of x at time t. Hence we propose a moving horizon controller of the form _Y = AY + Y A 0? Y (C 0 C )Y + B 1 B 0 1 Y ( ) = V _^x = A^x + B u + Y C 0 (y? C ^x) ^x( ) = ^x 0? X 1(; ) = X 1 (; )A() + A() 0 X 1 (; )? X 1 (t; t) = F (t) for all t Then the controller is given by X 1 (; )(B ()B () 0?? B 1 ()B 1 () 0 )X 1 (; ) + C() 0 C() u(t) =?B (t) 0 X 1 (t; t + T )(I?? Y (t)x 1 (t; t + T ))?1^x(t) Let Z 1(t) = (I?? Y (t)x 1 (t; t + T ))?1 Y (t) Then we can rewrite equations (5.1{5.1) as _x = (A? (B B 0?? B 1 B 0 1)X 1 )x? Z 1(C 0 C +? R)x + Z 1C 0 y where u(t) =?B 0 X 1 (t; t + T )x(t) R(t) = C 1 (t) 0 C 1 (t)? X 1(t; ) =t+t Theorem 8 If X 1 and Z 1 satisfy 1 I X 1 (t; t + T ) I 1 I Z 1(t) I for xed 1 ; for all t > 0, then the above controller is globally exponentially stable. 11

Proof Let e = x? x. Then where Let F = d x = F x e e " A? B BX 0 1 B BX 0 1 #?? B 1 B1X 0 1 +? Z 1R A? Z 1(CC 0 +? R) +? B 1 B1X 0 1 Let G = F 0 + F + _. Then Using dz?1 1 d x e = " X1 0 0 Z?1 1 0 x e! = x 0 e G x e =?Z?1 1A? A 0 Z?1 1 +? R + C 0 C? Y?1 B 1 B 0 1Y?1?? X 1 (B B 0?? B 1 B 0 1)X 1 some algebra reveals hence G = where 64?X 1B B0 X 1?? X 1 B 1 B 0 1X 1? R X 1 B B 0 X 1? X 1 B 1 B 0 1Z?1 1 + R X 1 B B 0 X 1? Z?1 1B 1 B 0 1X 1 + R " d x 0 e x e # =? x 0 e ^G x e ^G =??" X 1 Z?1 1 #?X 1 B BX 0 1? Z 1B?1 1 B1Z 0 1?1?? (CC 0 +? R)? jb 0 X 1 xj? jc ej? x 0 Rx # B 1 B 0 1" X 1 Z?1 1 We know that x = 0 if and only if x = e, and in this case? x e 0 ^G x e =? jb 0 1Y?1 xj < jxj for some > 0 since by assumption B 1 B 0 1 > "I for some " > 0. Using lemma 3, if F satises equation (7), then R > 0, since by assumption C 0 1C 1 > 0. If we have a bound for X 1, then in order to bound Z 1 we need only bound Y. Bounds for Y are given by Kalman (Kalman 1960). Therefore stability is again conditional on the existence of a bound for X 1, of which we are assured in the case when F = 1. # 0 3 75 1

6 Conclusions With the terminal constraint x(t + T ) = 0, the moving horizon state feedback problem is known to be stable and to resuln a global norm bound for the closed loop system, under the assumptions of Theorem 5. Several interesting problems remain. Is possible within the current formulation to use a time varying, which might allow both for attempts to improve performance if the system becomes easier to control with time, and to reduce performance demands if the system becomes more dicult to control. Is also feasible to extend the above approach for the observation feedback case so that the horizon becomes [t? T 1 ; t + T ]. In this case, we might expect a more conservative controller. One motivation for this control methodology has been thas possible to implement it knowing only details of the system up to time t + T ahead. In this formulation that knowledge is discreet; we assume that up to time t+t we know exactly the system matrices, after that we know nothing about them. This seems unnatural, since we might hope that an online identication process would give us a description of the future system with gradually increasing uncertainty in time. The linking of the present controller with a mathematical model for such uncertainty might produce a more realistic control system. References Anderson, B. D. O. & Moore, J. B. (1989), Optimal Control - Linear Quadratic Methods, Prentice Hall. Basar, T. & Bernhard, P. (1991), H 1 Optimal Control and Related Minimax Design Problems. A Dynamic Game Approach, Systems and Control: Foundations and Applications, Birkhauser. Basar, T. & Olsder, G. J. (198), Dynamic Noncooperative Game Theory, Mathematics in Science and Engineering, Academic Press. Bucy, R. S. (1967), `Global theory of the riccati equation', Journal of Computer and System Sciences 1, 349{361. Chen, C. C. & Shaw, L. (198), `On receding horizon feedback control', Automatica 18(3), 349{35. Doyle, J. C., Glover, K., Khargonekar, P. P. & Francis, B. A. (1989), `State-space solutions to standard H and H 1 control problems', IEEE Transactions on Automatic Control 34(8), 831{847. Glover, K. & Doyle, J. C. (1988), `State-space formulae for all stabilizing controllers that satisfy an H 1 -norm bound and relations to risk sensitivity', Systems and Control Letters 11, 167{17. Kalman, R. E. (1960), `Contributions to the theory of optimal control', Boletin de la Sociedad Matematica Mexicana 5(1), 10{119. 13

Khargonekar, P. P., Nagpal, K. M. & Poolla, K. R. (1991), `H 1 control with transients', SIAM Journal on Control and Optimization 9(6), 1373{1393. Kleinman, D. L. (1970), `An easy way to stabilize a linear constant system', IEEE Transactions on Automatic Control p. 69. Kwon, W. H., Bruckstein, A. M. & Kailath, T. (1983), `Stabilizing state-feedback design via the moving horizon method', International Journal of Control 37(3), 631{643. Kwon, W. H. & Pearson, A. E. (1977), `A modied quadratic cost problem and feedback stabilization of a linear system', IEEE Transactions on Automatic Control (5), 838{ 84. Limebeer, D. J., Anderson, B. D. O., Khargonekar, P. P. & Green, M. (199), `A game theoretic approach to H 1 control for time varying systems', SIAM Journal on Control and Optimization 30(), 6{83. Mayne, D. Q. & Michalska, H. (1990), `Receding horizon control of nonlinear systems', IEEE Transactions on Automatic Control 35(7), 814{84. Ran, A. C. M. & Vreugdenhil, R. (1988), `Existence and comparison theorems for algebraic riccati equations for continuous and discrete time systems', Linear Algebra and its applications 99, 63{83. Ravi, R., Nagpal, K. M. & Khargonekar, P. P. (1991), `H 1 control of linear time varying systems: A state space approach', SIAM Journal on Control and Optimization 9(6), 1394{1413. Reid, W. T. (1970), Riccati Dierential Equations, Academic Press. Tadmor, G. (199), `Receding horizon revisited: An easy way to robustly stabilize an ltv system', Systems and Control Letters 18, 85{94. Tadmor, G. (1993), `The standard H 1 problem and the maximum principle: the general linear case', SIAM Journal on Control and Optimization 31(4), 813{846. Uchida, K. & Fujita, M. (199), `Finite horizon H 1 control problems with terminal penalties', IEEE Transactions on Automatic Control 37(11), 176{1767. Verghese, G., Friedlander, B. & Kailath, T. (1980), `Scattering theory and linear least squares estimation, parii: The estimates', IEEE Transactions on Automatic Control 5(4), 794{80. Whittle, P. (1990), Risk-sensitive optimal control, Wiley Interscience series in Systems and Optimization, Wiley. 14