Nonlinear Output-Feedback Model Predictive Control with Moving Horizon Estimation

Noninear Output-Feedback Mode Predictive Contro with Moving Horizon Estimation David A. Copp and João P. Hespanha Abstract We introduce an output-feedback approach to mode predictive contro that combines state estimation and contro into a singe -max optimization. Under appropriate assumptions that ensure controabiity and observabiity of the noninear process to be controed, we prove that the state of the system remains bounded and estabish bounds on the tracking error for trajectory tracking probems. The resuts appy both to infinite and finite-horizon optimizations, the atter requiring reversibe dynamics and the use of a tera cost that is an ISS-contro Lyapunov function with respect to a disturbance input. A numerica exampe is presented that iustrates these resuts. I. INTRODUCTION Advances in computer technoogy have made onine optimization a viabe and powerfu too for soving contro probems in practica appications. Mode predictive contro (MPC) is an approach that uses onine optimization to sove an open-oop optima contro probem at each samping time and is now quite mature, as evidenced by [1 3]. Severa papers in this area are focused on the robustness to mode uncertainty, input disturbances, and measurement noise. These studies incude robust, worst-case, and -max MPC. Robust and worst-case MPC are discussed in works such as [4 7]. Min-max MPC for constrained inear systems is considered in [8, 9], and a game theoretic approach for robust constrained noninear MPC is proposed in [10]. Noa or inherent robustness of MPC has aso been studied in [3, 11]. MPC is often formuated assug fu-state feedback. In practica cases, however, the fu state often cannot be measured and is not avaiabe for feedback. This motivates the investigation of output-feedback MPC in which an independent agorithm for state estimation is often used. Exampes of agorithms for state estimation incude observers, fiters, and moving horizon estimation, some of which are discussed in [12]. Of these methods, moving horizon estimation (MHE) is attractive for use with MPC because it expicity handes constraints and computes the optima current estimate of the state by soving an onine optimization probem over a fixed number of past measurements. Therefore, the computationa cost does not grow as more measurements become avaiabe. Noninear MPC and MHE are both discussed in [13]. A usefu overview of constrained noninear moving horizon state estimation is given in [14], and more recent resuts regarding stabiity of MHE can be found in [15]. D. A. Copp and J. P. Hespanha are with the Center for Contro, Dynamica Systems, and Computation, University of Caifornia, Santa Barbara, CA 93106 USA. dacopp@engr.ucsb.edu,hespanha@ece.ucsb.edu Thus far, resuts on the stabiity of output-feedback contro schemes based on MPC and MHE (especiay for noninear systems) are imited. Some joint stabiity resuts for state estimation and contro of inear systems are given in [16], but output-feedback is not considered. Resuts on robust outputfeedback MPC for constrained inear systems can be found in [17] using a state observer for estimation, and in [18] using MHE for estimation. Fewer resuts are avaiabe for noninear output-feedback MPC, athough notabe exceptions are [3, 19, 20]. Recent studies of input-to-state stabiity of -max MPC can be found in [21 23]; however, these references aso do not investigate the use of output-feedback. In this paper, we consider the output-feedback of noninear systems with uncertainty and disturbances and formuate the MPC probem as a -max optimization. In this formuation, a desired cost function is maximized over disturbance and noise variabes and imized over contro input variabes. In this way, we can sove both the MPC and MHE probems using a singe -max optimization, which gives us an optima contro input sequence at each samping time for a worst-case estimate of the current state. For both infinite-horizon and finite-horizon optimizations, we show that the state remains bounded under the proposed feedback contro aw. We aso show that the tracking error in trajectory tracking probems is bounded in the presence of measurement noise and input disturbances. The main assumption for these resuts is that a saddepoint soution exists for the -max optimization that is soved at each samping time. This assumption is a common requirement in game theoretica approaches to contro design [24] and presumes observabiity and controabiity of the cosed-oop system. For the finite-horizon case, we additionay require that the dynamics are reversibe and that there exists a tera cost that is an ISS-contro Lyapunov function with respect to a disturbance input. The paper is organized as foows. In Section II, we formuate the contro probem we woud ike to sove and discuss its reationship to MPC and MHE. In Section III, we state the main cosed-oop stabiity resuts. Simuation resuts showing robust trajectory tracking in the presence of additive disturbances and measurement noise are presented in Section IV. Finay, we provide some concusions and directions for future research in Section V. II. PROBLEM FORMULATION We consider the contro of a time-varying noninear discrete-time process of the form x t`1 f t px t, u t, d t q, y t g t px t q ` n t, @t P Z ě0 (1)

with state x t taking vaues in a set X Ă R nx. The inputs to this system are the contro input u t that must be restricted to the set U Ă R nu, the unmeasured disturbance d t that is known to beong to the set D Ă R n d, and the measurement noise n t beonging to the set N Ă R nn. The signa y t, beonging to the set Y Ă R ny, denotes the measured output that is avaiabe for feedback. The contro objective is to seect the contro signa u t P U, @t P Z ě0 so as to imize a criterion of the form c t px t, u t, d t q η t pn t q ρ t pd t q, (2) t 0 t 0 t 0 for worst-case vaues of the unmeasured disturbance d t P D, @t P Z ě0 and the measurement noise n t P N, @t P Z ě0. The functions c t p q, η t p q, and ρ t p q in (2) are a assumed to take non-negative vaues. The negative sign in front of ρ t p q penaizes the maximizer for using arge vaues of d ř t. Boundedness of (2) by a constant γ guarantees that 8 t 0 c tpx t, u t, d t q ď γ ` ř8 t 0 η tpn t q ` ř8 t 0 ρ tpd t q. In what foows, we aow the functions η t p q and ρ t p q in the criterion (2) to take the vaue `8. This provides a convenient formaism to consider bounded disturbances and noise whie formay aowing n t and d t to take vaues in the whoe spaces R nn and R n d, respectivey. Specificay, considering extended-vaue extensions [25] of the form # # ρ t pd t q d t P D η t pn t q n t P N ρ t pd t q η t pn t q (3) 8 d t R D, 8 n t R N, with ρ t and η t bounded in D and N, respectivey, the imization of (2) with respect to the contro signa u t need not consider cases where d t and n t take vaues outside D and N, respectivey, as this woud directy ead to the cost 8 for any contro signa u t that keeps the positive term bounded. Remark 1: Whie the resuts presented here are genera, the reader is encouraged to consider the quadratic case c t px t, u t, d t q }x t } 2 ` }u t } 2, η t pn t q }n t } 2, ρ t pd t q }d t } 2 to gain intuition on the resuts. In this case, boundedness of (2) woud guarantee that the state x t and input u t are 2 provided that the disturbance d t and noise n t are aso 2. A. Infinite-Horizon Onine Optimization To overcome the conservativeness of an open-oop contro, we use onine optimization to generate the contro signas. Specificay, at each time t P Z ě0, we compute the contro u t so as to imize c s px s, u s, d s q η s pn s q ρ s pd s q (4) under worst-case assumptions on the unknown system s initia condition x 0, unmeasured disturbances d t, and measurement noise n t, subject to the constraints imposed by the system dynamics and the measurements y t coected up to the current time t. Since the goa is to optimize this cost at the current time t to compute the contro inputs at times s ě t, there is no point in penaizing the running cost c s px s, u s, d s q for past time instants s ă t, which expains the fact that the first summation in (4) starts at time t. There is aso no point in considering the vaues of future measurement noise at times s ą t, as they wi not affect choices made at time t, which expains the fact that the second summation in (4) stops at time t. However, we do need to consider a vaues for the unmeasured disturbance d s because past vaues affect the (unknown) current state x t, and future vaues affect the future vaues of the running cost. The foowing notation faciitates formaizing the contro aw proposed: Given a discrete-time signa z : Z ě0 Ñ R n and two times t 0, t P Z ě0 with t 0 ď t, we denote by z t0:t the sequence tz t0, z t0`1,..., z t u. Given a contro input sequence u t0:t1 and a disturbance input sequence d t0:t1, we denote by 1 ϕpt; t 0, x 0, u t0:t1, d t0:t1q the state x t of the system (1) at time t for the given inputs and initia condition x t0 x 0. In addition, to faciitate expressing the corresponding output and running cost, we define gϕpt; t 0, x 0, u t0:t1, d t0:t1q cϕpt; t 0, x 0, u t0:t, d t0:tq g t`ϕpt; t0, x 0, u t0:t1, d t0:t1q, c t`ϕpt; t0, x 0, u t0:t1, d t0:t1q, u t, d t. This notation aows us to re-write (4) as J 8 t px 0, u 0:8, d 0:8, y 0:t q cϕps; 0, x 0, u 0:s, d 0:s q η s`ys gϕps; 0, x 0, u 0:s1, d 0:s1 q ρ s pd s q, which emphasizes the dependence of (4) on the unknown initia state x 0, the unknown disturbance input sequence d 0:8, the measured output sequence y 0:t, and the contro input sequence u 0:8. Regarding the atter, one shoud recognize that u 0:8 is composed of two distinct sequences: the (known) past inputs u 0:t1 that have aready been appied, and the future inputs u t:8 that sti need to be seected. At a given time t P Z ě0, we do not know the vaue of the variabes x 0 and d 0:8 on which the vaue of criterion (5) depends, so we optimize this criterion under worst-case assumptions on these variabes, eading to the foowing max optimization û t:8 t PU (5) max J ˆx 0 t PX, ˆd t 8 pˆx 0 t, u 0:t1, û t:8 t, ˆd 0:8 t, y 0:t q, 0:8 t PD (6) where the arguments u 0:t1, û t:8 t to the function Jt 8 p q in (6) correspond to the argument u 0:8 in the definition of Jt 8 p q in the eft-hand side of (5). The subscript t in the (dummy) optimization variabes in (6) emphasizes that this optimization is repeated at each time step t P Z ě0. At different time steps, these optimizations typicay ead 1 When t t 0, it is understood that we drop a terms that depend on previous vaues of t, i.e., we write ϕpt 0 ; t 0, x 0 q.

to different soutions, which generay do not coincide with the rea contro input, disturbances, and noise. We can view the optimization variabes ˆx 0 t and ˆd 0:8 t as (worst-case) estimates of the initia state and disturbances, respectivey, based on the past inputs u 0:t1 and outputs y 0:t avaiabe at time t. Inspired by mode predictive contro, at each time t, we use as the contro input the first eement of the sequence û t:8 t tû t t, û t`1 t, û t`2 t,... u P U that imizes (6), eading to the foowing contro aw: B. Finite-Horizon Onine Optimization u t û t t, @t ě 0. (7) To avoid soving the infinite-dimensiona optimization in (6) that resuted from the infinite-horizon criterion (4), we aso consider a finite-horizon version of the criterion (4) of the form c s px s, u s, d s q ` q t`t px t`t q η s pn s q ρ s pd s q, (8) where now the optimization criterion ony contains T P Z ě1 terms of the running cost c s px s, u s, d s q, which recede as the current time t advances. The optimization criterion aso ony contains L`1 P Z ě1 terms of the measurement cost η s pn s q. Specificay, the summations in the criterion evauated at time t, which in (5) started at time 0 and went up to time `8, now start at time t L and ony go up to time t ` T 1. We aso added a tera cost q t`t px t`t q to penaize the fina state at time t ` T. Defining qϕpt; t L, x tl, u tl:t1, d tl:t1 q q t`ϕpt; t L, xtl, u tl:t1, d tl:t1 q, the cost (8) eads to the foowing finite-dimensiona optimization max ˆx tl t PX, ˆd tl:t`t 1 t PD J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:t q, (9) where J t px tl, u tl:t`t 1, d tl:t`t 1, y tl:t q cϕps; t L, x tl, u tl:s, d tl:s q ` qϕpt ` T ; t L, x tl, u tl:t`t 1, d tl:t`t 1 q η s`ys gϕps; t L, x tl, u tl:s1, d tl:s1 q ÿ t`t 1 ρ s pd s q. (10) In this formuation, we sti use a contro aw of the form (7), but now û t t denotes the first eement of the sequence that imizes (9). û t:t`t 1 t C. Reationship with Mode Predictive Contro When the state of (1) can be measured exacty and the maps d t ÞÑ f t px t, u t, d t q are injective (for each fixed x t and u t ), the initia state x tl and past vaues for the disturbance d tl:t1 are uniquey defined by the measurements x tl:t. In this case, the contro aw (7) that imizes (9) can aso be detered by the optimization with max ˆd t:t`t 1 t PD J t px tl, u tl:t1, û t:t`t 1 t, d tl:t1, ˆd t:t`t 1 t q, J t px t, u t:t`t 1, d t:t`t 1 q ` qϕpt ` T ; t, x t, u t:t`t 1, d t:t`t 1 q cϕps; t, x t, u t:s, d t:s q ρ s pd s q, which is essentiay the robust mode predictive contro probem with tera cost considered in [10, 26]. Remark 2 (Economic MPC): It is worth noting that our framework is more genera than standard forms of MPC where the ima cost is achieved at the optima feasibe state and input in order to ensure stabiity of the desired state. It can aso appy to economic MPC where the operating cost of the pant is used directy in the objective function, and therefore the cost need not be zero or ima at the optima state and input [27]. D. Reationship with Moving-Horizon Estimation When setting both c s p q and q t`t p q equa to zero in the criterion (10), this optimization no onger depends on u t:t`t 1 and d t:t`t 1, so the optimization in (9) simpy becomes max J t pˆx tl t, u tl:t1, ˆd tl:t1 t, y tl:t q, ˆx tl t PX, ˆd tl:t1 t PD where now the optimization criterion ony contains a finite number of terms that recede as the current time t advances: J t px tl, u tl:t1, d tl:t1, y tl:t q t1 ÿ ρ s pd s q η s`ys gϕps; t L, x tl, u tl:s1, d tl:s1 q, which is essentiay the moving horizon estimation probem considered in [14, 15].

III. MAIN RESULTS We now show that both for the infinite-horizon and finitehorizon cases introduced in Sections II-A and II-B, respectivey, the contro aw (7) eads to boundedness of the state of the cosed-oop system under appropriate assumptions, which we discuss next. A necessary condition for the impementation of the contro aw (7) is that the outer imizations in (6) or (9) ead to finite vaues for the optima that are achieved at specific sequences û t:8 t P U, t P Z ě0. However, for the stabiity resuts in this section we actuay ask for the existence of a sadde-point soution to the -max optimizations in (6) or (9), which is a common requirement in game theoretica approaches to contro design [24]: Assumption 1 (Sadde-point): The -max optimization (9) aways has a sadde-point soution for which the and max commute. Specificay, for every time t P Z ě0, past contro input sequence u tl:t1 P U, and past measured output sequence y tl:t P Y, there exists a finite scaar J t pu tl:t1, y tl:t q P R, an initia condition ˆx and sequences û t:t`t 1 t P U, ˆd tl:t`t 1 t J t pu tl:t1, y tl:t q tl t P X, P D such that J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:tq max ˆx tl t PX, ˆd tl:t`t 1 t PD J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:t q max ˆx tl t PX, ˆd tl:t`t 1 t PD J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:t q (11a) max ˆx tl t PX, ˆd tl:t`t 1 t PD J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:t q J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:tq (11b) ă 8. (11c) For the infinite-horizon case (6), the integer T in (11a) (11b) shoud be repaced by 8, and the integer t L shoud be repaced by 0. Assumption 1 presumes an appropriate form of observabiity/detectabiity ř adapted to the criterion t`t 1 c s px s, u s, d s q because (11a) impies that, for every initia condition ˆx tl t P X and disturbance sequence ˆd tl:t`t 1 t P D, cϕpt; t L, ˆx tl t, u tl:t1, û t t, ˆd tl:t t q ď J t pu 0:t1, y 0:t q ` ρ s p ˆd s t q ` η s`ysgϕps; tl, ˆx tl t, u tl:s1, ˆd tl:s1 t q. (12) This means that we can bound the size of the current state using past outputs and past/future input disturbances. For the infinite-horizon case, Assumption 1 aso presumes an appropriate form of controabiity/stabiizabiity adapted to the criterion ř 8 c spx s, u s, d s q because (11a) impies that the future contro sequence û t:8 t P U is abe to keep sma the size of future states as ong as the noise and disturbance remain sma. For the finite-horizon case, a subsequent assumption is needed to ensure controabiity. A. Infinite-Horizon Onine Optimization The foowing theorem is the main resut of this section and provides a bound that can be used to prove boundedness of the state when the contro signa is constructed using the infinite-horizon criterion (5). Theorem 1 (Infinite-horizon cost-to-go bound): Suppose that Assumption 1 hods. Then, for every t P Z ě0, the trajectories of the process (1) with contro (7) defined by the infinite-horizon optimization (6) satisfy cϕpt; 0, x 0, u 0:t, d 0:t q ď J 8 0 py 0 q ` η s pn s q ` ρ s pd s q. (13) Next we discuss the impications of Theorem 1 in terms of estabishing bounds on the state of the cosed-oop system, asymptotic stabiity, and the abiity of the cosed-oop to asymptoticay track desired trajectories. 1) State boundedness and asymptotic stabiity: When we seect criterion (5), for which there exists a cass K 8 function αp q and cass K functions 2 βp q, δp q such that c t px, u, dq ě αp}x}q, η t pnq ď βp}n}q, ρ t pdq ď δp}d}q, @x P R nx, u P R nu, d P R n d, n P R nn, we concude from (13) that, aong trajectories of the cosedoop system, the foowing inequaity hods for a t P Z ě0 : αp}x t }q ď J 8 0 py 0 q ` βp}n s }q ` δp}d s }q. (14) This provides a bound on the state provided that the noise and disturbances are vanishing, in the sense that βp}n s }q ă 8, δp}d s }q ă 8. Theorem 1 aso provides bounds on the state for nonvanishing noise and disturbances when we use exponentiay time-weighted functions c t p q, η t p q, and ρ t p q that satisfy c t px, u, dq ě λt αp}x}q, 2 A function α : R ě0 Ñ R ě0 is said to beong to cass K if it is continuous, zero at zero, and stricty increasing and is said to beong to cass K 8 if it beongs to cass K and is unbounded.

η t pnq ď λt βp}n}q, ρ t pdq ď λt δp}d}q, (15) for a x P R nx, u P R nu, d P R n d, n P R nn and some λ P p0, 1q. In this case we concude from (13) that for a t P Z ě0, αp}x t }q ď λ t J0 8 py 0 q ` λ ts βp}n s }q ` λ ts δp}d s }q. Therefore, x t remains bounded provided that the measurement noise n t and the unmeasured disturbance d t are both uniformy bounded. Moreover, }x t } converges to zero as t Ñ 8, when the noise and disturbances vanish asymptoticay. We have proved the foowing: Coroary 1: Suppose that Assumption 1 hods and aso that (15) hods for a cass K 8 function αp q, cass K functions βp q, δp q, and λ P p0, 1q. Then, for every initia condition x 0, uniformy bounded measurement noise sequence n 0:8, and uniformy bounded disturbance sequence d 0:8, the state x t remains uniformy bounded aong the trajectories of the process (1) with contro (7) defined by the infinite-horizon optimization (6). Moreover, when d t and n t converge to zero as t Ñ 8, the state x t aso converges to zero. Remark 3 (Time-weighted criteria): The exponentiay time-weighted functions (15) typicay arise from a criterion of the form λs cpx s, u s, d s q λs ηpn s q λs ρpd s q that weight the future more than the past. In this case, (15) hods for functions αp q, βp q, and δp q such that cpx, u, dq ě αp}x}q, ηpnq ď βp}n}q, and ρpdq ď δp}d}q, @x, u, d, n. 2) Reference tracking: When the contro objective is for the state x t to foow a given trajectory z t, the optimization criterion can be seected of the form λs cpx s z s, u s, d s q λs ηpn s q λs ρpd s q, with cpx, u, dq ě αp}x}q, @x, u, d for some cass K 8 function α and λ P p0, 1q. In this case, we concude from (13) that, for a t P Z ě0, αp}x tz t }q ď λ t J 8 0 py 0 q` λ ts ηpn s q` λ ts ρpd s q, which aows us to concude that x t converges to z t as t Ñ 8 when both n t and d t are vanishing sequences, and aso that, when these sequences are utimatey sma, the tracking error x t z t wi converge to a sma vaue. B. Finite-Horizon Onine Optimization To estabish state boundedness under the contro (7) defined by the finite-horizon optimization criterion (10), one needs additiona assumptions regarding the dynamics and the tera cost q t p q. Assumption 2 (Reversibe Dynamics): For every t P Z ě0, x t`1 P X, and u t P U, there exists a state x t P X and a disturbance d t P D such that x t`1 f t p x t, u t, d t q. (16) Assumption 3 (ISS-contro Lyapunov function): The tera cost q t p q is an ISS-contro Lyapunov function, in the sense that, for every t P Z ě0, x P X, there exists a contro u P U such that for a d P D, q t`1`ft px, u, dq q t pxq ď c t px, u, dq ` ρ t pdq. (17) Assumption 2 is very mid and essentiay means that the sets of disturbances D and past states X are sucienty rich to aow for a jump to any future state in X. Assumption 3 pays the roe of a common assumption in mode predictive contro, namey that the tera cost must be a contro Lyapunov function for the cosed-oop [2]. In the absence of the disturbance d t, (17) woud mean that q t p q coud be viewed as a contro Lyapunov function that decreases aong system trajectories for an appropriate contro input u t [28]. With disturbances, q t p q needs to be viewed as an ISS-contro Lyapunov function that satisfies an ISS stabiity condition for the disturbance input d t and an appropriate contro input u t [29]. Remark 4: When the dynamics are inear, for instance, Assumption 2 is satisfied if the state-space A matrix has no eigenvaues at the origin (e.g., if it resuts from the time-discretization of a continous-time system). When, the dynamics are inear and the cost function is quadratic, a tera cost q t p q satisfying Assumption 3 is typicay found by soving a system of inear matrix inequaities. We are now ready to state the finite-horizon counter-part to Theorem 1. Theorem 2 (Finite-horizon cost-to-go bound): Suppose that Assumptions 1, 2, and 3 hod. Then, for every t P Z ě0, there exist vectors d s P R n d, ñ s P R nn, for which η s pñ s q ă 8, ρ s p d s q ă 8, @s P t0, 1,..., t L 1u, and the trajectories of the process (1) with contro (7) defined by the finite-horizon optimization (9) satisfy tl1 ÿ cϕpt; tl, x tl, u tl:t, d tl:t q ď J 0 py 0 q` η s pñ s q ÿ tl1 ` ρ s p d s q ` η s pn s q ` ρ s pd s q. (18) The terms ř tl1 η s pñ s q ` řtl1 ρ s p d s q in the righthand side of (18) can be thought of as the arriva cost that appears in the MHE iterature to capture the quaity of the estimate at the beginning of the current estimation window [14]. For the extended-vaue extensions in (3), these terms can be bounded by tl1 ÿ sup n spn tl1 ÿ η s pn s q ` sup d spd ρ s pd s q.

Since (13) and (18) provide neary identica bounds, the discussion presented after Theorem 1 regarding state boundedness and reference tracking appies aso to the finitehorizon case, so we do not repeat it here. Proofs of these resuts can be found in the technica report [30]. IV. VALIDATION THROUGH SIMULATION To impement the contro aw (7) we need to find the contro sequence û t:t`t 1 t P U that achieves the outer imizations in (9). In view of Assumption 1, the desired contro sequence must be part of the sadde-point defined by (11a) (11b). It turns out that, from the perspective of numericay computing this sadde-point, it is more convenient to use the foowing equivaent characterization of the sadde-point: J t pu tl:t1, y tl:t q ˆx tl t PX, ˆd tl:t`t 1 t PD J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:t q, (19a) J t pu tl:t1, y tl:t q J t pˆx tl t, u tl:t1, û t:t`t 1 t, ˆd tl:t`t 1 t, y tl:tq (19b) (see, e.g., [24]). We introduce the sign in (19a) simpy to obtain two imizations, instead of a maximization and one imization, which somewhat simpifies the presentation. Since the process dynamics (1) has a unique soution for any given initia condition, contro input, and unmeasured disturbance, the couped optimizations in (19) can be rewritten as J t pu tl:t1, y tl:t q where p ˆd tl:t`t 1 t, x tl:t`t t qp Dru tl:t1,û t:t`t 1 s ÿ t`t 1 ` J t pu tl:t1, y tl:t q c s p x s, û s, ˆd s q q t`t p x t`t q η s`ys g s p x s q ` ρ s p ˆd s q, (20a) pû t:t`t 1 t, x tl`1:t`t t qpūr x tl, ˆd tl:t`t 1 s c s p x s, û s, ˆd s q ` q t`t p x t`t q Dru tl:t1, û t:t`t 1s η s`ys g s p x s q ρ s p ˆd s q, (20b) ˆd tl:t`t 1 t P D, x tl:t`t t P X,! p ˆd tl:t`t 1 t, x tl:t`t t q : x s`1 f s p x s, u s, ˆd s q, @s P tt L,..., t 1u, x s`1 f s p x s, û s, ˆd ) s q, @s P tt,..., t ` T 1u, (21a)! pû t:t`t 1 t, x tl`1:t`t t q : Ūr x tl, ˆd tl:t`t 1s û t:t`t 1 t P U, x tl`1:t`t t P X, x tl`1 f s p x tl, u tl, ˆd tlq, x s`1 f s p x s, u s, ˆd s q, @s P tt L ` 1,..., t 1u, ) x s`1 f s p x s, û s, ˆd s q, @s P tt,..., t ` T 1u. (21b) Essentiay, in each of the optimizations in (19) we introduced the vaues of the state as additiona optimization variabes that are constrained by the system dynamics. Whie this introduces additiona optimization variabes, it avoids the need to expicity evauate the soution ϕpt; t L, x tl, u tl:t1, d tl:t1 q that appears in the origina optimizations (19) and that can be numericay poory conditioned, e.g., for systems with unstabe dynamics. Whie soving either (19) or (20) gives the same resuts, we choose to sove the atter because it generay eads to simper optimization probems. An ecient prima-dua-ike interiorpoint method was deveoped in order to numericay compute soutions to this -max optimization. A description of this agorithm is of interest on its own and can be found in [30]. The foowing exampe uses this approach. Exampe 1 (Fexibe beam): Consider a singe-ink fexibe beam ike the one described in [31], where the contro objective is to reguate the mass on the tip of the beam to a desired reference trajectory. The contro input is the appied torque at the base, and the outputs are the tip s position, the ange at the base, the anguar veocity of the base, and a strain gauge measurement coected around the midde of the beam, respectivey. An approximate inearized discrete-time state-space mode of the dynamics, with a samping time T s 1 second, is given by x t`1 Ax t ` Bpu t ` d t q, y t Cx t ` n t, where d t is a disturbance, n t is measurement noise, and the system matrices are given by» fi A 1.0 1.016 0.676 1.084 1.0 0.585 0.233 0.032 0 0.665 1.241 1.783 0 0.042 0.288 0.023 0 0.009 0.439 0.143 0 0.002 0.012 0.007 0 0.001 0.014 0.308 0 0.000 0.001 0.001 0 1.264 37.070 10.581 1.0 1.016 0.676 1.084 0 2.109 59.920 16.883 0 0.665 1.241 1.783 0 0.413 9.156 3.695 0 0.009 0.439 0.143 0 0.012 0.371 3.929 0 0.001 0.014 0.308 B r0.800 0.797 0.003 0.001 1.327 1.163 0.197 0.006s 1, ff C «1.13 0.7225 0.2028 0.1220 0 0 0 0 1.0 0 0 0 0 0 0 0 0 0 0 0 1.0 0 0 0 0 0.9282 12.001 35.294 0 0 0 0 f,. (22) This matrix A has a doube eigenvaue at 1 with a singe independent eigenvector. Therefore this is an unstabe system. The optima contro input is found by soving the foowing optimization probem u t:t`t 1 t PU max }x t:t`t r t:t`t } 2 x tl PX,d tl:t`t 1 t PD ` λ u }u t:t`t 1 } 2 λ d }d tl:t`t 1 } 2 λ n }n tl:t } 2, where } } is the Eucidean norm, U tu t P R u max ď u t ď u max u, X R 8, and D td t P R d max ď d t ď d max u. The numerica computation of soutions to the -max optimization was performed using the agorithm described in [30].

1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 0 10 20 30 40 50 60 70 80 t Fig. 1. Simuation resuts of Fexibe Beam exampe. The reference is in red, the measured output in bue, the contro sequence in green, and the disturbance sequence in magenta. The resuts depicted in Figure 1 show the response of the cosed oop system under the contro aw (7) when our goa is to reguate the mass at the tip of the beam to a desired reference rptq α sgnpsinpωtqq with α 0.5 and ω 0.1. The other parameters in the optimization have vaues λ u 1, λ d 2, λ n 100, L 5, T 5, u max 1, d max 1. The state of the system starts cose to zero and evoves with zero contro input and sma random disturbance input unti time t 6, at which time the optima contro input (7) started to be appied aong with the optima worst-case disturbance d t t obtained from the -max optimization. The noise process n t was seected to be a zero-mean Gaussian independent and identicay distributed random process with standard deviation of 0.01. V. CONCLUSIONS We presented an output-feedback approach to noninear mode predictive contro using moving horizon state estimation. Soutions to the combined contro and state estimation probems were found by soving a singe -max optimization probem. Under the assumption that a saddepoint soution exists (which presumes appropriate forms of observabiity and controabiity), Theorem 1 gives bounds on the state of the system and the tracking error for reference tracking probems. Simiar resuts are given in Theorem 2 for the finite-horizon case under the additiona assumptions of reversibe dynamics and a tera cost that is an ISScontro Lyapunov function with respect to the disturbance input. Directions for future work incude investigating under what conditions, such as controabiity and observabiity, a sadde-point soution exists. Resuts specific to certain types of uncertainty, noise, and disturbances, such as mode uncertainty, may aso be investigated. REFERENCES [1] M. Morari and J. H Lee, Mode predictive contro: past, present and future, Computers & Chemica Engineering, vo. 23, no. 4, pp. 667 682, 1999. [2] D. Q. Mayne, J. B. Rawings, C. V. Rao, and P. O. Scokaert, Constrained mode predictive contro: Stabiity and optimaity, Automatica, vo. 36, no. 6, pp. 789 814, 2000. u* d* y ref [3] J. B. Rawings and D. Q. Mayne, Mode Predictive Contro: Theory and Design. Nob Hi Pubishing, 2009. [4] P. J. Campo and M. Morari, Robust mode predictive contro, in American Contro Conference, 1987, pp. 1021 1026, IEEE, 1987. [5] J. a. Lee and Z. Yu, Worst-case formuations of mode predictive contro for systems with bounded parameters, Automatica, vo. 33, no. 5, pp. 763 781, 1997. [6] A. Bemporad and M. Morari, Robust mode predictive contro: A survey, in Robustness in identification and contro, pp. 207 226, Springer, 1999. [7] L. Magni, G. De Nicoao, R. Scattoini, and F. Agöwer, Robust mode predictive contro for noninear discrete-time systems, Internationa Journa of Robust and Noninear Contro, vo. 13, no. 3-4, pp. 229 246, 2003. [8] P. Scokaert and D. Mayne, Min-max feedback mode predictive contro for constrained inear systems, Automatic Contro, IEEE Transactions on, vo. 43, no. 8, pp. 1136 1142, 1998. [9] A. Bemporad, F. Borrei, and M. Morari, Min-max contro of constrained uncertain discrete-time inear systems, Automatic Contro, IEEE Transactions on, vo. 48, no. 9, pp. 1600 1606, 2003. [10] H. Chen, C. Scherer, and F. Agöwer, A game theoretic approach to noninear robust receding horizon contro of constrained systems, in American Contro Conference, 1997. Proceedings of the 1997, vo. 5, pp. 3073 3077, IEEE, 1997. [11] G. Grimm, M. J. Messina, S. E. Tuna, and A. R. Tee, Noay robust mode predictive contro with state constraints, Automatic Contro, IEEE Transactions on, vo. 52, no. 10, pp. 1856 1870, 2007. [12] J. B. Rawings and B. R. Bakshi, Partice fitering and moving horizon estimation, Computers & chemica engineering, vo. 30, no. 10, pp. 1529 1541, 2006. [13] F. Agöwer, T. A. Badgwe, J. S. Qin, J. B. Rawings, and S. J. Wright, Noninear predictive contro and moving horizon estimation-an introductory overview, in Advances in contro, pp. 391 449, Springer, 1999. [14] C. V. Rao, J. B. Rawings, and D. Q. Mayne, Constrained state estimation for noninear discrete-time systems: Stabiity and moving horizon approximations, Automatic Contro, IEEE Transactions on, vo. 48, no. 2, pp. 246 258, 2003. [15] A. Aessandri, M. Bagietto, and G. Battistei, Moving-horizon state estimation for noninear discrete-time systems: New stabiity resuts and approximation schemes, Automatica, vo. 44, no. 7, pp. 1753 1765, 2008. [16] J. Löfberg, Towards joint state estimation and contro in imax MPC, in Proceedings of 15th IFAC Word Congress, Barceona, Spain, 2002. [17] D. Q. Mayne, S. Raković, R. Findeisen, and F. Agöwer, Robust output feedback mode predictive contro of constrained inear systems: Time varying case, Automatica, vo. 45, no. 9, pp. 2082 2087, 2009. [18] D. Sui, L. Feng, and M. Hovd, Robust output feedback mode predictive contro for inear systems via moving horizon estimation, in American Contro Conference, 2008, pp. 453 458, IEEE, 2008. [19] R. Findeisen, L. Imsand, F. Agöwer, and B. A. Foss, State and output feedback noninear mode predictive contro: An overview, European Journa of Contro, vo. 9, no. 2, pp. 190 206, 2003. [20] L. Imsand, R. Findeisen, E. Buinger, F. Agöwer, and B. A. Foss, A note on stabiity, robustness and performance of output feedback noninear mode predictive contro, Journa of Process Contro, vo. 13, no. 7, pp. 633 644, 2003. [21] M. Lazar, D. Muñoz de a Peña, W. Heemes, and T. Aamo, On input-to-state stabiity of max noninear mode predictive contro, Systems & Contro Letters, vo. 57, no. 1, pp. 39 48, 2008. [22] D. Limon, T. Aamo, D. Raimondo, D. M. de a Pena, J. Bravo, A. Ferramosca, and E. Camacho, Input-to-state stabiity: a unifying framework for robust mode predictive contro, in Noninear mode predictive contro, pp. 1 26, Springer, 2009. [23] D. M. Raimondo, D. Limon, M. Lazar, L. Magni, and E. Camacho, Min-max mode predictive contro of noninear systems: A unifying overview on stabiity, European Journa of Contro, vo. 15, no. 1, pp. 5 21, 2009. [24] T. Başar and G. J. Osder, Dynamic Noncooperative Game Theory. London: Academic Press, 1995. [25] S. P. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004. [26] L. Magni and R. Scattoini, Robustness and robust design of MPC for noninear discrete-time systems, in Assessment and future directions of noninear mode predictive contro, pp. 239 254, Springer, 2007. [27] J. B. Rawings, D. Angei, and C. N. Bates, Fundamentas of economic mode predictive contro, in Decision and Contro (CDC), 2012 IEEE 51st Annua Conference on, pp. 3851 3861, IEEE, 2012. [28] E. D. Sontag, Contro-yapunov functions, in Open probems in mathematica systems and contro theory, pp. 211 216, Springer, 1999. [29] D. Liberzon, E. D. Sontag, and Y. Wang, Universa construction of feedback aws achieving ISS and integra-iss disturbance attenuation, Systems & Contro Letters, vo. 46, no. 2, pp. 111 127, 2002. [30] D. A. Copp and J. P. Hespanha, Noninear output-feedback mode predictive contro with moving horizon estimation, tech. rep., Univ. Caifornia, Santa Barbara, 2014. http://www.ece.ucsb.edu/ hespanha/techrep.htm. [31] E. Schmitz, Experiments on the End-Point Position Contro of a Very Fexibe One-Link Manipuator. PhD thesis, Stanford University, 1985.