Outline. 1 Full information estimation. 2 Moving horizon estimation - zero prior weighting. 3 Moving horizon estimation - nonzero prior weighting

Outline Moving Horizon Estimation MHE James B. Rawlings Department of Chemical and Biological Engineering University of Wisconsin Madison SADCO Summer School and Workshop on Optimal and Model Predictive Control Universit of Bayreuth Bayreuth, Germany September 9, 2013 1 Full information estimation 2 Moving horizon estimation - zero prior weighting 3 Moving horizon estimation - nonzero prior weighting 4 Moving horizon estimation - constrained estimation 5 Moving horizon estimation - smoothing and filtering update SADCO 2013 MHE state estimation 1 / 65 SADCO 2013 MHE state estimation 2 / 65 State Estimation Full Information Estimation Now turn to the general problem of estimating the state of a noisy dynamic system given noisy measurements: x + = f x, w y = hx + v 1 in which the process disturbance, w, measurement disturbance, v, and system initial state, x0, are independent random variables with stationary probability densities. Full information estimation will prove to have the best theoretical properties in terms of stability and optimality. Unfortunately, it will also prove to be computationally intractable except for the simplest cases, such as a linear system model. One method for practical estimator design therefore is to come as close as possible to the properties of full information estimation while maintaining a tractable online computation MHE. SADCO 2013 MHE state estimation 3 / 65 SADCO 2013 MHE state estimation 4 / 65

Variables Notation required to distinguish the system variables from the estimator variables: System Decision Optimal variable variable decision state x χ ˆx process disturbance w ω ŵ measured output y η ŷ measurement disturbance v ν ˆv The relationships between these variables: x + = f x, w y = hx + v χ + = f χ, ω y = hχ + ν η = hχ ˆx + = f ˆx, ŵ y = hˆx + ˆv ŷ = hˆx Full Information Estimation The full information objective function is subject to T 1 V T χ0, ω = l x χ0 x 0 + χ + = f χ, ω i=0 y = hχ + ν l i ωi, νi 2 in which T is the current time, yi is the measurement at time i, and x 0 is the prior information on the initial state. The full information estimator is then defined as the solution to min V T χ0, ω 3 χ0,ω SADCO 2013 MHE state estimation 5 / 65 SADCO 2013 MHE state estimation 6 / 65 i-ioss What class of systems have a stable state estimator? Assume system observability? Too restrictive for even linear systems recall the definition of detectability. We need a similar detectability definition for nonlinear systems i-ioss: Definition 1 i-ioss The system x + = f x, w, y = hx is incrementally input/output-to-state stable i-ioss if there exists some β KL and γ 1, γ 2 K such that for every two initial states z 1 and z 2, and any two disturbance sequences w 1 and w 2 xk; z 1, w 1 xk; z 2, w 2 β z 1 z 2, k+ γ 1 w1 w 2 0:k 1 + γ2 yz1,w 1 y z2,w 2 0:k SADCO 2013 MHE state estimation 7 / 65 Convergent states under i-ioss The notation xk; x 0, w denotes the solution to x + = f x, w satisfying initial condition x0 = x 0 with disturbance sequence w = {w0, w1,...}. The notation xk; x 1, k 1, w denotes the solution to x + = f x, w satisfying the condition xk 1 = x 1 with disturbance sequence w = {w0, w1,...}. One of the most important and useful implications of the i-ioss property: Proposition 2 Convergence of state under i-ioss If system x + = f x, w, y = hx is i-ioss, w 1 k w 2 k and y 1 k y 2 k as k, then xk; z 1, w 1 xk; z 2, w 2 for all z 1, z 2 The proof of this proposition is discussed in Exercise 4.3. SADCO 2013 MHE state estimation 8 / 65

Convergent disturbances under i-ioss The class of disturbances w, v affecting the system is defined as: Assumption 3 Convergent disturbances The sequence wk, vk for k I 0 are bounded and converge to zero as k. Remark 4 Summable disturbances If the disturbances satisfy Assumption 3, then there exists a K-function γ w such that the disturbances are summable γ w wi, vi i=0 is bounded See Sontag 1998, Proposition 7 for a statement and proof of this result. Stage Cost Given such class of disturbances, the estimator stage cost is chosen to satisfy the following property: Assumption 5 Positive definite stage cost The initial state cost and stage costs are continuous functions and satisfy the following inequalities for all x R n, w R g, and v in R p γ x x l x x γ x x 4 γ w w, v l i w, v γ w w, v i 0 5 in which γ x, γ w, γ x, γ w K and γ w is defined in Remark 4. Notice that if we change the class of disturbances affecting the system, we may also have to change the stage cost in the state estimator to satisfy l i w, v γ w w, v in 5. SADCO 2013 MHE state estimation 9 / 65 SADCO 2013 MHE state estimation 10 / 65 Stability Global Asymptotic Stability Zero error system First consider the zero estimate error solution for all k 0 initial state is equal to the estimator s prior and zero disturbances. In this case, the optimal solution is: ˆx0 T = x 0 ŵi T = 0 for all 0 i T, T 1 hˆxi T = yi for all 0 i T, T 1 The perturbation to this solution are: the system s initial state distance from x 0, and the process and measurement disturbances. We next define stability properties so that: asymptotic stability considers the case x 0 x 0 with zero disturbances. robust stability considers the case in which wi, vi 0. Definition 6 Global asymptotic stability The estimate is based on the noise-free measurement y = hxx 0, 0. The estimate is nominally globally asymptotically stable GAS if there exists a KL-function β such that for all x 0, x 0 and k I 0 xk; x 0, 0 ˆxk β x 0 x 0, k The standard definition of estimator stability for linear systems is consistent with Definition 6. SADCO 2013 MHE state estimation 11 / 65 SADCO 2013 MHE state estimation 12 / 65

Robust Global Asymptotic Stability Robust GAS of full information estimates Definition 7 Robust global asymptotic stability The estimate is based on the noisy measurement y = hxx 0, w + v. The estimate is robustly GAS if for all x 0 and x 0, and w, v satisfying Assumption 3, the following hold. 1 The estimate converges to the state; as k ˆxk xk; x 0, w 2 For every ε > 0 there exists δ > 0 such that γ x x 0 x 0 + γ w wi, vi δ 6 i=0 The first part of the definition ensures that converging disturbances lead to converging estimates. The second part provides a bound on the transient estimate error given a bound on the disturbances. Note that robust GAS implies GAS see also Exercise 4.9. With the pieces in place, we can state the main result of this section: Theorem 8 Robust GAS of full information estimates Given an i-ioss detectable system and measurement sequence generated by 1 with disturbances satisfying Assumption 3, then the full information estimate with stage cost satisfying Assumption 5 is robustly GAS. implies xk; x 0, w ˆxk ε for all k I 0. SADCO 2013 MHE state estimation 13 / 65 SADCO 2013 MHE state estimation 14 / 65 Robust GAS of full information estimates Proof: a First we establish that the full information cost is bounded for all T 1 including T =. Consider a candidate set of decision variables χ0 = x 0 ωi = wi 0 i T 1 The full information cost for this choice is T 1 V T χ0, ω = l x x 0 x 0 + l i wi, vi From Remark 4, the sum is bounded for all T including the limit T =. Therefore, let V be an upper bound for the right-hand side. The optimal cost exists for all T 0 because V T is a continuous function and goes to infinity as any of its arguments goes to infinity due to the lower bounds in Assumption 5. i=0 SADCO 2013 MHE state estimation 15 / 65 Robust GAS of full information estimates Next we show that the optimal cost sequence converges: Evaluate the cost at time T 1 using the optimal solution from time T. We have that V T 1 ˆx0 T, ŵ T = V 0 T l T ŵt T, ˆvT T Optimization at time T 1 can only improve the cost giving V 0 T V 0 T 1 + l T ŵt T, ˆvT T and we see that the optimal sequence {VT 0 } is nondecreasing and bounded above by V. Therefore the sequence converges and the convergence implies l T ŵt T, ˆvT T 0 as T. The lower bound in 5 then gives that ˆvT = yt hˆxt T 0 and ŵt T 0 as T. SADCO 2013 MHE state estimation 16 / 65

Robust GAS of full information estimates Robust GAS of full information estimates Since the measurement satisfies y = hx + v, and vt converges to zero, we have that hxt hˆxt T 0 ŵt T 0 T Because the system is i-ioss, we have the following inequality for all x 0, ˆx0 k, w, ŵ k, and k 0, xk; x 0, w xk; ˆx0 k, ŵ k β x 0 ˆx0 k, k+ γ 1 w ŵk 0:k 1 + γ2 hx h xk 0:k 7 From Proposition 2 we conclude that xk; x 0, w xk; ˆx0 k, ŵ k converges to zero. Since the state estimate is ˆxk := xk; ˆx0 k, ŵ k and the state is xk = xk; x 0, w, we have that ˆxk xk k and the estimate converges to the system state. This establishes part 1 of the robust GAS definition. 1 Since wk converges to zero, wk ŵk converges to zero, and hxk hˆxk converges to zero. SADCO 2013 MHE state estimation 17 / 65 1 It is not difficult to extend this argument to conclude ˆxi k xi; x 0, w as k for k N i k and any finite N 0. SADCO 2013 MHE state estimation 18 / 65 Robust GAS of full information estimates b Assume that 6 holds for some arbitrary δ > 0. This gives immediately an upper bound on the optimal full information cost function for all T, 0 T, i.e, V = δ. We then have the following bounds on the initial state estimate for all k 0, and the initial state γ x ˆx0 k x 0 δ γ x x 0 x 0 δ These two imply a bound on the initial estimate error, x 0 ˆx0 k γ 1 δ + γ 1 x x δ. The process disturbance bounds are for all k 0, 0 i k γ w ŵi k δ and we have that wi ŵi k γ 1 w γ w wk δ δ + γ 1 w δ. SADCO 2013 MHE state estimation 19 / 65 Robust GAS of full information estimates A similar argument gives for the measurement disturbance vi ˆvi k γ 1 δ + γ 1 w w δ. Since vi ˆvi k = hxi hˆxi k, we have that hxi hˆxi k γ 1 w δ + γ 1 w δ We substitute these bounds in 7 and obtain for all k 0 xk ˆxk β γ 1 x δ + γ 1 x δ + γ 1 + γ 2 γ 1 δ + γ 1 w w δ in which βs := βs, 0 is a K-function. Finally we choose δ such that the right-hand side is less than ε, which is possible since the right-hand side defines a K-function, which goes to zero with δ. This gives for all k 0 xk ˆxk ε and part 2 of the robust GAS definition is established. SADCO 2013 MHE state estimation 20 / 65

State Estimation as Optimal Control of Estimate Error State Estimation as Optimal Control of Estimate Error Consider again the estimation problem in the simplest possible setting with a linear time invariant model and Gaussian noise x + = Ax + Gw w N0, Q y = Cx + v v N0, R 8 and random initial state x0 Nx0, P 0. In full information estimation, we define the objective function V T χ0, ω = 1 2 T 1 χ0 x0 2 P 0 1 + subject to χ + = Aχ + Gω, y = Cχ + ν. i=0 ωi 2 Q 1 + νi 2 R 1 Denote the solution to this optimization as ˆx0 T, ŵ T = arg min χ0,ω V T χ0, ω ˆxi + 1 T = Aˆxi T + Gŵi T Because the system is linear, the estimator is stable if and only if it is stable with zero process and measurement disturbances. An equivalent question If noise-free data are provided to the estimator, wi, vi = 0 for all i 0 in 8, is the estimate error asymptotically stable as T for all x 0? SADCO 2013 MHE state estimation 21 / 65 SADCO 2013 MHE state estimation 22 / 65 State Estimation as Optimal Control of Estimate Error State Estimation as Optimal Control of Estimate Error Then we make this statement precise: Define estimate error as xi T = xi ˆxi T for 0 i T 1, T 1. The noise-free measurement satisfies yi C ˆxi T = C xi T, 0 i T. The initial condition term can be written in estimate error as ˆx0 x0 = x0 a in which a = x0 x0. The cost function of noise-free measurement is V T a, x0, w = 1 2 T 1 x0 a 2 P 0 1 + i=0 C xi 2 R 1 + wi 2 Q 1 9 For estimation we solve subject to x + = A x + Gw. x 0 0; a, w 0 a = min x0,w V T a, x0, w 10 Now consider problem 10 as an optimal control problem using w as manipulated variable and minimizing an objective that measures size of estimate error x and control w. The stability analysis in estimation is to show that the origin for x is asymptotically stable, i.e., if there exists KL-function β such that x 0 T ; a β a, T for all T I 0. SADCO 2013 MHE state estimation 23 / 65 SADCO 2013 MHE state estimation 24 / 65

State Estimation as Optimal Control of Estimate Error Differences between standard regulation and the estimation problem 10: 10 is slightly nonstandard because it contains an extra decision variable, the initial state, and an extra term in the cost function, 9. Convergence is a question about the terminal state in a sequence of different optimal control problems with increasing horizon length T. It is also not the standard regulator convergence question which asks how the state trajectory evolves using the optimal control law. In standard regulation, we inject the optimal first input and ask whether we are successfully moving the system to the origin as time increases. In estimation, we do not inject anything into the system; we are provided more information as time increases and ask whether our explanation of the data is improving terminal estimate error is decreasing as time increases. State Estimation as Optimal Control of Estimate Error Choose forward DP to seek the optimal terminal state x 0 T ; a as a function of the parameter a appearing in the cost function. Check Exercise 4.12 to see how to solve 10. There exists the following recursion for the optimal terminal state x 0 k + 1; a = A LkC x 0 k; a 11 for k 0. The initial condition for the recursion is x 0 0; a = a. The time-varying gains Lk and associated cost matrices P k are P k + 1 = GQG + AP ka AP kc CP kc + R 1 CP ka 12 Lk = AP kc CP kc + R 1 13 in which P 0 is specified in the problem. SADCO 2013 MHE state estimation 25 / 65 SADCO 2013 MHE state estimation 26 / 65 State Estimation as Optimal Control of Estimate Error Asymptotic stability of the estimate error can be established by showing that V k, x := 1/2 x Pk 1 x is a Lyapunov function for 11 Jazwinski, 1970, Theorem 7.4. Although one can find Lyapunov functions valid for estimation, they do not have the same simple connection to optimal cost functions as in standard regulation problems, even in the linear, unconstrained case. Stability arguments based instead on properties of VT 0 a are simpler and more easily adapted to cover new situations arising in research problems. If a Lyapunov function is required for further analysis, a converse theorem guarantees its existence. Duality of Linear Estimation and Regulation Estimator problem: xk + 1 = Axk + Gwk yk = Cxk + vk R > 0 Q > 0 A, C detectable A, G stabilizable xk + 1 = A LC xk Regulator problem: xk + 1 = Axk + Buk yk = Cxk R > 0 Q > 0 A, B stabilizable A, C detectable xk + 1 = A + BK xk SADCO 2013 MHE state estimation 27 / 65 SADCO 2013 MHE state estimation 28 / 65

Duality of Linear Estimation and Regulation MHE introduction Regulator Estimator R > 0, Q > 0 R > 0, Q > 0 A, B stabilizable A, C detectable A, C detectable A, G stabilizable In MHE we consider only the N most recent measurements, y N T = {yt N, yt N + 1,... yt 1}. For T > N, the MHE problem is defined to be: min ˆV T χt N, ω = Γ T N χt N + χt N,ω T 1 i=t N l i ωi, νi Lemma 9 Duality of controllability and observability A, B is controllable stabilizable if and only if A, B is observable detectable. subject to: χ + = f χ, ω y = hχ + ν ω = {ωt N,..., ωt 1} SADCO 2013 MHE state estimation 29 / 65 SADCO 2013 MHE state estimation 30 / 65 Prior Information The designer chooses the prior weighting Γ k for k > N until the data horizon is full. For times T N, we generally define the MHE problem to be the full information problem. Zero Prior Weighting When we choose Γ i = 0 for all i N: ˆV T χt N, ω = T 1 i=t N l i wi, vi Because it discounts the past data completely, this form of MHE must be able to asymptotically reconstruct the state using only the most recent N measurements. Establishing the convergence of the estimate error to zero for this form of MHE is straightforward. SADCO 2013 MHE state estimation 31 / 65 Observability To ensure solution existence, the system needs to be restricted further than i-ioss: Definition 10 Observability The system x + = f x, w, y = hx is observable if there exist finite N o I 1, γ 1, γ 2 K such that for every two initial states z 1 and z 2, and any two disturbance sequences w 1, w 2, and all k N o z 1 z 2 γ 1 w1 w 2 0:k 1 + γ2 yz1,w 1 y z2,w 2 0:k At any time T N, for the decision variables of χt N = xt N and ωi = wi for T N i T 1, recall the cost function: ˆV T χt N, ω = T 1 i=t N l i wi, vi 14 which is less than V defined in the full information problem. SADCO 2013 MHE state estimation 32 / 65

Observability Ensures Existence of Solution Final-state observability Observability ensures that for all k N N o, xk N ˆxk N k γ 2 v k N:k Since vk is bounded for all k 0, observability has bounded the distance between the initial estimate in the horizon and the system state for all k N. This fact along with continuity of ˆV T χ, ω ensures existence of the solution to the MHE problem by the Weierstrass theorem. But the solution does not have to be unique. Definition 11 Final-state observability The system x + = f x, w, y = hx is final-state observable FSO if there exist finite N o I 1, γ 1, γ 2 K such that for every two initial states z 1 and z 2, and any two disturbance sequences w 1, w 2, and all k N o xk; z 1, w 1 xk; z 2, w 2 γ 1 w1 w 2 0:k 1 + γ 2 yz1,w 1 y z2,w 2 0:k FSO is the natural system requirement for MHE with zero prior weighting to provide stability and convergence. FSO is weaker than observability and stronger than i-ioss detectability as discussed in Exercise 4.11. SADCO 2013 MHE state estimation 33 / 65 SADCO 2013 MHE state estimation 34 / 65 Convergence of MHE Cost Function Robust GAS of MHE with zero prior weighting Since wi, vi converges to zero, 31 implies that ˆV T converges to zero as T. The optimal cost at T, ˆV 0 T, is bounded above by ˆV T so ˆV 0 T also converges to zero: ˆV 0 T = T 1 i=t N l i ŵi T, yi hˆxi T 0 Therefore yi hˆxi T 0 and ŵi T 0. Since y = hx + v and vi converges to zero, and wi converges to zero, we also have Theorem 12 Robust GAS of MHE with zero prior weighting Consider an observable system and measurement sequence generated by 1 with disturbances satisfying Assumption 3. The MHE estimate with zero prior weighting, N N o, and stage cost satisfying 5, is robustly GAS. hxi hˆxi T 0 wi ŵi T 0 15 for T N i T 1, T N. SADCO 2013 MHE state estimation 35 / 65 SADCO 2013 MHE state estimation 36 / 65

Nonzero Prior Weighting Full information arrival cost There are two drawbacks of zero prior weighting: The system had to be assumed observable rather than detectable to ensure existence of the solution to the MHE problem. A large horizon N may be required to obtain performance comparable to full information estimation. We address these disadvantages by using nonzero prior weighting: min ˆV T χt N, ω = Γ T N χt N + χt N,ω T 1 i=t N l i ωi, νi Definition 13 Full information arrival cost The full information arrival cost is defined as subject to Z T p = min χ0,ω V T χ0, ω 16 χ + = f χ, ω y = hχ + ν χt ; χ0, ω = p Here forward DP is used to decompose the full information problem exactly into the MHE problem 30 in which Γ is chosen as arrival cost. Lemma 14 MHE and full information estimation The MHE problem 30 is equivalent to the full information problem 3 for the choice Γ k = Z k for all k > N and N 1. SADCO 2013 MHE state estimation 37 / 65 SADCO 2013 MHE state estimation 38 / 65 MHE arrival cost Definition 15 MHE arrival cost The MHE arrival cost Ẑ is defined for T > N as subject to Ẑ T p = min z,ω ˆV T z, ω = min z,ω Γ T Nz + T 1 i=t N l i ωi, νi 17 χ + = f χ, ω y = hχ + ν χt ; z, T N, ω = p For T N, usually define the MHE problem to be the full information problem, so Ẑ T = Z T and ˆV 0 T = V 0 T. SADCO 2013 MHE state estimation 39 / 65 Prior weighting Choosing a prior weighting that underbounds the MHE arrival cost is the key sufficient condition for stability and convergence of MHE. Assumption 16 Prior weighting We assume that Γ k is continuous and satisfies the following inequalities for all k > N 1 Upper bound Γ k p Ẑ k p = min z,ω Γ k Nz + k 1 i=k N l i ωi, νi 18 subject to χ + = f χ, ω, y = hχ + ν, χk; z, k N, ω = p. 2 Lower bound in which γ p K. Γ k p ˆV k 0 + γ p ˆxk 19 p SADCO 2013 MHE state estimation 40 / 65

Prior weighting Upper Bound of MHE Arrival Cost Ẑ k p Γ k p An upper bound for the MHE optimal cost is necessary to establish convergence of the MHE estimates. A stronger result is that the MHE arrival cost is bounded above by the full information arrival cost. Proposition 17 Arrival cost of full information greater than MHE Ẑ T Z T T 1 20 ˆV 0 k Given 20 we also have the analogous inequality for the optimal costs of MHE and full information ˆxk p ˆV 0 T V 0 T T 1 21 SADCO 2013 MHE state estimation 41 / 65 SADCO 2013 MHE state estimation 42 / 65 MHE Detectable Robust GAS of MHE Assumption 18 MHE detectable system We say a system x + = f x, w, y = hx is MHE detectable if the system augmented with an extra disturbance w 2 x + = f x, w 1 + w 2 y = hx is i-ioss with respect to the augmented disturbance w 1, w 2. Note that MHE detectable is stronger than i-ioss detectable but weaker than observable and FSO. See also Exercise 4.10. Theorem 19 Robust GAS of MHE Consider an MHE detectable system and measurement sequence generated by 1 with disturbances satisfying Assumption 3. The MHE estimate defined by 30 using the prior weighting function Γ k satisfying Assumption 16 and stage cost satisfying Assumption 5 is robustly GAS. SADCO 2013 MHE state estimation 43 / 65 SADCO 2013 MHE state estimation 44 / 65

Robust GAS of MHE Constraints x ŵ0 N ˆx0 N 0 ˆx1 N ŵn 2N ˆxN 2N ŵn 1 N ˆxN N ˆxN 1 N N ŵ2n 1 2N ˆx2N 2N ˆx2N 1 2N ˆxN + 1 2N ˆxN 2N ˆxN N 2N k For estimation problem, some physically known facts should also be enforced such as: Concentrations of impurities must be nonnegative, Fluxes of mass must have the correct sign given concentration gradients. Fluxes of energy must have the correct sign given temperature gradients.... However, unlike the regulator, the estimator has no way to enforce these constraints on the system. It is important that any constraints imposed on the estimator are satisfied by the system generating the measurements. SADCO 2013 MHE state estimation 45 / 65 SADCO 2013 MHE state estimation 46 / 65 Constrained MHE It is straightforward to add constraints in MHE since it is posed as an optimization formulation. Assumption 20 Estimator constraint sets 1 For all k I 0, the sets W k, X k, and V k are nonempty and closed, and W k and V k contain the origin. 2 For all k I 0, the disturbances and state satisfy 3 The prior satisfies x 0 X 0. xk X k wk W k vk V k Constrained full information Constrained full information. The constrained full information estimation objective function is: subject to T 1 V T χ0, ω = l x χ0 x 0 + χ + = f χ, ω i=0 y = hχ + ν l i ωi, νi 22 χi X i ωi W i νi V i i I 0:T 1 The constrained full information problem is min V T χ0, ω 23 χ0,ω SADCO 2013 MHE state estimation 47 / 65 SADCO 2013 MHE state estimation 48 / 65

Robust GAS of constrained full information Constrained MHE. Constrained MHE. The constrained moving horizon estimation objective function is Theorem 21 Robust GAS of constrained full information Consider an i-ioss detectable system and measurement sequence generated by 1 with constrained, convergent disturbances satisfying Assumptions 3 and 20. The constrained full information estimator 23 with stage cost satisfying Assumption 5 is robustly GAS. ˆV T χt N, ω = Γ T N χt N + subject to χ + = f χ, ω T 1 i=t N y = hχ + ν l i ωi, νi 24 χi X i ωi W i νi V i i I T N:T 1 The constrained MHE is given by the solution to the following problem min ˆV T χt N, ω 25 χt N,ω SADCO 2013 MHE state estimation 49 / 65 SADCO 2013 MHE state estimation 50 / 65 Robust GAS of constrained MHE Constrained Linear Systems For the constrained linear systems: Theorem 22 Robust GAS of constrained MHE Consider an MHE detectable system and measurement sequence generated by 1 with convergent, constrained disturbances satisfying Assumptions 3 and 20. The constrained MHE estimator 25 using the prior weighting function Γ k satisfying Assumption 16 and stage cost satisfying Assumption 5 is robustly GAS. Because the system satisfies the state and disturbance constraints due to Assumption 20, both full information and MHE optimization problems are feasible at all times. The proofs of Theorems 21 and 22 closely follow the proofs of their respective unconstrained versions, Theorems 8 and 19, and are omitted. x + = Ax + Gw y = Cx + v 26 First, the i-ioss assumption of full information estimation and the MHE detectability assumption both reduce to the assumption that A, C is detectable in this case. We usually choose a constant quadratic function for the estimator stage cost for all i I 0 l i w, v = 1/2 w 2 Q 1 + v 2 R 1 Q, R > 0 27 In the unconstrained linear problem, we can of course find the full information arrival cost exactly; it is: Z k z = V 0 k + 1/2 z ˆxk P k 1 k 0 SADCO 2013 MHE state estimation 51 / 65 SADCO 2013 MHE state estimation 52 / 65

Constrained Linear Systems Robust GAS of constrained MHE We use this unconstrained arrival cost to be the prior weighting in MHE: Assumption 23 Prior weighting for linear system in which ˆV 0 k Γ k z = ˆV k 0 + 1/2 z ˆxk P k 1 k > N 28 is the optimal MHE cost at time k. This choice implies robust GAS of the MHE estimator also for the constrained case as we next demonstrate. To ensure the form of the estimation problem to be solved online is a quadratic program, we specialize the constraint sets to be polyhedral regions. Corollary 25 Robust GAS of constrained MHE Consider a detectable linear system and measurement sequence generated by 26 with convergent, constrained disturbances satisfying Assumptions 3 and 24. The constrained MHE estimator 25 using prior weighting function satisfying 28 and stage cost satisfying 27 is robustly GAS. This corollary follows as a special case of Theorem 22. Assumption 24 Polyhedral constraint sets For all k I 0, the sets W k, X k, and V k in Assumption 20 are nonempty, closed polyhedral regions containing the origin. SADCO 2013 MHE state estimation 53 / 65 SADCO 2013 MHE state estimation 54 / 65 Filtering Update Smoothing Update The MHE approach uses the MHE estimate ˆxT N and prior weighting function Γ T N derived from the unconstrained arrival cost as shown in 28. We call this approach a filtering update because the prior weight at time T is derived from the solution of the MHE filtering problem at time T N, i.e., the estimate of ˆxT N := ˆxT N T N given measurements up to time T N 1. For implementation, this choice requires storage of a window of N prior filtering estimates to be used in the prior weighting functions as time progresses. In the smoothing update we wish to use ˆxT N T 1 instead of ˆxT N T N for the prior and wish to find an appropriate prior weighting based on this choice. When constraints are added to the problem, the smoothing update provides a different MHE than the filtering update. The smoothing prior weighting maintains the stability and GAS robustness properties of MHE with the filtering update. Recall the unconstrained full information arrival cost is given by Z T N z = VT 0 N + 1/2 z ˆxT N 2 P T N 1 T > N 29 Now consider the proper weight when using ˆxT N T 2 in place of ˆxT N := ˆxT N T 1... SADCO 2013 MHE state estimation 55 / 65 SADCO 2013 MHE state estimation 56 / 65

Smoothing Update Smoothing Update Should it be the smoothed covariance PT N T 2 instead of P T N := PT N T 1? Correct, but not complete. The smoothed prior ˆxT N T 2 is influenced by the measurements y 0:T 2. But the sum of stage costs in the MHE problem at time T depends on measurements y T N:T 1 Therefore we have to adjust the prior weighting so we do not double count the data y T N:T 2. For any square matrix R and integer k 1, define diag k R to be the following 0 R R C diag k R :=... O k = CA C..... R }{{} CA k 2 CA k 3 C k times W k = diag k R + O k diag k QO k SADCO 2013 MHE state estimation 57 / 65 SADCO 2013 MHE state estimation 58 / 65 Smoothing Update Smoothing Update The smoothing arrival cost is then given by 2 From the following recursion Rauch, Tung, and Striebel, 1965; Bryson and Ho, 1975 Pk T = Pk+ PkA P k + 1 Pk 1 + 1 T P k + 1 P k + 1 1 APk We iterate this equation backwards N 1 times starting from the known value PT 1 T 2 := P T 1 to obtain PT N T 2. Z T N z = ˆV 0 T 1 + 1/2 z ˆxT N T 2 2 PT N T 2 1 1/2 y T N:T 2 O N 1 z 2 W N 1 1 T > N The second term accounts for the use of the smoothed covariance and the smoothed estimate. The third term subtracts the effect of the measurements that have been double counted in the MHE objective as well as the smoothed prior estimate. SADCO 2013 MHE state estimation 59 / 65 2 See Rao, Rawlings, and Lee 2001 and Rao 2000, pp.80 93 for a derivation that shows Z T = Z T for T > N. SADCO 2013 MHE state estimation 60 / 65

Smoothing Update Recommended exercises yk MHE problem at T y T N:T 1 smoothing update y T N 1:T 2 Observability, detectability, i-ioss. Exercises 4.1, 4.2, 4.3, 4.4, 4.5, 4.7, 4.8, 4.10, 4.11. 3 Estimator convergence. Exercises 4.14, 4.15. filtering update y T 2N:T N 1 T 2N T N k T SADCO 2013 MHE state estimation 61 / 65 3 Rawlings and Mayne 2009, Chapter 4. Downloadable from www.che.wisc.edu/~jbraw/mpc. SADCO 2013 MHE state estimation 62 / 65 Further Reading I Further Reading II A. E. Bryson and Y. Ho. Applied Optimal Control. Hemisphere Publishing, New York, 1975. A. H. Jazwinski. Stochastic Processes and Filtering Theory. Academic Press, New York, 1970. C. V. Rao. Moving Horizon Strategies for the Constrained Monitoring and Control of Nonlinear Discrete-Time Systems. PhD thesis, University of Wisconsin Madison, 2000. C. V. Rao, J. B. Rawlings, and J. H. Lee. Constrained linear state estimation a moving horizon approach. Automatica, 3710: 1619 1628, 2001. H. E. Rauch, F. Tung, and C. T. Striebel. Maximum likelihood estimates of linear dynamic systems. AIAA J., 38:1445 1450, 1965. J. B. Rawlings and D. Q. Mayne. Model Predictive Control: Theory and Design. Nob Hill Publishing, Madison, WI, 2009. 576 pages, ISBN 978-0-9759377-0-9. E. D. Sontag. Comments on integral variants of ISS. Sys. Cont. Let., 34: 93 100, 1998. SADCO 2013 MHE state estimation 63 / 65 SADCO 2013 MHE state estimation 64 / 65

Acknowledgments The author is indebted to Luo Ji of the University of Wisconsin and Ganzhou Wang of the Universität Stuttgart for their help in organizing the material and preparing the overheads. SADCO 2013 MHE state estimation 65 / 65