The Convergence of the Minimum Energy Estimator

The Convergence of the Minimum Energy Estimator Arthur J. Krener Department of Mathematics, University of California, Davis, CA 95616-8633, USA, ajkrener@ucdavis.edu Summary. We show that under suitable hypothesis that the minimum energy estimate of the state of a partially observed dynamical system converges to the true state. The main assumption is that the system is uniformly observable for any input. Key wor: Nonlinear Observer, State Estimation, Nonlinear Filtering, Minimum Energy Estimation, High Gain Observers, Extended Kalman Filter, Uniformly Observable for Any Input. 1 Introduction We consider the problem of estimating the current state x(t) R n of a nonlinear system ẋ = f(x, u) y = h(x, u) (1) x() = x from the past controls u(s) U R m, s t, past observations y(s) R p, s t and some information about the initial condition x. The functions f, h are assumed to be known. We assume that f, h are Lipschitz continuous on R n and satisfy linear growth conditions f(x, u) f(z, u) L x z h(x, u) h(z, u) L x z f(x, u) L(1 + x ) h(x, u) L(1 + x ) () for some L > and all x R n and u U. We also assume that u(s), s t is piecewise continuous. Piecewise continuous means continuous from the left with limits from the right (collor) and with a finite number of discontinuities Research supported in part by NSF DMS-439 and AFOSR F496-1-1-.

Arthur J. Krener in any bounded interval. The symbol denotes the Euclidean norm. The equations (1) model a real system which probably operates over some compact subset of R n. Therefore we may only need () to hold on this compact set as we may be able to extend f, h so that () hol on all of R n. To construct an estimator, we follow an approach introduced by Mortenson [9] and refined by Hijab [5], [6]. To account for possible inaccuracies in the model (1), we add deterministic but unknown noises, ẋ = f(x, u) + g(x)w y = h(x, u) + k(x)v (3) where w(t) R l, v(t) R p are L [, ) functions. The driving noise, w(t), represents modeling errors in f and other possible errors in the dynamics. The observation noise, v(t), represents modeling errors in h and other possible errors in the observations. We assume that g(x) g(z) L x z k(x) k(z) L x z g(x) L k(x) L. Note that g(x), k(x) are matrices so g(x), k(x) denote the induced Euclidean matrix norms. Define Γ (x) = g(x)g (x) R(x) = k(x)k (x) and assume that there exist positive constants m 1, m such that for all x R n, (4) m 1 I p p R(x) m I p p. (5) In particular this implies that k(x) and R(x) are invertible for all x. The initial condition x of (1) is also unknown and viewed as another noise. We are given a function Q (x ) which is a measure of the minimal amount of energy in the past that it would take to put the system in state x at time. We shall assume that Q is Lipschitz continuous on every compact subset of R n. Given the output y(s), s t, we define the minimum discounted energy necessary to reach the state x at time t as { Q(x, t) = inf e αt Q (z()) + 1 e α(t s) ( w(s) + v(s) ) } (6) where the infimum is over all triples w( ), v( ), z( ) satisfying ż(s) = f(z(s), u(s)) + g(z(s))w(s) y(s) = h(z(s), u(s)) + k(z(s))v(s) z(t) = x. (7)

The Convergence of the Minimum Energy Estimator 3 The discount rate is α. Notice that Q(x, t) depen on the past control u(s), s t and past output y(s), s t. A minimum energy estimate ˆx(t) of x(t) is a state of minimum discounted energy given the system (3), the initial energy Q (z) and the observations y(s), s t, ˆx(t) arg min x Q(x, t). (8) Of course the minimum need not be unique but we assume that there is a piecewise continuous selection ˆx(t). Clearly Q satisfies Q(x, ) = Q (x). (9) In the next section we shall show that Q(x, t) is locally Lipschitz continuous and it satisfies, in the viscosity sense, the Hamilton Jacobi PDE = αq(x, t) + Q t (x, t) + Q x (x, t)f(x, u(t)) (1) + 1 Q x(x, t) Γ 1 y(t) h(x, u(t)) R 1 where the subscripts x, t, x i, etc. denote partial derivatives and Q x (x, t) Γ = Q x (x, t)γ (x)q x (x, t) y(t) h(x, u(t)) R = (y(t) h(x, 1 u(t))) R 1 (x)(y(t) h(x, u(t))). To simplify the notation we have suppressed the arguments of Γ, R 1 on the left but they should be clear from context. In the next section we introduce the concept of a viscosity solution to the Hamilton Jacobi PDE (1) and show that Q(x, t) defined by (6) is one. Section 3 is devoted to the properties of smooth solutions to (1) and its relationship with the extended Kalman filter [4]. The principal result of this paper is presented in Section 4, that, under suitable hypothesis, any piecewise continuous selection of (8) globally converges to the corresponding trajectory of the noise free system (1) and this convergence is exponential if α >. We close with some remarks. Viscosity Solutions The following is a slight modification of the standard definition []. Definition 1. A viscosity solution of the partial differential equation (1) is a continuous function Q(x, t) which is Lipschitz continuous with respect to x on every compact subset of R n+1 and such that for each x R n, t > the following conditions hold.

4 Arthur J. Krener 1. If Φ(ξ, ) is any C function such that for ξ, near x, t then Φ(x, t) Q(x, t) e α(t ) (Φ(ξ, ) Q(ξ, )). αφ(x, t) + Φ t (x, t) + Φ x (x, t)f(x, u(t)) + 1 Φ x(x, t) Γ 1 y(t) h(x, u(t)) R 1.. If Φ(ξ, ) is any C function such that for ξ, near x, t then Φ(x, t) Q(x, t) e α(t ) (Φ(ξ, ) Q(ξ, )). αφ(x, t) + Φ t (x, t) + Φ x (x, t)f(x, u(t)) + 1 Φ x(x, t) Γ 1 y(t) h(x, u(t)) R 1. Theorem 1. The function Q(x, t) defined by (6) is a viscosity solution of the Hamilton Jacobi PDE (1) and it satisfies the initial condition (9). Proof. Clearly the initial condition is satisfied and Q(, ) is Lipschitz continuous with respect to x on compact subsets of R n. We start by showing that Q(, t) is Lipschitz continuous with respect to x on compacta. Let K be a compact subset of R n, x K, T > and t T. Now ( Q(x, t) e αt Q (z()) + 1 e α(t s) y(s) h(z(s), u(s)) R )(11) 1 where ż = f(z, u) z(t) = x. By standard arguments, z(s), s t is a continuous function of x K and the right side of (11) is a continuous functional of z(s), s t. Hence the composition is bounded on the compact set K and there exists c large enough so that K {x : Q(x, t) c for all t T }. Fix x K and t [, T ], given ɛ > we know that there exists w(s) such that where Q(x, t) + ɛ e αt Q (z()) (1) + 1 e α(t s) ( w(s) + y(s) h(z(s), u(s)) R 1 )

The Convergence of the Minimum Energy Estimator 5 ż = f(z, u) + g(z)w z(t) = x. Now w(s) e αs w(s) e αt e α(t s) w(s) e αt (c + ɛ). Using the Cauchy Schwarz inequality we also have where ( w(s) ( t 1 w(s) M M = ( T e αt (c + ɛ). Notice that this bound does not depend on the particular x K and t T, only that w( ) has been chosen so that (13) hol. Let ξ K, define ζ(s), s t by ζ = f(ζ, u) + g(ζ)w ζ(t) = ξ. where w( ) is the above. Now for s t we have ζ(s) ζ(t) + ζ(t) + s so using Gronwall s inequality s f(ζ(r), u(r)) + g(ζ(r)) w(r) dr L(1 + ζ(r) + w ) dr ζ(s) e LT ( ξ + LT + LM). Since ξ lies in a compact set we conclude that there is a compact set containing ζ(s) for s t T for all ξ K. Now so Q(ξ, t) e αt Q (ζ()) + 1 e α(t s) ( w(s) + y(s) h(ζ(s), u(s)) R 1 )

6 Arthur J. Krener Q(ξ, t) Q(x, t) ɛ + e αt ( Q (ζ()) Q (z()) ) (13) + 1 1 Again by Gronwall for s t e α(t s) y(s) h(ζ(s), u(s)) R 1 e α(t s) y(s) h(z(s), u(s)) R 1 z(s) ζ(s) e (LT +LM) x ξ. The trajectories z(s), ζ(s), s t lie in a compact set where Q and the integran are Lipschitz continuous so there exists L 1 such that But ɛ was arbitrary so Q(ξ, t) Q(x, t) ɛ + L 1 x ξ. Q(ξ, t) Q(x, t) L 1 x ξ. Reversing the roles of (x, t) and (ξ, t) yiel the other inequality. We have shown that Q(, t) is Lipschitz continuous on K for t T. Next we show that Q(x, t) is continuous with respect to t > for fixed x K. Suppose x, K. If < t, let w( ) satisfy (13) and define Then ζ(s) = z(s + t ) and w(s) = w(s + t ) ζ = f(ζ, u) + g(ζ) w ζ() = x so Q(x, ) e α Q (ζ()) + 1 e α( s) ( w(s) + y(s) h(ζ(s), u(s)) R 1 ) Q(x, ) Q(x, t) ɛ + e α Q (z(t )) e αt Q (z()) 1 1 e α(t s) w(s) e α(t s) y(s) h(z(s), u(s)) R 1 + 1 t e α(t s) ( y(s + t) h(z(s), u(s)) R 1 y(s) h(z(s), u(s)) R 1 )

Clearly the quantities 1 The Convergence of the Minimum Energy Estimator 7 e α Q (z(t )) e αt Q (z()), e α(t s) y(s) h(z(s), u(s)) R, 1 ( 1 t ) t e α(t s) y(s + t) h(z(s), u(s)) R y(s) h(z(s), u(s)) 1 R 1 all go to zero as t. Let χ t (s) be the characteristic function of [, t] and T >. For t T 1 t e α(t s) w(s) 1 w(s) 1 T χ t (s) w(s) which goes to zero as t by the Lebesgue dominated convergence theorem so lim Q(x, ) Q(x, t) < ɛ. t If > t, let w( ) satisfy (13) and define { if s < t w(s) = w(s + t ) if t s ζ = f(ζ, u) + g(ζ) w ζ() = x Then ζ(s) = x(s + t ) and so Q(x, ) e αt Q (ζ()) + 1 e α( s) ( w(s) + y(s) h(ζ(s), u(s)) R 1 ) Q(x, ) Q(x, t) ɛ + e α Q (ζ()) e αt Q (ζ( t)) + 1 t e α( s) y(s) h(ζ(s), u(s)) R 1 + 1 t e α( s) ( y(s) h(ζ(s), u(s)) R 1 y(s + t ) h(ζ(s), u(s)) R 1 ) This clearly goes to ɛ as t so we conclude that lim Q(x, ) Q(x, t) < ɛ. t +

8 Arthur J. Krener But ɛ was arbitrary and we can reverse t and so lim Q(x, ) = Q(x, t). t Now by the Lipschitz continuity with respect to x for all x, ξ K,, t T Q(ξ, ) Q(x, t) Q(ξ, ) Q(x, ) + Q(x, ) Q(x, t) L 1 ξ x + Q(x, ) Q(x, t) and this goes to zero as (ξ, ) (x, t). We conclude that Q(x, t) is continuous. Next we show that 1 and of Definition 1 hold. Let < t then the principle of optimality implies that Q(x, t) = inf { e α(t ) Q(z(), ) + 1 e α(t s) ( w(s) + v(s) ) } where the infimum is over all w( ), v( ), z( ) satisfying on [, t] ż = f(z, u) + g(z)w y = h(z, u) + k(z)v z(t) = x. Let Φ(ξ, ) be any C function such that near x, t (14) Φ(x, t) Q(x, t) e α(t ) (Φ(ξ, ) Q(ξ, )). (15) Suppose w(s) = w, a constant, on [, t] and let ξ = z() where v( ), z( ) satisfy (14). For any constant w we have Q(x, t) e α(t ) Q(ξ, ) + 1 so adding (15, 16) together yiel Φ(x, t) e α(t ) Φ(ξ, ) + 1 e α(t s) ( w + v(s) ). (16) e α(t s) ( w + v(s) ) Recall that u(t) is continuous from the left. Assume t is small then for any constant w Φ(x, t) (1 α(t )) Φ(x (f(x, u(t)) + g(x)w)(t )) + 1 ( w + y(t) h(x, u(t)) ) R (t ) + o(t ) 1 Φ(x, t) Φ(x, t) αφ(x, t)(t ) Φ t (x, t)(t ) Φ x (x, t)(f(x, u(t)) + g(x)w)(t ) + 1 ( w + y(t) h(x, u(t)) ) R (t ) + o(t ) 1 αφ(x, t) + Φ t (x, t) + Φ x (x, t)(f(x, u(t)) + g(x)w) 1 ( w + y(t) h(x, u(t)) ) R 1

The Convergence of the Minimum Energy Estimator 9 We let to obtain w = g (x)φ x (x, t) αφ(x, t) + Φ t (x, t) + Φ x (x, t)f(x, u(t)) + 1 Φ x(x, t) Γ 1 y(t) h(x, u(t)) R 1. On the other hand, suppose (17) Φ(x, t) Q(x, t) e α(t ) (Φ(ξ, ) Q(ξ, )) (18) in some neighborhood of x, t. Given any ɛ > and < t there is a w(s) such that Q(x, t) e α(t ) Q(ξ, ) (19) + 1 e α(t s) ( w(s) + v(s) ) + ɛ(t ) where ξ = z() from (14). Adding (18, 19) together yiel for some w(s) Φ(x, t) e α(t ) Φ(ξ, ) + 1 e α(t s) ( w(s) + v(s) ) + ɛ(t ), Φ(x, t) αφ(x, t)(t ) Φ t (x, t)(t ) Φ x (x, t)f(x, u(t))(t ) Φ x (x(s), s)g(x(s))w(s) + 1 +o(t ) + ɛ(t ). w(s) + 1 y(t) h(x, u(t)) R 1(t ) At each s [, t], the minimum of the right side with respect to w(s) occurs at w(s) = g (x(s))φ x (x(s), s) so we obtain αφ(x, t) + Φ t (x, t) + Φ x (x, t)f(x, u(t)) + 1 Φ x(x, t) Γ 1 y(t) h(x, u(t)) R 1. () Note that we have an initial value problem (9) for the Hamilton Jacobi PDE (1) and this determines the directions of the inequalities (17, ).

1 Arthur J. Krener 3 Smooth Solutions In this section we review some known facts about viscosity solutions in general and Q(x, t) in particular. If Q is differentiable at x, t then it satisfies the Hamilton Jacobi PDE (1) in the classical sense []. There is at most one viscosity solution to the Hamilton Jacobi PDE (1) []. Furthermore [9], [5], if Q is differentiable at (ˆx(t), t) then If, in addition, ˆx is differentiable at t then so this and (1) imply that = Q x (ˆx(t), t). (1) d dt Q(ˆx(t), t) = Q t(ˆx(t), t) + Q x (ˆx(t), t) ˆx(t) = Q t (ˆx(t), t) d dt Q(ˆx(t), t) = αq(ˆx, t) + 1 y(t) h(ˆx(t), u(t)). () Suppose that Q is C in a neighborhood of (ˆx(t), t) and ˆx is differentiable in a neighborhood of t. We differentiate (1) with respect to t to obtain = Q xit(ˆx(t), t) + Q xix j (ˆx(t), t) ˆx j (t). We are using the convention of summing on repeated indices. We differentiate the Hamilton Jacobi PDE (1) with respect to x i at ˆx(t) to obtain = Q txi (ˆx(t), t) + Q xj x i (ˆx(t), t)f j (ˆx(t)) +h rxi (ˆx(t), u(t))r 1 rs (ˆx(t)) (y s (t) h r (ˆx(t), u(t))) so by the commutativity of mixed partials Q xi x j (ˆx(t), t) ˆx j (t) = Q xi x j (ˆx(t), t)f j (ˆx(t), u(t)) +h rxi (ˆx(t), u(t))r 1 rs (ˆx(t)) (y s (t) h r (ˆx(t), u(t))) If Q xx (ˆx(t), t) is invertible, we define P (t) = Q 1 xx (ˆx(t), t) and obtain an ODE for ˆx(t), ˆx(t) = f(ˆx(t), u(t)) + P (t)h x(ˆx(t), u(t))r 1 (ˆx(t)) (y(t) h(ˆx(t), u(t)))(3) Suppose that Γ (x), R(x) are constant, f, h are C in a neighborhood of ˆx(t) and Q is C 3 in a neighborhood of (ˆx(t), t) then we differentiate the PDE (1) twice with respect to x i and x j at ˆx(t) to obtain

The Convergence of the Minimum Energy Estimator 11 = αq xix j (ˆx(t), t) + Q xix jt(ˆx(t), t) +Q xix k (ˆx(t), t)f kxj (ˆx(t), u(t)) + Q xjx k (ˆx(t), t)f kxi (ˆx(t), u(t)) +Q xix jx k (ˆx(t), t)f k (ˆx(t), u(t)) + Q xix k (ˆx(t), t)γ kl Q xl x j (ˆx(t), t) h rxi (ˆx(t), u(t))r 1 rs h sxj (ˆx(t), u(t)) +h rxi x j (ˆx(t), u(t))r 1 rs (y s (t) h s (ˆx(t), u(t))) If we set to zero α, the second partials of f, h and the third partials of Q then we obtain = Q xi x j t(x, t) + Q xi x k (x, t)f kxj (ˆx(t), u(t)) + Q xj x k (ˆx(t), t)f kxi (ˆx(t), u(t)) +Q xi x k (ˆx(t), t)γ kl Q xl x j (ˆx(t), t) h rxi (ˆx(t), u(t))r 1 rs h sxj (ˆx(t), u(t)) and so, if it exists, P (t) satisfies P (t) = f x (ˆx(t), u(t))p (t) + P (t)f x(ˆx(t), u(t)) +Γ P (t)h x(ˆx(t), u(t))r 1 h x (ˆx(t), u(t))p (t) (4) We recognize (3, 4) as the equations of the extended Kalman filter [4]. Baras, Bensoussan and James [1] have shown that, under suitable assumptions, the extended Kalman filter converges to the true state provided that the initial error is not too large. Their conditions are quite restrictive and hard to verify. Recently Krener [8] proved the extended Kalman filter is locally convergent under broad and verifiable conditions. There is a typographical error in the proof, the corrected version is available from the web. 4 Convergence In this section we shall prove the main result of this paper, that is, under certain conditions, the minimum energy estimate converges to the true state. Lemma 1. Suppose Q(x, t) is defined by (6) and ˆx(t) is a piecewise continuous selection of (8). Then for any t Q(ˆx(t), t) = e α(t ) Q(ˆx(), ) + 1 e α(t s) y(s) h(ˆx(s), u(s)) R 1 Proof. With sufficient smoothness the lemma follows from (). If Q, ˆx are not smooth we proceed as follows. Let s i 1 < s i t then { Q(ˆx(s i ), s i ) = inf e α(si si 1) Q(z(s i 1 ), s i 1 ) } + 1 si s i 1 e α(si s) ( w(s) + v(s) ) where the infimum is over all w( ), v( ), z( ) satisfying

1 Arthur J. Krener ż = f(z, u) + g(z)w y = h(z, u) + k(z)v z(s i ) = ˆx(s i ). If ˆx(s), u(s) are continuous on [s i 1, s i ] then { } Q(ˆx(s i ), s i ) inf e α(s i s i 1 ) Q(z(s i 1 ), s i 1 ) { 1 si + inf e α(s i s) ( } w(s) + v(s) ) s i 1 e α(si si 1) Q(ˆx(s i 1 ), s i 1 ) { } 1 si + inf e α(si s) v(s) s i 1 e α(si si 1) Q(ˆx(s i 1 ), s i 1 ) + 1 y(s i) h(ˆx(s i ), u(s i )) R 1(s i s i 1 ) + o(s i s i 1 ). Since ˆx(s), u(s) are piecewise continuous on [, t], they have only a finite number of discontinuities. Let = s < s 1 <... < s k = t then for most i the above hol so Q(ˆx(t), t) e α(t ) Q(ˆx(), ) On the other hand Q(ˆx(s i ), s i ) inf + 1 e α(t s) y(s) h(ˆx(s), u(s)) R. 1 { e α(si si 1) Q(z(s i 1 ), s i 1 ) + 1 for any w( ), v( ), z( ) satisfying si s i 1 e α(s i s) ( w(s) + v(s) ) ż = f(z, u) + g(z)w y = h(z, u) + k(z)v z(s i 1 ) = ˆx(s i 1 ). In particular if we set w = and assume ˆx(s) is continuous on [s i 1, s i ] then Q(ˆx(s i ), s i ) e α(si si 1) Q(ˆx(s i 1 ), s i 1 ) + 1 Q(ˆx(s i ), s i ) e α(s i s i 1 ) Q(ˆx(s i 1 ), s i 1 ) si } s i 1 e α(si s) v(s) + 1 y(s i) h(ˆx(s i 1 ), u(s i 1 )) R 1(s i s i 1 ) + o(s i s i 1 ).

The Convergence of the Minimum Energy Estimator 13 Therefore since ˆx(s), u(s) are piecewise continuous on [, t], with only a finite number of discontinuities then Q(ˆx(t), t) e α(t ) Q(ˆx(), ) + 1 Definition. [3] The system e α(t s) y(s) h(ˆx(s), u(s)) R 1 ż = f(z, u) + g(z)w y = h(z, u) is uniformly observable for any input if there exist coordinates {x ij : i = 1,..., p, j = 1,..., l i }. (5) where 1 l 1... l p and l i = n such that in these coordinates the system takes the form y i = x i1 + h i (u) ẋ i1 = x i + f i1 (x 1, u) + g i1 (x 1 )w.. ẋ ij = x ij+1 + f ij (x j, u) + g ij (x j )w. ẋ ili 1 = x ili + f i,li 1(x li 1, u) + g i,li 1(x li 1)w ẋ ili = f i,li (x li, u) + g i,li (x li )w (6) for i = 1,..., p where x j is defined by x j = (x 11,..., x 1,j l1, x 1,..., x pj ). (7) Notice that in x j the indices range over i = 1,..., p; k = 1,..., j l i = min{j, l i } and the coordinates are ordered so that right index moves faster than the left. We also require that each f ij, g ij be Lipschitz continuous and satisfy growth conditions, there exists an L such that for all x, z R n, u U, f ij (x j, u) f ij (z j, u) L x j z j g ij (x j ) g ij (z j ) L x j z j f ij (x j, u) (L + 1) x j g ij (x j ) L. (8) A system as above but without inputs is said to be uniformly observable [3].

14 Arthur J. Krener Let l 1... i l i n n 1... A 1 A i =... A =...... 1 A p... p n C i = [ 1... ] C 1 1 l i C =... C p l f i1 (x 1, u) i 1 f 1 (x, u) f i (x, u) =. f(x, u) =. f ili (x li, u) f p (x, u) l g i1 (x 1 ) i l n l ḡ 1 (x) ḡ i (x) = ḡ(x) =. g ili (x li ). ḡ p (x) n 1 (9) h(u) = [ h 1 (u),..., h p (u) ] (3) then (6) becomes ẋ = Ax + f(x, u) + ḡ(x)w y = Cx + h(u) (31) We recall the high gain observer of Gauthier, Hammouri and Othman [3]. Their estimate x(t) of x(t) given x, y(s), s t is given by the observer x = A x + f( x, u) + ḡ( x)w + S 1 (θ)c (y C x h(u)) x() = x (3) where θ > and S(θ) is the solution of A S(θ) + S(θ)A C C = θs(θ). (33) It is not hard to see that S(θ) is positive definite for θ > for it satisfies the Lyapunov equation ( A θ ) I S(θ) + S(θ) ( A θ ) I = C C where C, ( A θ I) is an observable pair and ( A θ I) has all eigenvalues equal to θ. Gauthier, Hammouri and Othman [3] showed that when θ is sufficiently large, p = 1, u= and w( ) is L [, ) then x(t) x(t) exponentially as t. We shall modify their proof to show when θ is sufficiently large, p is

The Convergence of the Minimum Energy Estimator 15 arbitrary, u( ) is piecewise continuous and w( ) is L [, ) then x(t) x(t) exponentially as t. The key to both results is the following lemma. We define x θ = x S(θ)x. Since S(θ) is positive definite, for each θ >, there exists constants M 1 (θ), M (θ) so that M 1 (θ) x x θ M (θ) x. (34) Lemma. [3] Suppose ḡ is of the form (9) and satisfies the Lipschitz conditions (8). Then there exists a Lipschitz constant L which is independent of θ such that for all x, z R n, ḡ(x) ḡ(z) θ L x z θ. (35) Note that ḡ(x) is an n l matrix so ḡ(x) ḡ(z) θ denotes the induced operator norm. Proof. It follows from (33) that S ij,rs (θ) = S ij,rs(1) θ j+s 1 = ( 1)j+s θ j+s 1 Let C = 1 M 1 (1) then x C x 1. Let σ = max{ S ij,rs (1) } then for each constant w R l ḡ(x)w ḡ(z)w θ (ḡ ij (x)w ḡ ij (z)w) S ij,rs (1) θ j+s 1 Define ξ ij = x ij θ j, and ξ j, ζ j as with x j, z j. Then σl 1 θ j+s 1 x j z j x s z s w. ζ ij = z ij θ j ( ) j + s. (36) j 1 (ḡ rs(x)w ḡ rs (z)w) and so 1 θ j x j z j ξ j ζ j ḡ(x)w ḡ(z)w θ σl θ ξ j ζ j ξ s ζ s w σl θn ξ ζ w σl θcn ξ ζ 1 w σl Cn x z θ w ḡ(x) ḡ(z) θ σl Cn x z θ.

16 Arthur J. Krener Notice that for each u U, f(, u) also satisfies the hypothesis of the above lemma so Theorem. Suppose f(x, u) f(z, u) θ L x z θ. the system (5) is uniformly observable for any input so that it can be transformed to (31) which satisfies the Lipschitz and growth conditions (8), u( ) is piecewise continuous, w( ) is L [, ), i.e., w(s) < x(t), y(t) are any state and output trajectories generated by system (31) with inputs u( ) and w( ), θ is sufficiently large x(t) is the solution of (3). Then x(t) x(t) exponentially as t. Proof. Let x(t) = x(t) x(t) then d dt x θ = x S(θ) x Define = x S(θ) ( A x + f(x, u) f( x, u) + (ḡ(x) ḡ( x)) w S 1 (θ)c C x ) θ x θ + x θ f(x, u) f( x, u) + (ḡ(x) ḡ( x)) w θ ( θ + L(1 + w ) ) x θ. β(t, ) = θ + L(1 + w(s) ). We choose θ 5 L and large enough so that ( w(s) 1. Then for t 1 β(t, ) = ( θ + L)(t ) + L w(s) ( ( θ + L)(t ) + L 1 ( ( θ + L)(t ) + L (t L(t ) ( w(s) w(s)

The Convergence of the Minimum Energy Estimator 17 By Gronwall s inequality for < + 1 t x(t) θ e β(t,) x() θ e L(t ) x() θ and we conclude that x(t) x(t) exponentially as t. Theorem 3. (Main Theorem) Suppose the system (1) is uniformly observable for any input and so without loss of generality we can assume that is in the form ẋ = Ax + f(x, u) y = Cx + h(u) (37) with A, C, f, h as above, g(x) has been chosen so that (5) is uniformly observable for any input and WLOG (5) is in the form (31) with A, C, f, ḡ, h as above, k(x) has been chosen to satisfy condition (5), x(t), u(t), y(t) are any state, control and output trajectories generated by the noise free system (37), Q(x, t) is defined by (6) with α for the system ẋ = Ax + f(x, u) + ḡ(x)w y = Cx + h(u) + k(x)v (38) where Q (x ) is Lipschitz continuous on compact subsets of R n, ˆx(t) is a piecewise continuous minimizing selection of Q(x, t), (8). Then x(t) ˆx(t) as t. If α > then the convergence is exponential. Proof. Let x(t) be the solution of the high gain observer (3) with ḡ =, driven by u(t), y(t) where x = ˆx and the gain is high enough to insure exponential convergence, x(t) x(t) as t. (39) We know that for any T there exists w T (t) such that the solution z T (t) of satisfies for T e α(t ) Q(z T (), ) + 1 ż T = Az T + f(z T, u) + ḡ(z T )w T z T (T ) = ˆx(T ) T e α(t s) ( w T (s) + y(s) Cz T (s) h(u(s)) R 1 ) e αt T +1 + Q(ˆx(T ), T )

18 Arthur J. Krener From Lemma 1 we have for T, so Q(ˆx(T ), T ) = e α(t ) Q(ˆx(), ) + 1 T e α(t s) y(s) C ˆx(s)) h(u(s)) R. 1 By the definition (6) of Q since x(s), y(s) satisfy the noise free system (37), Q(x(), ) e α Q (x()), Q(ˆx(), ) Q(x(), ) e α Q (x()). Hence Q(ˆx(T ), T ) is bounded if α = and goes to zero exponentially as T if α >. From the definition (8) of ˆx() we have From these we conclude that Q(ˆx(), ) Q(z T (), ). 1 T and it follows that e αs ( w T (s) + y(s) Cz T (s)) h(u(s)) R 1 ) 1 T + 1 + eα Q(ˆx(), ) + 1 T 1 T + 1 + Q (x()) e αs y(s) C ˆx(s)) h(u(s)) R 1 e αs y(s) C ˆx(s)) h(u(s)) R 1 <. Therefore given any ɛ there is a large enough so for all T and 1 T e αs y(s) C ˆx(s)) h(u(s)) R < ɛ 1 1 T w T (s) + y(s) Cz T (s)) h(u(s)) R 1 (4) ( ) 1 < e α T + 1 + Q (x ) + ɛ. Let z T (t) be the solution of the following high gain observer for z T (t) driven by u(t), w T (t) and Cz T (t),

The Convergence of the Minimum Energy Estimator 19 z T = A z T + f( z T, u) + ḡ( z T )w T + S 1 (θ)c (Cz T C z T ) z T () = ˆx (41) then the error z T (t) = z T (t) z T (t) satisfies z T = A z T + f(z T ) f( z T, u) + (ḡ(z T ) ḡ( z T )) w S 1 (θ)c C z T Proceeding as in the proof of Theorem we obtain where β T (t, ) = d dt z T θ β(t, ) z T θ θ + L(1 + w T (s) ). We choose θ 5 L and large enough so that for any T Then for t 1 ( T w T (s) β T (t, ) = ( θ + L)(t ) + L 1. w T (s) ( ( θ + L)(t ) + L 1 ( T ( θ + L)(t ) + L (t L(t ) By Gronwall s inequality for T z T (T ) θ e β T (T,) z T () θ ( e L(T ) z T () θ w T (s) w T (s) and we conclude that z T (T ) z T (T ) exponentially as T hence ˆx(T ) = z T (T ) z T (T ) exponentially. The last step of the proof is to show x(t ) z T (T ) (exponentially if α > ). Now d dt ( x z T ) = A x + f( x, u) + S 1 (θ)c (y C x h(u)) ( A z T + f( z T, u) + ḡ( z T )w T + S 1 (θ)c (Cz T C z T ) ) = ( A S 1 (θ)c C ) ( x z T ) + f( x, u) f( z T, u) ḡ( z T )w T + S 1 (θ)c (y Cz T h(u))

Arthur J. Krener so d dt x z T θ = ( x z T ) S(θ) (( A S 1 (θ)c C ) ( x z T ) + f( x, u) f( z T, u) ḡ( z T )w T + S 1 (θ)c (y Cz T h(u)) ) θ x z T θ + x z T θ f( x, u) f( z T, u) ḡ( z T )w θ + x z T C ( y Cz T h(u) ) ( θ + L) x θ + LM (θ) x z T θ w T + x z T C ( y Cz T h(u)) ). We have chosen θ 5 L. Using (8) and (34) we conclude that there is an M 3 (θ) > such that x z T C C (x z T ) M 3 (θ) x z T θ y Cz T h(u) R 1 Therefore d dt x z T θ ( θ + L) x z T θ + LM (θ) x z T θ w T +M 3 (θ) x z T θ y Cz T h(u)) R 1 d dt x z T θ L x z T θ + LM (θ) w T + M 3 (θ) y Cz T h(u) R 1. By Gronwall s inequality for t T x(t) z T (t) θ e L(t ) x() z T () θ + + e L(t s) LM (θ) w T e L(t s) M 3 (θ) y(s) Cz T (s) h(u(s)) R 1 e L(t ) x() z T () θ ( ( +LM (θ) e Ls t ( +M 3 (θ) e Ls e L(t ) x() z T () θ ( +LM (θ) e Ls ( +M 3 (θ) e Ls ( ( w T y(s) Cz T (s) h(u(s)) R 1 ( w T y(s) Cz T (s) h(u(s)) R 1

The Convergence of the Minimum Energy Estimator 1 e L(t ) x() z T () θ ( ( 1 t +LM (θ) w T L ( ( 1 t +M 3 (θ) y(s) Cz T (s) L h(u(s)) R. 1 As before, from (4) we see that given any δ we can choose large enough so that for all α and all t T we have x(t) z T (t) θ e L(t ) x() z T () θ + δ so we conclude that x(t) z T (t) as t. In particular, x(t ) z T (T ) as t. If α > then (4) implies that for T = x(t ) z T (T ) θ e L T T x( ) z T ( T ) θ ( ( 1 +LM (θ) e α T L ( ( 1 +M 3 e α T L ( ) 1 T + 1 + Q (x ) + ɛ ( 1 T + 1 + Q (x ) + ɛ ). Since we have already shown that x( T ) z T ( T ) as T, we conclude that x(t ) z T (T ) exponentially as T. 5 Conclusion We have shown the global convergence of the minimum energy estimate to the true state under suitable assumptions. The proof utilized a high gain observer but it should be emphasized that the minimum energy estimator is not necessarily high gain. It is low gain if the discount rate α is small and the observation noise is substantial, i.e. R(x) is not small relative to Γ (x). It becomes higher gain as the α is increased, Γ (x) is increased or R(x) is decreased. For any size gain, the minimum energy estimator can make instantaneous transitions in the estimate as the location of the minimum of Q(x, t) jumps around. The principal drawback of the minimum energy estimator is that it requires the solution in the viscosity sense of the Hamilton Jacobi PDE (1) that is driven by the observations. This is very challenging numerically in all but the smallest state dimensions and the accuracy of the estimate is limited by the fineness of the spatial and temporal mesh. Krener and Duarte [7] have offered a hybrid approach to this difficulty. The solution of (1) is computed on a very coarse grid and this is used to initiate multiple extended Kalman filters (3) which track the local minima of Q(, t). The one that best explains the observations is taken as the estimate.

Arthur J. Krener References 1. Baras JS, Bensoussan A, James MR (1988) SIAM J on Applied Mathematics 48:1147 1158. Evans LC (1998) Partial Differential Equations. American Mathematical Society, Providence, RI 3. Gauthier JP, Hammouri H, Othman S (199) IEEE Trans Auto Contr 37:875 88 4. Gelb A (1974) Applied Optimal Estimation. MIT Press, Cambridge, MA 5. Hijab O (198) Minimum Energy Estimation PhD Thesis, University of California, Berkeley, California 6. Hijab O (1984) Annals of Probability, 1:89 9 7. Krener AJ, Duarte A (1996) A hybrid computational approach to nonlinear estimation in Proc. of 35th Conference on Decision and Control, 1815-1819, Kobe, Japan 8. Krener AJ () The convergence of the extended Kalman filter. In: Rantzer A, Byrnes CI (e) Directions in Mathematical Systems Theory and Optimization, 173 18. Springer, Berlin Heidelberg New York, also at http://arxiv.org/abs/math.oc/155 9. Mortenson RE (1968) J. Optimization Theory and Applications, :386 394