Extended Kalman Filter Tutorial Gabriel A. Terejanu Department of Computer Science and Engineering University at Buffalo, Buffalo, NY 14260 terejanu@buffalo.edu 1 Dynamic process Consider the following nonlinear system, described by the difference equation and the observation model with additive noise: x k = f(x k 1 ) + w k 1 (1) z k = h(x k ) + v k (2) The initial state x 0 is a random vector with known mean µ 0 = E[x 0 ] and covariance P 0 = E[(x 0 µ 0 )(x 0 µ 0 ) T ]. In the following we assume that the random vector w k captures uncertainties in the model and v k denotes the measurement noise. Both are temporally uncorrelated (white noise), zero-mean random sequences with known covariances and both of them are uncorrelated with the initial state x 0. E[w k ] = 0 E[w k w T k ] = Q k E[w k w T j ] = 0 for k j E[w k x T 0 ] = 0 for all k (3) E[v k ] = 0 E[v k v T k ] = R k E[v k v T j ] = 0 for k j E[v kx T 0 ] = 0 for all k (4) Also the two random vectors w k and v k are uncorrelated: E[w k v T j ] = 0 for all k and j (5) Vectorial functions f( ) and h( ) are assumed to be C 1 functions (the function and its first derivative are continuous on the given domain). Dimension and description of variables: x k n 1 State vector w k n 1 Process noise vector z k m 1 Observation vector v k m 1 Measurement noise vector f( ) n 1 Process nonlinear vector function h( ) m 1 Observation nonlinear vector function Q k n n Process noise covariance matrix R k m m Measurement noise covariance matrix
2 EKF derivation Assuming the nonlinearities in the dynamic and the observation model are smooth, we can expand f(x k ) and h(x k ) in Taylor Series and approximate this way the forecast and the next estimate of x k. Model Forecast Step Initially, since the only available information is the mean, µ 0, and the covariance, P 0, of the initial state then the initial optimal estimate x a 0 and error covariance is: x a 0 = µ 0 = E[x 0 ] (6) P 0 = E[(x 0 x a 0 )(x 0 x a 0 )T ] (7) Assume now that we have an optimal estimate x a k 1 E[x k 1 Z k 1 ] with P k 1 covariance at time k 1. The predictable part of x k is given by: x f k E[x k Z k 1 ] (8) = E[f(x k 1 ) + w k 1 Z k 1 ] = E[f(x k 1 ) Z k 1 ] Expanding f( ) in Taylor Series about x a k 1 we get: f(x k 1 ) f(x a k 1 ) + J f(x a k 1 )(x k 1 x a k 1 ) + H.O.T. (9) where J f is the Jacobian of f( ) and the higher order terms (H.O.T.) are considered negligible. Hence, the Extended Kalman Filter is also called the First-Order Filter. The Jacobian is defined as: f 1 f 1 f x 1 x 2 1 J f..... (10) f n x 2 f n x 1 where f(x) = (f 1 (x),f 2 (x),...,f n (x)) T and x = (x 1,x 2,...,x n ) T. The eq.(9) becomes: f n f(x k 1 ) f(x a k 1 ) + J f(x a k 1 )e k 1 (11) where e k 1 x k 1 x a k 1. The expectated value of f(x k 1) conditioned by Z k 1 : E[f(x k 1 ) Z k 1 ] f(x a k 1 ) + J f(x a k 1 )E[e k 1 Z k 1 ] (12) where E[e k 1 Z k 1 ] = 0. Thus the forecast value of x k is: Substituting (11) in the forecast error equation results: x f k f(x a k 1 ) (13) e f k x k x f k (14) = f(x k 1 ) + w k 1 f(x a k 1 ) J f (x a k 1 )e k 1 + w k 1 2
The forecast error covariance is given by: P f k E[e f k (ef k )T ] (15) = J f (x a k 1 )E[e k 1e T k 1 ]JT f (xa k 1 ) + E[w k 1w T k 1 ] = J f (x a k 1 )P k 1J T f (xa k 1 ) + Q k 1 Data Assimilation Step At time k we have two pieces of information: the forecast value x f k with the covariance Pf k and the measurement z k with the covariance R k. Our goal is to approximate the best unbiased estimate, in the least squares sense, x a k of x k. One way is to assume the estimate is a linear combination of both x f k and z k [4]. Let: From the unbiasedness condition: Substitute (18) in (16): x a k = a + K k z k (16) 0 = E[x k x a k Z k] (17) = E[(x f k + ef k ) (a + K kh(x k ) + K k v k ) Z k ] = x f k a K ke[h(x k ) Z k ] a = x f k K ke[h(x k ) Z k ] (18) x a k = x f k + K k(z k E[h(x k ) Z k ]) (19) Following the same steps as in model forecast step, expanding h( ) in Taylor Series about x f k we have: h(x k ) h(x f k ) + J h(x f k )(x k x f k ) + H.O.T. (20) where J h is the Jacobian of h( ) and the higher order terms (H.O.T.) are considered negligible. The Jacobian of h( ) is defined as: J h h 1 x 1 h 1. h m x 1 h x 2 1.... h m x 2 h m (21) where h(x) = (h 1 (x),h 2 (x),...,h m (x)) T and x = (x 1,x 2,...,x n ) T. Taken the expectation on both sides of (20) conditioned by Z k : E[h(x k ) Z k ] h(x f k ) + J h(x f k )E[ef k Z k] (22) where E[e f k Z k] = 0. Substitute in (19), the state estimate is: x a k x f k + K k(z k h(x f k )) (23) 3
The error in the estimate x a k is: e k x k x a k = f(x k 1 ) + w k 1 x f k K k(z k h(x f k )) f(x k 1 ) f(x a k 1 ) + w k 1 K k (h(x k ) h(x f k ) + v k) J f (x a k 1 )e k 1 + w k 1 K k (J h (x f k )ef k + v k) J f (x a k 1 )e k 1 + w k 1 K k J h (x f k )(J f(x a k 1 )e k 1 + w k 1 ) K k v k (I K k J h (x f k ))J f(x a k 1 )e k 1 + (I K k J h (x f k ))w k 1 K k v k (24) Then, the posterior covariance of the new estimate is: P k E[e k e T k ] (25) = (I K k J h (x f k ))J f(x a k 1 )P k 1J T f (xa k 1 )(I K kj h (x f k ))T +(I K k J h (x f k ))Q k 1(I K k J h (x f k ))T + K k R k K T k = (I K k J h (x f k ))Pf k (I K kj h (x f k ))T + K k R k K T k = P f k K kj h (x f k )Pf k Pf k JT h (xf k )KT k + K kj h (x f k )Pf k JT h (xf k )KT k + K kr k K T k The posterior covariance formula holds for any K k. Like in the standard Kalman Filter we find out K k by minimizing tr(p k ) w.r.t. K k. 0 = tr(p k) K k (26) Hence the Kalman gain is: Substituting this back in (25) results: = (J h (x f k )Pf k )T P f k JT h (xf k ) + 2K kj h (x f k )Pf k JT h (xf k ) + 2K kr k K k = P f k JT h (xf k ) (J h (x f k )Pf k JT h (xf k ) + R k) 1 (27) P k = (I K k J h (x f k ))Pf k (I K kj h (x f k ))Pf k JT h (xf k )KT k + K kr k K T k (28) ( = (I K k J h (x f k ))Pf k P f k JT h (xf k ) K kj h (x f k )Pf k JT h (xf k ) K kr k )K T k = (I K k J h (x f k ))Pf k [P f k JT h (xf k ) K k [ = (I K k J h (x f k ))Pf k P f k JT h (xf k ) Pf k JT h (xf k ]K ) T k = (I K k J h (x f k ))Pf k 3 Summary of Extended Kalman Filter Model and Observation: ( J h (x f k )Pf k JT h (xf k ) + R k )] K T k x k z k = f(x k 1 ) + w k 1 = h(x k ) + v k 4
Initialization: x a 0 = µ 0 with error covariance P 0 Model Forecast Step/Predictor: x f k f(x a k 1 ) P f k = J f (x a k 1 )P k 1J T f (xa k 1 ) + Q k 1 Data Assimilation Step/Corrector: x a k x f k + K k(z k h(x f k )) K k = P f k JT h (xf k (J ) h (x f k )Pf k JT h (xf k ) + R k ) P k = (I K k J h (x f k ) P f k ) 1 4 Iterated Extended Kalman Filter In the EKF, h( ) is linearized about the predicted state estimate x f k. The IEKF tries to linearize it about the most recent estimate, improving this way the accuracy [3, 1]. This is achieved by calculating x a k, K k, P k at each iteration. Denote x a k,i the estimate at time k and ith iteration. The iteration process is initialized with xa k,0 = xf k. Then the measurement update step becomes for each i: x a k,i x f k + K k(z k h(x a k,i ( )) = P f k JT h (ˆx k,i) J h (x a k,i )Pf k JT h (xa k,i ) + R k K k,i ) 1 P k,i = ( I K k,i J h (x a k,i )) P f k If there is little improvement between two consecutive iterations then the iterative process is stopped. The accuracy reached this way is achieved with higher computational time. 5 Stability Since Q k and R k are symmetric positive definite matrices then we can write: Q k = G k G T k (29) R k = D k D T k (30) Denote by ϕ and χ the high order terms resulted in the following subtractions: f(x k ) f(x a k ) = J f(x a k )e k + ϕ(x k,x a k ) (31) h(x k ) h(x a k ) = J h(x a k )e k + χ(x k,x a k ) (32) Konrad Reif showed in [2] that the estimation error remains bounded if the followings hold: 5
1. α,β,γ 1,γ 2 > 0 are positive real numbers and for every k: 2. J f is nonsingular for every k J f (x a k ) α (33) J h (x a k ) β (34) γ 1 I P k γ 2 (35) 3. There are positive real numbers ǫ ϕ,ǫ χ,κ ϕ,κ χ > 0 such that the nonlinear functions ϕ,χ are bounded via: ϕ(x k,x a k ) ǫ ϕ x k x a k 2 with x k x a k κ ϕ (36) χ(x k,x a k ) ǫ χ x k x a k 2 with x k x a k κ χ (37) Then the estimation error e k is exponentially bounded in mean square and bounded with probability one, provided that the initial estimation error satisfies: and the covariance matrices of the noise terms are bounded via: e k ǫ (38) for some ǫ,δ > 0. G k G T k D k D T k δi (39) δi (40) 6 Conclusion In EKF the state distribution is propagated analytically through the first-order linearization of the nonlinear system. It does not take into account that x k is a random variable with inherent uncertainty and it requires that the first two terms of the Taylor series to dominate the remaining terms. Second-Order version exists [4, 5], but the computational complexity required makes it unfeasible for practical usage in cases of real time applications or high dimensional systems. References [1] Arthur Gelb. Applied Optimal Estimation. M.I.T. Press, 1974. [2] K.Reif, S.Gunther, E.Yaz, and R.Unbehauen. Stochastic Stability of the Discrete-Time Extended Kalman Filter. IEEE Trans.Automatic Control, 1999. [3] Tine Lefebvre and Herman Bruyninckx. Kalman Filters for Nonlinear Systems: A Comparison of Performance. [4] John M. Lewis and S.Lakshmivarahan. Dynamic Data Assimilation, a Least Squares Approach. 2006. [5] R. van der Merwe. Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models. Technical report, 2003. 6
Figure 1: The block diagram for Extended Kalman Filter 7