Kalman Filter Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S. Gannot and A. Yeredor, The Kalman Filter, Springer Handbook of Speech Processing, J. Benesty, M.M. Sondhi and Y. Huang (Eds.), Springer, 2008. R. Faragher, Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation, IEEE Signal Processing Magazine, Sept. 2012. https://www.youtube.com/watch?v=mwn8xhgnpfy https://www.youtube.com/watch?v=4oerjmppkrg https://www.youtube.com/watch?v=ul3u2ylpwu0 http://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures https://www.youtube.com/watch?v=caccowjpytq April 20, 2017 Man-Wai MAK (EIE) Kalman Filter April 20, 2017 1 / 24
Overview 1 Fundamental of Kalman Filter What is Kalman Filter MMSE Linear Optimal Estimator 2 The Kalman Filter and Its Applications Kalman Filter Formulations 1-Dimensional Example Kalman Filter s Output and Optimal Linear Estimate Man-Wai MAK (EIE) Kalman Filter April 20, 2017 2 / 24
What is Kalman Filter Named after Rudolf E. Kalman, the Kalman filter is one of the most important and common data fusion algorithms in use today. The most famous early use of the Kalman filter was in the Apollo navigation computer that took Neil Armstrong to the moon. The Kalman filter has found numerous applications in fields related to control of dynamical systems, e.g., predicting the trajectory of celestial bodies, missiles and microscopic particles. Kalman filters are at work in every satellite navigation devices, every smart phone, UAV, and many computer games. The Kalman filter is a Bayesian model similar to a hidden Markov model but where the state space of the latent variables is continuous and where all latent and observed variables have a Gaussian distribution. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 3 / 24
Areas of Applications Kalman filter is suitable for applications where the variables of interest can only be measured indirectly; measurements are available from various sensors, but they might be subject to noise. you have uncertain information about some dynamic systems but you can make an educated guess about what the system is going to do next. https://www.youtube.com/watch?v=mwn8xhgnpfy Man-Wai MAK (EIE) Kalman Filter April 20, 2017 4 / 24
MMSE Linear Optimal Estimator Let x and y be two random vectors with an arbitrary distribution having finite moments up to second order. Denote z = [x T y T ] T. The mean and covariance of z are given by [ ] ηx η z = E{z} = and η y [ ] C zz = E{(z η z )(z η z ) T Cxx C xy } = We estimate x from observation y using C T xy C yy x ˆx = Ay + b or x = Ay + b + ɛ (1) where ɛ = ˆx x is the estimation error. It is desired to find a linear estimator that attains the smallest mean square error (MSE) matrix: P = E{ɛɛ T } Man-Wai MAK (EIE) Kalman Filter April 20, 2017 5 / 24
MMSE Linear Optimal Estimator To find the optimal A and b, we rewrite ɛ as ɛ = Ay + b x = A(y η y ) (x η x ) + c where c = b + Aη y η x. Therefore, { [A(y P = E ηy ) (x η x ) + c ] [ A(y η y ) (x η x ) + c ] } T = AE{(y η y )(y η y ) T }A T AE{(y η y )(x η x ) T } E{(x η x )(y η y ) T }A T + E{(x η x )(x η x ) T } + cc T E{x η x }c T ce{(x η x ) T } + AE{y η y }c T + ce{(y η y ) T }A T = AC yy A T AC yx C xy A T + C xx + cc T (2) Man-Wai MAK (EIE) Kalman Filter April 20, 2017 6 / 24
MMSE Linear Optimal Estimator Note that only c depends on b in Eq. 2 and that a T cc T a 0 for all non-zero vector a. Therefore, P is minimal when c = 0. This suggests Substituting Eq. 3 into Eq. 2, we have P = AC yy A T AC yx C xy A T + C xx b = η x Aη y (3) = (A C xy C 1 yy )C yy (A C xy C 1 yy ) T + C xx C xy C 1 yy C yy C T yy C T xy = (A C xy C 1 yy )C yy (A C xy C 1 yy ) T + C xx C xy C 1 yy C T xy Therefore, P is minimal when A = C xy C 1 yy, which gives The optimal linear estimator is P min = C xx C xy C 1 yy C T xy. ˆx opt = Ay + b = Ay + η x Aη y = η x + C xy C 1 yy (y η y ) Man-Wai MAK (EIE) Kalman Filter April 20, 2017 7 / 24
MMSE Linear Optimal Estimator In summary, when the linear estimator ˆx is optimal, we have E{ɛ} = c = 0 { [A(y E{ɛy T } = E ηy ) (x η x ) + c ] y T} { [A(y = E ηy ) (x η x ) + c ] [ ] } T y η y { [A(y + E ηy ) (x η x ) + c ] } η T y = AC yy C xy + cη T y = 0 A = C xy C 1 yy and c = 0. The condition that the error ɛ and observation y are uncorrelated agrees with the orthogonality principle. The orthogonality principle will be used in deriving the formula of Kalman filters. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 8 / 24
Kalman Filter For every new y, we need to recompute the optimal A and b, which involves the computation of covariance matrices and their inverse. The computation become very demanding when the number of observations is very large. The required storage space has to grow when more measurements become available. It is more computationally appealing if we can recursively estimate the optimal A and b when new observation is received without storing the past observations. Such a recursive (linear, optimal) estimation scheme is known as Kalman filtering. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 9 / 24
Kalman Filter Formulations The Kalman filter assumes that the state of a system at a time t evolved from the prior state at time t 1 according to the equation: x t = F t x t 1 + B t u t + w t (4) where x t is the state vector containing the terms of interest for the system at time t, e.g., position and velocity; u t is the vector containing any control inputs, e.g., steering angle, throttle setting, and braking force; F t is the state transition matrix which applies the effect of each system state parameter at time t 1 on the system state at time t, e.g., the position and velocity at time t 1 both affect the position at time t; B t is the control input matrix which applies the effect of each control input parameter in the vector u t on the state vector, e.g., applies the effect of the throttle setting on the system velocity and position; w t is the vector containing the process noise with covariance matrix Q t for each parameter in the state vector. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 10 / 24
Kalman Filter Formulations In Eq. 4, x t 1 is in analogy to y in Eq. 1, x t is in analogy to ˆx, and w t is in analogy to ɛ. The system is measured according to the model z t = H t x t + v t (5) where z t is the vector of measurements; H t is the transformation matrix that maps the state vector parameters into the measurement domain; v t is a vector containing the measurement noise with covariance matrix R t. The true state x t cannot be directly observed; the Kalman filter provides an algorithm to determine an estimate ˆx t by combining models of the system and noisy measurements of certain parameters or linear functions of parameters. The covariance of estimation error at time t is denoted as P t. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 11 / 24
Kalman Filter Formulations The Kalman filter algorithm involves two stages: Prediction and Measurement update. Prediction stage (estimate ˆx t t 1 from previous measurements up to z t 1 ): ˆx t t 1 = F tˆx t 1 t 1 + B t u t (6) P t t 1 = F t P t 1 t 1 F T t + Q t (7) Eq. 6 means that the new best estimate is a prediction made from previous best estimate, plus a correction for known external influences. Eq. 7 means that the new uncertainty is predicted from the old uncertainty, with some additional uncertainty from the environment. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 12 / 24
Kalman Filter Formulations Measurement update (obtain ˆx t t from current measurement z t ): ˆx t t = ˆx t t 1 + K t (z t H tˆx t t 1 ) (8) P t t = P t t 1 K t H t P t t 1 (9) K t = P t t 1 H T t (H t P t t 1 H T t + R t ) 1 (10) In Eq. 8, if the measurement z t totally agrees with the prediction, then the current prediction is good enough and update is not necessary. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 13 / 24
Kalman Filter Formulations To show Eq. 7, we subtract Eq. 6 from Eq. 4 and compute the covariance of prediction error: P t t 1 = E{(x t ˆx t t 1 )(x t ˆx t t 1 ) T } = E {(F t x t 1 F tˆx t 1 t 1 + w t )(F t x t 1 F tˆx t 1 t 1 + w t ) T} { = F t E (x t 1 ˆx t 1 t 1 )(x t 1 ˆx t 1 t 1 ) T} F T t + F t E{(x t 1 ˆx t 1 t 1 )w T t } + E{w t (x t 1 ˆx t 1 t 1 ) T }F T t + E{w t w T t } = F t P t 1 t 1 F T t + Q t where we have used the property that estimation error and noise are uncorrelated. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 14 / 24
Kalman Filter Formulations If w t in Eq. 4 and v t in Eq. 5 follow Gaussian distributions, we may write them as p(x t x t 1 ) = N (x t F t x t 1 + B t u t, Q t ) p(z t x t ) = N (z t H t x t, R t ) The initial state also follows a Gaussian distribution: p(x 1 ) = N (x 1 B t u 1, Q 1 ) Man-Wai MAK (EIE) Kalman Filter April 20, 2017 15 / 24
mean Gaussian white noise with covariance Rt. In the derivation that follows, we will consider a simple one-dimensional tracking problem, particularly that of a train moving along a railway line (see Figure 1). We can therefore consider some example vectors and matrices in this problem. The state vector xt contains 1-Dimensional Example F t = ; E B 0 1 and t = > 2 Tt the position and velocity of the train parameters. The estimates of the parameters of interest in the state vector are unknown true value xt is given by We use the following one-dimensional xt x tracking problemt to derive (not t = ; E. therefore now provided by probability P x o t; t-1 = E [( xt-txt ; t-1)( xt-txt ; t-1) ], t density functions (pdfs), rather than discrete values. The Kalman filter is based and taking the difference between (3) The train driver may apply a braking or so vigorously) Eq. accelerating 6 Eq. input to the 10: system, which on Gaussian pdfs, as will become clear and (1) gives H. The true state of the system xt cannot be directly observed, and the Kalman filter provides an algorithm to determine an estimate xt t by combining models of the system and noisy measurements of certain parameters or linear functions of xt; t-1 = Fx t t-1; t- 1+ Btut (3) ; ; ; = t F + -1-1 t- 1 FP t t t (4) Pt Q T t, where Qt is the process noise covariance matrix associated with noisy control inputs. Equation (3) was derived explicitly in the discussion above. We can derive (4) as follows. The variance associated t xt of an with the prediction t-1 Measurement (Noisy) 0 Prediction (Estimate) r [FIG1] This figure shows the one-dim ensional system under consideration. We have the following variables IEEE SIGNAL PROCESSING andmagazine parameters: [129] SEPTEMBER 2012 x t : The unknown (hidden state) distance from the pole to the train s position at time t; z t : Noisy measure of the distance between the pole and the train s RF antenna at time t using time-of-flight techniques; H: A proportional scale to make the time-of-flight measurement compatible with that of distance x t ; u t : The amount of throttle or brake applied by the train s operator at time t; B: A proportional scale to convert u t into distance; w t : Noise of x t ; v t : Measurement noise. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 16 / 24
1-Dimensional Example At t = 0, we have an initial estimate (with error) of the train s position: [ lecture NOTES ] continued [FIG2] The initial knowledge of the system at time t = 0. The red Gaussian distribution represents the pdf providing the initial confidence in the estimate of the position of the train. The arrow pointing to the right represents the known initial velocity of the train. xt- xt; t 1 = F( xt 1- xt; t T T 1 t t 1) + wt Kt = Pt t 1Ht ( HtPt t - 1Ht + Rt). (7) from a radio ranging system deployed at At each time step t, we combine two information sources: & Pt; t-1 = E[( F( xt-1-xt t-1; t-1) the track side. The information from the + wt) # ( Fx ( t-1-xt In the remainder of this article, we will predictions and measurements are combined to provide the best possible estimate Predictions (ˆx t 1 t 1 t-1; t-1) ) based on the last known position of the train. derive the measurement update equations [(5) (7)] a radio from ranging first principles. systemof deployed the location of the atrain. the The track system is T + wt)] Measurements (z t ) from = FE[( xt-1-xt t-1; t-1) shown graphically in Figure 1. side. T # ( xt-1- xt t-1; t-1) ] SOLUTIONS The initial state of the system (at T # F + FE[( xt-1 The Kalman filter will be derived here time t = 0 s) is known to a reasonable T - xt t-1; t-1) wt ] by considering a simple one-dimensional tracking problem, specifically that of location of the train is given by a accuracy, as shown in Figure 2. The + E[ wx t t- 1 Man-Wai MAK (EIE) Kalman Filter April 20, 2017 17 / 24
T T T ; 1)] [ w ] xt F tw t-1 t- E t - + T & Pt; t-1 = FPt-1; t-1 F + Qt. 1-Dimensional Example The measurement update equations are given by MEASUREMENT. in Figure 3 by a new Gaussian pdf with a new mean and variance. Mathematically this step is represented by (1). The variance has increased [see (2)], representing our reduced certainty in the accuracy of our position estimate compared to t = 0, due to the uncertainty associated with any know the best possible estimate of the location of the train (or more precisely, the location of the radio antenna mount- At t = xtt0, ; t= xtthe t; t-1+ Ktlocation ( zt- Ht x t t; t-1) (5) of the ed on the train roof). isinformation givenis available from two sources: 1) predictions by a Gaussian PDF. Pt; t= Pt; t- 1-KtHtPtt ; - 1, (6) At t = 1, the new train position based on the last is known predicted position and (by Eq. 4), which is where velocity of the train and 2) measurements time t = 1. represented by a new Gaussian PDF with a larger variance: process noise from accelerations or decelerations undertaken from time t = 0 to Prediction (Estimate)??? [FIG3] Here, the predict ion of the location of the train at time t = 1 and the level of uncertainty in that prediction is shown. The confidence in the knowledge of the position of the train has decreased, as we are not certain if the train has undergone any accelerations or decelerations in the intervening period from t = 0 to t = 1. At t = 1, we also have IEEE a SIGNAL measurement PROCESSING MAGAZINE [130] z t SEPTEMBER of the 2012 train position. The measurement error is represented by another Gaussian PDF. Measurement (Noisy) Prediction (Estimate)??? [FIG4] Shows the measur ement of the location of the train at time t = 1 and the level of uncertainty in that noisy measurement, represented by the blue Gaussian pdf. The combined knowledge of this system is provided by multiplying these two pdfs together. Man-Wai MAK (EIE) Kalman Filter April 20, 2017 18 / 24
1-Dimensional Example Measurement (Noisy)??? Prediction (Estimate) The best estimate can be obtained by combining our knowledge from the prediction and measurement. Achieved by multiplying the two Gaussian PDFs, resulting in the together. green Gaussian: [FIG4] Shows the measur ement of the location of the train at time t = 1 and the level of uncertainty in that noisy measurement, represented by the blue Gaussian pdf. The combined knowledge of this system is provided by multiplying these two pdfs Measurement (Noisy) Prediction (Estimate)??? [FIG5] Shows the new pdf (green) generated by multiplying the pdfs associated with the prediction and measurement of the train s location at time t = 1. This new pdf provides the best estimate of the location of the train, by fusing the data from the prediction and the measurement. At t = 1, we also make a measurement of the that location the of the variance train using ofthe the blue Gaussian combined function in Gaussian Figure 4 function is smaller, can expanded meaning and then the The measurement pdf represented by The quadratic terms in this new Note the radio positioning system, and this is is given by whole expression rewritten in Gaussian that represented theby predicted the blue Gaussian position pdf in has smaller variation. form ( r n2) Figure 4. The best estimate we can make y ( r;, ) 1 2-2 n2 v2 _ e - 2 2v2 2. (9) yfused( r; nfused, vfused) of the location of the train is provided by 2rv2 2 ( r - nfused) 1-2 2 combining our knowledge from the prediction and the measurement. This is is fused by multiplying the two together, The information provided by these two pdfs = e v 2 fused, (11) 2rvfused Man-Wai achieved MAK by multiplying (EIE) the two corre- i.e., considering Kalman Filter the prediction and the where April 20, 2017 19 / 24
1-Dimensional Example To simplify notations, we omit subscript t and write the PDF of measurement (blue Gaussian) as: 1 N (z, µ z, σ z ) = exp { (z µ z) 2 } 2πσz Because the predicted position is in meter and the measurement (time-of-flight of radio signal) is in second, we need to make the two units compatible. This can be done by converting the prediction x to measurement z by setting H = 1 c, where c is the speed of light, i.e., 2σ 2 z The PDF of prediction becomes r = x c N (r, µ x c, σ x c ) = 1 2π σ x c µx (r exp { c )2 2( σx c )2 Man-Wai MAK (EIE) Kalman Filter April 20, 2017 20 / 24 }
1-Dimensional Example Note that both random variables r and z have the same unit (second). Their PDF can now be combined and written in terms of one random variable s: N (s, µ f, σ f ) = = exp { µx (s ( c ))2 2( σx exp { 2π σ x c 2πσz { (s µ f ) 2 } 1 2πσf exp c )2 } 2σ f 2 (s µz )2 2σz 2 } where ( ) σ 2 µ f = µ x + x /c ( σx/c 2 2 + σz 2 µ z µ ) x c ( ) σ σf 2 = σx 2 2 x /c σ 2 x σx/c 2 2 + σz 2 c. (11) Man-Wai MAK (EIE) Kalman Filter April 20, 2017 21 / 24
1-Dimensional Example Substituting H = 1 c and K = Hσ2 x H 2 σ 2 x +σ2 z into Eq. 11, we have µ f = µ x + K(µ z Hµ x ) σ 2 f = σ 2 x KHσ 2 x (12) Now we set ˆx t t µ f (posterior mean) ˆx t t 1 µ x P t t σ 2 f (posterior covariance) P t t 1 σ 2 x z t µ z R t σ 2 z H t H K t K Man-Wai MAK (EIE) Kalman Filter April 20, 2017 22 / 24
1-Dimensional Example Substituting these variables into Eq. 12, we have µ f = µ x + K(µ z Hµ x ) ˆx t t = ˆx t t 1 + K(z t H tˆx t t 1 ) σ 2 f = σ 2 x KHσ 2 x P t t = P t t 1 K t H t P t t 1 Hσ 2 x K = H 2 σx 2 + σz 2 K t = P t t 1 H T t (H t P t t 1 H T t + R t ) 1 Note that they are the Kalman filter equations (Eq. 8 Eq. 10). Man-Wai MAK (EIE) Kalman Filter April 20, 2017 23 / 24
Kalman Filter s Output and Optimal Linear Estimate In a Gaussian framework, the Kalman filter s output is the optimal linear estimate: ˆx t t = E{x t z 0, z 1,..., z t } = µ f,t in the 1-D example The covariance of prediction error, given the measurement up to time t is } P t t = E {(ˆx t t x t )(ˆx t t x t ) T z 0, z 1,..., z t = σ 2 f,t in the 1-D example Man-Wai MAK (EIE) Kalman Filter April 20, 2017 24 / 24