THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider the Kalma filter for a oe-dimesioal system. The mai idea is that the Kalma filter is simply a liear weighted average of two sesor values. The, we show that the geeral case has a similar structure ad that the mathematical formulatio is quite similar. (Der Text ethlt och eiige Tippfehler, der aufmerksame Studet ka diese fide).. A example of data filterig The Kalma filter is widely used i aeroautics ad egieerig for two mai purposes: for combiig measuremets of the same variables but from differet sesors, ad for combiig a iexact forecast of a system s state with a iexact measuremet of the state. The Kalma filter has also applicatios i statistics ad fuctio approximatio. Whe dealig with a time series of data poits x, x,..., x, a forecaster computes the best guess for the poit x +. A smoother looks back at the data, ad computes the best possible x i takig ito accout the poits before ad after x i. A filter provides a correctio for x + takig ito accout all the poits x, x,..., x ad a iexact measuremet of x +. A example of a filter is the followig: Assume that we have a system whose oe-dimesioal state we ca measure at successive steps. The readigs of the measuremets are x, x,..., x. Our task is to compute the average µ of the time series give poits. The solutio is µ = x i. If a ew poit x + is measured, we ca recompute µ, but it is more efficiet to use the old value of µ ad make a small correctio usig x +. The correctio is easy to derive, sice µ + = + + ad so, µ + ca be writte as x i = ( + ) x i + x + (.) µ + = + µ + + x + = µ + K(x + µ) where K = /( + ). K is called the gai factor. The ew average µ + is a weighted average of the old estimate µ ad the ew value x +. We trust µ more tha the sigle value x + ; therefore, the weight of µ is larger tha the weight of x +. Equatio. ca be also read as statig that the average is corrected
RAUL ROJAS usig the differece of x + ad the old value µ. The gai K adjusts how big the correctio will be. We ca also recalculate recursively the quadratic stadard deviatio of the time series (the variace). Give poits, the quadratic stadard deviatio is give by: = (x i µ). If a ew poit x + is measured, the ew variace is + = + (x i µ + ) = + (x i µ K(x + µ ) ) + + where K is the gai factor defied above ad we used Equatio.. The expressio above ca be expaded as follows: [ + = ] (x i µ ) + K (x i µ )(x + µ ) + K (x + µ ) + ( K) (x + µ ) + The secod term iside the brackets is zero, because (x i µ ) = 0. Therefore, the whole expressio reduces to + = + ( + (x + µ) (K + ( K) ). Sice K + ( K) = K the last expressio reduces to + = + ( + K(x + µ) ) = ( K)( + K(x + µ) ). The whole process ca ow be casted ito as a series of steps to be followed iteratively. Give the first poits, ad our calculatio of µ ad, the Whe a ew poit x + is measured, we compute the gai factor K = /( + ). We compute the ew estimatio of the average µ + = µ + K(x + µ) We compute also a provisioal estimate of the ew stadard deviatio = + K(x + µ) Fially, we fid the correct + usig the correctio + = ( K) This kid of iterative computatio is used i calculators for updatig the average ad stadard deviatio of umbers etered sequetially ito the calculator without havig to store all umbers. This kid of iterative procedure shows the geeral flavor of the Kalma filter, which is a kid of recursive least squares estimator for data poits.
THE KALMAN FILTER 3 Figure. Gaussias. The oe-dimesioal Kalma Filter The example above showed how to update a statistical quatity oce more iformatio becomes available. Assume ow that we are dealig with two differet istrumets that provide a readig for some quatity of iterest x. We call x the readig from the first istrumet ad x the readig from the secod istrumet. We kow that the first istrumet has a error modelled by a Gaussia with stadard deviatio. The error of the secod istrumet is also ormally distributed aroud zero with stadard deviatio. We would like to combie both readigs ito a sigle estimatio. If both istrumets are equally good ( = ), we just take the average of both umbers. If the first istrumet is absolutely superior ( << ), we will keep x as our estimate, ad vice versa if the secod istrumet is clearly superior to the first. I ay other case we would like to form a weighted average of both readigs to geerate a estimate of x which we call ˆx. The questio ow is which is the best weighted average. Oe possibility is weightig each readig iversely proportioal to its precisio, that is, or simplifyig ˆx = x + x + ˆx = x + x + This estimatio of ˆx fulfills the boudary coditios metioed above. Note that the above estimate ca be also rewritte as ˆx = x + K(x x ) where ow the gai K = /( + ). The update equatio has the same geeral form as i the example i Sectio. The expressio used above is optimal give our state of kowledge about the measuremets. Sice the error curve from the istrumets is a Gaussia, we ca write the probability of x beig the right measuremet as p (x) = π e (x x) / for istrumet, ad as p (x) = π e (x x) / for istrumet. I the first case, x is the most probable measuremet, i the secod, x, but all poits x have a o-vaishig probability of beig the right measuremet due to the istrumets errors. Sice the two measuremets are idepedet we ca combie them best, by multiplyig their probability distributios ad ormalizig. Multiplyig we obtai: p(x) = p (x)p (x) = Ce (x x) / (x x) /
4 RAUL ROJAS where C is a costat obtaied after the multiplicatio (icludig the ormalizatio factor eeded for the ew Gaussia). The expressio for p(x) ca be expaded ito p(x) = Ce (( x which groupig some terms reduces to xx + x p(x) = Ce (x ( + )+( x ) x( x xx + x )) + x ))+D where D is a costat. The expressio ca be rewritte as p(x) = Ce ( + Completig the square i the expoet we obtai: p(x) = F e + (x x( x +x )))+D +. [ ] x (x +x ) + where all costats i the expoet (also those arisig from completig the square) ad i frot of the expoetial fuctio have bee absorbed ito the costat F. From this result, we see that the most probable result ˆx obtaied from combiig the two measuremets (that is, the ceter of the distributio) is ˆx = (x + x ) + ad the variace of the combied result is ˆ = + If we itroduce a gai factor K = /( + ) we ca rewrite the best estimate ˆx of the state as ˆx = x + K(x x ) ad the chage to the variace as ˆ = ( K) This is the geeral form of the classical Kalma filter. Note that x does ot eed to be a measuremet. It ca be a forecast of the system state, with a variace, ad x ca be a measuremet with the error variace. The Kalma filter would i that case combie the forecast with the measuremet i order to provide the best possible liear combiatio of both as the fial estimate. 3. Multidimesioal Kalma Filter - ucorrelated dimesios We ca geeralize the result obtaied above to the case of multiple dimesios, whe every dimesio is idepedet of each other, that is, whe there is o correlatio betwee the measuremets obtaied for each dimesio. I that case the measuremets x ad x are vectors, ad we ca combie their coordiates idepedetly. I the case of ucorrelated dimesios, give measuremets x ad x ( vectors of dimesio ), the variaces for each dimesio ca be arraged i the diagoal of two matrices Σ ad Σ, as follows:
THE KALMAN FILTER 5 ad Σ = Σ = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 where i represets the variace of the i-th coordiate for the first istrumet, ad i the variace of the i-th coordiate for the secod. The best estimate for the system state is give by ˆx = (Σ x + Σ x )(Σ + Σ ) Itroducig the gai K = Σ (Σ + Σ ), we ca rewrite the expressio above as ˆx = x + K(x x ) ad the ew estimate of all variaces ca be put i the diagoal of a matrix ˆΣ computed as ˆΣ = (I K)Σ where I is the idetity matrix. The reader ca verify all these expressios. The matrices Σ ad Σ are the covariace matrices for the two istrumets. I the special case of ucorrelated dimesios, the covariace matrix is diagoal. Remember that the covariace matrix Σ for m -dimesioal data poits x, x,..., x m is defied as Σ = (x x T + x x T + + x m x T m). 4. The geeral case The geeral case is very similar to the oe-dimesioal case. The mathematics is almost idetical, but ow we have to operate with vectors ad matrices. Give two -dimesioal measuremets x ad x, we assume that the two istrumets that produced them have a ormal error distributio. The first measuremet is cetered at x with covariace matrix Σ. The secod is cetered at x with covariace matrix Σ. The first distributio is deoted by N(x, Σ ), the secod by N(x, Σ ) If the two istrumets are idepedet, the combied distributio is the product of the two distributios. The product of two ormalized Gaussias, after ormalizatio, is aother ormalized Gaussia. We oly have to add the expoets of the two Gaussias i order to fid the expoet of the ew Gaussia. Ay costats i the expoet ca be moved out of the expoet: they ca be coverted to costats i frot of the expoetial fuctio, which will be later ormalized. The ew distributio is N(ˆx, ˆΣ). Remember that i a Gaussia with ceter µ ad covariace matrix Σ, the expoet of the Gaussia is (x µ)t Σ (x µ). The sum of the expoets of the two Gaussias is:
6 RAUL ROJAS [ (x x ) T Σ (x x ) + (x x ) T Σ (x x ) ] We will drop the factor / i the itermediate steps ad will recover it at the ed. The expressio above (without cosiderig the factor /) ca be expaded as: x T (Σ + Σ )x (xt Σ + x T Σ )x + C where C is a costat. Sice the product of symmetric matrices is commutative ad the two covariace matrices Σ ad Σ are commutative, we ca trasform the expressio above ito: x T (Σ + Σ )(Σ Σ ) x (x T Σ + x T Σ )(Σ Σ ) (Σ + Σ ) (Σ + Σ )x + C Note that we used the fact that (Σ +Σ )Σ Σ = Σ +Σ. Now we ca complete the square i the expoet ad rewrite this expressio as: (x (x Σ +x Σ )(Σ +Σ ) ) T (Σ +Σ )(Σ Σ ) (x (x Σ +x Σ )(Σ +Σ ) )+C where C is a costat absorbig C ad ay other costats eeded for the square completio. Multiplyig the above expressio by /, the expoet has the required form for a Gaussia, ad by direct ispectio we see that the expected average of the combiatio of measuremets is ˆx = (x Σ + x Σ )(Σ + Σ ) ad the ew covariace matrix is ˆΣ = (Σ + Σ ) (Σ Σ ) These expressios correspod to the Kalma filter for the combiatio of two measuremets. To see this more clearly we ca defie the Kalma gai K as K = Σ (Σ + Σ ) The the expressios above ca be rewritte as ˆx = x + K(x x ) ad ˆΣ = (I K)Σ where I is the idetity matrix. Notice that i all the algebraic maipulatios above we used the fact that the product ad the sum of symmetric matrices is symmetric, ad that the iverse of a symmetric matrix is symmetric. Symmetric matrices commute uder matrix multiplicatio. 5. Forecast ad measuremet of differet dimesio The equatios prited i books for the Kalma filter are more geeral tha the expressios we have derived, because they hadle the more geeral case i which the measuremet ca have a differet dimesio tha the system s state. But it is easy to derive the more geeral form, all is eeded is to move all calculatios to measuremet space. I the geeral case we have a process goig through state trasitios x, x,.... The propagator matrix A allows us to forecast the state at time k + give our best past estimate ˆx k of the state at time k: ˆx k+ = Aˆx k + ν
THE KALMAN FILTER 7 The forecast is error proe. The error is give by ν, a variable with Gaussia distributio cetered at 0. If the covariace matrix of the system state estimate at time k is called P, the covariace matrix of ˆx k+ is give by P k+ = AP ka T + Q where Q is the covariace matrix of the oise ν. The measuremet z k+ is also error proe. The measuremet is liearly related to the state, amely as follows: z k = Hx k + R where H is a matrix ad R the covariace matrix of the measuremet error. Kalma gai: K k+ = P k+ HT [HP k+ HT + R] State update: Covariace update: x k+ ˆ = ˆx k+ + K k+[z k+ H ˆx k+ ] P k+ = (I K k+ H)P k+ Refereces. Peter Maybeck, Stochastic Models, Estimatiod ad Cotrol, Academic Press. Ic, 979.. N.D. Sigpurwalla ad R.J. Meihold, Uderstadig the Kalma Filter, The America Statisticia, Vol. 37, N., pp. 3-7. 3. Greg Welch ad Gary Bishop, A Itroductio to the Kalma Filter, TR-95-04, Departmet of Computer Sciece, Uiversity of North Carolia at Chapell Hill, updated i 00. Freie Uiversität Berli, Istitut für Iformatik