STATC AND DYNAMC RECURSVE LEAST SQUARES 3rd February 2006 1 Problem #1: additional information Problem At step we want to solve by least squares A 1 b 1 A 1 A 2 b 2 A 2 A x b, A := A, b := b 1 b 2 b with weight matrix W := W 1 W 2 W meaning that measurements at different times (steps) are independent Sizes Matrices A j are m j n We assume that they have maximum column ran Matrices W j are symmetric and positive definite Simple case m j = m for all j, meaning that the number of observations is constant Magnitudes We assume that n is small, whereas m j can be relatively large The aim To approximate A x b we solve (A W A ) 1 A W b The idea is to use x 1 to calculate x in a shorter time 1
Notation We write := A j W j A j = A W A j=1 Theorem (A recursion) Beginning with x 1 = P 1 A 1 W 1 b 1, P 1 = (A 1 W 1 A 1 ) 1 the solution is given by the recurrent calculations j = 1,, x j = x j 1 + P j A 1 j W j (b j A j x j 1), Pj = j 1 }{{} + A j W j A j =: K j Proof t is a straightforward verification Remar The vector b j A j x j 1 is the residual of the observations at level j by taing into account the values of the parameters at level j 1 A practical algorithm := A 1 W 1 A 1 x :=solve(, A 1 W 1 b 1 ) for j = 1 : := + A j W j A j r := b j A j x δ :=solve(, A j W j r) x := x + δ end for % residual of recent calculations with new data Computational features Explicit assembling of could be avoided if iterative methods were to be used 2 Problem # 2: dynamic least squares The problem Least squares solution to A 0 F 0 A 1 F 1 F 1 A x 0 x 1 x 1 x b 0 0 b 1 0 0 b 2
with weight matrix W = W 0 W1 W 1 W W nterpretation Observations in different times (for different magnitudes) are contained in the equations A j x j b j whereas x j+1 F j x j is an approximate dynamic model, relating the sets of parameters Simplified notations A 0 F 0 A := A 1 F 1 For matrices we write, S := F 1 A A 0 F 0 A 1 F 1 F 1 Names smoothed values (at earlier states): x j for j < ; filtered value (at the current state): x ; predicted value (next state in the future, not yet observed): x +1 := F x Theorem (Kalman filter recurrence) Starting with x 0 0 := (A 0 W 0 A 0 ) 1 A 0 W 0 b 0, P 0 0 := (A 0 W 0 A 0 ) 1, the calculations for x at different values of can be carried out by doing: where x 1 := F 1 x 1 1, x := x 1 + K (b A x 1 ) P 1 := F 1 1 F 1 + W 1 K := P 1 A (A P 1 A + W 1 ) 1 Proof See Section 4 P := ( K A )P 1 3
Names For the two steps of the method we write: prediction x 1 := F 1 x 1 1 correction x := x 1 + K (b A x 1 ) For the matrices intervening in computations predicted covariance P 1 gain matrix K corrected covariance P Sizes in the typical simple case The number of parameters in the dynamical system is fixed, ie, x j R m for all j, The number of observations in each times step is fixed, ie, b j R n for all j m = #{parameters}, n = #{observations} Hence the sizes of the matrices are as follows: W j is m m Ŵj is n n F j is m m A j is n m P and P 1 are m m K is m n (acts on observations returning parameters) Computational features When n is large and m is small We have to compute C = A P 1 A + W 1, which is n n Typically we have W 1 instead of W To compute K (b A x 1 ) = P 1 A C 1 (b A x 1 ) we solve an n n system To compute K A = P 1 A C 1 A we solve m systems n n All these calculations can be done in parallel 4
Steady state We remar that the steady state case, where the dynamics reduces to x j+1 x j, j that is, F j =, j is not equivalent to the static case (Problem #1) 3 Normal equations of the Kalman filter A first observation n the normal equations A W A x = A W b we can write A W A = D 0 U 0 U0 D 1 U 1 U 1 D 1 U 1 U 1 D A 0 W 0 b 0, A A 1 W 1 b 1 W b = = A W b c 0 c 1 c where (taing W 0 = 0) D j := W j + A j W j A j + F j W j+1 F j U j := F j W j+1 D := W + A W A The symbol mars the only element of the matriz which depends on the size () Notice also that for the next time step D = D + F W +1 F Another loo at the normal equations is symmetric bloc tridiagonal D 0 U 0 U0 D 1 U 1 U1 D 1 U 1 U 1 D The system x 0 x 1 x 1 x = c 0 c 1 c 1 c 5
Reduced equations Before considering the inclusions of observations at time (A x b ) we can consider the reduced system A 0 F 0 y 0 b 0 A 1 y 1 F 1 0 y 1 b 1 y 0 F 1 Theorem The vector (x 0 1, x 1 1,, x 1 1, F 1 x 1 1 }{{} x 1 is the weighted least square solution to the reduced system ) Proof Denote S as above, and W = W 0 W1 W 1 to the weight matrix The matrix for the normal equations is D 0 U 0 U0 D 1 U 1 S W S = U1 = D 1 U 1 U 1 W = A 1 W 0 0 1A 1 0 + F 0 0 0 1 W F 1 and the right hand side is (c 0,, c 1, 0) t is simple to prove that the proposed solution satisfies the equations W U 1 U 1 W 6
A direct bloc method f we only want to solve the last unnown x = x, we can apply an elimination technique to reach the upper bloc triangular form E 0 U 0 x 0 d 0 E 1 U 1 x 1 d 1 = E 1 U 1 E by using the well nown tridiagonal algorithm x 1 x d 1 d E 0 := D 0 d 0 := c 0 for j = 1 : 1 E j := D j Uj 1E 1 j 1 U j 1 d j := c j Uj 1E 1 j 1 d j 1 end for E := D U 1 E 1 1 U 1 d := c U 1 E 1 1 d 1 x :=solve(e, d ) The same elimination process applied to the reduced equations yields E 0 U 0 y 0 d 0 E 1 U 1 y 1 d 1 = where E 1 U 1 H y 1 y d 1 e H := W U 1E 1 1 U 1 = E A W A e := U 1E 1 1 d 1 = d A W b = d c The next step The equality D = D + F W +1 F implies, that the next step only requires the following calculations E := E + F W +1 F D +1 := W +1 + A +1W +1 A +1 U := F W +1 E +1 := D +1 U E 1 U d +1 := c +1 U E 1 d 7
The method as a whole Notice that we can write all the steps in the preceding recursive way Notice however that this scheme does not use the solutions at previous states, unlie the true Kalman filter implementation D 0 := A 0 W 0 A 0 E 0 := D 0 d 0 := A 0 W 0 b 0 x 0 :=solve(e 0, d 0 ) for j = 1 : end for E j 1 = E j 1 + F j 1 W j F j 1 D j := W j + A j W j A j U j 1 := F j 1 W j E j := D j U j 1E 1 j 1 U j 1 d j := A j W j b j U j 1E 1 j 1 d j 1 x j :=solve(e j, d j ) Writing the program in a more computational way, we can see the memory requirements of the algorithm A := A 0 ; b := b 0 W := W 0 ; D := A W A E := D d := A W b x :=solve(e, d) for j = 1 : F := F j 1 ; W := W j U := F W E = E + F W F A := A j ; b := b j ; W := W j D := W + A W A E := D U E 1 U d := A W b U E 1 d x :=solve(e, d) end for % storage of observations % storage of weights % storage of the state equation 8
4 A proof of the Kalman filter recursion Lemma Let C and P be invertible Then the inverse of Q := P P B (BP B + C 1 ) 1 BP is Q 1 = + B CB Proof t can be easily proven by direct verification Also we can easily see that [ ] [ ] [ ] C 1 + BP B BP 0 P B = P 0 Q ( is used in positions we are not interested in) nverting both sides [ ] [ ] [ ] C CB 0 B C + B = CB 0 Q 1 Comparison of blocs (2, 2) gives the result Theorem For all = E, 1 = H Proof The couple (H, E ) is given by the recurrence H = W W F 1(E 1 + F 1 W F 1 ) 1 F 1 W, E = H + A W A starting at E 0 = A 0 W 0 A 0 On the other hand, the couple (P 1, P ) satisfies the recurrence P 1 = W 1 + F 1 1 F 1, P = P 1 P 1 A (A P 1 A + W 1 ) 1 A P 1 starting with P 0 0 = (A 0 W 0 A 0 ) 1 The lemma clearly shows that both recurrences are equivalent Theorem For all x = ( K A )x 1 + K b Proof Notice first that by the preceding result and the form of the corresponding normal equations and their solutions x = d 1 x 1 = d A W b 9
and also ( K A ) = 1 The result is then equivalent to showing that but P d = P d P A W b + K b P A W = K as can be deduced from the definiyos of P and K Further reading and sources Gilbert Strang ntroduction to Applied Mathematics Wellesley Cambridge Gilbert Strang Linear Algebra, Geodesy and GPS Wellesley Cambridge Neil Gershhenfeld The Nature of Mathematical Modeling Cambridge University Press, 1999 (chapter 15) by F J Sayas Mathematics, Ho!, May, 2001 Mathematics, Ho! is a personal project for individual and comunal achievement in higher Maths among Mathematicians, with a bias to Applied Mathematics t is a collection of class notes and small courses do not claim entire originality in these, since they have been much influenced by what report as references Conditions of use You are allowed to use these pages and to share this material This is by no means aimed to be given to undergraduate students, since the style is often too dry Anyway you are requested to quote this as a source whenever you mae extensive use of it Note also that this is permanently wor in progress 10