Smoothers: Types and Benchmarks

Smoothers: Types and Benchmarks Patrick N. Raanes Oxford University, NERSC 8th International EnKF Workshop May 27, 2013 Chris Farmer, Irene Moroz Laurent Bertino NERSC Geir Evensen

Abstract Talk builds on Smoothing problems in a Bayesian framework and their linear Gaussian solutions (Cosme et al., 2012) Aim: Introduce new organisation of smoothers Aim: Comparisons with less known RTS smoother

Outline Bayesian DA Classic Form Ensemble Formulations Benchmarking Appendix

Notation Kalman filter equations x t 1:t 1 = M t 1x t 1 1:t 1 P t 1:t 1 = M t 1P t 1 1:t 1 M t 1 + Qt K t = P t t 1 H C 1 x t 1:t = x t 1:t 1 + K td t P t 1:t = (I K th)p t t 1 where C 1 = ( ) HP t t 1 H 1 + R t and dt = (y t Hx t 1:t 1 ). PDF of state x conditioned on data y : p(x y) x 1:t = (x 1; x 2;... ; x t 1; x t) Estimate for x t given data y 1:t : x t 1:t

Bayesian DA A useful generalisation Hidden Markov Chain Model M 0 M x 0 x 1 1 x 2 00 11 01 01 01 00 11 01 01 01 M t 1 x t 00 11 01 01 01 00 11 01 01 01 M T 1 x T H 1 H 2 H t H T y 1 y 2 y t y T Bayes Rule Marginalisation p(x y) = p(y x)p(x) p(y) p(x 1 ) = p(x 1, x 2 ) dx 2

Bayesian DA A useful generalisation 3 Methods Augmented Filter (AF) smoothers Backward-Forward, Rauch-Tung-Striebel (RTS) smoothers Two-Filter (2F) smoothers

Bayesian DA Augmented Filter (AF) smoothers Augment state vector: x t = ( ) xt x Σt, for some Σ t (0 : t 1)

Bayesian DA Augmented Filter (AF) smoothers Augment state vector: x t = Find filter solution ( ) xt x Σt, for some Σ t (0 : t 1) p(x t, x Σt y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1, x Σt 1 y 1:t 1 ) dx Σ c t \Σ c, t 1 provided Σ c t 1 Σc t, where Σ c t = {(0 : t 1)\Σ t }.

Bayesian DA AF special cases The Filter Observations Σ t =, i.e. (x t ; x Σt ) = x t 01 000 111 0000 1111 States p(x t y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1 y 1:t 1 ) dx t 1

Bayesian DA AF special cases The Filter Observations Σ t =, i.e. (x t ; x Σt ) = x t 000 111 00000 11111 0000000 1111111 000000000 111111111 00000000000 11111111111 0000000000000 1111111111111 States p(x t y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1 y 1:t 1 ) dx t 1

Bayesian DA AF special cases The Filter Σ t =, i.e. (x t ; x Σt ) = x t Observations 01 000 111 00000 11111 0000000 1111111 000000000 111111111 00000000000 11111111111 0000000000000 1111111111111 000000000000000 111111111111111 00000000000000000 11111111111111111 0000000000000000000 1111111111111111111 000000000000000000000 111111111111111111111 00000000000000000000000 11111111111111111111111 States p(x t y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1 y 1:t 1 ) dx t 1

Bayesian DA AF special cases The Global, Sequential Smoother Observations Σ t = (0 : t 1), i.e. (x t ; x Σt ) = x 0:t States p(x 0:t y 1:t ) = p(y t x t )p(x t x t 1 ) p(x 0:t 1 y 1:t 1 )

Bayesian DA AF special cases The Fixed Lag Smoother Observations Σ t = (t L : t 1), i.e. (x t ; x Σt ) = x t L:t States p(x t L:t y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1 L:t 1 y 1:t 1 ) dx t 1

Bayesian DA AF special cases The Fixed Lag Smoother Σ t = (t L : t 1), i.e. (x t ; x Σt ) = x t L:t Observations 000 111 00000 11111 0000000 1111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 States p(x t L:t y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1 L:t 1 y 1:t 1 ) dx t 1

Bayesian DA AF special cases The Fixed Lag Smoother Σ t = (t L : t 1), i.e. (x t ; x Σt ) = x t L:t Observations 000 111 00000 11111 0000000 1111111 000000000 111111111 00000000000 11111111111 0000000000000 1111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 000000000000000 111111111111111 States p(x t L:t y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1 L:t 1 y 1:t 1 ) dx t 1

Bayesian DA AF special cases The Fixed, Lagging Point Smoother Σ t = t L, i.e. (x t ; x Σt ) = (x t ; x t L ) Observations 01 01 000 111 000 111 00000000 11111111 00000000 11111111 00000000 11111111 States Does not satisfy the condition Σ c t 1 Σc t = no recursive relation available But the fixed-lag smoother should suffice

Bayesian DA AF special cases The Fixed, Lagging Point Smoother Σ t = t L, i.e. (x t ; x Σt ) = (x t ; x t L ) Observations 01 01 000 111 000 111 0000000000 1111111111 000000000000 111111111111 00000000000000 11111111111111 0000000000000000 1111111111111111 0000000000000000 1111111111111111 0000000000000000 1111111111111111 0000000000000000 1111111111111111 States Does not satisfy the condition Σ c t 1 Σc t = no recursive relation available But the fixed-lag smoother should suffice

Bayesian DA AF special cases The Fixed, Lagging Point Smoother Σ t = t L, i.e. (x t ; x Σt ) = (x t ; x t L ) Observations 0000 1111 000 111 00000000000 11111111111 0000000000000 1111111111111 000000000000000 111111111111111 00000000000000000 11111111111111111 0000000000000000000 1111111111111111111 000000000000000000000 111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 States Does not satisfy the condition Σ c t 1 Σc t = no recursive relation available But the fixed-lag smoother should suffice

Bayesian DA RTS smoother Seek a recursive, marginal smoother for a fixed analysis window Observations p(x t y 1:T ) = 000 111 00000 11111 0000000 1111111 000000000 111111111 p(x t x t+1, y 1:t ) p(x t+1 y 1:T ) dx 00000000000 11111111111 t+1 0000000000000 1111111111111 States First runs the filter; Then corrects backwards Recursive, but not with new data

Bayesian DA RTS smoother Seek a recursive, marginal smoother for a fixed analysis window 000 111 00000 11111 0000000 1111111 000000000 111111111 00000000000 11111111111 0000000000000 1111111111111 000000000000000 111111111111111 00000000000000000 11111111111111111 p(x t y 1:T ) = 0000000000000000000 1111111111111111111 000000000000000000000 111111111111111111111 00000000000000000000000 11111111111111111111111 0000000000000000000000000 1111111111111111111111111 p(x t x t+1, y 1:t ) p(x t+1 y 1:T ) dx t+1 000000000000000000000000000 111111111111111111111111111 00000000000000000000000000000 11111111111111111111111111111 Observations States First runs the filter; Then corrects backwards Recursive, but not with new data

Bayesian DA RTS smoother Seek a recursive, marginal smoother for a fixed analysis window 000000000000 111111111111 000 111 0000000000 1111111111 00000 11111 00000000 11111111 0000000 1111111 000000 111111 000000000 111111111 0000 1111 00000000000 11111111111 0000000000000 1111111111111 000000000000000 111111111111111 00000000000000000 11111111111111111 p(x t y 1:T ) = 0000000000000000000 1111111111111111111 000000000000000000000 111111111111111111111 00000000000000000000000 11111111111111111111111 0000000000000000000000000 1111111111111111111111111 p(x t x t+1, y 1:t ) p(x t+1 y 1:T ) dx t+1 000000000000000000000000000 111111111111111111111111111 00000000000000000000000000000 11111111111111111111111111111 Observations States First runs the filter; Then corrects backwards Recursive, but not with new data

Bayesian DA RTS smoother Seek a recursive, marginal smoother for a fixed analysis window 0000000000000000000000000 1111111111111111111111111 000 111 00000000000000000000000 11111111111111111111111 00000 11111 000000000000000000000 111111111111111111111 0000000 1111111 0000000000000000000 1111111111111111111 000000000 111111111 00000000000000000 11111111111111111 00000000000 11111111111 000000000000000 111111111111111 0000000000000 1111111111111 0000000000000 1111111111111 000000000000000 111111111111111 00000000000 11111111111 00000000000000000 11111111111111111 p(x t y 1:T ) = 000000000 111111111 0000000000000000000 1111111111111111111 0000000 1111111 000000000000000000000 111111111111111111111 00000 11111 00000000000000000000000 11111111111111111111111 000 111 0000000000000000000000000 1111111111111111111111111 p(x t x t+1, y 1:t ) p(x t+1 y 1:T ) dx t+1 000000000000000000000000000 111111111111111111111111111 01 00000000000000000000000000000 11111111111111111111111111111 Observations States First runs the filter; Then corrects backwards Recursive, but not with new data

Bayesian DA Two-Filter (2F) smoother Seek a smoother that directly utilises the filter solution p(x t y 1:T ) = p(x t y 1:t ) p(y t+1:t x t ) Observations 000000000000000000 111111111111111111 0000000000000000 1111111111111111 00000000000000 11111111111111 000000000000 111111111111 0000000000 1111111111 00000000 11111111 000000 111111 0000 1111 00 11 00 11 0000 1111 000000 111111 00000000 11111111 0000000000 1111111111 States Recursive, but not with new data Involves a backwards likelihood/information filter = Requires adjoint model Will not be discussed further

Bayesian DA Two-Filter (2F) smoother Seek a smoother that directly utilises the filter solution p(x t y 1:T ) = 00000000000 11111111111 000000000 111111111 0000000 1111111 00000 11111 00 0000 1111 11 000 111 0 1 000000 111111 00000000 11111111 0000000000 1111111111 000000000000 111111111111 00000000000000 11111111111111 p(x t y 1:t ) p(y t+1:t x t ) 0000000000000000 1111111111111111 000000000000000000 111111111111111111 Observations States Recursive, but not with new data Involves a backwards likelihood/information filter = Requires adjoint model Will not be discussed further

Bayesian DA Classic Forms Ensemble Formulations Benchmarking

Classic Form Derivable from the Bayesian formulations assuming Gaussian noise input and linear models Can simply insert augmented matrices into the Kalman filter equations

Classic Form Augmented Filter Smoother Augmented system matrices: Observations Fixed Point/Interval H t = (H t 0 0) M t 0 0 M t = 0. I 0 Fixed-Lag M t = M t 0 0 I 0. 0

Classic Form Augmented Filter Smoother In either case, the other block matrices are Q t 0 0 Q t = 0. 0, Pt = 0 P t P Σt,t P t,σt P Σt, which reduces the Kalman gain to K t+1 = K t+1 K Σt+1 = P t+1 1:t P Σt+1,t+1 1:t H t+1 C 1

Classic Form Augmented Filter Smoother Thus, in addition to the usual filter updates, we get x Σt 1:t = x Σt 1:t 1 + K Σt d t P t,σt 1:t = (I K t H)P t,σt 1:t 1 P Σt 1:t = P Σt 1:t 1 K Σt HP t,σt 1:t 1

Classic Form RTS Smoother Derivable using the usual Linear, Gaussian assumptions The recursive, retrospective corrections become J t = P t 1:t M t P 1 t+1 1:t x t 1:T = x t 1:t + J t ( x t+1 1:T x t+1 1:t ) P t 1:T = P t 1:t J t ( P t+1 1:T P t+1 1:t ) J t

Bayesian DA Classic Forms Ensemble Formulations Benchmarking

Ensemble formulation Augmented Filter Smoother Classic form ( ) 1 K Σt = P Σt,t 1:t 1 H HP t 1:t 1 H + R x Σt 1:t = x Σt 1:t 1 + K Σt d t

Ensemble formulation Augmented Filter Smoother Classic form ( ) 1 K Σt = P Σt,t 1:t 1 H HP t 1:t 1 H + R x Σt 1:t = x Σt 1:t 1 + K Σt d t Ensemble formulation = { EnKF (Evensen, 1994) EnKS (Evensen and Van Leeuwen, 2000) K Σt = A Σt 1:t 1 S t (S ts t + R) 1 x n Σ t 1:t = xn Σ t 1:t 1 + K Σ t d n t, where A t 1:t = E t 1:t [ I 1 N ½½ ] and S t = H ( E t 1:t 1 ) [ I 1 N ½½ ].

Ensemble formulation Augmented Filter Smoother Possibility: Save up multiple updates before analysing. = EnS (Van Leeuwen and Evensen, 1996) May save significant time because max n t T n t < t max T n n t, where Tt n is the time required to propagate ensemble member n from one observation point to another. Possibility: Process observations sequentially. = EnSS (Cosme et al., 2012)

Ensemble formulation RTS Smoother Classic form Ensemble formulation J t = P t 1:t M t P 1 t+1 1:t x t 1:T = x t 1:t + J t ( x t+1 1:T x t+1 1:t ) A t+1 t = UΣV J t = A t 1:t V Σ 1 U x n t 1:T = xn t 1:t + J t ( ) x n t+1 1:T xn t+1 1:t = EnRTS (Lermusiaux and Robinson, 1999)

Bayesian DA Classic Forms Ensemble Formulations Benchmarking

Classic Form Marginal and joint smoothers Joint and marginal estimates may differ But the marginal and joint means are always equal Besides, the marginal pdfs of a Gaussian RV are just the individual components of the joint pdf = Classic smoothers, whether marginal or not, yield the same estimate if the system is linear and Gaussian.

Benchmarking RMS RMS: 1 Tt=1 T x t x t 2 Normalised by the RMS of climatology.

Benchmarking Lorenz 63 A convenient, chaotic, toy model dx = σ(y x), dt dy = x(ρ z) y, dt dz = xy βz. dt with σ = 10, β = 8/3, ρ = 28, initial cond. (x, y, z) t=0 = (1.508870, 1.531271, 25.460910), Q t = diag([2, 12.13, 12.31]). Direct observations of all 3 components and R t = 2I. T = 10. t between observations = 0.25.

Benchmarking Lorenz 63

Benchmarking 1D Linear Advection A linear, high-dimensional toy model u t + c u x = 0 with c = x = t = 1, a periodic waveform with decorrelation length 20, a periodic domain with 300 nodes, and no forward model noise. Direct observations 60 nodes apart, assimilated every 10 time steps, with R t = 0.01I. T = 305.

Benchmarking 1D Linear Advection

Benchmarking Method performance EnS superior on LA model EnKS and RTS superior to EnS(S) on Lorenz 63 EnKS and RTS perform similarly on both models EnS superior EnS-Sequential Difference SQRT/Pert comparatively insignificant

Benchmarking Costs Asymptotic numerical flop costs. Not including application of M and H to ensemble Assumes N n x, n y. (N: number of ensemble members) Method Flops EnKF T N 2 (n x + n y ) 0 EnKS T N 2 (n x + n y ) + 1 2 T 2 N 2 n x EnKS-Lag T N 2 (n x + n y ) + LT N 2 n x EnS 0 + T N 2 n x EnRTS T N 2 (n x + n y ) + T N 2 n x

Future work More models More parameter test regimes More assessment scores than RMS More careful cost analysis More DA methods

References Cosme, E., J. Verron, P. Brasseur, J. Blum, and D. Auroux, 2012: Smoothing problems in a bayesian framework and their linear gaussian solutions. Monthly Weather Review, 140, 683 695. Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using monte carlo methods to forecast error statistics. Journal of Geophysical Research, 99, 10 10. 2009: The ensemble kalman filter for combined state and parameter estimation. Control Systems, IEEE, 29, 83 104. Evensen, G. and P. Van Leeuwen, 2000: An ensemble kalman smoother for nonlinear dynamics. Monthly Weather Review, 128, 1852 1867. Jazwinski, A., 1970: Stochastic processes and filtering theory, volume 63. Academic press. Lermusiaux, P. F. and A. Robinson, 1999: Data assimilation via error subspace statistical estimation. part i: Theory and schemes. Monthly Weather Review, 127, 1385 1407. Skjervheim, J. and G. Evensen, 2011: An ensemble smoother for assisted history matching. SPE Reservoir Simulation Symposium. Van Leeuwen, P. J. and G. Evensen, 1996: Data assimilation and inverse methods in terms of a probabilistic formulation. Monthly Weather Review-USA, 124, 2898 2913.

Bayesian DA AF special cases The Fixed Point Smoother Σ t = k, i.e. (x t ; x Σt ) = (x t ; x k ) Observations 01 000 111 00000 11111 0000000 1111111 000000000 111111111 00000000000 11111111111 0000000000000 1111111111111 000000000000000 111111111111111 00000000000000000 11111111111111111 000000000000000000 111111111111111111 States p(x t, x k y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1, x k y 1:t 1 ) dx t 1

Bayesian DA AF special cases The Fixed Point Smoother Σ t = k, i.e. (x t ; x Σt ) = (x t ; x k ) Observations 000 111 00000 11111 0000000 1111111 000000000 111111111 00000000000 11111111111 0000000000000 1111111111111 000000000000000 111111111111111 00000000000000000 11111111111111111 0000000000000000000 1111111111111111111 000000000000000000000 111111111111111111111 00000000000000000000000 11111111111111111111111 States p(x t, x k y 1:t ) = p(y t x t ) p(x t x t 1 ) p(x t 1, x k y 1:t 1 ) dx t 1