Data assimilation with and without a model

Data assimilation with and without a model Tyrus Berry George Mason University NJIT Feb. 28, 2017 Postdoc supported by NSF

This work is in collaboration with: Tim Sauer, GMU Franz Hamilton, Postdoc, NCSU

DATA ASSIMILATION x k = f (x k 1 ) + η k η k N (0, Q) y k = h(x k ) + ν k ν k N (0, R) Main Problem: Given the model above plus observations y k, Filtering: Estimate the current state p(x k y 1,..., y k ) Forecasting: Estimate a future state p(x k+l y 1,..., y k ) Smoothing: Estimate a past state p(x k l y 1,..., y k ) Parameter estimation Apply ensemble Kalman filter (EnKF) to achieve these goals.

DATA ASSIMILATION x k = f (x k 1 ) + η k η k N (0, Q) y k = h(x k ) + ν k ν k N (0, R) Possible obstructions: Observations y k mix system noise η k with obs noise ν k Observations may be sparse in space or time Model error Q and R may be unknown Known model with unknown parameters Wrong model, even with best fit parameters Have model for some, not all of the variables

EXAMPLE 1. LORENZ 96 dx i dt = x i 2 x i 1 + x i 1 x i+1 x i + F 10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 0 10 20 30 40 50 Time -5 0 5 10

EXAMPLE 2. MEA RECORDINGS Given: Voltages + Model Want to find: State: Sodium Potassium Currents Parameters Neuron Network

TWO STEP FILTERING TO FIND p(x k y 1,..., y k ) Assume we have p(x k 1 y 1,..., y k 1 ) Forecast Step: Find p(x k y 1,..., y k 1 ) Assimilation Step: Perform a Bayesian update, p(x k y 1,..., y k ) p(x k y 1,..., y k 1 )p(y k x k, y 1,..., y k 1 ) Posterior Prior Likelihood Alternative: Variational methods (e.g. 3DVAR, 4DVAR) Minimize a cost functional Hard optimization problem

BEST POSSIBLE SCENARIO x k = f (x k 1 ) + η k η k N (0, Q) y k = h(x k ) + ν k ν k N (0, R) f and h are linear, all parameters known. x k = F k 1 x k 1 + η k η k N (0, Q) y k = H k x k + ν k ν k N (0, R)

KALMAN FILTER Assume linear dynamics/obs and additive Gaussian noise x k = F k 1 x k 1 + ω k ω k N (0, Q) y k = H k x k + ν k ν k N (0, R) For linear systems, easy observability condition: H k H k l = H k+1 F k. H k+l+1 F k+l F k Must be full rank for some l KF guaranteed to work!

KALMAN FILTER Assume linear dynamics/obs and additive Gaussian noise x k = F k 1 x k 1 + ω k ω k N (0, Q) y k = H k x k + ν k ν k N (0, R) Assume current estimate is Gaussian: p(x k 1 y 1,..., y k 1 ) = N (x a k 1, Pa k 1 ) Forecast: Linear combinations of Gaussians Prior: p(x k y 1,..., y k 1 ) = N (x f k, Pf k ) x f k = F k 1 x a k 1 P f k = F k 1 P k 1 F k 1 + Q Likelihood: p(y k x k, y 1,..., y k 1 ) = N (y f k, Py k ) y f k = H k x f k P y k = H kp f kh k + R

KALMAN FILTER Forecast: Linear combinations of Gaussians Prior: p(xk y 1,..., y k 1 ) = N (x f k, Pf k ) x f k = F k 1 x a k 1 P f k = F k 1 P k 1 F k 1 + Q Likelihood: p(y k x k, y 1,..., y k 1 ) = N (y f k, Py k ) y f k = H k x f k P y k = H kpkh f k + R Assimilation: Product of Gaussians (complete the square) p(x k y 1,..., y k ) = N (xk f, Pf k ) N (y k f, Py k ) = N (x k a, Pa k ) Define the Kalman gain: K k = Pk f HT k (Py k ) 1 x a k = xk f + K k(y k yk f ) P a k = (I K kh k )P f k

KALMAN FILTER SUMMARY Forecast x f k = F k 1 x a k 1 y f k = H k x f k Covariance update P f k = F k 1 P a k 1 F T k 1 + Q P y k = H k P f k HT k + R Kalman gain & Innovation K k = P f k HT k (Py k ) 1 ɛ k = y k y f k Assimilation x a k = x f k + K kɛ k P a k = (I K k H k )P f k

WHAT ABOUT NONLINEAR SYSTEMS? Consider a system of the form: x k+1 = f (x k ) + ω k+1 ω k+1 N (0, Q) y k+1 = h(x k+1 ) + ν k+1 ν k+1 N (0, R) More complicated observability condition (Lie derivatives) Extended Kalman Filter (EKF): Linearize F k = Df (x a k ) and H k = Dh(x f k ) Problem: State estimate x a k may not be well localized Solution: Ensemble Kalman Filter (EnKF)

ENSEMBLE KALMAN FILTER (ENKF) + P xx P xx x t + x t i F x t ~ x i t F(x t i ) Generate an ensemble with the current statistics (use matrix square root): x i t = sigma points on semimajor axes xt f = 1 F(x i t ) 2n P f xx = 1 2n 1 (F(x i t ) x f t )(F (x i t ) x f t ) T + Q

ENSEMBLE KALMAN FILTER (ENKF) + P xx P xx x t + x t i F x t ~ x i t F(x t i ) Calculate yt i = H(F (xt i )). Set yt f = 1 2n i y t i. P yy = (2n 1) 1 (y i t y f t )(y i t y f t ) T + R P xy = (2n 1) 1 (F (x i t ) x f t )(y i t y f t ) T K = P xyp 1 yy x a t+1 = x f t + K (y t y f t ) and P a xx = P f xx KP yyk T

PARAMETER ESTIMATION (STATE AUGMENTATION) When the model has parameters θ, x k+1 = f (x k, θ) + ω k+1 [ ] Augment the state x k = [x k, θ k ], Q Q 0 = 0 Q θ Introduce trivial dynamics dθ = Q θ dω θ x k+1 = f (x k, θ k ) + ω k+1 θ k+1 = θ k + ωk+1 θ Need to tune the covariance Q θ of ω θ Can track slowly varying parameters, Q θ var(θ)

EXAMPLE OF PARAMETER ESTIMATION Consider a network of n Hodgkin-Huxley neurons V i = g Na m 3 h(v i E Na ) g K n 4 (V i E K ) g L (V i E L ) n +I + Γ HH (V j )V j j i ṁ i = a m (V i )(1 m i ) b m (V i )m i ḣ i = a h (V i )(1 h i ) b h (V i )h i ṅ i = a n (V i )(1 n i ) b n (V i )n i Γ HH (V j ) = β ij /(1 + e 10(V j +40) ) Only observe the voltages V i Recover all variables and the connection parameters β

EXAMPLE OF PARAMETER ESTIMATION Can even turn connections on and off (grey dashes) Variance estimate statistical test (black dashes)

ROBUSTNESS TO MODEL ERROR? Fit a generic spiking model (Hindmarsh-Rose) V i = a i V 2 i V 3 i y i z i + I i + ẏ i = (a i + α i )Vi 2 y i ż i = µ i (b i V i + c i z i ) Γ HH (V j ) = β ij /(1 + e 10(V j +40) ) Observe voltages V i from Hodgkin-Huxley! Fit parameters to match neuron characteristics Recover the connection parameters β n Γ HH (V j )V j j i

LINK DETECTION FROM NETWORKS OF MODEL NEURONS Network of Hindmarsh-Rose neurons, modeled by Hindmarsh-Rose Network of Hodgkin-Huxley neurons, modeled by Hindmarsh-Rose

LINK DETECTION FROM MEA RECORDINGS

LINK DETECTION FROM MEA RECORDINGS MEA Recording Recovered Network % of 160 sec each connection was statistically significant

ENKF: INFLUENCE OF Q AND R 3.2 Simple example with full observation and diagonal noise covariances Red indicates RMSE of unfiltered observations Black is RMSE of optimal filter (true covariances known) RMSE RMSE 1.6 0.8 0.4 0.2 0.1 10 3 10 2 10 1 10 0 10 1 10 2 Variance used in R k 3.2 1.6 0.8 0.4 0.2 0.1 10 4 10 3 10 2 10 1 10 0 10 1 Variance used in Q k

ENKF: INFLUENCE OF Q AND R Standard Kalman Update: Pk f = F k 1 Pk 1 a F k 1 T + Q k 1 RMSE 3.2 1.6 0.8 0.4 P y k = H k P f k HT k + R k 1 K k = P f k HT k (Py k ) 1 P a k = (I K k H k )P f k 0.2 0.1 10 3 10 2 10 1 10 0 10 1 10 2 Variance used in R k 3.2 1.6 ɛ k = y k y f k = y k H k x f k x a k = x f k + K kɛ k RMSE 0.8 0.4 0.2 0.1 10 4 10 3 10 2 10 1 10 0 10 1 Variance used in Q k

ADAPTIVE FILTER: ESTIMATING Q AND R Innovations contain information about Q and R ɛ k = y k yk f = h(x k ) + ν k h(xk f ) = h(f (x k 1 ) + ω k ) h(f (x a k 1 )) + ν k H k F k 1 (x k 1 x a k 1 ) + H kω k + ν k IDEA: Use innovations to produce samples of Q and R : E[ɛ k ɛ T k ] HPf H T + R E[ɛ k+1 ɛ T k ] HFPe H T HFK E[ɛ k ɛ T k ] P e FP a F T + Q In the linear case this is rigorous and was first solved by Mehra in 1970

ADAPTIVE FILTER: ESTIMATING Q AND R To find Q and R we estimate H k and F k 1 from the ensemble and invert the equations: E[ɛ k ɛ T k ] HPf H T + R E[ɛ k+1 ɛ T k ] HFPe H T HFK E[ɛ k ɛ T k ] This gives the following empirical estimates of Q k and R k : Pk e = (H k+1 F k ) 1 (ɛ k+1 ɛ T k + H k+1f k K k ɛ k ɛ T k )H T k Qk e = Pk e F k 1Pk 1 a F k 1 T R e k = ɛ k ɛ T k H kp f k HT k Note: Pk e is an empirical estimate of the background covariance

ADAPTIVE ENKF We combine the estimates of Q and R with a moving average Original Kalman Eqs. P f k = F k 1 P a k 1 F T k 1 + Q k 1 P y k = H k P f k HT k + R k 1 Our Additional Update P e k 1 = F 1 k 1 H 1 k ɛ k ɛ T k 1 H T k 1 + K k 1 ɛ k 1 ɛ T k 1 H T k 1 K k = P f k HT k (Py k ) 1 P a k = (I K k H k )P f k Q e k 1 = P e k 1 F k 2P a k 2 F T k 2 R e k 1 = ɛ k 1 ɛ T k 1 H k 1P f k 1 HT k 1 ɛ k = y k y f k x a k = x f k + K kɛ k Q k = Q k 1 + (Q e k 1 Q k 1)/τ R k = R k 1 + (R e k 1 R k 1)/τ

ADAPTIVE FILTER: APPLICATION TO LORENZ-96 We will apply the adaptive EnKF to the 40-dimensional Lorenz96 model integrated over a time step t = 0.05 dx i dt = x i 2 x i 1 + x i 1 x i+1 x i + F We augment the model with Gaussian white noise x k = f (x k 1 ) + ω k ω k = N (0, Q) y k = h(x k ) + ν k ν k = N (0, R) The Adaptive EnKF uses F = 8 We will consider model error where the true F i = N (8, 16)

RECOVERING Q AND R, PERFECT MODEL True Covariance Initial Guess Final Estimate Difference 8 x 10 3 Q RMSE of Q Q k 6 4 2 0 10 0 10 1 10 2 10 3 10 4 10 5 Filter Step k 0.8 R RMSE 0.6 0.4 0.2 0 10 2 10 3 10 4 10 5 Filter Steps RMSE (bottom right) for the initial guess covariances (red) the true Q and R (black) and the adaptive filter (blue)

COMPENSATING FOR MODEL ERROR The adaptive filter compensates for errors in the forcing F i True Covariance Initial Guess Final Estimate Difference Q R RMSE (Q f ) ii Relative Change 60 50 40 30 20 10 0 0 0 10 20 30 40 Site Number 0.8 0.6 0.4 150 100 50 F i Model Error (%) 0.2 0 0 0.5 1 1.5 2 4 Filter Steps x 10 RMSE (bottom right) for the initial guess covariances (red) the perfect model (black) and the adaptive filter (blue)

OPEN PROBLEM Determining Q θ for parameters fails: dθ = Q θ dω θ Often Q θ increases unrealistically or diverges Brownian motion Non-identifiability? Ornstein-Uhlenbeck works: dθ = α(θ θ)dθ + Q θ dω θ But now need to estimate α, θ!

KALMAN-TAKENS FILTER: THROWING OUT THE MODEL... Starting with historical observations {y 0,..., y n } Form Takens delay-embedding state vectors x i = (y i, y i 1,..., y i d ) Build an EnKF: Apply analog forecast to each ensemble member Use the observation function Hx i = y i Crucial to estimate Q and R

KALMAN RECONSTRUCTION 20 15 10 5 0-5 -10-15 -20 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 20 15 10 5 0-5 -10-15 -20 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

ANALOG FORECAST 10 5 0-5 -10-15 -20 0 10 20 30 40 50

ANALOG FORECAST 10 5 0-5 -10-15 -20 0 10 20 30 40 50 60

KALMAN-TAKENS FILTER: LORENZ-63 25 20 15 10 EnKF w/ model Kalman-Takens x x 5 0-5 -10-15 -20-25 0 1 2 3 4 5 Time 25 20 15 10 5 0-5 -10-15 -20-25 0 1 2 3 4 5 Time

KALMAN-TAKENS FILTER Comparing K-T (red) with full model (blue) 7 6 5 RMSE 4 3 2 1 0 10 20 30 40 50 60 70 80 Percent Noise

KALMAN-TAKENS FILTER: LORENZ-96 10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 0 10 20 30 40 50 Time -5 0 5 10 10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40 50 Time 5 6 7 8 9 10

MODEL ERROR Lorenz-63 Lorenz-96

KALMAN-TAKENS FILTER Forecast error: El Nino index 1 0.9 0.8 0.7 RMSE 0.6 0.5 0.4 0.3 0.2 0.1 0 2 4 6 8 10 12 14 16 Forecast Steps (months)

LIMITS OF PARAMETER ESTIMATION Depends on model complexity and observability Typically estimate 4 parameters per observation New work: (F. Hamilton et al., Hybrid modeling and prediction of dynamical systems) Problem: Too many parameters, model is useless Solution: To fit params, replace other equations with K-T Extends the boundaries of parameter estimation

ESTIMATING UNMODELED DYNAMICS (FRANZ) Want to track un-modeled variable S ẇ = F (w) + ω t y... y S = H(w) + ν t Run m model with different parameters p i w = x 1... x m c 1... c m d, F = f (x, p 1 )... f (x, p m) 0... 0 0, H(w) = h(x 1 )... h(x m ) i,j c i x i + d Fit regression params c i j and d from training data S

RECONSTRUCTING UNMODELED IONIC DYNAMICS Observing seizure voltage, reconstruct unmodeled potassium and sodium dynamics (assimilation model is Hindmarsh-Rose) J. Cressman, G. Ullah, J. Ziburkus, S. Schiff, and E. Barreto, Journal of Comp. Neuroscience 26, 159 (2009). Actual Observed Predicted

RECONSTRUCTING EXTRACELLULAR POTASSIUM FROM AN In Vitro NETWORK We want to track extracellular potassium dynamics in a network but measurements are difficult and spatially limited Extracellular potassium is an unmodeled variable

RECONSTRUCTING EXTRACELLULAR POTASSIUM FROM AN In Vitro NETWORK The local extracellular potassium in an MEA network can be reconstructed and predicted using our approach (assimilation model is Hindmarsh-Rose) Actual Observed Predicted

SUMMARY EnKF is a useful data assimilation technique for neurodynamics and other types of data Parameter estimation Adaptive QR is helpful when Q and R are unknown Difficulties Model error Unmodeled variables No model Kalman-Takens filter Multimodel data assimilation

REFERENCES T. Berry, T. Sauer, Adaptive ensemble Kalman filtering of nonlinear systems. Tellus A 65, 20331 (2013) F. Hamilton, T. Berry, N. Peixoto, T. Sauer, Real-time tracking of neuronal network structure using data assimilation. Phys. Rev. E 88, 052715 (2013) F. Hamilton, J. Cressman, N. Peixoto, T. Sauer, Reconstructing neural dynamics using data assimilation with multiple models. Europhys. Lett. 107, 68005 (2014) F. Hamilton, T. Berry, T. Sauer, Ensemble Kalman filtering without a model. Phys. Rev. X 6, 011021 (2016). F. Hamilton, T. Berry, T. Sauer, Kalman-Takens filtering in the presence of dynamical noise. To appear, Eur. Phys. J. F. Hamilton, A. Lloyd, K. Flores, Hybrid modeling and prediction of dynamical systems. Submitted.

KALMAN FILTER: FORECAST STEP At time k 1 we have mean x a k 1 and covariance Pa k 1 x k = F k 1 x k 1 + ω k Linear combinations of Gaussians are still Gaussian so: p(fk 1 x k 1 y 1,..., y k 1 ) = N (F k 1 xk 1 a, F k 1P k 1 Fk 1 ) p(x k y 1,..., y k 1 ) = N (F k 1 xk 1 a, F k 1P k 1 Fk 1 + Q) Define the Forecast mean: x f k F k 1x a k 1 Define the Forecast covariance: P f k F k 1P a k 1 F k 1 + Q

KALMAN FILTER: DEFINING THE LIKELIHOOD FUNCTION Recall that y k = H k x k + ν k where ν k N (0, R) is Gaussian The forecast distribution: p(x k y 1,..., y k 1 ) = N (x f k, Pf k ) Likelihood: p(y k x k, y 1,..., y k 1 ) = N (H k x f k, H kp f k H k + R) Define the Observation mean: y f k = H kx f k Define the Observation covariance: P y k = H kp f k H k + R

KALMAN FILTER: ASSIMILATION STEP Gaussian prior Gaussian likelihood Gaussian posterior { p(y x)p(x) exp 1 2 (y Hx) (P y ) 1 (y Hx) 1 } 2 (x x f ) (P f ) 1 (x x f ) { exp 1 2 x ((P y ) 1 + H(P f ) 1 H )x } +x (H (P y ) 1 y (P f ) 1 x f ) Posterior Covariance: P a = ( (P f ) 1 + H (P y ) 1 H ) 1 Posterior Mean: x a = P a ( H (P y ) 1 y (P f ) 1 x f )

KALMAN FILTER: ASSIMILATION STEP Kalman Equations: (after some linear algebra...) Kalman Gain: Kk = P f k H k (Py k ) 1 Innovation: ɛ k = y k y f k Posterior Mean: x a k = x f k + K kɛ k Posterior Covariance: P a k = (I K kh k )P f k x a k is the least squares/minimum variance estimator of x k