Data assimilation as an optimal control problem and applications to UQ Walter Acevedo, Angwenyi David, Jana de Wiljes & Sebastian Reich Universität Potsdam/ University of Reading IPAM, November 13th 2017 Universität Potsdam/ University of Reading 1
Numerical Weather Prediction Model: highly nonlinear discretized partial differential equations Data: heterogeneous mix of ground-, airborne-, satellite-based and radar data 24/7 data assimilation service for optimal weather prediction Universität Potsdam/ University of Reading 2
Numerical Weather Prediction Model: highly nonlinear discretized partial differential equations Data: heterogeneous mix of ground-, airborne-, satellite-based and radar data 24/7 data assimilation service for optimal weather prediction Universität Potsdam/ University of Reading 2
Problem setting Model: dx t = f (x t, λ) dt + σ(x t ) dw t, (x 0, λ) π 0 Data/Observations: (A) continuous-in-time dy t = h(x t, λ) dt + R 1/2 dv t (B) discrete-in-time y tn = h(x tn, λ) + R 1/2 Ξ tn. Goal: Approximate conditional PDF π t (x, λ) = π t (x, λ Y t ) where Y t contains all the data up to time t and π 0 = π 0 at initial time. Universität Potsdam/ University of Reading 3
Problem setting Time evolved marginal distributions: P(x Y 0:t ) P(x) time t Universität Potsdam/ University of Reading 4
Problem setting Applications assimilation of observations into computer model (e.g. weather forecasting), assimilation of synthetic data from a high resolution model into a model of lower resolution (e.g. parameter estimation), rare event simulation (data comes from possible rare event scenarios), solution of inverse problems, minimization (model dynamics is trivial, e.g., dx t = 0). Universität Potsdam/ University of Reading 5
Problem setting Particle filters: Approximate conditional PDF π t by a set of particles z i t = (xi t, λi ), i = 1,..., M, t with weights w i t 0, E.g. sequential Monte Carlo. s.t. w i t = 1. i Eulerian: z i t = zi 0 (particle locations are fixed) E.g., grid-based methods. Lagrangian: subject of this talk. w i t = 1/M (weights are fixed) Universität Potsdam/ University of Reading 6
Problem setting Particle filters: Approximate conditional PDF π t by a set of particles z i t = (xi t, λi ), i = 1,..., M, t with weights w i t 0, E.g. sequential Monte Carlo. s.t. w i t = 1. i Eulerian: z i t = zi 0 (particle locations are fixed) E.g., grid-based methods. Lagrangian: subject of this talk. w i t = 1/M (weights are fixed) Universität Potsdam/ University of Reading 6
Ensemble Kalman-Bucy filter A control approach to the continuous-in-time filtering problem: Kalman gain factor: K t = P xh R 1, P xh denotes the covariance matrix between x t and h(x t ) Innovation: di t = dy t 1 2 (h(x t) h t ) dt or di t = dy t h(x t ) dt R 1/2 du t Ensemble Kalman-Bucy filter: dx t = f (x t, λ) dt + σ(x t ) dw t + K t di t Universität Potsdam/ University of Reading 7
Ensemble Kalman-Bucy filter A control approach to the continuous-in-time filtering problem: Kalman gain factor: K t = P xh R 1, P xh denotes the covariance matrix between x t and h(x t ) Innovation: di t = dy t 1 2 (h(x t) h t ) dt or di t = dy t h(x t ) dt R 1/2 du t Ensemble Kalman-Bucy filter: dx t = f (x t, λ) dt + σ(x t ) dw t + K t di t Universität Potsdam/ University of Reading 7
Ensemble Kalman-Bucy filter A control approach to the continuous-in-time filtering problem: Kalman gain factor: K t = P xh R 1, P xh denotes the covariance matrix between x t and h(x t ) Innovation: di t = dy t 1 2 (h(x t) h t ) dt or di t = dy t h(x t ) dt R 1/2 du t Ensemble Kalman-Bucy filter: dx t = f (x t, λ) dt + σ(x t ) dw t + K t di t Universität Potsdam/ University of Reading 7
Ensemble Kalman-Bucy filter A control approach to the continuous-in-time filtering problem: Kalman gain factor: K t = P xh R 1, P xh denotes the covariance matrix between x t and h(x t ) Innovation: di t = dy t 1 2 (h(x t) h t ) dt or di t = dy t h(x t ) dt R 1/2 du t Ensemble Kalman-Bucy filter: dx t = f (x t, λ) dt + σ(x t ) dw t + K t di t Universität Potsdam/ University of Reading 7
Numerical implementation (i) Draw samples x i 0, i = 1,..., M from the initial distribution π 0. (ii) Solve the interacting particle system with innovation and gain factor i = 1,..., M. dx i t = f (xi t, λ)dt + σ(xi t ) dwi t + K t di i t, di i t = dy t dy i t, dyi t := h(xi t )dt + R 1/2 du i t, K t = P xh R 1, P xh := 1 M 1 M (x i t x t )(h(x i t ) h t ) T, i=1 Universität Potsdam/ University of Reading 8
Feedback particle filter motivation Consider zero-drift & scalar SDE dx t = σ(x t ) dw t with time-evolved expectation values: t π t [f ] = π 0 [f ] + π s [Lf ] ds, Lf := 1 0 2 σ x(σ x f ) Interacting particle representation: ẋ t = 1 2 σ(x t)i t (x t ), I t := π 1 t x (π t σ) It holds that (Liouville plus integration by parts) ˆπ t [f ] := π 0 [f ] + 1 t π s [( x f )(σi s )] ds 2 0 t = π 0 [f ] + π s [Lf ] ds = π t [f ]. 0 Universität Potsdam/ University of Reading 9
Feedback particle filter motivation Consider zero-drift & scalar SDE dx t = σ(x t ) dw t with time-evolved expectation values: t π t [f ] = π 0 [f ] + π s [Lf ] ds, Lf := 1 0 2 σ x(σ x f ) Interacting particle representation: ẋ t = 1 2 σ(x t)i t (x t ), I t := π 1 t x (π t σ) It holds that (Liouville plus integration by parts) ˆπ t [f ] := π 0 [f ] + 1 t π s [( x f )(σi s )] ds 2 0 t = π 0 [f ] + π s [Lf ] ds = π t [f ]. 0 Universität Potsdam/ University of Reading 9
Feedback particle filter motivation Consider zero-drift & scalar SDE dx t = σ(x t ) dw t with time-evolved expectation values: t π t [f ] = π 0 [f ] + π s [Lf ] ds, Lf := 1 0 2 σ x(σ x f ) Interacting particle representation: ẋ t = 1 2 σ(x t)i t (x t ), I t := π 1 t x (π t σ) It holds that (Liouville plus integration by parts) ˆπ t [f ] := π 0 [f ] + 1 t π s [( x f )(σi s )] ds 2 0 t = π 0 [f ] + π s [Lf ] ds = π t [f ]. 0 Universität Potsdam/ University of Reading 9
Feedback particle filter Generalization of the ensemble Kalman-Bucy filter: Innovation di t as before, i.e., dx t = f (x t, λ) dt + σ(x t ) dw t + K t di t di t = dy t 1 2 (h(x t) h t ) dt or di t = dy t h(x t ) dt R 1/2 du t, Kalman gain matrix K t = ψ t with x ( ˆπ t x ψ t ) = R 1 ˆπ t (h π t [h]). It can be shown that π t [f ] = ˆπ t [f ]. Universität Potsdam/ University of Reading 10
Feedback particle filter Generalization of the ensemble Kalman-Bucy filter: Innovation di t as before, i.e., dx t = f (x t, λ) dt + σ(x t ) dw t + K t di t di t = dy t 1 2 (h(x t) h t ) dt or di t = dy t h(x t ) dt R 1/2 du t, Kalman gain matrix K t = ψ t with x ( ˆπ t x ψ t ) = R 1 ˆπ t (h π t [h]). It can be shown that π t [f ] = ˆπ t [f ]. Universität Potsdam/ University of Reading 10
Feedback particle filter Generalization of the ensemble Kalman-Bucy filter: Innovation di t as before, i.e., dx t = f (x t, λ) dt + σ(x t ) dw t + K t di t di t = dy t 1 2 (h(x t) h t ) dt or di t = dy t h(x t ) dt R 1/2 du t, Kalman gain matrix K t = ψ t with x ( ˆπ t x ψ t ) = R 1 ˆπ t (h π t [h]). It can be shown that π t [f ] = ˆπ t [f ]. Universität Potsdam/ University of Reading 10
Feedback particle filter Structural form of filter formulations: Universität Potsdam/ University of Reading 11
Feedback particle filter Numerical implementation I A) Diffusion maps: Define elliptic operator L π ψ := π 1 x (π x ψ) and introduce the approximation e εl π Id ψ h π[h] ε The exponential e εl π can be approximated using diffusion maps. B) Optimal transport: Find optimal coupling T ε between π and π ε = π(1 + ε(h π[h]). Then x ψ(x) T ε(x) x ε. Universität Potsdam/ University of Reading 12
Feedback particle filter Numerical implementation I A) Diffusion maps: Define elliptic operator L π ψ := π 1 x (π x ψ) and introduce the approximation e εl π Id ψ h π[h] ε The exponential e εl π can be approximated using diffusion maps. B) Optimal transport: Find optimal coupling T ε between π and π ε = π(1 + ε(h π[h]). Then x ψ(x) T ε(x) x ε. Universität Potsdam/ University of Reading 12
Feedback particle filter Numerical implementation II The time-evolution of an ensemble of M particles x i is given by t dx i t = f (xi t, λ) dt + σ(xi t ) dwi t K i t dii t Gain: Innovation: or M K i t := x j t d ji x ψ t (x i t ) j=1 di i t = dy t 1 2 (h(xi t ) h t ) dt di i t = dy t h(x i t ) dt R1/2 du i t, Universität Potsdam/ University of Reading 13
References EnKBF: SR (BIT, 2011), Bergemann & SR (Meteorol. Zeitschrift, 2012) FPF: de Wiljes, Stannat & SR (ArXiv:1612.06065, 2016), Del Moral, Kurtzmann & Tugaut (ArXiv:1606.082566, 2016) Yang, Mehta & Meyn (IEEE Trans. Autom. Contr., 2013) Taghvaei, de Wiljes, Mehta & SR (ArXiv:1702.07241), David, de Wiljes & SR (ArXiv:1709.09199) Alternative control formulation: Crisan & Xiong (ESAIM, 2007), Crisan & Xiong (Stochastics, 2010) Universität Potsdam/ University of Reading 14
References EnKBF: SR (BIT, 2011), Bergemann & SR (Meteorol. Zeitschrift, 2012) FPF: de Wiljes, Stannat & SR (ArXiv:1612.06065, 2016), Del Moral, Kurtzmann & Tugaut (ArXiv:1606.082566, 2016) Yang, Mehta & Meyn (IEEE Trans. Autom. Contr., 2013) Taghvaei, de Wiljes, Mehta & SR (ArXiv:1702.07241), David, de Wiljes & SR (ArXiv:1709.09199) Alternative control formulation: Crisan & Xiong (ESAIM, 2007), Crisan & Xiong (Stochastics, 2010) Universität Potsdam/ University of Reading 14
References EnKBF: SR (BIT, 2011), Bergemann & SR (Meteorol. Zeitschrift, 2012) FPF: de Wiljes, Stannat & SR (ArXiv:1612.06065, 2016), Del Moral, Kurtzmann & Tugaut (ArXiv:1606.082566, 2016) Yang, Mehta & Meyn (IEEE Trans. Autom. Contr., 2013) Taghvaei, de Wiljes, Mehta & SR (ArXiv:1702.07241), David, de Wiljes & SR (ArXiv:1709.09199) Alternative control formulation: Crisan & Xiong (ESAIM, 2007), Crisan & Xiong (Stochastics, 2010) Universität Potsdam/ University of Reading 14
Bayesian inference Bayes: π(x y) π(y x) π(x) Task: Given a set of samples x i π(x), produce an estimator for 0 expectation values with respect to the posterior distribution. Application: Learn models y out = Ψ x (y in ), parametrized by x, which gives rise to the likelihood π(y x). Standard methodologies: importances sampling MCMC homotopy methods: learning models by making them interact Universität Potsdam/ University of Reading 15
Bayesian inference Bayes: π(x y) π(y x) π(x) Task: Given a set of samples x i π(x), produce an estimator for 0 expectation values with respect to the posterior distribution. Application: Learn models y out = Ψ x (y in ), parametrized by x, which gives rise to the likelihood π(y x). Standard methodologies: importances sampling MCMC homotopy methods: learning models by making them interact Universität Potsdam/ University of Reading 15
Homotopy method I Homotopy: Define family of distributions π α (x) e αl π(x), where L(x) := ln π(y x) and α 0. The posterior is obtained for α = 1. The function L can also stand for a tilting potential in rare event simulations/ importance sampling. Dynamic formulation: Bayes 1 α π α = π α (L π α [L]), π[e L ] = 1 + π α [L]dt, 0 Liouville α π α = x (π α x ψ α ) and parameter dynamics given by ẋ = x ψ α (x). Universität Potsdam/ University of Reading 16
Homotopy method I Homotopy: Define family of distributions π α (x) e αl π(x), where L(x) := ln π(y x) and α 0. The posterior is obtained for α = 1. The function L can also stand for a tilting potential in rare event simulations/ importance sampling. Dynamic formulation: Bayes 1 α π α = π α (L π α [L]), π[e L ] = 1 + π α [L]dt, 0 Liouville α π α = x (π α x ψ α ) and parameter dynamics given by ẋ = x ψ α (x). Universität Potsdam/ University of Reading 16
Homotopy method II Particle flow filter: ẋ i α = xψ α (x i α ), xi 0 π(x), α [0, 1], with potential ψ α solving x (π α x ψ α ) = π α (L π α (L)). Numerical implementation: diffusion maps optimal transportation (iterative) ensemble Kalman filter Universität Potsdam/ University of Reading 17
Homotopy method II Particle flow filter: ẋ i α = xψ α (x i α ), xi 0 π(x), α [0, 1], with potential ψ α solving x (π α x ψ α ) = π α (L π α (L)). Numerical implementation: diffusion maps optimal transportation (iterative) ensemble Kalman filter Universität Potsdam/ University of Reading 17
Homotopy method III Extensions: Randomized particle flow: dx α = ϵ x ln π α (x) dt + 2ϵ dw α + x ψ α (x) dt = x ψα (x) + ε ln π 0 (x) εαl(x) dt + 2ϵ dw α, for α 0. Here ε 0 is a parameter. It still holds that Minimization: Method for finding α π α = π α (L π α [L]). x = arg min V(x) if α, ε α 1, and π α e αv π 0. Universität Potsdam/ University of Reading 18
Homotopy method III Extensions: Randomized particle flow: dx α = ϵ x ln π α (x) dt + 2ϵ dw α + x ψ α (x) dt = x ψα (x) + ε ln π 0 (x) εαl(x) dt + 2ϵ dw α, for α 0. Here ε 0 is a parameter. It still holds that Minimization: Method for finding α π α = π α (L π α [L]). x = arg min V(x) if α, ε α 1, and π α e αv π 0. Universität Potsdam/ University of Reading 18
References Particle flow: Moser (1965), Daum et al. (SPIE, 2010), Daum et al. (SPIE, 2017) SR (BIT, 2011) Heng, Doucet & Pokern (2015, ArXiv:1509.08787) Extensions: iterative Kalman filters (Dean Oliver, Mark Bocquet, Marco Iglesias, SR...) sequential Monte Carlo & rejuvenation (..., Beskos, Crisan & Jasra (Annals of Applied Prob, 2014),...) Kalman filter as minimizer (Schilling & Stuart (2016, ArXiv:1602.02020)) stochastic minimization techniques/ simulated annealing (...) Universität Potsdam/ University of Reading 19
References Particle flow: Moser (1965), Daum et al. (SPIE, 2010), Daum et al. (SPIE, 2017) SR (BIT, 2011) Heng, Doucet & Pokern (2015, ArXiv:1509.08787) Extensions: iterative Kalman filters (Dean Oliver, Mark Bocquet, Marco Iglesias, SR...) sequential Monte Carlo & rejuvenation (..., Beskos, Crisan & Jasra (Annals of Applied Prob, 2014),...) Kalman filter as minimizer (Schilling & Stuart (2016, ArXiv:1602.02020)) stochastic minimization techniques/ simulated annealing (...) Universität Potsdam/ University of Reading 19
Conclusions ensemble Kalman filter has triggered renewed interest in Lagrangian filtering and data assimilation stability and accuracy of the resulting interacting particle systems is largely unknown many applications outside classic filtering context such as rare event simulations and whenever a change of measure arises exploitation of geometric properties? E.g. data assimilation for highly oscillatory Hamiltonian systems (Reinhardt, Hastermann, Klein, SR, ArXiv:1708.03570)... Universität Potsdam/ University of Reading 20
Conclusions ensemble Kalman filter has triggered renewed interest in Lagrangian filtering and data assimilation stability and accuracy of the resulting interacting particle systems is largely unknown many applications outside classic filtering context such as rare event simulations and whenever a change of measure arises exploitation of geometric properties? E.g. data assimilation for highly oscillatory Hamiltonian systems (Reinhardt, Hastermann, Klein, SR, ArXiv:1708.03570)... Universität Potsdam/ University of Reading 20