Data assimilation Schrödinger s perspective

Similar documents
Data assimilation as an optimal control problem and applications to UQ

EnKF-based particle filters

arxiv: v2 [math.na] 22 Sep 2016

Gaussian Process Approximations of Stochastic Differential Equations

A nonparametric ensemble transform method for Bayesian inference

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

Fundamentals of Data Assimilation

Smoothers: Types and Benchmarks

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

A Note on the Particle Filter with Posterior Gaussian Resampling

Gaussian Process Approximations of Stochastic Differential Equations

Lecture 6: Bayesian Inference in SDE Models

Convergence of the Ensemble Kalman Filter in Hilbert Space

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

State Space Models for Wind Forecast Correction

Lagrangian Data Assimilation and Its Application to Geophysical Fluid Flows

A variational radial basis function approximation for diffusion processes

State and Parameter Estimation in Stochastic Dynamical Models

A State Space Model for Wind Forecast Correction

Ergodicity in data assimilation methods

An introduction to data assimilation. Eric Blayo University of Grenoble and INRIA

The Ensemble Kalman Filter:

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER

A variance limiting Kalman filter for data assimilation: I. Sparse observational grids II. Model error

Hidden Markov Models for precipitation

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Data assimilation in high dimensions

Human Pose Tracking I: Basics. David Fleet University of Toronto

Variational inference

Relative Merits of 4D-Var and Ensemble Kalman Filter

Particle Filters in High Dimensions

Gaussian processes for inference in stochastic differential equations

Ensemble Kalman Filter

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

Seminar: Data Assimilation

EnKF and Catastrophic filter divergence

Fundamentals of Data Assimila1on

Data assimilation in high dimensions

Sequential Monte Carlo Samplers for Applications in High Dimensions

(Extended) Kalman Filter

Introduction to Particle Filters for Data Assimilation

CSC487/2503: Foundations of Computer Vision. Visual Tracking. David Fleet

Controlled sequential Monte Carlo

Expectation propagation for signal detection in flat-fading channels

Auxiliary Particle Methods

What do we know about EnKF?

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Introduction to Machine Learning

Short tutorial on data assimilation

EnKF and filter divergence

Large-scale Ordinal Collaborative Filtering

WHITE NOISE APPROACH TO FEYNMAN INTEGRALS. Takeyuki Hida

Par$cle Filters Part I: Theory. Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading

Feedback Particle Filter and its Application to Coupled Oscillators Presentation at University of Maryland, College Park, MD

Pattern Recognition and Machine Learning

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Imprecise Filtering for Spacecraft Navigation

Statistical Inference and Methods

Nonparametric Drift Estimation for Stochastic Differential Equations

Expectation Propagation in Dynamical Systems

Data assimilation in the geosciences An overview

Filtering the Navier-Stokes Equation

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

4. DATA ASSIMILATION FUNDAMENTALS

At A Glance. UQ16 Mobile App.

(Exact and efficient) Simulation of Conditioned Markov Processes

Stochastic Spectral Approaches to Bayesian Inference

Markov chain Monte Carlo methods for visual tracking

2D Image Processing (Extended) Kalman and particle filter

The Kalman Filter ImPr Talk

Convective-scale data assimilation in the Weather Research and Forecasting model using a nonlinear ensemble filter

arxiv: v1 [stat.me] 6 Nov 2013

Lecture 13 : Variational Inference: Mean Field Approximation

Four-Dimensional Ensemble Kalman Filtering

DATA ASSIMILATION FOR FLOOD FORECASTING

Chapter 2 Assimilating data into scientific models: An optimal coupling perspective

Numerical Solutions of ODEs by Gaussian (Kalman) Filtering

Bayesian Inverse problem, Data assimilation and Localization

Der SPP 1167-PQP und die stochastische Wettervorhersage

A Spectral Approach to Linear Bayesian Updating

A Maximum Entropy Method for Particle Filtering

Lecture 7: Optimal Smoothing

Adaptive ensemble Kalman filtering of nonlinear systems

A Note on Auxiliary Particle Filters

Bayesian Calibration of Simulators with Structured Discretization Uncertainty

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Maximum Likelihood Ensemble Filter Applied to Multisensor Systems

Sequential Monte Carlo Methods for Bayesian Computation

Four-dimensional ensemble Kalman filtering

arxiv: v1 [physics.ao-ph] 23 Jan 2009

Data assimilation with and without a model

Methods of Data Assimilation and Comparisons for Lagrangian Data

Relationship between Singular Vectors, Bred Vectors, 4D-Var and EnKF

Fundamentals of Data Assimila1on

Ensemble forecasting and flow-dependent estimates of initial uncertainty. Martin Leutbecher

Particle Filters. Outline

A nested sampling particle filter for nonlinear data assimilation

At-A-Glance SIAM Events Mobile App

Sampling the posterior: An approach to non-gaussian data assimilation

Transcription:

Data assimilation Schrödinger s perspective Sebastian Reich (www.sfb1294.de) Universität Potsdam/ University of Reading IMS NUS, August 3, 218 Universität Potsdam/ University of Reading 1

Core components of DA Mean Field Equations Interacting Particle Systems Coupling of Measures Universität Potsdam/ University of Reading 2

Entropic couplings Universität Potsdam/ University of Reading 3

Ensemble prediction I Ensemble prediction system with M members: d dt Zi t = f (Zi t ), Zi π, i = 1,..., M. Source: ECMWF Universität Potsdam/ University of Reading 4

Ensemble prediction II Source: The quiet revolution of numerical weather prediction, Nature, 215 Universität Potsdam/ University of Reading 5

Ensemble based data assimilation Continuous in time assimilation of precipitation data y t : d dt Zi t = f (Zi t ) + α 1Q t (Z i t Z t) + α 2 K t (y t h(z i t )) Additional terms: Inflation: α 1 >, Q t R N z N z spd, Z t = 1 M i Z i t Nudging: α 2 >, gain matrix K t R N z N y, forward operator h. Universität Potsdam/ University of Reading 6

SDEs & inflation I Forward SDE X + π, t [, T], W + t dz + t = f t (Z + t )dt + γ1/2 dw + t, standard Brownian motion forward in time. Generates probability measure P [,T] over C([, T], R N z) with marginal densities π t, i.e. Z t π t. The same measure is generated by backward SDE dz t = b t (Z t )dt + γ1/2 dw t, W t Brownian motion backward in time, X T π T. It holds that b t (z) = f t (z) γ z log π t (z). Universität Potsdam/ University of Reading 7

SDEs & inflation II Fokker-Planck equation for marginals: t π t = z (π t f t ) + γ 2 zπ t = z (π t b t ) γ 2 zπ t = z (π t u t ) with u t (z) = 1 2 (f t(z) + b t (z)) = f t (z) γ 2 z log π t (z). Replace forward SDE by mean field equation d dt Z t = f t (Z t ) γ 2 z log π t (Z t ), Z π. Remark. Generates path measure Q [,T] which is different from SDE measure P [,T] ; only marginals π t agree! Universität Potsdam/ University of Reading 8

SDEs & inflation III Lagrangian interacting particles (Gaussian approximation to π t ): d dt Zi t = f t(z i t ) + γ 2 (P t) 1 (Z i t Z t), Z i π, i = 1,..., M, empirical covariance matrix Connection to inflation: P t = 1 M 1 (Z i t Z t)(z i t Z t) T. i Q t = (P t ) 1, α 1 = γ/2. Universität Potsdam/ University of Reading 9

References Daley, R., Atmospheric Data Analysis, Cambridge University Press, 1991. Nelson, E., Dynamical Theories of Brownian Motion, Princeton University Press, 1967. del Moral, P., Mean field simulation for Monte Carlo integration, CRC Press, 213. SR, Cotter, J., Probabilistic Forecasting and Bayesian Data Assimilation, Cambridge University Press, 215. Law, K. et al., Data Assimilation A Mathematical Introduction, Springer, 215. Qiang Liu, Stein Variational Gradient Descent as Gradient Flow, arxiv:174.752v2, 217. Universität Potsdam/ University of Reading 1

Recap: Assimilation of precipitation data Continuous in time assimilation of precipitation data y t : d dt Zi t = f (Zi t ) + α 1Q t (Z i t Z t) + α 2 K t (y t h(z i t )) Inflation: α 1 >, Q t R N z N z spd, Z t = 1 M i Z i t Nudging: α 2 >, gain matrix K t R N z N y, forward operator h. Universität Potsdam/ University of Reading 11

SDEs & nudging I Given a likelihood function T L(z [,T] ) := exp V t (z t )dt. For example V t (z) = β 2 h(z) y t 2. Bayes theorem (Radon Nikodym): d P [,T] dp [,T] (z [,T] ) := L(z [,T]) P [,T] [L]. The measure P [,T] solves the filtering/smoothing problem of SDE inference. Universität Potsdam/ University of Reading 12

SDEs & nudging II Mean field formulation: d Z t = f t ( Z t ) + P t z ψ t ( Z t ) dt + γdw t with the potential ψ t satisfying the elliptic PDE z ( π t P t z ψ t ) = π t (V t V t ) Z π = π, V t = π t [V t ], P t = cov ( Z t ). Universität Potsdam/ University of Reading 13

SDEs & nudging III If π t Gaussian and h(z) = Hz in V t, then P t z ψ t (z) = βp t H T y t Hz + HZ t. 2 Compare to nudging scheme: α 2 K t (y t Hz), i.e., K t = P t H T, β = α 2, but innovation different. Universität Potsdam/ University of Reading 14

Combining nudging and inflation Ensemble Kalman Bucy filter (Gaussian approximation to π t ): d dt Zi t = f t(z i t ) + γ 2 (P t) 1 (Z i t Z t) + βk t y t h(zi t ) + h t 2 Z i π, i = 1,..., M, empirical covariance matrices P t = 1 M 1 K t = 1 M 1 (Z i t Z t)(z i t Z t) T, i (Z i t Z t)(h(z i t ) h t) T. Remark. Feedback particle filter for likelihood with i V t (z)dt 1 2 h(z) 2 dt h(z) T dy t. Universität Potsdam/ University of Reading 15

References Moser, J., On the volume elements on a manifold, Trans. Amer. Math. Soc., 1965. SR, A dynamical systems framework for intermittent data assimilation, BIT, 21. Daum, F., Huang, J., Particle filter for nonlinear filters, ICASSP, IEEE, 211. Bergemann, K, SR, An ensemble Kalman Bucy filter for continuous time data assimilation, Meteorolog. Zeitschrift, 212. Yang, T. et al., Feedback particle filter, IEEE Trans. Autom. Control, 213. Taghvaei, A. et al., Kalman filter and its modern extensions for the continuous time nonlinear filtering problem, J. Dyn. Sys., Meas., and Control, 218. de Wiljes, et al., Long time stability and accuracy of the ensemble Kalman Bucy filter for fully observed processes and small measurement noise, SIAM J. Appl. Dyn. Sys., 218. Universität Potsdam/ University of Reading 16

Discrete time observations I State Model State Model State Assimilation Assimilation Assimilation Data Data Data time Universität Potsdam/ University of Reading 17

Discrete time observations II Discrete time observations: y tn = h(z tn ) + R 1/2 Ξ tn, n = 1,..., N. Likelihood function: L(z [,T] ) := exp 1 (y tn h(z tn )) T R 1 (y tn h(z tn )). 2 n Bayes: dp + [,T] (Z + L(Z [,T] dp ) := [,T] ) [,T] P [,T] [L]. Universität Potsdam/ University of Reading 18

Discrete time observations III For simplicity: single observation, i.e. N = 1, R = I, t 1 = T, L(z) = 1 2 y T h(z) 2. But keep recursive nature of sequential DA in mind! Four main players: last analysis: π forecast based on last analysis: π T new analysis at time t = T (Bayes, filtering distribution): π T smoothing distribution at t = : π Universität Potsdam/ University of Reading 19

Discrete time observations IV Prediction 1 Smoothing Schrödinger Filtering b Optimal Control b 1 Data y 1 t = t =1 Time Universität Potsdam/ University of Reading 2

Example: Gaussian mixture I π 1 is a Gaussian mixture, Gaussian likelihood, π 1 weighted Gaussian mixture..1 prior.5 prediction.8.4.6.3.4.2.2.1.2-1 -.5.5 1 space smoothing -4-2 2 4 space filtering 1.5.15.1.5 1.5-1 -.5.5 1 space -4-2 2 4 space Universität Potsdam/ University of Reading 21

Example: Gaussian mixture II 2.5 optimal control proposals 2.5 Schroedinger proposal 2 2 1.5 1.5 1 1.5.5-2 -1.5-1 -.5.5 1 1.5 space -2-1.5-1 -.5.5 1 1.5 space Applied to smoothing PDF π. Applied to π directly! Universität Potsdam/ University of Reading 22

Example: Brownian dynamics Scalar Brownian dynamics under a double well potential (γ =.5): 2 1 4 last analysis 4 1 4 forecast 1 initial mean observation 1.5 1 3 2 8.5 1 potential 6 4-5 5 4 1 4 smoother -5 5 6 1 4 new analysis 2 3 2 4 1 2-2 -1.5-1 -.5.5 1 1.5 2 variable z -5 5-5 5 The forecast and the new analysis at T =.5 are nearly singular with respect to each other. The relation between the last analysis (π ) and the smoother ( π ) is somewhat better. Universität Potsdam/ University of Reading 23

Discrete time observations III Forward-backward smoother iteration: Forward: Z + π. Yields π t. Backward: dz + t = f (Z + t )dt + γw + t, d Z t with Z T π T and Yields π t. = f ( Z t )dt γ z log π t ( Z t )dt + γw t, π T (z) L(z)π T (z). Smoother: d Z + t Z + π, Z + T π T. = f ( Z + t )dt + γ z log π t π t ( Z + t )dt + γw + t Universität Potsdam/ University of Reading 24

Example: Brownian dynamics II Scalar Brownian dynamics under a double well potential (γ =.5): 2 1 4 last analysis 4 1 4 forecast 1 initial mean observation 1.5 1 3 2 8.5 1 potential 6 4-5 5 4 1 4 smoother -5 5 6 1 4 new analysis 2 3 2 4 1 2-2 -1.5-1 -.5.5 1 1.5 2 variable z -5 5-5 5 The forward smoother SDE links the smoother measure π with π T. Still requires transforming π into π (but now at t = ). Universität Potsdam/ University of Reading 25

Schrödinger problem of DA I A different perspective on sequential DA: Schrödinger problem. Find the measure P [,T] which minimises the Kullback-Leibler divergence 2 1.5 1 1 4 last analysis 4 3 2 1 4 forecast P [,T] = arg inf Q P KL(Q [,T] P [,T] ) subject to the constraints π = q = π, π T = q T = π T..5-5 5 4 3 2 1 1 4 smoother -5 5 1-5 5 6 4 2 1 4 new analysis -5 5 The measure P [,T] is generated by a controlled SDE d Z + t = f ( Z + t )dt + u t( Z + t )dt + γdw + t. Universität Potsdam/ University of Reading 26

Schrödinger problem of DA II Find an initial distribution ϕ + and its evolution ϕ+ t under the forward SDE dz + t = f (Z + t ) dt + γdw + t such that the associated backward SDE dz = f (Z t t ) γ z log ϕ + t (Z t ) dt + γdw t with final condition ϕ T := π T leads to marginals ϕ t such that Then the control solves the Schrödinger problem. ϕ = π. u t = γ z log ϕ t ϕ + t Universität Potsdam/ University of Reading 27

Example I: Lorenz 63 Starting point: (i) Euler-Maruyama discretization Z n+1 = Z n + f (Z n ) t + (γ t) 1/2 Ξ n giving rise to transition kernel q + (z n+1 z n ). (ii) Given M samples z j n from Z n and M samples z i n+1 from Z n+1 define the M M matrix Q by q ij = q + (z i n+1 zj n ). (iii) Observations at time t n+1 lead to importance weights w i n+1. Universität Potsdam/ University of Reading 28

Example II: Lorenz 63 Schrödinger problem: subject to M P = arg min KL(P Q), KL(P Q) := p ij log p ij, P Π q i,j=1 ij Π = Application to DA P R M M : p ij, M p ij = 1/M, i=1 M p ij = w i j=1 a) Schrödinger resample: Use P to sample from π n+1 P[ Z j n+1 ] = zi n+1 ] = Mp ij. b) Schrödinger transform: Set M z j n+1 = M z i n+1 p ij i=1 + (γ t)ξ j. Universität Potsdam/ University of Reading 29

Example III: Lorenz 63 1-2 RMS errors as a function M RMSE 1-3 particle filter Schrodinger transform Schrodinger resample EnKBF.5.1.15.2 1/M Stochastic Lorenz 63, observations at every time step, observation error scaled with inverse time step. Universität Potsdam/ University of Reading 3

References Schrödinger, E., Über die Umkehrung der Naturgesetze, Sitzungsberichte der Preuss. Phys. Math. Klasse, 1932. Yongxin Chen et al., On the relation between optimal transport and Schrödinger bridges: A stochastic control viewpoint, J. Optim. Theory and Appl., 216. Fearnhead, P., Künsch, H.R., Particle filters and data assimilation, Annual Review of Statistics and Its Application, 218. Guarniero et al., The iterated auxiliary particle filter, J. American Statist. Assoc., 217. van Leeuwen, P.-J., Nonlinear data assimilation, Frontiers in Applied Dynamical Systems, Vol. 2, 215. Peyre, G., Cuturi, M., Computational Optimal Transport, arxiv:183.567, 218. SR, Data assimilation, Acta Numerica, 219, preliminary version on arxiv 187.8351. Universität Potsdam/ University of Reading 31

OT meets IS I Importance sampling (IS): Available realisations Z i T π T with importance weights w i π T π T (Z i T ). Optimal transport (OT): Instead of resampling, find coupling/transformation Z T = z ψ(z T ), Z T π T and Z π T. More abstractly, Z T (a) = Z T (a )δ(a a ψ(a))da, where A is some random reference variable. For example, A = Z T. Universität Potsdam/ University of Reading 32

OT meets IS II Replace the integral by a sum and formally write M Z j T = Z i T d ij i=1 Requires M 1 d ij = 1, d ij = w i. M i=1 j Select an "optimal" transformation through maximising correlation V(D) = 1 d ij Z i M T Zj T = 1 Z j T M Zj T. ij j In addition, either d ij (Ensemble Transform Particle Filter) or 1 M ( Z i M 1 T z T )( Z i T z M T ) T = w i (Z i T z T )(Z i T z T ) T i=1 i=1 (Nonlinear Ensemble Transform Filter). Universität Potsdam/ University of Reading 33

Numerical example I Lorenz-63 model, first component observed infrequently ( t =.12) and with large measurement noise (R = 8): Figure: RMSEs for various second-order accurate LETFs compared to the ETPF, the ESRF, and the SIR PF as a function of the sample size, M. Universität Potsdam/ University of Reading 34

Numerical example II Hybrid filter: P := P ESRF (α) P ETPF (1 α). Figure: RMSEs for hybrid ESRF (α = ) and 2nd-order corrected NETF/ETPF (α = 1) as a function of the sample size, M. Universität Potsdam/ University of Reading 35

References Evensen, G., Data assimilation the ensemble Kalman filter, Springer, 29. SR, A nonparametric ensemble transform method for Bayesian inference, SIAM J. Sci. Comput., 213. SR, Cotter, J., Probabilistic Forecasting and Bayesian Data Assimilation, Cambridge University Press, 215. Tödter J., Ahrens, B., A second order exact ensemble square root filter for nonlinear data assimilation, Month. Weather Rev., 215. Chustagulprom, N. et al., A hybrid ensemble transform particle filter for nonlinear and spatially extended systems, SIAM/ASA J. UQ, 216. Acevedo, W., et al., Second order accurate ensemble transform particle filters, SIAM J. Sci. Comput., 217. Fearnhead, P., Künsch, H.R., Particle filters and data assimilation, Annual Review of Statistics and Its Application, 218. Universität Potsdam/ University of Reading 36

Summary Continuous in time DA naturally leads to various interacting particle systems. Schrödinger problem provides an "optimal" mathematical framework for sequential DA with discrete in time observations. Numerical implementation nontrivial; good drift corrections can be derived using Gaussian approximations or kernel methods. Coupling arguments are central to derivation of interacting particle systems. Relevant to rare event simulations, optimal control problems and derivative free optimization. Universität Potsdam/ University of Reading 37

The End (source: J.B. Handelsman, New Yorker, 5/31/1969) Universität Potsdam/ University of Reading 38

Collaborators Walter Acevedo Kay Bergemann Yuan Cheng Nawinda Chustagulprom Colin Cotter Jana de Wiljes Prashant Mehta Wilhelm Stannat Amari Taghvaei Universität Potsdam/ University of Reading 39