Par$cle Filters Part I: Theory. Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading

Similar documents
Nonlinear ensemble data assimila/on in high- dimensional spaces. Peter Jan van Leeuwen Javier Amezcua, Mengbin Zhu, Melanie Ades

Introduction to Particle Filters for Data Assimilation

Lagrangian Data Assimilation and Its Application to Geophysical Fluid Flows

Fundamentals of Data Assimila1on

Fundamentals of Data Assimila1on

DART Tutorial Sec'on 1: Filtering For a One Variable System

DART Tutorial Part IV: Other Updates for an Observed Variable

Sta$s$cal sequence recogni$on

CSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.

CSE 473: Ar+ficial Intelligence. Example. Par+cle Filters for HMMs. An HMM is defined by: Ini+al distribu+on: Transi+ons: Emissions:

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

A Note on the Particle Filter with Posterior Gaussian Resampling

Ensemble Data Assimila.on for Climate System Component Models

Ensemble Data Assimila.on and Uncertainty Quan.fica.on

Lagrangian Data Assimilation and Manifold Detection for a Point-Vortex Model. David Darmon, AMSC Kayo Ide, AOSC, IPST, CSCAMM, ESSIC

2D Image Processing (Extended) Kalman and particle filter

CSE 473: Ar+ficial Intelligence

Relative Merits of 4D-Var and Ensemble Kalman Filter

DART Tutorial Part II: How should observa'ons impact an unobserved state variable? Mul'variate assimila'on.

Fundamentals of Data Assimilation

(Extended) Kalman Filter

Introduc)on to Ar)ficial Intelligence

Lecture 2: From Linear Regression to Kalman Filter and Beyond

DART Tutorial Sec'on 5: Comprehensive Filtering Theory: Non-Iden'ty Observa'ons and the Joint Phase Space

DART Tutorial Sec'on 9: More on Dealing with Error: Infla'on

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Localization and Correlation in Ensemble Kalman Filters

Deriva'on of The Kalman Filter. Fred DePiero CalPoly State University EE 525 Stochas'c Processes

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Particle Filters. Outline

Robert Collins CSE586, PSU Intro to Sampling Methods

Forecasting and data assimilation

AUTOMOTIVE ENVIRONMENT SENSORS

Recap on Data Assimilation

An Efficient Ensemble Data Assimilation Approach To Deal With Range Limited Observation

Data Assimilation Research Testbed Tutorial

A nested sampling particle filter for nonlinear data assimilation

Ensemble Kalman Filter

Data assimilation in the geosciences An overview

Convective-scale data assimilation in the Weather Research and Forecasting model using a nonlinear ensemble filter

DART_LAB Tutorial Section 2: How should observations impact an unobserved state variable? Multivariate assimilation.

CSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i

CIS 390 Fall 2016 Robotics: Planning and Perception Final Review Questions

The Ensemble Kalman Filter:

Particle filters, the optimal proposal and high-dimensional systems

Quan&fying Uncertainty. Sai Ravela Massachuse7s Ins&tute of Technology

Gaussian Process Approximations of Stochastic Differential Equations

Ensemble Data Assimilation and Uncertainty Quantification

Smoothers: Types and Benchmarks

Short tutorial on data assimilation

Miscellaneous Thoughts on Ocean Data Assimilation

The Kalman Filter ImPr Talk

Convergence of the Ensemble Kalman Filter in Hilbert Space

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

Markov chain Monte Carlo methods for visual tracking

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

Ensemble Kalman Filter based snow data assimilation

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Particle Filters. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

A Stochastic Collocation based. for Data Assimilation

Model error and parameter estimation

Hierarchical Bayes Ensemble Kalman Filter

Gaussian Process Approximations of Stochastic Differential Equations

4DEnVar. Four-Dimensional Ensemble-Variational Data Assimilation. Colloque National sur l'assimilation de données

The Ensemble Kalman Filter:

Bayesian Inverse problem, Data assimilation and Localization

Sequential Monte Carlo Samplers for Applications in High Dimensions

Mul$- model ensemble challenge ini$al/model uncertain$es

State-Space Methods for Inferring Spike Trains from Calcium Imaging

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Markov Chain Monte Carlo Methods for Stochastic Optimization

Robert Collins CSE586, PSU Intro to Sampling Methods

Bayes Nets: Sampling

Robotics. Lecture 4: Probabilistic Robotics. See course website for up to date information.

Particle Filtering Approaches for Dynamic Stochastic Optimization

Answers and expectations

Particle Filtering for Data-Driven Simulation and Optimization

Lecture 6: Bayesian Inference in SDE Models

CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2

arxiv: v1 [physics.ao-ph] 23 Jan 2009

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler

Particle filtering in geophysical systems

Lecture 6: Multiple Model Filtering, Particle Filtering and Other Approximations

Dynamic System Identification using HDMR-Bayesian Technique

Machine Learning & Data Mining CS/CNS/EE 155. Lecture 8: Hidden Markov Models

CS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model

CS 6140: Machine Learning Spring 2016

Strong Lens Modeling (II): Statistical Methods

Expectation propagation for signal detection in flat-fading channels

Can hybrid-4denvar match hybrid-4dvar?

Local Positioning with Parallelepiped Moving Grid

Robust Ensemble Filtering With Improved Storm Surge Forecasting

STA414/2104 Statistical Methods for Machine Learning II

Markov Chain Monte Carlo Methods for Stochastic

Organization. I MCMC discussion. I project talks. I Lecture.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Lagrangian Data Assimilation and its Applications to Geophysical Fluid Flows

Introduction to Ensemble Kalman Filters and the Data Assimilation Research Testbed

State Estimation using Moving Horizon Estimation and Particle Filtering

Transcription:

Par$cle Filters Part I: Theory Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading Reading July 2013

Why Data Assimila$on Predic$on Model improvement: - Parameter es$ma$on - Parameterisa$on es$ma$on Increase our understanding

Data Assimila$on Ingredients Prior knowledge, the Stochas$c model: Observa$ons: Rela$on between the two:

Data assimilation: general formulation Bayes theorem: Solu$on is pdf! NO INVERSION!!!

Mo$va$on ensemble methods: Efficient propaga$on of pdf in $me (for nonlinear models)

How is DA used today in geosciences? Present- day data- assimila$on systems are based on lineariza$ons and state covariances are essen$al. 4DVar, Representer method (PSAS): - Gaussian pdf s, solves for posterior mode, needs error covariance of ini$al state (B matrix), no posterior error covariances (Ensemble) Kalman filter: - assumes Gaussian pdf s for the state, approximates posterior mean and covariance, doesn t minimize anything in nonlinear systems, needs infla$on and localisa$on Combina$ons of these: hybrid methods (!!!)

Non- linear Data Assimila$on Metropolis- Has$ngs Langevin sampling Hybrid Monte- Carlo Par$cle Filters/Smoothers Combina$ons of MH and PF All try to sample from the posterior pdf, either the joint-in-time, or the marginal. Only the particle filter/smoother does this sequentially in time.

Nonlinear filtering: Par$cle filter Use ensemble with the weights.

What are these weights? The weight is the normalised value of the pdf of the observa$ons given model state. For Gaussian distributed variables is is given by: One can just calculate this value That is all!!!

No explicit need for state covariances 3DVar and 4DVar need a good error covariance of the prior state es$mate: complicated The performance of Ensemble Kalman filters relies on the quality of the sample covariance, forcing ar$ficial (?) infla$on and localisa$on. Par$cle filter doesn t have this problem, but

Standard Par$cle filter The standard particle filter is degenerate for moderate ensemble size in moderate-dimensional systems.

Par$cle Filter degeneracy: resampling With each new set of observa$ons the old weights are mul$plied with the new weights. Very soon only one par$cle has all the weight Solu$on: Resampling: duplicate high- weight par$cles and abandon low- weight par$cles

Standard Par$cle filter

A simple resampling scheme 1. Put all weights ager each other on the unit interval: 0 1 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 2. Draw a random number from the uniform distribu$on over [0,1/N], in this case with 10 members over [0,1/10]. 3. Put that number on the unit interval: its end point is the first member drawn. 4. Add 1/N to the end point: the new end point is our second member. Repeat this un$l N new members are obtained. 0 1 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10 5. In our example we choose m1 2 $mes, m2 2 $mes, m3, m4, m5 2 $mes, m6 and m7.

A closer look at the weights I Probability space in large- dimensional systems is empty : the curse of dimensionality u(x1) u(x2) T(x3)

A closer look at the weights II Assume particle 1 is at 0.1 standard deviations s of M independent observations. Assume particle 2 is at 0.2 s of the M observations. The weight of particle 1 will be and particle 2 gives

A closer look at the weights III The ra$o of the weights is Take M=1000 to find Conclusion: the number of independent observations is responsible for the degeneracy in particle filters.

A closer look at the weights IV The volume of a hypersphere of radius r in an M dimensional space is Taking for the radius S$rling: we find, using So very small indeed.

How can we make par$cle filters useful? The joint-in-time prior pdf can be written as: So the marginal prior pdf at time n becomes: We introduced the transition densities

Meaning of the transi$on densi$es Stochastic model: Transition density: So, draw a sample from the model error pdf, and use that in the stochastic model equations. For a deterministic model this pdf is a delta function centered around the the deterministic forward step. For a Gaussian model error we find:

Bayes Theorem and the proposal density Bayes Theorem now becomes: Multiply and divide this expression by a proposal transition density q:

The magic: the proposal density We found: Note that the transition pdf q can be conditioned on the future observation y n. The trick will be to draw samples from transition density q instead of from transition density p.

How to use this in prac$ce? Start with the particle description of the conditional pdf at n-1 (assuming equal weight particles): Leading to:

Prac$ce II The standard Par$cle Filter propagates the original model by drawing from p(x n x n- 1 ). Now we draw from q(x n x n- 1,y n ), so we propagate the state using a different model. This model can be anything, e.g.

Examples proposal transi$on density The proposal transition density is related to a proposed model. In theory, this can be any model! For instance, add a relaxation term and change random forcing: Or, run a 4D-Var on each particle (implicit particle filter). This is a special 4D-Var: - initial condition is fixed - model error essential - needs extra random forcing Or use the EnKF as proposal density.

Prac$ce III For each particle at time n-1 draw a sample from the proposal transition density q, to find: Which can be rewritten as: with weights Likelihood weight Proposal weight

How to calculate p/q? Let s assume that the original model has Gaussian distributed model errors: To calculate the value of this term realise it is the probability of moving from x i n-1 to x in. Since x i n and x i n-1 are known from the proposed model we can calculate directly:

Example calcula$on of p Assume the proposed model is Then we find We know all the terms, so this can be calculated

And the q term The determinis$c part of the proposed model is: So the probability becomes We did draw the stochas$c terms, so we know what they are, so this term can be calculated too.

The weights We can calculate p/q and we can calculate the likelihood so we can calculate the weights:

Example: EnKF as proposal EnKF update: Use model equation: Regroup terms: Leading to:

Algorithm Generate ini$al set of par$cles Run proposed model condi$oned on next observa$on Accumulate proposal density weights p/q Calculate likelihood weights Calculate full weights and resample Note, the original model is never used directly.

Par$cle filter with proposal transi$on density

1. We know: Equivalent- weights I 2. Write down expression for each weight ignoring q for now: 3. When H is linear this is a quadratic function in x i n for each particle. Otherwise linearize.

Equivalent- weights II 4. Determine a target weight 1 4 2 3 Target weight 5 x i n

Equivalent- weights III 5. Determine corresponding model states, infinite number of solutions. f(x n-1 i ) X y n ^ x i n X target weight weight contour Determine at crossing of line with target weight contour in: with

Equivalent- Weights IV So, by construc$on 80% of the par$cles have equal weight! Hence PF not degenerate by construc$on. However, we s$ll need a stochas$c move. (Why?)

Almost equal weights IV 6. The previous is the determinis$c part of the proposal density. The stochas$c part of q should not be Gaussian because we divide by q, so an unlikely value for the random vector will result in a huge weight: A uniform density will leave the weights unchanged, but has limited support. Hence we choose from a mixture density: with a,b,q small

Almost equal weights V The full scheme is now: Use modified model up to last $me step Set target weight (e.g. 80%) Calculate determinis$c moves: Determine stochas$c move Calculate new weights and resample lost par$cles

How to test the accuracy of DA scheme s? If the DA problem is linear one can use that the cospunc$on is a chi- squared variable with M, the number of independent observa$ons, as degrees of freedom. So the value of the cospucn$on should lie in Other consistency tests for linear DA test rela$ons between the sta$s$cs of the forecast and analysis innova$ons, and the covariances. From linear theory we find rela$ons like: that can be checked.

And nonlinear DA scheme s? Compare the ensemble spread with the RMSE in idealised experiments One can use other measures, like a rank histogram or talagrand diagram: - Take one observa$on y. - Add observa$on noise to model equivalents H(x i ) and sort them in magnitude. - Rank the posi$on of the observa$on in this sorted ensemble - Do this every several assimila$on cycles (or over similar independent observa$ons) - Produce a histogram of the rankings. X 4 2 7 8 3 1 9 10 6 5 variable

Examples of rank histograms Ideally the rank histogram is flat: the observation is Indistinguishable from any ensemble member, so any Ensemble member could be responsible for the observations Under-dispersive ensemble Over-dispersive ensemble