Auxiliary Particle Methods

Similar documents
An Brief Overview of Particle Filtering

Recent Developments in Auxiliary Particle Filtering

A Note on Auxiliary Particle Filters

Monte Carlo Approximation of Monte Carlo Filters

Controlled sequential Monte Carlo

The Hierarchical Particle Filter

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Computer Intensive Methods in Mathematical Statistics

An introduction to Sequential Monte Carlo

Sequential Monte Carlo Methods for Bayesian Computation

Monte Carlo Filtering of Piecewise Deterministic Processes

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering

A Backward Particle Interpretation of Feynman-Kac Formulae

Towards a Bayesian model for Cyber Security

LTCC: Advanced Computational Methods in Statistics

Answers and expectations

An efficient stochastic approximation EM algorithm using conditional particle filters

Particle Filters: Convergence Results and High Dimensions

Particle Filtering Approaches for Dynamic Stochastic Optimization

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Inference in state-space models with multiple paths from conditional SMC

Adversarial Sequential Monte Carlo

Sequential Monte Carlo Methods (for DSGE Models)

Sequential Monte Carlo samplers

Markov Chain Monte Carlo Methods for Stochastic Optimization

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

State-Space Methods for Inferring Spike Trains from Calcium Imaging

1 / 32 Summer school on Foundations and advances in stochastic filtering (FASF) Barcelona, Spain, June 22, 2015.

AN EFFICIENT TWO-STAGE SAMPLING METHOD IN PARTICLE FILTER. Qi Cheng and Pascal Bondon. CNRS UMR 8506, Université Paris XI, France.

1 / 31 Identification of nonlinear dynamic systems (guest lectures) Brussels, Belgium, June 8, 2015.

Lecture 7 and 8: Markov Chain Monte Carlo

Perfect simulation algorithm of a trajectory under a Feynman-Kac law

Negative Association, Ordering and Convergence of Resampling Methods

Markov Chain Monte Carlo Methods for Stochastic

Concentration inequalities for Feynman-Kac particle models. P. Del Moral. INRIA Bordeaux & IMB & CMAP X. Journées MAS 2012, SMAI Clermond-Ferrand

Advanced Monte Carlo integration methods. P. Del Moral (INRIA team ALEA) INRIA & Bordeaux Mathematical Institute & X CMAP

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

Fast Sequential Monte Carlo PHD Smoothing

Sequential Monte Carlo Methods

Sequential Monte Carlo Methods for Bayesian Model Selection in Positron Emission Tomography

Particle Filtering for Data-Driven Simulation and Optimization

Divide-and-Conquer Sequential Monte Carlo

Introduction. log p θ (y k y 1:k 1 ), k=1

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Sequential Monte Carlo Samplers for Applications in High Dimensions

An Introduction to Particle Filtering

Particle Filtering a brief introductory tutorial. Frank Wood Gatsby, August 2007

Monte Carlo Solution of Integral Equations

CPSC 540: Machine Learning

A new class of interacting Markov Chain Monte Carlo methods

SMC 2 : an efficient algorithm for sequential analysis of state-space models

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER

Linear Dynamical Systems (Kalman filter)

On-Line Parameter Estimation in General State-Space Models

Sequential Monte Carlo methods for system identification

An Introduction to Sequential Monte Carlo for Filtering and Smoothing

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

RAO-BLACKWELLIZED PARTICLE FILTER FOR MARKOV MODULATED NONLINEARDYNAMIC SYSTEMS

PARTICLE FILTERS WITH INDEPENDENT RESAMPLING

Monte Carlo methods for sampling-based Stochastic Optimization

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

Linear Dynamical Systems

Inferring biological dynamics Iterated filtering (IF)

Dynamical Models for Tracking with the Variable Rate Particle Filter

Particle Learning and Smoothing

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems

Kernel adaptive Sequential Monte Carlo

Efficient Particle Filtering for Jump Markov Systems. Application to Time-Varying Autoregressions

Particle Learning and Smoothing

Sensor Fusion: Particle Filter

cappe/

Unsupervised Learning

Machine Learning for OR & FE

Sequential Monte Carlo in the machine learning toolbox

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Bayesian Inference for Dirichlet-Multinomials

Bearings-Only Tracking in Modified Polar Coordinate System : Initialization of the Particle Filter and Posterior Cramér-Rao Bound

LECTURE 15 Markov chain Monte Carlo

Sampling Methods (11/30/04)

arxiv: v1 [stat.co] 1 Jun 2015

Markov Chain Monte Carlo (MCMC)

Lecture Particle Filters. Magnus Wiktorsson

Bayesian Monte Carlo Filtering for Stochastic Volatility Models

The Kalman Filter ImPr Talk

Nested Sequential Monte Carlo Methods

Introduction to Machine Learning

Markov chain Monte Carlo methods for visual tracking

Sensitivity analysis in HMMs with application to likelihood maximization

Rao-Blackwellized Particle Filter for Multiple Target Tracking

Asymptotics and Simulation of Heavy-Tailed Processes

Basic math for biology

Lecture 6: Bayesian Inference in SDE Models

Divide-and-Conquer with Sequential Monte Carlo

An introduction to particle filters

Transcription:

Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen 1 adam.johansen@bristol.ac.uk Oxford University Man Institute 29th May 2008 1 Collaborators include: Arnaud Doucet, Nick Whiteley 1

2 Introduction

3 Outline Outline Background: Particle Filters Hidden Markov Models & Filtering Particle Filters & Sequential Importance Resampling Auxiliary Particle Filters Interpretation Auxiliary Particle Filters are SIR Algorithms Theoretical Considerations Practical Implications Applications Auxiliary SMC Samplers The Probability Hypothesis Density On Stratification Conclusions

4 Outline Outline Background: Particle Filters Hidden Markov Models & Filtering Particle Filters & Sequential Importance Resampling Auxiliary Particle Filters Interpretation Auxiliary Particle Filters are SIR Algorithms Theoretical Considerations Practical Implications Applications Auxiliary SMC Samplers The Probability Hypothesis Density On Stratification Conclusions

5 Outline Outline Background: Particle Filters Hidden Markov Models & Filtering Particle Filters & Sequential Importance Resampling Auxiliary Particle Filters Interpretation Auxiliary Particle Filters are SIR Algorithms Theoretical Considerations Practical Implications Applications Auxiliary SMC Samplers The Probability Hypothesis Density On Stratification Conclusions

6 Outline Outline Background: Particle Filters Hidden Markov Models & Filtering Particle Filters & Sequential Importance Resampling Auxiliary Particle Filters Interpretation Auxiliary Particle Filters are SIR Algorithms Theoretical Considerations Practical Implications Applications Auxiliary SMC Samplers The Probability Hypothesis Density On Stratification Conclusions

7 HMMs & Particle Filters Hidden Markov Models x 1 x 2 x 3 x 4 x 5 x 6 y 1 y 2 y 3 y 4 y 5 y 6 X n is a X -valued Markov Chain with transition density f: X n {X n 1 = x n 1 } f( x n 1 ) Y n is a Y-valued stochastic process: Y n {X n = x n } g( x n )

8 HMMs & Particle Filters Optimal Filtering The filtering distribution may be expressed recursively as: p(x n y 1:n 1 ) = p(x n 1 y 1:n 1 )f(x n x n 1 )dx n 1 Prediction p(x n y 1:n ) = p(x n y 1:n 1 )g(y n x n ) p(xn y 1:n 1 )g(y n x n )dx n Update. The smoothing distribution may be expressed recursively as: p(x 1:n y 1:n 1 ) = p(x 1:n 1 y 1:n 1 )f n (x n x n 1 ) p(x 1:n y 1:n ) = p(x 1:n y 1:n 1 )g n (y n x n ) p(x1:n y 1:n 1 )g n (y n x n )dx 1:n Prediction Update.

9 HMMs & Particle Filters Optimal Filtering The filtering distribution may be expressed recursively as: p(x n y 1:n 1 ) = p(x n 1 y 1:n 1 )f(x n x n 1 )dx n 1 Prediction p(x n y 1:n ) = p(x n y 1:n 1 )g(y n x n ) p(xn y 1:n 1 )g(y n x n )dx n Update. The smoothing distribution may be expressed recursively as: p(x 1:n y 1:n 1 ) = p(x 1:n 1 y 1:n 1 )f n (x n x n 1 ) p(x 1:n y 1:n ) = p(x 1:n y 1:n 1 )g n (y n x n ) p(x1:n y 1:n 1 )g n (y n x n )dx 1:n Prediction Update.

10 HMMs & Particle Filters Particle Filters: Sequential Importance Resampling At time n = 1: Sample X i 1,1 q 1( ). Weight Resample. At times n > 1, iterate: W i 1 f(x i 1,1)g(y 1 X i 1,1)/q 1 (X i 1,1) Sample X i n,n q n ( X i n 1,n 1 ). Set Xi n,1:n 1 = Xi n 1. Weight Resample. W i n f(xi n,n X i n,n 1 )g(xi n,n y n ) q n (X n,n X n,1:n 1 )

11 Auxiliary Particle Filters Auxiliary [v] Particle Filters (Pitt & Shephard 99) If we have access to the next observation before resampling, we could use this structure: Pre-weight every particle with λ (i) n ˆp(y n X (i) n 1 ). Propose new states, from the mixture distribution N i=1 λ (i) n q( X (i) n 1 )/ N i=1 λ (i) n. Weight samples, correcting for the pre-weighting. Resample particle set. W i n f(xi n,n X i n,n 1 )g(xi n,n y n ) λ i nq n (X n,n X n,1:n 1 )

12 Auxiliary Particle Filters Some Well Known Refinements We can tidy things up a bit: 1. The auxiliary variable step is equivalent to multinomial resampling. 2. So, there s no need to resample before the pre-weighting. Now we have: Pre-weight every particle with λ (i) n Resample Propose new states ˆp(y n X (i) n 1 ). Weight samples, correcting for the pre-weighting. W i n f(xi n,n X i n,n 1 )g(xi n,n y n ) λ i nq n (X n,n X n,1:n 1 )

13 Interpretation

14 APFs without Auxiliary Variables General SIR Algorithms The SIR algorithm can be used somewhat more generally. Given {π n } defined on E n = X n : Sample X i n,n q n ( X i n 1 ). Set Xi n,1:n 1 = Xi n 1. Weight Resample. W i n π n (X i n) π n 1 (X n,1:n 1 )q n (X n,n X n,1:n 1 )

15 APFs without Auxiliary Variables An Interpretation of the APF If we move the first step at time n + 1 to the last at time n, we get: Resample Propose new states Weight samples, correcting earlier pre-weighting. Pre-weight every particle with λ (i) n+1 ˆp(y n+1 X (i) n ). which is an SIR algorithm targetting the sequence of distributions η n (x n ) p(x 1:n y 1:n )ˆp(y n+1 x n ) which allows estimation under the actually interesting distribution via importance sampling.

16 Theoretical Considerations Theoretical Considerations Direct analysis of the APF is largely unnecessary. Results can be obtained by considering the associated SIR algorithm. SIR has a (discrete time) Feynman-Kac interpretation.

17 Theoretical Considerations For example... Proposition. Under standard regularity conditions N ( ϕ N n,ap F ϕ n ) N ( 0, σ 2 n (ϕ n ) ) where, Z σ 2 p ( x1 1 (ϕ y 1 ) 2 1) = (ϕ 1 (x 1 ) ϕ 1 ) 2 dx 1 q 1 (x 1 ) Z σ 2 p(x1 n (ϕn) = y 1:n ) 2 Z «2 ϕ n(x 1:n )p(x 2:n y 2:n, x 1 )dx 2:n ϕ n dx 1 q 1 (x 1 ) t 1 X Z p(x 1:k y 1:n ) 2 Z «2 + ϕ n(x 1:n )p(x k+1:n y k+1:n, x k )dx k+1:n ϕ n dx 1:k bp(x k=2 1:k 1 y 1:k )q k (x k x k 1 ) Z p(x 1:n y 1:n ) 2 + bp(x 1:n 1 y 1:n )q n(x n x n 1 ) (ϕn(x 1:n) ϕ n) 2 dx 1:n.

18 Practical Implications Practical Implications It means we re doing importance sampling. Choosing p(y n x n 1 ) = p (y n x n = E [X n x n 1 ]) is dangerous. A safer choice would be ensure that g(y n x n )f(x n x n 1 ) sup x n 1,x n p(y n x n 1 )q(x n x n 1 ) < Using APF doesn t ensure superior performance.

19 Practical Implications A Contrived Illustration Consider the following binary state-space model with common state and observation spaces: X = {0, 1} p(x 1 = 0) = 0.5 p(x n = x n 1 ) = 1 δ Y = X p(y n = x n ) = 1 ε. δ controls ergodicity of the state process. ɛ controls the information contained in observations. Consider estimating E(X 2 Y 1:2 = (0, 1)).

Practical Implications Variance of SIR - Variance of APF 0.05 0-0.05-0.1-0.15-0.2 0.05 0.1 0.15 0.2 ǫ 0.25 0.3 0.35 0.4 0.45 0.5 0.9 0.8 0.7 0.6 0.5 0.4 δ 0.3 0.2 0.1 1 20

21 Practical Implications 0.45 0.4 Variance Comparison at ǫ = 0.25 SISR APF SISR Asymptotic APF Asymptotic 0.35 Variance 0.3 0.25 0.2 0.15 0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 δ

22 Applications & Extensions

23 Auxiliary SMC Samplers SMC Samplers (Del Moral, Doucet & Jasra, 2006) SIR Techniques can be adapted for any sequence of distributions {π n }. Define Sample at time n, n 1 π n (x 1:n ) = π n (x n ) L k (x k+1, x k ). k=1 X i n K n (X i n 1, ) Weight Resample. W i n π n(x i n)l n 1 (X i n, X i n 1 ) π n 1 (X i n 1 )K n(x i n 1, Xi n)

24 Auxiliary SMC Samplers Auxiliary SMC Samplers An auxiliary SMC sampler for a sequence of distributions π n comprises: An SMC sampler targeting some auxiliary sequence of distributions µ n. A sequence of importance weight functions Generic approaches: w n (x n ) dπ n dµ n (x n ). Choose µ n (x n ) π n (x n )V n (x n ). Incorporate as much information as possible prior to resampling.

25 Auxiliary SMC Samplers Auxiliary SMC Samplers An auxiliary SMC sampler for a sequence of distributions π n comprises: An SMC sampler targeting some auxiliary sequence of distributions µ n. A sequence of importance weight functions Generic approaches: w n (x n ) dπ n dµ n (x n ). Choose µ n (x n ) π n (x n )V n (x n ). Incorporate as much information as possible prior to resampling.

26 Auxiliary SMC Samplers Resample-Move Approaches A common approach in SMC Samplers: Choose K n (x n 1, x n ) to be π n -invariant. Set Leading to L n 1 (x n, x n 1 ) = π n(x n 1 )K n (x n 1, x n ). π n (x n ) W n (x n 1, x n ) = π n(x n 1 ) π n 1 (x n 1 ). It would make more sense to resample and then move...

27 Auxiliary SMC Samplers An Interpretation of Pure Resample-Move It s SIR with auxiliary distributions... µ n (x n ) = π n+1 (x n ) L n 1 (x n, x n 1 ) = µ n 1(x n 1 )K n (x n 1, x n ) µ n 1 (x n ) w n (x n 1:n ) = µ n(x n ) µ n 1 (x n ) = π n+1(x n ) π n (x n ) w n (x n ) = µ n 1(x n ) µ n (x n ) = π n(x n ) π n+1 (x n ). = π n(x n 1 )K n (x n 1, x n ) π n (x n )

28 Auxiliary SMC Samplers Piecewise-Deterministic Processes 25 20 15 y 10 5 0 5 10 35 40 45 50 55 x

29 Auxiliary SMC Samplers Filtering of Piecewise-Deterministic Processes (Whiteley, Johansen & Godsill, 2007) X n = (k n, τ n,1:kn, θ n,0:kn ) specifies a continuous time trajectory (ζ t ) t [0,tn]. ζ t observed in the presence of noise, e.g. Y n = ζ tn + V n Target sequence of distributions {π n } on nested spaces: π n (k, τ 1:k, θ 0:k ) q(θ 0 ) k f(τ j τ j 1 )q(θ j τ j, θ j 1, τ j 1 ) j=1 S(t n, τ k ) n g(y p ζ tp ) p=1

30 Auxiliary SMC Samplers Filtering of Piecewise-Deterministic Processes (Whiteley, Johansen & Godsill, 2007) π n yields posterior marginal distributions for recent history of ζ t Employ auxiliary SMC samplers Choose µ n (k, τ 1:k, θ 0:k ) π n (k, τ 1:k, θ 0:k )V n (τ k, θ k ) Kernel proposes new pairs (τ j, θ j ) and adjusts recent pairs Applications in object-tracking and finance

31 The Auxiliary SMC-PHD Filter Probability Hypothesis Density Filtering (Mahler, 2003; Vo et al. 2005) Approximates the optimal filter for a class of spatial point process-valued HMMs A recursion for intensity functions: α n (x n ) = f(x n x n 1 )p S (x n 1 ) α n 1 (x n 1 )dx n 1 + γ(x n ) E [ ] m n ψ n,r (x n ) α n (x n ) = 1 p D (x n ) + α n (x n ). Z n,r r=1 Approximate the intensity functions { α n } n 0 using SMC

32 The Auxiliary SMC-PHD Filter An ASMC Implementation (Whiteley et al., 2007) α n 1 (dx n 1 ) α N n 1 (dx n 1) = 1 N N i=1 W i n 1 δ X i n 1 (dx n 1) In a simple case, target integral at nth iteration: E m n E r=1 ϕ(x n ) ψ n,r(x n ) Z n,r f(x n x n 1 )p S (x n 1 )dx n α N n 1(dx n 1 ) Proposal distribution q n (x n, x n 1, r n) built from α N n 1, potential functions {V n,r } and factorises: N q n (x n x i=1 V n,r(x (i) n 1 )W n 1δ (i) X (dx n 1 ) n 1 n 1, r) N i=1 V n,r(x (i) n 1 )W q n (r) n 1 and also estimate normalising constants {Z n,r } by IS

33 On Stratification Stratification... Back to HMM, let (A p ) M p=1 denote a partition of X Introduce stratum indicator variable R n = M p=1 pi A p (X n ) Define extended model: p(x 1:n, r 1:n y 1:n ) g(y n x n )f(x n r n, x n 1 )f(r n x n 1 ) p(x 1:n 1, r 1:n 1 y 1:n 1 ) An auxiliary SMC filter resamples from distribution with N M support points, over particles strata Applications in switching state space models

34 Conclusions

35 Conclusions Conclusions The APF is a standard SIR algorithm for nonstandard distributions. This interpretation... allows standard results to be applied directly, provides guidance on implementation, and allows the same technique to be applied in more general settings. Thanks for listening... any questions?

36 Conclusions Conclusions The APF is a standard SIR algorithm for nonstandard distributions. This interpretation... allows standard results to be applied directly, provides guidance on implementation, and allows the same technique to be applied in more general settings. Thanks for listening... any questions?

37 Conclusions Conclusions The APF is a standard SIR algorithm for nonstandard distributions. This interpretation... allows standard results to be applied directly, provides guidance on implementation, and allows the same technique to be applied in more general settings. Thanks for listening... any questions?

38 Conclusions Conclusions The APF is a standard SIR algorithm for nonstandard distributions. This interpretation... allows standard results to be applied directly, provides guidance on implementation, and allows the same technique to be applied in more general settings. Thanks for listening... any questions?

39 Conclusions Conclusions The APF is a standard SIR algorithm for nonstandard distributions. This interpretation... allows standard results to be applied directly, provides guidance on implementation, and allows the same technique to be applied in more general settings. Thanks for listening... any questions?

40 Conclusions Conclusions The APF is a standard SIR algorithm for nonstandard distributions. This interpretation... allows standard results to be applied directly, provides guidance on implementation, and allows the same technique to be applied in more general settings. Thanks for listening... any questions?

41 Conclusions Conclusions The APF is a standard SIR algorithm for nonstandard distributions. This interpretation... allows standard results to be applied directly, provides guidance on implementation, and allows the same technique to be applied in more general settings. Thanks for listening... any questions?

References [1] P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society B, 63(3):411 436, 2006. [2] A. M. Johansen and A. Doucet. Auxiliary variable sequential Monte Carlo methods. Technical Report 07:09, University of Bristol, Department of Mathematics Statistics Group, University Walk, Bristol, BS8 1TW, UK, July 2007. [3] A. M. Johansen and A. Doucet. A note on the auxiliary particle filter. Statistics and Probability Letters, 2008. To appear. [4] A. M. Johansen and N. Whiteley. A modern perspective on auxiliary particle filters. In Proceedings of Workshop on Inference and Estimation in Probabilistic Time Series Models. Isaac Newton Institute, June 2008. To appear. [5] R. P. S. Mahler. Multitarget Bayes filtering via first-order multitarget moments. IEEE Transactions on Aerospace and Electronic Systems, 39(4):1152, October 2003. [6] M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. Journal of the American Statistical Association, 94(446):590 599, 1999. [7] B. Vo, S.S. Singh, and A. Doucet. Sequential Monte Carlo methods for multi-target filtering with random finite sets. IEEE Transactions on Aerospace and Electronic Systems, 41(4):1223 1245, October 2005. [8] N. Whiteley, A. M. Johansen, and S. Godsill. Monte Carlo filtering of piecewise-deterministic processes. Technical Report CUED/F-INFENG/TR-592, University of Cambridge, Department of Engineering, Cambridge University Engineering Department, Trumpington Street, Cambridge, CB2 1PZ, December 2007. [9] N. Whiteley, S. Singh, and S. Godsill. Auxiliary particle implementation of the probability hypothesis density filter. Technical Report CUED F-INFENG/590, University of Cambridge, Department of Engineering, Trumpington Street, Cambridge,CB1 2PZ, United Kingdom, 2007.