Adaptive Population Monte Carlo
|
|
- Moris Hood
- 6 years ago
- Views:
Transcription
1 Adaptive Population Monte Carlo Olivier Cappé Centre Nat. de la Recherche Scientifique & Télécom Paris 46 rue Barrault, Paris cedex 13, France Recent Advances in Monte Carlo Based Inference Workshop 30 October 3 November 2006, Isaac Newton Institute Randal Douc, Arnaud Guillin, Jean-Michel Marin & Christian P. Robert
2 Outline Monte Carlo Basics Population Monte Carlo PMC for ECOSSTAT References
3 Monte Carlo Basics Monte Carlo Basics Monte Carlo, MCMC... Importance Sampling, SIR... Population Monte Carlo PMC for ECOSSTAT References
4 Monte Carlo Basics General Purpose Given a density π, known up to a normalising constant, compute π(h) = h(x)π(x)dx for some functions of interest h In the following π denotes the normalised density but we consider only algorithms that do not necessitate knowledge of the normalising constant
5 Monte Carlo Basics Monte Carlo, MCMC... Monte Carlo Generate an iid sample X 1,..., X N from π and estimate π(h) by ˆπ MC N (h) = 1/N N h(x i ) Under π(h 2 ) = h 2 (x)π(x)dx <, we have Consistency: ˆπ N MC P (h) π(h) and Asymptotic Normality: N MC (ˆπ N (h) π(h) ) D ( { N 0, π [h π(h)] 2 }) with practical variance estimate 1/N N [ i=1 h(xi ) ˆπ N MC(h)] 2 i=1
6 Monte Carlo Basics Monte Carlo, MCMC... Monte Carlo Generate an iid sample X 1,..., X N from π and estimate π(h) by ˆπ MC N (h) = 1/N N h(x i ) Under π(h 2 ) = h 2 (x)π(x)dx <, we have Consistency: ˆπ N MC P (h) π(h) and Asymptotic Normality: N MC (ˆπ N (h) π(h) ) D ( { N 0, π [h π(h)] 2 }) with practical variance estimate 1/N N [ i=1 h(xi ) ˆπ N MC(h)] 2 i=1 Caveat Often impossible to simulate directly from π Possible answers include Markov Chain Monte Carlo
7 Monte Carlo Basics Importance Sampling, SIR... Importance Sampling Generate an iid sample X 1,..., X N from q and estimate π(h) by ˆπ IS N (h) = N i=1 w i h(x i ) where w i = π(x i )/q(x i ) and w i = w i / N j=1 w j Under π [ (1 + h 2 )π/q ], we have consistency and asymptotic normality with asymptotic variance π { [h π(h)] 2 π/q } and practical variance estimate N N i=1 w2 i [h(x i) ˆπ IS N (h)]2
8 Monte Carlo Basics Importance Sampling, SIR... Importance Sampling Generate an iid sample X 1,..., X N from q and estimate π(h) by ˆπ IS N (h) = N i=1 w i h(x i ) where w i = π(x i )/q(x i ) and w i = w i / N j=1 w j Under π [ (1 + h 2 )π/q ], we have consistency and asymptotic normality with asymptotic variance π { [h π(h)] 2 π/q } and practical variance estimate N N i=1 w2 i [h(x i) ˆπ IS N (h)]2 No Free Lunch Finding a suitable proposal density q is a hard task (particularly in high dimension)
9 Population Monte Carlo Monte Carlo Basics Population Monte Carlo The Algorithm Properties PMC for ECOSSTAT References
10 Population Monte Carlo The Algorithm Population Monte Carlo (PMC) At time t = 0 Generate {X i,0 } 1 i N iid q0 Set ω i,0 = π(x i,0 )/q 0 (X i,0 ), ω i,0 = ω i,0 / N j=1 ω j,0 Generate {J i,0 } 1 i N iid M(1, ( ωi,0 ) 1 i N ) Set X i,0 = X Ji,0
11 Population Monte Carlo The Algorithm Population Monte Carlo (PMC) At time t = 0 Generate {X i,0 } 1 i N iid q0 Set ω i,0 = π(x i,0 )/q 0 (X i,0 ), ω i,0 = ω i,0 / N j=1 ω j,0 Generate {J i,0 } 1 i N iid M(1, ( ωi,0 ) 1 i N ) Set X i,0 = X Ji,0 At time t (t = 1,..., T ) Generate X i,t ind q i,t ( X i,t 1, ) Set ω i,t = π(x i,t )/q i,t ( X i,t 1, X i,t ), ω i,t = ω i,t / N j=1 ω j,t Generate {J i,t } 1 i N iid M(1, ( ωi,t ) 1 i N ) Set X i,t = X Ji,t,t Note that other form of resampling could be used...
12 Population Monte Carlo The Algorithm PMC has many connections with (among other) West s (1992) mixture approximation Hürzeler & Künsch s (1998) and Stavropoulos & Titterington s (1999) smooth bootstrap Wong & Liang s (1997) and Liu, Liang & Wong s (2001) dynamic weighting Chopin s (2001) progressive posteriors for large datasets Gilks & Berzuini s (2001) resample-move Rubinstein & Kroese s (2004) cross-entropy method Del Moral, Doucet & Jasra s (2006) sequential Monte Carlo samplers It may also be adequately described as an iterated Sampling Importance Resampling (SIR) approach with (possibly) Markov proposals (note that Iba, 2000, use the term to refer to a more general class of methods)
13 Population Monte Carlo Properties Basic Importance Sampling Equality Preservation of Unbiasedness E [ω,t h(x,t )] = [ ( )] π(x,t ) E E h(x,t ) q,t ( X,t 1, X,t ) { X i,t 1 } 1 i N }{{} π(h) We may freely choose the way in which the proposal kernels q i,t are built from { ω i,t 1, X i,t 1 } 1 i N In the following, we consider only global proposals of the form q i,t def = q t
14 Population Monte Carlo Properties Asymptotic Analysis Key Properties {X i,t, ω i,t } 1 i N are conditionally i.i.d given {X j,t 1, ω j,t 1 } 1 j N E [w,t h(x,t ) {X i,t 1, ω i,t 1 } 1 i N ] = π(h) However, the successive populations are not independent due to the use of (possibly) Markovian proposals which are adaptively tuned In the following, we consider the behaviour of PMC when T is kept fixed and N tends to infinity (using results of Douc & Moulines, 2005)
15 Population Monte Carlo Properties Fundamental Asymptotic Results Assume h such that π(h) < Consistency Under π π{q t (x, x ) = 0} = 0, N P ω i,t h(x i,t ) π(h) i=1 Asymptotic Normality Under π π ( [1 + h 2 (x)]π(x)/q t (x, x ) ) <, ( N ) D N ω i,t h(x i,t ) π(h) N (0, σt 2 ) i=1
16 Population Monte Carlo Properties Variance Estimation The Asymptotic Variance is given (for t 1) by σ 2 t def = lim Var [w,th(x,t ) {X i,t 1, ω i,t 1 } 1 i N ] N = lim N = N i=1 ω i,t 1 [h(x ) π(h) ] 2 π(x ) q t (X i,t, x ) π(x )dx [h(x ) π(h) ] 2 π(x ) q t (x, x ) π(x)π(x )dxdx which may be estimated in practice by N N h(x i,t ) ω i,t 2 i=1 j=1 N ω j,t h(x j,t ) 2
17 Population Monte Carlo Properties The Final Estimator After T iterations of the PMC algorithm, the estimator of π(h) is given by N ˆπ N,T P MC (h) = ω i,t h(x i,t ) or, more efficiently, T t=1 ˆσ 2 t T s=1 ˆσ 2 s i=1 N ω i,t h(x i,t ) i=1
18 Population Monte Carlo Properties The Final Estimator After T iterations of the PMC algorithm, the estimator of π(h) is given by N ˆπ N,T P MC (h) = ω i,t h(x i,t ) or, more efficiently, T t=1 ˆσ 2 t T s=1 ˆσ 2 s i=1 N ω i,t h(x i,t ) i=1 How to update q t from the simulations (up to time t 1)?
19 Monte Carlo Basics Population Monte Carlo Kullback Divergence Adaptive PMC for Mixture Proposals (Toy) Examples Variance Minimisation (Toy) Example Again PMC for ECOSSTAT References
20 Kullback Divergence We first need a performance criterion Kullback Divergence arg min K[π π π q θ ] θ = arg min log π(x)π(x ) θ π(x)q θ (x, x ) π(x)π(x )dxdx = arg max log q θ (x, x )π(x)π(x )dxdx θ }{{} l(θ)
21 Kullback Divergence We first need a performance criterion Kullback Divergence arg min K[π π π q θ ] θ = arg min log π(x)π(x ) θ π(x)q θ (x, x ) π(x)π(x )dxdx = arg max log q θ (x, x )π(x)π(x )dxdx θ }{{} l(θ) K[π π π q θ ] = 0 implies that the weights are constant Sequential Monte Carlo interpretation (see Arnaud s talk) Usually gives explicit solutions for θ (see Nicolas talk)
22 Kullback Divergence The Estimation of θ is straightforward when q θ belongs to an exponential family Example (Independent Gaussian proposal) q µ,σ (x, x ) Σ 1/2 exp[ 1/2(x µ) T Σ 1 (x µ)] gives (ˆµ, ˆΣ) = (E π [X], Cov π [X]) which can be estimated by ˆµ = N ω i,0 X i,0 ˆΣ = i=1 N ω i,0 (X i,0 ˆµ)(X i,0 ˆµ) T i=1 (the estimate can obviously be improved along the iterations)
23 Kullback Divergence The Estimation of θ is straightforward when q θ belongs to an exponential family Example (Independent Gaussian proposal) q µ,σ (x, x ) Σ 1/2 exp[ 1/2(x µ) T Σ 1 (x µ)] gives (ˆµ, ˆΣ) = (E π [X], Cov π [X]) which can be estimated by ˆµ = N ω i,0 X i,0 ˆΣ = i=1 N ω i,0 (X i,0 ˆµ)(X i,0 ˆµ) T i=1 (the estimate can obviously be improved along the iterations) Example (Random-Walk Gaussian proposal) q Σ (x, x ) Σ 1/2 exp[ 1/2(x x) T Σ 1 (x x)] gives ˆΣ = Cov π π [X X] = 2 Cov π [X]
24 Kullback Divergence Integrated-EM Formulas for More General Choices of q θ If q θ is a missing-data type of proposal, ie. q θ (x, x ) = f θ (x, x, y)dy, we may define an intermediate quantity ( [ L θ (θ) = E π π E f θ log fθ (X, X, Y ) X, X ]) which (using Jensen s inequality) satisfies L θ (θ) L θ (θ ) l(θ) l(θ ) we obtain ascent integrated-em updates, which can be approximated since {( X i,t 1, X i,t ), ω i,t } 1 i N can be used to estimate expectations under π π This is used in particular in the information bottleneck algorithm
25 Adaptive PMC for Mixture Proposals Mixture Proposals In the following we consider D q α (x, x ) = α d q d (x, x ) d=1 where q d are fixed transitions Note that the criterion l(α) is then concave
26 Adaptive PMC for Mixture Proposals Integrated EM Recursion on the Proportions The mapping Ψ(α) = ( α d q d (x, x ) D j=1 α jq j (x, x ) π(x)π(x )dxdx ) 1 d D defined on the probability simplex { S = α = (α 1,..., α D ); α d 0, 1 d D } D and α d = 1 d=1 is such that l(ψ(α)) l(α)
27 Adaptive PMC for Mixture Proposals Adaptive Mixture PMC At time t (t = 1,..., T ) (Mixture Sampling) Generate iid {K i,t } 1 i N M(1, (αd,t ) 1 d D ) and {X i,t } 1 i N ind q Ki,t ( X i,t 1, ) (IS Weights) Set ω i,t = π(x i,t ) / D d=1 α d,tq d ( X i,t 1, X i,t ) (Proportions Update) α d,t+1 = N i=1 ω i,t1 d (K i,t ) (Resampling) Generate {J i,t } 1 i N iid M(1, ( ωi,t ) 1 i N ) and set X i,t = X Ji,t,t
28 Adaptive PMC for Mixture Proposals Why is this correct? E [ ] ω i,t 1 d (K i,t ) X i,t D = α d,t d=1 q d ( X i,t 1, x π(x ) ) D j=1 α jq j ( X i,t, x ) dx [ E 1/N ] N ω i,t 1 d (K i,t ) { X j,t } 1 j N i=1 ( D α d,t d=1 ) q d (x, x π(x ) ) D j=1 α jq j (x, x ) dx π(x)dx
29 (Toy) Examples Example (Independent Proposals with Mixture Target) Target 1/4N ( 1, 0.3)(x) + 1/4N (0, 1)(x) + 1/2N (3, 2)(x) Proposals: N ( 1, 0.3), N (0, 1) and N (3, 2) Table: Weight evolution (N = 100, 000)
30 (Toy) Examples Figure: Target and mixture evolution
31 (Toy) Examples Example (Random-Walk Proposals) Target N (0, 1) Gaussian random walks proposals: q 1 (x, x ) = f N (x,0.1) (x ), q 2 (x, x ) = f N (x,2) (x ) and q 3 = f N (x,10) (x ) Table: Evolution of the weights (N = 100, 000)
32 Variance Minimisation Can we do better? Recall that when h is known beforehand, the optimal importance function is given by q (x) = h(x) π(h) π(x) h(x ) π(h) π(x )dx which may give a lesser variance than ˆπ N MC(h)
33 Variance Minimisation Can we do better? Recall that when h is known beforehand, the optimal importance function is given by q (x) = h(x) π(h) π(x) h(x ) π(h) π(x )dx which may give a lesser variance than ˆπ N MC(h) For PMC, it is natural to consider the following objective: Minimum Variance Criterion arg min E π π θ ( [h(x ) π(h) ] 2 π(x ) q θ (X, X ) Note that this criterion is again convex in α for mixture proposals )
34 Variance Minimisation There is an ascent update rule for the variance criterion Ψ(α) = ( ν αd q d (x,x ) h P D l=1 α lq l (x,x ) σh 2(α) ) 1 d D where the (unnormalised) measure ν h is defined as ν h (dx, dx ) = (h(x ) π(h)) 2 π(x ) D d=1 α dq d (x, x ) π(dx)π(dx ) and σh 2 (α) = ν h (1) is the value of variance criterion corresponding to α, is such that σh 2 (Ψ(α)) σ2 h (α)
35 Variance Minimisation Interlude: Convex Puzzles Kullback Divergence Criterion arg max α log [ d α df d (x)] ν(dx) Ascent Mapping α i = α i f i (x) Pd α df d (x) ν(dx) Proof Concavity of log(x) & positivity of Kullback divergence
36 Variance Minimisation Interlude: Convex Puzzles Kullback Divergence Criterion arg max α log [ d α df d (x)] ν(dx) Ascent Mapping α i = α i f i (x) Pd α df d (x) ν(dx) Proof Concavity of log(x) & positivity of Kullback divergence Minimum Variance Criterion arg min α [ d α df d (x)] 1 ν(dx) R α if i(x) Ascent Mapping α i = [ P Pd α d f d (x) d α df d (x)] 1 ν(dx) R P [ d α df d (x)] 1 ν(dx) Proof Convexity of (x) 1 & d α d = 1 Is there any more principled way of finding the second update?
37 Variance Minimisation Updating Rule The empirical version of the previous update is α d,t+1 = N i=1 ω 2 i,t N h(x i,t ) N ω j,t h(x j,t ) j=1 h(x i,t ) ω i,t 2 i=1 j=1 2 N ω j,t h(x j,t ) 1 d (K i,t ) 2
38 (Toy) Example Again Example N (0, 1) target, h(x) = x and D = 3 independent proposals: N (0, 1) Cauchy distribution Symmetrised Ga(0.5, 0.5) (This is the optimal choice, q ) t Estimation α 1,t α 2,t α 3,t Variance Table: PMC estimates for N = 100, 000 and T = 20
39 PMC for ECOSSTAT How does this works in real life? The ECOSSTAT (Measuring cosmological parameters from large heterogeneous surveys) project is an interdisciplinary study where we use Monte Carlo methods for inferring cosmological parameters from several set of measurements PMC is well suited in this context as Evaluation of π(x) is prohibitively long, but PMC can be (mostly) parallelised (in contrast to MCMC) Variance estimation is feasible with PMC, which is important to cosmologists Because of the physical nature of the parameters, their value is known to some extent, which is more or less required for techniques based on importance sampling
40 PMC for ECOSSTAT A (Somewhat) Idealised Example Gaussian ellipsoid Figure: Target and mixture evolution We use the following proposals: Random-Walk Gaussian proposals with covariance Σ, 2Σ and 4Σ, independent Gaussian proposals with covariance Σ/2, Σ and 2Σ and the uniform distribution
41 Remark: We need at least one q d, with non-zero α d, for which the IS variance is finite Adaptive Population Monte Carlo PMC for ECOSSTAT Results Mixture proportions 100 Norm. Variance ESS of h Uniform ( ) RW 2Σ ( ) Indep. Σ ( ) Kullback ( ) Variance ( ) Normalised ESS (Effective Sample Size) (N N i=1 ω2 i,t ) 1 Function h is h(x) = (1 1) T x
42 PMC for ECOSSTAT Typical Results with N = 10, 000 Particles and T = 50 Iterations RW C RW 2C RW 4C Indep. C/2 Indep. C Indep. 2C Unif Figure: Values of α d,t as a function of t (Kullback criterion)
43 PMC for ECOSSTAT Typical Results (contd.) Figure: Normalised ESS as a function of t Figure: Estimated variance for function h as a function of t Due to the stability of the asymptotic updates, the algorithm performs well even with moderate sample sizes
44 References References Cappé, Guillin, Marin, & Robert. Population Monte Carlo. J. Comput. Graph. Statist., 13(4): , Douc, Guillin, Marin & Robert. Convergence of adaptive mixtures of importance sampling schemes. Ann. Statist., 35(1), 2007 (to appear). Douc, Guillin, Marin & Robert. Minimum variance importance sampling via population Monte Carlo. Technical report, Advertising below this line... Postdocs interested in the ECOSSTAT project, please contact Christian P. Robert and/or myself People interested in adaptive Monte Carlo, please check the workshop (june 2007)
Adaptive Monte Carlo methods
Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert
More informationOutline. General purpose
Outline Population Monte Carlo and adaptive sampling schemes Christian P. Robert Université Paris Dauphine and CREST-INSEE http://www.ceremade.dauphine.fr/~xian 1 2 3 Illustrations 4 Joint work with O.
More informationSurveying the Characteristics of Population Monte Carlo
International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 7 (9): 522-527 Science Explorer Publications Surveying the Characteristics of
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationAn introduction to Sequential Monte Carlo
An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods
More informationKernel Sequential Monte Carlo
Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section
More informationSequential Monte Carlo Methods for Bayesian Computation
Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter
More informationAuxiliary Particle Methods
Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen 1 adam.johansen@bristol.ac.uk Oxford University Man Institute 29th May 2008 1 Collaborators include: Arnaud Doucet, Nick Whiteley
More informationAn Brief Overview of Particle Filtering
1 An Brief Overview of Particle Filtering Adam M. Johansen a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ May 11th, 2010 Warwick University Centre for Systems
More informationConsistency of the maximum likelihood estimator for general hidden Markov models
Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models
More informationSequential Monte Carlo samplers for Bayesian DSGE models
Sequential Monte Carlo samplers for Bayesian DSGE models Drew Creal Department of Econometrics, Vrije Universitiet Amsterdam, NL-8 HV Amsterdam dcreal@feweb.vu.nl August 7 Abstract Bayesian estimation
More informationAdvanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering
Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering Axel Gandy Department of Mathematics Imperial College London http://www2.imperial.ac.uk/~agandy London
More informationInference in state-space models with multiple paths from conditional SMC
Inference in state-space models with multiple paths from conditional SMC Sinan Yıldırım (Sabancı) joint work with Christophe Andrieu (Bristol), Arnaud Doucet (Oxford) and Nicolas Chopin (ENSAE) September
More informationAdaptive Metropolis with Online Relabeling
Adaptive Metropolis with Online Relabeling Rémi Bardenet LAL & LRI, University Paris-Sud 91898 Orsay, France bardenet@lri.fr Olivier Cappé LTCI, Telecom ParisTech & CNRS 46, rue Barrault, 7513 Paris, France
More informationSequential Monte Carlo Samplers for Applications in High Dimensions
Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex
More informationMonte Carlo methods for sampling-based Stochastic Optimization
Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from
More informationAn Adaptive Sequential Monte Carlo Sampler
Bayesian Analysis (2013) 8, Number 2, pp. 411 438 An Adaptive Sequential Monte Carlo Sampler Paul Fearnhead * and Benjamin M. Taylor Abstract. Sequential Monte Carlo (SMC) methods are not only a popular
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationExercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters
Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationSequential Monte Carlo Methods in High Dimensions
Sequential Monte Carlo Methods in High Dimensions Alexandros Beskos Statistical Science, UCL Oxford, 24th September 2012 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Andrew Stuart Imperial College,
More informationApproximate Bayesian Computation
Approximate Bayesian Computation Sarah Filippi Department of Statistics University of Oxford 09/02/2016 Parameter inference in a signalling pathway A RAS Receptor Growth factor Cell membrane Phosphorylation
More informationMonte Carlo Methods. Leon Gu CSD, CMU
Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte
More informationcappe/
Particle Methods for Hidden Markov Models - EPFL, 7 Dec 2004 Particle Methods for Hidden Markov Models Olivier Cappé CNRS Lab. Trait. Commun. Inform. & ENST département Trait. Signal Image 46 rue Barrault,
More informationLatent Variable Models and EM algorithm
Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic
More informationSampling from complex probability distributions
Sampling from complex probability distributions Louis J. M. Aslett (louis.aslett@durham.ac.uk) Department of Mathematical Sciences Durham University UTOPIAE Training School II 4 July 2017 1/37 Motivation
More informationMCMC and likelihood-free methods
MCMC and likelihood-free methods Christian P. Robert Université Paris-Dauphine, IUF, & CREST Université de Besançon, November 22, 2012 MCMC and likelihood-free methods Computational issues in Bayesian
More informationKernel adaptive Sequential Monte Carlo
Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline
More informationAn Introduction to Sequential Monte Carlo for Filtering and Smoothing
An Introduction to Sequential Monte Carlo for Filtering and Smoothing Olivier Cappé LTCI, TELECOM ParisTech & CNRS http://perso.telecom-paristech.fr/ cappe/ Acknowlegdment: Eric Moulines (TELECOM ParisTech)
More informationA Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait
A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationControlled sequential Monte Carlo
Controlled sequential Monte Carlo Jeremy Heng, Department of Statistics, Harvard University Joint work with Adrian Bishop (UTS, CSIRO), George Deligiannidis & Arnaud Doucet (Oxford) Bayesian Computation
More informationECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering
ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:
More informationRecent Advances in Regional Adaptation for MCMC
Recent Advances in Regional Adaptation for MCMC Radu Craiu Department of Statistics University of Toronto Collaborators: Yan Bai (Statistics, Toronto) Antonio Fabio di Narzo (Statistics, Bologna) Jeffrey
More informationIntroduction. log p θ (y k y 1:k 1 ), k=1
ESAIM: PROCEEDINGS, September 2007, Vol.19, 115-120 Christophe Andrieu & Dan Crisan, Editors DOI: 10.1051/proc:071915 PARTICLE FILTER-BASED APPROXIMATE MAXIMUM LIKELIHOOD INFERENCE ASYMPTOTICS IN STATE-SPACE
More informationSequential Monte Carlo samplers for Bayesian DSGE models
Sequential Monte Carlo samplers for Bayesian DSGE models Drew Creal First version: February 8, 27 Current version: March 27, 27 Abstract Dynamic stochastic general equilibrium models have become a popular
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationBayesian Monte Carlo Filtering for Stochastic Volatility Models
Bayesian Monte Carlo Filtering for Stochastic Volatility Models Roberto Casarin CEREMADE University Paris IX (Dauphine) and Dept. of Economics University Ca Foscari, Venice Abstract Modelling of the financial
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationLIMIT THEOREMS FOR WEIGHTED SAMPLES WITH APPLICATIONS TO SEQUENTIAL MONTE CARLO METHODS
ESAIM: ROCEEDIGS, September 2007, Vol.19, 101-107 Christophe Andrieu & Dan Crisan, Editors DOI: 10.1051/proc:071913. LIMIT THEOREMS FOR WEIGHTED SAMLES WITH ALICATIOS TO SEQUETIAL MOTE CARLO METHODS R.
More informationBayesian Sequential Design under Model Uncertainty using Sequential Monte Carlo
Bayesian Sequential Design under Model Uncertainty using Sequential Monte Carlo, James McGree, Tony Pettitt October 7, 2 Introduction Motivation Model choice abundant throughout literature Take into account
More informationThe square root rule for adaptive importance sampling
The square root rule for adaptive importance sampling Art B. Owen Stanford University Yi Zhou January 2019 Abstract In adaptive importance sampling, and other contexts, we have unbiased and uncorrelated
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationMarkov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018
Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 7 Sequential Monte Carlo methods III 7 April 2017 Computer Intensive Methods (1) Plan of today s lecture
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationSequential Monte Carlo methods for system identification
Technical report arxiv:1503.06058v3 [stat.co] 10 Mar 2016 Sequential Monte Carlo methods for system identification Thomas B. Schön, Fredrik Lindsten, Johan Dahlin, Johan Wågberg, Christian A. Naesseth,
More informationVariational inference
Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM
More informationSampling multimodal densities in high dimensional sampling space
Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS & Telecom ParisTech Paris, France Journées MAS Toulouse, Août 4 Introduction Sample from a target distribution
More informationSequential Monte Carlo Methods for Bayesian Model Selection in Positron Emission Tomography
Methods for Bayesian Model Selection in Positron Emission Tomography Yan Zhou John A.D. Aston and Adam M. Johansen 6th January 2014 Y. Zhou J. A. D. Aston and A. M. Johansen Outline Positron emission tomography
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationAdaptive Metropolis with Online Relabeling
Adaptive Metropolis with Online Relabeling Anonymous Unknown Abstract We propose a novel adaptive MCMC algorithm named AMOR (Adaptive Metropolis with Online Relabeling) for efficiently simulating from
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 32 Lecture 14 : Variational Bayes
More informationPseudo-marginal MCMC methods for inference in latent variable models
Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016
More informationSequential Monte Carlo samplers
J. R. Statist. Soc. B (2006) 68, Part 3, pp. 411 436 Sequential Monte Carlo samplers Pierre Del Moral, Université Nice Sophia Antipolis, France Arnaud Doucet University of British Columbia, Vancouver,
More informationMonte Carlo Methods in Statistics
Monte Carlo Methods in Statistics Christian Robert To cite this version: Christian Robert. Monte Carlo Methods in Statistics. Entry for the International Handbook of Statistical Sciences. 2009.
More informationIntegrated Non-Factorized Variational Inference
Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationHmms with variable dimension structures and extensions
Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating
More informationExpectation Maximization
Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /
More informationEffective Sample Size for Importance Sampling based on discrepancy measures
Effective Sample Size for Importance Sampling based on discrepancy measures L. Martino, V. Elvira, F. Louzada Universidade de São Paulo, São Carlos (Brazil). Universidad Carlos III de Madrid, Leganés (Spain).
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 15-7th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Mixture and composition of kernels. Hybrid algorithms. Examples Overview
More informationLecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions
DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K
More informationAsymptotic optimality of adaptive importance sampling
Asymptotic optimality of adaptive importance sampling Bernard Delyon IRMAR University of Rennes 1 bernard.delyon@univ-rennes1.fr François Portier Télécom ParisTech University of Paris-Saclay francois.portier@gmail.com
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationParticle Filters: Convergence Results and High Dimensions
Particle Filters: Convergence Results and High Dimensions Mark Coates mark.coates@mcgill.ca McGill University Department of Electrical and Computer Engineering Montreal, Quebec, Canada Bellairs 2012 Outline
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More informationWinter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo
Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte
More informationLECTURE 3. Last time:
LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate
More informationGradient-based Monte Carlo sampling methods
Gradient-based Monte Carlo sampling methods Johannes von Lindheim 31. May 016 Abstract Notes for a 90-minute presentation on gradient-based Monte Carlo sampling methods for the Uncertainty Quantification
More informationA Review of Pseudo-Marginal Markov Chain Monte Carlo
A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the
More informationSequential Monte Carlo Methods
University of Pennsylvania Bradley Visitor Lectures October 23, 2017 Introduction Unfortunately, standard MCMC can be inaccurate, especially in medium and large-scale DSGE models: disentangling importance
More informationLecture 8: The Metropolis-Hastings Algorithm
30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationComputer intensive statistical methods
Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of
More informationMarkov Chain Monte Carlo Lecture 1
What are Monte Carlo Methods? The subject of Monte Carlo methods can be viewed as a branch of experimental mathematics in which one uses random numbers to conduct experiments. Typically the experiments
More informationarxiv: v1 [stat.co] 1 Jun 2015
arxiv:1506.00570v1 [stat.co] 1 Jun 2015 Towards automatic calibration of the number of state particles within the SMC 2 algorithm N. Chopin J. Ridgway M. Gerber O. Papaspiliopoulos CREST-ENSAE, Malakoff,
More informationDivide-and-Conquer Sequential Monte Carlo
Divide-and-Conquer Joint work with: John Aston, Alexandre Bouchard-Côté, Brent Kirkpatrick, Fredrik Lindsten, Christian Næsseth, Thomas Schön University of Warwick a.m.johansen@warwick.ac.uk http://go.warwick.ac.uk/amjohansen/talks/
More informationPerturbed Proximal Gradient Algorithm
Perturbed Proximal Gradient Algorithm Gersende FORT LTCI, CNRS, Telecom ParisTech Université Paris-Saclay, 75013, Paris, France Large-scale inverse problems and optimization Applications to image processing
More informationParticle Filtering Approaches for Dynamic Stochastic Optimization
Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop,
More informationData assimilation as an optimal control problem and applications to UQ
Data assimilation as an optimal control problem and applications to UQ Walter Acevedo, Angwenyi David, Jana de Wiljes & Sebastian Reich Universität Potsdam/ University of Reading IPAM, November 13th 2017
More informationZig-Zag Monte Carlo. Delft University of Technology. Joris Bierkens February 7, 2017
Zig-Zag Monte Carlo Delft University of Technology Joris Bierkens February 7, 2017 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 1 / 33 Acknowledgements Collaborators Andrew Duncan Paul
More informationIterative Markov Chain Monte Carlo Computation of Reference Priors and Minimax Risk
Iterative Markov Chain Monte Carlo Computation of Reference Priors and Minimax Risk John Lafferty School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 lafferty@cs.cmu.edu Abstract
More informationComputer Practical: Metropolis-Hastings-based MCMC
Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov
More informationChapter 12 PAWL-Forced Simulated Tempering
Chapter 12 PAWL-Forced Simulated Tempering Luke Bornn Abstract In this short note, we show how the parallel adaptive Wang Landau (PAWL) algorithm of Bornn et al. (J Comput Graph Stat, to appear) can be
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationSequential Monte Carlo Methods
National University of Singapore KAUST, October 14th 2014 Monte Carlo Importance Sampling Markov chain Monte Carlo Sequential Importance Sampling Resampling + Weight Degeneracy Path Degeneracy Algorithm
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof
More informationMonte Carlo Approximation of Monte Carlo Filters
Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include: Arnaud Doucet, Axel Finke, Anthony Lee, Nick Whiteley 7th January 2014 Context & Outline Filtering in State-Space
More informationAn introduction to adaptive MCMC
An introduction to adaptive MCMC Gareth Roberts MIRAW Day on Monte Carlo methods March 2011 Mainly joint work with Jeff Rosenthal. http://www2.warwick.ac.uk/fac/sci/statistics/crism/ Conferences and workshops
More informationOverlapping block proposals for latent Gaussian Markov random fields
NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET Overlapping block proposals for latent Gaussian Markov random fields by Ingelin Steinsland and Håvard Rue PREPRINT STATISTICS NO. 8/3 NORWEGIAN UNIVERSITY
More informationBayesian inference for multivariate skew-normal and skew-t distributions
Bayesian inference for multivariate skew-normal and skew-t distributions Brunero Liseo Sapienza Università di Roma Banff, May 2013 Outline Joint research with Antonio Parisi (Roma Tor Vergata) 1. Inferential
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationLTCC: Advanced Computational Methods in Statistics
LTCC: Advanced Computational Methods in Statistics Advanced Particle Methods & Parameter estimation for HMMs N. Kantas Notes at http://wwwf.imperial.ac.uk/~nkantas/notes4ltcc.pdf Slides at http://wwwf.imperial.ac.uk/~nkantas/slides4.pdf
More informationMinimum Message Length Analysis of the Behrens Fisher Problem
Analysis of the Behrens Fisher Problem Enes Makalic and Daniel F Schmidt Centre for MEGA Epidemiology The University of Melbourne Solomonoff 85th Memorial Conference, 2011 Outline Introduction 1 Introduction
More information13 Notes on Markov Chain Monte Carlo
13 Notes on Markov Chain Monte Carlo Markov Chain Monte Carlo is a big, and currently very rapidly developing, subject in statistical computation. Many complex and multivariate types of random data, useful
More informationPattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods
Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs
More information