Sampling multimodal densities in high dimensional sampling space

Size: px
Start display at page:

Download "Sampling multimodal densities in high dimensional sampling space"

Transcription

1 Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS & Telecom ParisTech Paris, France Journées MAS Toulouse, Août 4

2 Introduction Sample from a target distribution π dλ on X R l, when π is (possibly) known up to a normalizing constant, Hereafter, to make the notations simpler, π is assumed to be normalized and in the context π is multimodal large dimension Research guided by Computational Bayesian Statistics π: the a posteriori distribution, known up to a normalizing constant Needed: algorithms to explore π, to compute expectations w.r.t. π,.

3 Introduction Talk based on joint works with Eric Moulines, Amandine Schreck (Telecom ParisTech) Pierre Priouret (Paris VI) Benjamin Jourdain, Tony Lelièvre, Gabriel Stoltz (ENPC) Estelle Kuhn (INRA)

4 Introduction Outline Introduction Usual Monte Carlo samplers The proposal mecanism Adaptive Monte Carlo samplers Conclusion Tempering-based Monte Carlo samplers Biasing Potential-based Monte Carlo sampler Convergence Analysis

5 Introduction Usual Monte Carlo samplers Usual Monte Carlo samplers Markov chain Monte Carlo (MCMC) Sample a Markov chain (X k ) k having π as unique invariant distribution Approximation: π n δ Xk n k= Example: Hastings-Metropolis algorithm with proposal kernel q(x,y) given X k, sample Y q(x k, ) accept-reject mecanism { Y with probability π(y ) q(y,x k ) X k+ = π(x k ) q(x k,y ) X k otherwise

6 Introduction Usual Monte Carlo samplers Usual Monte Carlo samplers Markov chain Monte Carlo (MCMC) Sample a Markov chain (X k ) k having π as unique invariant distribution Approximation: π n δ Xk n k= Example: Hastings-Metropolis algorithm with proposal kernel q(x,y) given X k, sample Y q(x k, ) accept-reject mecanism { Y with probability π(y ) q(y,x k ) X k+ = π(x k ) q(x k,y ) X k otherwise Importance Sampling (IS) Sample i.i.d. points (X k ) k with density q - proposal distribution chosen by the user Approximation: π n π(x k ) n q(x k= k ) δ X k

7 Introduction The proposal mecanism The proposal mecanism: MCMC Toy example in the case: Hastings-Metropolis algorithm with Gaussian proposal kernel ( q(x,y) exp ) (y x)t Σ (y x) Acceptance-rejection ratio: π(y ) π(x k ) Fig.: For three different values of Σ: [top] Plot of the chain (in R);[bottom] autocorrelation function

8 Introduction The proposal mecanism The proposal mecanism: Importance Sampling (/) Toy example: compute x π(x)dx when π(x) t(3) R ( + x 3 ) Consider in turn the proposal q equal to a Student t() and then to a Normal N (,) Nbr of samples when q t(). 5 5 Nbr of samples when q N(,) Plot of the densities q (green, blue) and π (in red) Boxplot computed from runs of the algorithm

9 Introduction The proposal mecanism The proposal mecanism: Importance Sampling (/) The efficiency of the algorithm depends upon the proposal distribution q: if few large weights and the others negligible, the approximation is likely not accurate Monitoring the convergence: there exist criteria measuring the proportion of ineffective draws : Coefficient of Variation Effective Sample Size Normalized perplexity

10 Introduction Adaptive Monte Carlo samplers Adaptive Monte Carlo samplers To fix some design parameters and make the samplers more efficient: adaptive Monte Carlo samplers were proposed Adaptive Algorithms: - The optimal design parameters are defined as the solutions of an optimality criterion. In practice, it can not be solved explicitly. - Based on the past history of the sampler, solve an approximation of this criterion and compute the design parameters for the current run of the samplers. - Repeat the scheme: adaption/sampling.

11 Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of adaptive MCMC (/) Adaptive Hastings-Metropolis algorithm with Gaussian proposal distribution ( q Σ(x,y) exp ) (y x)t Σ (y x) Design parameters: the covariance matrix Σ Optimal criterion: by using the scaling approach for Markov Chains, it is advocated pioneering work: Roberts, Gelman, Gilks (997) Σ = (.38) l covariance of π Iterative algorithm Haario, Saksman, Tamminen () Adaption Update the covariance matrix Σ t = (.38) l Σ (π) t Sampling one step of a Hastings-Metropolis algorithm with proposal q Σt to sample X t+.

12 Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of adaptive MCMC (/) Nevertheless, this receipe is not designed for any context. Example: multimodality Target distribution: mixture of Gaussian in R. The means of the Gaussians are indicated with a red cross. 5 6 i.i.d. draws Adaptive Hastings Metropolis: 5 6 draws

13 Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of Adaptive Importance Sampling (/) Design parameter: the proposal distribution Optimal criterion: choose the proposal density q among a (parametric) family Q as the solution of ( ) π(x) argmin q Q log π(x) λ(dx) argmax q(x) q Q log q(x) π(x) λ(dx) Iterative algorithm: O. Cappé, A. Guillin, J.M. Marin, C.Robert (4) Adaption Update the sampling distribution n q t = argmax q Q log q(x (t ) π(x (t ) k ) k ) n k= q t (X (t ) k ) Sampling Draw points (X (t) k ) k + importance reweighting π n π(x (t) k ) n k= q t(x (t) k ) δ X (t) k

14 Introduction Adaptive Monte Carlo samplers Adaptive MC sampler: example of Adaptive Importance Sampling (/) Nevertheless, it is known that such Importance Sampling techniques are not robust to the dimension: when sampling on R l with l > 5, the degeneracy of the importance ratios can not be avoided. π(x k ) q(x k )

15 Introduction Conclusion Conclusion Usual adaptive Monte Carlo samplers are not robust (enough) to the context of multimodality of the target distribution π: how to jump from modes to modes. large dimension of the sampling space Importance Sampling: MCMC: π(x) q(x) π(y)q(y,x) π(x)q(x,y) = π(y) when q is a symetric kernel π(x) New Monte Carlo samplers combine tempering techniques and/or biasing potential techniques and sampling steps.

16 Tempering-based Monte Carlo samplers Outline Introduction Tempering-based Monte Carlo samplers The Equi-Energy sampler Biasing Potential-based Monte Carlo sampler Convergence Analysis

17 Tempering-based Monte Carlo samplers Tempering: the idea.5.4 densité cible.3.. densité tempérée Learn a well fitted proposal mecanism by considering tempered versions π /T (T > ) of the target distribution π. Hereafter, an example where tempering is plugged in a MCMC sampler.

18 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (/6) Kou, Zhou and Wong (6) In the MCMC proposal mecanism, allow to pick a point from an auxiliary process designed to have better mixing properties. Auxiliary process With target π β Y Y Y t- Y t θ θ θ t- θ t The process of interest With target π X X X t- X t

19 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (/6) Algorithm: at iteration t, given the current state X t the samples Y,,Y t from the auxiliary process with probability ɛ, draw X t+ MCMC kernel with invariant distribution π with probability ɛ, choose a point Y l among the auxiliary samples in the same energy level as X t and accept/reject the move X t+ = Y l..5 equi-energy jump target distribution.4.3 local move.. tempered distribution boundary boundary current state

20 Sampling multimodal densities in high dimensional sampling space Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (3/6), numerical illustration π is a mixture of Gaussian distributions. With 4 auxiliary processes π β4,,π β, Target density at temperature < β4 < < β <. Target density at temperature 4 Target density at temperature draws means of the components draws means of the components Target density at temperature draws means of the components 3 Target density at temperature Target density : mixture of dim Gaussian draws means of the components draws means of the components draws means of the components

21 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (4/6), numerical illustration Schreck, F. and Moulines (3) Problem: Motif sampling in biological sequences - objective: find where motifs (a subsequence of length w = ) are, in a ADN sequence of length L =. Observation: (s,,s L) with s l {A,C,G,T }. Quantity of interest: motifs position collected in (a,,a L) with a j {,,w}.... w.... w...

22 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (4/6), numerical illustration Schreck, F. and Moulines (3) Result: EE with 4 auxiliary chains and 3 energy rings

23 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (5/6), design parameters Design parameters the probability of interaction ɛ the number of auxiliary processes and the scale of the β i the energy rings the MCMC kernels for the local moves Convergence Analysis: Andrieu, Jasra, Doucet and Del Moral (7,8); F., Moulines, Priouret (); F., Moulines, Priouret and Vandekerkhove (3) Adaptive version of Equi Energy Sampler: Schreck, F. and Moulines (3); Baragatti, Grimaud and Pommeret (3)

24 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (6/6), transition kernel Let us describe the conditional distribution of X t+ given the past: P θt (X t,a) = ( ɛ)p (X t,a) + ɛ g(x t,y)θ t(dy) }{{} proposition with selection where θ t = t t δ Yk k=

25 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (6/6), transition kernel Let us describe the conditional distribution of X t+ given the past: P θt (X t,a) = ( ɛ)p (X t,a) + ɛ π(y) g(y,xt) πβ (X t) π(x t)g(x t,y) π β (y) }{{} acceptance-rejection g(x t,y)θ t(dy) }{{} proposition with selection where θ t = t t δ Yk k=

26 Tempering-based Monte Carlo samplers The Equi-Energy sampler Example: Equi-Energy sampler (6/6), transition kernel Let us describe the conditional distribution of X t+ given the past: P θt (X t,a) = ( ɛ)p (X t,a) + ɛ π(y) g(y,xt) πβ (X t) π(x t)g(x t,y) π β (y) }{{} acceptance-rejection g(x t,y)θ t(dy) }{{} proposition with selection where θ t = t t δ Yk k=

27 Biasing Potential-based Monte Carlo sampler Outline Introduction Tempering-based Monte Carlo samplers Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Convergence Analysis

28 Biasing Potential-based Monte Carlo sampler The idea Among the Importance Sampling Monte Carlo sampler π n n t= π(x t) q (X t) δx t where (X t) t approximates q Idea from the molecular dynamics field; see e.g. Chopin, Lelièvre and Stoltz () for the extension to Computational Statistics Choose a proposal distribution of the form q (x) = π(x) exp ( A(ξ(x))) where A(ξ(x)) is a biasing potential depending on few directions of metastability ξ(x) and such that q is less multimodal than π. Example: Consider a partition of X in d strata: X,,X d and set ξ(x) = i for any x X i.

29 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Example: Wang-Landau algorithms (/4) Wang and Landau () - very popular algorithm in the molecular dynamics field Wang-Landau type algorithms: learn adaptively the proposal distribution At iteration t:. Proposal Distribution q * At the same time, Learn the proposal distribution Sample points X k approximating this proposal distribution Compute the associated importance weights θ k - approximation q t of q - draw X t approximating q t, and compute its associated importance weight π(x t)/q t(x t). Approximate the target π π n n t= π(x t) q t(x t) δx t

30 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Wang-Landau algorithms (/4) Key idea: q is obtained by locally biasing the target distribution d π(x) q (x) θ I X i (x) (i) i= where - X,,X d is a partition of the sampling space X. - the weights θ (i) are given by θ (i) = π(x) dx. X i With this biasing strategy, the proposal distribution visits each stratum X i with the same frequency X i q (x) dx = d.

31 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Wang-Landau algorithms (/4) Key idea: q is obtained by locally biasing the target distribution d π(x) q (x) θ I X i (x) (i) i= where - X,,X d is a partition of the sampling space X. - the weights θ (i) are given by θ (i) = π(x) dx. X i With this biasing strategy, the proposal distribution visits each stratum X i with the same frequency X i q (x) dx = d. Unfortunately, - θ (i) are unknown and have to be learnt on the fly. - exact sampling under q is not possible, but it can be replaced by a MCMC step.

32 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Wang-Landau algorithms (3/4) Wang-Landau algorithm: at iteration t, given the current point X t the current bias θ t = (θ t(),,θ t(d)) Draw a new point X t+ MCMC with invariant distribution q t(x) Update the bias θ t+. d i= π(x) θ t(i) I X i (x) 3 In parallel, update the approximation of π ( ) π n d d θ t(i)i Xi (X t) n t= i= δ Xt

33 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Wang-Landau algorithms (4/4) To learn θ on the fly: Different strategies in the literature, based on Stochastic Approximation algorithms with controlled Markov chain dynamics (X t ) t θ t+(i) = θ t(i) + γ t+ H i(θ t,x t+) where H i is chosen so that - penalize the stratum currently visited: H i(θ t,x t+) > iff X t+ X i - the mean field function θ H(θ,x) q (x)dx admits θ as the unique root.

34 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Wang-Landau algorithms (4/4) To learn θ on the fly: Different strategies in the literature, based on Stochastic Approximation algorithms with controlled Markov chain dynamics (X t ) t θ t+(i) = θ t(i) + γ t+ H i(θ t,x t+) where H i is chosen so that - penalize the stratum currently visited: H i(θ t,x t+) > iff X t+ X i - the mean field function θ H(θ,x) q (x)dx admits θ as the unique root. Two examples of updating rules: if X t+ X i θ t+(i) = θ t(i) + γ t+ θ t(i)( θ t(i)) θ t+ (k) = θ t(k) γ t+ θ t(i)θ t(k) k i S t+ (j) = S t(j) + γ θ t(j) I Xj (X t+ ) θ t+ (j) = S t+ (j) d r= S t+(r)

35 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Transition kernel The conditional distribution of X t+ given the past is a MCMC kernel with invariant distribution q t, denoted by P θt Example: HM with Gaussian proposal distribution ( P θ (x,a) = π(y) θ(str(x)) A π(x) θ(str(y)) ( + δ x(a) π(y) π(x) ) N (x,σ)[dy] ) θ(str(x)) N (x,σ)[dy] θ(str(y))

36 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Numerical illustration, a toy example Target distribution: mixture of Gaussian in R. The means of the Gaussians are indicated with a red cross Wang Landau algorithm: 5 strata, obtained by partitioning the energy levels. 5 6 draws approximating q : the sampler was able to jump the deep valleys and draw points around all the modes.

37 Sampling multimodal densities in high dimensional sampling space Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Numerical illustration: Structure of a protein In biophysics, structure of a protein from its sequence. AB model: two types of monomers A (hydrophobic) and B (hydophilic), linked by rigidbonds of unit length to form (D) chains. Given a sequence, what is the optimal shape of the N monomers? Minimize the energy function H(x) on where x = (x,, x,3,, x N,N ) [ π,π] N N H(x) = 4 i= ( cos(xi,i+ ) ) N + 4 N i= j=i+ r ij C(σ i,σ j ) r ij 6 x i,j is the angle between i-th and j-th bond vector r ij is the distance between monomers i,j C(σ i,σ j ) = (resp. / and /) between monomers AA (resp. BB and AB).

38 Sampling multimodal densities in high dimensional sampling space Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Numerical illustration: Structure of a protein In biophysics, structure of a protein from its sequence. AB model: two types of monomers A (hydrophobic) and B (hydophilic), linked by rigidbonds of unit length to form (D) chains. Given a sequence, what is the optimal shape of the N monomers? Minimize the energy function H(x) on min H(x) max π n(x) exp( β nh(x)) β n >

39 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Numerical illustration: Structure of a protein In biophysics, structure of a protein from its sequence. AB model: two types of monomers A (hydrophobic) and B (hydophilic), linked by rigidbonds of unit length to form (D) chains. Given a sequence, what is the optimal shape of the N monomers? (left) WL: initial config with energy.945; (center) WL: optimal config with energy 3.95; (right) optimal config in the literature with energy 3.94

40 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Design parameters (/4) Choice of the biasing potential A(ξ(x)) i.e. in the Wang-Landau algorithms - Number of strata and the strata - The update strategy for the bias vector θ t The MCMC kernels with target distribution q t Convergence analysis: Liang (5); Liang, Liu and Carroll (7); Atchadé and Liu (); Jacob and Ryder (); F., Jourdain, Kuhn, Lelièvre and Stoltz (4a); F., Jourdain, Lelièvre and Stoltz (submitted) Efficiency analysis: F., Jourdain, Kuhn, Lelièvre and Stoltz (4b); F., Jourdain, Lelièvre and Stoltz (submitted) Adaptive Wang Landau: Bornn, Jacob, Del Moral and Doucet ()

41 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Design parameters (/4) Role on the limiting behavior of the sampler: convergence occurs whatever the number of strata and the strata, for many MCMC samplers and many update strategies of the bias vector. Role on the transient phase of the sampler: for example, how long is the exit time from a mode? Let us illustrate the role of some design parameters on the exit time from a mode when: π(x,x ) exp( β U(x,x ))I [ R,R] (x ) d strata (see the right plot); the chains are initialised at (,)

42 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Design parameters (3/4) beta=4 beta= x x 6 Fig.: [left] Wang Landau, T = and d = 48. [right] Hastings Metropolis, T = 6 ; the red line is at x =

43 Biasing Potential-based Monte Carlo sampler Wang-Landau samplers Design parameters (4/4) F., Jourdain, Lelièvre, Stoltz (4) We compute the (mean) exit times t β from the left mode (time to reach the right mode x > ) for different values of d and [left] a fixed proposal scale σ in the MCMC samplers; [right] a proposal scale σ /d in the MCMC samplers. We observe t β = C(β,σ,d) exp(βµ) with a slope µ independent of β e+9 e+8 e+7 d = 96 d = 48 d = 4 d = d = 6 d = 3 slope.5 e+9 e+8 e+7 d = 4 d = d = 6 d = 3 slope.5 Exit time e+6 Exit time e β β

44 Convergence Analysis Outline Introduction Tempering-based Monte Carlo samplers Biasing Potential-based Monte Carlo sampler Convergence Analysis Controlled Markov chains Sufficient conditions for the cvg in distribution Convergence results

45 Convergence Analysis Controlled Markov chains Controlled Markov chains (/) These new samplers combine adaption/interaction and sampling: the draws (X t) t are from a controlled Markov chain E [h(x t+) F t] = h(y)p θt (X t,dy) where (P θ,θ Θ) is a family of Markov kernels having an invariant distribution π θ. Examples Wang Landau: the conditional distribution X t+ F t is a MCMC kernel with invariant distribution distribution q t d π(x) i= θ t (i) I X i (x). Here, π θ depends on θ and its expression is known. Equi-Energy: the conditional distribution X t+ F t is a MCMC kernel indexed by the empirical distribution θ t of the auxiliary process. Here, π θ exists but its expression is unknown. 3 Adaptive Hastings-Metropolis: the conditional distribution X t+ F t is a MCMC kernel with invariant distribution π and proposal distribution N (X t,θ t) Here, all the kernels have the same invariant distribution.

46 Convergence Analysis Controlled Markov chains Controlled Markov chains (/) Question: let (P θ,θ Θ) be a family of Markov kernels having the same invariant distribution π. Let (θ t) t be some F t-adapted random processes and draw X t+ F t P θt (X t, ) Does (X t) t converges (say in distribution) to π?

47 Convergence Analysis Controlled Markov chains Controlled Markov chains (/) No. Question: let (P θ,θ Θ) be a family of Markov kernels having the same invariant distribution π. Let (θ t) t be some F t-adapted random processes and draw X t+ F t P θt (X t, ) Does (X t) t converges (say in distribution) to π? Example: if X t = if X t = X t+ P (X t, ) X t+ P (X t, ) where ( ) tl t P l = l. t l t l We have πp l = π with π (,) but the transition matrix of (X t) t is ( ) t t P = with invariant distribution π (t t t,t )

48 Convergence Analysis Sufficient conditions for the cvg in distribution Sufficient conditions for the cvg in distribution (/3) Kernel P θt-n (X t-n, ) Kernel P θt- (X t-, ) X t-n X t-n+ X t Adaptive MCMC θ t-n θ t-n+ θ t Time t-n Time t-n+ Time t X t-n Kernel P θt-n iterated N times X t Frozen chain Under π θt-n X t In the limit

49 Convergence Analysis Sufficient conditions for the cvg in distribution Sufficient conditions for the cvg in distribution (/3) ] ] E [h(x t ) past t N h(y) π θ (dy) = E [h(x t ) past t N h(y) P N θ (X t N t N,dy) + h(y) P N θ (X t N t N,dy) h(y) π θn N (dy) + h(y) π θn N (dy) h(y) π θ (dy) Diminishing adaption condition Roughly speaking: dist(p θ,p θ ) dist(θ,θ ) If θ t θ t are close, then the transition kernels P θt and P θt are close also. Containment condition Roughly speaking: lim dist(p θ N,π θ ) = N at some rate depending smoothly on θ. Regularity in θ of π θ so that lim t θ t = θ = dist (π θt π θ )

50 Convergence Analysis Sufficient conditions for the cvg in distribution Sufficient conditions for the cvg in distribution (3/3) F., Moulines, Priouret () Assume A. (Containment condition) π θ s.t. π θ P θ = π θ for any ɛ >, there exists a non-decreasing positive sequence {r ɛ(n),n } such that lim sup n r ɛ(n)/n = and lim sup E n B. (Diminishing adaptation) For any ɛ >, r ɛ(n) lim n j= [ E [ P rɛ(n) θ n rɛ(n) (X n rɛ(n), ) π θn rɛ(n) tv ] ɛ ] sup P θn rɛ(n)+j (x, ) P θn rɛ(n) (x, ) tv = x C. (Convergence of the invariant distributions) (π θn ) n converges weakly to π almost-surely. Then for any bounded and continuous function f lim E [f(x n n)] = π(f)

51 Convergence Analysis Convergence results Convergence results The literature provides sufficient conditions for Convergence in distribution of (X t) t Strong law of large numbers for (X t) t Central Limit Theorem for (X t) t G.O. Roberts, J.S. Rosenthal. Coupling and Ergodicity of Adaptive Markov chain Monte Carlo algorithms. J. Appl. Prob. (7) G. Fort, E. Moulines, P. Priouret. Convergence of adaptive MCMC algorithms: ergodicity and law of large numbers. Ann. Stat. G. Fort, E. Moulines, P. Priouret and P. Vandekerkhove. A Central Limit Theorem for Adaptive and Interacting Markov Chain. Bernoulli, 3. Conditions successfully applied to establish the convergence of Adaptive Hastings-Metropolis, (adaptive) Equi-Energy, Wang-Landau,

52 Convergence Analysis Convergence results Example: Application to Wang Landau (/) Theorem ( F., Jourdain, Kuhn, Lelièvre, Stoltz (4-a)) Assume Then for any bounded measurable function f lim E [f(x t t)] = f(x) q (x) dλ(x) lim T T T f(x t) = f(x) q (x) dλ(x) almost-surely t=

53 Convergence Analysis Convergence results Example: Application to Wang Landau (/) Theorem ( F., Jourdain, Kuhn, Lelièvre, Stoltz (4-a)) Assume The target distribution π dλ satisfies < inf X π sup X π < and inf i π(x i) >. For any θ, P θ is a Hastings-Metropolis kernel with invariant distribution d i= π(x) θ(i) I X i (x) and proposal distribution q(x,y)dλ(y) such that inf X q >. 3 The step-size sequence is non-increasing, positive, γ t = γt < t t

54 Convergence Analysis Convergence results Example: Application to Wang Landau (/) Sketch of proof (.) The containment condition: There exist ρ (,) and C such that sup x sup Pθ(x, ) t π θ TV C ρ t θ (.) The diminishing adaption condition: There exists C such that for any θ,θ sup P θ (x, ) P θ (x, ) TV C x d θ(i) θ (i) The update of the parameter satisfies: there exists C such that t i= θ t+ θ t C γ t+ (3.) Convergence of π θn Requires to prove the convergence of Stochastic Approximation algorithm with controlled Markov chain dynamics.

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Some Results on the Ergodicity of Adaptive MCMC Algorithms Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship

More information

Chapter 12 PAWL-Forced Simulated Tempering

Chapter 12 PAWL-Forced Simulated Tempering Chapter 12 PAWL-Forced Simulated Tempering Luke Bornn Abstract In this short note, we show how the parallel adaptive Wang Landau (PAWL) algorithm of Bornn et al. (J Comput Graph Stat, to appear) can be

More information

Adaptive Markov Chain Monte Carlo: Theory and Methods

Adaptive Markov Chain Monte Carlo: Theory and Methods Chapter Adaptive Markov Chain Monte Carlo: Theory and Methods Yves Atchadé, Gersende Fort and Eric Moulines 2, Pierre Priouret 3. Introduction Markov chain Monte Carlo (MCMC methods allow to generate samples

More information

SUPPLEMENT TO PAPER CONVERGENCE OF ADAPTIVE AND INTERACTING MARKOV CHAIN MONTE CARLO ALGORITHMS

SUPPLEMENT TO PAPER CONVERGENCE OF ADAPTIVE AND INTERACTING MARKOV CHAIN MONTE CARLO ALGORITHMS Submitted to the Annals of Statistics SUPPLEMENT TO PAPER CONERGENCE OF ADAPTIE AND INTERACTING MARKO CHAIN MONTE CARLO ALGORITHMS By G Fort,, E Moulines and P Priouret LTCI, CNRS - TELECOM ParisTech,

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Adaptive Metropolis with Online Relabeling

Adaptive Metropolis with Online Relabeling Adaptive Metropolis with Online Relabeling Rémi Bardenet LAL & LRI, University Paris-Sud 91898 Orsay, France bardenet@lri.fr Olivier Cappé LTCI, Telecom ParisTech & CNRS 46, rue Barrault, 7513 Paris, France

More information

An introduction to adaptive MCMC

An introduction to adaptive MCMC An introduction to adaptive MCMC Gareth Roberts MIRAW Day on Monte Carlo methods March 2011 Mainly joint work with Jeff Rosenthal. http://www2.warwick.ac.uk/fac/sci/statistics/crism/ Conferences and workshops

More information

Adaptive Metropolis with Online Relabeling

Adaptive Metropolis with Online Relabeling Adaptive Metropolis with Online Relabeling Anonymous Unknown Abstract We propose a novel adaptive MCMC algorithm named AMOR (Adaptive Metropolis with Online Relabeling) for efficiently simulating from

More information

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Yan Bai Feb 2009; Revised Nov 2009 Abstract In the paper, we mainly study ergodicity of adaptive MCMC algorithms. Assume that

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Adaptive Population Monte Carlo

Adaptive Population Monte Carlo Adaptive Population Monte Carlo Olivier Cappé Centre Nat. de la Recherche Scientifique & Télécom Paris 46 rue Barrault, 75634 Paris cedex 13, France http://www.tsi.enst.fr/~cappe/ Recent Advances in Monte

More information

Perturbed Proximal Gradient Algorithm

Perturbed Proximal Gradient Algorithm Perturbed Proximal Gradient Algorithm Gersende FORT LTCI, CNRS, Telecom ParisTech Université Paris-Saclay, 75013, Paris, France Large-scale inverse problems and optimization Applications to image processing

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

THE WANG LANDAU ALGORITHM REACHES THE FLAT HISTOGRAM CRITERION IN FINITE TIME

THE WANG LANDAU ALGORITHM REACHES THE FLAT HISTOGRAM CRITERION IN FINITE TIME The Annals of Applied Probability 2014, Vol. 24, No. 1, 34 53 DOI: 10.1214/12-AAP913 Institute of Mathematical Statistics, 2014 THE WANG LANDAU ALGORITHM REACHES THE FLAT HISTOGRAM CRITERION IN FINITE

More information

Computer Practical: Metropolis-Hastings-based MCMC

Computer Practical: Metropolis-Hastings-based MCMC Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Optimal Proposal Distributions and Adaptive MCMC by Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0804 June 30, 2008 TECHNICAL REPORT SERIES University of Toronto

More information

Recent Advances in Regional Adaptation for MCMC

Recent Advances in Regional Adaptation for MCMC Recent Advances in Regional Adaptation for MCMC Radu Craiu Department of Statistics University of Toronto Collaborators: Yan Bai (Statistics, Toronto) Antonio Fabio di Narzo (Statistics, Bologna) Jeffrey

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

Kernel Sequential Monte Carlo

Kernel Sequential Monte Carlo Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section

More information

for Global Optimization with a Square-Root Cooling Schedule Faming Liang Simulated Stochastic Approximation Annealing for Global Optim

for Global Optimization with a Square-Root Cooling Schedule Faming Liang Simulated Stochastic Approximation Annealing for Global Optim Simulated Stochastic Approximation Annealing for Global Optimization with a Square-Root Cooling Schedule Abstract Simulated annealing has been widely used in the solution of optimization problems. As known

More information

Examples of Adaptive MCMC

Examples of Adaptive MCMC Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September, 2006.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the Markov chain parameters

More information

An Adaptive Sequential Monte Carlo Sampler

An Adaptive Sequential Monte Carlo Sampler Bayesian Analysis (2013) 8, Number 2, pp. 411 438 An Adaptive Sequential Monte Carlo Sampler Paul Fearnhead * and Benjamin M. Taylor Abstract. Sequential Monte Carlo (SMC) methods are not only a popular

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Norm Comparisons for Data Augmentation by James P. Hobert Department of Statistics University of Florida and Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0704

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof

More information

Concentration inequalities for Feynman-Kac particle models. P. Del Moral. INRIA Bordeaux & IMB & CMAP X. Journées MAS 2012, SMAI Clermond-Ferrand

Concentration inequalities for Feynman-Kac particle models. P. Del Moral. INRIA Bordeaux & IMB & CMAP X. Journées MAS 2012, SMAI Clermond-Ferrand Concentration inequalities for Feynman-Kac particle models P. Del Moral INRIA Bordeaux & IMB & CMAP X Journées MAS 2012, SMAI Clermond-Ferrand Some hyper-refs Feynman-Kac formulae, Genealogical & Interacting

More information

Controlled sequential Monte Carlo

Controlled sequential Monte Carlo Controlled sequential Monte Carlo Jeremy Heng, Department of Statistics, Harvard University Joint work with Adrian Bishop (UTS, CSIRO), George Deligiannidis & Arnaud Doucet (Oxford) Bayesian Computation

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Pseudo-marginal MCMC methods for inference in latent variable models

Pseudo-marginal MCMC methods for inference in latent variable models Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016

More information

Randomized Quasi-Monte Carlo for MCMC

Randomized Quasi-Monte Carlo for MCMC Randomized Quasi-Monte Carlo for MCMC Radu Craiu 1 Christiane Lemieux 2 1 Department of Statistics, Toronto 2 Department of Statistics, Waterloo Third Workshop on Monte Carlo Methods Harvard, May 2007

More information

The Wang-Landau algorithm in general state spaces: Applications and convergence analysis

The Wang-Landau algorithm in general state spaces: Applications and convergence analysis The Wang-Landau algorithm in general state spaces: Applications and convergence analysis Yves F. Atchadé and Jun S. Liu First version Nov. 2004; Revised Feb. 2007, Aug. 2008) Abstract: The Wang-Landau

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 Sequential parallel tempering With the development of science and technology, we more and more need to deal with high dimensional systems. For example, we need to align a group of protein or DNA sequences

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Sequential Monte Carlo Methods in High Dimensions

Sequential Monte Carlo Methods in High Dimensions Sequential Monte Carlo Methods in High Dimensions Alexandros Beskos Statistical Science, UCL Oxford, 24th September 2012 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Andrew Stuart Imperial College,

More information

Examples of Adaptive MCMC

Examples of Adaptive MCMC Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised January 2008.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the

More information

Sequential Monte Carlo Samplers for Applications in High Dimensions

Sequential Monte Carlo Samplers for Applications in High Dimensions Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

Inference in state-space models with multiple paths from conditional SMC

Inference in state-space models with multiple paths from conditional SMC Inference in state-space models with multiple paths from conditional SMC Sinan Yıldırım (Sabancı) joint work with Christophe Andrieu (Bristol), Arnaud Doucet (Oxford) and Nicolas Chopin (ENSAE) September

More information

Kernel Adaptive Metropolis-Hastings

Kernel Adaptive Metropolis-Hastings Kernel Adaptive Metropolis-Hastings Arthur Gretton,?? Gatsby Unit, CSML, University College London NIPS, December 2015 Arthur Gretton (Gatsby Unit, UCL) Kernel Adaptive Metropolis-Hastings 12/12/2015 1

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Surveying the Characteristics of Population Monte Carlo

Surveying the Characteristics of Population Monte Carlo International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 7 (9): 522-527 Science Explorer Publications Surveying the Characteristics of

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 18 : Advanced topics in MCMC Lecturer: Eric P. Xing Scribes: Jessica Chemali, Seungwhan Moon 1 Gibbs Sampling (Continued from the last lecture)

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Markov Chain Monte Carlo Methods for Stochastic Optimization

Markov Chain Monte Carlo Methods for Stochastic Optimization Markov Chain Monte Carlo Methods for Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U of Toronto, MIE,

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop,

More information

dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés

dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés Inférence pénalisée dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés Gersende Fort Institut de Mathématiques de Toulouse, CNRS and Univ. Paul Sabatier Toulouse,

More information

I. Bayesian econometrics

I. Bayesian econometrics I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory D. Large sample results E. Diffuse priors F. Numerical Bayesian methods

More information

Consistency of the maximum likelihood estimator for general hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

Simulation - Lectures - Part III Markov chain Monte Carlo

Simulation - Lectures - Part III Markov chain Monte Carlo Simulation - Lectures - Part III Markov chain Monte Carlo Julien Berestycki Part A Simulation and Statistical Programming Hilary Term 2018 Part A Simulation. HT 2018. J. Berestycki. 1 / 50 Outline Markov

More information

Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7. Markov chain background. 7.1 Finite state space Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

The two subset recurrent property of Markov chains

The two subset recurrent property of Markov chains The two subset recurrent property of Markov chains Lars Holden, Norsk Regnesentral Abstract This paper proposes a new type of recurrence where we divide the Markov chains into intervals that start when

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

Quantitative Non-Geometric Convergence Bounds for Independence Samplers

Quantitative Non-Geometric Convergence Bounds for Independence Samplers Quantitative Non-Geometric Convergence Bounds for Independence Samplers by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 28; revised July 29.) 1. Introduction. Markov chain Monte Carlo (MCMC)

More information

Adaptive HMC via the Infinite Exponential Family

Adaptive HMC via the Infinite Exponential Family Adaptive HMC via the Infinite Exponential Family Arthur Gretton Gatsby Unit, CSML, University College London RegML, 2017 Arthur Gretton (Gatsby Unit, UCL) Adaptive HMC via the Infinite Exponential Family

More information

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 15-7th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Mixture and composition of kernels. Hybrid algorithms. Examples Overview

More information

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation Luke Tierney Department of Statistics & Actuarial Science University of Iowa Basic Ratio of Uniforms Method Introduced by Kinderman and

More information

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems Jonas Latz 1 Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems Jonas Latz Technische Universität München Fakultät für Mathematik Lehrstuhl für Numerische Mathematik jonas.latz@tum.de November

More information

LOWER BOUNDS ON THE CONVERGENCE RATES OF ADAPTIVE MCMC METHODS. By Scott C. Schmidler and Dawn B. Woodard Duke University and Cornell University

LOWER BOUNDS ON THE CONVERGENCE RATES OF ADAPTIVE MCMC METHODS. By Scott C. Schmidler and Dawn B. Woodard Duke University and Cornell University Submitted to the Annals of Statistics LOWER BOUNDS ON THE CONVERGENCE RATES OF ADAPTIVE MCMC METHODS By Scott C. Schmidler and Dawn B. Woodard Duke University and Cornell University We consider the convergence

More information

Markov Chain Monte Carlo Methods for Stochastic

Markov Chain Monte Carlo Methods for Stochastic Markov Chain Monte Carlo Methods for Stochastic Optimization i John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U Florida, Nov 2013

More information

SMC 2 : an efficient algorithm for sequential analysis of state-space models

SMC 2 : an efficient algorithm for sequential analysis of state-space models SMC 2 : an efficient algorithm for sequential analysis of state-space models N. CHOPIN 1, P.E. JACOB 2, & O. PAPASPILIOPOULOS 3 1 ENSAE-CREST 2 CREST & Université Paris Dauphine, 3 Universitat Pompeu Fabra

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains A regeneration proof of the central limit theorem for uniformly ergodic Markov chains By AJAY JASRA Department of Mathematics, Imperial College London, SW7 2AZ, London, UK and CHAO YANG Department of Mathematics,

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

A framework for adaptive Monte-Carlo procedures

A framework for adaptive Monte-Carlo procedures A framework for adaptive Monte-Carlo procedures Jérôme Lelong (with B. Lapeyre) http://www-ljk.imag.fr/membres/jerome.lelong/ Journées MAS Bordeaux Friday 3 September 2010 J. Lelong (ENSIMAG LJK) Journées

More information

Bayesian Computation in Color-Magnitude Diagrams

Bayesian Computation in Color-Magnitude Diagrams Bayesian Computation in Color-Magnitude Diagrams SA, AA, PT, EE, MCMC and ASIS in CMDs Paul Baines Department of Statistics Harvard University October 19, 2009 Overview Motivation and Introduction Modelling

More information

arxiv: v4 [math.st] 15 Jan 2014

arxiv: v4 [math.st] 15 Jan 2014 The Annals of Applied Probability 2014, Vol. 24, No. 1, 34 53 DOI: 10.1214/12-AAP913 c Institute of Mathematical Statistics, 2014 arxiv:1110.4025v4 [math.st] 15 Jan 2014 THE WANG LANDAU ALGORITHM REACHES

More information

CONVERGENCE OF THE EQUI-ENERGY SAMPLER AND ITS APPLICATION TO THE ISING MODEL

CONVERGENCE OF THE EQUI-ENERGY SAMPLER AND ITS APPLICATION TO THE ISING MODEL Statistica Sinica 21 (2011), 1687-1711 doi:http://dx.doi.org/10.5705/ss.2009.282 CONVERGENCE OF THE EQUI-ENERGY SAMPLER AND ITS APPLICATION TO THE ISING MODEL Xia Hua and S. C. Kou Massachusetts Institute

More information

Coupling and Ergodicity of Adaptive MCMC

Coupling and Ergodicity of Adaptive MCMC Coupling and Ergodicity of Adaptive MCMC by Gareth O. Roberts* and Jeffrey S. Rosenthal** (March 2005; last revised March 2007.) Abstract. We consider basic ergodicity properties of adaptive MCMC algorithms

More information

Adaptively scaling the Metropolis algorithm using expected squared jumped distance

Adaptively scaling the Metropolis algorithm using expected squared jumped distance Adaptively scaling the Metropolis algorithm using expected squared jumped distance Cristian Pasarica Andrew Gelman January 25, 2005 Abstract Using existing theory on efficient jumping rules and on adaptive

More information

Asymptotics and Simulation of Heavy-Tailed Processes

Asymptotics and Simulation of Heavy-Tailed Processes Asymptotics and Simulation of Heavy-Tailed Processes Department of Mathematics Stockholm, Sweden Workshop on Heavy-tailed Distributions and Extreme Value Theory ISI Kolkata January 14-17, 2013 Outline

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of

More information

Auxiliary Particle Methods

Auxiliary Particle Methods Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen 1 adam.johansen@bristol.ac.uk Oxford University Man Institute 29th May 2008 1 Collaborators include: Arnaud Doucet, Nick Whiteley

More information

Semi-Parametric Importance Sampling for Rare-event probability Estimation

Semi-Parametric Importance Sampling for Rare-event probability Estimation Semi-Parametric Importance Sampling for Rare-event probability Estimation Z. I. Botev and P. L Ecuyer IMACS Seminar 2011 Borovets, Bulgaria Semi-Parametric Importance Sampling for Rare-event probability

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

A Repelling-Attracting Metropolis Algorithm for Multimodality

A Repelling-Attracting Metropolis Algorithm for Multimodality A Repelling-Attracting Metropolis Algorithm for Multimodality arxiv:1601.05633v5 [stat.me] 20 Oct 2017 Hyungsuk Tak Statistical and Applied Mathematical Sciences Institute Xiao-Li Meng Department of Statistics,

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

Sampling from complex probability distributions

Sampling from complex probability distributions Sampling from complex probability distributions Louis J. M. Aslett (louis.aslett@durham.ac.uk) Department of Mathematical Sciences Durham University UTOPIAE Training School II 4 July 2017 1/37 Motivation

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Stochastic Approximation in Monte Carlo Computation

Stochastic Approximation in Monte Carlo Computation Stochastic Approximation in Monte Carlo Computation Faming Liang, Chuanhai Liu and Raymond J. Carroll 1 June 26, 2006 Abstract The Wang-Landau algorithm is an adaptive Markov chain Monte Carlo algorithm

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Learn From Thy Neighbor: Parallel-Chain and Regional Adaptive MCMC by V. Radu Craiu Department of Statistics University of Toronto and Jeffrey S. Rosenthal Department of Statistics University of Toronto

More information

On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms

On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms Yan Bai, Gareth O. Roberts, and Jeffrey S. Rosenthal Last revised: July, 2010 Abstract This paper considers ergodicity properties

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information