MALA versus Random Walk Metropolis Dootika Vats June 4, 2017

Size: px
Start display at page:

Download "MALA versus Random Walk Metropolis Dootika Vats June 4, 2017"

Transcription

1 MALA versus Random Walk Metropolis Dootika Vats June 4, 2017 Introduction My research thus far has predominantly been on output analysis for Markov chain Monte Carlo. The examples on which I have implemented our methods have been Gibbs samplers, vanilla Metropolis-Hastings samplers, or Metropolis-within-Gibbs samplers. I have been somewhat distant from the wide variety of samplers that users can choose from. One of the more popular ones is the Metropolis-adjusted Langevin Algorithm (MALA), introduced in Roberts and Tweedie (1996) and further studied in Roberts and Rosenthal (1998). The MALA is a Metropolis-Hastings sampler with a special proposal distribution. This proposal distribution distribution evaluates the gradient of the log density at the current state and moves the center of the proposal distribution by a scaled factor of this gradient. MALA is based on Langevin diffusions, a connection I am going to ignore in this article due to lack of knowledge at this point. MALA Let π(x) be the target distribution for the MCMC sampler defined on a p-dimensional space. A generic Metropolis-Hastings sampler at the current value x proposes a value y from a proposal distribution with density q(x, ). The proposed value y is accepted with probability π(y)q(y, x) min 1,. π(x)q(x, y) Different choices of q lead to different samplers. For MALA, q(x, y) is the density for the distribution ( ) N x + σ2 1 2 log π(x), σ2 1D where σ 2 1 is the step-size greater than 0, and D is a p p positive definite matrix. The D here is similar to the choice of covariance matrix in the proposal for the random walk Metropolis sampler. We will just use D to be the diagonal matrix in our example. If σ 2 1 is tuned properly, the MALA proposal forces the center of the distribution to move climb the gradient. This enables the sampler to move away from the tails faster than other naive samplers. We will compare the performance of the MALA sampler with that of the random walk Metropolis (RWM) sampler. The RWM sampler uses the proposal distribution where the purpose of σ 2 2 and D is the same as before. N(x, σ 2 2D), Roberts and Rosenthal (1998) concluded that an optimal acceptance rate for MALA is.574 and Roberts, Gelman, and Gilks (1997) concluded that the optimal acceptance rate for the RWM is.234. We will tune both σ 2 1 and σ 2 2 to acheive these acceptance rates. 1

2 Example 1: Gaussian Let the target distribution be a bivariate Normal distribution (nothing too complicated). (( ) ( )) N 2, The mean vector we denote by µ and the 2 2 covariance matrix by Σ. The density for this distribution is π(x) exp (x µ)t Σ 1 (x µ) 2 log π(x) = log const (x µ)t Σ 1 (x µ) 2 log π(x) = Σ 1 (x µ.) mu <- c(3, 6) Sigma <- matrix(c(2,.5,.5, 1), nrow = 2, ncol = 2) Sigma.inv <- solve(sigma) # Calculates the loglikelihood of the bivariate normal loglike <- function(x) return(as.numeric(- t((x - mu)) %*% Sigma.inv %*% (x - mu)/2) ) First I write down the function for the RWM sampler. Note that this proposal is symmetric. set.seed(100) rwm <- function(n = 1e5, sigma) chain.rwm <- matrix(0, nrow = N, ncol = 2) accept <- 0 # Starting value is the origin chain.rwm[1,] <- c(0,0) for(i in 2:N) prop <- rnorm(2, mean = chain.rwm[i-1, ], sd = sigma) log.ratio <- loglike(prop) - loglike(chain.rwm[i-1, ]) if(log(runif(1)) < log.ratio) chain.rwm[i, ] <- prop accept <- accept+1 else chain.rwm[i, ] <- chain.rwm[i-1, ] return(list("chain" = chain.rwm, "accept" = accept/n)) out.rwm <- rwm(sigma = 1) out.rwm$accept ## [1]

3 # Calibrated sigma to get close to optimal rate out.rwm <- rwm(sigma = 2.5) out.rwm$accept ## [1] Coding up the MALA sampler is slightly more complicated since the proposal distribution is no longer symmetric and the densities do not cancel out in the acceptance ratio. # Calculates the log gradient of the density proplike <- function(x, y, sigma) grad <- Sigma.inv %*% (y - mu) mu.m <- y + sigma^2 * grad/2 return(as.numeric(- t((x - mu.m)) %*% (x - mu.m)/(2*sigma^2) ) ) # Mala sampler mala <- function(n = 1e5, sigma) chain.mala <- matrix(0, nrow = N, ncol = 2) accept <- 0 # Starting value is the origin chain.mala[1,] <- c(0,0) for(i in 2:N) grad <- Sigma.inv %*% (chain.mala[i-1, ] - mu) mu.m <- chain.mala[i-1, ] + sigma^2 * grad/2 prop <- rnorm(2, mean = mu.m, sd = sigma) log.ratio <- loglike(prop) - loglike(chain.mala[i-1, ]) if(log(runif(1)) < log.ratio) chain.mala[i, ] <- prop accept <- accept+1 else chain.mala[i, ] <- chain.mala[i-1, ] return(list("chain" = chain.mala, "accept" = accept/n)) out.mala <- mala(sigma = 1) out.mala$accept + proplike(chain.mala[i-1, ], prop, sigma) ## [1] # Tuning Mala out.mala <- mala(sigma =.5) out.mala$accept ## [1]

4 To compare the performance of the two samplers, we plot some graphs. The first is the traceplot for the two components. par(mfrow = c(1, 2)) plot(tail(1:1e5, 1e4), tail(out.rwm$chain[,1],1e4), ylab = "First Component", main = "", type lines(tail(1:1e5, 1e4),tail(out.mala$chain[,1], 1e4), col = "red") plot(tail(1:1e5, 1e4), tail(out.rwm$chain[,2], 1e4), ylab = "Second Component", lines(tail(1:1e5, 1e4),tail(out.mala$chain[,2], 1e4), col = "red") main = "", t First Component Second Component index index The performance looks similar in their trace plot. MALA looks like it may produce thinner tails and focus on areas of high probability. par(mfrow = c(1,2)) plot(density(out.rwm$chain[,1]), ylab = "First Component", main = "") lines(density(out.mala$chain[,1]), col = "red") plot(density(out.rwm$chain[,2]), ylab = "Second Component", main = "") lines(density(out.mala$chain[,2]), col = "red") First Component Second Component N = Bandwidth = 0.12 N = Bandwidth = In terms of autocorrelation, we see the following results. par(mfrow = c(2,3)) acf(out.rwm$chain[,1], main = "RWM: First") acf(out.rwm$chain[,2], main = "RWM: Second") 4

5 ccf(out.rwm$chain[,1], out.rwm$chain[,2], main = "RWM: CCF") acf(out.mala$chain[,1], main = "MALA: First") acf(out.mala$chain[,2], main = "MALA: Second") ccf(out.mala$chain[,1], out.mala$chain[,2], main = "MALA: CCF") RWM: First RWM: Second RWM: CCF MALA: First MALA: Second MALA: CCF Interestingly, MALA produces much higher autocorrelation, and significantly higher crosscorrelation. This would imply that the multivariate effective sample size for MALA for estimating the mean of the Normal distribution will be smaller. Here is the implementation. library(mcmcse) ## mcmcse: Monte Carlo Standard Errors for MCMC ## Version created on ## copyright (c) 2012, James M. Flegal, University of California,Riverside ## John Hughes, University of Minnesota ## Dootika Vats, University of Minnesota ## For citation information, type citation("mcmcse"). ## Type help("mcmcse-package") to get started. c(multiess(out.rwm$chain), multiess(out.mala$chain)) ## [1] And clearly, the effective sample size for a Monte Carlo sample of size 1e5 is much smaller for MALA than it is for the RWM. So both RWM and MALA seem to yield decent density estimates but for estimation of the mean (3,6), the RWM will clearly be favored over MALA due to lost of efficiency. 5

6 Intuitively I think MALA will work better for fatter tail distributions. Let s see Example 2: Multimodal Let the target be a t distribution with 5 degrees of freedom π(x) ) 3 (1 + x2. 5 π(x) ) 3 (1 + x2 5 log π(x) = log const 3 log log π(x) = 6x 5 + x 2. # Calculates the loglikelihood of the bivariate normal loglike <- function(x) return( -3*log(1 + x^2/5) ) Below is the RWM implementation rwm <- function(n = 1e5, sigma) chain.rwm <- numeric(length = N) accept <- 0 # Starting value is far from 0 chain.rwm[1] <- 10 for(i in 2:N) prop <- rnorm(1, mean = chain.rwm[i-1 ], sd = sigma) log.ratio <- loglike(prop) - loglike(chain.rwm[i-1]) if(log(runif(1)) < log.ratio) chain.rwm[i ] <- prop accept <- accept+1 else chain.rwm[i ] <- chain.rwm[i-1 ] return(list("chain" = chain.rwm, "accept" = accept/n)) out.rwm <- rwm(sigma = 1) out.rwm$accept ## [1] # Calibrated sigma to get close to optimal rate out.rwm <- rwm(sigma = 5) out.rwm$accept ) (1 + x2 5 6

7 ## [1] Coding up the MALA sampler is slightly more complicated since the proposal distribution is no longer symmetric and the densities do not cancel out in the acceptance ratio. # Calculates the log gradient of the density proplike <- function(x, y, sigma) grad <- -6*y/(5 + y^2) mu.m <- y + sigma^2 * grad/2 return(as.numeric(- t((x - mu.m)) %*% (x - mu.m)/(2*sigma^2) ) ) # Mala sampler mala <- function(n = 1e5, sigma) chain.mala <- numeric(length = N) accept <- 0 # Starting value is far from 0 chain.mala[1] <- 10 for(i in 2:N) grad <- -6*chain.mala[i-1]/(5 + chain.mala[i-1]^2) mu.m <- chain.mala[i-1] + sigma^2 * grad/2 prop <- rnorm(1, mean = mu.m, sd = sigma) log.ratio <- loglike(prop) - loglike(chain.mala[i-1]) if(log(runif(1)) < log.ratio) chain.mala[i ] <- prop accept <- accept+1 else chain.mala[i] <- chain.mala[i-1 ] return(list("chain" = chain.mala, "accept" = accept/n)) out.mala <- mala(sigma = 1) out.mala$accept + proplike(chain.mala[i-1], prop, sigma) - pr ## [1] # Tuning Mala out.mala <- mala(sigma = 2.2) out.mala$accept ## [1] par(mfrow = c(1, 2)) plot(tail(1:1e5, 1e4), tail(out.rwm$chain,1e4),, main = "", type = 'l', xlab = "index") lines(tail(1:1e5, 1e4),tail(out.mala$chain, 1e4), col = "red") plot(density(out.rwm$chain), main = "") 7

8 lines(density(out.mala$chain), col = "red") tail(out.rwm$chain, 10000) Density index N = Bandwidth = Ah, we see MALA exploring the space more than RWM. Let us zoom into that. par(mfrow = c(1, 1)) plot(density(out.rwm$chain), main = " truth is BLUE", xlim = range(c(-5,5))) lines(density(out.mala$chain), col = "red") lines(seq(-5,5, length = 1e4), dt( seq(-5,5, length = 1e4), df = 5), col = "blue") truth is BLUE Density N = Bandwidth = Hmm, density estimation seems similar after 1e5 iterations. par(mfrow = c(1,2)) acf(out.rwm$chain, main = "RWM") acf(out.mala$chain, main = "MALA") 8

9 RWM MALA Since the seems lower for MALA, the effective sample size for estimating the mean will be larger. c(ess(out.rwm$chain), ess(out.mala$chain)) ## se se ## Thus for a distribution with fatter tails, MALA performs (marginally) better. For more complicated high dimensional target distributions, I can understand how MALA would be a handy tool. References Roberts, Gareth O, and Jeffrey S Rosenthal Optimal Scaling of Discrete Approximations to Langevin Diffusions. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60. Wiley Online Library: Roberts, Gareth O, and Richard L Tweedie Exponential Convergence of Langevin Distributions and Their Discrete Approximations. Bernoulli. JSTOR, Roberts, Gareth O, Andrew Gelman, and Walter R Gilks Weak Convergence and Optimal Scaling of Random Walk Metropolis Algorithms. The Annals of Applied Probability 7. Institute of Mathematical Statistics:

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

An introduction to adaptive MCMC

An introduction to adaptive MCMC An introduction to adaptive MCMC Gareth Roberts MIRAW Day on Monte Carlo methods March 2011 Mainly joint work with Jeff Rosenthal. http://www2.warwick.ac.uk/fac/sci/statistics/crism/ Conferences and workshops

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

Examples of Adaptive MCMC

Examples of Adaptive MCMC Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September, 2006.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the Markov chain parameters

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information

Kernel Sequential Monte Carlo

Kernel Sequential Monte Carlo Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

Package mcmcse. July 4, 2017

Package mcmcse. July 4, 2017 Version 1.3-2 Date 2017-07-03 Title Monte Carlo Standard Errors for MCMC Package mcmcse July 4, 2017 Author James M. Flegal , John Hughes , Dootika Vats ,

More information

A Dirichlet Form approach to MCMC Optimal Scaling

A Dirichlet Form approach to MCMC Optimal Scaling A Dirichlet Form approach to MCMC Optimal Scaling Giacomo Zanella, Wilfrid S. Kendall, and Mylène Bédard. g.zanella@warwick.ac.uk, w.s.kendall@warwick.ac.uk, mylene.bedard@umontreal.ca Supported by EPSRC

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Gradient-based Monte Carlo sampling methods

Gradient-based Monte Carlo sampling methods Gradient-based Monte Carlo sampling methods Johannes von Lindheim 31. May 016 Abstract Notes for a 90-minute presentation on gradient-based Monte Carlo sampling methods for the Uncertainty Quantification

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Some Results on the Ergodicity of Adaptive MCMC Algorithms Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Log-concave sampling: Metropolis-Hastings algorithms are fast!

Log-concave sampling: Metropolis-Hastings algorithms are fast! Proceedings of Machine Learning Research vol 75:1 5, 2018 31st Annual Conference on Learning Theory Log-concave sampling: Metropolis-Hastings algorithms are fast! Raaz Dwivedi Department of Electrical

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS A Thesis in Statistics by Chris Groendyke c 2008 Chris Groendyke Submitted in

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

1 Geometry of high dimensional probability distributions

1 Geometry of high dimensional probability distributions Hamiltonian Monte Carlo October 20, 2018 Debdeep Pati References: Neal, Radford M. MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo 2.11 (2011): 2. Betancourt, Michael. A conceptual

More information

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II References Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Zig-Zag Monte Carlo. Delft University of Technology. Joris Bierkens February 7, 2017

Zig-Zag Monte Carlo. Delft University of Technology. Joris Bierkens February 7, 2017 Zig-Zag Monte Carlo Delft University of Technology Joris Bierkens February 7, 2017 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 1 / 33 Acknowledgements Collaborators Andrew Duncan Paul

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Optimal Proposal Distributions and Adaptive MCMC by Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0804 June 30, 2008 TECHNICAL REPORT SERIES University of Toronto

More information

Markov-Chain Monte Carlo

Markov-Chain Monte Carlo Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. References Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

TUNING OF MARKOV CHAIN MONTE CARLO ALGORITHMS USING COPULAS

TUNING OF MARKOV CHAIN MONTE CARLO ALGORITHMS USING COPULAS U.P.B. Sci. Bull., Series A, Vol. 73, Iss. 1, 2011 ISSN 1223-7027 TUNING OF MARKOV CHAIN MONTE CARLO ALGORITHMS USING COPULAS Radu V. Craiu 1 Algoritmii de tipul Metropolis-Hastings se constituie într-una

More information

arxiv: v1 [stat.co] 2 Nov 2017

arxiv: v1 [stat.co] 2 Nov 2017 Binary Bouncy Particle Sampler arxiv:1711.922v1 [stat.co] 2 Nov 217 Ari Pakman Department of Statistics Center for Theoretical Neuroscience Grossman Center for the Statistics of Mind Columbia University

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Dimension-Independent likelihood-informed (DILI) MCMC

Dimension-Independent likelihood-informed (DILI) MCMC Dimension-Independent likelihood-informed (DILI) MCMC Tiangang Cui, Kody Law 2, Youssef Marzouk Massachusetts Institute of Technology 2 Oak Ridge National Laboratory 2 August 25 TC, KL, YM DILI MCMC USC

More information

In many cases, it is easier (and numerically more stable) to compute

In many cases, it is easier (and numerically more stable) to compute In many cases, it is easier (and numerically more stable) to compute r u (x, y) := log p u (y) + log q u (y, x) log p u (x) log q u (x, y), and then accept if U k < exp ( r u (X k 1, Y k ) ) and reject

More information

The two subset recurrent property of Markov chains

The two subset recurrent property of Markov chains The two subset recurrent property of Markov chains Lars Holden, Norsk Regnesentral Abstract This paper proposes a new type of recurrence where we divide the Markov chains into intervals that start when

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Examples of Adaptive MCMC

Examples of Adaptive MCMC Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised January 2008.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Methodology for inference on the Markov modulated Poisson process and theory for optimal scaling of the random walk Metropolis

Methodology for inference on the Markov modulated Poisson process and theory for optimal scaling of the random walk Metropolis II Methodology for inference on the Markov modulated Poisson process and theory for optimal scaling of the random walk Metropolis Christopher Sherlock, MSc. Submitted for the degree of Doctor of Philosophy

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information

Metropolis-Hastings sampling

Metropolis-Hastings sampling Metropolis-Hastings sampling Gibbs sampling requires that a sample from each full conditional distribution. In all the cases we have looked at so far the conditional distributions were conjugate so sampling

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Computer Practical: Metropolis-Hastings-based MCMC

Computer Practical: Metropolis-Hastings-based MCMC Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation Luke Tierney Department of Statistics & Actuarial Science University of Iowa Basic Ratio of Uniforms Method Introduced by Kinderman and

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Statistical Data Mining and Medical Signal Detection. Lecture Five: Stochastic Algorithms and MCMC. July 6, Motoya Machida

Statistical Data Mining and Medical Signal Detection. Lecture Five: Stochastic Algorithms and MCMC. July 6, Motoya Machida Statistical Data Mining and Medical Signal Detection Lecture Five: Stochastic Algorithms and MCMC July 6, 2011 Motoya Machida (mmachida@tntech.edu) Statistical Data Mining 1/20 Plotting Density Functions

More information

Darwin Uy Math 538 Quiz 4 Dr. Behseta

Darwin Uy Math 538 Quiz 4 Dr. Behseta Darwin Uy Math 538 Quiz 4 Dr. Behseta 1) Section 16.1.4 talks about how when sample size gets large, posterior distributions become approximately normal and centered at the MLE. This is largely due to

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

Statistical analysis of neural data: Monte Carlo techniques for decoding spike trains

Statistical analysis of neural data: Monte Carlo techniques for decoding spike trains Statistical analysis of neural data: Monte Carlo techniques for decoding spike trains Liam Paninski Department of Statistics and Center for Theoretical Neuroscience Columbia University http://www.stat.columbia.edu/

More information

Directional Metropolis Hastings algorithms on hyperplanes

Directional Metropolis Hastings algorithms on hyperplanes Directional Metropolis Hastings algorithms on hyperplanes Hugo Hammer and Håon Tjelmeland Department of Mathematical Sciences Norwegian University of Science and Technology Trondheim, Norway Abstract In

More information

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch. Sampling Methods Oliver Schulte - CMP 419/726 Bishop PRML Ch. 11 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Kernel Adaptive Metropolis-Hastings

Kernel Adaptive Metropolis-Hastings Kernel Adaptive Metropolis-Hastings Arthur Gretton,?? Gatsby Unit, CSML, University College London NIPS, December 2015 Arthur Gretton (Gatsby Unit, UCL) Kernel Adaptive Metropolis-Hastings 12/12/2015 1

More information

Variance Bounding Markov Chains

Variance Bounding Markov Chains Variance Bounding Markov Chains by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised April 2007.) Abstract. We introduce a new property of Markov chains, called variance bounding.

More information

The random walk Metropolis: linking theory and practice through a case study.

The random walk Metropolis: linking theory and practice through a case study. The random walk Metropolis: linking theory and practice through a case study. Chris Sherlock 1,3, Paul Fearnhead 1, and Gareth O. Roberts 2 1. Department of Mathematics and Statistics, Lancaster University,

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

Monte Carlo integration

Monte Carlo integration Monte Carlo integration Eample of a Monte Carlo sampler in D: imagine a circle radius L/ within a square of LL. If points are randoml generated over the square, what s the probabilit to hit within circle?

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Adaptive Metropolis with Online Relabeling

Adaptive Metropolis with Online Relabeling Adaptive Metropolis with Online Relabeling Anonymous Unknown Abstract We propose a novel adaptive MCMC algorithm named AMOR (Adaptive Metropolis with Online Relabeling) for efficiently simulating from

More information

The random walk Metropolis: linking theory and practice through a case study.

The random walk Metropolis: linking theory and practice through a case study. The random walk Metropolis: linking theory and practice through a case study. Chris Sherlock 1,3, Paul Fearnhead 1, and Gareth O. Roberts 2 1. Department of Mathematics and Statistics, Lancaster University,

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

The Random Walk Metropolis: Linking Theory and Practice Through a Case Study

The Random Walk Metropolis: Linking Theory and Practice Through a Case Study Statistical Science 2010, Vol. 25, No. 2, 172 190 DOI: 10.1214/10-STS327 Institute of Mathematical Statistics, 2010 The Random Walk Metropolis: Linking Theory and Practice Through a Case Study Chris Sherlock,

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J. Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical

More information

Applicability of subsampling bootstrap methods in Markov chain Monte Carlo

Applicability of subsampling bootstrap methods in Markov chain Monte Carlo Applicability of subsampling bootstrap methods in Markov chain Monte Carlo James M. Flegal Abstract Markov chain Monte Carlo (MCMC) methods allow exploration of intractable probability distributions by

More information

Bayesian Estimation of Expected Cell Counts by Using R

Bayesian Estimation of Expected Cell Counts by Using R Bayesian Estimation of Expected Cell Counts by Using R Haydar Demirhan 1 and Canan Hamurkaroglu 2 Department of Statistics, Hacettepe University, Beytepe, 06800, Ankara, Turkey Abstract In this article,

More information

The lmm Package. May 9, Description Some improved procedures for linear mixed models

The lmm Package. May 9, Description Some improved procedures for linear mixed models The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Improved Robust MCMC Algorithm for Hierarchical Models

Improved Robust MCMC Algorithm for Hierarchical Models UNIVERSITY OF TEXAS AT SAN ANTONIO Improved Robust MCMC Algorithm for Hierarchical Models Liang Jing July 2010 1 1 ABSTRACT In this paper, three important techniques are discussed with details: 1) group

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

Bayesian Gaussian Process Regression

Bayesian Gaussian Process Regression Bayesian Gaussian Process Regression STAT8810, Fall 2017 M.T. Pratola October 7, 2017 Today Bayesian Gaussian Process Regression Bayesian GP Regression Recall we had observations from our expensive simulator,

More information

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer Package lmm March 19, 2012 Version 0.4 Date 2012-3-19 Title Linear mixed models Author Joseph L. Schafer Maintainer Jing hua Zhao Depends R (>= 2.0.0) Description Some

More information

Control Variates for Markov Chain Monte Carlo

Control Variates for Markov Chain Monte Carlo Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability

More information

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014 Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation

More information

Multivariate Normal & Wishart

Multivariate Normal & Wishart Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.

More information

A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models

A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology and Learning Systems ralmond@fsu.edu BMAW 2014 1

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) Gibbs and Metropolis Hastings Slice sampling Practical details Iain Murray http://iainmurray.net/ Reminder Need to sample large, non-standard distributions:

More information

The zig-zag and super-efficient sampling for Bayesian analysis of big data

The zig-zag and super-efficient sampling for Bayesian analysis of big data The zig-zag and super-efficient sampling for Bayesian analysis of big data LMS-CRiSM Summer School on Computational Statistics 15th July 2018 Gareth Roberts, University of Warwick Joint work with Joris

More information

Metric Predicted Variable on One Group

Metric Predicted Variable on One Group Metric Predicted Variable on One Group Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Prior Homework

More information

Information-Geometric Markov Chain Monte Carlo Methods Using Diffusions

Information-Geometric Markov Chain Monte Carlo Methods Using Diffusions Entropy 2014, 16, 3074-3102; doi:10.3390/e16063074 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Information-Geometric Markov Chain Monte Carlo Methods Using Diffusions Samuel

More information