MALA versus Random Walk Metropolis Dootika Vats June 4, 2017
|
|
- Silas Simpson
- 5 years ago
- Views:
Transcription
1 MALA versus Random Walk Metropolis Dootika Vats June 4, 2017 Introduction My research thus far has predominantly been on output analysis for Markov chain Monte Carlo. The examples on which I have implemented our methods have been Gibbs samplers, vanilla Metropolis-Hastings samplers, or Metropolis-within-Gibbs samplers. I have been somewhat distant from the wide variety of samplers that users can choose from. One of the more popular ones is the Metropolis-adjusted Langevin Algorithm (MALA), introduced in Roberts and Tweedie (1996) and further studied in Roberts and Rosenthal (1998). The MALA is a Metropolis-Hastings sampler with a special proposal distribution. This proposal distribution distribution evaluates the gradient of the log density at the current state and moves the center of the proposal distribution by a scaled factor of this gradient. MALA is based on Langevin diffusions, a connection I am going to ignore in this article due to lack of knowledge at this point. MALA Let π(x) be the target distribution for the MCMC sampler defined on a p-dimensional space. A generic Metropolis-Hastings sampler at the current value x proposes a value y from a proposal distribution with density q(x, ). The proposed value y is accepted with probability π(y)q(y, x) min 1,. π(x)q(x, y) Different choices of q lead to different samplers. For MALA, q(x, y) is the density for the distribution ( ) N x + σ2 1 2 log π(x), σ2 1D where σ 2 1 is the step-size greater than 0, and D is a p p positive definite matrix. The D here is similar to the choice of covariance matrix in the proposal for the random walk Metropolis sampler. We will just use D to be the diagonal matrix in our example. If σ 2 1 is tuned properly, the MALA proposal forces the center of the distribution to move climb the gradient. This enables the sampler to move away from the tails faster than other naive samplers. We will compare the performance of the MALA sampler with that of the random walk Metropolis (RWM) sampler. The RWM sampler uses the proposal distribution where the purpose of σ 2 2 and D is the same as before. N(x, σ 2 2D), Roberts and Rosenthal (1998) concluded that an optimal acceptance rate for MALA is.574 and Roberts, Gelman, and Gilks (1997) concluded that the optimal acceptance rate for the RWM is.234. We will tune both σ 2 1 and σ 2 2 to acheive these acceptance rates. 1
2 Example 1: Gaussian Let the target distribution be a bivariate Normal distribution (nothing too complicated). (( ) ( )) N 2, The mean vector we denote by µ and the 2 2 covariance matrix by Σ. The density for this distribution is π(x) exp (x µ)t Σ 1 (x µ) 2 log π(x) = log const (x µ)t Σ 1 (x µ) 2 log π(x) = Σ 1 (x µ.) mu <- c(3, 6) Sigma <- matrix(c(2,.5,.5, 1), nrow = 2, ncol = 2) Sigma.inv <- solve(sigma) # Calculates the loglikelihood of the bivariate normal loglike <- function(x) return(as.numeric(- t((x - mu)) %*% Sigma.inv %*% (x - mu)/2) ) First I write down the function for the RWM sampler. Note that this proposal is symmetric. set.seed(100) rwm <- function(n = 1e5, sigma) chain.rwm <- matrix(0, nrow = N, ncol = 2) accept <- 0 # Starting value is the origin chain.rwm[1,] <- c(0,0) for(i in 2:N) prop <- rnorm(2, mean = chain.rwm[i-1, ], sd = sigma) log.ratio <- loglike(prop) - loglike(chain.rwm[i-1, ]) if(log(runif(1)) < log.ratio) chain.rwm[i, ] <- prop accept <- accept+1 else chain.rwm[i, ] <- chain.rwm[i-1, ] return(list("chain" = chain.rwm, "accept" = accept/n)) out.rwm <- rwm(sigma = 1) out.rwm$accept ## [1]
3 # Calibrated sigma to get close to optimal rate out.rwm <- rwm(sigma = 2.5) out.rwm$accept ## [1] Coding up the MALA sampler is slightly more complicated since the proposal distribution is no longer symmetric and the densities do not cancel out in the acceptance ratio. # Calculates the log gradient of the density proplike <- function(x, y, sigma) grad <- Sigma.inv %*% (y - mu) mu.m <- y + sigma^2 * grad/2 return(as.numeric(- t((x - mu.m)) %*% (x - mu.m)/(2*sigma^2) ) ) # Mala sampler mala <- function(n = 1e5, sigma) chain.mala <- matrix(0, nrow = N, ncol = 2) accept <- 0 # Starting value is the origin chain.mala[1,] <- c(0,0) for(i in 2:N) grad <- Sigma.inv %*% (chain.mala[i-1, ] - mu) mu.m <- chain.mala[i-1, ] + sigma^2 * grad/2 prop <- rnorm(2, mean = mu.m, sd = sigma) log.ratio <- loglike(prop) - loglike(chain.mala[i-1, ]) if(log(runif(1)) < log.ratio) chain.mala[i, ] <- prop accept <- accept+1 else chain.mala[i, ] <- chain.mala[i-1, ] return(list("chain" = chain.mala, "accept" = accept/n)) out.mala <- mala(sigma = 1) out.mala$accept + proplike(chain.mala[i-1, ], prop, sigma) ## [1] # Tuning Mala out.mala <- mala(sigma =.5) out.mala$accept ## [1]
4 To compare the performance of the two samplers, we plot some graphs. The first is the traceplot for the two components. par(mfrow = c(1, 2)) plot(tail(1:1e5, 1e4), tail(out.rwm$chain[,1],1e4), ylab = "First Component", main = "", type lines(tail(1:1e5, 1e4),tail(out.mala$chain[,1], 1e4), col = "red") plot(tail(1:1e5, 1e4), tail(out.rwm$chain[,2], 1e4), ylab = "Second Component", lines(tail(1:1e5, 1e4),tail(out.mala$chain[,2], 1e4), col = "red") main = "", t First Component Second Component index index The performance looks similar in their trace plot. MALA looks like it may produce thinner tails and focus on areas of high probability. par(mfrow = c(1,2)) plot(density(out.rwm$chain[,1]), ylab = "First Component", main = "") lines(density(out.mala$chain[,1]), col = "red") plot(density(out.rwm$chain[,2]), ylab = "Second Component", main = "") lines(density(out.mala$chain[,2]), col = "red") First Component Second Component N = Bandwidth = 0.12 N = Bandwidth = In terms of autocorrelation, we see the following results. par(mfrow = c(2,3)) acf(out.rwm$chain[,1], main = "RWM: First") acf(out.rwm$chain[,2], main = "RWM: Second") 4
5 ccf(out.rwm$chain[,1], out.rwm$chain[,2], main = "RWM: CCF") acf(out.mala$chain[,1], main = "MALA: First") acf(out.mala$chain[,2], main = "MALA: Second") ccf(out.mala$chain[,1], out.mala$chain[,2], main = "MALA: CCF") RWM: First RWM: Second RWM: CCF MALA: First MALA: Second MALA: CCF Interestingly, MALA produces much higher autocorrelation, and significantly higher crosscorrelation. This would imply that the multivariate effective sample size for MALA for estimating the mean of the Normal distribution will be smaller. Here is the implementation. library(mcmcse) ## mcmcse: Monte Carlo Standard Errors for MCMC ## Version created on ## copyright (c) 2012, James M. Flegal, University of California,Riverside ## John Hughes, University of Minnesota ## Dootika Vats, University of Minnesota ## For citation information, type citation("mcmcse"). ## Type help("mcmcse-package") to get started. c(multiess(out.rwm$chain), multiess(out.mala$chain)) ## [1] And clearly, the effective sample size for a Monte Carlo sample of size 1e5 is much smaller for MALA than it is for the RWM. So both RWM and MALA seem to yield decent density estimates but for estimation of the mean (3,6), the RWM will clearly be favored over MALA due to lost of efficiency. 5
6 Intuitively I think MALA will work better for fatter tail distributions. Let s see Example 2: Multimodal Let the target be a t distribution with 5 degrees of freedom π(x) ) 3 (1 + x2. 5 π(x) ) 3 (1 + x2 5 log π(x) = log const 3 log log π(x) = 6x 5 + x 2. # Calculates the loglikelihood of the bivariate normal loglike <- function(x) return( -3*log(1 + x^2/5) ) Below is the RWM implementation rwm <- function(n = 1e5, sigma) chain.rwm <- numeric(length = N) accept <- 0 # Starting value is far from 0 chain.rwm[1] <- 10 for(i in 2:N) prop <- rnorm(1, mean = chain.rwm[i-1 ], sd = sigma) log.ratio <- loglike(prop) - loglike(chain.rwm[i-1]) if(log(runif(1)) < log.ratio) chain.rwm[i ] <- prop accept <- accept+1 else chain.rwm[i ] <- chain.rwm[i-1 ] return(list("chain" = chain.rwm, "accept" = accept/n)) out.rwm <- rwm(sigma = 1) out.rwm$accept ## [1] # Calibrated sigma to get close to optimal rate out.rwm <- rwm(sigma = 5) out.rwm$accept ) (1 + x2 5 6
7 ## [1] Coding up the MALA sampler is slightly more complicated since the proposal distribution is no longer symmetric and the densities do not cancel out in the acceptance ratio. # Calculates the log gradient of the density proplike <- function(x, y, sigma) grad <- -6*y/(5 + y^2) mu.m <- y + sigma^2 * grad/2 return(as.numeric(- t((x - mu.m)) %*% (x - mu.m)/(2*sigma^2) ) ) # Mala sampler mala <- function(n = 1e5, sigma) chain.mala <- numeric(length = N) accept <- 0 # Starting value is far from 0 chain.mala[1] <- 10 for(i in 2:N) grad <- -6*chain.mala[i-1]/(5 + chain.mala[i-1]^2) mu.m <- chain.mala[i-1] + sigma^2 * grad/2 prop <- rnorm(1, mean = mu.m, sd = sigma) log.ratio <- loglike(prop) - loglike(chain.mala[i-1]) if(log(runif(1)) < log.ratio) chain.mala[i ] <- prop accept <- accept+1 else chain.mala[i] <- chain.mala[i-1 ] return(list("chain" = chain.mala, "accept" = accept/n)) out.mala <- mala(sigma = 1) out.mala$accept + proplike(chain.mala[i-1], prop, sigma) - pr ## [1] # Tuning Mala out.mala <- mala(sigma = 2.2) out.mala$accept ## [1] par(mfrow = c(1, 2)) plot(tail(1:1e5, 1e4), tail(out.rwm$chain,1e4),, main = "", type = 'l', xlab = "index") lines(tail(1:1e5, 1e4),tail(out.mala$chain, 1e4), col = "red") plot(density(out.rwm$chain), main = "") 7
8 lines(density(out.mala$chain), col = "red") tail(out.rwm$chain, 10000) Density index N = Bandwidth = Ah, we see MALA exploring the space more than RWM. Let us zoom into that. par(mfrow = c(1, 1)) plot(density(out.rwm$chain), main = " truth is BLUE", xlim = range(c(-5,5))) lines(density(out.mala$chain), col = "red") lines(seq(-5,5, length = 1e4), dt( seq(-5,5, length = 1e4), df = 5), col = "blue") truth is BLUE Density N = Bandwidth = Hmm, density estimation seems similar after 1e5 iterations. par(mfrow = c(1,2)) acf(out.rwm$chain, main = "RWM") acf(out.mala$chain, main = "MALA") 8
9 RWM MALA Since the seems lower for MALA, the effective sample size for estimating the mean will be larger. c(ess(out.rwm$chain), ess(out.mala$chain)) ## se se ## Thus for a distribution with fatter tails, MALA performs (marginally) better. For more complicated high dimensional target distributions, I can understand how MALA would be a handy tool. References Roberts, Gareth O, and Jeffrey S Rosenthal Optimal Scaling of Discrete Approximations to Langevin Diffusions. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60. Wiley Online Library: Roberts, Gareth O, and Richard L Tweedie Exponential Convergence of Langevin Distributions and Their Discrete Approximations. Bernoulli. JSTOR, Roberts, Gareth O, Andrew Gelman, and Walter R Gilks Weak Convergence and Optimal Scaling of Random Walk Metropolis Algorithms. The Annals of Applied Probability 7. Institute of Mathematical Statistics:
Markov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationAn introduction to adaptive MCMC
An introduction to adaptive MCMC Gareth Roberts MIRAW Day on Monte Carlo methods March 2011 Mainly joint work with Jeff Rosenthal. http://www2.warwick.ac.uk/fac/sci/statistics/crism/ Conferences and workshops
More informationMCMC Methods: Gibbs and Metropolis
MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution
More informationExamples of Adaptive MCMC
Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September, 2006.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the Markov chain parameters
More informationST 740: Markov Chain Monte Carlo
ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:
More informationMarkov chain Monte Carlo
Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains
More informationKernel Sequential Monte Carlo
Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section
More informationHastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model
UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced
More informationSlice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method
Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at
More informationLecture 8: The Metropolis-Hastings Algorithm
30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:
More informationMarkov chain Monte Carlo
Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically
More informationPackage mcmcse. July 4, 2017
Version 1.3-2 Date 2017-07-03 Title Monte Carlo Standard Errors for MCMC Package mcmcse July 4, 2017 Author James M. Flegal , John Hughes , Dootika Vats ,
More informationA Dirichlet Form approach to MCMC Optimal Scaling
A Dirichlet Form approach to MCMC Optimal Scaling Giacomo Zanella, Wilfrid S. Kendall, and Mylène Bédard. g.zanella@warwick.ac.uk, w.s.kendall@warwick.ac.uk, mylene.bedard@umontreal.ca Supported by EPSRC
More informationMarkov chain Monte Carlo methods in atmospheric remote sensing
1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationGradient-based Monte Carlo sampling methods
Gradient-based Monte Carlo sampling methods Johannes von Lindheim 31. May 016 Abstract Notes for a 90-minute presentation on gradient-based Monte Carlo sampling methods for the Uncertainty Quantification
More information16 : Approximate Inference: Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution
More informationSome Results on the Ergodicity of Adaptive MCMC Algorithms
Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship
More informationReminder of some Markov Chain properties:
Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent
More informationLog-concave sampling: Metropolis-Hastings algorithms are fast!
Proceedings of Machine Learning Research vol 75:1 5, 2018 31st Annual Conference on Learning Theory Log-concave sampling: Metropolis-Hastings algorithms are fast! Raaz Dwivedi Department of Electrical
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationThe Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS
The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS A Thesis in Statistics by Chris Groendyke c 2008 Chris Groendyke Submitted in
More informationeqr094: Hierarchical MCMC for Bayesian System Reliability
eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More information1 Geometry of high dimensional probability distributions
Hamiltonian Monte Carlo October 20, 2018 Debdeep Pati References: Neal, Radford M. MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte Carlo 2.11 (2011): 2. Betancourt, Michael. A conceptual
More informationReferences. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II
References Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can
More informationRiemann Manifold Methods in Bayesian Statistics
Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationMH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution
MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationStatistical Methods in Particle Physics Lecture 1: Bayesian methods
Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan
More informationZig-Zag Monte Carlo. Delft University of Technology. Joris Bierkens February 7, 2017
Zig-Zag Monte Carlo Delft University of Technology Joris Bierkens February 7, 2017 Joris Bierkens (TU Delft) Zig-Zag Monte Carlo February 7, 2017 1 / 33 Acknowledgements Collaborators Andrew Duncan Paul
More informationUniversity of Toronto Department of Statistics
Optimal Proposal Distributions and Adaptive MCMC by Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0804 June 30, 2008 TECHNICAL REPORT SERIES University of Toronto
More informationMarkov-Chain Monte Carlo
Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. References Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can
More informationTUNING OF MARKOV CHAIN MONTE CARLO ALGORITHMS USING COPULAS
U.P.B. Sci. Bull., Series A, Vol. 73, Iss. 1, 2011 ISSN 1223-7027 TUNING OF MARKOV CHAIN MONTE CARLO ALGORITHMS USING COPULAS Radu V. Craiu 1 Algoritmii de tipul Metropolis-Hastings se constituie într-una
More informationarxiv: v1 [stat.co] 2 Nov 2017
Binary Bouncy Particle Sampler arxiv:1711.922v1 [stat.co] 2 Nov 217 Ari Pakman Department of Statistics Center for Theoretical Neuroscience Grossman Center for the Statistics of Mind Columbia University
More informationKernel adaptive Sequential Monte Carlo
Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline
More informationDimension-Independent likelihood-informed (DILI) MCMC
Dimension-Independent likelihood-informed (DILI) MCMC Tiangang Cui, Kody Law 2, Youssef Marzouk Massachusetts Institute of Technology 2 Oak Ridge National Laboratory 2 August 25 TC, KL, YM DILI MCMC USC
More informationIn many cases, it is easier (and numerically more stable) to compute
In many cases, it is easier (and numerically more stable) to compute r u (x, y) := log p u (y) + log q u (y, x) log p u (x) log q u (x, y), and then accept if U k < exp ( r u (X k 1, Y k ) ) and reject
More informationThe two subset recurrent property of Markov chains
The two subset recurrent property of Markov chains Lars Holden, Norsk Regnesentral Abstract This paper proposes a new type of recurrence where we divide the Markov chains into intervals that start when
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationExamples of Adaptive MCMC
Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised January 2008.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the
More informationSAMPLING ALGORITHMS. In general. Inference in Bayesian models
SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be
More informationMethodology for inference on the Markov modulated Poisson process and theory for optimal scaling of the random walk Metropolis
II Methodology for inference on the Markov modulated Poisson process and theory for optimal scaling of the random walk Metropolis Christopher Sherlock, MSc. Submitted for the degree of Doctor of Philosophy
More informationPhysics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester
Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation
More informationMetropolis-Hastings sampling
Metropolis-Hastings sampling Gibbs sampling requires that a sample from each full conditional distribution. In all the cases we have looked at so far the conditional distributions were conjugate so sampling
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationComputer Practical: Metropolis-Hastings-based MCMC
Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov
More information17 : Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo
More informationMarkov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017
Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the
More informationMarkov Chain Monte Carlo
Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible
More informationCS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling
CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo
Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative
More informationSupplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements
Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model
More informationLearning the hyper-parameters. Luca Martino
Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth
More informationMarkov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa
Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation Luke Tierney Department of Statistics & Actuarial Science University of Iowa Basic Ratio of Uniforms Method Introduced by Kinderman and
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationMARKOV CHAIN MONTE CARLO
MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationA quick introduction to Markov chains and Markov chain Monte Carlo (revised version)
A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to
More informationStatistical Data Mining and Medical Signal Detection. Lecture Five: Stochastic Algorithms and MCMC. July 6, Motoya Machida
Statistical Data Mining and Medical Signal Detection Lecture Five: Stochastic Algorithms and MCMC July 6, 2011 Motoya Machida (mmachida@tntech.edu) Statistical Data Mining 1/20 Plotting Density Functions
More informationDarwin Uy Math 538 Quiz 4 Dr. Behseta
Darwin Uy Math 538 Quiz 4 Dr. Behseta 1) Section 16.1.4 talks about how when sample size gets large, posterior distributions become approximately normal and centered at the MLE. This is largely due to
More informationRobert Collins CSE586, PSU Intro to Sampling Methods
Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling
More informationStatistical analysis of neural data: Monte Carlo techniques for decoding spike trains
Statistical analysis of neural data: Monte Carlo techniques for decoding spike trains Liam Paninski Department of Statistics and Center for Theoretical Neuroscience Columbia University http://www.stat.columbia.edu/
More informationDirectional Metropolis Hastings algorithms on hyperplanes
Directional Metropolis Hastings algorithms on hyperplanes Hugo Hammer and Håon Tjelmeland Department of Mathematical Sciences Norwegian University of Science and Technology Trondheim, Norway Abstract In
More informationSampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.
Sampling Methods Oliver Schulte - CMP 419/726 Bishop PRML Ch. 11 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More informationKernel Adaptive Metropolis-Hastings
Kernel Adaptive Metropolis-Hastings Arthur Gretton,?? Gatsby Unit, CSML, University College London NIPS, December 2015 Arthur Gretton (Gatsby Unit, UCL) Kernel Adaptive Metropolis-Hastings 12/12/2015 1
More informationVariance Bounding Markov Chains
Variance Bounding Markov Chains by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised April 2007.) Abstract. We introduce a new property of Markov chains, called variance bounding.
More informationThe random walk Metropolis: linking theory and practice through a case study.
The random walk Metropolis: linking theory and practice through a case study. Chris Sherlock 1,3, Paul Fearnhead 1, and Gareth O. Roberts 2 1. Department of Mathematics and Statistics, Lancaster University,
More informationMonte Carlo methods for sampling-based Stochastic Optimization
Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from
More informationA Review of Pseudo-Marginal Markov Chain Monte Carlo
A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the
More informationMonte Carlo integration
Monte Carlo integration Eample of a Monte Carlo sampler in D: imagine a circle radius L/ within a square of LL. If points are randoml generated over the square, what s the probabilit to hit within circle?
More informationMonte Carlo Methods. Leon Gu CSD, CMU
Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte
More informationAdaptive Metropolis with Online Relabeling
Adaptive Metropolis with Online Relabeling Anonymous Unknown Abstract We propose a novel adaptive MCMC algorithm named AMOR (Adaptive Metropolis with Online Relabeling) for efficiently simulating from
More informationThe random walk Metropolis: linking theory and practice through a case study.
The random walk Metropolis: linking theory and practice through a case study. Chris Sherlock 1,3, Paul Fearnhead 1, and Gareth O. Roberts 2 1. Department of Mathematics and Statistics, Lancaster University,
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationThe Random Walk Metropolis: Linking Theory and Practice Through a Case Study
Statistical Science 2010, Vol. 25, No. 2, 172 190 DOI: 10.1214/10-STS327 Institute of Mathematical Statistics, 2010 The Random Walk Metropolis: Linking Theory and Practice Through a Case Study Chris Sherlock,
More information19 : Slice Sampling and HMC
10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often
More informationDeblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.
Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical
More informationApplicability of subsampling bootstrap methods in Markov chain Monte Carlo
Applicability of subsampling bootstrap methods in Markov chain Monte Carlo James M. Flegal Abstract Markov chain Monte Carlo (MCMC) methods allow exploration of intractable probability distributions by
More informationBayesian Estimation of Expected Cell Counts by Using R
Bayesian Estimation of Expected Cell Counts by Using R Haydar Demirhan 1 and Canan Hamurkaroglu 2 Department of Statistics, Hacettepe University, Beytepe, 06800, Ankara, Turkey Abstract In this article,
More informationThe lmm Package. May 9, Description Some improved procedures for linear mixed models
The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationImproved Robust MCMC Algorithm for Hierarchical Models
UNIVERSITY OF TEXAS AT SAN ANTONIO Improved Robust MCMC Algorithm for Hierarchical Models Liang Jing July 2010 1 1 ABSTRACT In this paper, three important techniques are discussed with details: 1) group
More informationSTA 294: Stochastic Processes & Bayesian Nonparametrics
MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a
More informationA Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait
A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute
More informationBayesian Gaussian Process Regression
Bayesian Gaussian Process Regression STAT8810, Fall 2017 M.T. Pratola October 7, 2017 Today Bayesian Gaussian Process Regression Bayesian GP Regression Recall we had observations from our expensive simulator,
More informationPackage lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer
Package lmm March 19, 2012 Version 0.4 Date 2012-3-19 Title Linear mixed models Author Joseph L. Schafer Maintainer Jing hua Zhao Depends R (>= 2.0.0) Description Some
More informationControl Variates for Markov Chain Monte Carlo
Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability
More informationBayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014
Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation
More informationMultivariate Normal & Wishart
Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.
More informationA Comparison of Two MCMC Algorithms for Hierarchical Mixture Models
A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology and Learning Systems ralmond@fsu.edu BMAW 2014 1
More informationLecture 7 and 8: Markov Chain Monte Carlo
Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani
More informationMarkov chain Monte Carlo
Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) Gibbs and Metropolis Hastings Slice sampling Practical details Iain Murray http://iainmurray.net/ Reminder Need to sample large, non-standard distributions:
More informationThe zig-zag and super-efficient sampling for Bayesian analysis of big data
The zig-zag and super-efficient sampling for Bayesian analysis of big data LMS-CRiSM Summer School on Computational Statistics 15th July 2018 Gareth Roberts, University of Warwick Joint work with Joris
More informationMetric Predicted Variable on One Group
Metric Predicted Variable on One Group Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. Prior Homework
More informationInformation-Geometric Markov Chain Monte Carlo Methods Using Diffusions
Entropy 2014, 16, 3074-3102; doi:10.3390/e16063074 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Information-Geometric Markov Chain Monte Carlo Methods Using Diffusions Samuel
More information