Introduction to Markov Chain Monte Carlo

Similar documents
Markov Chain Monte Carlo

Hierarchical Linear Models

Session 5B: A worked example EGARCH model

ST 740: Markov Chain Monte Carlo

Riemann Manifold Methods in Bayesian Statistics

Markov Chain Monte Carlo, Numerical Integration

Bayesian Inference and MCMC

HOMEWORK 6 ANSWERS. 5.1a) biasprob [1] b) Using m= I get: acceptrate [1] 3534 probthetagt0 [1]

Reminder of some Markov Chain properties:

Answers and expectations

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Fast Likelihood-Free Inference via Bayesian Optimization

Monte Carlo in Bayesian Statistics

Markov chain Monte Carlo

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

DAG models and Markov Chain Monte Carlo methods a short overview

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Computational statistics

Multivariate Normal & Wishart

HW3 Solutions : Applied Bayesian and Computational Statistics

Strong Lens Modeling (II): Statistical Methods

An introduction to Bayesian statistics and model calibration and a host of related topics

Density Estimation. Seungjin Choi

Package gk. R topics documented: February 4, Type Package Title g-and-k and g-and-h Distribution Functions Version 0.5.

Bayesian Methods for Machine Learning

F denotes cumulative density. denotes probability density function; (.)

BUGS Bayesian inference Using Gibbs Sampling

LECTURE 15 Markov chain Monte Carlo

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

MIT /30 Gelman, Carpenter, Hoffman, Guo, Goodrich, Lee,... Stan for Bayesian data analysis

STA 4273H: Statistical Machine Learning


Bayesian spatial hierarchical modeling for temperature extremes

Nonparametric Bayesian Methods (Gaussian Processes)

Principles of Bayesian Inference

Learning the hyper-parameters. Luca Martino

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

MCMC algorithms for fitting Bayesian models

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

The Bayesian Approach to Multi-equation Econometric Model Estimation

MCMC: Markov Chain Monte Carlo

Markov Chain Monte Carlo Methods

Statistics & Data Sciences: First Year Prelim Exam May 2018

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

STA 4273H: Statistical Machine Learning

Assessing Regime Uncertainty Through Reversible Jump McMC

Bayesian Phylogenetics:

Markov chain Monte Carlo

ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors. RicardoS.Ehlers

Doing Bayesian Integrals

STAT 518 Intro Student Presentation

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

CSC 2541: Bayesian Methods for Machine Learning

Switching Regime Estimation

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

STA 4273H: Sta-s-cal Machine Learning

Markov chain Monte Carlo General Principles

Machine Learning. Probabilistic KNN.

Marginal Specifications and a Gaussian Copula Estimation

The Metropolis-Hastings Algorithm. June 8, 2012

Statistics in Environmental Research (BUC Workshop Series) II Problem sheet - WinBUGS - SOLUTIONS

Tutorial on Probabilistic Programming with PyMC3

Down by the Bayes, where the Watermelons Grow

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Bayesian Networks in Educational Assessment

Advanced Statistical Modelling

First steps of multivariate data analysis

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference

MCMC and Gibbs Sampling. Sargur Srihari

Stan Workshop. Peter Gao and Serge Aleshin-Guendel. November

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Pattern Recognition and Machine Learning

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference

CSC 2541: Bayesian Methods for Machine Learning

Monte Carlo Techniques for Regressing Random Variables. Dan Pemstein. Using V-Dem the Right Way

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Brief introduction to Markov Chain Monte Carlo

Adaptive Monte Carlo methods

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Machine Learning for Data Science (CS4786) Lecture 24

Session 3A: Markov chain Monte Carlo (MCMC)

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

16 : Markov Chain Monte Carlo (MCMC)

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results

The Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016

Bayesian Linear Regression

Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging. Céline Helbert, Delphine Dupuy and Laurent Carraro

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Package pmhtutorial. October 18, 2018

Computer Practical: Metropolis-Hastings-based MCMC

Example of MC for Continuous State Space

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

17 : Markov Chain Monte Carlo

Transcription:

Introduction to Markov Chain Monte Carlo Jim Albert March 18, 2018 A Selected Data Problem Here is an interesting problem with selected data. Suppose you are measuring the speeds of cars driving on an interstate. You assume the speeds are normally distributed with mean µ and standard deviation σ. You see 10 cars pass by and you only record the minimum and maximum speeds. What have you learned about the normal parameters? First we focus on the construction of the likelihood. Given values of the normal parameters, what is the probability of observing minimum = x and the maximum = y in a sample of size n? Essentially we re looking for the joint density of two order statistics which is a standard result. Let f and F denote the density and cdf of a normal density with mean µ and standard deviation σ. Then the joint density of (x, y) is given by f(x, y µ, σ) f(x)f(y)[f (y) F (x)] n 2, x < y After we observe data, the likelihood is this sampling density viewed as function of the parameters. Suppose we take a sample of size 10 and we observe x = 52, y = 84. Then the likelihood is given by L(µ, σ) f(52)f(84)[f (84) F (52)] 8 Defining the log posterior First I write a short function minmaxpost that computes the logarithm of the posterior density. The arguments to this function are θ = (µ, log σ) and data which is a list with components n, min, and max. I d recommend using the R functions pnorm and dnorm in computing the density it saves typing errors. > minmaxpost <- function(theta, data){ + mu <- theta[1] + sigma <- exp(theta[2]) + dnorm(data$min, mu, sigma, log=true) + + dnorm(data$max, mu, sigma, log=true) + 1

+ (data$n - 2) * log(pnorm(data$max, mu, sigma) - + pnorm(data$min, mu, sigma)) + } Normal approximation to posterior We work with the parameterization (µ, log σ) which will give us a better normal approximation. A standard noninformative prior is uniform on (µ, log σ). The function laplace is used to summarize this posterior. The arguments to laplace are the name of the log posterior function, an initial estimate at θ, and the data that is used in the log posterior function. The output of laplace includes mode, the posterior mode, and var, the corresponding estimate at the variance-covariance matrix. > data <- list(n=10, min=52, max=84) > library(learnbayes) > fit <- laplace(minmaxpost, c(70, 2), data) > fit $mode [1] 67.999960 2.298369 $var [,1] [,2] [1,] 1.920690e+01-1.901459e-06 [2,] -1.901459e-06 6.031533e-02 $int [1] -8.020139 $converge [1] TRUE In this example, this gives a pretty good approximation in this situation. The mycontour function is used to display contours of the exact posterior and overlay the matching normal approximation using a second application of mycontour. > mycontour(minmaxpost, c(45, 95, 1.5, 4), data, + xlab=expression(mu), ylab=expression(paste("log ",sigma))) > mycontour(lbinorm, c(45, 95, 1.5, 4), + list(m=fit$mode, v=fit$var), add=true, col="red") 2

log σ 1.5 2.0 2.5 3.0 3.5 4.0 50 60 70 80 90 µ Random Walk Metropolis Sampling The rwmetrop function implements the M-H random walk algorithm. There are four inputs: (1) the function defining the log posterior, (2) a list containing var, the estimated var-cov matrix, and scale, the M-H random walk scale constant, (3) the starting value in the Markov Chain simulation, (4) the number of iterations of the algorithm, and (5) any data and prior parameters used in the log posterior density. Here we use fit$v as our estimated var-cov matrix, use a scale value of 3, start the simulation at (µ, log σ) = (70, 2) and try 10,000 iterations. > mcmc.fit <- rwmetrop(minmaxpost, + list(var=fit$v, scale=3), + c(70, 2), + 10000, + data) I display the acceptance rate here it is 19% which is a reasonable value. > mcmc.fit$accept [1] 0.1853 3

We display the contours of the exact posterior and overlay the simulated draws. > mycontour(minmaxpost, c(45, 95, 1.5, 4), data, + xlab=expression(mu), + ylab=expression(paste("log ",sigma))) > points(mcmc.fit$par) µ log σ 50 60 70 80 90 1.5 2.0 2.5 3.0 3.5 4.0 It appears like we have been successful in getting a good sample from this posterior distribution. Random Walk Metropolis Sampling To illustrate simulation-based inference, suppose one is interested in learning about the upper quartile P.75 = µ + 0.674 σ of the car speed distribution. For each simulated draw of (µ, σ) from the posterior, we compute the upper quartile P.75. We use the density function to construct a density estimate of the simulated sample of P.75. > mu <- mcmc.fit$par[, 1] > sigma <- exp(mcmc.fit$par[, 2]) 4

> P.75 <- mu + 0.674 * sigma > plot(density(p.75), + main="posterior Density of Upper Quartile") Posterior Density of Upper Quartile Density 0.00 0.02 0.04 0.06 0.08 60 70 80 90 100 110 N = 10000 Bandwidth = 0.7082 5