Session 3A: Markov chain Monte Carlo (MCMC)

Similar documents
Markov Chain Monte Carlo Methods

MONTE CARLO METHODS. Hedibert Freitas Lopes

Session 2B: Some basic simulation methods

Introduction to Machine Learning CMU-10701

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo

Nonlinear Inequality Constrained Ridge Regression Estimator

Markov Chains and MCMC

Likelihood-free MCMC

A note on Reversible Jump Markov Chain Monte Carlo

Paul Karapanagiotidis ECO4060

Brief introduction to Markov Chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo

CSC 2541: Bayesian Methods for Machine Learning

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Markov chain Monte Carlo methods for visual tracking

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

Lecture 7 and 8: Markov Chain Monte Carlo

The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1

Bayesian Phylogenetics:

Markov Chain Monte Carlo, Numerical Integration

STA 4273H: Statistical Machine Learning

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Markov chain Monte Carlo

Principles of Bayesian Inference

Markov chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo

CPSC 540: Machine Learning

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Sampling Methods (11/30/04)

The Metropolis-Hastings Algorithm. June 8, 2012

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

A Single Series from the Gibbs Sampler Provides a False Sense of Security

Bayesian Inference and MCMC

Markov chain Monte Carlo Lecture 9

Precision Engineering

Markov Chain Monte Carlo

16 : Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo

Bayesian Methods for Machine Learning

Random Walks A&T and F&S 3.1.2

Computational statistics

MCMC and Gibbs Sampling. Sargur Srihari

General Construction of Irreversible Kernel in Markov Chain Monte Carlo

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

Monte Carlo Methods. Leon Gu CSD, CMU

ELEC633: Graphical Models

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

STAT 425: Introduction to Bayesian Analysis

Convergence Rate of Markov Chains

MCMC: Markov Chain Monte Carlo

Simulation - Lectures - Part III Markov chain Monte Carlo

Bayesian analysis of ARMA}GARCH models: A Markov chain sampling approach

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

MCMC Sampling for Bayesian Inference using L1-type Priors

A Semi-parametric Bayesian Framework for Performance Analysis of Call Centers

F denotes cumulative density. denotes probability density function; (.)

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Kernel adaptive Sequential Monte Carlo

Monte Carlo in Bayesian Statistics

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Ch5. Markov Chain Monte Carlo

Session 5B: A worked example EGARCH model

Markov chain Monte Carlo methods in atmospheric remote sensing

Kobe University Repository : Kernel

Computer Practical: Metropolis-Hastings-based MCMC

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

eqr094: Hierarchical MCMC for Bayesian System Reliability

6 Markov Chain Monte Carlo (MCMC)

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Sampling Algorithms for Probabilistic Graphical models

Bayesian Estimation with Sparse Grids

A Geometric Interpretation of the Metropolis Hastings Algorithm

Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal*

Monte Carlo methods for sampling-based Stochastic Optimization

Markov chain Monte Carlo

MARKOV CHAIN MONTE CARLO

MCMC algorithms for fitting Bayesian models

ST 740: Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Markov chain Monte Carlo

STA 294: Stochastic Processes & Bayesian Nonparametrics

MCMC Methods: Gibbs and Metropolis

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

Introduction to Machine Learning CMU-10701

CSC 446 Notes: Lecture 13

Markov Processes. Stochastic process. Markov process

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Transcription:

Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 1 / 15

New schedule Today: Lecture 8:30-10:00, Session 3A 10:00-10:30, Co ee Lecture 10:30-12:30, Session 3B 12:30-2:30, Lunch / Check your email if you haven t today 2:30-4:00, Laptop session in this room If you have a laptop, bring it Tomorrow: 8:30-10:00, Session 4A in this room... After that, to be announced ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 2 / 15

Markov chain Monte Carlo (MCMC): Motivation Some basic knowledge is needed just to read the technical applied Bayesian econometric literature. An MCMC procedure known as the Metropolis random-walk is an important component of sequential Monte Carlo (next session) Gibbs sampling is covered in detail in several texts including Koop (2003), Lancaster (2004), Geweke (2005) and (plug) Geweke J, Koop G. and van Dijk, H.K. (2011), Handbook of Bayesian Econometrics. Oxford: Oxford University Press. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 3 / 15

Background Metropolis, N., A.W. Rosenbluth, M.N. Rosenbluth, A.H.Teller and E. Teller (1953), Equation of State Calculations by Fast Computing Machines, The Journal of Chemical Physics 21: 1087-1092. Hastings, W.K. (1970), Monte Carlo Sampling Methods Using Markov Chains and Their Applications, Biometrika 57: 97-109. Geman, S., and D. Geman (1984), Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images, IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721-741. Gelfand, A.E. and A.F.M. Smith (1990), Sampling Based Approaches to Calculating Marginal Densities, Journal of the American Statistical Association 85: 398-409. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 4 / 15

Nature of the MCMC simulator Speci es a transition rule that perpetuates a sequence means of a transition density p θ (m) j θ (m 1), T. n θ (m)o by At a minimum this transition density has to satisfy the invariance condition Z p θ (m 1) j I p θ (m) j θ (m 1), T dθ (m 1) = p θ (m) j I ; Θ That is, if the previous θ (m 1) comes from the distribution we re trying to learn about then so does θ (m) By recursion so do all of the θ (m+j) (j = 1, 2, 3,...) that follow. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 5 / 15

Issue #1: Uniqueness of the invariant distribution (irreducibility) Chain starts at θ (0) ; more or less arbitrary but θ (0) s p (θ ja) is a good idea. Chain must somehow nd the invariant density p (θ j I ). A chain can have more than one invariant distribution. Easy to construct practical examples in econometrics where this happens ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 6 / 15

Issue #2: Ergodicity θ (1) s p (θ j I ) and the invariance condition Z Θ p θ (m 1) j I p θ (m) j θ (m 1), T dθ (m 1) = p θ (m) j I (1) imply: for any m, θ (m) s p (θ j I ). But we would like to know that lim M! M 1 M m=1 g in some sense a good approximation of E [g (θ) j I ] θ (m) will be And, of course, (1) begs the question if we knew how to draw θ (1) s p (θ j I ) we could use direct sampling. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 7 / 15

Issues #3,... : Practical matters Because θ (0) 6s p (θ j Y o, A), early iterations are in general not representative of p (θ j Y o, A). Some number of initial iterations are discarded (burn-in or warm-up) To evaluate numerical accuracy we need a central limit theorem # M "M 1/2 1 M g θ (m) d E [g (θ) j I ]! N 0, τ 2, m=1 bτ 2(M ) a.s.! τ 2 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 8 / 15

The Metropolis-Hastings algorithm Continue to assume ω (m) s p ω j θ (m), I (relatively) easy The algorithm: Arbitrary starting valueθ (0) 2 Θ θ s q θ j θ (m 1),H is a candidate value for θ (m) (H for Hastings) θ (m) = θ with probability 8 < α θ j θ (m = min : p p (θ j I ) /q θ (m 1) j I /q θ j θ (m 1),H θ (m 1) j θ,h 9 =, 1 ;. Otherwise θ (m) = θ (m 1). ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 9 / 15

Some intuition Why (in the world!) α θ j θ (m 8 < = min : p p (θ j I ) /q θ (m 1) j I /q θ j θ (m 1),H θ (m 1) j θ,h 9 =, 1 ;??? In many respects this is similar to importance sampling. If q θ j θ (m 1),H makes a move from θ (m 1) = θ A to θ = θ B quite likely, compared to p (θ B j I ), and a move back from θ = θ B to θ (m then: relative to p (θ A j I ), low probability on actually making the transition high probability on staying at θ (m 1). 1) = θ A quite unlikely, ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 10 / 15

Two-step proof (Chib and Greenberg, 1995) Step 1: Note that if a transition probability density function p θ (m) j θ (m 1), T satis es the reversibility condition p θ (m 1) j I p θ (m) j θ (m with respect to p (θ ji ), then Z Θ Z = = p p Θ θ (m 1) j I p θ (m) j θ (m θ (m) j I p θ (m p θ (m) j I Z Θ p θ (m This is the invariance condition. 1), T = p θ (m) j I p θ (m 1), T 1) j θ (m), T 1) j θ (m), T (m 1) dθ (m 1) dθ dθ (m 1) j θ (m), T 1) = p θ (m) j I. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 11 / 15

Two step proof, Step 2 We want H to meet the reversibility condition p θ (m 1) ji p θ (m) j θ (m = p θ (m) ji p θ (m 1) j θ (m), H. (2) If θ (m 1) = θ (m) (2) holds trivially. For θ (m 1) 6= θ (m) (2) implies p θ (m 1) j I q θ j θ (m α θ j θ (m = p (θ j I ) q θ (m 1) j θ, H α θ (m 1) j θ, H. Suppose (without loss of generality) p θ (m 1) j I q θ j θ (m > p (θ j I ) q θ (m If α θ (m 1) j θ, H α θ j θ (m = 1 (3) is true if and only if = p (θ j I ) q θ (m 1) j θ, H p θ (m 1) j I q θ j θ (m 1) j θ, H.. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 12 / 15

... conclusion of Step 2 p θ (m = p (θ j I ) q 1) j I θ (m q θ j θ (m 1) j θ, H α θ (m α θ j θ (m 1) j θ, H. (3) Suppose (without loss of generality) p θ (m 1) j I q θ j θ (m > p (θ j I ) q θ (m 1) j θ, H. If α θ (m 1) j θ, H = 1 and α θ j θ (m then (3) is satis ed. = p (θ j I ) q p θ (m 1) j I q θ (m 1) j θ, H θ j θ (m, ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 13 / 15

Special case: Metropolis independence chain q (θ j θ,h) = q (θ j H) Then α θ j θ (m = 2 min 4 p (θ j I ) q θ (m p θ (m 1) j I q θ j θ (m h = min w (θ ) /w θ (m 1) i, 1 1) j θ, H 3, 15 where w (θ) = p (θ ji ) /q (θ jh) ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 14 / 15

Special case: Random walk Metropolis chain q (θ j θ,h) = q (θ θ jh) Typically where q (θ θ jh) is symmetric about 0. θ j (θ,h) s N (θ, Σ) Variance matrix Σ must be chosen with care Too big: acceptance rate very low Too small: acceptance rate very high but chain moves slowly through Θ ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 15 / 15