Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 1 / 15

New schedule Today: Lecture 8:30-10:00, Session 3A 10:00-10:30, Co ee Lecture 10:30-12:30, Session 3B 12:30-2:30, Lunch / Check your email if you haven t today 2:30-4:00, Laptop session in this room If you have a laptop, bring it Tomorrow: 8:30-10:00, Session 4A in this room... After that, to be announced ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 2 / 15

Markov chain Monte Carlo (MCMC): Motivation Some basic knowledge is needed just to read the technical applied Bayesian econometric literature. An MCMC procedure known as the Metropolis random-walk is an important component of sequential Monte Carlo (next session) Gibbs sampling is covered in detail in several texts including Koop (2003), Lancaster (2004), Geweke (2005) and (plug) Geweke J, Koop G. and van Dijk, H.K. (2011), Handbook of Bayesian Econometrics. Oxford: Oxford University Press. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 3 / 15

Background Metropolis, N., A.W. Rosenbluth, M.N. Rosenbluth, A.H.Teller and E. Teller (1953), Equation of State Calculations by Fast Computing Machines, The Journal of Chemical Physics 21: 1087-1092. Hastings, W.K. (1970), Monte Carlo Sampling Methods Using Markov Chains and Their Applications, Biometrika 57: 97-109. Geman, S., and D. Geman (1984), Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images, IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721-741. Gelfand, A.E. and A.F.M. Smith (1990), Sampling Based Approaches to Calculating Marginal Densities, Journal of the American Statistical Association 85: 398-409. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 4 / 15

Nature of the MCMC simulator Speci es a transition rule that perpetuates a sequence means of a transition density p θ (m) j θ (m 1), T. n θ (m)o by At a minimum this transition density has to satisfy the invariance condition Z p θ (m 1) j I p θ (m) j θ (m 1), T dθ (m 1) = p θ (m) j I ; Θ That is, if the previous θ (m 1) comes from the distribution we re trying to learn about then so does θ (m) By recursion so do all of the θ (m+j) (j = 1, 2, 3,...) that follow. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 5 / 15

Issue #1: Uniqueness of the invariant distribution (irreducibility) Chain starts at θ (0) ; more or less arbitrary but θ (0) s p (θ ja) is a good idea. Chain must somehow nd the invariant density p (θ j I ). A chain can have more than one invariant distribution. Easy to construct practical examples in econometrics where this happens ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 6 / 15

Issue #2: Ergodicity θ (1) s p (θ j I ) and the invariance condition Z Θ p θ (m 1) j I p θ (m) j θ (m 1), T dθ (m 1) = p θ (m) j I (1) imply: for any m, θ (m) s p (θ j I ). But we would like to know that lim M! M 1 M m=1 g in some sense a good approximation of E [g (θ) j I ] θ (m) will be And, of course, (1) begs the question if we knew how to draw θ (1) s p (θ j I ) we could use direct sampling. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 7 / 15

Issues #3,... : Practical matters Because θ (0) 6s p (θ j Y o, A), early iterations are in general not representative of p (θ j Y o, A). Some number of initial iterations are discarded (burn-in or warm-up) To evaluate numerical accuracy we need a central limit theorem # M "M 1/2 1 M g θ (m) d E [g (θ) j I ]! N 0, τ 2, m=1 bτ 2(M ) a.s.! τ 2 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 8 / 15

The Metropolis-Hastings algorithm Continue to assume ω (m) s p ω j θ (m), I (relatively) easy The algorithm: Arbitrary starting valueθ (0) 2 Θ θ s q θ j θ (m 1),H is a candidate value for θ (m) (H for Hastings) θ (m) = θ with probability 8 < α θ j θ (m = min : p p (θ j I ) /q θ (m 1) j I /q θ j θ (m 1),H θ (m 1) j θ,h 9 =, 1 ;. Otherwise θ (m) = θ (m 1). ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 9 / 15

Some intuition Why (in the world!) α θ j θ (m 8 < = min : p p (θ j I ) /q θ (m 1) j I /q θ j θ (m 1),H θ (m 1) j θ,h 9 =, 1 ;??? In many respects this is similar to importance sampling. If q θ j θ (m 1),H makes a move from θ (m 1) = θ A to θ = θ B quite likely, compared to p (θ B j I ), and a move back from θ = θ B to θ (m then: relative to p (θ A j I ), low probability on actually making the transition high probability on staying at θ (m 1). 1) = θ A quite unlikely, ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 10 / 15

Two-step proof (Chib and Greenberg, 1995) Step 1: Note that if a transition probability density function p θ (m) j θ (m 1), T satis es the reversibility condition p θ (m 1) j I p θ (m) j θ (m with respect to p (θ ji ), then Z Θ Z = = p p Θ θ (m 1) j I p θ (m) j θ (m θ (m) j I p θ (m p θ (m) j I Z Θ p θ (m This is the invariance condition. 1), T = p θ (m) j I p θ (m 1), T 1) j θ (m), T 1) j θ (m), T (m 1) dθ (m 1) dθ dθ (m 1) j θ (m), T 1) = p θ (m) j I. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 11 / 15

Two step proof, Step 2 We want H to meet the reversibility condition p θ (m 1) ji p θ (m) j θ (m = p θ (m) ji p θ (m 1) j θ (m), H. (2) If θ (m 1) = θ (m) (2) holds trivially. For θ (m 1) 6= θ (m) (2) implies p θ (m 1) j I q θ j θ (m α θ j θ (m = p (θ j I ) q θ (m 1) j θ, H α θ (m 1) j θ, H. Suppose (without loss of generality) p θ (m 1) j I q θ j θ (m > p (θ j I ) q θ (m If α θ (m 1) j θ, H α θ j θ (m = 1 (3) is true if and only if = p (θ j I ) q θ (m 1) j θ, H p θ (m 1) j I q θ j θ (m 1) j θ, H.. ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 12 / 15

... conclusion of Step 2 p θ (m = p (θ j I ) q 1) j I θ (m q θ j θ (m 1) j θ, H α θ (m α θ j θ (m 1) j θ, H. (3) Suppose (without loss of generality) p θ (m 1) j I q θ j θ (m > p (θ j I ) q θ (m 1) j θ, H. If α θ (m 1) j θ, H = 1 and α θ j θ (m then (3) is satis ed. = p (θ j I ) q p θ (m 1) j I q θ (m 1) j θ, H θ j θ (m, ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 13 / 15

Special case: Metropolis independence chain q (θ j θ,h) = q (θ j H) Then α θ j θ (m = 2 min 4 p (θ j I ) q θ (m p θ (m 1) j I q θ j θ (m h = min w (θ ) /w θ (m 1) i, 1 1) j θ, H 3, 15 where w (θ) = p (θ ji ) /q (θ jh) ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 14 / 15

Special case: Random walk Metropolis chain q (θ j θ,h) = q (θ θ jh) Typically where q (θ θ jh) is symmetric about 0. θ j (θ,h) s N (θ, Σ) Variance matrix Σ must be chosen with care Too big: acceptance rate very low Too small: acceptance rate very high but chain moves slowly through Θ ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte Carlo (MCMC) August 15, 2012 15 / 15