Simulation - Lectures - Part III Markov chain Monte Carlo

Size: px
Start display at page:

Download "Simulation - Lectures - Part III Markov chain Monte Carlo"

Transcription

1 Simulation - Lectures - Part III Markov chain Monte Carlo Julien Berestycki Part A Simulation and Statistical Programming Hilary Term 2018 Part A Simulation. HT J. Berestycki. 1 / 50

2 Outline Markov Chain Monte Carlo Recap on Markov chains Markov Chain Monte Carlo Metropolis-Hastings Part A Simulation. HT J. Berestycki. 2 / 50

3 Markov chain Monte Carlo Methods Our aim is to estimate θ = E p (φ(x)) where X is a random variable on Ω with pmf (or pdf) p. Up to this point we have based our estimates on independent and identically distributed draws from either p itself, or some proposal distribution with pmf/pdf q. In MCMC we simulate a correlated sequence X 0, X 1, X 2,... such that X t is approximately distributed from p for t large, and rely on the usual estimate ˆθ n = 1 n 1 φ(x t ). n We will suppose the space of states of X is discrete (finite or countable). But it should be kept in mind that MCMC methods are applicable to continuous state spaces, and in fact one of the most versatile and widespread classes of Monte Carlo algorithms currently. t=0 Part A Simulation. HT J. Berestycki. 3 / 50

4 Outline Markov Chain Monte Carlo Recap on Markov chains Markov Chain Monte Carlo Metropolis-Hastings Part A Simulation. HT J. Berestycki. 4 / 50

5 Markov chains From Part A Probability. Let (X t ) t=0,1,... be a homogeneous Markov chain of random variables on Ω with starting distribution X 0 p (0) and transition matrix P = (P ij ) i,j Ω with for i, j Ω. P i,j = P(X t+1 = j X t = i). We write (X t ) t=0,1,... Markov(p (0), P ) Denote by P (n) i,j and by p (n) (i) = P(X n = i). the n-step transition probabilities P (n) i,j = P(X t+n = j X t = i) Recall that a Markov chain, or equivalently its transition matrix P, is irreducible if and only if, for each pair of states i, j Ω there is n such that P (n) i,j > 0. Part A Simulation. HT J. Berestycki. 5 / 50

6 Markov chains π is a stationary, or invariant distribution of P, if π verifies π j = i Ω π i P ij for all j Ω. If p (0) = π then p (1) (j) = i Ω p (0) (i)p i,j, so p (1) (j) = π(j) also. Iterating, p (t) = π for each t = 1, 2,... in the chain, so the distribution of X t doesn t change with t, it is stationary. If P is irreducible, and has a stationary distribution π, then this stationary distribution π is unique. Part A Simulation. HT J. Berestycki. 6 / 50

7 Ergodic Theorem for Markov chains Theorem (Theorems 6.1, 6.2, 6.3 of Part A Probability) Let (X t ) t=0,1,... Markov(λ, P ) be an irreducible Markov chain on a discrete state space Ω. Assume it admits a stationary distribution π. Then, for any initial distribution λ That is n 1 1 I(X t = i) π(i) almost surely, as n. n t=0 P ( ) n 1 1 I(X t = i) π(i) n t=0 = 1. Additionally, if the chain is aperiodic, then for all i Ω P(X n = i) π(i) Part A Simulation. HT J. Berestycki. 7 / 50

8 Ergodic Theorem for Markov chains Corollary Let (X t ) t=0,1,... Markov(λ, P ) be an irreducible Markov chain on a discrete state space Ω. Assume it admits a stationary distribution π. Let φ : Ω R be a bounded function, X a discrete random variable on Ω with pmf π and θ = E p [φ(x)] = i Ω φ(i)π(i). Then, for any initial distribution λ n 1 1 φ(x t ) θ almost surely, as n. n t=0 That is ( ) n 1 1 P φ(x t ) θ = 1. n t=0 Part A Simulation. HT J. Berestycki. 8 / 50

9 Ergodic Theorem for Markov chains Proof (non-examinable). Assume wlog φ(i) < 1 for all i Ω. Let A Ω and V i(n) = n 1 t=0 I(Xt = i). n 1 1 n 1 φ(x t) θ n = 1 φ(i)i(x t = i) φ(i)π(i) n t=0 t=0 i Ω i Ω ( ) 1 = φ(i) Vi(n) π(i) n i Ω Vi(n) n π(i) + Vi(n) n π(i) i A i / A Vi(n) n π(i) + ( ) Vi(n) n + π(i) i A i / A = Vi(n) n π(i) + ( π(i) Vi(n) ) + 2 π(i) n i A i A i / A 2 Vi(n) n π(i) + 2 π(i) i A i / A where in line 5 we ve used i A V i (n) n = 1 i A V i (n) n = i π(i) i A V i (n) n. Part A Simulation. HT J. Berestycki. 9 / 50

10 Ergodic Theorem for Markov chains Proof (continued). Let ɛ > 0, and take A finite such that i / A π(i) < ɛ/4. For N N, Define the event { ( ) E N = Vi(n) n π(i) < ɛ/4 for all n N}. i A As P( V i(n) π(i)) = 1 for all i Ω and A is finite, the event E n N must occur for some N hence P( E N ) = 1. It follows that, for any ɛ > 0 ( ) n 1 1 P N such that for all n N, φ(x t) θ n < ɛ = 1. t=0 Part A Simulation. HT J. Berestycki. 10 / 50

11 Reversible Markov chains In a reversible Markov chain we cannot distinguish the direction of simulation from inspection of a realization of the chain and its reversal, even with knowledge of the transition matrix. A Markov chain (X t ) t 0 Markov(π, P ) is reversible iff for any n 0. P(X 0, X 1,..., X n ) = P(X n, X n 1,..., X 0 ) We also say that P is reversible with respect to π, or π-reversible Most MCMC algorithms are based on reversible Markov chains. Part A Simulation. HT J. Berestycki. 11 / 50

12 Reversible Markov chains It seems clear that a Markov chain will be reversible if and only if P(X t 1 = j X t = i) = P i,j, so that any particular transition occurs with equal probability in forward and reverse directions. Theorem Let P be a transition matrix. If there is a probability mass function π on Ω such that π and P satisfy the detailed balance condition then π(i)p i,j = π(j)p j,i for all pairs i, j Ω, (I) π = πp, so π is stationary for P and (II) the chain (X t ) Markov(π, P ) is reversible. Part A Simulation. HT J. Berestycki. 12 / 50

13 Reversible Markov chains Proof of (I): sum both sides of detailed balance equation over i Ω. Now i P j,i = 1 so i π(i)p i,j = π(j). Proof of (II), we have π a stationary distribution of P so P(X t = i) = π(i) for all t = 1, 2,... along the chain. Then P(X t 1 = j X t = i) = P(X t = i X t 1 = j) P(X t 1 = j) P(X t = i) = P j,i π(j)/π(i) (stationarity) = P i,j (detailed balance). (Bayes rule) Part A Simulation. HT J. Berestycki. 13 / 50

14 Outline Markov Chain Monte Carlo Recap on Markov chains Markov Chain Monte Carlo Metropolis-Hastings Part A Simulation. HT J. Berestycki. 14 / 50

15 Markov chain Monte Carlo method (discrete case) Let X be a discrete random variable with pmf p on Ω, φ a bounded function on Ω and θ = E p [φ(x)]. Consider a homogeneous Markov chain (X t ) t=0,1,... Markov(λ, P ) with initial distribution λ on Ω and transition matrix P, such that P is irreducible, and admits p as invariant distribution. Then, for any initial distribution λ the MCMC estimator θ MCMC n is (weakly and strongly) consistent θ MCMC n and, if the chain is aperiodic = 1 n 1 φ(x t ) n i=1 θ almost surely as n X t p in distribution as t. Part A Simulation. HT J. Berestycki. 15 / 50

16 Markov chain Monte Carlo method Proof follows directly from the ergodic theorem and corollary Note that the estimator is biased, as λ p (otherwise we would use a standard Monte Carlo estimator) For t large, we have X t d X In order to implement the MCMC algorithm we need, for a given target distribution p, to find an irreducible (and aperiodic) transition matrix P with admits p as invariant distribution Most MCMC algorithms use a transition matrix P which is reversible with respect to p The Metropolis-Hastings algorithm provides a generic way to obtain such P for any target distribution p Part A Simulation. HT J. Berestycki. 16 / 50

17 Outline Markov Chain Monte Carlo Recap on Markov chains Markov Chain Monte Carlo Metropolis-Hastings Part A Simulation. HT J. Berestycki. 17 / 50

18 Metropolis-Hastings Algorithm The Metropolis-Hastings (MH) algorithm allows to simulate a Markov Chain with any given stationary distribution. We will start with simulation of random variable X on a discrete state space. Let p(x) = p(x)/z p be the pmf on Ω. We will call p the (pmf of the) target distribution. To simplify notations, we assume that p(x) > 0 for all x Ω Choose a proposal transition matrix q(y x). We will use the notation Y q( x) to mean Pr(Y = y X = x) = q(y x). Part A Simulation. HT J. Berestycki. 18 / 50

19 Metropolis-Hastings Algorithm Metropolis-Hastings algorithm 1. Set the initial state X 0 = x For t = 1, 2,..., n 1: 2.1 Assume X t 1 = x t Simulate Y t q( x t 1 ) and U t U[0, 1]. 2.3 If U t α(y t x t 1 ) where { α(y x) = min 1, p(y)q(x y) } p(x)q(y x) set X t = Y t, otherwise set X t = x t 1. Part A Simulation. HT J. Berestycki. 19 / 50

20 Metropolis-Hastings Algorithm The Metropolis-Hastings algorithm defines a Markov chain with transition matrix P such that, for x, y Ω P x,y = P(X t = y X t 1 = x) = q(y x)α(y x) + ρ(x)i(y = x) where ρ(x) is the probability of rejection ρ(x) = 1 y Ω q(y x)α(y x). Part A Simulation. HT J. Berestycki. 20 / 50

21 Metropolis-Hastings Algorithm Theorem The transition matrix P of the Markov chain generated by the Metropolis-Hastings algorithm is reversible with respect to p and therefore admits p as stationary distribution. Proof: We check detailed balance. For x y p(x)p x,y = p(x)q(y x)α(y x) { = p(x)q(y x) min 1, p(y)q(x y) } p(x)q(y x) = min {p(x)q(y x), p(y)q(x y)} { } p(x)q(y x) = p(y)q(x y) min p(y)q(x y), 1 = p(y)q(x y)α(x y) = p(y)p y,x. Part A Simulation. HT J. Berestycki. 21 / 50

22 Metropolis-Hastings Algorithm To run the MH algorithm, we need to specify X 0 = x 0 (or X 0 λ) and a proposal q(y x). We only need to know the target p up to a normalizing constant as α depends only on p(y)/p(x) = p(y)/ p(x). If the Markov chain simulated by the MH algorithm is irreducible and aperiodic then the ergodic theorem applies. Verifying aperiodicity is usually straightforward, since the MCMC algorithm may reject the candidate state y, so P x,x > 0 for at least some states x Ω. In order to check irreducibility we need to check that q can take us anywhere in Ω (so q itself is an irreducible transition matrix), and then that the acceptance step doesn t trap the chain (as might happen if α(y x) is zero too often). Part A Simulation. HT J. Berestycki. 22 / 50

23 Example: Discrete Distribution on a finite state-space Consider a discrete random variable X p on Ω = {1, 2,..., m} with p(i) = i so Z p = m i=1 i = m(m+1) 2. One simple proposal distribution is Y q on Ω such that q(i) = 1/m. Acceptance probability { α(y x) = min 1, p(y)q(x y) } { = min 1, y } p(x)q(y x) x This proposal scheme is clearly irreducible P(X t+1 = y X t = x) q(y x)α(y x) = 1 min(1, y/x) > 0 m Part A Simulation. HT J. Berestycki. 23 / 50

24 Example: Discrete Distribution on a finite state-space Start from X 0 = 1. For t = 1,..., n 1 1. Let Y t U{1, 2,..., m} and U t U[0, 1] 2. If U t Y t X t 1 set X t = Y t, otherwise set X t = X t 1. For t large, X t d X Part A Simulation. HT J. Berestycki. 24 / 50

25 Example: Discrete Distribution on a finite state-space Figure: Realization of the MH Markov chain for n = 100 with m = 20. Part A Simulation. HT J. Berestycki. 25 / 50

26 Example: Discrete Distribution on a finite state-space Figure: Average number of visits V i (n)/n (n = 1000) and target pmf p(i) Part A Simulation. HT J. Berestycki. 26 / 50

27 Example: Discrete Distribution on a finite state-space Figure: Average number of visits V i (n)/n (n = 10, 000) and target pmf p(i) Part A Simulation. HT J. Berestycki. 27 / 50

28 Example: Poisson Distribution We want to simulate p(x) = e λ λ x /x! λ x /x! For the proposal we use 1 2 for y = x ± 1,x 1 q(y x) = 1 for x = 0,y = 1 0 otherwise, i.e. toss a coin and add or substract 1 to x to obtain y. Acceptance probability { α(y x) = ( min λ 1, min ( 1, x ) λ x+1 ) if y = x + 1, x 1 if y = x 1, x 2 and α(1 0) = min(1, λ/2), α(0 1) = min(1, 2/λ). Markov chain is irreducible (check!) Part A Simulation. HT J. Berestycki. 28 / 50

29 Example: Poisson Distribution Set X 0 = 1. For t = 1,..., n 1 1. If X t 1 = 0, set Y t = 1 2. Otherwise, simulate V t U[0, 1] 2.1 If V t 1, set Yt = Xt Otherwise set Y t = X t Simulate U t U[0, 1]. 4. If U t α(y t X t 1 ), set X t = Y t, otherwise set X t = X t 1. Part A Simulation. HT J. Berestycki. 29 / 50

30 Example: Poisson distribution Figure: Realization of the MH Markov chain for n = 500 with λ = 20. Part A Simulation. HT J. Berestycki. 30 / 50

31 Example: Poisson distribution Figure: Average number of visits V i (n)/n (n = 1000) and target pmf p(i) Part A Simulation. HT J. Berestycki. 31 / 50

32 Example: Poisson distribution Figure: Average number of visits V i (n)/n (n = 10, 000) and target pmf p(i) Part A Simulation. HT J. Berestycki. 32 / 50

33 Example: Image Consider a m 1 m 2 image, where I(i, j) {0, 1,..., 256} is the gray level of pixel (i, j) Ω = {0,..., m 1 1} {0,..., m 2 1} Consider a discrete random variable taking values in Ω Unnormalized pdf p((i, j)) = I(i, j) Proposal transition probabilities q((y 1, y 2 ) (x 1, x 2 )) = q(y 1 x 1 )q(y 2 x 2 ) with { 1/3 if y1 = x q(y 1 x 1 ) = 1 ± 1 or y 1 = x 1, mod m 1 0 otherwise and similarly for q(y 2 x 2 ). Part A Simulation. HT J. Berestycki. 33 / 50

34 Example: Simulation of an image Average number of visits to each pixel (i, j): V (i,j) (n) = 1 n 1 n t=0 I(X t = (i, j)) Part A Simulation. HT J. Berestycki. 34 / 50

35 Example: Simulation of an image Target pmf Part A Simulation. HT J. Berestycki. 35 / 50

36 Metropolis-Hastings algorithm on R d The Metropolis-Hastings algorithm generalizes to continuous state-space where Ω R d with 1. p is a pdf on Ω 2. q( x) is a pdf on Ω for any x Ω The Metropolis-Hastings algorithm thus defines a Markov chain on Ω R d Precise definition of Markov chains on R d is beyond the scope of this course. We will just state the most important results without proof. Assume for simplicity that p(x) > 0 for all x Ω Part A Simulation. HT J. Berestycki. 36 / 50

37 Metropolis-Hastings algorithm on R d The Markov chain X 0, X 1,... on Ω R d is irreducible if for any x Ω and A Ω, there is n such that P(X n A X 0 = x) > 0 Theorem If the Metropolis-Hastings chain is irreducible, then for any function φ such that E p [ φ(x) ] <, the MH estimator is strongly consistent θ MH n = 1 n 1 φ(x t ) θ n t=0 almost surely as n Part A Simulation. HT J. Berestycki. 37 / 50

38 Example: Gaussian distribution Let X N(µ, σ 2 ) with p(x) = (2πσ 2 ) 1/2 e (x µ)2 2σ 2 MH algorithm with target pdf p and proposal transition pdf { 1 for y [x 1/2, x + 1/2] q(y x) = 0 otherwise Acceptance probability ( α(y x) = min 1, p(y)q(x y) ) ) = min (1, e (y µ)2 2σ p(x)q(y x) 2 + (x µ)2 2σ 2 Part A Simulation. HT J. Berestycki. 38 / 50

39 Example: Gaussian distribution Figure: Realizations from the MH Markov chain (X 0,..., X 1000 ) with X 0 = 0. Part A Simulation. HT J. Berestycki. 39 / 50

40 Example: Gaussian distribution Figure: MH estimates θ t = 1 t the Markov chain. t 1 i=0 X i of θ = E p [X] for different realizations of Part A Simulation. HT J. Berestycki. 40 / 50

41 Example: Mixture of Gaussians distribution Let p(x) = π 1 (2πσ1 2) 1/2 e (x µ 1 ) 2σ π 2 (2πσ2 2) 1/2 e (x µ 2 )2 2σ MH algorithm with target pdf p and proposal transition pdf { 1 for y [x 1/2, x + 1/2] q(y x) = 0 otherwise Acceptance probability α(y x) = min ( 1, p(y)q(x y) ) = min (1, p(y)/p(x)) p(x)q(y x) Part A Simulation. HT J. Berestycki. 41 / 50

42 Example: Mixture of Gaussians distribution Figure: Realization of the Markov chain (X 0,..., X ) with X 0 = 0. Part A Simulation. HT J. Berestycki. 42 / 50

43 Example Consider a Metropolis algorithm for simulating samples from a bivariate Normal distribution with mean µ = (0, 0) and covariance ( Σ = using a proposal distribution for each component x i x i U(x w, x + w) i {1, 2} ) The performance of the algorithm can be controlled by setting w. If w is small then we propose only small moves and the chain will move slowly around the parameter space. If w is large then it is large then we may only accept a few moves. There is an art to implementing MCMC that involves choice of good proposal distributions. Part A Simulation. HT J. Berestycki. 43 / 50

44 Example Consider a Metropolis algorithm for simulating samples from a bivariate Normal distribution with mean µ = (0, 0) and covariance ( Σ = using a proposal distribution for each component x i x i U(x w, x + w) i {1, 2} ) The performance of the algorithm can be controlled by setting w. If w is small then we propose only small moves and the chain will move slowly around the parameter space. If w is large then it is large then we may only accept a few moves. There is an art to implementing MCMC that involves choice of good proposal distributions. Part A Simulation. HT J. Berestycki. 43 / 50

45 Example Consider a Metropolis algorithm for simulating samples from a bivariate Normal distribution with mean µ = (0, 0) and covariance ( Σ = using a proposal distribution for each component x i x i U(x w, x + w) i {1, 2} ) The performance of the algorithm can be controlled by setting w. If w is small then we propose only small moves and the chain will move slowly around the parameter space. If w is large then it is large then we may only accept a few moves. There is an art to implementing MCMC that involves choice of good proposal distributions. Part A Simulation. HT J. Berestycki. 43 / 50

46 Example Consider a Metropolis algorithm for simulating samples from a bivariate Normal distribution with mean µ = (0, 0) and covariance ( Σ = using a proposal distribution for each component x i x i U(x w, x + w) i {1, 2} ) The performance of the algorithm can be controlled by setting w. If w is small then we propose only small moves and the chain will move slowly around the parameter space. If w is large then it is large then we may only accept a few moves. There is an art to implementing MCMC that involves choice of good proposal distributions. Part A Simulation. HT J. Berestycki. 43 / 50

47 Example Consider a Metropolis algorithm for simulating samples from a bivariate Normal distribution with mean µ = (0, 0) and covariance ( Σ = using a proposal distribution for each component x i x i U(x w, x + w) i {1, 2} ) The performance of the algorithm can be controlled by setting w. If w is small then we propose only small moves and the chain will move slowly around the parameter space. If w is large then it is large then we may only accept a few moves. There is an art to implementing MCMC that involves choice of good proposal distributions. Part A Simulation. HT J. Berestycki. 43 / 50

48 Example N = 4 w = 0.1 T = Part A Simulation. HT J. Berestycki. 44 / 50

49 Example N = 4 w = 0.1 T = Part A Simulation. HT J. Berestycki. 45 / 50

50 Example N = 4 w = 0.1 T = Part A Simulation. HT J. Berestycki. 46 / 50

51 Example N = 4 w = 0.01 T = Part A Simulation. HT J. Berestycki. 47 / 50

52 Example N = 4 w = 0.01 T = 1e Part A Simulation. HT J. Berestycki. 48 / 50

53 Example N = 4 w = 10 T = Part A Simulation. HT J. Berestycki. 49 / 50

54 MCMC: Practical aspects The MCMC chain does not start from the invariant distribution, so E[φ(X t )] E[φ(X)] and the difference can be significant for small t X t converges in distribution to p as t Common practice is to discard the b first values of the Markov chain X 0,..., X b 1, where we assume that X b is approximately distributed from p We use the estimator n 1 1 φ(x t ) n b The initial X 0,..., X b 1 is called the burn-in period of the MCMC chain. t=b Part A Simulation. HT J. Berestycki. 50 / 50

MSc MT15. Further Statistical Methods: MCMC. Lecture 5-6: Markov chains; Metropolis Hastings MCMC. Notes and Practicals available at

MSc MT15. Further Statistical Methods: MCMC. Lecture 5-6: Markov chains; Metropolis Hastings MCMC. Notes and Practicals available at MSc MT15. Further Statistical Methods: MCMC Lecture 5-6: Markov chains; Metropolis Hastings MCMC Notes and Practicals available at www.stats.ox.ac.uk\ nicholls\mscmcmc15 Markov chain Monte Carlo Methods

More information

SC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods p. /36 Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Markov Chain Monte Carlo Methods p. 2/36 Markov Chains

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

In many cases, it is easier (and numerically more stable) to compute

In many cases, it is easier (and numerically more stable) to compute In many cases, it is easier (and numerically more stable) to compute r u (x, y) := log p u (y) + log q u (y, x) log p u (x) log q u (x, y), and then accept if U k < exp ( r u (X k 1, Y k ) ) and reject

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Assaf Weiner Tuesday, March 13, 2007 1 Introduction Today we will return to the motif finding problem, in lecture 10

More information

CS281A/Stat241A Lecture 22

CS281A/Stat241A Lecture 22 CS281A/Stat241A Lecture 22 p. 1/4 CS281A/Stat241A Lecture 22 Monte Carlo Methods Peter Bartlett CS281A/Stat241A Lecture 22 p. 2/4 Key ideas of this lecture Sampling in Bayesian methods: Predictive distribution

More information

= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1

= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1 Properties of Markov Chains and Evaluation of Steady State Transition Matrix P ss V. Krishnan - 3/9/2 Property 1 Let X be a Markov Chain (MC) where X {X n : n, 1, }. The state space is E {i, j, k, }. The

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Convergence Rate of Markov Chains

Convergence Rate of Markov Chains Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution

More information

The Markov Chain Monte Carlo Method

The Markov Chain Monte Carlo Method The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC)

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Cong Ling Imperial College London aaa joint work with Zheng Wang (Huawei Technologies Shanghai) aaa September 20, 2016 Cong Ling (ICL) MCMC

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Monte Carlo importance sampling and Markov chain

Monte Carlo importance sampling and Markov chain Monte Carlo importance sampling and Markov chain If a configuration in phase space is denoted by X, the probability for configuration according to Boltzman is ρ(x) e βe(x) β = 1 T (1) How to sample over

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

LECTURE 3. Last time:

LECTURE 3. Last time: LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Markov processes and queueing networks

Markov processes and queueing networks Inria September 22, 2015 Outline Poisson processes Markov jump processes Some queueing networks The Poisson distribution (Siméon-Denis Poisson, 1781-1840) { } e λ λ n n! As prevalent as Gaussian distribution

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation Ulm University Institute of Stochastics Lecture Notes Dr. Tim Brereton Summer Term 2015 Ulm, 2015 2 Contents 1 Discrete-Time Markov Chains 5 1.1 Discrete-Time Markov Chains.....................

More information

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y. CS450 Final Review Problems Fall 08 Solutions or worked answers provided Problems -6 are based on the midterm review Identical problems are marked recap] Please consult previous recitations and textbook

More information

Lecture Notes 7 Stationary Random Processes. Strict-Sense and Wide-Sense Stationarity. Autocorrelation Function of a Stationary Process

Lecture Notes 7 Stationary Random Processes. Strict-Sense and Wide-Sense Stationarity. Autocorrelation Function of a Stationary Process Lecture Notes 7 Stationary Random Processes Strict-Sense and Wide-Sense Stationarity Autocorrelation Function of a Stationary Process Power Spectral Density Continuity and Integration of Random Processes

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Joint Gaussian Graphical Model Review Series I

Joint Gaussian Graphical Model Review Series I Joint Gaussian Graphical Model Review Series I Probability Foundations Beilun Wang Advisor: Yanjun Qi 1 Department of Computer Science, University of Virginia http://jointggm.org/ June 23rd, 2017 Beilun

More information

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321 Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation Luke Tierney Department of Statistics & Actuarial Science University of Iowa Basic Ratio of Uniforms Method Introduced by Kinderman and

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Homework 10 Solution

Homework 10 Solution CS 174: Combinatorics and Discrete Probability Fall 2012 Homewor 10 Solution Problem 1. (Exercise 10.6 from MU 8 points) The problem of counting the number of solutions to a napsac instance can be defined

More information

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 15: MCMC Sanjeev Arora Elad Hazan COS 402 Machine Learning and Artificial Intelligence Fall 2016 Course progress Learning from examples Definition + fundamental theorem of statistical learning,

More information

CSC 446 Notes: Lecture 13

CSC 446 Notes: Lecture 13 CSC 446 Notes: Lecture 3 The Problem We have already studied how to calculate the probability of a variable or variables using the message passing method. However, there are some times when the structure

More information

The Theory behind PageRank

The Theory behind PageRank The Theory behind PageRank Mauro Sozio Telecom ParisTech May 21, 2014 Mauro Sozio (LTCI TPT) The Theory behind PageRank May 21, 2014 1 / 19 A Crash Course on Discrete Probability Events and Probability

More information

MCMC and Gibbs Sampling. Sargur Srihari

MCMC and Gibbs Sampling. Sargur Srihari MCMC and Gibbs Sampling Sargur srihari@cedar.buffalo.edu 1 Topics 1. Markov Chain Monte Carlo 2. Markov Chains 3. Gibbs Sampling 4. Basic Metropolis Algorithm 5. Metropolis-Hastings Algorithm 6. Slice

More information

INTRODUCTION TO BAYESIAN STATISTICS

INTRODUCTION TO BAYESIAN STATISTICS INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types

More information

for Global Optimization with a Square-Root Cooling Schedule Faming Liang Simulated Stochastic Approximation Annealing for Global Optim

for Global Optimization with a Square-Root Cooling Schedule Faming Liang Simulated Stochastic Approximation Annealing for Global Optim Simulated Stochastic Approximation Annealing for Global Optimization with a Square-Root Cooling Schedule Abstract Simulated annealing has been widely used in the solution of optimization problems. As known

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An

More information

f(x)dx := f(x 1,...,x d )dx 1 dx d.

f(x)dx := f(x 1,...,x d )dx 1 dx d. 18.465, Revised November 14, 2012 Rejection methods, Markov chain Monte Carlo, the Gibbs sampler, and ergodicity In one dimension, to generate random variables X 1,X 2,... with a given distribution function

More information

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models 6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch. Sampling Methods Oliver Schulte - CMP 419/726 Bishop PRML Ch. 11 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

6.842 Randomness and Computation February 24, Lecture 6

6.842 Randomness and Computation February 24, Lecture 6 6.8 Randomness and Computation February, Lecture 6 Lecturer: Ronitt Rubinfeld Scribe: Mutaamba Maasha Outline Random Walks Markov Chains Stationary Distributions Hitting, Cover, Commute times Markov Chains

More information

Markov Random Field Modelling of Diagenetic Facies in Carbonate Reservoirs

Markov Random Field Modelling of Diagenetic Facies in Carbonate Reservoirs Markov Random Field Modelling of Diagenetic Facies in Carbonate Reservoirs Elisabeth Finserås Larsen Master of Science in Physics and Mathematics Submission date: July 2010 Supervisor: Karl Henning Omre,

More information