Markov Chain Monte Carlo

Size: px
Start display at page:

Download "Markov Chain Monte Carlo"

Transcription

1 Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible to generate observations from distributions that may not be easy to sample by the usual Monte Carlo method The idea is pretty simple Suppose that we have a target distribution, π(x, x R d, say, which is known only up to some multiplicative constant If π(x is not so easy to sample from it directly, an indirect method for obtaining samples from π( is to construct a Markov chain whose stationary distribution is π(x When we run the chain long enough, simulated values from the chain can be treated as a sample from the target distribution and used as a basis for summarizing important features of π( Under certain regularity conditions, the Markov chain sample path mimics a random sample fro π( Given realization {X t, t = 0, 1, } from such a chain, typical asymptotic results include X t π(x, t, and 1 n n θ(x t Eθ(X, t=1 n (Ergodic theorem Major MCMC application areas: Bayesian inference, conditional frequentist inference problems for categorical data, EM algorithms 66

2 51 MARKOV CHAINS Markov chains 511 Definitions and basic properties A Markov process {X t, t T } is a stochastic process with the Markov property that, given the current value of X t, the future values of X s for s > t are not influenced by the past values of X u for u < t X_u X_t X_s u t s > past current future Markov process are classified according to (i the nature of the index set of the process, T, (whether discrete time or continuous time, (ii the nature of the state space of the process, or, essentially, the values of X t assumes with positive probability (X t is a discrete random variable or a continuous random variable Markov processes classified into four basic types State space Discrete Continuous Index Discrete Discrete-time Discrete-time (Nature of Markov chain Markov process time Continuous Continuous-time Continuous-time Markov chain Markov process A discrete-time Markov chain is a Markov process whose time index set is T = {0, 1, 2, } and whose state space is a finite or countable set It is often convenient to label such a state space by the nonnegative integers {0, 1, 2, }, and it is customary to speak of X n as being in state i if X n = i In other words, a discrete-time Markov chain is a sequence of random variables X 0, X 1,, with the following Markov property: P(X i+1 = y X i = x i,,x 0 = x 0 = P(X i+1 = y X i = x i, ie, given the current state X i, the future state X i+1 depends only on the current state X i but not on the past states X i 1,,X 0 Example 1 Let Y 0, Y 1,,Y n, be independent discrete random variables Define n X n =, n {0, 1, 2, } k=0

3 68 CHAPTER 5 MARKOV CHAIN MONTE CARLO It forms a Markov chain, since P(X i+1 = y X i = x i,,x 0 = x 0 = P(X i+1 = y, X i = x i,,x 0 = x 0 P(X i = x i,,x 0 = x 0 = P(Y i+1 + X i = y, X i = x i,,x 0 = x 0 P(X i = x i,,x 0 = x 0 = P(Y i+1 = y x i, X i = x i,,x 0 = x 0 P(X i = x i,,x 0 = x 0 = P(Y i+1 = y x i P(X i = x i,,x 0 = x 0 P(X i = x i,,x 0 = x 0 = P(Y i+1 = y x i, (why? ans, similarly, P(X i+1 = y X i = x i = P(Y i+1 = y x i In order to specify the probability law of a Markov chain {X n, n = 0, 1, }, it suffices to state two types of probabilities (i the probability of the initial state X 0, p i = P(X 0 = i, i = 0, 1,, (ii the one-step transition probability p n,n+1 = P(X n+1 = j X n = i, i, j = 0, 1, 2,, n = 0, 1, 2,, which is the probability of X n+1 being in state j given that X n is in state i To see this, it is enough to show how to evaluate the finite-dimensional probability P(X 0 = i 0, X 1 = i 1, X 2 = i 2,,X n = i n, since any other probability involving X j1, X j2,,x jk, say, for j 1 < j 2 < j k, can be obtained by summing terms of these forms Now P(X 0 = i 0, X 1 = i 1,,X n = i n = P(X 0 = i 0, X 1 = i 1,,X n 1 = i n 1 P(X n = i n X 0 = i 0, X 1 = i 1,,X n 1 = i n 1 = P(X 0 = i 0, X 1 = i 1,,X n 1 = i n 1 P(X n = i n X n 1 = i n 1 = P(X 0 = i 0, X 1 = i 1,,X n 1 = i n 1 p n 1,n i n 1,i n = = p i0 p 01 i 0,i 1 p n 1,n i n 1,i n

4 51 MARKOV CHAINS Stationary transition probabilities The one-step transition probability p n,n+1 is a function not only of the initial and finite states, but also of the time of transition as well A simplification is made as follows When the one-step transition probabilities are independent of the time variable n, we say that the Markov chain has stationary transition probabilities Then p n,n+1 = p, and p is the conditional probability that the state value undergoes a transition from i to j one trial Sometimes we write p as p(i, j, and call p(x, y the transition kernel It is customary to put these probabilities in an infinite order matrix, called transition probability matrix of the Markov chain, Some properties of p s are P = p 00 p 01 p 02 p 10 p 11 p 12 p i0 p i1 p i2 (i p 0, i, j = 0, 1, 2,, (ii j=0 p = 1, i = 0, 1, 2, What does it mean? Example 2 A Markov chain X 0, X 1, X 2, on states 0, 1, 2 has the transition probability matrix and initial probabilities Determine P(X 0 = 0, X 1 = 1, X 2 = 2 P = , P(X 0 = 0 = 03, P(X 0 = 1 = 06, P(X 0 = 2 = 01 Denote the probability that a Markov chain goes from state i to state j in n transitions by p (n = P(X m+n = j X m = i, and call P (n = (p (n the n-step transition probability matrix Theorem 1 The n-step transition probabilities of a Markov chain satisfy p (n = k=0 p ik p (n 1 kj,

5 70 CHAPTER 5 MARKOV CHAIN MONTE CARLO where In other words, P (n = P n { p (0 1, if i = j, = 0, if i j Theorem 2 If P(X 0 = j = p j, j = 0, 1,, then the probability of the Markov chain being in state k at time n is p (n k = p j p (n jk = P(X n = k Find j=0 Example 3 A Markov chain X 0, X 1, X 2, on the states 0, 1, 2 has the transition probability matrix P = (i the two-step transition matrix P (2, (ii P(X 3 = 1 X 1 = 0, (iii P(X 3 = 1 X 0 = Regular transition probability matrices Consider a Markov chain on a finite number of states labeled 0, 1,,N Suppose that a transition probability matrix P = (p has the property that, when raised to some power k, the matrix P has all of its elements strictly positive Such a transition probability matrix, or the corresponding Markov chain, is called regular Example 4 For a Markov chain whose transition matrix probability matrix is ( 1 a a P =, b 1 b the n-step transion matrix is P n = 1 a + b ( b a b a + (1 a bn a + b ( a a b b The chain is regular when 0 < a, b < 1, and in this case the limiting distribution is ( b π = a + b, a a + b The most important fact is the existence of a limiting probability distribution π = (π 0, π 1,,π N, with π j > 0, (j = 0, 1,,N, and N π j = 1, j=0

6 51 MARKOV CHAINS 71 which is independent of the initial state, and or, in terms of the Markov chain, lim n p(n = π j > 0, j = 0, 1,,N, lim P(X n = j X 0 = i = π j > 0, n j = 0, 1,,N Theorem Let P be a regular transition probability matrix on the states 0, 1,,N Then the limiting distribution π = (π 0, π 1,,π N is the unique nonnegative solution of the equations N π j = π k p kj, k=0 j = 0, 1,,N, In other words, N π k = 1 k=0 π = πp This limiting distribution is known as the stationary distribution If the initial distribution p (0 is the stationary distribution π, then π (1 = p (0 P = πp = π, and continuing in the same fashion, π (n = π for all n 514 The classification of states Not all Markov chains are regular Example 5 Clearly, P n = P for all n Example 6 In this case, P n = P = P = ( ( ( ( , if n is even,, if n is odd

7 72 CHAPTER 5 MARKOV CHAIN MONTE CARLO A state i is periodic if there exists an integer d > 1 such that p (n ii = 0 whenever n is not divisible by d A state i is aperiodic if it is not periodic A Markov chain is aperiodic if all state are aperiodic Example 7 Is there a limiting distribution? P = ( 1/2 1/2 0 1 > P=matrix(c(1/2,0,1/2,1, nrow=2 > P [,1] [,2] [1,] [2,] > P%*%P [1,] [2,] > P%*%P%*%P [1,] [2,] > P%*%P%*%P%*%P [1,] [2,] > P%*%P%*%P%*%P%*%P [1,] [2,] Finally, ( 0 1 lim n Pn = 0 1 Here state 0 is transient; after the chain starts from state 0 there is a positive probability that it will never return to that state State j is said to be accessible from state i if there is positive probability that state j can be reached starting from state i in some finite number of transitions, namely, p (n > 0 for some integer n 0 Two state i and j, each accessible to the other, are said to communicate A Markov chain is irreducible if all state communicate with each other That is, every state can be reached from every other state For an irreducible, aperiodic chain, the stationary distribution exists and is unique

8 52 THE METROPOLIS-HASTINGS ALGORITHM Basic idea about Markov chain sampling Suppose that π( is a distribution we wish to simulate A way to generate values from π( is to construct a Markov chain with π( as its stationary distribution, and to run the chain from an arbitrary starting value until the distribution converges to π Two important questions: (i how to construct an appropriate Markov chain, and (ii how long the chain needs to run to reach the stationary distribution 52 The Metropolis-Hastings algorithm It seems that practical applications of Markov chain sampling started from Metroploist 1 et al (1953, from which Hastings 2 (1970 extended the basic proposal and gave some of the first applications in the statistical literature The M-H algorithm gives a general method for constructing a Markov chain with stationary distribution given by a target (or approximating density function π(x Choose an appropriate Markov chain transition kernel q(x, y with the following properties: (i its state space is the same as that of π(, (ii it is close to π(y, and (iii it is easy to sample from Define Clearly, 0 α(x, y 1 { } π(yq(y, x α(x, y = min π(xq(x, y, 1 Given the state of the chain at time n, X (n, the M-H algorithm samples a trial value X (n+1 t q(x (n,, and sets X (n+1 = X (n, X (n+1 t, with probability α(x (n, X (n+1 t, with probability 1 α(x (n, X (n+1 t from Typically, this is established in practice by drawing U (n U(0, 1 and setting X (n+1 = X (n+1 t I{U (n α(x (n, X (n+1 t } + X (n I{U (n > α(x (n, X (n+1 t }, 1 Metropolis, N, Rosenbluth, AW, Rosenbluth, MN, Teller, AH, Teller, E (1953 Equations of state calculations by fast computing machines Journal of Chemical Physics 21, Hastings, WK (1970 Monte Carlo sampling methods using Markov chains and their application Biometrika 57,

9 74 CHAPTER 5 MARKOV CHAIN MONTE CARLO where I(A is the indicator function of set A The transition probability distribution of the resulting chain is given by q(x, yα(x, y, y x, p(x, y = 1 u =x q(x, uα(x, u, y = x, for which π(x is the stationary distribution 521 Random walk chain (Metropolis et al (1953 Let g be a density defined on the same space as π(, and set q(x, y = g(y x One choice for g would be the normal density with mean zero and covariance matrix { } 2 1 Ĥ = x x log(π(ˆx, where ˆx is the mode of π(x In case g is symmetric, then α(x, y = min { } { } π(yg(y x π(y π(xg(x y, 1 = min π(x, Independence chain Let g be a density defined on the same space as π(, and set q(x, y = g(y, so that trial values are generated independently of the current value Now { } { } π(yg(x π(y/g(y α(x, y = min π(xg(y, 1 = min π(x/g(x, 1, so that α(x (n, X (n+1 t is the ratio of the importance sampling weights at the current and the trial points 523 Rejection sampling chain For rejection sampling, we need find a function g which dominates the density π everywhere, ie, g > π If g does not actually dominate π(, then choose which results in q(x, y = h(y min(π(y, g(y, { } π(yh(x α(x, y = min π(xh(y, 1

10 52 THE METROPOLIS-HASTINGS ALGORITHM Gibbs sampling Suppose that π(x is the density for a random vector (U, V, π U V (u v is the conditional density of U given V, and π V U (v u is the conditional density of V given U Gibbs sampling is a block-at-a-time update scheme where the new values for each block are generated directly from the conditional distributions, π U V (u v and π V U (v u It turns out a variety of problems, such as hierarchical models, can be put in a framework where the full conditional distributions are easy to sample from Suppose that π U V (u v is a stationary distribution for a Markov chain, and π V U (v u is a stationary distribution for another Markov chain Given the current state X (n = (U (n, V (n, consider the two step update (i generate U (n+1 from π U V (U (n, V (n, (ii generate V (n+1 from π V U (V (n, U (n, which may be thought of as generating a single update X (n+1 of the chain

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7. Markov chain background. 7.1 Finite state space Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

More information

Statistical Data Mining and Medical Signal Detection. Lecture Five: Stochastic Algorithms and MCMC. July 6, Motoya Machida

Statistical Data Mining and Medical Signal Detection. Lecture Five: Stochastic Algorithms and MCMC. July 6, Motoya Machida Statistical Data Mining and Medical Signal Detection Lecture Five: Stochastic Algorithms and MCMC July 6, 2011 Motoya Machida (mmachida@tntech.edu) Statistical Data Mining 1/20 Plotting Density Functions

More information

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Eric Slud, Statistics Program Lecture 1: Metropolis-Hastings Algorithm, plus background in Simulation and Markov Chains. Lecture

More information

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

MSc MT15. Further Statistical Methods: MCMC. Lecture 5-6: Markov chains; Metropolis Hastings MCMC. Notes and Practicals available at

MSc MT15. Further Statistical Methods: MCMC. Lecture 5-6: Markov chains; Metropolis Hastings MCMC. Notes and Practicals available at MSc MT15. Further Statistical Methods: MCMC Lecture 5-6: Markov chains; Metropolis Hastings MCMC Notes and Practicals available at www.stats.ox.ac.uk\ nicholls\mscmcmc15 Markov chain Monte Carlo Methods

More information

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected 4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Simulation - Lectures - Part III Markov chain Monte Carlo

Simulation - Lectures - Part III Markov chain Monte Carlo Simulation - Lectures - Part III Markov chain Monte Carlo Julien Berestycki Part A Simulation and Statistical Programming Hilary Term 2018 Part A Simulation. HT 2018. J. Berestycki. 1 / 50 Outline Markov

More information

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Some Results on the Ergodicity of Adaptive MCMC Algorithms Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2 MATH 56A: STOCHASTIC PROCESSES CHAPTER 2 2. Countable Markov Chains I started Chapter 2 which talks about Markov chains with a countably infinite number of states. I did my favorite example which is on

More information

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1 MATH 56A: STOCHASTIC PROCESSES CHAPTER. Finite Markov chains For the sake of completeness of these notes I decided to write a summary of the basic concepts of finite Markov chains. The topics in this chapter

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018 Math 456: Mathematical Modeling Tuesday, April 9th, 2018 The Ergodic theorem Tuesday, April 9th, 2018 Today 1. Asymptotic frequency (or: How to use the stationary distribution to estimate the average amount

More information

Convergence Rate of Markov Chains

Convergence Rate of Markov Chains Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Markov Chains Handout for Stat 110

Markov Chains Handout for Stat 110 Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Markov Random Fields

Markov Random Fields Markov Random Fields 1. Markov property The Markov property of a stochastic sequence {X n } n 0 implies that for all n 1, X n is independent of (X k : k / {n 1, n, n + 1}), given (X n 1, X n+1 ). Another

More information

Markov and Gibbs Random Fields

Markov and Gibbs Random Fields Markov and Gibbs Random Fields Bruno Galerne bruno.galerne@parisdescartes.fr MAP5, Université Paris Descartes Master MVA Cours Méthodes stochastiques pour l analyse d images Lundi 6 mars 2017 Outline The

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS A Thesis in Statistics by Chris Groendyke c 2008 Chris Groendyke Submitted in

More information

Markov Processes Hamid R. Rabiee

Markov Processes Hamid R. Rabiee Markov Processes Hamid R. Rabiee Overview Markov Property Markov Chains Definition Stationary Property Paths in Markov Chains Classification of States Steady States in MCs. 2 Markov Property A discrete

More information

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC)

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Cong Ling Imperial College London aaa joint work with Zheng Wang (Huawei Technologies Shanghai) aaa September 20, 2016 Cong Ling (ICL) MCMC

More information

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Assaf Weiner Tuesday, March 13, 2007 1 Introduction Today we will return to the motif finding problem, in lecture 10

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

MCMC notes by Mark Holder

MCMC notes by Mark Holder MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Summary of Results on Markov Chains. Abstract

Summary of Results on Markov Chains. Abstract Summary of Results on Markov Chains Enrico Scalas 1, 1 Laboratory on Complex Systems. Dipartimento di Scienze e Tecnologie Avanzate, Università del Piemonte Orientale Amedeo Avogadro, Via Bellini 25 G,

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321 Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i := 2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation Ulm University Institute of Stochastics Lecture Notes Dr. Tim Brereton Summer Term 2015 Ulm, 2015 2 Contents 1 Discrete-Time Markov Chains 5 1.1 Discrete-Time Markov Chains.....................

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods p. /36 Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Markov Chain Monte Carlo Methods p. 2/36 Markov Chains

More information

Advanced Sampling Algorithms

Advanced Sampling Algorithms + Advanced Sampling Algorithms + Mobashir Mohammad Hirak Sarkar Parvathy Sudhir Yamilet Serrano Llerena Advanced Sampling Algorithms Aditya Kulkarni Tobias Bertelsen Nirandika Wanigasekara Malay Singh

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information

MARKOV PROCESSES. Valerio Di Valerio

MARKOV PROCESSES. Valerio Di Valerio MARKOV PROCESSES Valerio Di Valerio Stochastic Process Definition: a stochastic process is a collection of random variables {X(t)} indexed by time t T Each X(t) X is a random variable that satisfy some

More information

Probabilistic Graphical Networks: Definitions and Basic Results

Probabilistic Graphical Networks: Definitions and Basic Results This document gives a cursory overview of Probabilistic Graphical Networks. The material has been gleaned from different sources. I make no claim to original authorship of this material. Bayesian Graphical

More information

Understanding MCMC. Marcel Lüthi, University of Basel. Slides based on presentation by Sandro Schönborn

Understanding MCMC. Marcel Lüthi, University of Basel. Slides based on presentation by Sandro Schönborn Understanding MCMC Marcel Lüthi, University of Basel Slides based on presentation by Sandro Schönborn 1 The big picture which satisfies detailed balance condition for p(x) an aperiodic and irreducable

More information

ELEC633: Graphical Models

ELEC633: Graphical Models ELEC633: Graphical Models Tahira isa Saleem Scribe from 7 October 2008 References: Casella and George Exploring the Gibbs sampler (1992) Chib and Greenberg Understanding the Metropolis-Hastings algorithm

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Markov Chains (Part 3)

Markov Chains (Part 3) Markov Chains (Part 3) State Classification Markov Chains - State Classification Accessibility State j is accessible from state i if p ij (n) > for some n>=, meaning that starting at state i, there is

More information

Necessary and sufficient conditions for strong R-positivity

Necessary and sufficient conditions for strong R-positivity Necessary and sufficient conditions for strong R-positivity Wednesday, November 29th, 2017 The Perron-Frobenius theorem Let A = (A(x, y)) x,y S be a nonnegative matrix indexed by a countable set S. We

More information

A short diversion into the theory of Markov chains, with a view to Markov chain Monte Carlo methods

A short diversion into the theory of Markov chains, with a view to Markov chain Monte Carlo methods A short diversion into the theory of Markov chains, with a view to Markov chain Monte Carlo methods by Kasper K. Berthelsen and Jesper Møller June 2004 2004-01 DEPARTMENT OF MATHEMATICAL SCIENCES AALBORG

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

The Theory behind PageRank

The Theory behind PageRank The Theory behind PageRank Mauro Sozio Telecom ParisTech May 21, 2014 Mauro Sozio (LTCI TPT) The Theory behind PageRank May 21, 2014 1 / 19 A Crash Course on Discrete Probability Events and Probability

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Probabilistic Models of Cognition, 2011 http://www.ipam.ucla.edu/programs/gss2011/ Roadmap: Motivation Monte Carlo basics What is MCMC? Metropolis Hastings and Gibbs...more tomorrow.

More information

Ch5. Markov Chain Monte Carlo

Ch5. Markov Chain Monte Carlo ST4231, Semester I, 2003-2004 Ch5. Markov Chain Monte Carlo In general, it is very difficult to simulate the value of a random vector X whose component random variables are dependent. In this chapter we

More information

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School Bayesian modelling Hans-Peter Helfrich University of Bonn Theodor-Brinkmann-Graduate School H.-P. Helfrich (University of Bonn) Bayesian modelling Brinkmann School 1 / 22 Overview 1 Bayesian modelling

More information

Homework 10 Solution

Homework 10 Solution CS 174: Combinatorics and Discrete Probability Fall 2012 Homewor 10 Solution Problem 1. (Exercise 10.6 from MU 8 points) The problem of counting the number of solutions to a napsac instance can be defined

More information

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65 MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65 2.2.5. proof of extinction lemma. The proof of Lemma 2.3 is just like the proof of the lemma I did on Wednesday. It goes like this. Suppose that â is the smallest

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

A Geometric Interpretation of the Metropolis Hastings Algorithm

A Geometric Interpretation of the Metropolis Hastings Algorithm Statistical Science 2, Vol. 6, No., 5 9 A Geometric Interpretation of the Metropolis Hastings Algorithm Louis J. Billera and Persi Diaconis Abstract. The Metropolis Hastings algorithm transforms a given

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

http://www.math.uah.edu/stat/markov/.xhtml 1 of 9 7/16/2009 7:20 AM Virtual Laboratories > 16. Markov Chains > 1 2 3 4 5 6 7 8 9 10 11 12 1. A Markov process is a random process in which the future is

More information

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

INTRODUCTION TO MARKOV CHAIN MONTE CARLO INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate

More information