Advanced Sampling Algorithms

Size: px
Start display at page:

Download "Advanced Sampling Algorithms"

Transcription

1 + Advanced Sampling Algorithms

2 + Mobashir Mohammad Hirak Sarkar Parvathy Sudhir Yamilet Serrano Llerena Advanced Sampling Algorithms Aditya Kulkarni Tobias Bertelsen Nirandika Wanigasekara Malay Singh

3 + Introduction Mobashir Mohammad 3

4 + Sampling Algorithms 4 Monte Carlo Algorithms Gives an approximate answer with a certain probability May terminate with a failure signal Correct answer not Guaranteed Time Bounded Las Vegas Algorithms Always gives an exact answer Doesn t terminates in a definite time Answer Guaranteed Time not Bounded

5 + Monte Carlo Method 5 What are the chances of winning with a properly shuffled deck? Hard to compute because winning or losing depends on a complex procedure of reorganizing the cards Insight Why not play just a few hands, and see empirically how many do in fact we win? More generally We can approximate a probability density function using only a few samples from that density

6 + Simplest Monte Carlo Method 6 Finding the value of π Shooting Darts 1 π/4 is equal to the area of a circle of diameter 1 p 4 ò ò x 2 + y 2 <1 = dx dy 1 Integral solved with Monte Carlo Randomly select a large number of points inside the square N _ inside_ circle p» 4 * (Exact when N_total ) N _ total

7 + Monte Carlo Principle 7 Given a very large set X and a distribution p(x) over it We draw a set of N independent and identically distributed samples We can then approximate the distribution using these samples. N p N x = 1 n i=1 1 xi =x N p x We can also use these sample to compute expectations,

8 + Sampling 8 Problem: Exact inference from a probability distribution is hard Determine the number of people likely to go shopping to a Nike Store at Vivo City on a Sunday? Solution: Use random samples to get an approximate inference. How: Use a Markov Chain to get random samples!

9 + Markov Chains Hirak Sarkar 9

10 + Introduction to Markov Chain 10 Real world systems contain uncertainty and evolve over time Stochastic Processes and Markov Chains model such systems A discrete time stochastic process is a sequence of random variables X 0, X 1, and typically denoted by {X n }

11 + Components of a Stochastic 11 Process The state space of a stochastic process is the set of all values that the {X n } can take. Time: n = 1,2, Set of states is denoted by S. S = {s 1, s 2,, s k } here there are k states Clearly {X n } will take one of k values

12 + Markov Property and Markov 12 Chain Consider special class of stochastic processes that satisfy a certain property called Markov Property. Markov Property: State for X n only depends on content of X n 1 not any of X i for i < n 1 Formally, P X n = i n X 0 = i 0,, X n 1 = i n 1 ) = P X n = i n X n 1 = i n 1 ) = p(i n, i n 1 ) We call such processes as Markov Chain A matrix containing p(i n, i n 1 ) in (i n, i n 1 ) cell is called Transition Matrix.

13 + Transition Matrix 13 The one-step transition matrix for a Markov chain with states S = 0,1,2 is P p p p p p p p p p Where p ij = Pr X 1 = j X 0 = i If P is same for any point of time then it is called time homogeneous. m j=1 p i,j = 1 p i,j 0 i, j

14 + Example: Random Walk 14 Adjacency Matrix Transition Matrix 1 1 1/2 1/2

15 + 15 What is a random walk B 1 A 1 t=0 1/2 1/2 C

16 + 16 What is a random walk B 1 A 1 t=0 t=1 A 1/2 B 1 1 1/2 1/2 C 1/2 C

17 + 17 What is a random walk B 1 A 1 t=0 t=1 A 1 1/2 1/2 1 B 1/2 C 1/2 C B 1 A 1 t=2 1/2 1/2 C

18 + 18 What is a random walk B 1 A 1 t=0 t=1 A 1 1/2 1/2 1 B 1/2 C 1/2 C B 1 1/2 A 1 C t=2 1/2 B 1 1/2 A 1 C t=3 1/2

19 + 19 Probability Distributions x t i = P X t = i = probability that the surfer is at node i at time t x t+1 (i)= j P X t+1 = i X t = j) P(X t = j) = j p i,j x t (j) x t+1 = x t P= x t 1 P P= x t 2 P P P = =x 0 P t What happens when the surfer keeps walking for a long time?

20 + 20 Stationary Distribution When the surfer keeps walking for a long time When the distribution does not change anymore i.e. x T+1 = x T We denote it as π

21 + Basic Limit Theorem 21 P = P 5 = P 10 = P 20 =

22 + Example 22

23 + Classification of States 23 State j is reachable from state i if the probability to go from i to j in n > 0 steps is greater than zero (State j is reachable from state i if in the state transition diagram there is a path from i to j) A subset S of the state space S is closed if p ij = 0 for every i S and j S A state i is said to be absorbing if it is a single element closed set. A closed set S of states is irreducible if any state j S is reachable from every state i S A Markov chain is said to be irreducible if the state space S is irreducible.

24 + Example 24 Irreducible Markov Chain p 01 p 12 p 22 p 00 p 10 p Reducible Markov Chain p 01 p 12 p p 00 p 10 p 14 p 32 Absorbing State 4 p 22 p 33 Closed irreducible set

25 + Periodic and Aperiodic States 25 Suppose that the structure of the Markov Chain is such that state i is visited after a number of steps that is an integer multiple of an integer d >1. Then the state is called periodic with period d. If no such integer exists (i.e., d =1) then the state is called aperiodic. Example Periodic State d = P

26 + Stationary Distribution 26 P = π 0 = π 0 0, π 1 0 = 0.5,0.5 π 1 = π 0 P = = ( )

27 + Steady State Analysis 27 Recall that the probability of finding the MC at state i after the k th step is given by Pr X i π k k, k... i k k 0 1 An interesting question is what happens in the long run, i.e., Questions: Do these limits exists? i lim k k This is referred to as steady state or equilibrium or stationary state probability If they exist, do they converge to a legitimate probability distribution, i.e., i How do we evaluate π j, for all j. 1

28 + Steady State Analysis 28 THEOREM: In a finite irreducible aperiodic Markov chain a unique stationary state probability vector π exists such that π j > 0 and lim k π j k = π j The steady state distribution π is determined by solving π = πp and i 1 i

29 + Sampling from a Markov Chain Parvathy 29

30 + Sampling from a Markov Chain 30 Sampling create random samples from the desired probability distribution (P) Construct Markov Chain with steady state distribution π Π = P Run the Markov chain long enough On convergence, its states are approximately distributed according to π. Random sampling of Markov chain yields random samples of P But! Difficult to determine how long to run the Markov chain.

31 + Ising Model 31 Named after physicist Ernst Ising, invented by Wilhem Lenz Mathematical model of ferromagnetism Consists of discrete variables representing magnetic dipole moments of atomic spins (+1 or 1)

32 + 1D Ising Model 32 Solved by Ernst Ising N spins arranged in a ring P(s) = e βh(s) Z β inverse temperature H s energy s = x 1,, x N is one possible configuration x N x 1 Higher probability on states with lower energy

33 + 2D Ising Model 33 Spins are arranged in a lattice. Each spin interacts with 4 neighbors Simplest statistical models to show a phase transition

34 + 2D Ising Model (2) 34 G = (V, E) x i {+1, 1} - value of spin at location i S = {s 1, s 2,, s n } is the state space. H s = xi,x j E s x i s(x j ) Similar spins subtract energy Opposite spins add energy

35 + 2D Ising Model (3) 35 P(s) = e βh(s) Z β = e xi x j E s x i s(x j ) Z If β = 0 all spin configurations have same probability If β > 0 lower energy preferred

36 + Exact Inference is Hard 36 Posterior distribution over s If Y is any observed variable P s Y) = p s,y p(s,y) Intractable computation Joint probability distribution - 2 N possible combinations of s Marginal probability distribution at a site MAP estimation

37 + The Big Question 37 Given π on S = s 1, s 2,, s n distribution π simulate a random object with

38 + Methods 38 Generate random samples to estimate a quantity Samples are generated Markov-chain style Markov Chain Monte Carlo (MCMC) Propp Wilson Simulation Las Vegas variant Sandwiching Improvement on Propp Wilson

39 + MARKOV CHAIN MONTE CARLO METHOD (MCMC) YAMILET R. SERRANO LL. 39

40 + Recall 40 Given a probability distribution π on S = {s1,, sk}, how do we simulate a random object with distribution π?

41 + Intuition 41 Given a probability distribution π on S = {s1,, sk}, how do we simulate a random object with distribution π? High Dimension Space

42 + MCMC Construct an irreducible and aperiodic Markov Chain [X0,X1, ], whose stationary distribution π. 2. If we run the chain with arbitrary initial distribution then 1. The Markov Chain Convergence Theorem guarantees that the distribution of the chain at time n converges to π. 3. Hence, if we run the chain for a sufficiently long time n, then the distribution of Xn, will be very close to π. So it can be used as a sample.

43 + MCMC(2) 43 Generally, two types of MCMC algorithm Metropolis Hasting Gibbs Sampling

44 + Metropolis Hastings Algorithm 44 Original method: Metropolis, Rosenbluth, Rosenbluth, Teller and Teller (1953). Generalized by Hasting in Rediscovered by Tanner and Wong (1987) and Gelfang and Smith (1990) Is one way to implement MCMC. Nicholas Metropolis

45 + Metropolis Hastings Algorithm 45 Basic Idea GIVEN: A probability distribution π on S = {s1,, sk} GOAL : Approx. sample from π Start with a proposal distribution Q(x,y) Q(x,y) specifies transition of Markov Chain Q(x,y) plays the role of the transition matrix x: current state y: new proposal By accepting/rejecting the proposal, MH simulates a Markov Chain, whose stationary distribution is π

46 + Metropolis Hastings Algorithm 46 Algorithm GIVEN: GOAL : A probability distribution π on S = {s1,, sk} Approx. sample from π Given current sample, Draw y from the proposal distribution, Draw U ~ (0,1) and update where the acceptance probability is

47 + Metropolis Hastings Algorithm 47 The Ising Model Consider m sites around a circle. Each site i can have one of two spins xi {-1,1} The target distribution :

48 + Metropolis Hastings Algorithm The Ising Model Target Distribution: Proposal Distribution: 1. Randomly pick one out of the m spins 2. Flip its sign Acceptance probability (say i-th spin flipped):

49 + Metropolis Hastings Algorithm 49 The Ising Model Image has been taken from Montecarlo investigation of the Ising Model(2006) Tobin Fricke

50 + Disadvantages of MCMC 50 No matter how large n is taken to be in the MCMC algorithm, there will still be some discrepancy between the distribution of the output and the target distribution π. In order to make the previous error small, we need to figure out how large n needs to be.

51 + Bounds on Convergence of MCMC Aditya Kulkarni 51

52 + Seminar on probabilities 52 (Strasunbourge ) If it is difficult to obtain asymptotic bounds on the convergence time of MCMC algorithms for Ising models, use quantitative bounds Use a characteristics of Ising model MCMC algorithm to obtain quantitative bounds David Aldous

53 + What we need to ensure 53 As we go along the time We approach to the stationary distribution in a monotonically decreasing fashion. When we stop d t 0 as t The sample should follow a distribution which is not further apart from a stationary distribution by some factor ε. d t ε

54 + Total Variation Distance TV 54 Given two distributions p and q over a finite set of states S, The total variation distance TV is p q TV = 1 2 p q 1 = 1 2 x S p x q x

55 + Convergence Time 55 Let X = X 0, X 1, be a Markov Chain of a state space with stationary distribution π We define d t as the worst total variation distance at time t. d t = max x V P X t X 0 = x π TV The mixing time t is the minimum time t such that d t most ε. is at τ(ε) = min {t d t ε} Define τ = τ 1 2e = min {t d t 1 2e}

56 + Quantitative results 56 1 d t 1 2e ε 0 τ τ(ε) t

57 + Lemma 57 Consider two random walks started from state i and state j Define ρ t as the worst total variation distance between their respective probability distributions at time t a. ρ t 2d t b. d(t) is decreasing

58 + Upper bound on d(t) proof 58 From part a and the definition of τ = τ we get ρ τ e 1 1 2e = min {t d t 1 2e} Also from part b, we deduce that upper bound of ρ at a particular time t 0 gives upper bound for later times ρ t (ρ(t 0 )) n ; nt 0 t (n + 1)t 0 ρ t (ρ(t 0 )) ( t t 1) 0 Substitute τ instead of t 0 ρ t (ρ(τ)) ( t τ 1) d t exp 1 t τ, t 0

59 + Upper bound on τ(ε) proof 59 Algebraic calculations: d t exp 1 t τ log d(t) 1 t τ t τ 1 log d t 1 + log( 1 d(t) ) t τ (1 + log( 1 ε)) τ ε 2et (1 + log( 1 ε))

60 + Upper bounds 60 d t min(1, exp 1 t τ ), t 0 τ(ε) τ (1 + log 1 ε ), 0 < ε < 1 τ(ε) 2et (1 + log 1 ε ), 0 < ε < 1

61 + Entropy of initial distribution 61 Measure of randomness: ent μ = μ x log μ x x V x is initial state μ is initial probability distribution

62 + Few more lemma s Let (X t ) is a random walk associated with μ, and μ t be the distribution of X t then ent μ t t ent(μ) 2. If v is a distribution of S such that v π ε then ent v (1 ε) log S

63 + Lower bound on d(t) proof 63 From lemma 2, ent v 1 ε log S ent v log S d t 1 1 ε (1 d(t)) ent v log S d t 1 t ent μ log S

64 + Lower bound on τ(ε) proof 64 From lemma 1, ent μ t t ent(μ) t ent μ t ent(μ) From lemma 2, ent μ t (1 ε) log S t 1 ε log S ent(μ) τ(ε) 1 ε log S ent(μ)

65 + Lower bounds 65 d t 1 t ent(μ) log S τ ε 1 ε log S ent(μ)

66 + Propp Wilson Tobias Bertelsen 66

67 + An exact version of MCMC 67 Problems with MCMC A. Have accuracy error, which depends on starting state B. We must know number of iterations James Propp and David Wilson propos Coupling from the past (1996) A.k.a. the Propp-Wilson algorithm Idea: Solve problems by running chain infinitely

68 + An exact version of MCMC 68 Theoretical Runs all configurations infinitely Literarily takes time Impossible Coupling from the past Runs all configurations for finite time Might take 1000 s of years Infeasible Sandwiching Run few configurations for finite time Takes seconds Practicable

69 + Theoretical exact sampling 69 Recall the convergence theorem: We will approach the stationary distribution as the number of steps goes to infinity Intuitive approach: To sample perfectly we start a chain and run for infinity Start at t = 0, sample at t = Problem: We never get a sample Alternative approach: To sample perfectly we take a chain that have already been running for an infinite amount of time Start at t =, sample at t = 0

70 + Theoretical independence of 70 starting state Sample from a Markov chain in MCMC depends solely on Starting state X Sequence of random numbers U We want to be independent of the starting state. For a given sequence of random numbers U,, U 1 we want to ensure that the starting state X has no effect on X 0

71 + Theoretical independence of 71 starting state Collisions: For a given U,, U 1 if two Markov chains is at the same state at some t the will continue on together:

72 + Coupling from the past 72 At some finite past time t = N All past chains has already run infinitively and has coupled into one N = We want to continue that coupled chain to t = 0 But we don t know which at state they will be at t = N Run all states from N instead of

73 + Coupling from the past Let U = U 1, U 2, U 3, be the sequence of independnent uniformly random numbers 2. For N j 1,2,4,8, 1. Extend U to length n j, keeping U 1,, U nj 1 the same 2. Start one chain from each state at t = N j 3. For t from N j to zero: Simulate the chains using U t 4. If all chains has converged at S i at t = 0, return S i 5. Else repeat loop

74 + Coupling from the past 74

75 + Questions 75 Why do we double the lengths? Worst case < 4N opt steps, where N opt is the minimal N at which we can achieve convergence 2 Compare to N [1,2,3,4,5, ], O N opt steps Why do we have to use the same random numbers? Different samples might take longer or shorter to converge. We must evaluate the same sample in each iteration. The sample should only be dependent on U not the different N

76 + Using the same U 76 We have k states with the following update function S i, U = S 1 U < 1 2 S i+1 otherwise φ S k, U = S 1 U < 1 2 S k otherwise π S i = 1 2 i, π k = 2 2 k

77 + Using the same U 77 The probability of S 1 only depends on the last random number: P S 1 = P U 1 < 1 2 = 1 2 Lets assume we generate a new U for each run: U 1, U 2, U 3, n U P S 1 P S 1 acc. 1 1 U 1 2 U 2 2 2, U 1 4 U 3 3 4,, U 1 P U 1 1 < 1 2 P U 2 1 < 1 U < 1 2 P U 3 1 < 1 U < 1 U < % 75 % % 8 [U 4 8,, U 4 1 ] P U 3 1 < 1 U < 1 U < 1 U < % 2

78 + Using the same U 78 The probability of S 1 only depends on the last random number: P S 1 = P U 1 < 1 2 = 1 2 Lets instead use the same U for each run: U 1 n U P S 1 P S 1 acc. 1 1 U 1 2 U 1 1 2, U 1 4 U 1 1 4,, U 1 P U 1 1 < 1 2 P U 1 1 < 1 2 P U 1 1 < % 50 % 50 % 8 [U 1 8,, U 1 1 ] P U 1 1 < %

79 + Problem 79 In each step we update up to k chains Total execution time is O N opt k BUT k = 2 L2 in the Ising model Worse than the naïve approach O(k)

80 + Sandwiching Nirandika Wanigasekara 80

81 + Sandwiching 81 Propp Wilson algorithm Many vertices running k chains will take time Impractical for large k Choose a relatively small state space Can we still get the same results? Try sandwiching

82 + Sandwiching 82 Idea Find two chains bounding all other chains If we have such two boundary chains Check if those two chains converge Then all other chains have also converged

83 + Sandwiching 83 To come up with the boundary chains Need a way to order the states S 1 S 2 S 3 Chain in higher state does not cross a chain in lower state Results in a Markov Chain obeying certain monotonicity properties

84 + Sandwiching 84 Lets consider A fixed set of states k State space S = {1,., k} A transition matrix P 11 = P 12 = 1 2 P kk = P k,k 1 = 1 2 for i=2,..k 1, P i,i 1 = P i,i+1 = 1 2 All the other entries are 0

85 + Sandwiching 85 What is this Markov Chain doing? Take one step up and one step down the ladder, at each integer time, with probability 1 2 If at the top or the bottom(state k) It will stays where it is Ladder Walk on k vertices The stationary distribution π of this Markov Chain π i = 1 k for i = 1,., k

86 + Sandwiching 86 Propp-Wilson for this Markov Chain with Valid update function φ if u < 1 2 if u 1 2 then step down then step up negative starting times N 1, N 2, = 1, 2, 4, 8, States k = 5

87 + Sandwiching 87

88 + Sandwiching 88 Update function preserves ordering between states for all U 0, 1 and all i, j 1,., k such that i j we have φ i, U φ(j, U) It is sufficient to run only 2 chains rather than k

89 + Sandwiching 89 Are these conditions always met No, not always But, there are frequent instances where these conditions are met Especially useful when k is large Ising model is a good example for this

90 + Ising Model Malay Singh 90

91 + 2D Ising Model 91 A grid with sites numbered from 1 to L 2 where L is grid size. Each site i can have spin x i 1, +1 V is set of sites S = 1,1 V defines all possible configurations (states) Magnetisation m for a state s is m s = i s x i L 2 The energy of a state s H s = ij s x i s x j

92 + Ordering of states 92 For two states s, s we say s s if s x < s x x V Minima m = 1 Maxima m = 1 s min x = 1 x V s max x = +1 x V Hence s min s s max for all s

93 + The update function 93 We use the sequence of random numbers U n, U n 1,, U 0 We are updating the state at X n. We choose a site x in 0, L 2 uniformly. X n+1 x = +1, if U n+1 < exp 2β K + x, s K x, s exp 2β K + x, s K x, s + 1 1, otherwise

94 + Maintaining ordering 94 Ordering after update (From X n to X n+1 ) We choose the same site x to update in both chains The spin of x at X n+1 depends on update function We want to check exp 2β K + x, s K x, s exp 2β K + x, s K x, s + 1 exp 2β K + x, s K x, s exp 2β K + x, s K x, s + 1 That is equivalent to checking K + x, s K x, s K + x, s K x, s

95 + Maintaining ordering 95 As s s we have K + x, s K + x, s K x, s K x, s First equation minus second equation K + x, s K x, s K + x, s K x, s

96 + Ising Model L = 4 96 T = 3.5 and N = 512 T = 4.8 and N = 128

97 + Ising Model L = 8 97 T = 5.9 and N = 512

98 + Ising Model L = T = 5.3 and N =

99 + Summary 99 Exact sampling Markov chains monte carlo but when we converge Propp Wilson with Sandwiching to rescue

100 + Questions? 100

101 + Additional 101 Example 6.2

102 + Aditional 102 Update function

103 + Additional 103 Proof for order of between states?? Need to find

Propp-Wilson Algorithm (and sampling the Ising model)

Propp-Wilson Algorithm (and sampling the Ising model) Propp-Wilson Algorithm (and sampling the Ising model) Danny Leshem, Nov 2009 References: Haggstrom, O. (2002) Finite Markov Chains and Algorithmic Applications, ch. 10-11 Propp, J. & Wilson, D. (1996)

More information

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm

Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Lecture 3: September 10

Lecture 3: September 10 CS294 Markov Chain Monte Carlo: Foundations & Applications Fall 2009 Lecture 3: September 10 Lecturer: Prof. Alistair Sinclair Scribes: Andrew H. Chan, Piyush Srivastava Disclaimer: These notes have not

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Convergence Rate of Markov Chains

Convergence Rate of Markov Chains Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution

More information

7.1 Coupling from the Past

7.1 Coupling from the Past Georgia Tech Fall 2006 Markov Chain Monte Carlo Methods Lecture 7: September 12, 2006 Coupling from the Past Eric Vigoda 7.1 Coupling from the Past 7.1.1 Introduction We saw in the last lecture how Markov

More information

Markov Processes Hamid R. Rabiee

Markov Processes Hamid R. Rabiee Markov Processes Hamid R. Rabiee Overview Markov Property Markov Chains Definition Stationary Property Paths in Markov Chains Classification of States Steady States in MCs. 2 Markov Property A discrete

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

25.1 Markov Chain Monte Carlo (MCMC)

25.1 Markov Chain Monte Carlo (MCMC) CS880: Approximations Algorithms Scribe: Dave Andrzejewski Lecturer: Shuchi Chawla Topic: Approx counting/sampling, MCMC methods Date: 4/4/07 The previous lecture showed that, for self-reducible problems,

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018 Math 456: Mathematical Modeling Tuesday, April 9th, 2018 The Ergodic theorem Tuesday, April 9th, 2018 Today 1. Asymptotic frequency (or: How to use the stationary distribution to estimate the average amount

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Assaf Weiner Tuesday, March 13, 2007 1 Introduction Today we will return to the motif finding problem, in lecture 10

More information

The Markov Chain Monte Carlo Method

The Markov Chain Monte Carlo Method The Markov Chain Monte Carlo Method Idea: define an ergodic Markov chain whose stationary distribution is the desired probability distribution. Let X 0, X 1, X 2,..., X n be the run of the chain. The Markov

More information

Coupling. 2/3/2010 and 2/5/2010

Coupling. 2/3/2010 and 2/5/2010 Coupling 2/3/2010 and 2/5/2010 1 Introduction Consider the move to middle shuffle where a card from the top is placed uniformly at random at a position in the deck. It is easy to see that this Markov Chain

More information

Monte Carlo Simulation of the Ising Model. Abstract

Monte Carlo Simulation of the Ising Model. Abstract Monte Carlo Simulation of the Ising Model Saryu Jindal 1 1 Department of Chemical Engineering and Material Sciences, University of California, Davis, CA 95616 (Dated: June 9, 2007) Abstract This paper

More information

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected 4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

MCMC notes by Mark Holder

MCMC notes by Mark Holder MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:

More information

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

INTRODUCTION TO MARKOV CHAIN MONTE CARLO INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness

More information

Perfect simulation for image analysis

Perfect simulation for image analysis Perfect simulation for image analysis Mark Huber Fletcher Jones Foundation Associate Professor of Mathematics and Statistics and George R. Roberts Fellow Mathematical Sciences Claremont McKenna College

More information

Markov Chains, Random Walks on Graphs, and the Laplacian

Markov Chains, Random Walks on Graphs, and the Laplacian Markov Chains, Random Walks on Graphs, and the Laplacian CMPSCI 791BB: Advanced ML Sridhar Mahadevan Random Walks! There is significant interest in the problem of random walks! Markov chain analysis! Computer

More information

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time

More information

On Markov Chain Monte Carlo

On Markov Chain Monte Carlo MCMC 0 On Markov Chain Monte Carlo Yevgeniy Kovchegov Oregon State University MCMC 1 Metropolis-Hastings algorithm. Goal: simulating an Ω-valued random variable distributed according to a given probability

More information

Monte Carlo importance sampling and Markov chain

Monte Carlo importance sampling and Markov chain Monte Carlo importance sampling and Markov chain If a configuration in phase space is denoted by X, the probability for configuration according to Boltzman is ρ(x) e βe(x) β = 1 T (1) How to sample over

More information

Markov Chains Handout for Stat 110

Markov Chains Handout for Stat 110 Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of

More information

Numerical methods for lattice field theory

Numerical methods for lattice field theory Numerical methods for lattice field theory Mike Peardon Trinity College Dublin August 9, 2007 Mike Peardon (Trinity College Dublin) Numerical methods for lattice field theory August 9, 2007 1 / 24 Numerical

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Markov Processes. Stochastic process. Markov process

Markov Processes. Stochastic process. Markov process Markov Processes Stochastic process movement through a series of well-defined states in a way that involves some element of randomness for our purposes, states are microstates in the governing ensemble

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Numerical Analysis of 2-D Ising Model. Ishita Agarwal Masters in Physics (University of Bonn) 17 th March 2011

Numerical Analysis of 2-D Ising Model. Ishita Agarwal Masters in Physics (University of Bonn) 17 th March 2011 Numerical Analysis of 2-D Ising Model By Ishita Agarwal Masters in Physics (University of Bonn) 17 th March 2011 Contents Abstract Acknowledgment Introduction Computational techniques Numerical Analysis

More information

Lecture 5: Random Walks and Markov Chain

Lecture 5: Random Walks and Markov Chain Spectral Graph Theory and Applications WS 20/202 Lecture 5: Random Walks and Markov Chain Lecturer: Thomas Sauerwald & He Sun Introduction to Markov Chains Definition 5.. A sequence of random variables

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Simulated Annealing for Constrained Global Optimization

Simulated Annealing for Constrained Global Optimization Monte Carlo Methods for Computation and Optimization Final Presentation Simulated Annealing for Constrained Global Optimization H. Edwin Romeijn & Robert L.Smith (1994) Presented by Ariel Schwartz Objective

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Markov Chain Monte Carlo. Simulated Annealing.

Markov Chain Monte Carlo. Simulated Annealing. Aula 10. Simulated Annealing. 0 Markov Chain Monte Carlo. Simulated Annealing. Anatoli Iambartsev IME-USP Aula 10. Simulated Annealing. 1 [RC] Stochastic search. General iterative formula for optimizing

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1 MATH 56A: STOCHASTIC PROCESSES CHAPTER. Finite Markov chains For the sake of completeness of these notes I decided to write a summary of the basic concepts of finite Markov chains. The topics in this chapter

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 Sequential parallel tempering With the development of science and technology, we more and more need to deal with high dimensional systems. For example, we need to align a group of protein or DNA sequences

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Bayesian networks: approximate inference

Bayesian networks: approximate inference Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008 Approximative inference September 2008 1 / 25 Motivation Because of the (worst-case) intractability of exact

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

Metropolis Monte Carlo simulation of the Ising Model

Metropolis Monte Carlo simulation of the Ising Model Metropolis Monte Carlo simulation of the Ising Model Krishna Shrinivas (CH10B026) Swaroop Ramaswamy (CH10B068) May 10, 2013 Modelling and Simulation of Particulate Processes (CH5012) Introduction The Ising

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods p. /36 Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Markov Chain Monte Carlo Methods p. 2/36 Markov Chains

More information

On random walks. i=1 U i, where x Z is the starting point of

On random walks. i=1 U i, where x Z is the starting point of On random walks Random walk in dimension. Let S n = x+ n i= U i, where x Z is the starting point of the random walk, and the U i s are IID with P(U i = +) = P(U n = ) = /2.. Let N be fixed (goal you want

More information

Markov and Gibbs Random Fields

Markov and Gibbs Random Fields Markov and Gibbs Random Fields Bruno Galerne bruno.galerne@parisdescartes.fr MAP5, Université Paris Descartes Master MVA Cours Méthodes stochastiques pour l analyse d images Lundi 6 mars 2017 Outline The

More information

Discrete solid-on-solid models

Discrete solid-on-solid models Discrete solid-on-solid models University of Alberta 2018 COSy, University of Manitoba - June 7 Discrete processes, stochastic PDEs, deterministic PDEs Table: Deterministic PDEs Heat-diffusion equation

More information

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states. Chapter 8 Finite Markov Chains A discrete system is characterized by a set V of states and transitions between the states. V is referred to as the state space. We think of the transitions as occurring

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

Lect4: Exact Sampling Techniques and MCMC Convergence Analysis

Lect4: Exact Sampling Techniques and MCMC Convergence Analysis Lect4: Exact Sampling Techniques and MCMC Convergence Analysis. Exact sampling. Convergence analysis of MCMC. First-hit time analysis for MCMC--ways to analyze the proposals. Outline of the Module Definitions

More information

The parallel replica method for Markov chains

The parallel replica method for Markov chains The parallel replica method for Markov chains David Aristoff (joint work with T Lelièvre and G Simpson) Colorado State University March 2015 D Aristoff (Colorado State University) March 2015 1 / 29 Introduction

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information

Ch5. Markov Chain Monte Carlo

Ch5. Markov Chain Monte Carlo ST4231, Semester I, 2003-2004 Ch5. Markov Chain Monte Carlo In general, it is very difficult to simulate the value of a random vector X whose component random variables are dependent. In this chapter we

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321 Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process

More information

Any live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step.

Any live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step. 2. Cellular automata, and the SIRS model In this Section we consider an important set of models used in computer simulations, which are called cellular automata (these are very similar to the so-called

More information

MCMC and Gibbs Sampling. Sargur Srihari

MCMC and Gibbs Sampling. Sargur Srihari MCMC and Gibbs Sampling Sargur srihari@cedar.buffalo.edu 1 Topics 1. Markov Chain Monte Carlo 2. Markov Chains 3. Gibbs Sampling 4. Basic Metropolis Algorithm 5. Metropolis-Hastings Algorithm 6. Slice

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Powerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature.

Powerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature. Markov Chains Markov chains: 2SAT: Powerful tool for sampling from complicated distributions rely only on local moves to explore state space. Many use Markov chains to model events that arise in nature.

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Sampling Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

SC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions Instructor: Erik Sudderth Brown University Computer Science April 14, 215 Review: Discrete Markov Chains Some

More information

Monte Caro simulations

Monte Caro simulations Monte Caro simulations Monte Carlo methods - based on random numbers Stanislav Ulam s terminology - his uncle frequented the Casino in Monte Carlo Random (pseudo random) number generator on the computer

More information

LIMITING PROBABILITY TRANSITION MATRIX OF A CONDENSED FIBONACCI TREE

LIMITING PROBABILITY TRANSITION MATRIX OF A CONDENSED FIBONACCI TREE International Journal of Applied Mathematics Volume 31 No. 18, 41-49 ISSN: 1311-178 (printed version); ISSN: 1314-86 (on-line version) doi: http://dx.doi.org/1.173/ijam.v31i.6 LIMITING PROBABILITY TRANSITION

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

Markov Chains. Andreas Klappenecker by Andreas Klappenecker. All rights reserved. Texas A&M University

Markov Chains. Andreas Klappenecker by Andreas Klappenecker. All rights reserved. Texas A&M University Markov Chains Andreas Klappenecker Texas A&M University 208 by Andreas Klappenecker. All rights reserved. / 58 Stochastic Processes A stochastic process X tx ptq: t P T u is a collection of random variables.

More information

Random Walks A&T and F&S 3.1.2

Random Walks A&T and F&S 3.1.2 Random Walks A&T 110-123 and F&S 3.1.2 As we explained last time, it is very difficult to sample directly a general probability distribution. - If we sample from another distribution, the overlap will

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information