April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Size: px
Start display at page:

Download "April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning"

Transcription

1 for for Advanced Topics in California Institute of Technology April 20th, / 50

2 Table of Contents for / 50

3 History of methods for Enrico Fermi used to calculate incredibly accurate predictions using statistical sampling methods when he had insomnia, in order to impress his friends. 3 / 50

4 History of methods for Enrico Fermi used to calculate incredibly accurate predictions using statistical sampling methods when he had insomnia, in order to impress his friends. Sam Ulam used sampling to approximate the probability that a solitaire layout would be successful after the combinatorics proved too difficult. 3 / 50

5 History of methods for Enrico Fermi used to calculate incredibly accurate predictions using statistical sampling methods when he had insomnia, in order to impress his friends. Sam Ulam used sampling to approximate the probability that a solitaire layout would be successful after the combinatorics proved too difficult. Both of these examples get at the heart of Monte Carlo methods: selecting a statistical sample to approximate a hard combinatorial problem by a much simpler problem. 3 / 50

6 History of methods for Enrico Fermi used to calculate incredibly accurate predictions using statistical sampling methods when he had insomnia, in order to impress his friends. Sam Ulam used sampling to approximate the probability that a solitaire layout would be successful after the combinatorics proved too difficult. Both of these examples get at the heart of Monte Carlo methods: selecting a statistical sample to approximate a hard combinatorial problem by a much simpler problem. These ideas have been extended to fields across science, spurred on by the development of computers 3 / 50

7 Table of Contents for / 50

8 Bayesian Inference and for We often need to solve the following intractable problems in Bayesian Statistics, given unknown variables x X and data y Y : Normalization: we need to compute the normalizing factor X p(y x )p(x )dx in Bayes theorem in order to compute the posterior p(x y) given p(x) and p(y x) 5 / 50

9 Bayesian Inference and for We often need to solve the following intractable problems in Bayesian Statistics, given unknown variables x X and data y Y : Normalization: we need to compute the normalizing factor X p(y x )p(x )dx in Bayes theorem in order to compute the posterior p(x y) given p(x) and p(y x) Marginalization: Computing the marginal posterior p(x y) = Z p(x, z y)dz given the joint posterior of (x, z) X Z 5 / 50

10 Bayesian Inference and for We often need to solve the following intractable problems in Bayesian Statistics, given unknown variables x X and data y Y : Normalization: we need to compute the normalizing factor X p(y x )p(x )dx in Bayes theorem in order to compute the posterior p(x y) given p(x) and p(y x) Marginalization: Computing the marginal posterior p(x y) = Z p(x, z y)dz given the joint posterior of (x, z) X Z Expectation: Obtain the expectation E p(x y) (f (x)) = Z f (x)p(x y)dx for some function f : X R n f, such as the conditional mean or the conditional covariance 5 / 50

11 Other problems that can be approximated with for Statistical mechanics: Computing the partition function of a system can involve a large sum over possible configurations Optimization: Finding the optimal solution over a large feasible set Penalized likelihood model selection: Avoiding wasting computing resources on computing maximum likelihood estimates over a large initial set of models 6 / 50

12 Table of Contents for / 50

13 Introduction to for We first draw N i.i.d. samples x (i)n i=1 from a target density p(x) on a high-dimensional space X. We can use these N samples to estimate the target density p(x) using a point-mass function: p N (x) = 1 N N δ x (i)(x) (1) i=1 where δ x (i)(x) is the Dirac-delta mass located at x(i) 8 / 50

14 Introduction to for This allows us to approximate integrals or large sums I (f ) with tractable, smaller sums I N (f ). These converge by the strong law of large numbers: I N (f ) = 1 N N i=1 f (x (i) ) a.s. I (f ) = f (x)p(x)dx (2) N X 9 / 50

15 Introduction to for Note that I N (f ) is unbiased. If the variance of f (x) satisfies σf 2(x) = E p(x)(f 2 (x)) I 2 (f ) <, then var(i N (f )) = σ2 f (x) N Bounded variance of f also guarantees convergence in distribution of the error via the central limit theorem: N(IN (f ) I (f )) = N (0, σf 2 (x)) (3) N 10 / 50

16 Inverse Transform for How do we sample from easy-to-sample distributions? Suppose we want to sample from a distribution f with CDF F. Sample u Uniform[0, 1] Compute x = F 1 (u) (This gives us the largest x : P( < f < x) <= u) Then x will be sampled according to f 11 / 50

17 Inverse Transform for Example: N (0, 1) But for more complicated distributions, we may not have an explicit formula for F 1 (u). In these cases we can consider other sampling methods. 12 / 50

18 Rejection for Main idea: It can be difficult to sample from complicated distributions Instead, we will sample from a simple distribution that contains our complicated distribution (an upper bound), and reject any sample that is not within the complicated distribution If we take enough samples, this will give us a good approximation under the principle 13 / 50

19 Rejection : from a density for This is how we sample if we can compute the inverse CDF 14 / 50

20 Rejection : from a density for And this is a good way to approximate if you can t 15 / 50

21 Rejection : from a density for 16 / 50

22 Rejection : Drawbacks for Consider approximating the area of the unit sphere by sampling uniformly from the simplex and rejecting any sample not inside the sphere. In low dimensions, this is fine, but as the number of dimensions gets large, the volume of the unit sphere goes to zero whereas the volume of the simplex stays one, resulting in an intractable number of rejected samples. Rejection sampling is impractical in high dimensions! 17 / 50

23 Importance for Importance sampling is used to estimate properties of a particular distribution of interest. Essentially we are transforming a difficult integral into an expectation over a simpler proposal distribution. Relies on drawing samples from a proposal distribution in order to develop the approximation Convergence time depends heavily on an appropriate choice of proposal distribution. 18 / 50

24 Importance : visual example for 19 / 50

25 Importance for If we choose proposal distribution q(x) such that supp(p(x)) supp(q(x)) then we can write I (f ) = f (x)w(x)dx (4) where w(x) = p(x) q(x) is known as the importance weight 20 / 50

26 Importance for If we can simulate N i.i.d. samples x (i)n i+1 from q(x) and evaluate w(x (i) ) then we can approximate I (f ) as I N (f ) = N f (x (i) )w(x (i) ) (5) i=1 which converges to I (f ) by the strong law of large numbers 21 / 50

27 Importance for Choosing an appropriate q(x) is very important. In importance sampling, we can choose an optimal q by finding one that will minimize the variance of I N (f ) By sampling from areas of high importance we can achieve very high sampling efficiency It can be hard to find a candidate q in high dimensions. One strategy is to use adaptive importance sampling which parametrizes q. 22 / 50

28 Table of Contents for / 50

29 for Main Idea: Given a target distribution p, construct a Markov chain over the sample space χ with invariant distribution equal to p. Perform a random walk and a sample at each iteration. Use the sample histogram to approximate the target distribution p. This is called the Metropolis-Hastings algorithm. 24 / 50

30 for Main Idea: Given a target distribution p, construct a Markov chain over the sample space χ with invariant distribution equal to p. Perform a random walk and a sample at each iteration. Use the sample histogram to approximate the target distribution p. This is called the Metropolis-Hastings algorithm. Two important things to note for : We assume we can evaluate p(x) for any x χ. We only need to know p(x) up to a normalizing constant. 24 / 50

31 Review: Markov chains for Suppose a stochastic process x (i) can only take n values in χ = {x 1, x 2,..., x n }. Then x (i) is a Markov chain if Pr(x (i) x (i 1),..., x (1) ) = T (x (i) x (i 1) ). Transition matrix T is n n with row-sum 1 (aka stochastic matrix). T ij = T (x j x i ) transition probability from state x i to state x j A row vector π is a stationary distribution if πt = π. A row vector π is a (unique) invariant/limiting distribution if lim n µt n = π for all initial distributions µ. An invariant distribution is also a stationary distribution. 25 / 50

32 Review: Markov chains for Theorem: A Markov chain has an invariant distribution if it is: Irreducible: There is a positive probability to get from any state to all other states. Aperiodic: The gcd of the lengths of all path from a state to itself is 1, for all states x i : gcd{n > 0 : Pr(x (n) = x i x (0) = x i ) > 0} 26 / 50

33 Review: Markov chains for Theorem: A Markov chain has an invariant distribution if it is: Irreducible: There is a positive probability to get from any state to all other states. Aperiodic: The gcd of the lengths of all path from a state to itself is 1, for all states x i : gcd{n > 0 : Pr(x (n) = x i x (0) = x i ) > 0} If we construct an irreducible and aperiodic Markov chain T such that pt = p (i.e. p is a stationary distribution), then p is necessarily the invariant distribution. This is the key behind the Metropolis-Hastings algorithm. 26 / 50

34 Reversibility / Detailed Balance Condition for How can we ensure that p is a stationary distribution of T? Check if T satisfies the reversibility / detailed balance condition: p(x (i+1) )T (x (i) x (i+1) ) = p(x (i) )T (x (i+1) x (i) ) (6) 27 / 50

35 Reversibility / Detailed Balance Condition for How can we ensure that p is a stationary distribution of T? Check if T satisfies the reversibility / detailed balance condition: p(x (i+1) )T (x (i) x (i+1) ) = p(x (i) )T (x (i+1) x (i) ) (6) Summing over x (i) on both sides gives us: p(x (i+1) ) = x (i) p(x (i) )T (x (i+1) x (i) ) (7) This is equivalent to pt = p. 27 / 50

36 Discrete vs. Continuous Markov chains for Transition matrix T becomes an integral/transition kernel K. Detailed balance condition is the same: p(x (i+1) )K(x (i) x (i+1) ) = p(x (i) )K(x (i+1) x (i) ) (8) Integral instead of summation: p(x (i+1) ) = p(x (i) )K(x (i+1) x (i) )dx (i) (9) χ 28 / 50

37 Metropolis-Hastings Algorithm for Initialize x (0) For i from 0 to N 1: Sample u Uniform[0, 1] Sample x q(x x (i) ) If u < A(x (i), x ) = min x (i+1) = x Else: x (i+1) = x (i) { } 1, p(x )q(x (i) x ) : p(x (i) )q(x x (i) ) q is called the proposal distribution and should be easy-to-sample. A(x (i), x ) is called the acceptance probability of moving from x (i) to x. 29 / 50

38 Metropolis-Hastings Algorithm for 30 / 50

39 Example of MH algorithm for p(x) 0.1 exp( 0.2x 2 ) exp( 0.2(x 10) 2 ) q(x x (i) ) = N (x (i), 100) bimodal distribution 31 / 50

40 Verify: detailed balance condition for The transition kernel for the MH algorithm is: K MH (x (i+1) x (i) ) = q(x (i+1) x (i) ) A(x (i), x (i+1) ) + δ x (i)(x (i+1) ) r(x (i) ) (10) where r(x (i) ) = χ q(x x (i) )(1 A(x (i), x ))dx is the rejection term 32 / 50

41 Verify: detailed balance condition for The transition kernel for the MH algorithm is: K MH (x (i+1) x (i) ) = q(x (i+1) x (i) ) A(x (i), x (i+1) ) + δ x (i)(x (i+1) ) r(x (i) ) (10) where r(x (i) ) = χ q(x x (i) )(1 A(x (i), x ))dx is the rejection term We can check that K MH satisfies the detailed balance condition: p(x (i+1) )K MH (x (i) x (i+1) ) = p(x (i) )K MH (x (i+1) x (i) ) (11) 32 / 50

42 Verify: irreducibility and aperiodicity for Irreducibility holds as long as the support of q includes the support of p. Aperiodicity holds by construction. By allowing for rejection, we are creating self-loops in our Markov chain. 33 / 50

43 Disadvantages of MH algorithm for Samples are not independent Nearby samples are correlated. Can use thinning by sampling every d steps, but this is often unnecessary. Just take more samples. Theoretically, the sample histogram is an unbiased estimator for p. 34 / 50

44 Disadvantages of MH algorithm for Initial samples are biased Starting state not chosen according to p, meaning the random walk might start in a low density region of p. We can employ a burn-in period, where the first few samples are discarded. How many samples to discard depends on the mixing time of the Markov chain. 35 / 50

45 Convergence of Random Walk for How many time-steps do we need to converge to p? Theoretically, infinite. But being close is often a good enough approximation. 36 / 50

46 Convergence of Random Walk for How many time-steps do we need to converge to p? Theoretically, infinite. But being close is often a good enough approximation. Total Variation norm: Finite case: δ(q, p) = 1 2 x χ q(x) p(x) General case: δ(q, p) = sup A χ q(a) p(a) ɛ-mixing time: τ x (ɛ) = min{t : δ(k (t ) x, p) ɛ, t > t} 36 / 50

47 Convergence of Random Walk for How many time-steps do we need to converge to p? Theoretically, infinite. But being close is often a good enough approximation. Total Variation norm: Finite case: δ(q, p) = 1 2 x χ q(x) p(x) General case: δ(q, p) = sup A χ q(a) p(a) ɛ-mixing time: τ x (ɛ) = min{t : δ(k (t ) x, p) ɛ, t > t} Mixing times of a Markov chain can be bounded by its second eigenvalue and its conductance (measure of connectivity). CS139 Advanced covers this in great detail. 36 / 50

48 Other algorithms for Many algorithms are just special cases of the MH algorithm. Independent sampler Assume q(x x (i) ) = q(x ), independent of current state { } { } A(x (i), x ) = min 1, p(x )q(x (i) ) = min 1, w(x ) p(x (i) )q(x ) w(x (i) ) Metropolis algorithm Assume q(x x (i) ) = q(x (i) x ), symmetric random walk { } A(x (i), x ) = min 1, p(x ) p(x (i) ) 37 / 50

49 Other algorithms for Simulated Annealing EM Hybrid Slice Sampler Reversible jump Gibbs Sampler 38 / 50

50 Gibbs Sampler for Assume each x is an n-dimensional vector and we know the full conditional probabilities: p(x j x 1,..., x j 1, x j+1,..., x n ) = p(x j x j ). Furthermore, assume the full conditional probabilities are easy-to-sample. Then we can choose proposal distribution for each j: { q(x x (i) p(xj x (i) j ) = ) if x j = x (i) j, meaning all other coordinates match 0 otherwise With acceptance probability: A(x (i), x ) = min {1, p(x )p(x (i) j x (i) j ) } { = min 1, p(x j ) } = 1 p(x (i) )p(xj x(i) j ) p(x (i) j ) 39 / 50

51 Gibbs Sampler for At each iteration, we are changing one coordinate of our current state. We can choose which coordinate to change randomly, or we can go in order. 40 / 50

52 Gibbs Sampler for At each iteration, we are changing one coordinate of our current state. We can choose which coordinate to change randomly, or we can go in order. For directed graphical models (aka Bayesian networks), we can write the full conditional probability of x j in terms of its Markov blanket: p(x j x j ) = p(x j x parent(j) ) k children(j) p(x k x parent(k) ) (12) Recall: the Markov blanket MB(x j ) of x j is the minimal set of nodes such that x j is independent from the rest of the graph if MB(x j ) is observed. For directed graphical models, MB(x j ) = {parents(x j ), children(x j ), parents(children(x j ))}. 40 / 50

53 Table of Contents for / 50

54 Sequential MC and Particle Filters for Sequential MC methods provide online approximations of probability distributions for inherently sequential data 42 / 50

55 Sequential MC and Particle Filters for Sequential MC methods provide online approximations of probability distributions for inherently sequential data Assumptions: Initial distribution: p(x 0 ) Dynamic model: p(x t x 0:t 1, y 1:t 1 ) (for t 1) Measurement model: p(y t x 0:t, y 1:t 1 ) (for t 1) 42 / 50

56 Sequential MC and Particle Filters for Sequential MC methods provide online approximations of probability distributions for inherently sequential data Assumptions: Initial distribution: p(x 0 ) Dynamic model: p(x t x 0:t 1, y 1:t 1 ) (for t 1) Measurement model: p(y t x 0:t, y 1:t 1 ) (for t 1) Aim: estimate the posterior p(x 0:t y 1:t ), including the filtering distribution, p(x t y 1:t ), and sample N particles {x (i) 0:t }N i=1 from that distribution 42 / 50

57 Sequential MC and Particle Filters for Sequential MC methods provide online approximations of probability distributions for inherently sequential data Assumptions: Initial distribution: p(x 0 ) Dynamic model: p(x t x 0:t 1, y 1:t 1 ) (for t 1) Measurement model: p(y t x 0:t, y 1:t 1 ) (for t 1) Aim: estimate the posterior p(x 0:t y 1:t ), including the filtering distribution, p(x t y 1:t ), and sample N particles {x (i) 0:t }N i=1 from that distribution Method: Generate a weighted set of paths { x (i) 0:t }N i=1, using an importance sampling framework with appropriate proposal distribution q( x 0:t y 1:t ) 42 / 50

58 Sequential MC and Particle Filters for Here, q( x 0:t y 1:t ) = p(x 0:t 1 y 1:t 1 )q( x x 0:t 1, y 1:t ) 43 / 50

59 Vision Application for Tracking Multiple Interacting Targets with an -based Particle Filter Aim: Generate sequence of states X it, for the i th ant at frame t Idea: Use an independent particle filter for each ant 44 / 50

60 Vision Application for Tracking Multiple Interacting Targets with an -based Particle Filter Particle Filter Assumptions: Motion Model: Normally distributed, centered at location in previous frame X t 1 Measurement Model: Image-comparison function ({how ant-like} - {how background-like}) 45 / 50

61 Vision Application for Independent Particle Filters P(X it Z t 1 ) {X (r) i,t 1, π(r) i,t 1 }N r=1 46 / 50

62 Vision Application for Independent Particle Filters P(X it Z t 1 ) {X (r) i,t 1, π(r) i,t 1 }N r=1 Proposal distribution: q(x it ) = r π(r) i,t 1 P(X i,t X (r) i,t 1 ) where π (r) i,t 1 = P(Z it X (r) it ) 46 / 50

63 Vision Application for Independent Particle Filters P(X it Z t 1 ) {X (r) i,t 1, π(r) i,t 1 }N r=1 Proposal distribution: q(x it ) = r π(r) i,t 1 P(X i,t X (r) i,t 1 ) where π (r) i,t 1 = P(Z it X (r) it ) 46 / 50

64 Vision Application for Markov Random Field (MRF) Model & the Joint MRF Particle Filter Modify motion model to include pairwise interactions: P(X t X t 1 ) i P(X it X i,t 1 ) ij E ψ(x it, X jt ) Proposal distribution (same as before!): q(x t ) = r π(r) t 1 i P(X i,t X (r) i,t 1 ) but now π (r) t 1 = n i=1 P(Z it X (r) it ) (r) ij E ψ(x i,t, X j,t r ) 47 / 50

65 Vision Application for -based Particle Filter Because of pairwise interactions, Joint MRF Particle Filter scales badly with number of targets (ants) Intractable instance of importance sampling... smells like a good time to use an algorithm! 48 / 50

66 Vision Application for Tracking Multiple Interacting Targets with an -based Particle Filter Use Metropolis-Hastings instead of importance sampling Proposal Density: Simply the Motion Model 49 / 50

67 References for intromontecarlomachinelearning.pdf pdf?sequence=1&isallowed=y nips-2015-monte-carlo-tutorial.pdf / 50

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Convergence Rate of Markov Chains

Convergence Rate of Markov Chains Convergence Rate of Markov Chains Will Perkins April 16, 2013 Convergence Last class we saw that if X n is an irreducible, aperiodic, positive recurrent Markov chain, then there exists a stationary distribution

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC Markov chains Let S = {1, 2,..., N} be a finite set consisting of N states. A Markov chain Y 0, Y 1, Y 2,... is a sequence of random variables, with Y t S for all points in time

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) School of Computer Science 10-708 Probabilistic Graphical Models Markov Chain Monte Carlo (MCMC) Readings: MacKay Ch. 29 Jordan Ch. 21 Matt Gormley Lecture 16 March 14, 2016 1 Homework 2 Housekeeping Due

More information

Results: MCMC Dancers, q=10, n=500

Results: MCMC Dancers, q=10, n=500 Motivation Sampling Methods for Bayesian Inference How to track many INTERACTING targets? A Tutorial Frank Dellaert Results: MCMC Dancers, q=10, n=500 1 Probabilistic Topological Maps Results Real-Time

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short

More information

An ABC interpretation of the multiple auxiliary variable method

An ABC interpretation of the multiple auxiliary variable method School of Mathematical and Physical Sciences Department of Mathematics and Statistics Preprint MPS-2016-07 27 April 2016 An ABC interpretation of the multiple auxiliary variable method by Dennis Prangle

More information

Markov Networks.

Markov Networks. Markov Networks www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts Markov network syntax Markov network semantics Potential functions Partition function

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

MCMC and Gibbs Sampling. Sargur Srihari

MCMC and Gibbs Sampling. Sargur Srihari MCMC and Gibbs Sampling Sargur srihari@cedar.buffalo.edu 1 Topics 1. Markov Chain Monte Carlo 2. Markov Chains 3. Gibbs Sampling 4. Basic Metropolis Algorithm 5. Metropolis-Hastings Algorithm 6. Slice

More information

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 18 : Advanced topics in MCMC Lecturer: Eric P. Xing Scribes: Jessica Chemali, Seungwhan Moon 1 Gibbs Sampling (Continued from the last lecture)

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo

Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Assaf Weiner Tuesday, March 13, 2007 1 Introduction Today we will return to the motif finding problem, in lecture 10

More information

Bayesian networks: approximate inference

Bayesian networks: approximate inference Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008 Approximative inference September 2008 1 / 25 Motivation Because of the (worst-case) intractability of exact

More information

Bayes Nets: Sampling

Bayes Nets: Sampling Bayes Nets: Sampling [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Approximate Inference:

More information

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II References Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Sampling Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Introduction to MCMC. DB Breakfast 09/30/2011 Guozhang Wang

Introduction to MCMC. DB Breakfast 09/30/2011 Guozhang Wang Introduction to MCMC DB Breakfast 09/30/2011 Guozhang Wang Motivation: Statistical Inference Joint Distribution Sleeps Well Playground Sunny Bike Ride Pleasant dinner Productive day Posterior Estimation

More information

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Machine Learning. Torsten Möller.

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Machine Learning. Torsten Möller. Sampling Methods Machine Learning orsten Möller Möller/Mori 1 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure defined

More information

Advanced Sampling Algorithms

Advanced Sampling Algorithms + Advanced Sampling Algorithms + Mobashir Mohammad Hirak Sarkar Parvathy Sudhir Yamilet Serrano Llerena Advanced Sampling Algorithms Aditya Kulkarni Tobias Bertelsen Nirandika Wanigasekara Malay Singh

More information

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:

More information

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch. Sampling Methods Oliver Schulte - CMP 419/726 Bishop PRML Ch. 11 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs A particular tree structure

More information

Monte Carlo Methods. Geoff Gordon February 9, 2006

Monte Carlo Methods. Geoff Gordon February 9, 2006 Monte Carlo Methods Geoff Gordon ggordon@cs.cmu.edu February 9, 2006 Numerical integration problem 5 4 3 f(x,y) 2 1 1 0 0.5 0 X 0.5 1 1 0.8 0.6 0.4 Y 0.2 0 0.2 0.4 0.6 0.8 1 x X f(x)dx Used for: function

More information

Markov-Chain Monte Carlo

Markov-Chain Monte Carlo Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. References Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Sampling Methods. Bishop PRML Ch. 11. Alireza Ghane. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo

Sampling Methods. Bishop PRML Ch. 11. Alireza Ghane. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo Sampling Methods Bishop PRML h. 11 Alireza Ghane Sampling Methods A. Ghane /. Möller / G. Mori 1 Recall Inference or General Graphs Junction tree algorithm is an exact inference method for arbitrary graphs

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

ELEC633: Graphical Models

ELEC633: Graphical Models ELEC633: Graphical Models Tahira isa Saleem Scribe from 7 October 2008 References: Casella and George Exploring the Gibbs sampler (1992) Chib and Greenberg Understanding the Metropolis-Hastings algorithm

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Probabilistic Models of Cognition, 2011 http://www.ipam.ucla.edu/programs/gss2011/ Roadmap: Motivation Monte Carlo basics What is MCMC? Metropolis Hastings and Gibbs...more tomorrow.

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo (and Bayesian Mixture Models) David M. Blei Columbia University October 14, 2014 We have discussed probabilistic modeling, and have seen how the posterior distribution is the critical

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Topics to be Covered Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling (CDF) Ancestral Sampling Rejection

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods p. /36 Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Markov Chain Monte Carlo Methods p. 2/36 Markov Chains

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018 Math 456: Mathematical Modeling Tuesday, April 9th, 2018 The Ergodic theorem Tuesday, April 9th, 2018 Today 1. Asymptotic frequency (or: How to use the stationary distribution to estimate the average amount

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation Idea: probabilities samples Get probabilities from samples: X count x 1 n 1. x k total. n k m X probability x 1. n 1 /m. x k n k /m If we could sample from a variable s (posterior)

More information

Markov chain Monte Carlo methods for visual tracking

Markov chain Monte Carlo methods for visual tracking Markov chain Monte Carlo methods for visual tracking Ray Luo rluo@cory.eecs.berkeley.edu Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

A Repelling-Attracting Metropolis Algorithm for Multimodality

A Repelling-Attracting Metropolis Algorithm for Multimodality A Repelling-Attracting Metropolis Algorithm for Multimodality arxiv:1601.05633v5 [stat.me] 20 Oct 2017 Hyungsuk Tak Statistical and Applied Mathematical Sciences Institute Xiao-Li Meng Department of Statistics,

More information