Sampling from complex probability distributions

Size: px
Start display at page:

Download "Sampling from complex probability distributions"

Transcription

1 Sampling from complex probability distributions Louis J. M. Aslett Department of Mathematical Sciences Durham University UTOPIAE Training School II 4 July /37

2 Motivation 2/37

3 Sampling from probability distributions why? Monte Carlo essentially avoids the quandry of choosing an accurate but intractable model versus a simple but computable one. 3/37

4 Sampling from probability distributions why? Monte Carlo essentially avoids the quandry of choosing an accurate but intractable model versus a simple but computable one. May want to answer: Probabilistic questions simulate physical random processes concerned with some corresponding random outcome may be inherent or perceived randomness eg simulation of shuttle launch Deterministic questions most often, boils down to computation of high dimensional integrals use experimental methods to answer theoretical question 3/37

5 Bayesian inference (recall Georgios Karagiannis talk) Data: t = {t 1,..., t n } Model: t is the realisation of a random vector T having probability density π T Ψ ( ψ), where ψ is an unknown parameter. π T Ψ (t ) is the likelihood. Prior: all knowledge about ψ which is not contained in t is expressed via prior density π Ψ (ψ). 4/37

6 Bayesian inference (recall Georgios Karagiannis talk) Data: t = {t 1,..., t n } Model: t is the realisation of a random vector T having probability density π T Ψ ( ψ), where ψ is an unknown parameter. π T Ψ (t ) is the likelihood. Prior: all knowledge about ψ which is not contained in t is expressed via prior density π Ψ (ψ). Posterior: Bayes Theorem enables us to rationally update the prior to our posterior belief in light of the new evidence (data). Bayes Theorem π Ψ T (ψ t) = π T Ψ(t ψ) π Ψ (ψ) Ω π T Ψ(t ψ) π Ψ (dψ) 4/37

7 Bayesian inference (recall Georgios Karagiannis talk) Data: t = {t 1,..., t n } Model: t is the realisation of a random vector T having probability density π T Ψ ( ψ), where ψ is an unknown parameter. π T Ψ (t ) is the likelihood. Prior: all knowledge about ψ which is not contained in t is expressed via prior density π Ψ (ψ). Posterior: Bayes Theorem enables us to rationally update the prior to our posterior belief in light of the new evidence (data). Bayes Theorem π Ψ T (ψ t) = π T Ψ(t ψ) π Ψ (ψ) Ω π T Ψ(t ψ) π Ψ (dψ) π T Ψ (t ψ) π Ψ (ψ) 4/37

8 Everything is about expectations Recall, for a random variable X Ω having probability density π(x), E[f(X)] := f(x)π(dx) µ Ω 5/37

9 Everything is about expectations Recall, for a random variable X Ω having probability density π(x), E[f(X)] := f(x)π(dx) µ Pretty much all statements of probability can be phrased in terms of expectations. X R, P(X < a) = a π(dx) = Ω I (,a) (x)π(dx) = E[I (,a) (X)] 5/37

10 Everything is about expectations Recall, for a random variable X Ω having probability density π(x), E[f(X)] := f(x)π(dx) µ Pretty much all statements of probability can be phrased in terms of expectations. X R, P(X < a) = a π(dx) = Ω I (,a) (x)π(dx) = E[I (,a) (X)] X Ω, A Ω, P(X A) = A π(dx) = Ω I A (x)π(dx) = E[I A (X)] ie for statements of probability define f(x) := I A (x) 5/37

11 Bayesian inference again May want samples directly from the posterior; marginal kernel density estimates; posterior predictive simulation; etc 6/37

12 Bayesian inference again May want samples directly from the posterior; marginal kernel density estimates; posterior predictive simulation; etc Or may want to answer question about a probability under the posterior P(ψ A t) = π(dψ t) A Ω = I A(ψ)π(t ψ) π(dψ) Ω π(t ψ) π(dψ) = I A (ψ)π(dψ t) Ω 6/37

13 So just do numerical integration? Midpoint Riemann integral in 1-dim using n evaulations: b a f(x)π(dx) = b a g(x) dx b a n n g(x i ) i=1 where x i := a + b a n ( i 1 ) 2 7/37

14 So just do numerical integration? Midpoint Riemann integral in 1-dim using n evaulations: where b a f(x)π(dx) = b a x i := a + b a n It is easy to show the error is: b g(x) dx b a n g(x i ) n a i=1 g(x) dx b a n ( i 1 ) 2 (b a)3 24n 2 n g(x i ) i=1 max f (z) a z b Clearly, (b a)3 24 max a z b f (z) is fixed by the problem, so we achieve desired accuracy by controlling n 2. 7/37

15 So just do numerical integration? error in midpoint Riemann integral in 1-dim: O(n 2 ) 8/37

16 So just do numerical integration? error in midpoint Riemann integral in 1-dim: O(n 2 ) but error in midpoint Riemann integral in d-dim: O(n 2/d ) so-called curse of dimensionality 8/37

17 So just do numerical integration? but error in midpoint Riemann integral in 1-dim: O(n 2 ) error in midpoint Riemann integral in d-dim: O(n 2/d ) so-called curse of dimensionality error in Monte Carlo integration: ie independent of dimension O(n 1/2 ) 8/37

18 So just do numerical integration? but error in midpoint Riemann integral in 1-dim: O(n 2 ) error in midpoint Riemann integral in d-dim: O(n 2/d ) so-called curse of dimensionality error in Monte Carlo integration: ie independent of dimension O(n 1/2 ) Simpson s improves this to O(n 4/d ), but in general Bakhvalov s Theorem bounds all possible quadriture methods by O(n r/d ) quadriture can t beat curse of dimensionality. 8/37

19 Order of error Method Monte Carlo d-dim Riemann 2-dim Riemann 6-dim n Note: this is the order of error, not absolute error! 9/37

20 Monte Carlo to the rescue? Monte Carlo integration in d-dim using n evaulations: µ f(x)π(dx) 1 n where x i π( ) n f(x i ) ˆµ i=1 10/37

21 Monte Carlo to the rescue? Monte Carlo integration in d-dim using n evaulations: µ f(x)π(dx) 1 n n f(x i ) ˆµ i=1 where x i π( ) The root mean square error is: ( E f(x)π(dx) 1 n ) n 2 f(x i ) = i=1 σ n where σ = Var π (f(x)). Again, σ is (mostly) inherent to the problem, so we achieve desired accuracy by controlling n 1/2 10/37

22 Monte Carlo to the rescue? Monte Carlo integration in d-dim using n evaulations: µ f(x)π(dx) 1 n n f(x i ) ˆµ i=1 where x i π( ) The root mean square error is: ( E f(x)π(dx) 1 n ) n 2 f(x i ) = i=1 σ n where σ = Var π (f(x)). Again, σ is (mostly) inherent to the problem, so we achieve desired accuracy by controlling n 1/2 Recall we can set f(x) := I A (x) to compute probabilities. 10/37

23 Monte Carlo the practicality Remarkably: No dependence on d. No dependence on smoothness of integrand. n σ can itself be directly estimated from the samples drawn. 11/37

24 Monte Carlo the practicality Remarkably: No dependence on d. No dependence on smoothness of integrand. n σ can itself be directly estimated from the samples drawn. but the crux of the last slide was: where x i π( ) Methodological research in Monte Carlo is largely preocupied with how to achieve this for complex probability distributions. 11/37

25 Monte Carlo achieving desired accuracy A simple application of Chebyshev s inequality allows us to bound how certain we are in a fully quantified way, P( ˆµ µ ε) E[(ˆµ µ)2 ] ε 2 = σ nε 2 12/37

26 Monte Carlo achieving desired accuracy A simple application of Chebyshev s inequality allows us to bound how certain we are in a fully quantified way, P( ˆµ µ ε) E[(ˆµ µ)2 ] ε 2 = σ nε 2 Or indeed, invoking the iid central limit theorem we can asymptotically state, ( ) ˆµ µ P σn 1/2 z n Φ(z) and so form confidence intervals for µ based on large n samples. 12/37

27 Simple MC 13/37

28 The setting Almost all Monte Carlo procedures start from the assumption that we have available an unlimited stream of independent uniformly distributed values, typically on the interval [0, 1]. We now want to study how to convert a stream into a stream u i Unif(0, 1) x j π( ) where x j is generated by some algorithm depending on one or more u i. In more advanced methods (see MCMC), x j may also depend on x j 1 or even x 1,..., x j 1. 14/37

29 Inverse sampling Let F (x) := P(X x) be the cumulative distribution function for our target probability density function π( ). Inverse Sampling 1 Sample U Unif(0, 1). 2 Set X = F 1 (U). 15/37

30 Inverse sampling Let F (x) := P(X x) be the cumulative distribution function for our target probability density function π( ). Inverse Sampling 1 Sample U Unif(0, 1). 2 Set X = F 1 (U). Why is X π( )? P(X x) = P(F 1 (U) x) = P(F (F 1 (U)) F (x)) = P(U F (x)) = F (x) 15/37

31 Inverse sampling Let F (x) := P(X x) be the cumulative distribution function for our target probability density function π( ). Inverse Sampling 1 Sample U Unif(0, 1). 2 Set X = F 1 (U). Why is X π( )? P(X x) = P(F 1 (U) x) = P(F (F 1 (U)) F (x)) = P(U F (x)) = F (x) To avoid problems with discrete distributions, we must define F 1 (u) = inf{x : F (x) u}, u [0, 1] 15/37

32 Rejection sampling Seek a density π( ) we can sample from and such that π(x) c π(x) x where c <. π( ) and π( ) need not be normalised. Rejection Sampling 1 Sample Y π( ) and U Unif(0, 1). 2 If U π(y ) c π(y ), return Y, else return to 1. 16/37

33 Rejection sampling Seek a density π( ) we can sample from and such that π(x) c π(x) x where c <. π( ) and π( ) need not be normalised. Rejection Sampling 1 Sample Y π( ) and U Unif(0, 1). 2 If U π(y ) c π(y ), return Y, else return to 1. This is not perfectly efficient as we must iterate 1 & 2 a random number of times until acceptance, with P(accept) = 1 c 16/37

34 Rejection sampling caution, a low-d method Consider a multi-variate Normal distribution centered at 0, ( π(x) = det(2πσ) 1/2 exp 1 ) 2 xt Σ 1 x Say want to produce samples for target where Σ =..... =. QT ΛQ using a proposal π( ) where Σ = σi. If σ < max{λ i }, c =. c minimal for σ = max{λ i }. 17/37

35 1e-01 Acceptance Probability 1e-04 1e Dimension 18/37

36 Rejection sampling demo Shiny demo for rejection sampling with: π(x) (x 5) 2 cos (x 1/4) and Thus, π(x) N(µ = 4, σ = 3) c 4.9 and P(accept) /37

37 Importance sampling (I) If we have a probability density π( ) which is close to π( ), then we can produce a weighted set of samples. 20/37

38 Importance sampling (I) If we have a probability density π( ) which is close to π( ), then we can produce a weighted set of samples. Importance Sampling 1 Sample X π( ) 2 Sample is x i = X, with weight w i = π(x) π(x) 20/37

39 Importance sampling (I) If we have a probability density π( ) which is close to π( ), then we can produce a weighted set of samples. Importance Sampling 1 Sample X π( ) 2 Sample is x i = X, with weight w i = π(x) π(x) Monte Carlo estimator is slightly modified to account for weights: µ f(x)π(dx) 1 n w i f(x i ) ˆµ n i=1 20/37

40 Importance sampling (I) If we have a probability density π( ) which is close to π( ), then we can produce a weighted set of samples. Importance Sampling 1 Sample X π( ) 2 Sample is x i = X, with weight w i = π(x) π(x) Monte Carlo estimator is slightly modified to account for weights: µ f(x)π(dx) 1 n w i f(x i ) ˆµ n i=1 Standard Importance Sampling Properties E[ˆµ] = µ, Var(ˆµ) = σ π (f(x)π(x) µ π(x)) 2 where σ π = dx n f(x)π(x) 20/37

41 Importance sampling (II) Consequently, can show optimal proposal for importance sampling is: π(x) opt = f(x) π(x) Ω f(x) π(dx) 21/37

42 Importance sampling (II) Consequently, can show optimal proposal for importance sampling is: π(x) opt = f(x) π(x) Ω f(x) π(dx) Hence, importance sampling shows how to beat naïve Monte Carlo when estimating expectations of non-identity functionals in practice, we can never compute the optimal π( ). Can still provide a nice guide 21/37

43 Importance sampling unnormalised π( ) We can still perform importance sampling if π( ) and π( ) are only known up to a normalising constant. Algorithm for sampling is unchanged, but the self-normalised importance sampling estimate becomes: f(x)π(dx) ni=1 f(x i )w i ni=1 w i 22/37

44 Importance sampling unnormalised π( ) We can still perform importance sampling if π( ) and π( ) are only known up to a normalising constant. Algorithm for sampling is unchanged, but the self-normalised importance sampling estimate becomes: f(x)π(dx) ni=1 f(x i )w i ni=1 w i Self-normalised Importance Sampling Properties µvar(w ) Cov(W, W f(x)) E[ˆµ] = µ + + O(n 2 ) n n Var(ˆµ) wi 2 (f(x i ) ˆµ) 2 and π(x) opt f(x) µ π(x) i=1 22/37

45 Importance sampling simple diagnostic Equate variance of importance sampling estimate to Monte Carlo variance for a fixed sample size n e : ( ni=1 ) f(x i )w i Var ni=1 = σ2 w i n e = Var ( n i=1 f(x i )w i ) ( n i=1 w i ) 2 = σ2 n e = σ2 n i=1 wi 2 ( n i=1 w i ) 2 = σ2 n e = n e = n w2 w 2 balanced weights are desirable. small n e diagnose a problem with IS large n e all is ok with IS 23/37

46 MCMC 24/37

47 Markov Chain Monte Carlo Standard Monte Carlo methods indeed have the nice O(n 1/2 ) convergence rates no dependence on dimension d Constant in the error term still depends on dimension! no completely free lunch But there are methods which control the error term better than standard Monte Carlo MCMC, introduced in 1953, constructs a Markov Chain whose stationary distribution is the target distribution of interest, π( ). 25/37

48 Markov Chains Saw Markov Chains in an imprecise probability context yesterday morning (Gert de Cooman s talk). Recall, a process (X 1, X 2,... ) is a continuous state space, discrete time Markov Chain if P(X t A X 1 = x 1,..., X t 1 = x t 1 ) P(X t A X t 1 = x t 1 ) 26/37

49 Markov Chains Saw Markov Chains in an imprecise probability context yesterday morning (Gert de Cooman s talk). Recall, a process (X 1, X 2,... ) is a continuous state space, discrete time Markov Chain if P(X t A X 1 = x 1,..., X t 1 = x t 1 ) P(X t A X t 1 = x t 1 ) The transition probabilities from a current state are defined by a kernel function K(x, ), such that, P(X t A X t 1 = x t 1 ) = K(x t 1, dy) K(x t 1, A) A 26/37

50 Markov Chains Saw Markov Chains in an imprecise probability context yesterday morning (Gert de Cooman s talk). Recall, a process (X 1, X 2,... ) is a continuous state space, discrete time Markov Chain if P(X t A X 1 = x 1,..., X t 1 = x t 1 ) P(X t A X t 1 = x t 1 ) The transition probabilities from a current state are defined by a kernel function K(x, ), such that, P(X t A X t 1 = x t 1 ) = K(x t 1, dy) K(x t 1, A) Under certain conditions, these chains will have a stationary distribution. We are interested in constucting Markov Chains with the stationary distribution we want to target, ie π(dx)k(x, y) = π(y) Ω A 26/37

51 Diving straight in There is a rich and interesting theory of Markov Chains, but we ll fast-forward to the action for today. 27/37

52 Diving straight in Metropolis-Hastings is a method to algorithmically construct K(x, ) such that π( ) will be stationary distribution. 28/37

53 Diving straight in Metropolis-Hastings is a method to algorithmically construct K(x, ) such that π( ) will be stationary distribution. Metropolis-Hastings Specify a target, π( ), proposal, q( x), and starting point x 1. To sample the Markov Chain, repeat: 1 Sample x q( x t 1 ) 2 Compute { α(x π(x ) q(x t 1 x } ) x t 1 ) = min 1, π(x t 1 ) q(x x t 1 ) 3 Sample u Unif(0, 1). Set, x t = { x if u α(x x t 1 ) x t 1 otherwise 28/37

54 Metropolis-Hastings common proposals Random-walk MH: choose some spherically symmetric distribution g( ) and define q(x x) = x + ε, where ε g( ) the spherical symmetry means acceptance probability simplifies: { α(x π(x } ) x t 1 ) = min 1, π(x t 1 ) often, g( ) is zero-mean multivariate Normal 29/37

55 Metropolis-Hastings common proposals Random-walk MH: choose some spherically symmetric distribution g( ) and define q(x x) = x + ε, where ε g( ) the spherical symmetry means acceptance probability simplifies: { α(x π(x } ) x t 1 ) = min 1, π(x t 1 ) often, g( ) is zero-mean multivariate Normal Independent MH: any choice q(x x) = g(x ), where the proposal does not depend on the current state. generally not a good choice, easy to construct non-ergodic chains 29/37

56 Convergence results We use the same estimator as standard Monte Carlo, µ f(x)π(dx) 1 n where now x i are MCMC draws. n f(x i ) ˆµ i=1 30/37

57 Convergence results We use the same estimator as standard Monte Carlo, µ f(x)π(dx) 1 n where now x i are MCMC draws. n f(x i ) ˆµ i=1 However, we no longer have iid samples from π( ), so standard Monte Carlo convergence results do not apply. Under some mild assumptions, we can state similar results for MCMC: where n(ˆµ µ) n N(0, σ 2 ) σ 2 = Var(f(X 1 )) + 2 Cov(f(X 1 ), f(x i )) i=2 30/37

58 Estimating the variance It is hard to estimate σ 2 in the MCMC setting, but essential to be able to quantify accuracy of estimates. Simple option: always examine autocorrelation plots. These will alert you to situations where the infinite sum is contributing substantially to the variance in your estimate. Better option: use methods such as batch means to estimate σ 2 from the Markov Chain output. See mcmcse R package for easy functions. 31/37

59 Choosing a proposal Counter-intuitively, high acceptance rates in MCMC are a bad thing! strongly correlated draws reduce the efficiency of the estimator by inflating the variance 32/37

60 Choosing a proposal Counter-intuitively, high acceptance rates in MCMC are a bad thing! strongly correlated draws reduce the efficiency of the estimator by inflating the variance But, need to move enough to explore the target long range jumps which reduce correlation have very low acceptance rates 32/37

61 Choosing a proposal Counter-intuitively, high acceptance rates in MCMC are a bad thing! strongly correlated draws reduce the efficiency of the estimator by inflating the variance But, need to move enough to explore the target long range jumps which reduce correlation have very low acceptance rates Need to balance these concerns a famous result shows that in the limit as d, the optimal acceptance rate for a symmetric product form target density is empirically this works well in lower dimensions and other targets, though for very small d should be increased (eg 0.44 in 1D) 32/37

62 Demo Enough talk 1 Example Metropolis-Hastings sampler in R (MCMC.R) 2 MCMC convergence Shiny demo (Shiny/MCMC>R) 33/37

63 Adaptive MCMC Clearly there is an issue: we may get terrible results by making a poor choice of proposal. 34/37

64 Adaptive MCMC Clearly there is an issue: we may get terrible results by making a poor choice of proposal. We can learn from the samples we have already seen to automatically improve our proposal distribution. q t ( x t 1 ) = q( x t 1, {x 1,..., x t 1 }) 34/37

65 Adaptive MCMC Clearly there is an issue: we may get terrible results by making a poor choice of proposal. We can learn from the samples we have already seen to automatically improve our proposal distribution. q t ( x t 1 ) = q( x t 1, {x 1,..., x t 1 }) Warning: this breaks Markov property! Adaptive MCMC Conditions Stationarity: π( ) must be stationary for q t ( x t 1 ) t Diminishing adaptation: lim sup K t ( x) K t+1 ( x) = 0 n x Ω Containment: Time to stationarity from any point in chain with adapted kernel bounded in probability. 34/37

66 Adaptive MCMC Haario et al / Roberts & Rosenthal Idea? q t ( x t 1 ) N ( x t 1, d ˆΣ t ) 35/37

67 Adaptive MCMC Haario et al / Roberts & Rosenthal Idea? q t ( x t 1 ) N This doesn t quite work, use ( q t ( x t 1 ) = (1 β)n to satisfy adaptive conditions. ( x t 1, d x t 1, d ˆΣ t ) ˆΣ t ) + βn ( ) x t 1, 0.12 d I 35/37

68 Adaptive MCMC Haario et al / Roberts & Rosenthal Idea? q t ( x t 1 ) N This doesn t quite work, use ( q t ( x t 1 ) = (1 β)n to satisfy adaptive conditions. ( x t 1, d x t 1, d ˆΣ t ) ˆΣ t ) + βn ( ) x t 1, 0.12 d I /d is the optimal scaling in certain theoretical circumstances. Alternative, scale to target an acceptance rate: ( ) ) q t ( x t 1 ) = (1 β)n (x t 1, e γ b ˆΣt + βn x t 1, 0.12 d I where split into batches b of size 50, say, with γ b = γ b 1 + ( 1) I(α b 1<0.44) min{0.01, n 1/2 } 35/37

69 In Practice 36/37

70 Software mcmc R package Stan Birch metrop for the kind of MCMC shown today temper to handle multi-modality Hamiltonian Monte Carlo several languages Sequential Monte Carlo brand new and particularly exciting probabilistic programming language 37/37

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

SC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Norm Comparisons for Data Augmentation by James P. Hobert Department of Statistics University of Florida and Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0704

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

Paul Karapanagiotidis ECO4060

Paul Karapanagiotidis ECO4060 Paul Karapanagiotidis ECO4060 The way forward 1) Motivate why Markov-Chain Monte Carlo (MCMC) is useful for econometric modeling 2) Introduce Markov-Chain Monte Carlo (MCMC) - Metropolis-Hastings (MH)

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Sequential Monte Carlo Samplers for Applications in High Dimensions

Sequential Monte Carlo Samplers for Applications in High Dimensions Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems John Bardsley, University of Montana Collaborators: H. Haario, J. Kaipio, M. Laine, Y. Marzouk, A. Seppänen, A. Solonen, Z.

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Charles J. Geyer April 6, 2009 1 The Problem This is an example of an application of Bayes rule that requires some form of computer analysis.

More information

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS A Thesis in Statistics by Chris Groendyke c 2008 Chris Groendyke Submitted in

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Some Results on the Ergodicity of Adaptive MCMC Algorithms Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) Gibbs and Metropolis Hastings Slice sampling Practical details Iain Murray http://iainmurray.net/ Reminder Need to sample large, non-standard distributions:

More information

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007 Sequential Monte Carlo and Particle Filtering Frank Wood Gatsby, November 2007 Importance Sampling Recall: Let s say that we want to compute some expectation (integral) E p [f] = p(x)f(x)dx and we remember

More information

An introduction to adaptive MCMC

An introduction to adaptive MCMC An introduction to adaptive MCMC Gareth Roberts MIRAW Day on Monte Carlo methods March 2011 Mainly joint work with Jeff Rosenthal. http://www2.warwick.ac.uk/fac/sci/statistics/crism/ Conferences and workshops

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

I. Bayesian econometrics

I. Bayesian econometrics I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory D. Large sample results E. Diffuse priors F. Numerical Bayesian methods

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

Stat 451 Lecture Notes Monte Carlo Integration

Stat 451 Lecture Notes Monte Carlo Integration Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Probabilistic Models of Cognition, 2011 http://www.ipam.ucla.edu/programs/gss2011/ Roadmap: Motivation Monte Carlo basics What is MCMC? Metropolis Hastings and Gibbs...more tomorrow.

More information

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference

Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Stat 5102 Notes: Markov Chain Monte Carlo and Bayesian Inference Charles J. Geyer March 30, 2012 1 The Problem This is an example of an application of Bayes rule that requires some form of computer analysis.

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

16 : Markov Chain Monte Carlo (MCMC)

16 : Markov Chain Monte Carlo (MCMC) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 16 : Markov Chain Monte Carlo MCMC Lecturer: Matthew Gormley Scribes: Yining Wang, Renato Negrinho 1 Sampling from low-dimensional distributions

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 11 Markov Chain Monte Carlo cont. October 6, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university The two stage Gibbs sampler If the conditional distributions are easy to sample

More information

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9 Metropolis Hastings Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 9 1 The Metropolis-Hastings algorithm is a general term for a family of Markov chain simulation methods

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods Prof. Daniel Cremers 14. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017

Simple Techniques for Improving SGD. CS6787 Lecture 2 Fall 2017 Simple Techniques for Improving SGD CS6787 Lecture 2 Fall 2017 Step Sizes and Convergence Where we left off Stochastic gradient descent x t+1 = x t rf(x t ; yĩt ) Much faster per iteration than gradient

More information

Introduction to MCMC. DB Breakfast 09/30/2011 Guozhang Wang

Introduction to MCMC. DB Breakfast 09/30/2011 Guozhang Wang Introduction to MCMC DB Breakfast 09/30/2011 Guozhang Wang Motivation: Statistical Inference Joint Distribution Sleeps Well Playground Sunny Bike Ride Pleasant dinner Productive day Posterior Estimation

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) School of Computer Science 10-708 Probabilistic Graphical Models Markov Chain Monte Carlo (MCMC) Readings: MacKay Ch. 29 Jordan Ch. 21 Matt Gormley Lecture 16 March 14, 2016 1 Homework 2 Housekeeping Due

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

Gradient-based Monte Carlo sampling methods

Gradient-based Monte Carlo sampling methods Gradient-based Monte Carlo sampling methods Johannes von Lindheim 31. May 016 Abstract Notes for a 90-minute presentation on gradient-based Monte Carlo sampling methods for the Uncertainty Quantification

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL Xuebin Zheng Supervisor: Associate Professor Josef Dick Co-Supervisor: Dr. David Gunawan School of Mathematics

More information

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 15: MCMC Sanjeev Arora Elad Hazan COS 402 Machine Learning and Artificial Intelligence Fall 2016 Course progress Learning from examples Definition + fundamental theorem of statistical learning,

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of

More information

Bayesian Quadrature: Model-based Approximate Integration. David Duvenaud University of Cambridge

Bayesian Quadrature: Model-based Approximate Integration. David Duvenaud University of Cambridge Bayesian Quadrature: Model-based Approimate Integration David Duvenaud University of Cambridge The Quadrature Problem ˆ We want to estimate an integral Z = f ()p()d ˆ Most computational problems in inference

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

MSc MT15. Further Statistical Methods: MCMC. Lecture 5-6: Markov chains; Metropolis Hastings MCMC. Notes and Practicals available at

MSc MT15. Further Statistical Methods: MCMC. Lecture 5-6: Markov chains; Metropolis Hastings MCMC. Notes and Practicals available at MSc MT15. Further Statistical Methods: MCMC Lecture 5-6: Markov chains; Metropolis Hastings MCMC Notes and Practicals available at www.stats.ox.ac.uk\ nicholls\mscmcmc15 Markov chain Monte Carlo Methods

More information

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 18 : Advanced topics in MCMC Lecturer: Eric P. Xing Scribes: Jessica Chemali, Seungwhan Moon 1 Gibbs Sampling (Continued from the last lecture)

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Stochastic optimization Markov Chain Monte Carlo

Stochastic optimization Markov Chain Monte Carlo Stochastic optimization Markov Chain Monte Carlo Ethan Fetaya Weizmann Institute of Science 1 Motivation Markov chains Stationary distribution Mixing time 2 Algorithms Metropolis-Hastings Simulated Annealing

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Quantifying Uncertainty

Quantifying Uncertainty Sai Ravela M. I. T Last Updated: Spring 2013 1 Markov Chain Monte Carlo Monte Carlo sampling made for large scale problems via Markov Chains Monte Carlo Sampling Rejection Sampling Importance Sampling

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

arxiv: v1 [stat.co] 23 Apr 2018

arxiv: v1 [stat.co] 23 Apr 2018 Bayesian Updating and Uncertainty Quantification using Sequential Tempered MCMC with the Rank-One Modified Metropolis Algorithm Thomas A. Catanach and James L. Beck arxiv:1804.08738v1 [stat.co] 23 Apr

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo (and Bayesian Mixture Models) David M. Blei Columbia University October 14, 2014 We have discussed probabilistic modeling, and have seen how the posterior distribution is the critical

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

STA4502 (Monte Carlo Estimation) Lecture Notes, Jan Feb 2013

STA4502 (Monte Carlo Estimation) Lecture Notes, Jan Feb 2013 STA4502 Monte Carlo Estimation Lecture Notes, Jan Feb 203 by Jeffrey S. Rosenthal, University of Toronto Last updated: February 5, 203. Note: I will update these notes regularly on the course web page.

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information