Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Size: px
Start display at page:

Download "Introduction to Markov Chain Monte Carlo & Gibbs Sampling"

Transcription

1 Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY URL: August 19,

2 Contents Incremental Strategies for Sampling, Iterative sampling Introduction to MCMC, autoregressive model The Gibbs sampler, systematic scan, random scan Gibbs sampler examples Block and Metropolized Gibbs Application in variable/model selection in linear regression Following closely: Monte Carlo Statistical Methods, C.P. Roberts and G. Casella, Chapter 3 (google books, slides, video) Other References: 1. D Mackay, Introduction to MC methods, reprint. 2. R Neal, Probabilistic Inference Using MCMC Methods, C. Andriew et al., An introduction to MCMC for Machine Learning, Machine Learning, 50, 5 43, S. Brooks, MCMC methods and its applications, Journal of the Royal Statistical Society. Series D (The Statistician), Vol. 47, No. 1 (1998), pp G. Casella and EI George, Explaining the Gibbs Sampler, The American Statistician, Vol. 46, 1992, S. Chib and E. Greenberg, Understanding the MH algorithm, The American Statistician, Vol. 49, No. 4 (Nov., 1995), pp

3 Using Incremental Strategies for Sampling We have seen that both rejection sampling (RS) and importance sampling (IS) are limited to problems of moderate dimensions. The problem with these algorithms is that we try to sample all the components of a high-dimensional parameter simultaneously. We can learn next incremental strategies: - Iterative Methods: Markov chain Monte Carlo. - Sequential Methods: Sequential Monte Carlo. 3

4 Motivating Example Multiple failures in a nuclear plant: a Model: Failures of the i th pump follow a Poisson process with parameter λi,1 λi 10. For an observation time t i, the number of failuresp i is a Poisson P(λ t t i ) random variable. ( ) The unknowns consist of θ : = λ1, λ2,..., λ10, β where β is parameter in the hierarchical model introduced next. a Statistical Computing and MC Methods, A. Doucet, Lecture 10. 4

5 Motivating Example: Nuclear Pump Data Hierarchical Model: iid... with α=1.8, γ=0.01, δ=1. 10 i= 1 λ ~ Ga ( αβ, ), and β~ Ga ( γδ, ), i The posterior distribution (see here Ga distribution) 10 pi λii t α α 1 βλ i γ 1 δβ ~ ( λiti) e β λi e β e i= 1 ( λ ) ~ (, ) ~ (, ) ii t λ β γ δ P i Ga α β Ga { p ( )} i λi ti+ β λi e β + α 1 10α+ γ 1 δβ It is not obvious how the inverse CDF method or the accept/reject method or how importance sampling could be used for this multidimensional distribution. e 5

6 Conditional Distributions The conditionals can be obtained with direct observation: λ ( β, t, p )~ Ga ( p + α, t + β) for1 i 10 i i i i i β ( λ,..., λ ) ~ Ga ( γ 10 α, δ λ ) 1 10 Instead of directly sampling the vector θ = ( λ1,..., λ10, β) at once, one could suggest sampling it iteratively. We can start with the λ i s for a given guess of β, followed by an update of β given the new samples ( λ,..., λ ) i i= 1 6

7 Iterative Sampling Given a sample, at iteration t, θ t = ( λ t 1,..., λ t 10, β t ), one could proceed as follows at iteration t + 1, Step t p p t for i t 1 1: λ + t t i ( β, i, i) ~ Ga ( i + α, i + β ) 1 10 Step t t+ 1 t Ga + + t i i= 1 2 : β ( λ,..., λ ) ~ ( γ 10 α, δ λ ) Note that instead of directly sampling in a space of dimension 11, one samples 11 times in spaces of dimension 1! 7

8 Iterative Sampling With this iterative procedure: Are we sampling from the desired joint distribution of the 11 variables? If yes, how many times should the iteration above be repeated? The validity of the approach described here is derived from the sequence { θ t }: = { λ t 1, λ t 2,..., λ t 10, β t } being a Markov chain. 8

9 Introduction to Markov Chain Monte Carlo Markov chain: A sequence of random variables {X n,n } defined on (X, B (X)) which satisfies that for any A B(X) the following probability condition: ( X A X,..., X ) = ( X A X ) n 0 n 1 n n 1 and we write: Transition Kernel : P( x, A) = ( X n A X n 1) As we have seen in an earlier lecture, given a target π, we need to design a transition kernel P such that asymptotically 1 N N n= 1 N f ( X ) f ( x) π ( x) dx and / or X ~ π n It is easy to simulate the Markov Chain even if π is complex. n 9

10 Autoregression Model Consider the autoregression model for α < 1 X n = α X n 1 + V n, where V n N(0,σ 2 ) The limiting distribution is: 2 σ π ( x) = N x;0, 1 2 α To sample from π, we just sample the Markov chain and we know that asymptotically X n ~π Of course this problem is only to demonstrate the main idea since we can here sample directly from π! 10

11 Autoregression Model Consider 100 independent Markov chains run in parallel. We assume that the initial distribution of these Markov chains is U[0,20]. So initially, the Markov chains samples are not distributed according to π. In the following example, we choose here for a MatLab implementation) α = 0.4, σ = 5 (see 11

12 Example A Markov chain with a normal distribution as target distribution. Initial distribution step=1 step=2 step=3 step=4 step=100 12

13 Example Histograms of 100 independent Markov chains with a normal distribution as target distribution. Initial distribution step=1 step=2 step=3 step=4 step=100 13

14 Example The target normal distribution seems to attract the distribution of the samples and even to be a fixed point of the algorithm. We have produced 100 independent samples from the normal distribution. We will see that it is not necessary to run N Markov chains in parallel in order to obtain 100 samples, but that one can consider a unique Markov chain, and build the histogram from this single Markov chain by forming histograms from one trajectory. 14

15 Markov Chain Monte Carlo The estimate of the target distribution, through the series of histograms, improves with the # of iterations. Assume that we have stored X, n n= 1,..., N for N large and wish to estimate f ( x) π ( x) dx. X { } N 1 We suggest the estimator f ( X which is the estimator n) N n = 1 we used before when { X, n n= 1,..., N} were independent. Under relatively mild conditions, such an estimator is consistent despite the fact that the samples are not independent. Under additional conditions, a CLT also holds with a rate of convergence 1/ N. 15

16 Markov Chain Monte Carlo We are interested in Markov chains with transition kernel P which has the following three important properties observed in the autoregressive example: A. The desired distribution π is an invariant distribution of the Markov chain, i.e. X π( xpxydx ) (, ) = π( y) B. The successive distributions of the Markov chains converge towards π regardless of the starting point. N 1 f ( X ) E n π ( f( X)) N n = 1 asymptotically X n ~ π (stronger requirement) C. The estimator converges towards and 16

17 Markov Chain Monte Carlo Since there is an infinite number of kernels P(x, y) which admit π(x) as their invariant distribution, the main task in MCMC is coming up with good ones. Convergence is ensured under very weak assumptions -- irreducibility and aperiodicity. It is usually easy to establish that an MCMC sampler converges towards π(x) but difficult to obtain rates of convergence. 17

18 The Gibbs Sampler The Gibbs sampler is a generic method to sample from a high dimensional distribution. It generates a Markov chain which converges to the target distribution under weak assumptions: irreducibility and aperiodicity. 18

19 The Two Component Gibbs Sampler Consider the target distribution π(θ) such that θ={θ 1, θ 2 }. The two component Gibbs sampler proceeds as follows: Initialization: Select deterministically or randomly θ = ( θ 1, θ 2 ) Iteration i, i 1. Sample Sample ( ) i 1 θ 1 ~ 1 2 i π θ θ ( ) i θ ~ π θ θ i Sampling from conditionals is often feasible even when sampling from the joint is impossible (e.g. in the nuclear pump data). 19

20 ( ) { } Invariant Distribution 1 2 Clearly θi, θi is a Markov Chain. Its transition kernel is: (( ) ( )) = The detailed balance equation is satisfied: ( ) ( ) P θ, θ, θ, θ π θ θ π θ θ ( ( 1 2 )) (, ) (, ),, π θ θ P θ θ θ θ dθ dθ ( 1 ) ( 2 1 ) ( 1 ) ( 2 1 ) d ( ) π θ, θ π θ θ π θ θ dθ dθ ( ) π θ π θ θ π θ θ θ 1 2 πθ (, θ ) ( 1 ) ( ) ( ) ( ) π θ, θ π θ θ dθ = π θ π θ θ = π θ, θ = = = ( ) 20

21 Irreducibility The detailed balance does not ensure that the Gibbs sampler converges towards the invariant distribution. Additionaly, it is required to ensure irreducibility: the Markov chain can move to any set A such that π(a)>0 for (almost) any starting point. This ensures that 1 N N n= 1 (, ) (, ) (, ) n n f θ 1 θ 2 f θ 1 θ 2 π θ 1 θ 2 dθ 1 dθ 2 but not that asymptotically ( 1 2 θ ) n θn, ~ π 21

22 Irreducibility A distribution is shown here that leads to a reducible Gibbs sampler. 22

23 Irreducibility Consider an example with X = {1, 2} and transition probabilities P (1, 2) = P (2, 1) = 1. The invariant distribution is clearly given by π (1) = π (2) = 1/2. However, we know that if the chain starts in X 0 = 1, then X 2n = 1 and X 2n+1 = 2 for any n. We have 1 N n N n= 1 ( ) ( ) π ( ) f X f x x dx but clearly X n is not distributed according to π. You need to make sure that you do not explore the space in a periodic way to ensure that X ~ π asymptotically. n 23

24 θ = θ1 θ2 θ p (,,..., ) Gibbs Sampler If where p>2, the Gibbs sampler still applies. Initialization: Select deterministically or randomly θ = ( θ, θ,..., θ p ) (0) (0) (0) (0) 1 1 Iteration i, i 1 For k=1:p Sample where θ ~ πθ ( θ ) () i () i k k k θ = ( θ,..., θ, θ,..., θ ) () i () i () i ( i 1) ( i 1) k 1 k 1 k+ 1 p 24

25 Systematic-Scan Gibbs Sampler ( ) 1 1 p Systematic Scan Gibbs: Let θ = θ, θ,..., θ () i () i () i () i () i Update θ1 from () i Update θ2 from ( i 1) ( i 1) π(. θ2,..., θ p ) () i ( i 1) ( i 1) π(. θ1, θ3,..., θ p ) () Update i from θ p π(. θ, θ,..., θ ) () i () i () i 1 2 p 1 25

26 Random Scan Gibbs Sampler Consider again: 1 2 where p>2. We consider the following random scan Gibbs sampler. Initialization: θ = ( θ, θ,..., θ p ) Select deterministically or randomly θ = ( θ, θ,..., θ p ) (0) (0) (0) Iteration i, i 1 Sample Set θ Sample K ~ U. () i ( i 1) K θ = K. { 1,..., p} () i () i θk ~ π( θk θ K) where θ = ( θ,..., θ, θ,..., θ ) () i () i () i () i () i K 1 K 1 K+ 1 p 26

27 Random-Scan Gibbs Sampler Random scan Gibbs: Let 1 2 p at step (iteration) i. Draw j from 1 to p with probability w j =1/p Draw new coordinate j, θ j θ j ~ π(. θ j) and leave the remaining components unchanged; that is, let θ () i ( i 1) θ j = j (,,..., ) θ = θ θ θ () i () i () i () i 27

28 Gibbs Sampler: Example Consider the following target distribution: 0 1 ρ ρ x1 π ( x1, x2) = N, exp ( x1 x2) 2 0 ρ ρ ρ 1 x2 The marginal distribution is given as: 1 π ( x ) exp x A systematic-scan Gibbs sampler is generated with the following conditionals: x x ~ N x,1 { ρ ρ } { ρ ρ } t+ 1 t t x x ~ N x,1 t+ 1 t+ 1 t

29 Gibbs Sampler: Example Set ρ=0.5, # of iterations 10000, and (x 0, x 1 )=(-3,-3) Histogram of x 1, the exact pdf of which is the standard Gaussian x 1 -x 2 plot C++ programs are given here 29

30 Gibbs Sampler: Example Set ρ=0.999, # of iterations 10000, and (x 0, x 1 )=(-3,-3) We can see that the sampling process in this case of highly correlated variables is inaccurate. Histogram of x 1, the exact pdf of which is the standard Gaussian x 1 -x 2 plot 30

31 Convergence of the Gibbs Sampler Even when irreducibility and aperiodicity are ensured, the Gibbs sampler can still converge very slowly. Consider the target bivariate Gaussian distribution a b N (0, ) b a A systematic-scan Gibbs sampler is generated as In this example, we set 2 t+ 1 t b t b x1 x2 ~ N x2, a a a 2 t+ 1 t+ 1 b t+ 1 b x2 x1 ~ N x1, a a a N (0, )

32 Convergence of the Gibbs Sampler The Gibbs sampling path and equiprobability curves are plotted below. A C++ implementation can be found here 32

33 Gibbs Sampler for Mixture of Gaussians A MatLab implementation can be found here 33

34 Gibbs Sampler: Example Consider the following target distribution x1 α n x1 ( x, x ) ~ x (1 x ) β x1 The two conditional distributions for the Gibbs sampler are x1 x2 ~ Binom( n, x2) x x ~ Be ( x + α, n x + β ) We set n = 20, α = β = 0.5, initial state (0,0), time of iterations See here for a C++ implementation. Histogram of x 2, the exact pdf of which is a Beta distribution π n 34

35 Gibbs Sampler: Example Consider a likelihood defined with the Cauchy distribution C(μ,1) with two measurements as follows: n= 2 ( μ D ) = f ( x ) = n We take as prior a normal distribution This leads to a posterior of the form: ( 1 + ( x ) )( 1 + ( x ) ) μ i i= 1 π 1 μ 2 μ μ ~ N (0,10) 2 2 ( + x1 μ )( + x2 μ ) How do we use the Gibbs sampler to sample from this univariate distribution? 2 μ 20 e π( μ D)~ 1 ( ) 1 ( ) 1 35

36 Gibbs Sampler: Example We can use Gibbs sampler by noticing: We can then think π ( μ D) as the marginal of π ( μω, 1, ω2 D)! The Gibbs sampler is based on the following 2 steps: Generate μ (t) π(μ ω (t 1),D) Generate ω (t) π(ω μ (t),d) 2 2 ( + x1 μ )( + x2 μ ) 2 μ 20 e π( μ D)~ 1 ( ) 1 ( ) 1 = 2 + xi μ 0 1 ( ) π( μ, ω, ω )~ 1 1 e 2 ω i 1 + ( xi μ) dω 2 μ 2 2 ω i 1 + ( xi μ) 20 D e e i= 1 i 36

37 Gibbs Sampler: Example The step μ (t) π(μ ω (t 1),D) is straight forward since ω ixi 1 π ( μ ω, i D) N, ω 1/ 20 2 ω 1/10 + i i + i i The step ω (t) π(ω μ (t),d) is also straighrforward: ( () t ) ( () t 2, D Exp 1 + ( x μ ) ) i π ω μ A MatLab implementation can be found here. 37

38 Gibbs sampler: Example On the left, the last 100 iterations of the chain (μ (t )); on the right, the histogram of the chain (μ (t) ) and comparison with the target density for 10,000 iterations. A MatLab implementation can be found here 38

39 Block and Metropolized Gibbs Instead of updating single coordinates x j, one can update blocks x A. This is more efficient but requires knowing the block conditionals π (x A x A ) and being able to sample from them. Combinations of Gibbs and Metropolis Hastings (an introduction was provided in the introduction to Markov Chains lecture) are popular. In Metropolized Gibbs, for example, some coordinates are updated from conditionals and others using arbitrary proposals as in Metropolis- Hastings. 39

40 Gibbs Sampling Are the component-wise MH algorithms π-invariant, irreducible or aperiodic? Each transition kernel in Gibbs (which updates a single coordinate) is not irreducible nor aperiodic. However, their combination (random or systematic scan) might be! 40

41 Gibbs Sampling Consider a target π(x 1, x 2 ) (e.g. a uniform distribution) with disconnected support as in the figure. Conditioning on x 1 < 0, the distribution of x 2 cannot produce a value in [0,1]. You can make this type of problems to work by introducing a proper coordinate transformation. y = x + x, y = x x Conditioning now on y 1 produces a uniform distribution on the union of a negative & of a positive interval. Therefore, one iteration of the Gibbs sampler is sufficient to jump from one disk to the other one. 41

42 Gibbs Sampler: Recommendation Have as few blocks as possible. Put the most correlated variables in the same block. If necessary, reparametrize the model to achieve this. Integrate analytically as many variables as possible. There is no general strategy that will work for all problems. 42

43 Bayesian Variable Selection in Regression We select the following regression model: p βk k k = 1 Y = X + σv, wherev ~ N (0,1) ν γ 2 IG α << where we assume as priors σ ;, and for 1 1 βk N αδσ + N δσ ~ (0, ) (0, ) We introduce a latent variable γ k { 0,1} such that: 1 Pr( γk = 0) = Pr( γk = 1) = 2 β γ αδσ β γ δσ k k = 0 ~ N (0, ), k k = 1 ~ N (0, ) A. Doucet, Statistical Computing and MC Methods, Lecture 10 43

44 A Bad Gibbs Sampler ( 1: p 1: p ) 2 We have parameters β, γ, σ and observe n (, ) 1 D= x y = i i i A potential Gibbs sampler consists of sampling iteratively from p( β D, γ, σ )( Gaussian), p( σ D, γ, β )( inverse Gamma), and 2 2 1: p 1: p 1: p 1: p 2 ( γ1: p D, β1: p, σ ) 2 2 In particular, p( γ1: p D, β1: p, σ ) = p( γk D, βk, σ ) and p p k = 1 p 2 ( γk 1 βk, σ ) = = 2 1 β k exp 2 2 2πδσ 2δσ β 1 β k k exp exp πδσ 2δσ 2παδσ 2αδσ The Gibbs sampler becomes reducible as α goes to zero. 44

45 Bayes Variable Selection This is the result of bad modeling. As we already have seen, we put α 0 and write: p γβ k k k i= 1 Y = X + σv, wherev ~ N (0,1) where γ k = 1 if X k is included or γ k = 0 otherwise. However, this suggests that is defined even when γ = 0. β k A neater way to write such models is T Y = β X + σv = β X + σv, wherev ~ N (0,1) { k: γ = 1} k where, for a vector p = ( 1,..., ), γ = { : = 1 }, Xγ = { X : = 1 }, and nγ = k k γ γ γ γ β β γ γ γ γ k p k k k k k k = 1 Prior distributions ( ) ( ) ν γ π β σ N β δ σ In σ π γ π γ γ γ, = γ;0, ;,, ( ) ( ) 2. γ = p p IG and k = 2 2 k = 1 45

46 A Better Gibbs Sampler We are interested in sampling from the trans-dimensional distribution 2 (, γ, σ D) π γ β However, we know that where ( γ ) = ( ) π γ β σ π γ π β σ γ 2 2,, D D ( γ, D, ) ( D) ( D ) π γ π γ π( γ) and (see result from earlier lecture) n γ0 + μ Σ μ 2 2 ν 1/2 0 + y n nγ i= 1 π( D γ) = π( D, βγ, σ γ) dβγdσ Γ( ) δ Σ γ 2 2 with μ n n 1 2 T γ =Σγ yx i γ, i, Σ γ = δ In + x, ix γ γ γ, i i= 1 i= 1 2 T 1 T i γ γ γ ν 0 + n ( ) 2 46

47 A Better Gibbs Sampler 2 The full conditional distribution for π β, σ D, γ is where ( ) π ( β, σ D) = N β ; μ, σ Σ 2 2 γ γ γ γ γ μ IG σ ;, γ 0 + μγ Σγ μγ 2 ν 0 + n T T yi n i= 1 n n 1 2 T γ =Σγ yx i γ, i, Σ γ = δ In + x, ix γ γ γ, i i= 1 i= 1 ( γ ) The derivation of the above conditional is already given in an earlier lecture. 47

48 A Better Gibbs Sampler Popular alternative prior models for γ i include g-prior (Zellner) ( ) ( ) γi ~ B λ, where λ ~ U[0,1] γ ~ B λ, where λ ~ Be( α, β) i i T ( X X ) β σ ~ N β ;0, δ σ ( ) γ γ γ γ where here for robustness we additionally use 2 a0 b0 δ ~ IG, 2 2 Such variations are very important and can modify dramatically the performance of the Bayesian model. 48

49 Bayesian Variable Selection Example π(γ D) is a discrete probability distribution with 2 p potential values. We assume δ 2 is known here. We can use the Gibbs sampler to sample from it. Initialization: Select deterministically or randomly = Iteration i, i 1 For k=1:p Sample () i () i γk ~ π( γk D, γ k), (0) (0) (0) γ ( γ1,..., γ p ) where γ = ( γ,..., γ, γ,..., γ ) () i () i () i ( i 1) ( i 1) k 1 k 1 k+ 1 p Optional step: Sample ( ) γ β, σ ~ π( β, σ D, γ ) () i 2() i 2 () i γ 49

50 Bayesian Variable Selection Example Consider the case where δ 2 is unknown. Initialization: Select deterministically or randomly Iteration i, i 1 For k=1:p Sample i i i γ ~ π( γ D, γ, δ ) () () 2( 1) k k k ( (0) (0) 2(0) 2(0) γ, β,, ) γ σ δ where γ = ( γ,..., γ, γ,..., γ ) () i () i () i ( i 1) ( i 1) k 1 k 1 k+ 1 p Sample Sample ( ) γ β, σ ~ π( β, σ D, γ, δ ) () i 2() i 2 () i 2() i γ 2() i 2() i () i δ ~ πδ ( β γ ) 50

51 Bayesian Variable Selection Example This very simple sampler is much more efficient than the ones where γ is sampled conditional upon (β, σ 2 ) However, it mixes very slowly because the components are updated one at a time. Updating correlated components together would increase significantly the convergence speed of the algorithm at the cost of an increased complexity. We will revisit linear regression models in more detail and provide implementation of the variable selection caterpillar example in the following lecture. 51

52 Pine Processionary Caterpillars Caterpillar dataset: 1973 study to assess the influence of some forest settlement characteristics on the development of catepillar colonies. The response variable is the log of the average number of nests of caterpillars per tree on an area of 500 m 2. We have n = 33 data and 10 explanatory variables. Following closely: Bayesian Core, J.M. Marin and C.P. Roberts, Chapter 3 (available on line for Cornell students) 52

53 Regression Linear regression is one of the most widespread tools of statistics for modeling the (linear) influence of some variables on others. The variable of primary interest, y, is called the response variable e.g. here the number of pine processionary caterpillar colonies The variables x=(x 1,,x k ) are the explanatory variables. These variables can be continuous or discrete. Our objective is to uncover explanatory and predictive patterns. 53

54 Caterpillar Regression Problem The pine processionary caterpillar colony size is influenced by: x 1 is the altitude (in meters), x 2 is the slope (in degrees), x 3 is the number of pines in the square, x 4 is the height (in meters) of the tree sampled at the center of the square, x 5 is the diameter of the tree sampled at the center of the square, x 6 is the index of the settlement density, x 7 is the orientation of the square (from 1 if southbound to 2 otherwise), x 8 is the height (in meters) of the dominant tree, x 9 is the number of vegetation strata, x 10 is the mix settlement index (from 1 if not mixed to 2 if mixed). 54

55 Caterpillar Regression Problem x 1 x 2 x 3 Semilog y plot of the data (x i,y),i=1,..,9 x 4 x 5 x 6 x 7 x 8 x 9 55

56 Bayesian Variable Selection Example Top five most likely models for the selection models discussed: Results from: Statistical Computing and MC Methods, A. Doucet. 56

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 Suggested Projects: www.cs.ubc.ca/~arnaud/projects.html First assignement on the web: capture/recapture.

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 15-7th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Mixture and composition of kernels. Hybrid algorithms. Examples Overview

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 18-16th March Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 18-16th March Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 18-16th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Trans-dimensional Markov chain Monte Carlo. Bayesian model for autoregressions.

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 Suggested Projects: www.cs.ubc.ca/~arnaud/projects.html First assignement on the web this afternoon: capture/recapture.

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods John Geweke University of Iowa, USA 2005 Institute on Computational Economics University of Chicago - Argonne National Laboaratories July 22, 2005 The problem p (θ, ω I)

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Hmms with variable dimension structures and extensions

Hmms with variable dimension structures and extensions Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

MONTE CARLO METHODS. Hedibert Freitas Lopes

MONTE CARLO METHODS. Hedibert Freitas Lopes MONTE CARLO METHODS Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Simulation - Lectures - Part III Markov chain Monte Carlo

Simulation - Lectures - Part III Markov chain Monte Carlo Simulation - Lectures - Part III Markov chain Monte Carlo Julien Berestycki Part A Simulation and Statistical Programming Hilary Term 2018 Part A Simulation. HT 2018. J. Berestycki. 1 / 50 Outline Markov

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31 Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Caterpillar Regression Example: Conjugate Priors, Conditional & Marginal Posteriors, Predictive Distribution, Variable Selection

Caterpillar Regression Example: Conjugate Priors, Conditional & Marginal Posteriors, Predictive Distribution, Variable Selection Caterpillar Regression Example: Conjugate Priors, Conditional & Marginal Posteriors, Predictive Distribution, Variable Selection Prof. Nicholas Zabaras University of Notre Dame Notre Dame, IN, USA Email:

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

BAYESIAN MODEL CRITICISM

BAYESIAN MODEL CRITICISM Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

Approximate Inference using MCMC

Approximate Inference using MCMC Approximate Inference using MCMC 9.520 Class 22 Ruslan Salakhutdinov BCS and CSAIL, MIT 1 Plan 1. Introduction/Notation. 2. Examples of successful Bayesian models. 3. Basic Sampling Algorithms. 4. Markov

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

What is the most likely year in which the change occurred? Did the rate of disasters increase or decrease after the change-point?

What is the most likely year in which the change occurred? Did the rate of disasters increase or decrease after the change-point? Chapter 11 Markov Chain Monte Carlo Methods 11.1 Introduction In many applications of statistical modeling, the data analyst would like to use a more complex model for a data set, but is forced to resort

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014 Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006 Astronomical p( y x, I) p( x, I) p ( x y, I) = p( y, I) Data Analysis I Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK 10 lectures, beginning October 2006 4. Monte Carlo Methods

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland),

The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland), The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland), Geoff Nicholls (Statistics, Oxford) fox@math.auckland.ac.nz

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Control Variates for Markov Chain Monte Carlo

Control Variates for Markov Chain Monte Carlo Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Monte Carlo Methods in Bayesian Inference: Theory, Methods and Applications

Monte Carlo Methods in Bayesian Inference: Theory, Methods and Applications University of Arkansas, Fayetteville ScholarWorks@UARK Theses and Dissertations 1-016 Monte Carlo Methods in Bayesian Inference: Theory, Methods and Applications Huarui Zhang University of Arkansas, Fayetteville

More information

On Markov chain Monte Carlo methods for tall data

On Markov chain Monte Carlo methods for tall data On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

Surveying the Characteristics of Population Monte Carlo

Surveying the Characteristics of Population Monte Carlo International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 7 (9): 522-527 Science Explorer Publications Surveying the Characteristics of

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Computer Practical: Metropolis-Hastings-based MCMC

Computer Practical: Metropolis-Hastings-based MCMC Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

Results: MCMC Dancers, q=10, n=500

Results: MCMC Dancers, q=10, n=500 Motivation Sampling Methods for Bayesian Inference How to track many INTERACTING targets? A Tutorial Frank Dellaert Results: MCMC Dancers, q=10, n=500 1 Probabilistic Topological Maps Results Real-Time

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016 Spatial Statistics Spatial Examples More Spatial Statistics with Image Analysis Johan Lindström 1 1 Mathematical Statistics Centre for Mathematical Sciences Lund University Lund October 6, 2016 Johan Lindström

More information

Sequential Monte Carlo Methods

Sequential Monte Carlo Methods University of Pennsylvania Bradley Visitor Lectures October 23, 2017 Introduction Unfortunately, standard MCMC can be inaccurate, especially in medium and large-scale DSGE models: disentangling importance

More information

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Jonas Hallgren 1 1 Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden BFS 2012 June

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J. Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical

More information