Bayesian Inference and Decision Theory

Size: px
Start display at page:

Download "Bayesian Inference and Decision Theory"

Transcription

1 Bayesian Inference and Decision Theory Instructor: Kathryn Blackmond Laskey Room 2214 ENGR (703) Office Hours: Tuesday and Thursday 4:30-5:30 PM, or by appointment Spring 2018 Unit 6: Gibbs Sampling Unit 6(v2a) - 1 -

2 Learning Objectives for Unit 6 Describe how Gibbs sampling works Implement a simple Gibbs sampler Use JAGS to perform Gibbs sampling Estimate posterior quantities from the output of a Gibbs sampler for the posterior distribution Describe some MCMC diagnostics Apply diagnostics to assess adequacy of MCMC sampler output Unit 6(v2a) - 2 -

3 Review: Steps in Bayesian Data Analysis 1. Determine the question: We are concerned with understanding the process by which a data set x = x 1,, x n was generated 2. Specify the likelihood: f(x θ) expresses probability distribution of data conditional on parameter 3. Specify the prior distribution: g(θ) represents beliefs about parameter prior to seeing observations x 4. Find the (exact or approximate) posterior distribution: For a Bayesian, the posterior distribution is everything needed to draw conclusions about θ Once we have specified the likelihood and the prior, the posterior distribution is completely determined Approximation is needed when posterior distribution is intractable 5. Summarize the posterior distribution and draw conclusions: We report posterior summaries such as mean, credible interval, or predictive probabilities Summaries are chosen to address the original question We also do analyses to check model adequacy Unit 6(v2a) - 3 -

4 Step 4: Find / Approximate the Posterior Distribution When the prior and likelihood form a conjugate pair, we have a closed form expression for the posterior distribution and many posterior quantities There is no closed-form expression for some posterior quantities Example: difference in defect rates for two plants where defect rates are independent Gamma random variables Sometimes we can estimate these quantities using direct Monte Carlo For many interesting problems no exact posterior distribution can be found We cannot use direct Monte Carlo We need another way to approximate the posterior distribution Markov Chain Monte Carlo is a class of methods for taking correlated (not iid) draws from the posterior distribution MCMC can be applied for many problems for which direct Monte Carlo cannot be used Gibbs sampling is the simplest MCMC method Unit 6(v2a) - 4 -

5 Example: Normal Random Variable with Independent Mean and Precision Problem: infer mean and precision of normal data In Unit 5 we used the normal-gamma conjugate prior Prior knowledge about mean Θ and precision Ρ are dependent The greater the precision of an observation, the more sure we are about the prior mean This might not be a faithful representation of our prior information Consider a prior distribution in which Θ and Ρ are independent a priori and Ρ has a gamma distribution with shape α and scale β Θ has a normal distribution with mean µ and standard deviation τ This is not a conjugate distribution There is no closed-form expression for the posterior distribution Ρ Y Ρ Y Θ _ X Normal-gamma conjugate prior n X = 1 x n i Y = (x i x) 2 i=1 Θ _ X Independent normal and gamma priors n i=1 Unit 6(v2a) - 5 -

6 A Semi-Conjugate Prior Distribution A prior distribution for two (or more) parameters is semi-conjugate if the prior distribution for each parameter given the others is conjugate The independent normal and gamma prior distribution is semi-conjugate Observations: X 1,, X n Θ, Ρ ~ Normal(Θ, 1/Ρ 1/2 ) Distribution for Θ given Ρ = ρ and X 1:n : Prior distribution: Θ ~ Normal(µ, τ) is independent of Ρ Posterior distribution: Θ Ρ=ρ, X 1:n ~ Normal(µ*, τ*) µ /τ 2 + ρ X (see Unit 5) τ* = (1/τ 2 + nρ) -1/2 i µ* = i 1/τ 2 + nρ Distribution for Ρ given Θ = θ and X 1:n : Prior distribution: Ρ ~ Gamma(α, β) is independent of Θ Posterior distribution: Ρ Θ=θ, X 1:n ~ Gamma(α*, β*) α* = α + n/2 β* = ( β (X i θ) 2 ) 1 (see next page for derivation) i Ρ Y Θ _ X Unit 6(v2a) - 6 -

7 Semi-Conjugate Prior Distribution: Details Distribution for Θ given Ρ = ρ and X 1:n is just the case of known standard deviation from Unit 5 Θ Ρ=ρ, X 1:n ~ Normal(µ*, τ*) τ* = (1/τ 2 + nρ) -1/2 µ* = We find the distribution for Ρ given Θ = θ and X 1:n by considering the limiting case of the normal-gamma distribution as the precision multiplier tends to infinity (i.e. the mean has infinite precision a priori) If X 1:n are iid Normal(Θ, Ρ -1/2 ) and prior distribution for (Θ, Ρ) is Normal- Gamma(µ, k, α, β ) then posterior distribution for Ρ is Gamma(α*, β*) with α* = α +! " β* = β $% + % x " ) x " +,! ) x μ " $% ",-! = β $% + % x " ) x "! ) + x μ " $% " %-!/, β $% + % x " ) x " +! ) x μ " $% " as k = β $% + % " x ) μ " ) µ /τ 2 + ρ i X i 1/τ 2 + nρ $% = β $% + % " x ) θ " ) Conditioning on Θ = θ and X 1:n means assuming that Θ = θ is known with infinite precision (k ) to be equal to the prior mean μ. $% Unit 6(v2a) - 7 -

8 Approximating the Posterior Distribution for a Semi-Conjugate Prior We have no closed-form expression for the posterior distribution of (Θ, Ρ) given X 1:n We cannot approximate with direct Monte Carlo We can approximate using Gibbs sampling: INITIALIZE: Chose arbitrary initial parameter values θ (0), ρ (0) SAMPLE: For k = 1,, M» Sample θ (k) from g(θ ρ (k-1), x)» Sample ρ (k) from g(ρ θ (k), x) Facts about Gibbs sampling: Successive draws are correlated: (θ (k), ρ (k) ) depends on (θ (k-1), ρ (k-1) ) The sequence (θ (1), ρ (1) ), (θ (2), ρ (2) ), is a Markov chain This Markov chain has a unique stationary distribution equal to the posterior distribution of (Θ, Ρ) given X 1:n We can use the samples (θ (1), ρ (1) ), (θ (2), ρ (2) ), to approximate posterior quantities of interest Unit 6(v2a) - 8 -

9 Review: Markov Chain A Markov chain (of order 1) is a sequence of random variables X 1, X 2, such that X i is independent of all lowernumbered X j (j < i-1) given X i-1 The X i can be univariate or multivariate Pr(X i X 1, X 2, X i-1 ) = Pr(X i X i-1 ) In an order k Markov chain, X i is independent of all lowernumbered X j (j < i) given X i-1,, X i-k Under fairly general conditions a Markov chain has a unique stationary distribution π(x) If X i has distribution π(x) then so does X i+1 X 1 X 2 X 3 X 4 Unit 6(v2a) - 9 -

10 Example of a Markov Chain States: Cold, Exposed, Healthy Allowable transitions: Cold à Cold (p=0.12) Cold à Healthy (p=0.88) Exposed à Cold (p=0.75) Exposed à Healthy (p=0.25) Healthy à Exposed (p=0.12) Healthy à Healthy (p=0.88) Unique stationary distribution P st (Cold) = ; P st (Exposed) = ; P st (Healthy) = All initial distributions evolve to stationary distribution Cold_Status_1 Cold_Status_2 Cold_Status_3 Cold_Status_4 Cold_Status_5 Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold_Status_1 Cold_Status_2 Cold_Status_3 Cold_Status_4 Cold_Status_5 Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold_Status_1 Cold_Status_2 Cold_Status_3 Cold_Status_4 Cold_Status_5 Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Cold Exposed Healthy Unit 6(v2a)

11 Markov Chain of Gibbs Samples for (θ, ρ) INITIALIZE: Chose arbitrary initial parameter values θ (0), ρ (0) SAMPLE: For k = 1,, M Sample θ (k) from g(θ ρ (k-1), X=x) (posterior distribution of Θ ρ (k-1), X=x)» Normal with mean (µ/τ 2 + nρσx i )/(1/τ 2 + nρ (k-1) ) and precision (1/τ 2 + nρ (k-1) ) Sample ρ (k) from g (ρ θ (k), X=x) (posterior distribution of Ρ θ (k), X=x)» Gamma with shape α + n/2 and scale (β -1 + ½ Σ(x i -θ (k) ) 2 ) -1 This process gives a Markov chain with states (θ (k), ρ (k) ) (θ (k), ρ (k) ) is independent of the past given (θ (k-1), ρ (k-1) ) g(θ (1) ρ (0), X=x) g(θ (2) ρ (1), X=x) g(θ (3) ρ (2), X=x) θ (0) θ (1) θ (2) θ (3) ρ (0) ρ (1) ρ (2) ρ (3) g(ρ (1) θ (1), X=x) g(ρ (2) θ (2), X=x) g(ρ (3) θ (3), X=x) Unit 6(v2a)

12 Reaction Time Example We analyzed a data set of reaction times in Unit 5 using a noninformative conjugate normal-gamma distribution g(θ,ρ) ρ 1 Normal-Gamma(µ, k, α, β) with µ = 0, k = 0, α = % ", β = Although we can find the posterior distribution exactly, we will use Gibbs sampling to illustrate the method Posterior distribution of Θ given Ρ=ρ and x 1,, x n is normal with Mean µ* = x = 5.73 and precision ρ* = nρ (posterior distribution from normal conjugate prior with known variance and uninformative prior on mean, i.e., k=0) Posterior distribution of Ρ given Θ=θ and x 1,, x n is gamma with Shape α* = n/2 ½ Scale β* = %! x " )7% ) θ " $% (known mean formula from page 6 with β = ) We sample repeatedly from these distributions to find the Gibbs sampling estimate of the posterior distribution Unit 6(v2a)

13 Results: Gibbs Sampling for Reaction Times with Semi-Conjugate Prior 10,000 samples were drawn from the Gibbs sampler for the posterior distribution given the 30 reaction time observations: 95% credible interval for Θ: [5.68, 5.78] 95% credible interval for Σ: [0.102, 0.171] Kernel Density Plots for Marginal Posterior Densities of Θ and Σ R code is available on Blackboard g(θ x) g(σ x) θ σ Unit 6(v2a)

14 Scatterplots for Gibbs Sampler Output Scatterplots for 10,000 samples from Gibbs sampler for posterior distribution given 30 observations on the first non-schizophrenic subject Left: joint distribution of mean and precision Right: joint distribution of mean and standard deviation Unit 6(v2a)

15 Comparison: Exact, Direct MC and Gibbs for Normal Model with Conjugate Prior The Unit 5 example used a normal-gamma (µ=0, k=0, α=-0.5, β= ) conjugate prior distribution for (Θ,Ρ) The exact posterior distribution for (Θ,Ρ) given X is a normal-gamma (µ*=5.73, k*=30, α*=14.5, β*=4.30) distribution Marginal distribution of Ρ is Gamma(α*, β*) Marginal density for Σ can be found with a bit of calculus (see next page) Marginal distribution of Θ is nonstandard t with center µ* and spread (k*α*β*) -1/2 We can approximate this distribution by simulating iid normal-gamma (µ*, k*, α*, β*) observations: Simulate ρ m from a Gamma(a*, β*) distribution, and simulate θ m from a Normal(µ*, (k*ρ m ) -1/2 )) distribution In this unit we approximated this distribution by Gibbs sampling Sample θ m from a normal distribution with mean µ* and standard deviation (k*ρ m-1 ) -1/2 ( ) 1 Sample ρ m from a gamma distribution with shape α* and scale β (x i θ m ) 2 i Unit 6(v2a)

16 Comparison of Posterior Density Estimates Density Direct MC KD Gibbs KD Theoretical t Plot compares exact and approximate posterior density functions for mean of reaction time distribution Dashed green line shows kernel density estimate from 10,000 direct Monte Carlo samples Dotted blue line shows kernel density estimate from 10,000 Gibbs samples Solid red line shows posterior t density with center µ* = 5.732, spread 1/(k*α*β*) 1/2 = , and degrees of freedom 2α* = Theta R code is available on Blackboard Unit 6(v2a)

17 Gibbs Sampling in General Suppose we wish to estimate g(y x) = g(y 1, y 2,, y p x) Sometimes we cannot sample directly from g(y x), but we can sample from each of the full conditional distributions g(y i y 1,, y i-1, y i+1,, y p, x) In such a case, we can apply Gibbs sampling as follows: INITIALIZE: Chose initial parameter values y 1 (0), y 2 (0),, y p (0) SAMPLE: For m = 1,, M» Sample y 1 (m) from g(y 1 y 2 (m-1),, y p (m-1), x)» Sample y 2 (m) from g(y 2 y 1 (m), y 3 (m-1),, y p (m-1), x)» Sample y i (m) from g(y i y 1 (m),, y i-1 (m), y i+1 (m-1),, y p (m-1), x)» Sample y p (m) from g(y p y 1 (m),, y p-1 (m), x) This sampling process is a Markov chain, because the distribution of y 1 (m),, y p (m) is independent of the past given y 1 (m-1),, y p (m-1) Under fairly general conditions g(y x) is the unique stationary distribution Unit 6(v2a)

18 Markov Chain Monte Carlo General-purpose class of Monte Carlo algorithms originating in statistical physics Often applied in problems for which exact computation of posterior distribution is intractable Goal: estimate a target distribution by Monte Carlo sampling Method: Construct a Markov chain with a unique stationary distribution equal to the target distribution P(X) Sample from this Markov chain Estimate P(X) by the frequency of X in the sample (often discarding a burn-in period) Remarks MCMC takes correlated draws from a distribution constructed to have target distribution as stationary distribution We use MCMC when we cannot take iid draws from the target distribution The most common MCMC samplers are the Gibbs sampler (this unit) and the Metropolis-Hastings sampler (a generalization of the Gibbs sampler we will study later) Unit 6(v2a)

19 MCMC Computation MCMC is an active area of research and is widely used in applications Software for doing MCMC is available for free use R packages: MATLAB tools: Python tools: Bayesian Inference Using Gibbs Sampling (BUGS): WinBUGS JAGS Stan Many people write custom code for specific applications Unit 6(v2a)

20 BUGS is: Bayesian Inference Using Gibbs Sampling A high-level language for defining Bayesian models A library of sampling routines An interface for running the sampler An output processor for processing and interpreting results BUGS is intended to free the modeler to focus on the problem without worrying about details of inference implementation Incarnations of BUGS: Classic BUGS WinBUGS JAGS developed 1995, cross-platform, not maintained Windows-only GUI, creates coda files for input to R, latest news on blog is dated 2012 cross-platform, interfaces directly to R with rjags and R2jags We will focus on JAGS because it is cross-platform, can be called from R, and is currently being maintained Unit 6(v2a)

21 Quick Guide to Installing and Running JAGS JAGS runs on Linux, Mac, and Windows and interfaces with R through the rjags and R2jags packages. To install JAGS and set it up to be used from R: If necessary Download and install R and potentially a user interface to R like R Studio (see here for tips on getting started withr). Download and install JAGS as per operating system requirements. Install additional R packages: e.g., rjags to interface with JAGS R2jags to call JAGS from R (depends on rjags) coda to process MCMC output superdiag for MCMC convergence diagnostics Source: Unit 6(v2a)

22 JAGS Example: Reaction Times (1 of 2) 1. Specify model in BUGS language and save as.jags file) 1. Run the model from R Unit 6(v2a)

23 JAGS Example: Reaction Times (2 of 2) 3. Analyze output using coda package It is common to thin the chain by keeping only every k th observation. This reduces the serial correlation. Summary Table: Deviance = -log (y θ,ρ) is a measure of how well the observations fit the model Unit 6(v2a)

24 Traceplots deviance D θ, ρ = 2 log P(x θ, ρ) Measures how well model fits Used to compare models Unit 6(v2a)

25 Kernel Density Plots Unit 6(v2a)

26 MCMC Diagnostics Direct Monte Carlo generates iid samples from the distribution of interest MCMC generates a Markov chain in which successive realizations are correlated When the target distribution is multi-model, MCMC sampler can get stuck in regions near a local mode, yielding a poor approximation to the target distribution MCMC diagnostics help us to identify problems with the sampler and to assess whether we have collected enough samples to get a good approximation to the target distribution Some MCMC diagnostics Traceplot plots a parameter against iteration number to help diagnose whether the sampler is getting stuck in a local region Sample autocorrelation function (acf) evaluates correlation between elements of the sequence as a function of the time separation Effective sample size Uses acf to estimate the number of independent MC draws needed to achieve same precision as the MCMC samples Convergence diagnostics Run parallel MCMC chains and evaluate convergence using within and between chain variance Unit 6(v2a)

27 Example: Weaver Ants Body lengths of weaver ant workers show a bimodal distribution Minor workers are a little more than half the size of major workers There is very little overlap in the size distributions A mixture of two normal distributions provides a good model for the body length data Minor workers (36%)» Mean 4.8 mm» Std dev 0.36 mm Major workers (64%)» Mean 7.7 mm» Std dev 0.61 mm Body Length (mm) Mixture Density Sample Frequency Weber, NA (1946). Dimorphism in the African Oecophylla worker and an anomaly (Hym.: Formicidae). Annals of the Entomological Society of America 39: pp Unit 6(v2a)

28 R Code for Ant Mixture Density Plot #Ants example #Data from Weber, 1946, "Dimorphism in the African Oecophyulla Worker and an Anomaly # lengthcounts <- c(8,41,52,7,6,11,32,56,59,23,5) lengths <- c(4.0,4.5,5.0,5.5,6.0,6.5,7.0,7.5,8.0,8.5,9.0) # Mixture model mu1 <- 4.8 sd1 < mu2 <- 7.7 sd2 < pr1 < pr2 <- (1-pr1) Z X Ant type Ant length xvals <- 350:1100/100 mixdens <- pr1*dnorm(xvals,mu1,sd1)+pr2*dnorm(xvals,mu2,sd2) # Sample Frequencies and Mixture Density Plot plot(xvals,mixdens,col="red",type="l",main="",ylab="",xlab="body Length (mm)") lines(lengths,lengthcounts/(300*.5),col="darkcyan",type="h",lwd=3) legend(8.2,0.42,c("mixture Density","Sample Frequency"),col=c("red","darkcyan"),lty=c(1,1),lwd=c(1,3)) Unit 6(v2a)

29 Direct Monte Carlo for Ant Lengths The distribution for ant lengths can be simulated directly: Simulate z = 1 with probability 0.36 and z=2 with probability 0.64 If z=1 simulate length from Normal(4.8, 0.36) If z=2 simulate length from Normal(7.7, 0.61) This produces an iid sample of ant lengths Z X Histogram of 1000 Direct MC Values Trace Plot of 1000 Direct MC Values Frequency Simulated Body Length (mm) Direct MC Simulated Body Length (mm) Iteration Unit 6(v2a)

30 R Code for Direct MC # Direct Monte Carlo xd <- NULL zd <- NULL for (i in 1:numSim) { zval <- rbinom(1,1,pr1) if (zval==1) zd[i] <- 1 else zd[i] <- 2 if (zval==1) xd[i] <- rnorm(1,mu1,sd1) else xd[i] <- rnorm(1,mu2,sd2) } # Trace plot and Histogram plot(1:numsim,xd,main="",ylab="simulated Body Length (mm)",xlab="iteration") histogram(xd,xlab="direct MC Simulated Body Length (mm)",ylab="frequency") Unit 6(v2a)

31 Gibbs Sampling for Ant Lengths This example illustrates problems that can occur with MCMC (you do not want to do this!) Gibbs sampling to simulate ant length distribution: Initialize length x 0 For each k» Calculate L 1 = f(x k-1 4.9, 0.36) (normal density with mean 4.9, sd 0.36)» Calculate L 2 = f(x k-1 7.7, 0.61) (normal density with mean 7.7, sd 0.61)» Calculate p 1 = 0.36L 1 / (0.36L L 2 )» Simulate z k = 1 or 2, with probabilities p 1 and 1-p 1 respectively» If z k =1 simulate x k from Normal(4.8, 0.36)» If z k =2 simulate length from Normal(7.7, 0.61) This produces a sample of ant lengths Consecutive observations are correlated This is a Markov chain with stationary distribution equal to the target mixture distribution Due to correlation between successive samples, this is a very inefficient way to simulate samples from the target distribution Z X Unit 6(v2a)

32 Gibbs Sampling Results (1000 Samples) Histogram of 1000 Gibbs Samples Trace Plot of 1000 Gibbs Samples 30 Frequency Simulated Body Length (mm) Gibbs Simulated Body Length (mm) Iteration Unit 6(v2a)

33 Gibbs Sampling Results (10,000 Samples) Histogram of 10,000 Gibbs Samples Trace Plot of 10,000 Gibbs Samples 20 Frequency Simulated Body Length (mm) Gibbs Simulated Body Length (mm) Iteration Unit 6(v2a)

34 Comparison: Gibbs Sampling for Parameters of Reaction Time Distribution Histogram of 10,000 Gibbs Samples Trace Plot of 10,000 Gibbs Samples Frequency Simulated Mean Log Reaction Time Gibbs Samples of Mean Log Reaction Time Iteration Unit 6(v2a)

35 R Code for Gibbs Sampling # Gibbs sampling xval <- mu1 xg <- NULL #Starting value zg <- NULL for (i in 1:numSim) { } likz1 <- pr1*dnorm(xval,mu1,sd1) likz2 <- pr2*dnorm(xval,mu2,sd2) prz1 <- likz1/(likz1+likz2) zval <- rbinom(1,1,prz1) if (zval==1) zg[i] <- 1 else zg[i] <- 2 if (zval==1) xval <- rnorm(1,mu1,sd1) else xval <- rnorm(1,mu2,sd2) xg[i] <- xval # Trace plot and Histogram plot(1:numsim,xg,main="",ylab="simulated Body Length (mm)",xlab="iteration") histogram(xg,xlab="gibbs Simulated Body Length (mm)",ylab="frequency") Unit 6(v2a)

36 Autocorrelation Function The lag-k autocorrelation function (acf) estimates correlation between observations k steps apart The lag-1 autocorrelation for the 10,000 Gibbs simulations is The lag-1 autocorrelation for the 1,000 direct MC simulations is R command: acf(x) ACF for 10,000 Gibbs Samples ACF for 1000 Direct MC Samples ACF ACF Lag Lag Unit 6(v2a)

37 Effective Sample Size We can use the autocorrelation function to estimate the number of independent draws we would need to give the same precision as our MCMC sample The effectivesize function in R calculates such an estimate To use this function, you must load the coda package This package provides output analysis and diagnostics for MCMC samples Effective sample sizes for reaction time and ant length simulations For 10,000 Gibbs samples of reaction time mean and standard deviation, effective size was 9650 for the mean and 9711 for the standard deviation For 1000 direct MC samples of ant lengths, the effective size was 1000 For 10,000 Gibbs samples of ant lengths, the effective size was 28.5 For ant length simulation, we clearly prefer direct MC to Gibbs sampling For many problems, MCMC is necessary because we cannot sample directly from the target distribution of interest We must be careful to draw enough samples for reliable inference We must be on the lookout for problems such as multimodality Unit 6(v2a)

38 Other MCMC Diagnostics Potential scale reduction (Gelman and Rubin, 1992) Compares within and between variance components for multiple MCMC chains Chains are started at overdispersed starting points; convergence occurs when output of all chains is indistinguishable Large values (above 1.1 or 1.2) suggest chain has not converged Geweke (1992) z-score Test for equality of means of first and last part of a Markov chain Burn-in period may be discarded but more than half the chain should be retained Heidelberger and Welch (1983) diagnostic Uses Cramer-von-Mises statistic to test null hypothesis that sampled values come from stationary distribution Raftery and Lewis (1992) diagnostic Use on a short pilot run of the chain Provides information on sample size for chain with no correlation between successive samples These diagnostics are available as part of the coda package in R. Documentation is available at Unit 6(v2a)

39 Caveat In hard problems it can be very difficult to assess whether realizations of a MCMC sampler provide an adequate approximation to the target distribution Although MCMC diagnostics are helpful, it is possible for a chain to be stuck in a local optimum without being detected by MCMC diagnostics For highly multi-modal problems it may be that the best we can do is find a good local mode of the posterior distribution Unit 6(v2a)

40 JAGS References MCMC Diagnostics Gelman, A., & Rubin, D. B. (1992). Inference from Iterative Simulation using Multiple Sequences. Statistical Science, 7, Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bayesian Statistics 4, Bernardo, J. M., Berger, J. O., Dawid, A. P. and Smith, A. F. M. (eds.), Oxford: Oxford University Press. Heidelberger P and Welch PD. (1983). Simulation run length control in the presence of an initial transient. Opns Res., 31, Raftery, Adrian E.; Lewis, Steven M. (1992). [Practical Markov Chain Monte Carlo]: Comment: One Long Run with Diagnostics: Implementation Strategies for Markov Chain Monte Carlo. Statist. Sci. 7, no. 4, Unit 6(v2a)

41 Summary and Synthesis Gibbs sampling is a Markov Chain Monte Carlo (MCMC) approximation method that can be applied to problems in which it is possible to sample from the full conditional distributions of each target variable given all the others Gibbs sampling can be applied in cases for which direct Monte Carlo is infeasible Like all MCMC methods, Gibbs sampling yields a correlated sequence of draws A number of MCMC diagnostic tools can help assess the severity of autocorrelation in the chain and evaluate whether enough samples have been collected Although these diagnostics are useful, they can be deceptive For very hard problems, it may be infeasible to obtain an accurate estimate of the posterior distribution, and good local optima are the best that can be done Gibbs sampling (and other MCMC algorithms) have proven very useful for otherwise intractable estimation problems Tools are available for running MCMC samplers from R Unit 6(v2a)

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Bayesian Inference and Decision Theory

Bayesian Inference and Decision Theory Bayesian Inference and Decision Theory Instructor: Kathryn Blackmond Laskey Room 4 ENGR (703) 993-644 Office Hours: Thursday 4:00-6:00 PM, or by appointment Spring 08 Unit 5: The Normal Model Unit 5 -

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013 Eco517 Fall 2013 C. Sims MCMC October 8, 2013 c 2013 by Christopher A. Sims. This document may be reproduced for educational and research purposes, so long as the copies contain this notice and are retained

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9 Metropolis Hastings Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 9 1 The Metropolis-Hastings algorithm is a general term for a family of Markov chain simulation methods

More information

Monte Carlo integration

Monte Carlo integration Monte Carlo integration Eample of a Monte Carlo sampler in D: imagine a circle radius L/ within a square of LL. If points are randoml generated over the square, what s the probabilit to hit within circle?

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

MCMC Review. MCMC Review. Gibbs Sampling. MCMC Review

MCMC Review. MCMC Review. Gibbs Sampling. MCMC Review MCMC Review http://jackman.stanford.edu/mcmc/icpsr99.pdf http://students.washington.edu/fkrogsta/bayes/stat538.pdf http://www.stat.berkeley.edu/users/terry/classes/s260.1998 /Week9a/week9a/week9a.html

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Multivariate Normal & Wishart

Multivariate Normal & Wishart Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.

More information

INTRODUCTION TO BAYESIAN STATISTICS

INTRODUCTION TO BAYESIAN STATISTICS INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) Gibbs and Metropolis Hastings Slice sampling Practical details Iain Murray http://iainmurray.net/ Reminder Need to sample large, non-standard distributions:

More information

36-463/663Multilevel and Hierarchical Models

36-463/663Multilevel and Hierarchical Models 36-463/663Multilevel and Hierarchical Models From Bayes to MCMC to MLMs Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Bayesian Statistics and MCMC Distribution of Skill Mastery in a Population

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Lecture 13 Fundamentals of Bayesian Inference

Lecture 13 Fundamentals of Bayesian Inference Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up

More information

Hamiltonian Monte Carlo

Hamiltonian Monte Carlo Hamiltonian Monte Carlo within Stan Daniel Lee Columbia University, Statistics Department bearlee@alum.mit.edu BayesComp mc-stan.org Why MCMC? Have data. Have a rich statistical model. No analytic solution.

More information

Weakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0

More information

I. Bayesian econometrics

I. Bayesian econometrics I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory D. Large sample results E. Diffuse priors F. Numerical Bayesian methods

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

0.1 normal.bayes: Bayesian Normal Linear Regression

0.1 normal.bayes: Bayesian Normal Linear Regression 0.1 normal.bayes: Bayesian Normal Linear Regression Use Bayesian regression to specify a continuous dependent variable as a linear function of specified explanatory variables. The model is implemented

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

BUGS Bayesian inference Using Gibbs Sampling

BUGS Bayesian inference Using Gibbs Sampling BUGS Bayesian inference Using Gibbs Sampling Glen DePalma Department of Statistics May 30, 2013 www.stat.purdue.edu/~gdepalma 1 / 20 Bayesian Philosophy I [Pearl] turned Bayesian in 1971, as soon as I

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Hierarchical Linear Models

Hierarchical Linear Models Hierarchical Linear Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin The linear regression model Hierarchical Linear Models y N(Xβ, Σ y ) β σ 2 p(β σ 2 ) σ 2 p(σ 2 ) can be extended

More information

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017 Chalmers April 6, 2017 Bayesian philosophy Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are

More information

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Solar Spectral Analyses with Uncertainties in Atomic Physical Models

Solar Spectral Analyses with Uncertainties in Atomic Physical Models Solar Spectral Analyses with Uncertainties in Atomic Physical Models Xixi Yu Imperial College London xixi.yu16@imperial.ac.uk 3 April 2018 Xixi Yu (Imperial College London) Statistical methods in Astrophysics

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Why Bayesian approaches? The average height of a rare plant

Why Bayesian approaches? The average height of a rare plant Why Bayesian approaches? The average height of a rare plant Estimation and comparison of averages is an important step in many ecological analyses and demographic models. In this demonstration you will

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Markov chain Monte Carlo General Principles

Markov chain Monte Carlo General Principles Markov chain Monte Carlo General Principles Aaron A. King May 27, 2009 Contents 1 The Metropolis-Hastings algorithm 1 2 Completion (AKA data augmentation) 7 3 The Gibbs sampler 8 4 Block and hybrid MCMC

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

0.1 factor.bayes: Bayesian Factor Analysis

0.1 factor.bayes: Bayesian Factor Analysis 0.1 factor.bayes: Bayesian Factor Analysis Given some unobserved explanatory variables and observed dependent variables, the Normal theory factor analysis model estimates the latent factors. The model

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software SAMSI Astrostatistics Tutorial More Markov chain Monte Carlo & Demo of Mathematica software Phil Gregory University of British Columbia 26 Bayesian Logical Data Analysis for the Physical Sciences Contents:

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

FAV i R This paper is produced mechanically as part of FAViR. See for more information.

FAV i R This paper is produced mechanically as part of FAViR. See  for more information. Bayesian Claim Severity Part 2 Mixed Exponentials with Trend, Censoring, and Truncation By Benedict Escoto FAV i R This paper is produced mechanically as part of FAViR. See http://www.favir.net for more

More information

Appendix 3. Markov Chain Monte Carlo and Gibbs Sampling

Appendix 3. Markov Chain Monte Carlo and Gibbs Sampling Appendix 3 Markov Chain Monte Carlo and Gibbs Sampling A constant theme in the development of statistics has been the search for justifications for what statisticians do Blasco (2001) Draft version 21

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Jamie Monogan University of Georgia Spring 2013 For more information, including R programs, properties of Markov chains, and Metropolis-Hastings, please see: http://monogan.myweb.uga.edu/teaching/statcomp/mcmc.pdf

More information

Infer relationships among three species: Outgroup:

Infer relationships among three species: Outgroup: Infer relationships among three species: Outgroup: Three possible trees (topologies): A C B A B C Model probability 1.0 Prior distribution Data (observations) probability 1.0 Posterior distribution Bayes

More information

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation Luke Tierney Department of Statistics & Actuarial Science University of Iowa Basic Ratio of Uniforms Method Introduced by Kinderman and

More information

Markov Chain Monte Carlo and Applied Bayesian Statistics

Markov Chain Monte Carlo and Applied Bayesian Statistics Markov Chain Monte Carlo and Applied Bayesian Statistics Trinity Term 2005 Prof. Gesine Reinert Markov chain Monte Carlo is a stochastic simulation technique that is very useful for computing inferential

More information

Weakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers

Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints Hadi Mohasel Afshar Scott Sanner Christfried Webers Inference in Hybrid Graphical Models / Probabilistic Programs Limitations

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014 Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

0.1 poisson.bayes: Bayesian Poisson Regression

0.1 poisson.bayes: Bayesian Poisson Regression 0.1 poisson.bayes: Bayesian Poisson Regression Use the Poisson regression model if the observations of your dependent variable represents the number of independent events that occur during a fixed period

More information

Monte Carlo Inference Methods

Monte Carlo Inference Methods Monte Carlo Inference Methods Iain Murray University of Edinburgh http://iainmurray.net Monte Carlo and Insomnia Enrico Fermi (1901 1954) took great delight in astonishing his colleagues with his remarkably

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review

Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin 1 Abstract A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

What is the most likely year in which the change occurred? Did the rate of disasters increase or decrease after the change-point?

What is the most likely year in which the change occurred? Did the rate of disasters increase or decrease after the change-point? Chapter 11 Markov Chain Monte Carlo Methods 11.1 Introduction In many applications of statistical modeling, the data analyst would like to use a more complex model for a data set, but is forced to resort

More information

Sampling from complex probability distributions

Sampling from complex probability distributions Sampling from complex probability distributions Louis J. M. Aslett (louis.aslett@durham.ac.uk) Department of Mathematical Sciences Durham University UTOPIAE Training School II 4 July 2017 1/37 Motivation

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 2 Coding and output Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. General (Markov) epidemic model 2. Non-Markov epidemic model 3. Debugging

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

PIER HLM Course July 30, 2011 Howard Seltman. Discussion Guide for Bayes and BUGS

PIER HLM Course July 30, 2011 Howard Seltman. Discussion Guide for Bayes and BUGS PIER HLM Course July 30, 2011 Howard Seltman Discussion Guide for Bayes and BUGS 1. Classical Statistics is based on parameters as fixed unknown values. a. The standard approach is to try to discover,

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Bayesian Inference for Regression Parameters

Bayesian Inference for Regression Parameters Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short

More information