Monte Carlo integration Eample of a Monte Carlo sampler in D: imagine a circle radius L/ within a square of LL. If points are randoml generated over the square, what s the probabilit to hit within circle? B algebra: pl/ /L = p/4. B simulation: K P S K This also provides a Monte Carlo appro of p. k k { S}
Monte Carlo integration Wanted: e.g. posterior mean E X p X d But assume we do not have conjugate priors, no closed form solution. Could tr numerical integration methods. Or Monte Carlo: draw random i.i.d samples k from the distribution, k=,.,k. large K. K k E X K k k ~ p X
Monte Carlo integration Even if we had solved the densit, it can be difficult to evaluate E g X Note also: So that we can approimate probabilities b And likewise an quantiles. 3 0 } { X S P X S P X S P X E S K k k S K X S P } {
Monte Carlo integration What remains is to generate samples from the correct distribution. This is eas for known distributions, just read the manual of our software. N-dimensional distributions? Nonstandard distributions? Need other sampling methods than directl drawing i.i.d. 4
Monte Carlo Markov chain Innovation: Construct a sampler that works as a Markov chain, for which stationar distribution eists, and this stationar distribution is the same as our target distribution. 5
Gibbs sampler Gibbs sampling in D Eample: uniform distribition in a triangle. Y 0 X f }, {, 0, 0, 6
Gibbs sampler Gibbs sampling in D Remember product rule: p, = p p = p p Solve the marginal densit p Then: p =p,/p 7 d d d p p 0 0 } 0, 0, { 0,,
Gibbs sampler Gibbs sampling in D Solve the conditional densit: Note: above it would suffice to recognize p up to a constant term, so that solving p is not necessar. Similarl, get p = U0,-. 8 0,,, } 0, { } 0, 0, { U p p p
Gibbs sampler Gibbs sampling in D Starting from the joint densit p,, we have obtained two important conditional densities: p and p Gibbs algorithm is then: start from 0, 0. Set k=. sample k from p k- 3 sample k from p k. Set k=k+. 4 go to until sufficientl large sample. These samples are no longer i.i.d. 9
Gibbs sampler start Y end 0 X 0
0.0 0. 0.4 0.6 0.8 In R, ou could: Gibbs sampler 0.0 0. 0.4 0.6 0.8
Gibbs sampler Jumping around? Possible problems.?
Gibbs sampler Consider again the binomial model, conditional to N Joint distribution p,x N can be epressed either as px,np N or p X,NpX N. From the first, we recognize px,n=binn, with e.g. uniform prior p N=p. Then, we would know p X,N = BetaX+,N-X+. This gives p X and px for Gibbs. 3
Gibbs sampler Consider again the binomial model, conditional to N Gibbs sampling X, gives the joint distribution of X and. [ We know both conditional densities, but it would be also possible to obtain p X b Monte Carlo sampling from the joint p,x, and then accepting onl those,x-pairs for which X takes a given value. This idea is used in Approimate Baesian Computation ABC. ] 4
Gibbs sampler Binomial model, conditional to N, in R: n<-0; p <- numeric; <- numeric p[] <- 0.5; [] <- 0 # initial values fori in :000{ p[i] <- rbeta,[i-]+,n-[i-]+ [i] <- rbinom,n,p[i] } plot,p 5
Gibbs and normal densit D normal densit: Marg. densities p and p are both N0, Conditional densit p =p,/p is 6, 0 0 ~ N Y X ep, p p, ep p N p
Gibbs and normal densit Gibbs would then be sampling from: p = N,- p = N,- This can mi slowl if X & Y heavil correlated. Recall the posterior pm,s X,,X n This is a D problem. Assume improper prior p m, s / s Then we can solve pm s,x = N X i /n, s /n And pt m,x = gamma n/, 0.5 X i -m This makes Gibbs! tr this with R 7
Net time ou estimate m,s from a sample X,,X n, assuming normal model, tr sampling the posterior distribution: X <- rnorm40,0, # generate eample dataset, n=40, mean=0,sd= m[] <- mean; t[] <- /sd*sd # initial values fori in :000{ # Gibbs sampling m[i] <- rnorm,mean,sqrt/t[i-]/40; t[i] <- rgamma,40/,0.5*sum[:40]-m[i]^ } 8
Metropolis-Hastings This is a ver general purpose sampler The core is: proposal distribution and acceptance probabilit. At each iteration: Random draw is obtained from proposal densit Q * i-, which can depend on previous iteration. Simpl, it could be U i- - L/, i- + L/. 9
Metropolis-Hastings At each iteration: Proposal is accepted with probabilit i p * data Q * r min, i i p data Q * Note how little we need to know about p data! Normalizing constant cancels out from the ratio. Enough to be able to evaluate prior and likelihood terms. Proposals too far accepted rarel slow sampler Proposals too near small moves slow sampler Acceptance probabilit ideall about 0%-40% Gibbs sampler is a special case of MH-sampler In Gibbs, the acceptance probabilit is. Block sampling also possible. 0
Metropolis-Hastings Sampling from N0,, using MH-algorithm:
MCMC convergence Remember to monitor for convergence! Chain is onl approaching the target densit, when iterating a long time, k. Convergence can be ver slow in some cases. Autocorrelations between iterations are then large makes sense to take a thinned sample. Sstematic patterns, trends, sticking, indicate problems. Pa attention to starting values! Tr different values in different MCMC chains. discard burn-in period.
MCMC convergence Can onl diagnose poor convergence, but cannot full prove a good one! e.g. multimodal densities. 0.8 0.6 0.4 0. 0.0 p[] chains :4 Target densit 0000 40000 iteration 3
MCMC in BUGS Man different samplers, some of them are implemented in WinBUGS/OpenBUGS. Net, we leave the sampling for BUGS, and onl consider building the models which define a posterior distribution, and running the sampling in BUGS, and assessing the results. 4