Molecular Epidemiology Workshop: Bayesian Data Analysis

Size: px
Start display at page:

Download "Molecular Epidemiology Workshop: Bayesian Data Analysis"

Transcription

1 Molecular Epidemiology Workshop: Bayesian Data Analysis Jay Taylor and Ananias Escalante School of Mathematical and Statistical Sciences Center for Evolutionary Medicine and Informatics Arizona State University Jay Taylor (ASU) Bayesian Analysis August / 46

2 Outline 1 Probability and Uncertainty 2 Bayesian Analysis 3 MCMC Jay Taylor (ASU) Bayesian Analysis August / 46

3 Probability and Uncertainty Probability: Interpretations and Basic Principles Frequentist Interpretation The probability of an event is equal to its limiting frequency in an infinite series of independent identical trials. Bayesian Interpretations Logical: The probability of a proposition is equal to the strength of evidence in favor of the proposition. Subjective: The probability of a proposition is equal to the strength of an individual s belief in the proposition. Jay Taylor (ASU) Bayesian Analysis August / 46

4 Probability and Uncertainty Although probabilities can be interpreted in different ways, these interpretations are usually based on the same mathematical rules. To describe these, we will use P(E) to denote the probability of an event or proposition E. Probability Axioms 1 If S is certain to be true, then P(S) = P(E) 1 for any proposition E. 3 If E and F are mutually exclusive propositions, then the probability that either E or F is true is equal to the sum of the probabilities of E and of F : P(E or F ) = P(E) + P(F ). Jay Taylor (ASU) Bayesian Analysis August / 46

5 Probability and Uncertainty The probability assigned to a proposition depends on the information or evidence available to us. This can be made explicit through conditional probability. Conditional Probability Suppose that E and F are propositions and that P(E) > 0. If we know E to be true, then the conditional probability of F given E is equal to P(F E) = P(E and F ) P(E) where P(E and F ) is the probability that both E and F are true. P(S) = 1 S P(E) = 0.46, P(F ) = 0.33 P(E and F ) = 0.21 P(F E) = 0.21/ E E and F F Jay Taylor (ASU) Bayesian Analysis August / 46

6 Probability and Uncertainty Joint probabilities can often be calculated by conditioning on one of the propositions. Product Rule P(E and F ) = P(E) P(F E). Example: Suppose that two balls are sampled without replacement from an urn containing five red balls and five blue balls. If we let E be the event that the first ball sampled is red and F be the event that second ball sampled is red, then the probability that both balls sampled are red is P(E, F ) = P(E)P(F E) = = 2 9. Jay Taylor (ASU) Bayesian Analysis August / 46

7 Probability and Uncertainty Because we can condition on either E or F, the joint probability of E and F can be decomposed in two different ways using the product rule: ( P(E) P(F E) (conditioning on E) P(E and F ) = P(F ) P(E F ) (conditioning on F ). It follows that the two expressions on the right-hand side are equal, i.e., P(E) P(F E) = P(F ) P(E F ), and if we then divide both sides by P(E), we arrive at one of the most important formulas in probability theory: Bayes Formula P(F E) = P(F ) P(E F ). P(E) Jay Taylor (ASU) Bayesian Analysis August / 46

8 Probability and Uncertainty Example: Reversed Sexual Size Dimorphism in Spotted Owls Like many raptors, adult female Spotted Owls (Strix occidentalis) are larger, on average, than their male counterparts. For example, a study of a California population found that the wing chord distribution (in mm) is approximately N (329, 6) in females and N (320, 6) in males (Blakesley et al., 1990, J. Field Ornithology). Wing Chord in Spotted Owls male female density wing chord Jay Taylor (ASU) Bayesian Analysis August / 46

9 Probability and Uncertainty Problem: Suppose that an adult bird with a wing chord of 329 mm is randomly sampled from a population with a 1 : 1 adult sex ratio. What is the probability that this is a female? Solution: Let F (M) be the event that the bird is female (male) and let W be the event that the wing chord is 329 mm. Then P(F ) = 0.5 p(w F ) = /72 2π e ( ) p(w ) = P(F )p(w F ) + P(M)p(W M) = /72 1 2π e ( ) /72 2π e ( ) and upon substituting these quantities into Bayes formula we find that P(F W ) = P(F ) p(w F ) p(w ) Jay Taylor (ASU) Bayesian Analysis August / 46

10 Bayesian Analysis Bayesian Data Analysis: Overview Bayesian data analysis is based on the following two principles: 1 Probability is interpreted as a measure of uncertainty, whatever the source. Thus, in a Bayesian analysis, it is standard practice to assign probability distributions not only to unseen data, but also to parameters, models, and hypotheses. 2 Uncertainty is quantified both before and after the collection of data and Bayes formula is used to update our beliefs in light of the new data. Jay Taylor (ASU) Bayesian Analysis August / 46

11 Bayesian Analysis Suppose that our objective is to use some newly acquired data D to estimate the value of an unknown parameter Θ. A Bayesian treatment of this problem would proceed as follows: 1 We first need to formulate a statistical model which determines the conditional distribution of the data under each possible value of Θ. This is specified by the likelihood function, p(d Θ = θ). 2 We then choose a prior distribution, p(θ = θ), for the unknown parameter which quantifies how strongly we believe that the true value is θ before we examine the new data. 3 We then collect and examine the new data D. 4 In light of this new data, we use Bayes formula to revise our beliefs concerning the value of the parameter Θ: «p(d Θ = θ) p(θ = θ D) = p(θ = θ) {z } {z } p(d) posterior prior p(θ = θ D) is said to be the posterior distribution of the parameter Θ given the data D. Jay Taylor (ASU) Bayesian Analysis August / 46

12 Bayesian Analysis Example: Bayesian Estimation of Sex Ratios Suppose that our objective is to estimate the birth sex ratio of a newly described species. To this end, we will count the numbers of male and female offspring in each of b broods. For the sake of the example, we will make the following assumptions: The probability of being born female does not vary within or between broods. In particular, there is no environmental or heritable variation in sex ratio. The sexes of the different members of a brood are determined independently of one another. If we let θ denote the unknown probability that an individual is born female, then the sex ratio at birth will be θ/1 θ. We will let m and f denote the total numbers of male and female offspring contained in the b broods. Jay Taylor (ASU) Bayesian Analysis August / 46

13 Bayesian Analysis Likelihood Function: Under our assumptions about sex determination, the likelihood function depends only on the sex ratio and the total numbers of males and females in the broods. Conditional on Θ = θ, the total number of female offspring among the n offspring is binomially distributed with parameters f + m and θ: Likelihood function P(f, m Θ = θ) =! f + m θ f (1 θ) m f p(3,7 θ Likelihood Function: f =7, m=3 θ MLE =0.7 MLE θ MLE = f f + m θ Jay Taylor (ASU) Bayesian Analysis August / 46

14 Bayesian Analysis Prior Distribution: The prior distribution on the unknown parameter θ should reflect what we have previously learned about the birth sex ratio in this species. For example, this might be determined by prior observations of this species or by information about the sex ratio of closely-related species. Here I will consider prior distributions for three different scenarios: Uniform: no prior information p(θ = θ) = Prior Distributions Beta(1,1) Beta(5,5) Beta(2,4) Beta(5,5): even sex ratio 1.5 p(θ = θ) = 630 θ 4 (1 θ) 4 prior 1 Beta(2,4): male-biased 0.5 p(θ = θ) = 20 θ(1 θ) θ Jay Taylor (ASU) Bayesian Analysis August / 46

15 Bayesian Analysis We are now ready to use Bayes formula to calculate the posterior distribution of θ conditional on having observed f female offspring out of a total of n offspring. When the prior distribution is of type Beta(a, b), we have p(θ = θ f, m) = p(θ = θ) = = «P(f, m θ) P(f, m) 1 β(a, b) θa 1 (1 θ) b 1 (Bayes formula) θf! (1 θ) m `f +m f P(f, m) 1 β(f + a, m + b) θf +a 1 (1 θ) m+b 1, which shows that the posterior distribution is of type Beta(f + a, m + b). Remark Because the prior and the posterior distribution belong to the same family of distributions, we say that the beta distribution is a conjugate prior for the binomial likelihood function. Jay Taylor (ASU) Bayesian Analysis August / 46

16 Bayesian Analysis The figures show the posterior distributions corresponding to each of these three priors for two different data sets: either f = 7, m = 3 (left plot) or f = 70, m = 30 (right plot) Posterior Distribution: f =7, m=3 Beta(1,1) Beta(5,5) Beta(2,4) Posterior Distribution: f =70, m=30 Beta(1,1) Beta(5,5) Beta(2,4) posterior density posterior density Jay Taylor (ASU) Bayesian Analysis August / 46

17 Bayesian Analysis Summaries of the Posterior Distribution Although the posterior distribution p(θ = θ D) comprehensively describes the state of knowledge concerning an unknown parameter, it is common to summarize this distribution by a number or an interval. Point estimators of Θ include the mean and the median of the posterior distribution. If it exists, the mode of the posterior distribution can be used to estimate Θ, in which case it is called the maximum a posteriori (MAP) estimate. A credible region is a region that contains a specified proportion of the probability mass of the posterior distribution, e.g., a 95% credible region will contain 95% of this mass. These can be chosen in several ways, including: quantile-based intervals; highest probability density (HPD) regions. Jay Taylor (ASU) Bayesian Analysis August / 46

18 Bayesian Analysis Posterior Summary Statistics Credible Intervals for N(0,1) Credible Intervals for Exp(1) MAP = 0.0 MAP = mean = median Median = % HPD/quantiles (-1.96, 1.96) Mean = % HPD (0.0, 2.996) 95% quantiles (0.025, 3.69) MAP = mean = median = 0 95% median = ( 1.96, 1.96) 95% HPD = ( 1.96, 1.96) MAP = 0, median = 0.692, mean = % quantiles = (0.025, 3.689) 95% HPD = (0, 2.996) Jay Taylor (ASU) Bayesian Analysis August / 46

19 Bayesian Analysis The table lists various summary statistics for the posterior distributions calculated in the sex ratio example. data prior mean median sd 2.5% 97.5% 7, 3 uniform even biased , 30 uniform even biased Interpretation Whereas the small data set does not provide strong evidence against the hypothesis that the sex ratio is 1 : 1, all three analyses of the large data set suggest that the true value of θ is greater than Jay Taylor (ASU) Bayesian Analysis August / 46

20 Bayesian Analysis Acquiring New Data One of the strengths of Bayesian statistics is that it can readily handle sequentially acquired data. This is done by using the posterior distribution from the most recent experiment as the prior distribution for the next experiment. p 0(θ) {z } prior p 1(θ) {z } prior D 1 D 2 p1(θ) = p 0(θ) p(d1 θ) p(d 1) {z } posterior p2(θ) = p 1(θ) p(d2 θ) p(d 2) {z } posterior p 2(θ) {z } prior D 3 Jay Taylor (ASU) Bayesian Analysis August / 46

21 Bayesian Analysis Example: Sex Ratio Estimation, continued Suppose that our prior distribution on Θ was uniform and that we initially collected three broods totaling 7 females and 3 males. We then used Bayes formula to determine that the posterior distribution of Θ is Beta(8, 4). Since this distribution is broad, we decide to collect additional data to refine our estimate of Θ. If the second data set contains 15 females and 5 males, then using Beta(8, 4) as the new prior distribution, we find that the new posterior distribution of Θ is p(θ = θ f = 15, m = 5) = θ7 (1 θ) 3 β(8, 4) = θ22 (1 θ) 8 β(23, 9) `20 θ15! (1 θ) 5 15 P(f = 15, m = 5) Jay Taylor (ASU) Bayesian Analysis August / 46

22 Bayesian Analysis Choosing the Prior Distribution: Some Guidelines Choosing a prior distribution is both useful and sometimes difficult because it requires a careful assessment of our knowledge or beliefs before we perform an experiment. As a rule, the prior should be chosen independently of the new data. Cromwell s rule: The prior distribution should assign positive probability to any proposition that is not logically false, no matter how unlikely. It is sometimes useful to carry out multiple Bayesian analyses using different prior distributions to explore the sensitivity of the posterior distribution to different prior assumptions. When there is very little prior information, it may be appropriate to choose an uninformative prior. Examples include maximum entropy distributions and Jeffreys priors. Jay Taylor (ASU) Bayesian Analysis August / 46

23 Bayesian Analysis Bayesian Phylogenetics Bayesian methods have proven to be especially useful in the analysis of genetic sequence data. In these problems, the unknown parameters can often be divided into three categories: the unknown tree, T the parameters of the demographic model, Θ dem the parameters of the substitution model, Θ subst. Then, given a sequence alignment D, the analytical problem is to calculate the posterior distribution of all of the unknowns: Bayes formula for phylogenetic inference, general case «p(d T, Θdem, Θ subst ) p(t, Θ dem, Θ subst D) = p(t, Θ dem, Θ subst ) p(d) Jay Taylor (ASU) Bayesian Analysis August / 46

24 Bayesian Analysis If the substitution process is assumed to be neutral, then the prior distribution and the likelihood functions can be simplified: Bayes formula for phylogenetic inference with neutral data «p(d T, Θsubst ) p(t, Θ subst, Θ dem D) = p(θ subst ) p(θ dem ) p(t Θ dem ) p(d) This is a consequence of the following assumptions: Under the prior distribution, the parameters of the substitution process Θ subst are independent of the tree T and the demographic parameters Θ dem. The demographic parameters Θ dem typically determine the conditional distribution of the genealogy T, e.g., through a coalescent model. Conditional on T and Θ subst, the sequence data D is independent of Θ dem. Jay Taylor (ASU) Bayesian Analysis August / 46

25 Bayesian Analysis Example: Bayesian Inference of Effective Population Size Suppose that our data D consists of n randomly sampled individuals that have been sequenced at a neutral locus and that our objective is to estimate the effective population size N e. For simplicity, we will use the Jukes-Cantor model for the substitution process with a strict molecular clock and we will assume that the demography can be described by the constant population size coalescent. To carry out a Bayesian analysis, we need to specify p(µ), p(n e) and p(t N e). p(µ) should be chosen to reflect what we know about the mutation rate, e.g., we could use a lognormal distribution with mean u and variance σ 2. Since N e is a scale parameter in the coalescent, it is common practice to use the Jeffreys prior p(n e) 1/N e. p(t N e) is then determined by Kingman s coalescent. Jay Taylor (ASU) Bayesian Analysis August / 46

26 Bayesian Analysis Assuming that we are only interested in N e, then µ and T are nuisance parameters and so we need to calculate the marginal posterior distribution of N e by integrating over µ and T : Z Z p(n e D) = p(t, N e, µ D)dµdT Z Z = p(ne) p(d) p(d T, µ)p(µ)p(t N E )dµdt. However, unless n is quite small, integration over T is not feasible. For example, if our sample contains 20 sequences, then there are approximately possible trees to be considered. Even with the fastest computers available, this is an impossible calculation. Jay Taylor (ASU) Bayesian Analysis August / 46

27 MCMC Markov Chain Monte Carlo Methods (MCMC) Until recently, Bayesian methods were regarded as impractical for many problems because of the computational difficulty of evaluating the posterior distribution. In particular, to use Bayes formula, p(θ D) = p(θ) «p(d θ), p(d) we need to evaluate the marginal probability of the data Z p(d) = p(d θ)p(θ)dθ, which requires integration of the likelihood function over the parameter space. Except in special cases, this integration must be performed numerically, but sometimes even this is very difficult. Jay Taylor (ASU) Bayesian Analysis August / 46

28 MCMC An alternative is to use Monte Carlo methods to sample from the posterior distribution. Here the idea is to generate a random sample from the distribution and then use the empirical distribution of that sample to approximate p(θ D): p(θ D) {z } posterior 1 NX δ Θi where Θ 1,, Θ N p(θ D) N {z } i=1 {z } sample empirical distribution For example, the figure shows two histogram estimators for the Beta(8, 4) density generated using either 100 (left) or 1000 (right) independent samples: sample: N =100 Beta(8,4) 3.5 sample: N =1000 Beta(8,4) 3 3 posterior density posterior density θ θ Jay Taylor (ASU) Bayesian Analysis August / 46

29 MCMC In particular, the empirical distribution can be used to estimate probabilities and expectations under p(θ D): P(θ A D) 1 N E[f (θ) D] 1 N NX 1 A (Θ i ) i=1 NX f (Θ i ). i=1 What makes this approach difficult is the need to generate random samples from distributions that are only known up to a constant of proportionality (e.g., p(d)). This is where Markov chain Monte Carlo methods come in. Jay Taylor (ASU) Bayesian Analysis August / 46

30 MCMC Markov Chains A discrete-time Markov chain is a stochastic process X 0, X 1, with the property that the future behavior of the process only depends on its current state. More precisely, this means that for every set A and t, s 0, P(X t+s A X t, X {z } {z} t 1,, X 0 ) = P(X t+s A X t). {z } future present past frequency Simulations of the Wright Fisher Process N =500, µ =0.001 present past future Markov chains have numerous applications in biology. Some familiar examples include: random walks branching processes the Wright-Fisher model chain-binomial models (Reed-Frost) time Jay Taylor (ASU) Bayesian Analysis August / 46

31 MCMC One consequence of the Markov property is that many Markov chains have a tendency to forget their initial state as time progresses. More precisely, Asymptotic behavior of Markov chains Many Markov chains have the property that there is a probability distribution π, called the stationary distribution of the chain, such that for large t the distribution of X t approaches π, i.e., for all states x and sets A, lim P(Xt A X0 = x) = π(a). t 0.16 p = 0.01 t = t = t =100 Stationary behavior of the Wright-Fisher process: (N = 100, µ = 0.02) density p = 0.5 p = 0.9 density density p p p Jay Taylor (ASU) Bayesian Analysis August / 46

32 MCMC p Some Markov chains satisfy an even stronger property, called ergodicity. Ergodicity A Markov chain with stationary distribution π is said to be ergodic if for every initial state X 0 = x and every set A, we have 1 lim T T TX 1 A (X t) = π(a). t=1 In other words, if we run the chain for a long time, then the proportion of time spent visiting the set A is approximately π(a). 1 Neutral Wright Fisher model 0.9 Ergodic behavior of the Wright-Fisher process: (N = 100, µ = 0.02) Generation x 10 4 Jay Taylor (ASU) Bayesian Analysis August / 46

33 MCMC Markov Chain Monte Carlo: General Approach The central idea in Markov Chain Monte Carlo is to use an ergodic Markov chain to generate a random sample from the target distribution π. 1 The first step is to select an ergodic Markov chain which has π as its stationary distribution. There are different methods for doing this, including the Metropolis-Hastings algorithm and the Gibbs Sampler. 2 We then need to simulate the Markov chain until the distribution is close to the target distribution. This initial period is often called the burn-in period. 3 We continue simulating the chain, but because successive values are highly correlated, it is common practice to only collect a sample every T generations, so as to reduce the correlations between the sampled states (thinning). 4 We can then use these samples to approximate the target distribution: 1 N NX δ XB+nT π. n=1 Jay Taylor (ASU) Bayesian Analysis August / 46

34 MCMC The Metropolis-Hastings Algorithm Given a probability distribution π, the Metropolis-Hastings algorithm can be used to explicitly construct a Markov chain that has π as its stationary distribution. Implementation requires the following elements: the target distribution, π, known up to a constant of proportionality; a family of proposal distributions, Q(y x), and a way to efficiently sample from these distributions. The MH algorithm is based on a more general idea known as rejection sampling. Instead of sampling directly from π, we propose values using a distribution Q(y x) that we can easily sample but then we reject values that are unlikely under π. Jay Taylor (ASU) Bayesian Analysis August / 46

35 MCMC The Metropolis-Hastings algorithm consists of repeated application of the following three steps. Suppose that X n = x is the current state of the Markov chain. Then the next state is chosen as follows: Step 1: We first propose a new value for the chain by sampling y with probability Q(y x). Step 2: We then calculate the acceptance probability of the new state: j ff π(y)q(x y) α(x; y) = min π(x)q(y x), 1 Step 3: With probability α(x; y), set X n+1 = y. Otherwise, set X n+1 = x. Remark Because π enters into α as a ratio π(y)/π(x), we only need to know π up to a constant of proportionality. This is why the MH algorithm is so well suited for Bayesian analysis. Jay Taylor (ASU) Bayesian Analysis August / 46

36 MCMC The Proposal Distribution The choice of the proposal distribution Q can have a profound impact on the performance of the MH algorithm. While there is no universal procedure for selecting a good proposal distribution, the following considerations are important. Q should be chosen so that the chain rapidly converges to its stationary distribution. Q should also be chosen so that it is easy to sample from. There is usually a tradeoff between these two conditions and sometimes it is necessary to try out different proposal distributions to identify one with good properties. Many implementations of MH (e.g., BEAST, MIGRATE) offer the user some control over the proposal distribution. Jay Taylor (ASU) Bayesian Analysis August / 46

37 MCMC MCMC: Convergence and Mixing One of the most challenging issues in MCMC is knowing for how long to run the chain. There are two related considerations. 1 We need to run the chain until its distribution is sufficiently close to the target distribution (convergence). 2 We then need to collect a large enough number of samples that we can estimate any quantities of interest (e.g., the mean TMRCA) sufficiently accurately (mixing). Unfortunately, there is no universally-valid, fool-proof way to guarantee that either one of these conditions is satisfied. However, there are a number of convergence diagnostics that can indicate when there are problems. Jay Taylor (ASU) Bayesian Analysis August / 46

38 MCMC Convergence Diagnostics: Trace Plots Trace plots show how the value of a parameter changes over the course of a simulation. In general, what we want to see is that the mean and the variance of the parameter are fairly constant over the duration of the trace plot, as in the two examples shown below P_vivax_gsr_geo1.log P_vivax_gsr_geo1.log E7 3E E7 3E7 State State Jay Taylor (ASU) Bayesian Analysis August / 46

39 MCMC Problems with convergence or mixing may be revealed by trends or sudden changes in the behavior of the trace plot P_vivax_gsr_geo1.log P_vivax_gsr_geo1.log E7 3E7 State E7 3E7 State The increasing trend indicates that the chain has not yet converged. The sudden changes in mean indicate that the chain is poorly mixing. Jay Taylor (ASU) Bayesian Analysis August / 46

40 MCMC Trace Plots: Some Guidelines 1 You should examine the trace plot of every parameter of interest, including the likelihood and the posterior probability. If any of the trace plots look problematic, then all of the results are suspect. 2 The fact that a trace plot appears to have converged is not conclusive proof that it has. Especially in high-dimensional problems, a chain that appears to be stationary for the first 500 million generations may well show a sudden change in behavior in the next. 3 The program Tracer ( can be used to display and analyze trace plots generated by BEAST, MrBayes and LAMARC. Jay Taylor (ASU) Bayesian Analysis August / 46

41 MCMC Convergence Diagnostics: Effective Sample Size Because successive states visited by a Markov chain are correlated, an estimate derived using N values generated by such a chain will usually be less precise than an estimate derived using N independent samples. This motivates the following definition. Effective Sample Size (ESS) The effective sample size of a sample of N correlated random variables is equal to the number of independent samples that would estimate the mean with the same variance. For a stationary Markov chain with autocorrelation coefficients ρ k, the ESS of N successive samples is equal to ESS = N P k=1 ρ. k Jay Taylor (ASU) Bayesian Analysis August / 46

42 MCMC Effective Sample Size: Guidelines 1 Each parameter has its own ESS and these can differ between parameters by more than order of magnitude. 2 Parameters with small ESS s indicate that a chain either has not converged or is slowly mixing. As a rule of thumb, the ESS of every parameter should exceed 1000 and larger values are even better. 3 Thinning by itself will not increase the ESS. However, we can increase the ESS by simultaneously thinning and increasing the duration of the chain, e.g., collecting a 1000 samples from a chain lasting generations is better than collecting a 1000 samples from a chain lasting generations. 4 The ESS of a parameter can usually only be estimated from its trace. For this reason, large ESS s do not guarantee that the chain has converged. Jay Taylor (ASU) Bayesian Analysis August / 46

43 MCMC Formal Tests of Convergence There are several formal tests of stationarity that can be applied to MCMC. The Geweke diagnostic compares the mean of a parameter estimated from two non-overlapping parts of the chain and tests whether these are significantly different. The Raftery-Lewis diagnostic uses a pilot chain to estimate the burn-in and chain length required to estimate the q th quantile of a parameter to within some tolerance. Both of these methods suffer from the defect that you ve only seen where you ve been (Robert & Casella, 2004). In other words, these methods cannot detect that the chain has failed to visit part of the support of the target distribution. These and other diagnostic tests are implemented in the R package coda. Jay Taylor (ASU) Bayesian Analysis August / 46

44 MCMC Convergence Diagnostics: Multiple Chains Another approach to testing the convergence of a MCMC analysis is to run multiple independent chains and compare the results. Large differences between the posterior distributions estimated by the different chains indicate problems with convergence or mixing. It is often useful to start the different chains from different, randomly-chosen initial conditions. If the different chains give similar results, then their traces can be combined using a program such as LogCombiner ( Best Practice It is always a good idea to run at least two independent chains in any MCMC analysis. Jay Taylor (ASU) Bayesian Analysis August / 46

45 MCMC Metropolis-Coupled Markov Chain Monte Carlo (MC 3 ) MC 3 is a generalization of MCMC which attempts to improve mixing by running m chains in parallel, while randomly exchanging their states. The first chain (the cold chain) is constructed so that the target distribution π is its stationary distribution. The i th chain is constructed so that its stationary distribution is proportional to π i (x) π(x) 1/T i where T i is said to be the temperature of the chain. As T i increases, the distribution π i (x) becomes flatter, which makes it easier for this chain to converge. The MH algorithm is used to swap the states occupied by different chains in such a way that the stationary distribution of each chain is maintained. The output from the cold chain is then used to approximate the target distribution. Jay Taylor (ASU) Bayesian Analysis August / 46

46 References I MCMC M. A. Beaumont and B. Rannala, The Bayesian revolution in genetics, Nat. Rev. Genetics 5 (2004), A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian Data Analysis, second ed., Chapman & Hall/CRC, P. Gregory, Bayesian Logical Data Analysis for the Physical Sciences, Cambridge University Press, P. Lemey, M. Salemi, and A.-M. Vandamme (eds.), The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing, second ed., Cambridge University Press, S. P. Otto and T. Day, A Biologist s Guide to Mathematical Modeling in Ecology and Evolution, Princeton University Press, C. P. Robert and G. Casella, Monte Carlo Statistical Methods, Springer-Verlag, Jay Taylor (ASU) Bayesian Analysis August / 46

APM 541: Stochastic Modelling in Biology Bayesian Inference. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 53

APM 541: Stochastic Modelling in Biology Bayesian Inference. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 53 APM 541: Stochastic Modelling in Biology Bayesian Inference Jay Taylor Fall 2013 Jay Taylor (ASU) APM 541 Fall 2013 1 / 53 Outline Outline 1 Introduction 2 Conjugate Distributions 3 Noninformative priors

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) Background: Conditional probability A P (B A) = A,B P (A,

More information

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics Bayesian phylogenetics the one true tree? the methods we ve learned so far try to get a single tree that best describes the data however, they admit that they don t search everywhere, and that it is difficult

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

How robust are the predictions of the W-F Model?

How robust are the predictions of the W-F Model? How robust are the predictions of the W-F Model? As simplistic as the Wright-Fisher model may be, it accurately describes the behavior of many other models incorporating additional complexity. Many population

More information

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65 APM 504: Probability Notes Jay Taylor Spring 2015 Jay Taylor (ASU) APM 504 Fall 2013 1 / 65 Outline Outline 1 Probability and Uncertainty 2 Random Variables Discrete Distributions Continuous Distributions

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem? Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the

More information

Bayesian Phylogenetics

Bayesian Phylogenetics Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 Bayesian Phylogenetics 1 / 27 Who was Bayes? The Reverand Thomas Bayes was born

More information

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software SAMSI Astrostatistics Tutorial More Markov chain Monte Carlo & Demo of Mathematica software Phil Gregory University of British Columbia 26 Bayesian Logical Data Analysis for the Physical Sciences Contents:

More information

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Probability, Entropy, and Inference / More About Inference

Probability, Entropy, and Inference / More About Inference Probability, Entropy, and Inference / More About Inference Mário S. Alvim (msalvim@dcc.ufmg.br) Information Theory DCC-UFMG (2018/02) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods M. González, R. Martínez, I. del Puerto, A. Ramos Department of Mathematics. University of Extremadura Spanish

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Bayesian Inference in Astronomy & Astrophysics A Short Course

Bayesian Inference in Astronomy & Astrophysics A Short Course Bayesian Inference in Astronomy & Astrophysics A Short Course Tom Loredo Dept. of Astronomy, Cornell University p.1/37 Five Lectures Overview of Bayesian Inference From Gaussians to Periodograms Learning

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Bayesian inference: what it means and why we care

Bayesian inference: what it means and why we care Bayesian inference: what it means and why we care Robin J. Ryder Centre de Recherche en Mathématiques de la Décision Université Paris-Dauphine 6 November 2017 Mathematical Coffees Robin Ryder (Dauphine)

More information

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

MCMC Methods: Gibbs and Metropolis

MCMC Methods: Gibbs and Metropolis MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Bayesian Inference. p(y)

Bayesian Inference. p(y) Bayesian Inference There are different ways to interpret a probability statement in a real world setting. Frequentist interpretations of probability apply to situations that can be repeated many times,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Model comparison. Christopher A. Sims Princeton University October 18, 2016

Model comparison. Christopher A. Sims Princeton University October 18, 2016 ECO 513 Fall 2008 Model comparison Christopher A. Sims Princeton University sims@princeton.edu October 18, 2016 c 2016 by Christopher A. Sims. This document may be reproduced for educational and research

More information

Lecture 1 Bayesian inference

Lecture 1 Bayesian inference Lecture 1 Bayesian inference olivier.francois@imag.fr April 2011 Outline of Lecture 1 Principles of Bayesian inference Classical inference problems (frequency, mean, variance) Basic simulation algorithms

More information

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY ECO 513 Fall 2008 MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY SIMS@PRINCETON.EDU 1. MODEL COMPARISON AS ESTIMATING A DISCRETE PARAMETER Data Y, models 1 and 2, parameter vectors θ 1, θ 2.

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

An introduction to Bayesian reasoning in particle physics

An introduction to Bayesian reasoning in particle physics An introduction to Bayesian reasoning in particle physics Graduiertenkolleg seminar, May 15th 2013 Overview Overview A concrete example Scientific reasoning Probability and the Bayesian interpretation

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Data Analysis and Uncertainty Part 2: Estimation

Data Analysis and Uncertainty Part 2: Estimation Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 12 MCMC for Bayesian computation II March 1, 2013 J. Olsson Monte Carlo-based

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference By Philip J. Bergmann 0. Laboratory Objectives 1. Learn what Bayes Theorem and Bayesian Inference are 2. Reinforce the properties of Bayesian

More information

Taming the Beast Workshop

Taming the Beast Workshop Workshop and Chi Zhang June 28, 2016 1 / 19 Species tree Species tree the phylogeny representing the relationships among a group of species Figure adapted from [Rogers and Gibbs, 2014] Gene tree the phylogeny

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Part IV: Monte Carlo and nonparametric Bayes

Part IV: Monte Carlo and nonparametric Bayes Part IV: Monte Carlo and nonparametric Bayes Outline Monte Carlo methods Nonparametric Bayesian models Outline Monte Carlo methods Nonparametric Bayesian models The Monte Carlo principle The expectation

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Applied Bayesian Statistics STAT 388/488

Applied Bayesian Statistics STAT 388/488 STAT 388/488 Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 29, 207 Course Info STAT 388/488 http://math.luc.edu/~ebalderama/bayes 2 A motivating example (See

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 By Philip J. Bergmann 0. Laboratory Objectives 1. Learn what Bayes Theorem and Bayesian Inference are 2. Reinforce the properties

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Bayesian analysis of the Hardy-Weinberg equilibrium model

Bayesian analysis of the Hardy-Weinberg equilibrium model Bayesian analysis of the Hardy-Weinberg equilibrium model Eduardo Gutiérrez Peña Department of Probability and Statistics IIMAS, UNAM 6 April, 2010 Outline Statistical Inference 1 Statistical Inference

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Hierarchical Models & Bayesian Model Selection

Hierarchical Models & Bayesian Model Selection Hierarchical Models & Bayesian Model Selection Geoffrey Roeder Departments of Computer Science and Statistics University of British Columbia Jan. 20, 2016 Contact information Please report any typos or

More information

INTRODUCTION TO BAYESIAN STATISTICS

INTRODUCTION TO BAYESIAN STATISTICS INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types

More information

Control Variates for Markov Chain Monte Carlo

Control Variates for Markov Chain Monte Carlo Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability

More information

Infer relationships among three species: Outgroup:

Infer relationships among three species: Outgroup: Infer relationships among three species: Outgroup: Three possible trees (topologies): A C B A B C Model probability 1.0 Prior distribution Data (observations) probability 1.0 Posterior distribution Bayes

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

Bayesian Inference: Concept and Practice

Bayesian Inference: Concept and Practice Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 5. Bayesian Computation Historically, the computational "cost" of Bayesian methods greatly limited their application. For instance, by Bayes' Theorem: p(θ y) = p(θ)p(y

More information

MCMC notes by Mark Holder

MCMC notes by Mark Holder MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 13 MCMC, Hybrid chains October 13, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university MH algorithm, Chap:6.3 The metropolis hastings requires three objects, the distribution of

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School Bayesian modelling Hans-Peter Helfrich University of Bonn Theodor-Brinkmann-Graduate School H.-P. Helfrich (University of Bonn) Bayesian modelling Brinkmann School 1 / 22 Overview 1 Bayesian modelling

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short

More information

Introduc)on to Bayesian Methods

Introduc)on to Bayesian Methods Introduc)on to Bayesian Methods Bayes Rule py x)px) = px! y) = px y)py) py x) = px y)py) px) px) =! px! y) = px y)py) y py x) = py x) =! y "! y px y)py) px y)py) px y)py) px y)py)dy Bayes Rule py x) =

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian Inference and Decision Theory

Bayesian Inference and Decision Theory Bayesian Inference and Decision Theory Instructor: Kathryn Blackmond Laskey Room 2214 ENGR (703) 993-1644 Office Hours: Tuesday and Thursday 4:30-5:30 PM, or by appointment Spring 2018 Unit 6: Gibbs Sampling

More information