Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Similar documents
Bayesian Phylogenetics

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Bayesian Phylogenetics:

Monte Carlo in Bayesian Statistics

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)

A Bayesian Approach to Phylogenetics

Bayesian Inference and MCMC

Doing Bayesian Integrals

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

MCMC: Markov Chain Monte Carlo

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Infer relationships among three species: Outgroup:

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Strong Lens Modeling (II): Statistical Methods

Bayesian Methods for Machine Learning

STA 4273H: Statistical Machine Learning

Markov Chain Monte Carlo methods

Answers and expectations

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Computational statistics

CSC 446 Notes: Lecture 13

Principles of Bayesian Inference

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Molecular Epidemiology Workshop: Bayesian Data Analysis

STAT 425: Introduction to Bayesian Analysis

Bayesian Networks in Educational Assessment

ST 740: Markov Chain Monte Carlo

Estimating Evolutionary Trees. Phylogenetic Methods

Bayes Nets: Sampling

Introduction to Machine Learning CMU-10701

Graphical Models and Kernel Methods

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?

Lecture 6: Markov Chain Monte Carlo

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

CSC 2541: Bayesian Methods for Machine Learning

Reminder of some Markov Chain properties:

CS 343: Artificial Intelligence

One-minute responses. Nice class{no complaints. Your explanations of ML were very clear. The phylogenetics portion made more sense to me today.

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Bayesian Phylogenetics

Learning the hyper-parameters. Luca Martino

David Giles Bayesian Econometrics

Bayesian Estimation with Sparse Grids

eqr094: Hierarchical MCMC for Bayesian System Reliability

Bayes Formula. MATH 107: Finite Mathematics University of Louisville. March 26, 2014

F denotes cumulative density. denotes probability density function; (.)

Markov Chains and MCMC

Down by the Bayes, where the Watermelons Grow

Forward Problems and their Inverse Solutions

Markov chain Monte-Carlo to estimate speciation and extinction rates: making use of the forest hidden behind the (phylogenetic) tree

Machine Learning for Data Science (CS4786) Lecture 24

Markov Chain Monte Carlo

Molecular Evolution & Phylogenetics

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods

CS 188: Artificial Intelligence. Bayes Nets

Frequentist Statistics and Hypothesis Testing Spring

Markov Chain Monte Carlo

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Probabilistic Machine Learning

Advanced Statistical Modelling

Bayesian Regression Linear and Logistic Regression

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Bayesian Networks BY: MOHAMAD ALSABBAGH

Transdimensional Markov Chain Monte Carlo Methods. Jesse Kolb, Vedran Lekić (Univ. of MD) Supervisor: Kris Innanen

17 : Markov Chain Monte Carlo

Markov chain Monte Carlo

Introduction into Bayesian statistics

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

MARKOV CHAIN MONTE CARLO

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Bayesian Inference in Astronomy & Astrophysics A Short Course

Probability and Conditional Probability

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Bayesian Inference. Chapter 1. Introduction and basic concepts

Introduc)on to Bayesian Methods

Perfect simulation of repulsive point processes

SAMSI Astrostatistics Tutorial. More Markov chain Monte Carlo & Demo of Mathematica software

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Lecture 4 Bayes Theorem

CosmoSIS Webinar Part 1

Inferring Speciation Times under an Episodic Molecular Clock

Lecture 4 Bayes Theorem

MCMC algorithms for fitting Bayesian models


Basic math for biology

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Lecture 15: MCMC Sanjeev Arora Elad Hazan. COS 402 Machine Learning and Artificial Intelligence Fall 2016

arxiv: v1 [stat.co] 18 Feb 2012

Transcription:

Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the son of one of the first Noncomformist ministers to be ordained in England. He became a Presbyterian minister in the late 1720s, but was well known for his studies of mathematics. He was elected a Fellow of the Royal Society of London in 1742. He died in 1761 before his works were published. Bayesian Phylogenetics 1 / 27 Bayesian Phylogenetics History Reverend Thomas Bayes 2 / 27 What is Bayes Theorem? What is Bayes Theorem? Bayes Theorem explains how to calculate inverse probabilities. For example, suppose that Box B 1 contains four balls, three of which are black and one of which is white. Box B 2 has four balls, two of which are black and two of which are white. Box B 3 has four balls, one of which is black and three of which are white. B 1 : B 2 : B 3 : If a ball is chosen uniformly at random from Box B 1, there is a 3/4 chance that it is black. But if a black ball is drawn, how likely is it that it came from Box B 1? B 1 : B 2 : B 3 : If a black ball is drawn, how likely is it that it came from Box B 1? To answer this question, we need to have prespecified probabilities of which box we pick to draw the ball from. The answer will be different if we believe a priori that Box B 1 is 10% likely to be the chosen box than if we believe that all three boxes are equally likely. Do the problem with a probability tree. Bayesian Phylogenetics Mathematical Background Bayes Theorem 3 / 27 Bayesian Phylogenetics Mathematical Background Bayes Theorem 4 / 27

Bayes Theorem Connection to Phylogeny Bayes Theorem states that if a complete list of mutually exclusive events B 1, B 2,... have prior probabilities P(B 1 ), P(B 2 ),..., and if the likelihood of the event A given event B i is P(A B i ) for each i, then P(B i A) = P(A B i)p(b i ) j P(A B j)p(b j ) The posterior probability of B i given A, written P(B i A), is proportional to the product of the likelihood P(A B i ) and the prior probability P(B i ) where the normalizing constant P(A) = j P(A B j)p(b j ) is the prior probability of A. In a Bayesian approach to phylogenetics, the boxes are like different tree topologies, only one of which is right. The colored balls are like site patterns, except that there are many more than two varieties and we are able to observe multiple independent draws from each box. Things are further complicated in that additional parameters such as branch lengths and likelihood model parameters affect the likelihood, but are also unknown. Bayesian Phylogenetics Mathematical Background Bayes Theorem 5 / 27 Bayesian Phylogenetics Mathematical Background Phylogenetics 6 / 27 Prior and Posterior Distributions Bayesian Methods vs. Maximum Likelihood A prior distribution is a probability distribution on parameters before any data is observed. A posterior distribution is a probability distribution on parameters after data is observed. Maximum Likelihood Bayesian Probability Only defined Describes everything in the context that is uncertain of long-run relative frequencies Parameters Fixed and Unknown Random Nuisance Optimize them Average over them Parameters Testing p-values Bayes factors Model Likelihood Likelihood and Prior Distribution Bayesian Phylogenetics Mathematical Background Distributions 7 / 27 Bayesian Phylogenetics Mathematical Background Likelihood 8 / 27

Bayesian Phylogenetic Methods Let s say we want to find the posterior probability of a clade. Then we are interested in computing P(clade data) = = P(tree data) P(data tree)p(tree) P(data) But we need to know the parameters including branch lengths (params) to compute the likelihood. X P(data tree)p(tree) = = X X Z P(data, params tree)p(tree)dparams Z P(tree) P(data params, tree)p(params tree)dparams Bayesian Phylogenetic Methods So, we need to compute: P P(tree) R P(data params, tree)p(params tree)dparams P(data) However, P(data) is generally not computable. Solution? Markov chain Monte Carlo. Bayesian Phylogenetics Bayesian Phylogenetics Methods 9 / 27 Bayesian Phylogenetics Bayesian Phylogenetics Methods 10 / 27 Metropolis-Hastings Example Assume a Jukes-Cantor likelihood model for two species where we observe 50 sites, 9 of which differ. The likelihood for the distance d is ( ) 1 50 L(d) = 4 ( 1 4 1 4 e 4 3 d Assume a prior for d with the form p(d) = λ (1 + λd) 2, d > 0 ) 9 ( 1 4 + 3 ) 41 4 e 4 3 d Example An exact expression for the posterior density of d is ( ) ( ( λ 1 ) ( 50 1 (1+λd) 2 4 4 1 4 e 4 d) 9 ( ) 3 1 4 + 3 4 e 4 d) 41 3 p(d x) = ( ) ( ( ( λ 1 50 1 0 (1+λd) 4) 2 4 1 4 e 4 d) 9 ( ) 3 1 4 + 3 4 e 4 d) 41 3 dd where λ > 0 is a parameter. This density is what you get if you take the ratio of two independent exponential random variables, one with parameter λ and one with parameter 1. The median is 1/λ, but the mean is +. Bayesian Phylogenetics Example 11 / 27 Bayesian Phylogenetics Example 12 / 27

Graph What is Markov Chain Monte Carlo? density 0 2 4 6 8 10 Jukes Cantor, n=50, x=9 MLE = 0.206 prior (lambda = 10) posterior (lambda = 10) prior (lambda = 1) posterior (lambda = 1) likelihood 0.0 0.2 0.4 0.6 0.8 1.0 d Markov chain Monte Carlo (MCMC) is a method to take (dependent) samples from a distribution. The distribution need only be known up to a constant of proportionality. MCMC is especially useful for computation of Bayesian posterior probabilities. Simple summary statistics from the sample converge to posterior probabilities. Metropolis-Hastings is a form of MCMC that works using any Markov chain to propose the next item to sample, but rejecting proposals with specified probability. Bayesian Phylogenetics Example 13 / 27 Bayesian Phylogenetics Computation MCMC 14 / 27 An MCMC Algorithm Example 1 Start at x 0 ; Set i = 0. 2 Propose x from the current x i. 3 Calculate the acceptance probability. We have a function h(θ) from which we want to sample. We only need to know h up to a normalizing constant. Target Distribution 4 Generate a random number. 5 1 If accepted, set x i+1 = x. 2 If rejected, set x i+1 = x i. 6 Increment i to i + 1. 7 Repeat steps 2 through 6 many times. Bayesian Phylogenetics Computation MCMC 15 / 27 Bayesian Phylogenetics Computation Example 16 / 27

Initial Point Proposal Distribution We begin the Markov chain at a single point. We evaluate the value of h at this point. Given our current state, we have a proposal distribution for the next candidate state. Initial Point Proposal Distribution Bayesian Phylogenetics Computation Example 17 / 27 Bayesian Phylogenetics Computation Example 18 / 27 First Proposal Second Proposal We propose a candidate new point. Current state θ; Proposed state θ This proposal is accepted. The proposal was accepted, so proposed state becomes current. Current state θ; Proposed state θ ; Make another proposal. This proposal is rejected. First Proposal Accept with probability 1 Second Proposal Accept with probability 0.153 θ * θ θ θ * Bayesian Phylogenetics Computation Example 19 / 27 Bayesian Phylogenetics Computation Example 20 / 27

Third Proposal Beginning of Sample The proposal was rejected, so proposed state is sampled again and remains current. Current state θ; Proposed state θ ; Make another proposal. This proposal is accepted. The first four sample points. Vertical position is random to separate points at the same point. Sample So Far Third Proposal Accept with probability 0.536 θ θ * Bayesian Phylogenetics Computation Example 21 / 27 Bayesian Phylogenetics Computation Example 22 / 27 Larger Sample Comparison to Target Repeat this for 10,000 proposals and show the sample. Large Sample Bayesian Phylogenetics Computation Example 23 / 27 Bayesian Phylogenetics Computation Example 24 / 27

Things to Note MCMC for Phylogenetics The resulting sample mimics the target sample very well. The shape of the proposal distribution did not depend on the target distribution at all: almost any type of proposal method would have worked. There is a lot of autocorrelation: MCMC produces dependent samples. The acceptance probabilities depend on the proposal distributions and relative values of the target. Summaries of the sample are good estimates of corresponding target quantities: The sample mean converges to the mean of the target. The sample median converges to the median of the target. The sample tail area above 1.0 converges to the relative area above 1.0 in the target. The model parameters for a Bayesian phylogenetics analysis typically includes: a tree (topology and branch lengths); substitution process parameters. There are most often multiple MCMC methods used in combination. For example, methods may: Adust the stationary distribution, leaving other things fixed; Adjust the rates, leaving the tree fixed; Adjust some branch lengths, leaving the topology and Q fixed; Adjust the tree in a small region, leaving the rest of the tree fixed; and so on. Bayesian Phylogenetics Computation Example 25 / 27 Bayesian Phylogenetics MCMC for Phylogenetics 26 / 27 Cautions It is important to discard an initial portion of the sample as burnin. The MCMC sampler must be run for a long time after reaching stationarity. It is good practice to make several independent runs to assess agreement; chains can get stuck in local regions, leading to inaccurate inferences. Problems with many taxa or very long sequences are more likely to have computational problems. Bayesian Phylogenetics MCMC for Phylogenetics 27 / 27