Bayes factors, marginal likelihoods, and how to choose a population model

Similar documents
Unified framework to evaluate panmixia and migration direction among multiple sampling locations

Penalized Loss functions for Bayesian Model Choice

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration

Week 7: Bayesian inference, Testing trees, Bootstraps

Bayesian analysis of the Hardy-Weinberg equilibrium model

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Will Penny. SPM short course for M/EEG, London 2013

Bayesian Inference and MCMC

Inferring Molecular Phylogeny

Lecture 6: Model Checking and Selection

Phylogenetic Inference using RevBayes

Bayesian model selection: methodology, computation and applications

Will Penny. DCM short course, Paris 2012

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Improving power posterior estimation of statistical evidence

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA)

Estimating Evolutionary Trees. Phylogenetic Methods

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Will Penny. SPM for MEG/EEG, 15th May 2012

One-minute responses. Nice class{no complaints. Your explanations of ML were very clear. The phylogenetics portion made more sense to me today.

Infer relationships among three species: Outgroup:

ST 740: Model Selection

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Sequential Monte Carlo Algorithms

Monte Carlo in Bayesian Statistics

Will Penny. SPM short course for M/EEG, London 2015

Model Comparison. Course on Bayesian Inference, WTCN, UCL, February Model Comparison. Bayes rule for models. Linear Models. AIC and BIC.

7. Estimation and hypothesis testing. Objective. Recommended reading

p(d g A,g B )p(g B ), g B

Lecture 1 Bayesian inference

Sandwiching the marginal likelihood using bidirectional Monte Carlo. Roger Grosse

Bayesian Phylogenetics

Approximate Inference using MCMC

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Bayesian Phylogenetics:

Bayesian Analysis of Network Data. Model Selection and Evaluation of the Exponential Random Graph Model. Dissertation

MCMC: Markov Chain Monte Carlo

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Bayes Model Selection with Path Sampling: Factor Models

David Giles Bayesian Econometrics

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

CSC 2541: Bayesian Methods for Machine Learning

Modern Phylogenetics. An Introduction to Phylogenetics. Phylogenetics and Systematics. Phylogenetic Tree of Whales

Assessing Regime Uncertainty Through Reversible Jump McMC

Reminder of some Markov Chain properties:

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School

Machine Learning for Data Science (CS4786) Lecture 24

2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling

A Bayesian Approach to Phylogenetics

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

A note on Reversible Jump Markov Chain Monte Carlo

an introduction to bayesian inference

Bayesian Inference: Concept and Practice

Bayesian Model Diagnostics and Checking

PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric

Bayesian Inference. Chapter 1. Introduction and basic concepts

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

7. Estimation and hypothesis testing. Objective. Recommended reading

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Molecular Epidemiology Workshop: Bayesian Data Analysis

Marginal likelihood estimation via power posteriors

Bayesian Methods for Machine Learning

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Bayesian inference: what it means and why we care

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

STAT 425: Introduction to Bayesian Analysis

A prior distribution over model space p(m) (or hypothesis space ) can be updated to a posterior distribution after observing data y.

Bayes Net Representation. CS 188: Artificial Intelligence. Approximate Inference: Sampling. Variable Elimination. Sampling.

Introduction to Bayes

Harrison B. Prosper. Bari Lectures

Theory of Evolution. Charles Darwin

Introduction to Bayesian thinking

Introduc)on to Bayesian Methods

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayes Factors for Discovery

Bayesian Concept Learning

Introduction to Bayesian Data Analysis

Power Series. Part 1. J. Gonzalez-Zugasti, University of Massachusetts - Lowell

Bayesian search for other Earths

Monte Carlo (MC) Simulation Methods. Elisa Fadda

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Bayesian Phylogenetics

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Bayesian Assessment of Hypotheses and Models

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Answers and expectations

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Lecture 6 Phylogenetic Inference

Hidden Markov Models. based on chapters from the book Durbin, Eddy, Krogh and Mitchison Biological Sequence Analysis via Shamir s lecture notes

Integrated Non-Factorized Variational Inference

Exact and approximate recursive calculations for binary Markov random fields defined on graphs

UMR 5506, CNRS-Universit, France b D partement de Biochimie, Canadian Institute for Advanced Research,

An alternative marginal likelihood estimator for phylogenetic models

Bayesian Inference in Astronomy & Astrophysics A Short Course

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Delayed Rejection Algorithm to Estimate Bayesian Social Networks

Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies

The Importance of Data Partitioning and the Utility of Bayes Factors in Bayesian Phylogenetics

Transcription:

ayes factors, marginal likelihoods, and how to choose a population model Peter eerli Department of Scientific omputing Florida State University, Tallahassee

Overview 1. Location versus Population 2. ayes factors, what are they and how to calculate them 3. Marginal likelihoods, what are they and how to calculate them 4. Examples: simulated and real data 5. Resources: replicated runs, cluster computing c 2009 Peter eerli

Location versus Population c 2009 Peter eerli

Location versus Population c 2009 Peter eerli

Location Population c 2009 Peter eerli

Location versus Population c 2009 Peter eerli

Location? = Population c 2009 Peter eerli

Model comparison Several tests that establish whether two locations belong to the same population exist. The test by Hudson and Kaplan (1995) seemed particularly powerful even with a single locus. These days researchers mostly use the program STRUTURE to establish the number of populations. A procedure that not only can handle panmixia versus all other gene flow models would help. c 2009 Peter eerli

Model comparison For example we want to compare some of these models 4-parameters 3-parameters 2-parameters 1-parameter c 2009 Peter eerli

Model comparison With a criterium such as likelihood we can compare nested models. ommonly we use a likelihood ratio test (LRT) or Akaike s information criterion (AI) to establish whether phylogenetic trees are statistically different or mutation models have an effect on the outcome, etc. Kass and Raftery (1995) popularized the ayes Factor as a ayesian alternative to the LRT. c 2009 Peter eerli

ayes factor In a ayesian context we could look at the posterior odds ratio or equivalently the ayes factors. p(m 1 X) = p(m 1)p(X M 1 ) p(x) F = p(x M 1) p(x M 2 ) p(m 1 X) p(m 2 X) = p(m 1) p(m 2 ) p(x M 1) p(x M 2 ) LF = 2 ln F = 2 ln ( p(x M1 ) p(x M 2 ) ) The magnitude of F gives us evidence against hypothesis M 2 LF = 2 ln F = z 0 < z < 2 No real difference 2 < z < 6 Positive 6 < z < 10 Strong z > 10 Very strong c 2009 Peter eerli

Marginal likelihood So why are we not all running F analyses instead of the AI, I, LRT? Typically, it is rather difficult to calculate the marginal likelihoods with good accuracy, because most often we only approximate the posterior distribution using Markov chain Monte arlo (MM). In MM we need to know only differences and therefore we typically do not need to calculate the denominator to calculate the Posterior distribution p(θ X): p(θ X, M) = p(θ)p(x Θ) p(x M) where p(x M) is the marginal likelihood. = p(θ)p(x Θ) Θ p(θ)p(x Θ)dΘ c 2009 Peter eerli

Harmonic mean estimator Marginal likelihood [ommon approximation, used in programs such a Mrayes and east] The harmonic mean estimator applied to our specific problem can be described using an importance sampling approach p(x M) = G p(x G, M i)p(g)dg G p(g)dg which is approximated after some shuffling wth expectations by p(x M) 1 n n j 1 1 p(x G,M) l H = ln p(x M), G j p(g X, M). c 2009 Peter eerli

Harmonic mean estimator Marginal likelihood [ommon approximation, used in programs such a Mrayes and east] The harmonic mean estimator applied to our specific problem can be described using an importance sampling approach p(x M) = G p(x G, M i)p(g)dg G p(g)dg which is approximated after some shuffling wth expectations by p(x M) 1 n n j 1 l H = ln p(x M) 1 p(x G,M), G j p(g X, M). The Harmonic Mean of the Likelihood: Worst Monte arlo Method Ever Radford Neal 2008 c 2009 Peter eerli

Thermodynamic integration Marginal likelihood l T = ln p(x M i ) = 1 0 E(ln p t (X M i ))dt which we approximate using the trapezoidal rule for t 0 = 0 < t 1 <... < t n = 1 using E(ln p t (X M i )) 1 m m ln p tz (X G j, M i ) j=1 Path sampling: Gelman and Meng (1998), Friel and Pettitt (2007,2009) Phylogeny: Lartillot and Phillipe (2006), Wu et al (2011), Xie et al (2011) [Paul Lewis] Population genetics: eerli and Palczewski 2010 c 2009 Peter eerli

Thermodynamic integration Marginal likelihood 1800 ln L 2000 2200 2400 2600 2800 Inverse temperature 0 c 2009 Peter eerli

Thermodynamic integration Marginal likelihood 1800 ln L 2000 2200 2400 2600 2800 Inverse temperature 0 c 2009 Peter eerli

Thermodynamic integration Marginal likelihood 1800 ln L 2000 2200 2400 2600 2800 Inverse temperature 0 c 2009 Peter eerli

Thermodynamic integration Marginal likelihood 1800 ln L 2000 2200 2400 2600 2800 Inverse temperature 0 c 2009 Peter eerli

Thermodynamic integration Marginal likelihood 1800 ln L 2000 2200 2400 2600 2800 Inverse temperature 0 c 2009 Peter eerli

Thermodynamic integration Marginal likelihood 1800 ln L 2000 2200 2400 2600 2800 0.0 0.2 0.4 0.6 0.8 1.0 Inverse temperature c 2009 Peter eerli

Thermodynamic integration Marginal likelihood 1800 ln L 2000 2200 2400 2600 2800 0.0 0.2 0.4 0.6 0.8 1.0 Inverse temperature c 2009 Peter eerli

Simulation results ayes factor 12000 600 10000 500 8000 400 6000 LF (2 ln F) 300 20-20 - 40 4000 200 40 20 2000 100-20 - 40-60 - 80-60 - 80! p X 1) LF= 2 ln p(x M p(x M2 ) = 2 ln p X c 2009 Peter eerli

Simulation results ayes factor 12000 600 10000 500 8000 400 6000 LF (2 ln F) 300 20-20 - 40 4000 200 40 20 2000 100-20 - 40-60 - 80-60 - 80! p X 1) LF= 2 ln p(x M p(x M2 ) = 2 ln p X c 2009 Peter eerli

Simulation results ayes factor 12000 600 10000 500 8000 400 6000 LF (2 ln F) 300 20-20 - 40 4000 200 40 20 2000 100-20 - 40-60 - 80-60 - 80! p X 1) LF= 2 ln p(x M p(x M2 ) = 2 ln p X c 2009 Peter eerli

Simulation results ayes factor 12000 600 10000 500 8000 400 6000 LF (2 ln F) 300 20-20 - 40 4000 200 40 20 2000 100-20 - 40-60 - 80-60 - 80! p X 1) LF= 2 ln p(x M p(x M2 ) = 2 ln p X c 2009 Peter eerli

ayes factor Simulation results Percent of Models LF = 2 ln ( p(x M 1 ) p X ) Param. 4 3 3 2 1 3 2 2 Model xxxx xmmx mxxm mmmm x x0xx m0xm mx0m Rejected 100 100 100 100 97 71 46 29 Accepted 0 0 0 0 3 29 54 71 Total 20 sequences with length of 1000 bp M 1 2 = 100 Parameters used to generate data: Θ i = 4N e (i) µ; M ji = m ji µ ; Θ 1 = 0.005 Θ 1 = 0.01 Nm = ΘM/4 M 2 1 = 0 c 2009 Peter eerli

LF ayes factor: influence of runlength 50 40 30 20 10 0-10 20-30 -40-50 Harmonic Run length [ρ 2x] Heated chains 4 16 32 1 2 4 8 16 32 64 Relative run length 1) LF = 2 ln p(x M p(x M2 ) = 2 ln p X! p X 128 256 ρ = (104 + 2 104) Time: 17 to 350 sec c 2009 Peter eerli

LF ayes factor: influence of runlength 50 40 30 20 10 0-10 20-30 -40-50 Harmonic Run length [ρ 2x] Heated chains 4 16 32 1 2 4 8 16 32 64 Relative run length 1) LF = 2 ln p(x M p(x M2 ) = 2 ln p X! p X 128 256 ρ = (104 + 2 104) Time: 17 to 350 sec c 2009 Peter eerli

LF ayes factor: influence of runlength 50 40 30 20 10 0-10 20-30 -40-50 Harmonic Run length [ρ 2x] Heated chains 4 16 32 1 2 4 8 16 32 64 Relative run length 1) LF = 2 ln p(x M p(x M2 ) = 2 ln p X! p X 128 256 ρ = (104 + 2 104) Time: 17 to 350 sec c 2009 Peter eerli

ayes factor: influence of runlength LF 5 4 3 2 1 0-1 -2-3 -4-5 16 heated chains 32 heated chains 1 2 4 8 16 32 64 128 256 Relative run length LF = 2 ln p(x M 1) p(x M 2 ) = 2 ln p ( p X X Thermodynamic Run length [ρ 2 x ] ) ρ = (10 4 + 2 10 4 ) c 2009 Peter eerli

ayes factor: influence of runlength LF 5 4 3 2 1 0-1 -2-3 -4-5 4 heated chains 4 heated chains + ézier 1 2 4 8 16 32 64 128 256 Relative run length LF = 2 ln p(x M 1) p(x M 2 ) = 2 ln p ( p X X Thermodynamic Run length [ρ 2 x ] ) ρ = (10 4 + 2 10 4 ) c 2009 Peter eerli

ayes factor: influence of runlength LF 5 4 3 2 1 0-1 -2-3 -4-5 4 heated chains 4 heated chains + ézier 16 heated chains 32 heated chains 1 2 4 8 16 32 64 128 256 Relative run length LF = 2 ln p(x M 1) p(x M 2 ) = 2 ln p ( p X X Thermodynamic Run length [ρ 2 x ] ) ρ = (10 4 + 2 10 4 ) c 2009 Peter eerli

ayes factor: influence of runlength omparison Harmonic mean estimator Thermodynamic integration (A) LF 50 40 30 20 10 0-10 20-30 -40-50 1 2 4 8 16 32 64 128 256 Relative run length () LF 5 4 3 2 1 0-1 -2-3 -4-5 4 heated chains 4 heated chains + ézier 16 heated chains 32 heated chains 1 2 4 8 16 32 64 128 256 Relative run length c 2009 Peter eerli

Humpback whales in the South Atlantic Real data c 2009 Peter eerli

Whale example Using Marginal Likelihoods to rank `ˆMi of models Mi Replica1, A, A A 1 (10) -1988-1958 -1984-2009 -2054-1935 -2070-1793 -2015 1 (10)2-1988 -1958-1984 -2009-2054 -1936-2070 -1793-2002 2 (10) -2034-2005 -2030-2056 -2099-1985 -2134-1856 -2071 3 (30) -3669-3519 -3630-3735 -3983-3454 -3689-2725 -3028 Rank 5 3 4 6 8 2 9 1 7 1 Number of samples per population in parentheses. 2 Same data as in replicate 1, but different start values of MM run. c 2009 Peter eerli

Whale example Using Marginal Likelihoods to rank `ˆMi of models Mi Replica1, A, A A 1 (10) -1988-1958 -1984-2009 -2054-1935 -2070-1793 -2015 1 (10)2-1988 -1958-1984 -2009-2054 -1936-2070 -1793-2002 2 (10) -2034-2005 -2030-2056 -2099-1985 -2134-1856 -2071 3 (30) -3669-3519 -3630-3735 -3983-3454 -3689-2725 -3028 Rank 5 3 4 6 8 2 9 1 7 1 Number of samples per population in parentheses. 2 Same data as in replicate 1, but different start values of MM run. c 2009 Peter eerli

Whale example Using Marginal Likelihoods to rank `ˆMi of models Mi Replica1, A, A A 1 (10) -1988-1958 -1984-2009 -2054-1935 -2070-1793 -2015 1 (10)2-1988 -1958-1984 -2009-2054 -1936-2070 -1793-2002 2 (10) -2034-2005 -2030-2056 -2099-1985 -2134-1856 -2071 3 (30) -3669-3519 -3630-3735 -3983-3454 -3689-2725 -3028 Rank 5 3 4 6 8 2 9 1 7 1 Number of samples per population in parentheses. 2 Same data as in replicate 1, but different start values of MM run. c 2009 Peter eerli

Whale example Using Marginal Likelihoods to rank `ˆMi of models Mi Replica1, A, A A 1 (10) -1988-1958 -1984-2009 -2054-1935 -2070-1793 -2015 1 (10)2-1988 -1958-1984 -2009-2054 -1936-2070 -1793-2002 2 (10) -2034-2005 -2030-2056 -2099-1985 -2134-1856 -2071 3 (30) -3669-3519 -3630-3735 -3983-3454 -3689-2725 -3028 Rank 7 5 6 8 9 3 4 1 2 1 Number of samples per population in parentheses. 2 Same data as in replicate 1, but different start values of MM run. c 2009 Peter eerli