Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models

Similar documents
Efficient model comparison techniques for models requiring large scale data augmentation

Approximate Bayesian Computation: a simulation based approach to inference

Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models.

Statistical Inference for Stochastic Epidemic Models

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Metropolis-Hastings Algorithm

Bayesian Methods for Machine Learning

Markov Chain Monte Carlo

Computer intensive statistical methods

Computational statistics

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Introduction to Machine Learning CMU-10701

Markov Chain Monte Carlo methods

Monte Carlo in Bayesian Statistics

Bayesian Inference for Contact Networks Given Epidemic Data

MCMC for non-linear state space models using ensembles of latent sequences

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Approximate Bayesian Computation and Particle Filters

Markov Chain Monte Carlo in Practice

State Space and Hidden Markov Models

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Fast Likelihood-Free Inference via Bayesian Optimization

ST 740: Markov Chain Monte Carlo

A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation

The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland),

Basic math for biology

Hmms with variable dimension structures and extensions

Reducing The Computational Cost of Bayesian Indoor Positioning Systems

Lecture 6: Markov Chain Monte Carlo

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

Bayesian Model Comparison:

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

An introduction to Sequential Monte Carlo

Bayesian inference for stochastic multitype epidemics in structured populations using sample data

an introduction to bayesian inference

ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors. RicardoS.Ehlers

Reminder of some Markov Chain properties:

MCMC Sampling for Bayesian Inference using L1-type Priors

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Downloaded from:

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL

Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications

Bayesian model selection in graphs by using BDgraph package

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

CPSC 540: Machine Learning

Contents. Part I: Fundamentals of Bayesian Inference 1

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection

Computer intensive statistical methods

A Review of Pseudo-Marginal Markov Chain Monte Carlo

An introduction to Approximate Bayesian Computation methods

arxiv: v1 [stat.co] 13 Oct 2017

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

Learning the hyper-parameters. Luca Martino

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

STA 4273H: Statistical Machine Learning

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

Pseudo-marginal MCMC methods for inference in latent variable models

Lecture 4: Dynamic models

Multistate Modeling and Applications

Likelihood-free inference and approximate Bayesian computation for stochastic modelling

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Lecture 8: Bayesian Estimation of Parameters in State Space Models

A Generic Multivariate Distribution for Counting Data

Modeling Ultra-High-Frequency Multivariate Financial Data by Monte Carlo Simulation Methods

Theory of Stochastic Processes 8. Markov chain Monte Carlo

The Origin of Deep Learning. Lili Mou Jan, 2015

Notes on pseudo-marginal methods, variational Bayes and ABC

Probabilistic Graphical Networks: Definitions and Basic Results

POMP inference via iterated filtering

Scaling up Bayesian Inference

Learning Energy-Based Models of High-Dimensional Data

Bayesian phylogenetics. the one true tree? Bayesian phylogenetics

Estimating marginal likelihoods from the posterior draws through a geometric identity

Principles of Bayesian Inference

An ABC interpretation of the multiple auxiliary variable method

Markov Chain Monte Carlo (MCMC)

Bayesian model selection for computer model validation via mixture model estimation

Paul Karapanagiotidis ECO4060

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Markov Chain Monte Carlo

Probabilistic Graphical Models

High-dimensional Problems in Finance and Economics. Thomas M. Mertens

Tutorial on ABC Algorithms

The Ising model and Markov chain Monte Carlo

Bayesian Estimation with Sparse Grids

Introduction to Restricted Boltzmann Machines

Bayesian Inference and MCMC

PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Oliver Schulte - CMPT 419/726. Bishop PRML Ch.

Kernel Sequential Monte Carlo

Likelihood-free MCMC

Transcription:

Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models (with special attention to epidemics of Escherichia coli O157:H7 in cattle) Simon Spencer 3rd May 2016

Acknowledgements Bärbel Finkenstädt Rand Peter Neal TJ McKinley Nigel French, Tom Besser and Rowland Cobbold Panayiota Touloupou

Outline 1. Introduction 2. Bayesian inference for epidemics 3. Model selection for epidemics 4. Scalable inference for epidemics 5. Conclusion

Introduction

Introduction A typical epidemic model: Susceptible Exposed Infected Removed Infections occur according to an inhomogeneous Poisson process with rate S(t)I (t).

A simulation 0 20 40 60 80 100 Susceptible Exposed Infected Removed 0 20 40 60 80 100 time

Comments Statistical inference for epidemic models is hard. Intractable likelihood need to know infection times. Usual solution: large scale data augmentation MCMC. What are the observed data?

Epidemic data Historically: final size (single number). Final size in many sub-populations, e.g. households. Markov models: removal times. Who is removed is not needed / recorded.

Epidemic data Historically: final size (single number). Final size in many sub-populations, e.g. households. Markov models: removal times. Who is removed is not needed / recorded. Individual level diagnostic test results. To be realistic, tests are imperfect. Temporal resolution of 1 day.

Epidemic data Historically: final size (single number). Final size in many sub-populations, e.g. households. Markov models: removal times. Who is removed is not needed / recorded. Individual level diagnostic test results. To be realistic, tests are imperfect. Temporal resolution of 1 day. View epidemic as hidden Markov model

Motivating example: Escherichia coli O157 E. coli O157 is a highly pathogenic form of Escherichia coli. It can cause severe gastroentestinal illness, haemorrhagic diarrhoea and even death. Outbreaks and endemic cases are associated with food, water or direct contact with infected animals. Cattle are the main reservoir. Additional economic burden due to impacts on trade.

Study design Natural colonization and faecal excretion of E. coli O157 in commercial feedlot. 20 pens containing 8 calves were sampled 27 times over a 99 day period. Each sampling event included a faecal pat sample and a recto-anal mucosal swab (RAMS). Tests were assumed to have perfect specificity but imperfect sensitivity.

Patterns of infection Positive Tests, Pen 5 (South) Animal 1 2 3 4 5 6 7 8 RAMS Faecal Negative 0 20 40 60 80 100 Time (days)

Patterns of infection Positive Tests, Pen 7 (North) Animal 1 2 3 4 5 6 7 8 RAMS Faecal Negative 0 20 40 60 80 100 Time (days)

Bayesian inference for epidemics

Bayesian inference for epidemics Intractable likelihood: π(y θ). Need to impute infection status of individuals x for augmented likelihood π(y x, θ). Missing data x typically very high dimensional.

Updating the infection status Standard method by O Neill and Roberts (1999) involves 3 steps: 1 Add a period of infection 2 Remove a period of infection 3 Move an end-point of a period of infection This method was designed for SIR models (where individuals can t be infected twice). Easily adapted to discrete time models.

Add a period of infection Current: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Propose: 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 Choose a block of zeros at random. 2 Propose changing zeros to ones. 3 Accept or reject based on ratio of posteriors.

Remove a period of infection Current: 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 Propose: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 Choose a complete block of ones. 2 Propose changing ones to zeros. 3 Accept or reject based on ratio of posteriors.

Move an endpoint Current: 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 Propose: 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1 Choose an endpoint of a block of ones. 2 Propose a new location for that endpoint. 3 Accept or reject based on ratio of posteriors.

Some pros and cons! Considerably fast! Can handle non-markov models % Most of the hidden states are not updated % High degree of autocorrelation Slow mixing of the chain and long run length % Tuning of the maximum block length required.

Alternative approach: FFBS Discrete time epidemic is a hidden Markov model. Gibbs step: sample from the full condition distribution of the hidden states. Use Forward Filtering Backward Sampling algorithm (Carter and Kohn, 1994).

Some pros and cons! Very good mixing of the MCMC chains! No tuning required % Computationally intensive At each timepoint we need to calculate N C summations O(TN 2C ) % High memory requirements All T forward variables must be stored The transition matrix is of dimension N C N C N = number of infection states (e.g. 2) C = number of cows (e.g. 8) T = number of timepoints (e.g. 99)

Example: SIS model Stochastic SIS (Susceptible-Infected-Susceptible) transmission model in discrete time. 1 X p,i,t infection status for animal i in pen p on day t. X p,i,t = 1 infected/colonized. X p,i,t = 0 uninfected/susceptible. We treat X p,i,t as missing data and infer it using MCMC. Epidemic model parameters updated via Metropolis-Hastings and test sensitivities updated using Gibbs. 1 Spencer et al. (2015) Super or just above average? Supershedders and the transmission of Escherichia coli O157:H7 among feedlot cattle. Interface 12, 20150446.

Colonization probability: P X p,i,t+1 = 1 X p,i,t = 0 = 1 exp α β 8 j=1 X p,j,t ρ I(S p,j,t > τ) Susceptible X p,i,t = 0 Colonized X p,i,t = 1 Colonization duration: NegativeBinomial(r, μ) Pens: p = 1 20 Animals: i = 1 8 Time: t = 1 99 days

Example: Posterior infection probabilities Posterior colonization probability 0.0 0.5 1.0 RAMS Faecal Pen 5 Animal 5 0 20 40 60 80 100 We can calculate the posterior infection probability for every day of the study. Time (days) Pen 5 Pen 8 Animal 1 2 3 4 5 6 7 8 Animal 1 2 3 4 5 6 7 8 0 20 40 60 80 100 0 20 40 60 80 100 Time (days) Time (days)

Model selection for epidemics

Model selection for epidemics A lot of epidemiologically interesting questions take the form of model selection questions. What is the transmission mechanism of this disease? Do infected individuals really exhibit an exposed period? Do water troughs spread E. coli O157?

Posterior probabilities and marginal likelihoods Would like the posterior probability in favour of model i. P(M i y) = π(y M i)p(m i ) j π(y M j)p(m j )

Posterior probabilities and marginal likelihoods Would like the posterior probability in favour of model i. P(M i y) = π(y M i)p(m i ) j π(y M j)p(m j ) Equivalently, the Bayes factor comparing models i and j. B ij = π(y M i) π(y M j )

Posterior probabilities and marginal likelihoods Would like the posterior probability in favour of model i. P(M i y) = π(y M i)p(m i ) j π(y M j)p(m j ) Equivalently, the Bayes factor comparing models i and j. B ij = π(y M i) π(y M j ) All we need is the marginal likelihood, π(y M i ) = π(y θ, M i )π(θ M i ) dθ but how can we calculate it?

Marginal likelihood estimation Many existing approaches: Chib s method Power posteriors Harmonic mean Bridge sampling Most direct approach: importance sampling. Use asymptotic normality of the posterior to find efficient proposal. Dr Peter Neal But how to deal with the missing data?

Marginal likelihood estimation using importance sampling 1 Run MCMC as usual. 2 Fit normal distribution to posterior samples 2 q(θ). 3 Draw N samples from q(θ). π(y) = π(y θ)π(θ) dθ. 2 To avoid problems, make q overdispersed relative to the posterior.

Marginal likelihood estimation using importance sampling 1 Run MCMC as usual. 2 Fit normal distribution to posterior samples 2 q(θ). 3 Draw N samples from q(θ). π(y) N i=1 π(y θ i )π(θ i ). q(θ i ) 2 To avoid problems, make q overdispersed relative to the posterior.

Marginal likelihood estimation with missing data 1 Run MCMC as usual. 2 Fit normal distribution to posterior samples q(θ). 3 Draw N samples from q(θ). 4 For each sampled θ i draw missing data x i from the full conditional using FFBS. π(y) N i=1 π(y x i, θ i ) π(x i θ i ) π(θ i ). π(x i y, θ i ) q(θ i )

Simulation study: pneumococcol carriage Panayiota performed a thorough simulation study 3 based on Melegaro at al. (2004). Household based longitudinal study on carriage of Streptococcus Pneumoniae. Data consist of repeated diagnostic tests. Multi-type model with 11 parameters, 2600 observed data and 6500 missing data. 3 Touloupou et al. (2016) Model comparison with missing data using MCMC and importance sampling. arxiv 1512.04743

Results: marginal likelihood estimation -1238-1237 -1236-1235 -1234-1233 -1232-1231 -1230-1237.5-1237.25-1237 IS N1 IS N2 IS N3 IS N4 IS mix IS t4 IS t6 IS t8 IS t10 Chib PP HM -931-929 -927-925 -923-921 -919 Log marginal likelihood

Results: Bayes factor estimation Do adults and children acquire infection at the same rate? M 1 : k A k C M 2 : k A = k C 2 3 4 5-4 -3-2 -1 0 IS mix Chib RJcor RJ IS mix Chib RJcor RJ IS mix PP HM IS mix PP HM 2 5 8 11 15 19 23 Log B 12 (a) Data simulated from model M 1-22 -16-10 -4 0 4 8 Log B 12 (b) Data simulated from model M 2

Results: Evolution of the log Bayes factor Initial MCMC run for IS and Chib RJ pilot + RJcor burn in Log B12-2 0 2 4 6 8 10 12 HM PP RJcor Chib IS 0 30 60 90 120 150 180 210 240 Time (minutes)

Application 1: E. coli O157 in feedlot cattle Do animals develop immunity over time? We compare two models for infection period: Geometric: lack of memory. Negative Binomial: probability of recovery depends on duration of infection. The Negative Binomial is a generalisation of the Geometric: Setting Negative Binomial dispersion parameter κ = 1 leads to Geometric.

Application 1: Results Log B NG 0.0 0.5 1.0 1.5 IS RJMCMC 0 30 60 90 120 150 180 Time (in minutes) RJMCMC and IS agree on the estimate of the Bayes factor IS estimator: faster convergence Bayes factor supports the Negative Binomial model The longer the colonization, the greater the probability of clearance may indicate an immune response in the host

Application 2: Role of pen area/location 1 Pen 14 Pen 15 North Pen 6 Pen Set Pen 7 North = small Pen Size Pen 8 6m 17m Pen 9 Pen 10 Pen 16 Pen 17 Pen 18 Pen 19 Pen 20 South = big Supplement and Premix Storage Catch pens Scale from scale Pen 1 House house Pen 2 Pen 11 South Pen Pen 12 Set Pen 3 Pen 13 Pen Size Pen 4 6m 37m Pen 5

Application 2: Role of pen area/location Do north and south pens have different risk of infection? Allow different external (α s, α n ) and/or within-pen (β s, β n ) transmission rates. Candidate models: External Within-pen Model North South North South 1 α n α s β n β s 2 α α β n β s 3 α n α s β β 4 α α β β

Application 2: Posterior probabilities Posterior Probability 0.8 0.6 0.4 0.2 0.0 1 2 3 4 Model Method IS RJMCMC RJMCMC and IS provide identical conclusions. Evidence to support different within-pen transmission rates. Animals in smaller pens more at risk of within-pen infection

Application 3: Investigating transmission between pens Additional dataset: pens adjacent in a 12 2 rectangular grid. No direct contact across feed buck. Shared waterers between pairs of adjacent pens. Pen 24 Pen 23 Pen 22 Pen 21 Pen 20 Pen 19 Pen 18 Pen 17 Pen 16 Pen 15 Pen 14 Pen 13 Pen 1 Pen 2 Pen 3 Pen 4 Pen 5 Pen 6 Pen 7 Pen 8 Pen 9 Pen 10 Pen 11 Pen 12

Application 3: Investigating transmission between pens Do waterers spread infection? (a) Model 1: No contacts between pens (b) Model 2: Transmission via a waterer (c) Model 3: Transmission via any boundary

Application 3: Posterior probabilities Posterior Probability 0.6 0.5 0.4 0.3 0.2 RJMCMC: hard to design jump mechanism Using IS results still possible. Evidence for transmission between pens sharing a waterer rather than another boundary. 1 2 3 Model

Scalable inference for epidemics

Scalable inference for epidemics Thus far we have been doing inference for small populations. Households Pens The FFBS algorithm scales very badly with population size. We would like an inference method that scales better with population size.

Graphical representation Diagram of the Markovian epidemic model. Circles are hidden states and rectangles are observed data. Arrows represent dependencies. x [1] t 2 x [1] t 1 x [1] t x [1] t+1 x [1] t+2 x [1] t+3 y [1] t 2 y [1] t y [1] t+3 x [2] t 2 x [2] t 1 x [2] t x [2] t+1 x [2] t+2 x [2] t+3 y [2] t 2 y [2] t y [2] t+3 x [3] t 2 x [3] t 1 x [3] t x [3] t+1 x [3] t+2 x [3] t+3 y [3] t 2 y [3] t y [3] t+3

A new approach the iffbs algorithm Reformulate graph: Sample x [1] t 1 x [1] t x [1] t+1 Update one individual at a time by sampling from the full conditional: P(x [c] 1:T y [1:C] 1:T, x [ c] 1:T, θ). y [1] t x [2] t 1 x [2] t x [2] t+1 View as coupled hidden Markov model x [3] t 1 x [3] t x [3] t+1 y [2] t y [3] t

A new approach the iffbs algorithm Reformulate graph: Sample x [1] t 1 x [1] t x [1] t+1 Update one individual at a time by sampling from the full conditional: P(x [c] 1:T y [1:C] 1:T, x [ c] 1:T, θ). y [1] t x [2] t 1 x [2] t x [2] t+1 View as coupled hidden Markov model x [3] t 1 x [3] t x [3] t+1 y [2] t y [3] t Computational complexity reduced from O(TN 2C ) to O(TCN 2 ). N = number of infection states (e.g. 2) C = number of cows (e.g. 8) T = number of timepoints (e.g. 99)

Comparison of methods Time (in seconds) 0 0.5 1 1.5 2 2.5 3 3.5 Time (in seconds) 0 0.1 0.2 0.3 Spencer's Dong's fullffbs iffbs 3 4 5 6 7 8 9 Animals in pen ACF per iteration 0 0.2 0.4 0.6 0.8 1 3 4 5 6 7 8 9 10 11 Animals in pen 0 5 10 15 20 25 30 Lag

Larger populations Relative speed 0 25 50 75 100 125 150 175 Spencer's Dong's iffbs 100 200 300 400 500 600 700 800 900 1000 Animals in pen

Conclusion

Conclusion FFBS algorithm generates better mixing MCMC for parameter inference. Unlocks direct approach to marginal likelihood estimation. Allows important epidemiological questions to be answered via model selection. iffbs can perform inference in large populations exploits dependence structure in epidemic data.

What I didn t say All of this work (and much more!) has been done by Panayiota. FFBS and iffbs can also be used as a Metropolis-Hastings proposal to fit non-markovian epidemic models. Can we do model selection with iffbs? Power of iffbs allows more complex models to be fitted, e.g. multi-strain epidemic models.

Current work Pen 3 - Animal 1 1.0 0.8 0.6 0.4 0.2 - - - - - - - - - - - -- +C + + - - - - - -- - - - - - - - - - - - - - - -- ++ + C - - - -+ -- - - - Pen 3 - Animal 4 1.0 0.8 0.6 0.4 0.2 O - ++ - - - - - - - +M+ - - - - - - - - -- - - - ++ -+ - + - - - - - + - M- - - - - - - - -- - - - 0.0 1 15 29 43 57 71 85 99 Day 0.0 1 15 29 43 57 71 85 99 Day Pen 3 - Animal 6 1.0 0.8 0.6 0.4 0.2 + - ++ +++++++ + - - - - -+++++ +++++ -++- - -+ - - +O + - + - + -++- ++ + - - - M Pen 3 - Animal 8 1.0 0.8 0.6 0.4 0.2 - - - - - - - - - - - -- - - - ++ +C - - U - + - - - - - - - - - - - - - -- - - - ++ + - - - -- - - - 0.0 1 15 29 43 57 71 85 99 Day 0.0 1 15 29 43 57 71 85 99 Day Serotype A C G M O P T U -