Downloaded from:

Similar documents
Heriot-Watt University

Markov Chain Monte Carlo methods

Approximation of epidemic models by diffusion processes and their statistical inferencedes

Statistical Inference for Stochastic Epidemic Models

Bayesian model selection for computer model validation via mixture model estimation

The changing epidemiology of ebola virus in West Africa

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

7. Estimation and hypothesis testing. Objective. Recommended reading

Bayesian Linear Regression

eqr094: Hierarchical MCMC for Bayesian System Reliability

Penalized Loss functions for Bayesian Model Choice

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions

Transmission of Hand, Foot and Mouth Disease and Its Potential Driving

Monte Carlo in Bayesian Statistics

Approximate Bayesian Computation

Fast Likelihood-Free Inference via Bayesian Optimization

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Simulating stochastic epidemics

Markov Chain Monte Carlo methods

Markov chain Monte Carlo

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Computational statistics

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

DIC: Deviance Information Criterion

Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models

CHAPTER 10. Bayesian methods. Geir Storvik

16 : Approximate Inference: Markov Chain Monte Carlo

Mathematical modelling and controlling the dynamics of infectious diseases

arxiv: v2 [q-bio.pe] 7 Nov 2015

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Problems with Penalised Maximum Likelihood and Jeffrey s Priors to Account For Separation in Large Datasets with Rare Events

Spatial Analysis of Incidence Rates: A Bayesian Approach

Markov Chain Monte Carlo

Doubly Inhomogeneous Cluster Point Processes

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Hierarchical Modeling for Spatial Data

Chapter 12 PAWL-Forced Simulated Tempering

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Efficient Likelihood-Free Inference

PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH

Bayesian Phylogenetics:

A Generic Multivariate Distribution for Counting Data

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial

Stochastic modelling of epidemic spread

Downscaling Seismic Data to the Meter Scale: Sampling and Marginalization. Subhash Kalla LSU Christopher D. White LSU James S.

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications

CTDL-Positive Stable Frailty Model

Fully Bayesian Spatial Analysis of Homicide Rates.

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Package effectfusion

Markov chain Monte Carlo

Bayesian Linear Models

MCMC Sampling for Bayesian Inference using L1-type Priors

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Markov Chain Monte Carlo in Practice

Time-Sensitive Dirichlet Process Mixture Models

ST 740: Markov Chain Monte Carlo

Bayesian Inference for Contact Networks Given Epidemic Data

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Best Linear Unbiased Prediction: an Illustration Based on, but Not Limited to, Shelf Life Estimation

HIV with contact-tracing: a case study in Approximate Bayesian Computation

Bayesian model selection: methodology, computation and applications

Estimation. John M. Drake & Pejman Rohani

Transmission in finite populations

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

Bayesian Linear Models

Numerical solution of stochastic epidemiological models

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Bayesian inference - Practical exercises Guiding document

Control Variates for Markov Chain Monte Carlo

STA 4273H: Statistical Machine Learning

Meta-Analysis for Diagnostic Test Data: a Bayesian Approach

Principles of Bayesian Inference

PARAMETER ESTIMATION IN EPIDEMIC MODELS: SIMPLIFIED FORMULAS

Modelling spatio-temporal patterns of disease

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks

Bayesian Nonparametric Regression for Diabetes Deaths

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

The Bayesian Approach to Multi-equation Econometric Model Estimation

pyblocxs Bayesian Low-Counts X-ray Spectral Analysis in Sherpa

Lecture 7 and 8: Markov Chain Monte Carlo

MCMC algorithms for fitting Bayesian models

Assessing the Effect of Prior Distribution Assumption on the Variance Parameters in Evaluating Bioequivalence Trials

Bayesian inference: what it means and why we care

Centre for Biodiversity Dynamics, Department of Biology, NTNU, NO-7491 Trondheim, Norway,

Bayesian Inference for Pair-copula Constructions of Multiple Dependence

MATHEMATICAL MODELS Vol. III - Mathematical Models in Epidemiology - M. G. Roberts, J. A. P. Heesterbeek

ntopic Organic Traffic Study

Empirical Likelihood Based Deviance Information Criterion

Monetary and Exchange Rate Policy Under Remittance Fluctuations. Technical Appendix and Additional Results

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

POMP inference via iterated filtering

University of Toronto Department of Statistics

Lecture 8: The Metropolis-Hastings Algorithm

PUSHING THE LIMITS OF AMS RADIOCARBON DATING WITH IMPROVED BAYESIAN DATA ANALYSIS

Bayesian Additive Regression Tree (BART) with application to controlled trail data analysis

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Transcription:

Camacho, A; Kucharski, AJ; Funk, S; Breman, J; Piot, P; Edmunds, WJ (2014) Potential for large outbreaks of Ebola virus disease. Epidemics, 9. pp. 70-8. ISSN 1755-4365 DOI: https://doi.org/10.1016/j.epidem.2014.09.003 Downloaded from: http://researchonline.lshtm.ac.uk/2030883/ DOI: 10.1016/j.epidem.2014.09.003 Usage Guidelines Please refer to usage guidelines at http://researchonline.lshtm.ac.uk/policies.html or alternatively contact researchonline@lshtm.ac.uk. Available under license: http://creativecommons.org/licenses/by/2.5/

Potential for large outbreaks of Ebola virus disease Electronic supplementary material A. Camacho a,, A. J. Kucharski a,, S. Funk a, J. Breman b, P. Piot c, W. J. Edmunds a a Centre for the Mathematical Modelling of Infectious Diseases, Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom b Fogarty International Center, National Institutes of Health, United States c London School of Hygiene and Tropical Medicine, London, United Kingdom Contents S1 MCMC inference 2 S2 Simulation study 2 S3 Basic reproduction number 3 S4 Model comparison 4 List of Figures S1 Posterior distribution of parameters..................... 6 S2 Simulated data............................... 7 S3 Model fit to simulated data......................... 8 S4 Posterior distribution of parameters (simulation study).......... 9 S5 Course of R(t) (simulation study)..................... 10 S6 Joint posterior of R 0i and R 0d........................ 11 List of Tables S1 Summary of case data............................ 12 S2 Estimates from simulated data....................... 13 S3 DIC of tested models............................ 14 Corresponding authors. These authors contributed equally to the work. Email addresses: anton.camacho@lshtm.ac.uk (A. Camacho), adam.kucharski@lshtm.ac.uk (A. J. Kucharski) Preprint submitted to Epidemics September 9, 2014

S1. MCMC inference Model simulation and fitting were performed using the SSM library (Dureau et al., 2013), freely available at https://github.com/jdureau/ssm. This library implements a Metropolis-Hastings adaptive MCMC algorithm with a multivariate normal proposal distribution. The adaptive procedure of the proposal kernel operates in two steps. First, the size of the covariance matrix is adapted at each iteration to achieve an optimal acceptance rate around 23% (Roberts et al., 1997). Second, its shape is adapted by using the empirical covariance matrix, computed from the accepted samples and updated at each iteration, thus leading to an optimal proposal distribution (Roberts and Rosenthal, 2009). The SSM library (Dureau et al., 2013) also implements a simplex algorithm, which was used to maximise the (non-normalized) posterior distribution and thus initialise the MCMC close to the mode of the target. Since the simplex algorithm only guaranty convergence to a local maximum, we ran 1000 independent simplex initialised from parameter sets sampled from the prior distribution. We selected the simplex that converged to the highest posterior density value, and used the outputed parameter set to initialise 5 independent MCMC chains of 200,000 iterations. We visually checked that the 5 chains converged to the same stationary distribution and combined them after appropriate burning and thinning. The posterior distribution of the model parameters (Figure S1) and the model fitting (Figure 3) were plotted using the R (R Core Team, 2014) package fitr, which is freely available at https://github.com/sbfnk/fitr. Median and 95% CI for all parameters can be found in Table 2. S2. Simulation study We checked the inference procedure and assessed potential bias in our estimates by performing the same analysis as in the main paper on a simulated dataset. We set a true parameter vector θ as the maximum a posteriori probability estimate (MAP) obtained from the main analysis. Since our Bayesian inference procedure integrates strong priors for several parameters, we fixed those parameters at the mode of their respective priors (we note however that the posterior and prior distributions almost match for those parameters). As such we would expect θ to coincide with the MAP of the distribution obtained with the inference procedure. 2

We simulated an outbreak under θ using the deterministic model described by Equation (3). Then, we generated the four random daily incidence time-series using the same Poisson observation processes as described in the method section of the main paper. We selected a random dataset which was as representative as possible of the expected dataset under θ ; that is, the dataset with every observation corresponding to the expected observation. Because Poisson expectations don t need to be integers, we achieved this by picking a random dataset whose cumulated incidence was close to that of the expected dataset (we used an overall tolerance of 8 incidence counts). Both used and expected datasets are shown in Figure S2. As shown in Figure S3, the model was able to fit the simulated data correctly. Comparison between θ and the MAP is reported in Table S2 whereas the posterior estimates of the parameters are shown in Figure S4. Although we would not expect all true parameter values to coincide with the MAP (as a result of stochasticity during data generation) nor with the mode of the univariate posterior distributions (due to correlation between parameters), most parameter estimates are close to the true value, even some those with flat priors. In addition, we note that all true parameter values lied within the 95% CI of our posterior estimates. That said, the posterior distributions also reveal that it is difficult to separate the community (β i ) and funeral (β d ) transmission rates due to correlations between those two parameters (the data don t separate these two routes of transmission). We also note the flat posterior distributions of the shapes of the change in hospital-seeking (α h ) and contact (α pp ) behaviours, which suggests that the simulated data were not very informative for these parameters (note however that the shape of the sigmoid doesn t change significantly for values above 1). Despite this, Figure S5 shows that our inference procedure is able to accurately reconstruct the change of hospital-seeking and contact behaviours. Although this simulation study reveals some limitations of the data, it indicates that our inference procedure does not generate any substantial bias in estimates. S3. Basic reproduction number It can be shown (Legrand et al., 2007) that the basic reproduction number (R 0 ) for the model of Equation (3) can be split into a hospital (R 0h ) and person-to-person (R 0pp ) components, the latter can be further split into a community (R 0i ) and funeral (R 0d ) com- 3

ponents: R 0 = β h (0)γ h κ i (0) Γ i (0)(ν d φ h + ν r (1 φ h )) }{{} R 0h R {}} 0i { β i (0) + Γ i (0) + R 0d {}}{ β d (0)φ µ b } {{ } R 0pp (S1) Posterior estimates for R 0h and R 0pp can be found on Table 3 of the main paper. Our choice of grouping both community (R 0i ) and funeral (R 0d ) into a single person-to-person component R 0pp is justified by correlation between these two quantities, as revealed by the simulation study of Section S2 and shown in Figure S6. Indeed, although our posterior density estimate suggests R 0i > R 0d, the 95% CI reveals that alternative scenarios cannot be excluded (e.g. larger reproduction number for dead cases than for alive cases). The course of the effective reproduction number during the outbreak, as shown in Figure 4 of the main paper, can be obtained similarly by accounting for the time-dependencie of the parameters as well as the proportion of susceptible individuals in the population (note however that the latter was negligible in the 1976 Yambuku outbreak): R(t) = ( ) β h (t)γ h κ i (t) Γ i (t)(ν d φ h + ν r (1 φ h )) + β i(t) Γ i (t) + β d(t)φ S(t) µ b N (S2) S4. Model comparison In order to test the role of person-to-person and hospital transmissions, we constructed, fitted and compared four different models: 1. The first model is the one presented in the main paper and includes hospital closure, change in hospital seeking behaviour and change in person-to-person contact behaviour. We refer to the main paper for a more detailed description of this complete model. 2. The second model assume that person-to-person contact behaviour did not change over time. 3. In the third model we only included hospital closure (i.e. no change of behaviour over time). 4. The fourth model is similar to the complete model but assume that hospital transmission was density-independent. The force of infection for this model is λ h (t) = β h (t)1 H 1 instead of λ h (t) = β h (t)h in the complete model. 4

Models were compared using the Deviance Information Criterion (DIC), which measures how well a model fits the data, adjusting for model complexity (Spiegelhalter et al., 2002). The model with the smallest DIC is the model that would best predict a replicate dataset which has the same structure as that currently observed. Given the differences DIC 10, we can confidently rule out models 2, 3 and 4 in favour of the complete model 1 (Spiegelhalter et al., 2002). 5

References Dureau, J., Ballesteros, S., Bogich, T., 2013. SSM: Inference for time series analysis with State Space Models. arxiv 1307.5626v4. Legrand, J., Grais, R.F., Boelle, P.Y., Valleron, A.J., Flahault, A., 2007. Understanding the dynamics of Ebola epidemics. Epidemiol Infect 135, 610 21. R Core Team, 2014. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. Roberts, G.O., Gelman, A., Gilks, W.R., 1997. Weak convergence and optimal scaling of random walk Metropolis algorithms. The Annals of Applied Probability 7, 110 120. Roberts, G.O., Rosenthal, J.S., 2009. Examples of adaptive MCMC. Journal of Computational and Graphical Statistics 18, 349 367. Spiegelhalter, D.J., Best, N.G., Carlin, B.P., Van Der Linde, A., 2002. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64, 583 639. 6

Figures Figure S1: Posterior density of model parameters (red histogram) together with their prior density distribution (blue area), as specified in Table 2. As discussed in Section S3, there were correlations between R 0i and R 0d, which explain the somewhat wide marginal posterior distributions of β i and β d. 7

Figure S2: Simulated data (blue line) were chosen to be close to the expected observed time-series (red line) under the "true" parameter values θ. 8

Figure S3: Comparison of our fitted model and simulated dataset (black dots). The mean and median fits are represented by solid and dashed red lines respectively. The dark and light red shaded areas correspond to the 50% and 95% credible intervals. 9

Figure S4: Posterior density of model parameters for the simulation study, where true values are mapped as vertical black lines. See legend of Figure S1 for more details. 10

Figure S5: Drop in the effective reproduction number (R(t)) owing to change of behaviour in community contacts and visit of outpatients to the hospital and comparison with the true drop (black solid line). The overall R (lower panel) can be split into an hospital (upper panel) and person-to-person (middle panel) component. The dashed line indicates the epidemic threshold (R = 1) and the dotted line corresponds to the hospital closure (30th September). Solid, dashed and shaded red lines/area as in Figure S3. 11

Figure S6: Joint estimate of the contribution of alive and dead (but not yet buried) cases to person-to-person reproduction number. Several high posterior density (HPD) regions are indicated. 12

Tables Source of infection Cases With date of onset With date of outcome Person to person 142 140 141 Syringe 90 87 88 Both 8 8 8 Other 10 10 10 Unknown 12 12 12 All 262 257 259 Table S1: Summary of case data. 13

Parameter Description θ MAP T 0 Date of introduction of index case to H compartment Aug 24 Aug 24 ρ onset Proportion of onsets reported 0.71 0.72 ρ d Proportion of death reported 0.89 0.89 ρ r Proportion of recovery reported 0.29 0.26 κ Proportion of cases hospitalised until hospital closure 0.17 0.19 φ Case-fatality ratio 0.90 0.88 1/ɛ Incubation period (days) 6.00 6.00 1/γ h Mean time from onset to hospitalisation (days) 3.00 3.05 1/γ d Mean time from onset to death (days) 7.50 7.53 1/γ r Mean time from onset to recovery (days) 10.00 9.95 1/µ b Mean time from death to burial (days) 1.00 1.07 β i Transmission rate in the community at the onset of the epidemic 0.09 0.07 β d Transmission rate during traditional burial at the onset of the epidemic 0.73 0.66 α pp Shape of the change of person-to-person contact behaviour in community and during traditional burial 0.27 3.45 τ pp Midpoint date for the change of person-to-person contact behaviour Sep 28 Sep 29 δ pp Reduction of the person-to-person transmission rate following change of contact behaviour (%) 99 95 β h Transmission rate in hospital at the onset of the epidemic 3.67 3.49 α h Shape of the change of hospital seeking behaviour from outpatients 1.66 1.45 τ h Midpoint date for the change of hospital seeking behaviour Sep 17 Sep 17 Table S2: Comparison between the true parameter values (θ ) and the MAP obtained in the simulation study. 14

Model Description DIC DIC 1 Complete model 711 0 2 No change of person-to-person contact behaviour 827 117 4 Density-independent hospital transmission 970 259 3 No change of hospital seeking or person-to-person contact behaviours 1191 480 Table S3: Deviance Information Criterion for the four models tested. 15