Estimating the Exponential Growth Rate and R 0

Similar documents
An intrinsic connection between Richards model and SIR model

Introduction to SEIR Models

Thursday. Threshold and Sensitivity Analysis

Mathematical Epidemiology Lecture 1. Matylda Jabłońska-Sabuka

Electronic appendices are refereed with the text. However, no attempt has been made to impose a uniform editorial style on the electronic appendices.

EPIDEMIC MODELS I REPRODUCTION NUMBERS AND FINAL SIZE RELATIONS FRED BRAUER

Transmission of Hand, Foot and Mouth Disease and Its Potential Driving

Approximation of epidemic models by diffusion processes and their statistical inferencedes

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Transmission in finite populations

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

HETEROGENEOUS MIXING IN EPIDEMIC MODELS

The death of an epidemic

Statistical challenges in Disease Ecology

Mathematical modelling and controlling the dynamics of infectious diseases

Models of Infectious Disease Formal Demography Stanford Summer Short Course James Holland Jones, Instructor. August 15, 2005

Stochastic modelling of epidemic spread

Models of Infectious Disease Formal Demography Stanford Spring Workshop in Formal Demography May 2008

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

arxiv: v2 [q-bio.pe] 7 Nov 2015

The Estimation of the Effective Reproductive Number from Disease. Outbreak Data

Malaria transmission: Modeling & Inference

Mathematical Analysis of Epidemiological Models: Introduction

Epidemics in Networks Part 2 Compartmental Disease Models

A Generic Multivariate Distribution for Counting Data

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

Parameter Estimation and Uncertainty Quantification for an Epidemic Model

MATHEMATICAL MODELS Vol. III - Mathematical Models in Epidemiology - M. G. Roberts, J. A. P. Heesterbeek

On the Spread of Epidemics in a Closed Heterogeneous Population

Multivariate Count Time Series Modeling of Surveillance Data

Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications

Mathematical Modeling and Analysis of Infectious Disease Dynamics

Inference for partially observed stochastic dynamic systems. University of California, Davis Statistics Department Seminar

POMP inference via iterated filtering

Supplemental Information Population Dynamics of Epidemic and Endemic States of Drug-Resistance Emergence in Infectious Diseases

Control of Epidemics by Vaccination

A Note on the Spread of Infectious Diseases. in a Large Susceptible Population

3 rd Workshop and Conference on Modeling Infectious Diseases

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Dynamic modeling and inference for ecological systems. IOE 899: Seminar in Industrial and Operations Engineering

PARAMETER ESTIMATION IN EPIDEMIC MODELS: SIMPLIFIED FORMULAS

Modelling spatio-temporal patterns of disease

Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005

Sensitivity of Model-Based Epidemiological Parameter Estimation to Model Assumptions

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

LAW OF LARGE NUMBERS FOR THE SIRS EPIDEMIC

PAIRWISE ACCELERATED FAILURE TIME MODELS FOR INFECTIOUS DISEASE TRANSMISSION DATA

Derivation of Itô SDE and Relationship to ODE and CTMC Models

Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and

Downloaded from:

Inference for Regression Simple Linear Regression

AARMS Homework Exercises

The founding fathers Ross, R An application of the theory of probabilities to the study of a priori pathometry. Part I. Proceedings of the

Dynamic Modeling and Inference for Ecological and Epidemiological Systems

Fast Likelihood-Free Inference via Bayesian Optimization

Approximation of epidemic models by diffusion processes and their statistical inferencedes

Modelling of the Hand-Foot-Mouth-Disease with the Carrier Population

Stochastic modelling of epidemic spread

SIR Epidemic Model with total Population size

The effect of emigration and immigration on the dynamics of a discrete-generation population

STABILITY ANALYSIS OF A GENERAL SIR EPIDEMIC MODEL

Introduction to Statistical Analysis

GLOBAL DYNAMICS OF A MATHEMATICAL MODEL OF TUBERCULOSIS

Chapter 2. Discrete Distributions

Examination paper for TMA4275 Lifetime Analysis

Three Disguises of 1 x = e λx

Global Analysis of an SEIRS Model with Saturating Contact Rate 1

Multistate Modelling Vertical Transmission and Determination of R 0 Using Transition Intensities

Survival Analysis I (CHL5209H)

PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH

GLOBAL STABILITY OF THE ENDEMIC EQUILIBRIUM OF A TUBERCULOSIS MODEL WITH IMMIGRATION AND TREATMENT

Multistate Modeling and Applications

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Transmission Dynamics of an Influenza Model with Vaccination and Antiviral Treatment

Notes on the Multivariate Normal and Related Topics

1 Disease Spread Model

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

On a stochastic epidemic SEIHR model and its diffusion approximation

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Applied Probability. School of Mathematics and Statistics, University of Sheffield. (University of Sheffield) Applied Probability / 8

The Effect of Stochastic Migration on an SIR Model for the Transmission of HIV. Jan P. Medlock

arxiv: v1 [q-bio.pe] 19 Dec 2012

Age-dependent branching processes with incubation

Comment on A BINOMIAL MOMENT APPROXIMATION SCHEME FOR EPIDEMIC SPREADING IN NETWORKS in U.P.B. Sci. Bull., Series A, Vol. 76, Iss.

Modeling Recurrent Events in Panel Data Using Mixed Poisson Models

Supplementary Information Activity driven modeling of time varying networks

Chapter 3 - Estimation by direct maximization of the likelihood

A Time Since Recovery Model with Varying Rates of Loss of Immunity

Essential Ideas of Mathematical Modeling in Population Dynamics

Understanding the contribution of space on the spread of Influenza using an Individual-based model approach

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

ST5212: Survival Analysis

Recovering the Time-Dependent Transmission Rate in an SIR Model from Data and an Overfitting Danger

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Lab 8. Matched Case Control Studies

A hidden semi-markov model for the occurrences of water pipe bursts

S. L. Lee K. W. Liang X. B. Pan RogerC.E.Tan. Department of Mathematics,

Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models.

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Transcription:

Junling Ma Department of Mathematics and Statistics, University of Victoria May 23, 2012

Introduction Daily pneumonia and influenza (P&I) deaths of 1918 pandemic influenza in Philadelphia. 900 800 700 600 linear scale deaths 500 400 300 200 100 36 38 40 42 44 46 48 50 52 week number in 1918 How do we estimate R 0?

Re-examine 1918 Daily Philadelphia P&I Deaths linear scale log linear scale 900 800 700 600 10 2 deaths 500 400 300 deaths 10 1 200 100 36 38 40 42 44 46 48 50 52 week number in 1918 10 0 36 38 40 42 44 46 48 50 52 week number in 1918 An exponential growth phase Given infectious period and latent period, this rate implies R 0

What We Will Learn Estimate the exponential growth rate Fit (phenomenological or mechanistic) models to data Estimate R 0 from the exponential growth rate

The Exponential Growth Phase The 1918 pandemic epidemic curve, and most others, show an initial exponential growth phase, That is, during the initial growth phase, the epidemic curve can be modeled as X (t) = X (0)e λt, where λ is the exponential growth rate, X (0) is the initial condition. So, ln X (t) and the time t have a linear relationship during the initial growth phase ln X (t) = ln X (0) + λt. The exp growth rate measures how fast the disease spreads

Example: 1665 Great Plague Deaths in London linear scale log linear scale 8000 deaths 7000 6000 5000 4000 3000 2000 1000 deaths 10 3 10 2 10 1 10 20 30 40 50 week number in 1918 10 0 10 20 30 40 50 week number in 1918 Exponential growth rate decreases around week 25?

Example: 1665 Great Plague All-cause Deaths in London linear scale log linear scale deaths 9000 8000 7000 6000 5000 4000 3000 2000 1000 deaths 10 3 10 20 30 40 50 week number in 1918 10 2 10 20 30 40 50 week number in 1918 The decrease of the exponential growth rate in plague deaths may be caused by under reporting

Theoretical Exponential Growth Rate: SIR Model S = β N SI, I = β SI γi. N Near the disease free equilibrium (DFE) (N, 0) I = (β γ)i This is a linear ODE, with an exponential solutions I (t) = I (0)e (β γ)t So, the exponential growth rate is λ = β γ. What is the growth rate of the incidence curve X (t) = βsi?

Theoretical Growth Rate: SEIR Model S = β N SI, E = β N SI σe. I = σe γi. Near the disease free equilibrium (DFE) (N, 0, 0) ] [ ] [ ] σ β E = = J σ γ I [ d E dt I [ E I ] The exponential growth rate is λ = ρ(j) = 1 ( ) λ + γ + (σ γ) 2 2 + 4βγ

Theoretical Growth Rate: General Case Assume that a disease can be modeled with Susceptible classes S R m and infected classes I R n parameters θ R p. Assume a disease free equilibrium (DFE) (S = S, I = 0). S = f (S, I ; θ), I = g(s, I ; θ), where S g(s, 0) = 0 Linearize about the DFE (S, 0): I = g I (S, 0; θ)i. The exponential growth rate ( ) g λ = ρ I (S, 0; θ)

Fitting an Exponential Curve Fitting an Exponential Curve Model x(t) x(0)e λt Naive methods that have been widely used: Least square and linear regression Poisson regression Negative binomial regression These methods assume a mean that can be described by a deterministic model only consider observation errors around the deterministic model ignore the process errors are completely ignored

Fitting an Exponential Curve Point Estimates and Confidence Intervals The best estimate for (λ, x 0 ) is called a point estimate. A 95% confidence interval (CI) (a, b) for λ is an interval estimate that satisfies Prob{λ (a, b)} = 95% 95% is called the confidence level. Other examples, 99% CI Infinitely many CI with the same confidence level (95%) Wider CIs means the true paramter value may differ more widely from the point estimate E.g.: The 1918 pandemic influenza (fall wave) has R 0 = 1.86 with 95% CI (1.82, 1.90) (Chowell et al Proc B 2008).

Fitting an Exponential Curve Linear Regression x(t) = x(t)e λt ln x(t) = ln x(0) + λt Commonly use the least square method: For a data set (t i, x i ), minimize F (λ, x 0 ) = n (ln x(t i ) ln x i ) 2 t=1 Confidence intervals: Assume that ln x i are normally distributed, i.e., x i are log-normally distributed Then (λ, x 0) are joint normal The covariance matrix is (D 2 F ) 1. If x i is not log-normal, not an easy problem.

Fitting an Exponential Curve Poisson Regression For an epidemic curve (t i, x i ), x i usually not log-normal. If infection events have exponentially distributed waiting time, x i are Poisson distributed. Poisson regression for these type of data, which is a maximum likelihood method. A likelihood function is the probability of observing the data with a given set of parameters n n L({x i } n E[x i ] x i e E[x i ] i=1 λ, x 0 ) = Prob(x i λ, x 0 ) == x i! = i=1 n x(t i ) x i e x(t i ) = x i! i=1 Find the parameters λ, x 0 that maximize L n i=1 i=1 x x i 0 eλt i x i x 0 exp(λt i ). x i!

Fitting an Exponential Curve Poisson Regression: Maximize Log-likelihood Because L is a product, it is convenient to maximize ln L, called the log likelihood n ln L(λ, x 0 ) = x i ln x 0 + λt i x i x 0 e λt i ln(x i!). i=1 Because x i are constants, drop ln(x i!) to maximize n ln L(λ, x 0 ) = x i ln x 0 + λt i x i x 0 e λt i. i=1 This can only be maximized numerically. Covariance matrix of the parameters: ( Var[λ, x 0 ] = D 2 ln L ) 1

Fitting an Exponential Curve Condifence Intervals: Likelihood Ratio Test To estimate the CI for λ, We use the likelihood ratio test Construct a likelihood profile for λ λ is stepped to both directions of the point estimate ˆλ At each step k = ±1, ±2, Find L k = max L(x 0 λ k ) Compute the likelihood ratio D(λ k ) = 2 ln L0 L k = 2 ln L 0 2 ln L k where L 0 is the likelihood at the point estimate. Best practice for step size is to use the standard deviation from the covariance matrix. Approx. D k χ 2 1, find the 95% CI for D(λ) : (D(λ a), D(λ b )). The 95% for λ is (λ a, λ b ).

Fitting an Exponential Curve Negative Binomial Regression Poisson regression assumes E[x i ] = Var[x i ]. Over-dispersion: Var[x i ] > E[x i ] because of observation errors non-exponentially distributed waiting times Solution: assume that x i is Negative-Binomial with parameters r and 0 < p < 1 Prob(x i r, p) = Γ(x i + r) x i!γ(r) pr (1 p) x i. Assume the same r for all x i. r E[X i ] = r(1 p)/p p = r/(r + E[x i ] = r + x 0 e λt i The parameters are λ, x 0, and r. As r, the Negative Binomial approaches Poisson.

Fitting an Exponential Curve Application to Simulated Epidemics The trend of estimated exponential growth rate when using more data points towards the peak of epidemic: Red: theoretical rate; Black: estimation; blue: 95% CI; grey: epidemic curve

Baseline 1918 Pandemic Influenza in Philadelphia The trend of estimated exponential growth rate when using more data points towards the peak of epidemic: Black: estimation; blue: 95% CI; grey: epidemic curve

Baseline Baseline The early flat phase are non-flu deaths, such deaths are called the baseline P&I deaths In a pandemic, most P&I deaths are flu deaths. We can thus ignore the variation in the baseline So, we can use a new model for the mean P&I deaths where b is the baseline. We use Poisson regression. x(t) = b + x 0 e λt

Baseline 1918 Pandemic Influenza in Philadelphia with Baseline The trend of estimated exponential growth rate when using more data points towards the peak of epidemic: Black: estimation; blue: 95% CI; grey: epidemic curve

Decreasing Growth Rate Taking Account of Decreasing Growth Rate Exponential growth rate decreases because of the depletion of the susceptibles. Use the exponential model, Find the best fitting window by testing goodness of fit. Use more sophisticated phenomenological models Logistic model for cumulative cases Richards model for cumulative cases Use a mechanistic model, e.g., SIR, SEIR,...

Decreasing Growth Rate Single Epidemic Phenomenological Models Logistic model: The cumulative cases C(t) initially grow exponentially, then approach the final size The same shape as the logistic model. C (t) = λc[1 C/K] This model introduces one more parameter K. But we should not directly fit the cumulative cases data k c k i=0 x k to this model, because c k are not independent. Instead, we compute the interval cases x(t) = c(t + 1) c(t), and fit x(t) to the data x i. Richards model: cumulative cases has a mean C (t) = λc[1 C/K] α

Decreasing Growth Rate Fit Logistic Model to Simulated Epidemics Allows the use of more data points: Red: theoretical rate; black: estimation; blue: 95% confidence interval; grey: epidemic curve

Decreasing Growth Rate Philadelphia 1918 Pandemic w/ Baseline + Logistic Model The trend of estimated exponential growth rate when using more data points towards the peak of epidemic: Black: estimation; blue: 95% confidence interval; grey: epidemic curve

Decreasing Growth Rate Process Errors Same disease parameters may produce different epidemic curves. cases 5 10 20 50 100 200 500 1000 5000 0 50 100 150 200 time

Decreasing Growth Rate Coverage Probability Coverage probability of a CI is the probability that CI contains the true parameter value. A 95% CI should have 95% coverage probability. Because we ignored process errors, this method generally produces narrower confidence intervals Simulations can verify that the coverage probability for incidence cases is poor. Larger observation errors, for example, small reporting rates, improve coverage. Methods that can handle process errors include: one-step ahead, particle filters, MCMC,...

Estimate R 0 : SIR Model First, as an example, we look at an SIR model S = β N SI, I = β SI γi. N Recall that λ = β γ, so β = λ + γ R 0 = β γ = λ + γ γ = 1 + λ γ. What if λ is the exponential growth rate of the incidence curve X (t) = βsi?

Estimate R 0 : SEIR Model S = β N SI, E = β N SI σe. I = σe γi. Recall that λ = ρ(j) = 1 2 ( ) λ + γ + (σ γ) 2 + 4βγ Isolate β, β = σ + λ (λ + γ + σ) γ R 0 = β ( λ γ = 1 + λ σγ + 1 γ + 1 ). σ

Estimate R 0 with a Model: in General S = f (S, I ; θ), I = g(s, I ; θ). where S R m, I R n, θ is the vector of parameters. Recall that the exponential growth rate is the largest eigenvalue of g I (S 0, 0; θ). This replationship usually gives us an estimate of the transmission rate given all the other disease parameters. R 0 can be computed using the infered transmission rate and all other given parameter values.

Estimate R 0 using Generation Interval See Wallinga and Lipsitch (Proc B 2007, 274:599604) Generation interval (serial interval): the waiting time from being infected to secondary infections The generation interval distribution w(t) can be estimated (e.g., from contact tracing) without a mechanistic model. Let n(t) is the transmission rate at age of infection τ. R 0 = n(τ)dτ, and w(τ) = n(τ)/r 0 0. The incidence curve x(t) = x(t) n(t) Assume x(t) = x(0)e λt, R 0 = 1 0 e λτ w(τ)dτ

The Influenza Generation Interval Distribution Taken from N.M. Ferguson, et al., (Nature 2005, 437:209-214)

The Basic Reproduction Number of 1918 Pandemic Influenza in Philadelphia Given that we have estimated the exponential growth rate to be λ = 0.288, with 95% confidence interval: (0.286, 0.290) and with the above generation interval distribution, we can compute that R 0 = 2.16, with 95% confidence interval: (2.15, 2.17) This is consistent with other estimations such as Mills et al. (Nature 2004) and Goldstein et al. (PNAS 2009)

Basic Reproduction Number Estimated by Goldstein et al (2009)

Some Considerations Deaths v.s. incidences Temporal aggregation (e.g., weekly incidences) Spatial aggregation (e.g., overall Canada v.s. city level curves)

Some References O. N. Bjrnstad, B. F. Finkenstdt and B. T. Grenfell (2002), Ecological Monographs, 72:169-184 B. M. Bolker (2008), Ecological models and data in R, Princeton University Press M. C. J. Bootsma and N. M. Ferguson (2007), PNAS, 104:7588-7593 G. Chowell, P. W. Fenimore, M. A. Castillo-Garsow and C. Castillo-Chavez (2003), JTB, 224:1-8 G. Chowell, H. Nishiura and L. M. Bettencourt (2007), JoRS Interface, 4:155-166. U. C. de Silva, J. W. S. Waicharoen and M. Chittaganpitch (2009) Euro Surveillance 14

Some References, Continued E. L. Ionides, C. Bret and A. A. King (2006), PNAS, 103:18438-18443 D. He, J. Dushoff, T. Day, J. Ma and D. J. D. Earn (2011),Theoretical Ecology, 4:283-288 Y.-H. Hsieh, D. N. Fisman and J. Wu (2010), BMC Research Notes, 3:283 F. J. Richards (1959), J. Experimental Botany, 10