Bayesian estimation of the basic reproduction number in stochastic epidemic models

Size: px

Start display at page:

Download "Bayesian estimation of the basic reproduction number in stochastic epidemic models"

Caitlin Pearson
5 years ago
Views:

1 Bayesian estimation of the basic reproduction number in stochastic epidemic models Damian Clancy Department of Mathematical Sciences, University of Liverpool Philip D. O Neill School of Mathematical Sciences, University of Nottingham Abstract. In recent years there has been considerable activity in the development and application of Bayesian inferential methods for infectious disease data using stochastic epidemic models. Most of this activity has employed computationally intensive approaches such as Markov chain Monte Carlo methods. In contrast, here we address fundamental questions for Bayesian inference in the setting of the standard SIR (Susceptible-Infective-Removed) epidemic model via simple methods. Our main focus is on the basic reproduction number, a quantity of central importance in mathematical epidemic theory, whose value essentially dictates whether or not a large epidemic outbreak can occur. We specifically consider two SIR models routinely employed in the literature, namely the model with exponentially distributed infectious periods, and the model with fixed length infectious periods. It is assumed that an epidemic outbreak is observed through time. Given complete observation of the epidemic, we derive explicit expressions for the posterior densities of the model parameters and the basic reproduction number. For partial observation of the epidemic, when the entire infection process is unobserved, we derive conservative bounds for quantities such as the mean of the basic reproduction number and the probability that a major epidemic outbreak will occur. If the time at which the epidemic started is observed, then linear programming methods can be used to derive suitable bounds for the mean of the basic reproduction number and similar quantities. Numerical examples are used to illustrate the practical consequences of our findings. In addition, we also examine the implications of commonly-used prior distributions on the basic model parameters as regards inference for the basic reproduction number. Keywords: epidemics; basic reproduction number; Bayesian inference; linear programming; stochastic epidemic models 1

2 1 Introduction In recent years there has been considerable activity in both the methodological development and application of methods for Bayesian data analysis of infectious disease outbreak data using stochastic epidemic models. Almost all of this literature employs Markov chain Monte Carlo (MCMC) methodology, which offers enormous power and flexibility compared to other approaches (see e.g. Gibson and Renshaw, 1998; O Neill and Roberts, 1999; O Neill et al., 2000; Streftaris and Gibson, 2004; Neal and Roberts, 2005). The methods have been applied to many different human, animal and plant pathogens and diseases, examples of which include pneumococcal carriage (Auranen et al., 2000), measles (Li et al., 2002), swine fever (Höhle et al., 2005), influenza (Cauchemez et al., 2004; Demiris and O Neill, 2005), norovirus (O Neill and Marks, 2005) and nosocomial infections (McBryde et al., 2006). Despite the advances mentioned above, one notable absence in the current literature is any treatment of Bayesian inference for one of the most basic stochastic epidemic models, namely the SIR (Susceptible-Infective-Removed) model, without recourse to MCMC methods. This situation is in stark contrast to the classical case, where estimation methods for the SIR model, at least in the case where the model is Markov, are long-established (see e.g. Becker, 1989, Chapter 7; Andersson and Britton, 2000, Chapters 9 and 10). It is therefore of interest to see what can be achieved without computationally intensive methods such as MCMC. An additional and important motivation for our study is that Bayesian analysis of the SIR model provides insights that are useful in the analysis of more complex and realistic models. SIR models themselves are frequently used as components of more complex epidemic models, for example those featuring populations divided into households (Demiris and O Neill, 2005) or epidemics on networks (Andersson and Britton, 2000, Chapter 7). An example of such an insight, described in detail below, is the extent to which apparently natural (and commonly used) prior distributions for model parameters can affect the resulting inference. In this paper, our attention generally focuses on the basic reproduction number R 0, informally defined as the average number of secondary cases caused by a single infective individual in a large susceptible population. This quantity is of enormous importance within epidemic modelling because, roughly speaking, if R 0 1 then an epidemic is highly unlikely to occur. Estimation of R 0, or equivalent parameters in more complex models, can usually be achieved via MCMC methods. Among other things, we derive a closed-form expression for the posterior density of R 0 given suitably complete data, and bounds on various quantites (e.g. the mean) for other data scenarios. The paper is organised as follows. Section 2 describes the standard SIR epidemic model and recalls some results required in the sequel. Sections 3 and 4 consider two different special cases of the SIR model, corresponding to exponential and constant infectious periods, respectively. Some brief concluding comments are given in Section 5. 2

3 2 Preliminaries 2.1 The standard SIR model We begin by recalling the definition of the standard SIR (Susceptible-Infective-Removed) stochastic epidemic model (see e.g. Andersson and Britton, p.11). Consider a population of N individuals, assumed to mix together homogeneously. At each time t 0, every individual in the population is either susceptible, infective or removed, with the numbers in each category denoted S(t), I(t) and R(t), respectively, so that S(t) + I(t) + R(t) = N. At time t = 0, the population contains only infectives and susceptibles, so that S(0) 1, I(0) 1 and R(0) = 0. Each infective individual remains so for a period of time (called the infectious period) having an arbitrary but specified distribution T I, before becoming removed. Removed individuals play no further part in the epidemic. The infectious periods of different individuals are assumed to be mutually independent. During its infectious period, an infective individual has infectious contacts with each susceptible individual at times given by the points of a homogeneous Poisson process of rate β/n, with these processes being mutually independent. Each such contact results in the susceptible immediately becoming infective. Since the number of susceptible-infective pairs at time t 0 is S(t)I(t), it follows that the overall rate of infection at time t is βs(t)i(t)/n. The epidemic ends as soon as there are no more infectives left in the population. 2.2 The basic reproduction number A quantity of major importance within mathematical epidemic theory is the basic reproduction number R 0, heuristically defined as the average number of new infections caused by a single infective in a large susceptible population (see e.g. Dietz, 1993). This quantity is important because roughly speaking, in a large population, a large epidemic outbreak can occur if and only if R 0 > 1. When R 0 > 1 the epidemic is said to be above threshold. Knowledge of the value of R 0 makes it possible to calculate the proportion of a population that should be vaccinated in order to prevent an epidemic from occurring. Both the definition and threshold interpretation of R 0 can be made mathematically precise by allowing the population size to tend to infinity, so that R 0 essentially becomes the mean offspring size of a branching process of new infections (see e.g. Andersson and Britton, 1999, p. 22.) For the standard SIR model defined above, R 0 = βe[t I ]. 2.3 Data and notation In the following, we focus on an epidemic outbreak that results in a total of n removals, where 1 n N. Our main interest will be in making inference about the parameters of the epidemic, and in particular R 0, given observation of the removal process alone. However, to begin with it is fruitful to consider inference given complete observation of the epidemic process, i.e. observing both infections and removals. In some settings such a detailed level of observation is not that unrealistic, e.g. animal experiments in which the animals are regularly tested for exposure to the pathogen in question. 3

4 We start with some notation. Suppose that the epidemic begins with a single infection at time i 1, so that (S(i 1 ), I(i 1 ), R(i 1 )) = (N 1, 1, 0). Subsequent infections occur at times i 2 i 3... i n, where i 2 i 1, and removals occur at times r 1 r 2... r n. We suppose that the period of observation is [i 1, r n ], so that it is assumed that the entire epidemic is observed, and define τ := r n. We write r = (r 1 r 2... r n ) and i = (i 2 i 3... i n ). It is important to note that the infection and removal times must satisfy the inequalities i k+1 r k for k = 1, 2,..., n 1. In particular, this constraint ensures that the number of infectives does not reach zero until the time of the last removal, r n. For a given r define Er to be the set of all infection times (i 1, i) satisfying i k i k+1 r k for k = 1, 2,..., n 1. Thus Er contains all possible configurations of infection times for a given set of ordered removal times r. 2.4 Ratio of independent Gamma distributions In the sequel we will make some use of the following facts (see e.g. Bhoj and Schiefermayr, 2001). Denoting by Γ(a, b) a Gamma random variable with mean and variance a/b and a/b 2, respectively, let X Γ(a, b) and Y Γ(c, d) be independent, and define W = X/Y. Then W has probability density function given by ( ) a b Γ(a + c) w a 1 f W (w) =, w 0, d Γ(a)Γ(c) + 1)a+c and E[W k ] = ( bw d ( ) k d Γ(a + k)γ(c k), k = 1, 2,... [c]. b Γ(a)Γ(c) It is straightforward to show that W has mode 0 d(a 1)/b(c + 1), where denotes maximum. Finally, a little algebra shows that the distribution function of W is given by ( ) a b Γ(a + c) w a F W (w) = d Γ(a)Γ(c) a 2 F 1 (a + c, a; a + 1; bw/d), w 0, (1) where p F q (n 1,..., n p ; m 1,..., m q ; x) denotes the hypergeometric function defined by pf q (n 1,..., n p ; m 1,..., m q ; x) = k=0 x k (n 1 ) k (n 2 ) k... (n p ) k, k! (m 1 ) k (m 2 ) k... (m q ) k where (x) 0 = 1 and for k = 1, 2,..., (x) k = (x)(x + 1)... (x + k 1). 3 Exponential infectious period In this section we suppose that the infectious period distribution is exponential with mean E[T I ] = γ 1. This model is often known as the general stochastic epidemic, and is the most widely-studied SIR stochastic epidemic model. This model is also the natural analogue of the deterministic SIR epidemic model, defined in terms of differential equations (see e.g. Bailey, 1975, p. 82), which is itself a component of many deterministic epidemic models studied in the literature. 4

5 3.1 Likelihood The likelihood of the infection and removal times given the model parameters β, γ and i 1 is ( n ) ( n ) π(i, r β, γ, i 1 ) = βn 1 S(i j )I(i j ) γi(r j ) j=2 j=1 ( τ ) exp βn 1 S(t)I(t) + γi(t) dt 1 {(i1,i) Er}, (2) i 1 where S(t ) = lim s t S(s), etc., see for example O Neill and Roberts (1999). 3.2 Parameter prior distributions Suppose that β and γ are, a priori, independent and respectively distributed as Γ(m β, λ β ) and Γ(m γ, λ γ ). As recalled below, this choice of prior distributions is convenient in terms of Bayesian inference due to conjugacy (O Neill and Roberts, 1999). Furthermore, the flexibility of the Gamma distribution means that it is frequently used in practice as a prior distribution for rate parameters in epidemic models (see e.g. Auranen et al., 2000; Cauchemez et al., 2004; Streftaris and Gibson, 2004). However, the induced prior for R 0 is rarely mentioned in the literature, and we now consider this for the current model. Since R 0 = β/γ, applying the results in Section 2.4 yields that R 0 has prior density ( ) mβ λβ Γ(m β + m γ ) R m β 1 0 f(r 0 ) = λ γ Γ(m β )Γ(m γ ) ( λ βr 0, R 0 0, λ γ + 1) m β+m γ and prior mean, mode and variance given, when they exist, by E[R 0 ] = m β λ γ, mode(r 0 ) = λ γ(m β 1) (m γ 1)λ β λ β (m γ + 1) 0, var[r 0 ] = m β(m β + m γ 1) (m γ 1) 2 (m γ 2) In practice, it is often the case that uninformative Γ(m, λ) prior distributions are assigned, popular choices being (m, λ) = (1, ɛ) or (m, λ) = (ɛ, ɛ) where ɛ is a small positive number, or zero. In the present case, we note that if m γ 1 then R 0 has infinite mean a priori, and if m γ 2 then R 0 has infinite prior variance. In other words, such vague priors on β and γ yield a vague prior for R 0. As recalled in Section 2.2, the question of whether or not R 0 > 1 is often of interest. In the Bayesian setting it is natural to consider the posterior mean of R 0 as a suitable summary measure. Suppose now that β and γ are assigned the same prior distribution, so that m = m β = m γ, λ = λ β = λ γ ; a typical case in practice would be m = 1, λ some small positive number. However, it then follows that E(R 0 ) > 1. This suggests some need for caution in using the mean as the sole means of assessing whether or not an epidemic is above threshold, a point that we shall return to in the sequel. However, here it is also the case that P (R 0 > 1) = 0.5, so that a priori the epidemic is equally likely to be above or below threshold. 5 ( λγ λ β ) 2.

6 3.3 Parameter posterior distributions By Bayes Theorem, the joint posterior density of β and γ given i, r and i 1 is defined by π(β, γ i, r, i 1 ) π(i, r β, γ, i 1 )π(β)π(γ). It follows from (2) that where π(β i, r, i 1 ) Γ ( n + m β 1, λ β + N 1 ξ SI ), π(γ i, r, i 1 ) Γ (n + m γ, λ γ + ξ I ) ξ I = τ i 1 I(t) dt, ξ SI = τ i 1 S(t)I(t) dt, see O Neill and Roberts (1999). Moreover, the posterior densities of β and γ are independent, and thus the distribution of R 0 given i, r and i 1 is a ratio of two independent Gamma random variables. It follows that ( λβ + N 1 ξ SI π(r 0 i, r, i 1 ) = λ γ + ξ I Γ(2n + m β + m γ 1) Γ(n + m β 1)Γ(n + m γ ) and, for n + m γ > 2, E[R 0 i, r, i 1 ] = mode[r 0 i, r, i 1 ] = ) n+mβ 1 R n+m β 2 0 (( λβ +N 1 ξ SI λ γ+ξ I ) R var[r 0 i, r, i 1 ] = (2n + m β + m γ 2)(n + m β 1) (n + m γ 1) 2 (n + m γ 2) ) 2n+mβ +m γ 1, R 0 0, (3) ( ) ( ) n + mβ 1 λγ + ξ I, (4) n + m γ 1 λ β + N 1 ξ SI ( ) ( ) n + mβ 2 λγ + ξ I 0, (5) n + m γ + 1 λ β + N 1 ξ SI ( λγ + ξ I λ β + N 1 ξ SI ) 2. (6) Note from (3) that the posterior density of R 0 is only dependent on the infection and removal times via the quantity (λ γ + ξ I )/(λ β + N 1 ξ SI ). This is in accord with the estimator of R 0 given by the ratio of the maximum likelihood estimators of β and γ given i 1, i and r, namely ˆR 0 = N(n 1)ξ I /nξ SI (Andersson and Britton, 2000, p. 93). Now since S(t) N 1 for i 1 < t τ, it follows that ξ SI (N 1)ξ I < Nξ I (the final inequality requires i 1 < τ, which we shall assume true). It follows from (4) that if the prior distributions of β and γ are identical then E[R 0 i, r, i 1 ] > 1. Such a conclusion seems to be an artefact of the choice of gamma distributions as priors for β and γ, along with the induced prior distribution on R 0 as discussed in Section 3.2, and reinforces the need for caution in interpreting the posterior mean of R 0. Arguing similarly for the posterior mode of R 0 defined in (5) yields that mode[r 0 i, r, i 1 ] > (n + m 2)/(n + m + 1), where m = m β = m γ, while the classical estimator ˆR 0 mentioned above satisfies ˆR 0 > (n 1)/n. 6

7 3.4 Bounding the posterior mean of R 0 In practice, infection times are rarely observed, and so we now turn our attention to summary measures of R 0 given data on removal times alone. We assume intially that the start time of the epidemic, i 1, is also known, but this assumption is relaxed later. Without loss of generality, we set i 1 = 0. We now show how to compute bounds on the posterior mean of R 0. First note that min i E[R 0 i, r, i 1 ] E[R 0 r, i 1 ] max E[R 0 i, r, i 1 ], i and thus we can obtain bounds by minimising or maximising (4) over all possible infection times i. This is equivalent to minimising or maximising the function h (i) defined (for given r, and i 1 = 0) by h (i) = λ γ + ξ I λ β + N 1 ξ SI. (7) In fact, maximising h (i) is relatively straightforward in some cases, as follows. For i 1 t τ we have S(t) N n, so that ξ SI (N n)ξ I, and hence h (i) λ γ + ξ I λ β + ((N n)/n)ξ I. (8) It is straightforward to show that the right hand side of (8) is non-decreasing in ξ I if and only if (λ β /λ γ ) (N n)/n. Now ξ I is maximised when all infections occur at time i 1 = 0, in which case ξ I = n r k. It follows that, for (λ β /λ γ ) (N n)/n, h (i) λ γ + n r k λ β + ((N n)/n) n r, (9) k and moreover this bound is attained in the soon as possible scenario i 2 = i 3 = = i n = 0. It is tempting to conjecture that the late as possible scenario, in which i k+1 = r k for k = 1, 2,..., n 1, will provide the minimal value of h(i), at least when λ β = λ γ = 0 corresponding to vague priors. That this is not the case is demonstrated by the following explicit counterexample. N = 11; n = 7; r = (3, 4, 5, 6, 7, 8, 11); λ β = λ γ = 0. (10) In this case the late as possible process has infection times i = (3, 4, 5, 6, 7, 8), giving h(i) = 11/7. If instead we take i = (1, 2, 5, 6, 7, 8) then we find that h(i) = 165/106 < 11/7. Note that this example also shows that the minimal infection-timevector depends on the entire course of the epidemic, since the late as possible process does minimise the ratio of interest for 0 t 3. To find the minimal value of h (i), it is helpful first to express the integrals ξ I and ξ SI in terms of the removal times and infection times. Clearly, ξ I = (r k i k ). 7

8 Defining i n+1 = i n+2 = = i N =, then it can be shown (Neal and Roberts, 2005) that ξ SI = N (r k i j i k i j ), (11) j=1 where we use to denote minimum. Recalling that i k i k+1 r k for k = 1, 2,..., n 1 and using the fact that i k = for k n + 1, we can re-write (11) as ξ SI = = = = j=1 r k i j + (N n)r k n 1 k+1 i j + i j + j=1 j=1 i j i 1 + j=1 n 2 j=k+2 (n j + 1)i j + j=1 (n j + 1)i j j=1 (N n)r k + j=1 r k i j + n 2 (N k)i k (k + 1 N)i k + k i j N j=k+1 (N n)r k j=k+2 n 2 r k i j + j=k+2 i k j=1 k=j i j (N n)r k r k i j, (N k)i k since i 1 = 0. Thus in the definition (7) of h(i), the numerator is an affine function of the infection times i, while the denominator is an affine function of i together with the set of variables {r k i j : k = 1, 2,..., n 2, j = k + 2, k + 3,..., n}. In order to minimise h(i), define = h(i, a) λ β + (1 (n/n)) n r k + (1/N) λ γ + n r k n ( n i k (k + 1 N)i k + n 2 n j=k+2 a kj ),(12) where a = {a kj : k = 1, 2,..., n 2, j = k + 2, k + 3,..., n}. Consider the following linear fractional program. [LFP]: Minimise h(i, a) subject to i k i k+1, k = 1, 2,..., n 1, i k+1 r k, k = 1, 2,..., n 1, a kj r k, k = 1, 2,..., n 2, j = k + 2,..., n, a kj i j, k = 1, 2,..., n 2, j = k + 2,..., n. Suppose i, a satisfy the constraints (13) and are such that a kj < r k i j for some k, j. Then from the form of the right hand side of (12) it is clear that we can reduce the value of h without violating any of the constraints (13) by increasing a kj up to r k i j while leaving i unchanged. Hence the minimum in [LFP] must be attained for some 8 (13)

9 i, a satisfying a kj = r k i j for all k, j, and therefore provides a minimum of h (i) defined by (7). Before solving [LFP], we shall show that the minimal value of h(i) is attained for some i with i k A = {i 1, r 1, r 2,..., r n 1 } for k = 2, 3,..., n. First, note that a linear fractional program is known to attain its minimal value at a vertex of the feasible region (Martos, 1965). Now consider a set of infection times i with i k / A for some k {2, 3,..., n}, and take a kj = r k i j for all k, j. Take m = min {k : i k / A}, and l = max {k : i k = i m }. Set q = i m 1 max {r k : r k < i m } and p = i l+1 min {r k : i m < r k } (with the convention that i n+1 = ). Define sets of infection times i, i as follows: i k = i k for 2 k m 1; i k = q for m k l; i k = i k for l + 1 k n, i k = i k for 2 k m 1; i k = p for m k l; i k = i k for l + 1 k n. We thus have i i and i = λi + (1 λ)i for some λ such that 0 < λ < 1 (in fact, λ = (p i m ) /(p q)). Further, setting a kj = r k i j and a kj = r k i j we also have a = λa + (1 λ)a. Hence any set of infection times with i k / A for some k (together with associated a kj values) cannot lie at a vertex of the feasible region, and so the minimum of [LFP] must be attained with i k A for all k, as claimed. As an aside, it is also possible to show that for any values of λ β, λ γ the maximum value of h(i) is attained with i k A for all k using related methods (as opposed to the direct argument we used previously under the condition that (λ β /λ γ ) (N n)/n). Specifically, h(i) can be shown to be quasi-convex, from which it follows that its maximum is attained at a vertex of the feasible region (see Martos, 1965). Various methods exist for solving linear fractional programs (see Ibaraki, 1981). We follow Charnes and Cooper (1962) in transforming our problem into a linear programming problem, which can then be efficiently solved using the simplex method, software for which is widely available. Thus we consider the problem [LP]: Minimise g(c, b, t) = (λ γ + n r k) t n c k subject to ( ) ( n 2 λ β + (1 (n/n)) r k t + (1/N) (k + 1 N)c k + j=k+2 b kj ) c 1 = 0, c k c k+1, k = 1, 2,..., n 1, c k+1 r k t, k = 1, 2,..., n 1, b kj r k t, k = 1, 2,..., n 2, j = k + 2,..., n, b kj c j, k = 1, 2,..., n 2, j = k + 2,..., n, t 0, b kj 0, k = 1, 2,..., n 2, j = k + 2,..., n. = 1, Denoting by c min, b min, t min the (not necessarily unique) values of c, b, t for which the minimum of [LP] is attained, then the minimal value of h(i) is attained at i = c min /t min and is given by g ( c min, b min, t min) /t min. 3.5 Numerical examples Returning to our numerical example (10), and solving [LP] for these data using the linprog command within the Optimization Toolbox of Matlab, we find that the 9

10 minimum is achieved for i = (0, 4, 5, 6, 7, 8) with h(i) = 154/ The maximal value, given by (9) and achieved for i = (0, 0, 0, 0, 0, 0), is h(i) = 11/4 = Taking m β = m γ = 1, then from equation (4) we have thus shown that the posterior mean of R 0 for these data satisfies E [R 0 r, i 1 ] For a more realistic example, we now consider a data set obtained from a smallpox outbreak in a closed community of N = 120 individuals in Abakaliki, Nigeria (see Bailey, 1975, p125). The use of the general stochastic epidemic to model these data is not entirely appropriate, since smallpox has an appreciable latent period; however, we follow O Neill and Roberts (1999) in using these data to illustrate our methods. O Neill and Roberts (1999) used a Markov chain Monte Carlo approach to evaluate the joint posterior distribution of (β, γ), treating the unknown infection times i and i 1 as additional unknown parameters. The data consist of the following 29 inter-removal times, measured in days: 13, 7, 2, 3, 0, 0, 1, 4, 5, 3, 2, 0, 2, 0, 5, 3, 1, 4, 0, 1, 1, 1, 2, 0, 1, 5, 0, 5, 5. Note that for these data, the initial infection time i 1 is not observed. Since O Neill and Roberts (1999) found the posterior mean of γ with non-informative priors to be 0.098, so that the mean infectious period is 1/ days, we shall assume for purposes of illustration that the first removal occurs 10 days after the first infection. With this assumption, the set of removal times become r = (10, 23, 30, 32, 35, 35, 35, 36, 40, 45, 48, 50, 50, 52, 52, 57, 60, 61, 65, 65, 66, 67, 68, 70, 70, 71, 76, 76, 81, 86). We take non-informative priors, with λ β = λ γ = 0 and m β = m γ = 1. Using the Matlab linprog command to solve [LP] for the minimum, and reading off the maximum from (9), we find that for the smallpox data, E [R 0 r, i 1 ] 1.333, the maximal value being achieved with i 2 = i 3 = = i 30 = 0, the minimal value with i = (0, 0, 0, 10, 30, 35, 35, 36, 40, 45, 48, 50, 50, 52, 52, 57, 60, 61, 65, 65, 66, 67, 68, 70, 70, 71, 76, 76, 81). 3.6 Bounds when the initial infection time is unobserved So far, we have assumed that the initial infection time i 1 is known, and fixed a time origin by taking i 1 = 0. In practice it is likely that the initial infection event will not be observed, as seen for the smallpox data above. Thus we now allow i 1 to take any value, and seek bounds on h(i 1, i) under this relaxation of our assumptions. It should be noted that in most applications, plausible bounds on i 1 will be available, and so the bounds derived below are conservative. Note also that our analysis takes no account of the influence of a prior density on i 1, which from (2) is required to 10

11 estimate E[R 0 r] itself (see O Neill and Roberts, 1999). However, as described below, the bounds we derive are still surprisingly good. The upper bound (9) remains valid, except that each removal time r k must now be replaced by r k i 1. If (N n)λ γ Nλ β then (9) implies that h(i 1, i) N/(N n), and this bound is attained in the limit as i 1 with i 2 = i 3 = = i n = i 1. For the lower bound, note that S(t) N 1 on the interval (i 1, τ), so that ξ SI (N 1)ξ I and hence h (i 1, i) = λ γ + ξ I λ β + ((N 1)/N)ξ I N N 1 + λ γ (N/(N 1))λ β. λ β + ((N 1)/N)ξ I For (N 1)λ γ Nλ β this implies that h(i 1, i) N/(N 1), and this lower bound is attained in the limit as i 1 with i 2, i 3,..., i n held fixed. Note that the bounds of this section do not depend upon the observed removal times. The upper bound depends only upon the final size n of the epidemic, while the lower bound does not depend on the progress of the epidemic at all but only upon the total population size N. In the case of non-informative priors λ β = λ γ = 0, m β = m γ = 1, we have N N 1 E [R 0 r] N N n. Note in particular that we always have E[R 0 r] > 1, and that if only one removal is observed, so n = 1, then E[R 0 r] = N/(N 1). 3.7 Distributional bounds From equation (6), it is clear that bounding the integral ratio h(i) allows us to bound not only the posterior mean of R 0, but also the posterior variance. Furthermore, we now show that we can bound the whole posterior distribution of R 0 in the sense of likelihood ratio ordering of distributions. We have observed previously that the posterior density of R 0 is dependent on the infection and removal times only via the quantity h defined by equation (7). Thus we can write the posterior density (3) in the form π(r 0 h) = Γ(2n + m β + m γ 1) h 1 n m n+m β R β 2 0 Γ(n + m β 1)Γ(n + m γ ) (h 1 R 0 + 1) 2n+m β+m, R γ For R 01, R 02 > 0, with h fixed, we have the likelihood ratio π (R 01 h) π (R 02 h) = (R 01 /R 02 ) n+m β 2 [(R 01 + h) / (R 02 + h)] 2n+m β+m γ 1, 11

12 so that for h 1, h 2 > 0, π (R 01 h 1 ) π (R 02 h 1 ) / π (R01 h 2 ) π (R 02 h 2 ) = ( ) 2n+mβ +m (R01 + h 2 ) (R 02 + h 1 ) γ 1. (R 01 + h 1 ) (R 02 + h 2 ) If R 01 > R 02 and h 1 > h 2, then (R 01 + h 1 ) (R 02 + h 2 ) < (R 01 + h 2 ) (R 02 + h 1 ), and so π (R 01 h 1 ) π (R 02 h 1 ) > π (R 01 h 2 ) π (R 02 h 2 ). That is, R 0 h 2 LR R 0 h 1, where LR denotes likelihood ratio ordering of distributions. Likelihood ratio ordering implies the standard stochastic ordering between distributions (see Kijima and Seneta, 1991), so that for any r 0 > 0 and h 1 > h 2 we have P (R 0 r 0 h 1 ) P (R 0 r 0 h 2 ), and for any non-decreasing real-valued function θ (R 0 ), E [θ (R 0 ) h 1 ] E [θ (R 0 ) h 2 ]. In particular, if h min, h max are the minimal and maximal possible values, respectively, of h given the observed data, then the probability that the epidemic is below threshold may be bounded by P (R 0 1 h max ) P (R 0 1 r) P ( R 0 1 h min). From (1), taking λ β = λ γ = 0 and m β = m γ = 1 to give non-informative priors, we have P (R 0 1 h) = Γ(2n + 1) 2F 1 (2n + 1, n; n + 1; h 1 ). Γ(n)Γ(n + 1) nh n When the removal times r are observed, but the initial infection time i 1 is not, then with these non-informative priors we have seen that h min = N/(N 1) and h max = N/(N n). For the Abakaliki data, with N = 120, n = 30, this yields P (R 0 1 r) If we suppose, as in Section 3.5, that the initial infection time i 1 is observed and that r 1 i 1 = 10 days, then we obtain the slightly tighter bounds P (R 0 1 r, r 1 i 1 = 10) Constant infectious period In this section we briefly consider the standard SIR model in which the infectious period is simply a constant, so that T I = c almost surely. Such a choice of infectious 12

13 period is often more realistic than the exponential infectious period considered in the previous section. In this case, the basic reproduction number is R 0 = βc. For this model, and a given set of removal times r 1 r 2... r n, we have i k = r k c, k = 1,..., n, which automatically implies the required ordering i k i k+1, k = 1,..., n 1. However, a necessary and sufficient condition for (i 1, i) Er (in other words, for the epidemic to contain at least one infective during (i 1, τ)) is that c r := max 1 k n 1 (r k+1 r k ). To see this, note that the condition implies that i k+1 = r k+1 c r k for k = 1,..., n 1, so that (i 1, i) Er. Conversely if c < r k+1 r k for some 1 k n 1 then we have i k+1 > r k. Thus the likelihood is given by ( n ) ( τ ) π(i, r β, c, i 1 ) = βn 1 S(i j )I(i j ) exp βn 1 S(t)I(t) dt i 1 j=2 1 { r c} 1 {ik =r k c,,...,n}. It follows that, if r, i and i 1 are all observed, then inference for c is trivial. Specifically, c is either a point mass at the value dictated by r, i and i 1, or else the observations have zero probability density and the model is inappropriate. If just the removals are observed, then β Γ(m β, λ β ) a priori yields that, for c r, π(β r, c) Γ ( n + m β 1, λ β + N 1 ξ SI ), where here ξ SI is dependent on the value of c. It follows that for c r, π(r 0 r, c) Γ ( n + m β 1, (λ β + N 1 ξ SI )c 1). (14) The posterior density π(r 0 r) can in principle be obtained from (14) by integrating c out of the product π(r 0 r, c)π(c), where π(c) is the prior density for c. In general this must be done numerically: analytical expressions are hard to obtain because of the way that ξ SI depends on c. However, it is possible to show that R 0 is stochastically non-decreasing in c, as follows. Recall that for j = 1,..., n we have i j = r j c, while if j n + 1 then i j = r j =. Substitution into (11) and a few lines of algebra yields that, for c r, ξ SI (c) = Nnc + {[r k (r j c)] (r k r j )}, (15) whence j=1 c ξ SI(c) = Nnc j=1 c1 {rk >r j c} (16) (except at the finite set of values c = r j r k for 1 j, k n, where the derivative is undefined). Next, for c r define ψ(c) = c/[λ β + N 1 ξ SI (c)], and observe that if ψ(c) is nondecreasing in c, then R 0 r, c will be stochastically non-decreasing in c. Now for any λ β 0, ψ (c) 0 if ξ SI (c) c ξ SI (c). By (15) and (16), this last inequality holds if and only if {(r k r j ) [r k (r j c)]}. j=1 c1 {rk >r j c} j=1 13

14 Now for 1 j, k n, r k r j c implies that (r k r j ) [r k (r j c)] = r k r k = 0, and thus ξ SI (c) c ξ SI (c) if and only if c1 {rk >r j c} {(r k r j ) [r k (r j c)]} 1 {rk >r j c}. j=1 j=1 However, for 1 j, k n, r k > r j c implies that 0 (r k r j ) [r k (r j c)] c, and the result follows. Now if c > r n r 1, we have that r k > r j c for all 1 j, k n. It follows from (15) that ξ SI (c) = (N n)nc + [r j (r k r j )], j=1 and in particular ξ SI (c)/c n(n n) as c. Therefore, ψ(c) N/[n(N n)] as c, which along with the fact that ψ(c) ψ( r) yields distributional bounds on R 0 r, c. For example, whatever the prior density for c, we have r(n + m β 1) λ β + N 1 ξ SI ( r) E[R 0 r] N(n + m β 1), (17) n(n n) P (R 0 1 r, c = ) P (R 0 1 r) P (R 0 1 r, c = r). (18) Note that the upper and lower bounds in (18) are straightforward to evaluate numerically via equation (14). We finish with numerical illustrations for the smallpox data described above. For these data we have c r = 13. Note that this value itself is considerably larger than typical estimates of the mean infectious period in the exponential infectious period model. Assuming m β = 1 and λ β = 0, the values of E[R 0 r, c] for c = 13, 14, 15 are, respectively, , and From (17) we have the bounds E[R 0 r] Note that this range of values lies within the bounds E[R 0 r] obtained above for the exponential infectious period case, suggesting that inference for the mean of R 0 is unlikely to be very different between the two different infectious period models. However, such a conclusion does not apply to the probability that the epidemic is below threshold. Specifically, for the constant infectious period model we obtain P (R 0 1 r) , which contrasts sharply with the range P (R 0 1 r) obtained previously for the exponential infectious period. Such a marked difference can be explained in two ways. First, the posterior distribution of R 0 has a larger variance for the exponential infectious period model, which itself is unsurprising since the model contains more inherent variability. Second, recall that the extinction probability of an SIR epidemic model is defined, for R 0 > 1, as the smallest root of the equation f(s) = s in [0, 1], where f(s) = E[s R ] is the probability generating function of R, the number of new infections that an infective gives rise to among infinitely many susceptibles (see e.g. Andersson and Britton, Theorem 3.1). In the present case, if the two models have the same R 0 > 1 value, then it is straightforward to show that the extinction probability of the exponential infectious period model exceeds that of the constant infectious period model. For example, with R 0 = 4/3, the two extinction probabilities are 0.75 and , respectively. Such findings reinforce the need for caution when using R 0 as a summary measure of an epidemic. 14

15 5 Conclusions In this paper we have considered Bayesian inference for the standard SIR model, focussing particularly on the basic reproduction number in the case where the infectious period is either exponentially distributed or non-random. These two choices of infectious period distribution are of some practical interest. The exponential case is commonly used by modellers, partly for mathematical convenience, and partly because it provides a natural analogue to deterministic differential equation models, in which a constant rate of removal corresponds to an exponentially distributed infectious period. The constant infectious period case is of interest because, for many diseases, it gives a good approximation to reality. For both models, we have provided either exact expressions or bounds for the posterior distribution and summary statistics of R 0, depending on the assumed data. It is natural to consider other choices of infectious period distribution, although in general the approaches described in this paper will be harder to adopt. There are two reasons for this. The main difficulty is that the posterior distribution for R 0 given complete observation will not, in general, have a closed form. For example, assuming a Γ(α, δ) infectious period distribution gives R 0 = βα/δ, which even in the case of complete observation does not yield a tractable posterior distribution. A second drawback is that in general it is more natural to work not with a set of ordered infection times i 1 i 2... i n, but instead define infection time i k to be that of the individual removed at time r k for k = 1, 2,..., n. In this way, the part of the likelihood that corresponds to the removal process is straightforward to write down as a product of terms such as f(r k i k ), where f is the probability density function of the infectious period. However, in this setting the constraints on the set of possible infection times that ensure there is always at least one infective present become more complicated, since they now rely on the order statistics of the infection times. The linear programming approach we have taken to bound certain posterior summary statistics can be applied to related problems. For example, bounds on the posterior mean of the infection rate parameter β could be derived via similar methods. References Andersson, H. and Britton, T. (2000), Stochastic Epidemic Models and their Statistical Analysis, Lecture Notes in Statistics 151, New York: Springer-Verlag. Auranen, K., Arjas, E., Leino, T. and Takala, A. K. (2000), Transmission of Pneumococcal carriage in families: a latent Markov process model for binary longitudinal data, Journal of the American Statistical Association, 95, Bailey, N.T.J. (1975), The Mathematical Theory of Infectious Diseases and its Applications (2nd ed), London: Griffin. Becker, N. G. (1989), Analysis of Infectious Disease Data, London: Chapman and Hall. Bhoj, D. S. and Schiefermayr, K. (2001), Approximations to the distribution of weighted combination of independent probabilities, Journal of Statistical Computation and Simulation, 68,

16 Cauchemez, S., Carrat, F., Viboud, C., Valleron, A. J. and Böelle, P. Y. (2004), A Bayesian MCMC approach to study transmission of influenza: application to household longitudinal data, Statistics in Medicine, 23, Charnes, A. and Cooper, W.W. (1962), Programming with linear fractional functionals, Naval Research Logistics Quarterly, 9, Demiris, N. and O Neill, P. D. (2005), Bayesian inference for stochastic multitype epidemics in structured populations via random graphs, Journal of the Royal Statistical Society, Ser. B, 67, Dietz, K. (1993), The estimation of the basic reproduction number for infectious diseases, Statistical methods in medical research, 2, Gibson, G. J. and Renshaw, E. (1998), Estimating parameters in stochastic compartmental models using Markov Chain methods, IMA Journal of Mathematics Applied in Medicine and Biology, 15, Höhle, M., Jørgensen, E. and O Neill, P. D. (2005), Inference in disease transmission experiments by using stochastic epidemic models, Applied Statistics, 54, Ibaraki, T. (1981), Solving mathematical programming problems with fractional objective functions, In Generalized Concavity in Optimization and Economics, eds. S. Schaible and W.T. Ziemba, New York: Academic Press, pp Kijima, M. and Seneta, E. (1991), Some results for quasistationary distributions of birth-death processes, Journal of Applied Probability, 28, Li, N., Qian, G., and Huggins, R. (2002), Analysis of between-household heterogeneity in disease transmission from data on outbreak sizes, Australia and New Zealand Journal of Statistics 44, McBryde, E.S., Gibson, G., Pettitt, A.N., Zhang, Y., Zhao, B. and McElwain, D.L.S. (2006), Bayesian modelling of the severe acute respiratory syndrome epidemic, Bulletin of Mathematical Biology, 68, Martos, B. (1965), The direct power of adjacent vertex programming methods, Management Science, 12, Neal, P. and Roberts, G. O. (2005), A case study in non-centering for data augmentation: stochastic epidemics, Statistics and Computing, 15, O Neill, P. D., Balding, D. J., Becker, N. G., Eerola, M. and Mollison, D. (2000), Analyses of infectious disease data from household outbreaks by Markov Chain Monte Carlo methods, Applied Statistics, 49, O Neill, P. D. and Marks, P. J. (2005), Bayesian model choice and infection route modelling in an outbreak of Norovirus, Statistics in Medicine, 24, O Neill, P. D. and Roberts, G. O. (1999), Bayesian inference for partially observed stochastic epidemics, Journal of the Royal Statistical Society, Ser. A, 162, Streftaris, G. and Gibson, G.J. (2004), Bayesian inference for stochastic epidemics in closed populations, Statistical Modelling, 4,

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation