Bayesian estimation of the basic reproduction number in stochastic epidemic models

Size: px
Start display at page:

Download "Bayesian estimation of the basic reproduction number in stochastic epidemic models"

Transcription

1 Bayesian estimation of the basic reproduction number in stochastic epidemic models Damian Clancy Department of Mathematical Sciences, University of Liverpool Philip D. O Neill School of Mathematical Sciences, University of Nottingham Abstract. In recent years there has been considerable activity in the development and application of Bayesian inferential methods for infectious disease data using stochastic epidemic models. Most of this activity has employed computationally intensive approaches such as Markov chain Monte Carlo methods. In contrast, here we address fundamental questions for Bayesian inference in the setting of the standard SIR (Susceptible-Infective-Removed) epidemic model via simple methods. Our main focus is on the basic reproduction number, a quantity of central importance in mathematical epidemic theory, whose value essentially dictates whether or not a large epidemic outbreak can occur. We specifically consider two SIR models routinely employed in the literature, namely the model with exponentially distributed infectious periods, and the model with fixed length infectious periods. It is assumed that an epidemic outbreak is observed through time. Given complete observation of the epidemic, we derive explicit expressions for the posterior densities of the model parameters and the basic reproduction number. For partial observation of the epidemic, when the entire infection process is unobserved, we derive conservative bounds for quantities such as the mean of the basic reproduction number and the probability that a major epidemic outbreak will occur. If the time at which the epidemic started is observed, then linear programming methods can be used to derive suitable bounds for the mean of the basic reproduction number and similar quantities. Numerical examples are used to illustrate the practical consequences of our findings. In addition, we also examine the implications of commonly-used prior distributions on the basic model parameters as regards inference for the basic reproduction number. Keywords: epidemics; basic reproduction number; Bayesian inference; linear programming; stochastic epidemic models 1

2 1 Introduction In recent years there has been considerable activity in both the methodological development and application of methods for Bayesian data analysis of infectious disease outbreak data using stochastic epidemic models. Almost all of this literature employs Markov chain Monte Carlo (MCMC) methodology, which offers enormous power and flexibility compared to other approaches (see e.g. Gibson and Renshaw, 1998; O Neill and Roberts, 1999; O Neill et al., 2000; Streftaris and Gibson, 2004; Neal and Roberts, 2005). The methods have been applied to many different human, animal and plant pathogens and diseases, examples of which include pneumococcal carriage (Auranen et al., 2000), measles (Li et al., 2002), swine fever (Höhle et al., 2005), influenza (Cauchemez et al., 2004; Demiris and O Neill, 2005), norovirus (O Neill and Marks, 2005) and nosocomial infections (McBryde et al., 2006). Despite the advances mentioned above, one notable absence in the current literature is any treatment of Bayesian inference for one of the most basic stochastic epidemic models, namely the SIR (Susceptible-Infective-Removed) model, without recourse to MCMC methods. This situation is in stark contrast to the classical case, where estimation methods for the SIR model, at least in the case where the model is Markov, are long-established (see e.g. Becker, 1989, Chapter 7; Andersson and Britton, 2000, Chapters 9 and 10). It is therefore of interest to see what can be achieved without computationally intensive methods such as MCMC. An additional and important motivation for our study is that Bayesian analysis of the SIR model provides insights that are useful in the analysis of more complex and realistic models. SIR models themselves are frequently used as components of more complex epidemic models, for example those featuring populations divided into households (Demiris and O Neill, 2005) or epidemics on networks (Andersson and Britton, 2000, Chapter 7). An example of such an insight, described in detail below, is the extent to which apparently natural (and commonly used) prior distributions for model parameters can affect the resulting inference. In this paper, our attention generally focuses on the basic reproduction number R 0, informally defined as the average number of secondary cases caused by a single infective individual in a large susceptible population. This quantity is of enormous importance within epidemic modelling because, roughly speaking, if R 0 1 then an epidemic is highly unlikely to occur. Estimation of R 0, or equivalent parameters in more complex models, can usually be achieved via MCMC methods. Among other things, we derive a closed-form expression for the posterior density of R 0 given suitably complete data, and bounds on various quantites (e.g. the mean) for other data scenarios. The paper is organised as follows. Section 2 describes the standard SIR epidemic model and recalls some results required in the sequel. Sections 3 and 4 consider two different special cases of the SIR model, corresponding to exponential and constant infectious periods, respectively. Some brief concluding comments are given in Section 5. 2

3 2 Preliminaries 2.1 The standard SIR model We begin by recalling the definition of the standard SIR (Susceptible-Infective-Removed) stochastic epidemic model (see e.g. Andersson and Britton, p.11). Consider a population of N individuals, assumed to mix together homogeneously. At each time t 0, every individual in the population is either susceptible, infective or removed, with the numbers in each category denoted S(t), I(t) and R(t), respectively, so that S(t) + I(t) + R(t) = N. At time t = 0, the population contains only infectives and susceptibles, so that S(0) 1, I(0) 1 and R(0) = 0. Each infective individual remains so for a period of time (called the infectious period) having an arbitrary but specified distribution T I, before becoming removed. Removed individuals play no further part in the epidemic. The infectious periods of different individuals are assumed to be mutually independent. During its infectious period, an infective individual has infectious contacts with each susceptible individual at times given by the points of a homogeneous Poisson process of rate β/n, with these processes being mutually independent. Each such contact results in the susceptible immediately becoming infective. Since the number of susceptible-infective pairs at time t 0 is S(t)I(t), it follows that the overall rate of infection at time t is βs(t)i(t)/n. The epidemic ends as soon as there are no more infectives left in the population. 2.2 The basic reproduction number A quantity of major importance within mathematical epidemic theory is the basic reproduction number R 0, heuristically defined as the average number of new infections caused by a single infective in a large susceptible population (see e.g. Dietz, 1993). This quantity is important because roughly speaking, in a large population, a large epidemic outbreak can occur if and only if R 0 > 1. When R 0 > 1 the epidemic is said to be above threshold. Knowledge of the value of R 0 makes it possible to calculate the proportion of a population that should be vaccinated in order to prevent an epidemic from occurring. Both the definition and threshold interpretation of R 0 can be made mathematically precise by allowing the population size to tend to infinity, so that R 0 essentially becomes the mean offspring size of a branching process of new infections (see e.g. Andersson and Britton, 1999, p. 22.) For the standard SIR model defined above, R 0 = βe[t I ]. 2.3 Data and notation In the following, we focus on an epidemic outbreak that results in a total of n removals, where 1 n N. Our main interest will be in making inference about the parameters of the epidemic, and in particular R 0, given observation of the removal process alone. However, to begin with it is fruitful to consider inference given complete observation of the epidemic process, i.e. observing both infections and removals. In some settings such a detailed level of observation is not that unrealistic, e.g. animal experiments in which the animals are regularly tested for exposure to the pathogen in question. 3

4 We start with some notation. Suppose that the epidemic begins with a single infection at time i 1, so that (S(i 1 ), I(i 1 ), R(i 1 )) = (N 1, 1, 0). Subsequent infections occur at times i 2 i 3... i n, where i 2 i 1, and removals occur at times r 1 r 2... r n. We suppose that the period of observation is [i 1, r n ], so that it is assumed that the entire epidemic is observed, and define τ := r n. We write r = (r 1 r 2... r n ) and i = (i 2 i 3... i n ). It is important to note that the infection and removal times must satisfy the inequalities i k+1 r k for k = 1, 2,..., n 1. In particular, this constraint ensures that the number of infectives does not reach zero until the time of the last removal, r n. For a given r define Er to be the set of all infection times (i 1, i) satisfying i k i k+1 r k for k = 1, 2,..., n 1. Thus Er contains all possible configurations of infection times for a given set of ordered removal times r. 2.4 Ratio of independent Gamma distributions In the sequel we will make some use of the following facts (see e.g. Bhoj and Schiefermayr, 2001). Denoting by Γ(a, b) a Gamma random variable with mean and variance a/b and a/b 2, respectively, let X Γ(a, b) and Y Γ(c, d) be independent, and define W = X/Y. Then W has probability density function given by ( ) a b Γ(a + c) w a 1 f W (w) =, w 0, d Γ(a)Γ(c) + 1)a+c and E[W k ] = ( bw d ( ) k d Γ(a + k)γ(c k), k = 1, 2,... [c]. b Γ(a)Γ(c) It is straightforward to show that W has mode 0 d(a 1)/b(c + 1), where denotes maximum. Finally, a little algebra shows that the distribution function of W is given by ( ) a b Γ(a + c) w a F W (w) = d Γ(a)Γ(c) a 2 F 1 (a + c, a; a + 1; bw/d), w 0, (1) where p F q (n 1,..., n p ; m 1,..., m q ; x) denotes the hypergeometric function defined by pf q (n 1,..., n p ; m 1,..., m q ; x) = k=0 x k (n 1 ) k (n 2 ) k... (n p ) k, k! (m 1 ) k (m 2 ) k... (m q ) k where (x) 0 = 1 and for k = 1, 2,..., (x) k = (x)(x + 1)... (x + k 1). 3 Exponential infectious period In this section we suppose that the infectious period distribution is exponential with mean E[T I ] = γ 1. This model is often known as the general stochastic epidemic, and is the most widely-studied SIR stochastic epidemic model. This model is also the natural analogue of the deterministic SIR epidemic model, defined in terms of differential equations (see e.g. Bailey, 1975, p. 82), which is itself a component of many deterministic epidemic models studied in the literature. 4

5 3.1 Likelihood The likelihood of the infection and removal times given the model parameters β, γ and i 1 is ( n ) ( n ) π(i, r β, γ, i 1 ) = βn 1 S(i j )I(i j ) γi(r j ) j=2 j=1 ( τ ) exp βn 1 S(t)I(t) + γi(t) dt 1 {(i1,i) Er}, (2) i 1 where S(t ) = lim s t S(s), etc., see for example O Neill and Roberts (1999). 3.2 Parameter prior distributions Suppose that β and γ are, a priori, independent and respectively distributed as Γ(m β, λ β ) and Γ(m γ, λ γ ). As recalled below, this choice of prior distributions is convenient in terms of Bayesian inference due to conjugacy (O Neill and Roberts, 1999). Furthermore, the flexibility of the Gamma distribution means that it is frequently used in practice as a prior distribution for rate parameters in epidemic models (see e.g. Auranen et al., 2000; Cauchemez et al., 2004; Streftaris and Gibson, 2004). However, the induced prior for R 0 is rarely mentioned in the literature, and we now consider this for the current model. Since R 0 = β/γ, applying the results in Section 2.4 yields that R 0 has prior density ( ) mβ λβ Γ(m β + m γ ) R m β 1 0 f(r 0 ) = λ γ Γ(m β )Γ(m γ ) ( λ βr 0, R 0 0, λ γ + 1) m β+m γ and prior mean, mode and variance given, when they exist, by E[R 0 ] = m β λ γ, mode(r 0 ) = λ γ(m β 1) (m γ 1)λ β λ β (m γ + 1) 0, var[r 0 ] = m β(m β + m γ 1) (m γ 1) 2 (m γ 2) In practice, it is often the case that uninformative Γ(m, λ) prior distributions are assigned, popular choices being (m, λ) = (1, ɛ) or (m, λ) = (ɛ, ɛ) where ɛ is a small positive number, or zero. In the present case, we note that if m γ 1 then R 0 has infinite mean a priori, and if m γ 2 then R 0 has infinite prior variance. In other words, such vague priors on β and γ yield a vague prior for R 0. As recalled in Section 2.2, the question of whether or not R 0 > 1 is often of interest. In the Bayesian setting it is natural to consider the posterior mean of R 0 as a suitable summary measure. Suppose now that β and γ are assigned the same prior distribution, so that m = m β = m γ, λ = λ β = λ γ ; a typical case in practice would be m = 1, λ some small positive number. However, it then follows that E(R 0 ) > 1. This suggests some need for caution in using the mean as the sole means of assessing whether or not an epidemic is above threshold, a point that we shall return to in the sequel. However, here it is also the case that P (R 0 > 1) = 0.5, so that a priori the epidemic is equally likely to be above or below threshold. 5 ( λγ λ β ) 2.

6 3.3 Parameter posterior distributions By Bayes Theorem, the joint posterior density of β and γ given i, r and i 1 is defined by π(β, γ i, r, i 1 ) π(i, r β, γ, i 1 )π(β)π(γ). It follows from (2) that where π(β i, r, i 1 ) Γ ( n + m β 1, λ β + N 1 ξ SI ), π(γ i, r, i 1 ) Γ (n + m γ, λ γ + ξ I ) ξ I = τ i 1 I(t) dt, ξ SI = τ i 1 S(t)I(t) dt, see O Neill and Roberts (1999). Moreover, the posterior densities of β and γ are independent, and thus the distribution of R 0 given i, r and i 1 is a ratio of two independent Gamma random variables. It follows that ( λβ + N 1 ξ SI π(r 0 i, r, i 1 ) = λ γ + ξ I Γ(2n + m β + m γ 1) Γ(n + m β 1)Γ(n + m γ ) and, for n + m γ > 2, E[R 0 i, r, i 1 ] = mode[r 0 i, r, i 1 ] = ) n+mβ 1 R n+m β 2 0 (( λβ +N 1 ξ SI λ γ+ξ I ) R var[r 0 i, r, i 1 ] = (2n + m β + m γ 2)(n + m β 1) (n + m γ 1) 2 (n + m γ 2) ) 2n+mβ +m γ 1, R 0 0, (3) ( ) ( ) n + mβ 1 λγ + ξ I, (4) n + m γ 1 λ β + N 1 ξ SI ( ) ( ) n + mβ 2 λγ + ξ I 0, (5) n + m γ + 1 λ β + N 1 ξ SI ( λγ + ξ I λ β + N 1 ξ SI ) 2. (6) Note from (3) that the posterior density of R 0 is only dependent on the infection and removal times via the quantity (λ γ + ξ I )/(λ β + N 1 ξ SI ). This is in accord with the estimator of R 0 given by the ratio of the maximum likelihood estimators of β and γ given i 1, i and r, namely ˆR 0 = N(n 1)ξ I /nξ SI (Andersson and Britton, 2000, p. 93). Now since S(t) N 1 for i 1 < t τ, it follows that ξ SI (N 1)ξ I < Nξ I (the final inequality requires i 1 < τ, which we shall assume true). It follows from (4) that if the prior distributions of β and γ are identical then E[R 0 i, r, i 1 ] > 1. Such a conclusion seems to be an artefact of the choice of gamma distributions as priors for β and γ, along with the induced prior distribution on R 0 as discussed in Section 3.2, and reinforces the need for caution in interpreting the posterior mean of R 0. Arguing similarly for the posterior mode of R 0 defined in (5) yields that mode[r 0 i, r, i 1 ] > (n + m 2)/(n + m + 1), where m = m β = m γ, while the classical estimator ˆR 0 mentioned above satisfies ˆR 0 > (n 1)/n. 6

7 3.4 Bounding the posterior mean of R 0 In practice, infection times are rarely observed, and so we now turn our attention to summary measures of R 0 given data on removal times alone. We assume intially that the start time of the epidemic, i 1, is also known, but this assumption is relaxed later. Without loss of generality, we set i 1 = 0. We now show how to compute bounds on the posterior mean of R 0. First note that min i E[R 0 i, r, i 1 ] E[R 0 r, i 1 ] max E[R 0 i, r, i 1 ], i and thus we can obtain bounds by minimising or maximising (4) over all possible infection times i. This is equivalent to minimising or maximising the function h (i) defined (for given r, and i 1 = 0) by h (i) = λ γ + ξ I λ β + N 1 ξ SI. (7) In fact, maximising h (i) is relatively straightforward in some cases, as follows. For i 1 t τ we have S(t) N n, so that ξ SI (N n)ξ I, and hence h (i) λ γ + ξ I λ β + ((N n)/n)ξ I. (8) It is straightforward to show that the right hand side of (8) is non-decreasing in ξ I if and only if (λ β /λ γ ) (N n)/n. Now ξ I is maximised when all infections occur at time i 1 = 0, in which case ξ I = n r k. It follows that, for (λ β /λ γ ) (N n)/n, h (i) λ γ + n r k λ β + ((N n)/n) n r, (9) k and moreover this bound is attained in the soon as possible scenario i 2 = i 3 = = i n = 0. It is tempting to conjecture that the late as possible scenario, in which i k+1 = r k for k = 1, 2,..., n 1, will provide the minimal value of h(i), at least when λ β = λ γ = 0 corresponding to vague priors. That this is not the case is demonstrated by the following explicit counterexample. N = 11; n = 7; r = (3, 4, 5, 6, 7, 8, 11); λ β = λ γ = 0. (10) In this case the late as possible process has infection times i = (3, 4, 5, 6, 7, 8), giving h(i) = 11/7. If instead we take i = (1, 2, 5, 6, 7, 8) then we find that h(i) = 165/106 < 11/7. Note that this example also shows that the minimal infection-timevector depends on the entire course of the epidemic, since the late as possible process does minimise the ratio of interest for 0 t 3. To find the minimal value of h (i), it is helpful first to express the integrals ξ I and ξ SI in terms of the removal times and infection times. Clearly, ξ I = (r k i k ). 7

8 Defining i n+1 = i n+2 = = i N =, then it can be shown (Neal and Roberts, 2005) that ξ SI = N (r k i j i k i j ), (11) j=1 where we use to denote minimum. Recalling that i k i k+1 r k for k = 1, 2,..., n 1 and using the fact that i k = for k n + 1, we can re-write (11) as ξ SI = = = = j=1 r k i j + (N n)r k n 1 k+1 i j + i j + j=1 j=1 i j i 1 + j=1 n 2 j=k+2 (n j + 1)i j + j=1 (n j + 1)i j j=1 (N n)r k + j=1 r k i j + n 2 (N k)i k (k + 1 N)i k + k i j N j=k+1 (N n)r k j=k+2 n 2 r k i j + j=k+2 i k j=1 k=j i j (N n)r k r k i j, (N k)i k since i 1 = 0. Thus in the definition (7) of h(i), the numerator is an affine function of the infection times i, while the denominator is an affine function of i together with the set of variables {r k i j : k = 1, 2,..., n 2, j = k + 2, k + 3,..., n}. In order to minimise h(i), define = h(i, a) λ β + (1 (n/n)) n r k + (1/N) λ γ + n r k n ( n i k (k + 1 N)i k + n 2 n j=k+2 a kj ),(12) where a = {a kj : k = 1, 2,..., n 2, j = k + 2, k + 3,..., n}. Consider the following linear fractional program. [LFP]: Minimise h(i, a) subject to i k i k+1, k = 1, 2,..., n 1, i k+1 r k, k = 1, 2,..., n 1, a kj r k, k = 1, 2,..., n 2, j = k + 2,..., n, a kj i j, k = 1, 2,..., n 2, j = k + 2,..., n. Suppose i, a satisfy the constraints (13) and are such that a kj < r k i j for some k, j. Then from the form of the right hand side of (12) it is clear that we can reduce the value of h without violating any of the constraints (13) by increasing a kj up to r k i j while leaving i unchanged. Hence the minimum in [LFP] must be attained for some 8 (13)

9 i, a satisfying a kj = r k i j for all k, j, and therefore provides a minimum of h (i) defined by (7). Before solving [LFP], we shall show that the minimal value of h(i) is attained for some i with i k A = {i 1, r 1, r 2,..., r n 1 } for k = 2, 3,..., n. First, note that a linear fractional program is known to attain its minimal value at a vertex of the feasible region (Martos, 1965). Now consider a set of infection times i with i k / A for some k {2, 3,..., n}, and take a kj = r k i j for all k, j. Take m = min {k : i k / A}, and l = max {k : i k = i m }. Set q = i m 1 max {r k : r k < i m } and p = i l+1 min {r k : i m < r k } (with the convention that i n+1 = ). Define sets of infection times i, i as follows: i k = i k for 2 k m 1; i k = q for m k l; i k = i k for l + 1 k n, i k = i k for 2 k m 1; i k = p for m k l; i k = i k for l + 1 k n. We thus have i i and i = λi + (1 λ)i for some λ such that 0 < λ < 1 (in fact, λ = (p i m ) /(p q)). Further, setting a kj = r k i j and a kj = r k i j we also have a = λa + (1 λ)a. Hence any set of infection times with i k / A for some k (together with associated a kj values) cannot lie at a vertex of the feasible region, and so the minimum of [LFP] must be attained with i k A for all k, as claimed. As an aside, it is also possible to show that for any values of λ β, λ γ the maximum value of h(i) is attained with i k A for all k using related methods (as opposed to the direct argument we used previously under the condition that (λ β /λ γ ) (N n)/n). Specifically, h(i) can be shown to be quasi-convex, from which it follows that its maximum is attained at a vertex of the feasible region (see Martos, 1965). Various methods exist for solving linear fractional programs (see Ibaraki, 1981). We follow Charnes and Cooper (1962) in transforming our problem into a linear programming problem, which can then be efficiently solved using the simplex method, software for which is widely available. Thus we consider the problem [LP]: Minimise g(c, b, t) = (λ γ + n r k) t n c k subject to ( ) ( n 2 λ β + (1 (n/n)) r k t + (1/N) (k + 1 N)c k + j=k+2 b kj ) c 1 = 0, c k c k+1, k = 1, 2,..., n 1, c k+1 r k t, k = 1, 2,..., n 1, b kj r k t, k = 1, 2,..., n 2, j = k + 2,..., n, b kj c j, k = 1, 2,..., n 2, j = k + 2,..., n, t 0, b kj 0, k = 1, 2,..., n 2, j = k + 2,..., n. = 1, Denoting by c min, b min, t min the (not necessarily unique) values of c, b, t for which the minimum of [LP] is attained, then the minimal value of h(i) is attained at i = c min /t min and is given by g ( c min, b min, t min) /t min. 3.5 Numerical examples Returning to our numerical example (10), and solving [LP] for these data using the linprog command within the Optimization Toolbox of Matlab, we find that the 9

10 minimum is achieved for i = (0, 4, 5, 6, 7, 8) with h(i) = 154/ The maximal value, given by (9) and achieved for i = (0, 0, 0, 0, 0, 0), is h(i) = 11/4 = Taking m β = m γ = 1, then from equation (4) we have thus shown that the posterior mean of R 0 for these data satisfies E [R 0 r, i 1 ] For a more realistic example, we now consider a data set obtained from a smallpox outbreak in a closed community of N = 120 individuals in Abakaliki, Nigeria (see Bailey, 1975, p125). The use of the general stochastic epidemic to model these data is not entirely appropriate, since smallpox has an appreciable latent period; however, we follow O Neill and Roberts (1999) in using these data to illustrate our methods. O Neill and Roberts (1999) used a Markov chain Monte Carlo approach to evaluate the joint posterior distribution of (β, γ), treating the unknown infection times i and i 1 as additional unknown parameters. The data consist of the following 29 inter-removal times, measured in days: 13, 7, 2, 3, 0, 0, 1, 4, 5, 3, 2, 0, 2, 0, 5, 3, 1, 4, 0, 1, 1, 1, 2, 0, 1, 5, 0, 5, 5. Note that for these data, the initial infection time i 1 is not observed. Since O Neill and Roberts (1999) found the posterior mean of γ with non-informative priors to be 0.098, so that the mean infectious period is 1/ days, we shall assume for purposes of illustration that the first removal occurs 10 days after the first infection. With this assumption, the set of removal times become r = (10, 23, 30, 32, 35, 35, 35, 36, 40, 45, 48, 50, 50, 52, 52, 57, 60, 61, 65, 65, 66, 67, 68, 70, 70, 71, 76, 76, 81, 86). We take non-informative priors, with λ β = λ γ = 0 and m β = m γ = 1. Using the Matlab linprog command to solve [LP] for the minimum, and reading off the maximum from (9), we find that for the smallpox data, E [R 0 r, i 1 ] 1.333, the maximal value being achieved with i 2 = i 3 = = i 30 = 0, the minimal value with i = (0, 0, 0, 10, 30, 35, 35, 36, 40, 45, 48, 50, 50, 52, 52, 57, 60, 61, 65, 65, 66, 67, 68, 70, 70, 71, 76, 76, 81). 3.6 Bounds when the initial infection time is unobserved So far, we have assumed that the initial infection time i 1 is known, and fixed a time origin by taking i 1 = 0. In practice it is likely that the initial infection event will not be observed, as seen for the smallpox data above. Thus we now allow i 1 to take any value, and seek bounds on h(i 1, i) under this relaxation of our assumptions. It should be noted that in most applications, plausible bounds on i 1 will be available, and so the bounds derived below are conservative. Note also that our analysis takes no account of the influence of a prior density on i 1, which from (2) is required to 10

11 estimate E[R 0 r] itself (see O Neill and Roberts, 1999). However, as described below, the bounds we derive are still surprisingly good. The upper bound (9) remains valid, except that each removal time r k must now be replaced by r k i 1. If (N n)λ γ Nλ β then (9) implies that h(i 1, i) N/(N n), and this bound is attained in the limit as i 1 with i 2 = i 3 = = i n = i 1. For the lower bound, note that S(t) N 1 on the interval (i 1, τ), so that ξ SI (N 1)ξ I and hence h (i 1, i) = λ γ + ξ I λ β + ((N 1)/N)ξ I N N 1 + λ γ (N/(N 1))λ β. λ β + ((N 1)/N)ξ I For (N 1)λ γ Nλ β this implies that h(i 1, i) N/(N 1), and this lower bound is attained in the limit as i 1 with i 2, i 3,..., i n held fixed. Note that the bounds of this section do not depend upon the observed removal times. The upper bound depends only upon the final size n of the epidemic, while the lower bound does not depend on the progress of the epidemic at all but only upon the total population size N. In the case of non-informative priors λ β = λ γ = 0, m β = m γ = 1, we have N N 1 E [R 0 r] N N n. Note in particular that we always have E[R 0 r] > 1, and that if only one removal is observed, so n = 1, then E[R 0 r] = N/(N 1). 3.7 Distributional bounds From equation (6), it is clear that bounding the integral ratio h(i) allows us to bound not only the posterior mean of R 0, but also the posterior variance. Furthermore, we now show that we can bound the whole posterior distribution of R 0 in the sense of likelihood ratio ordering of distributions. We have observed previously that the posterior density of R 0 is dependent on the infection and removal times only via the quantity h defined by equation (7). Thus we can write the posterior density (3) in the form π(r 0 h) = Γ(2n + m β + m γ 1) h 1 n m n+m β R β 2 0 Γ(n + m β 1)Γ(n + m γ ) (h 1 R 0 + 1) 2n+m β+m, R γ For R 01, R 02 > 0, with h fixed, we have the likelihood ratio π (R 01 h) π (R 02 h) = (R 01 /R 02 ) n+m β 2 [(R 01 + h) / (R 02 + h)] 2n+m β+m γ 1, 11

12 so that for h 1, h 2 > 0, π (R 01 h 1 ) π (R 02 h 1 ) / π (R01 h 2 ) π (R 02 h 2 ) = ( ) 2n+mβ +m (R01 + h 2 ) (R 02 + h 1 ) γ 1. (R 01 + h 1 ) (R 02 + h 2 ) If R 01 > R 02 and h 1 > h 2, then (R 01 + h 1 ) (R 02 + h 2 ) < (R 01 + h 2 ) (R 02 + h 1 ), and so π (R 01 h 1 ) π (R 02 h 1 ) > π (R 01 h 2 ) π (R 02 h 2 ). That is, R 0 h 2 LR R 0 h 1, where LR denotes likelihood ratio ordering of distributions. Likelihood ratio ordering implies the standard stochastic ordering between distributions (see Kijima and Seneta, 1991), so that for any r 0 > 0 and h 1 > h 2 we have P (R 0 r 0 h 1 ) P (R 0 r 0 h 2 ), and for any non-decreasing real-valued function θ (R 0 ), E [θ (R 0 ) h 1 ] E [θ (R 0 ) h 2 ]. In particular, if h min, h max are the minimal and maximal possible values, respectively, of h given the observed data, then the probability that the epidemic is below threshold may be bounded by P (R 0 1 h max ) P (R 0 1 r) P ( R 0 1 h min). From (1), taking λ β = λ γ = 0 and m β = m γ = 1 to give non-informative priors, we have P (R 0 1 h) = Γ(2n + 1) 2F 1 (2n + 1, n; n + 1; h 1 ). Γ(n)Γ(n + 1) nh n When the removal times r are observed, but the initial infection time i 1 is not, then with these non-informative priors we have seen that h min = N/(N 1) and h max = N/(N n). For the Abakaliki data, with N = 120, n = 30, this yields P (R 0 1 r) If we suppose, as in Section 3.5, that the initial infection time i 1 is observed and that r 1 i 1 = 10 days, then we obtain the slightly tighter bounds P (R 0 1 r, r 1 i 1 = 10) Constant infectious period In this section we briefly consider the standard SIR model in which the infectious period is simply a constant, so that T I = c almost surely. Such a choice of infectious 12

13 period is often more realistic than the exponential infectious period considered in the previous section. In this case, the basic reproduction number is R 0 = βc. For this model, and a given set of removal times r 1 r 2... r n, we have i k = r k c, k = 1,..., n, which automatically implies the required ordering i k i k+1, k = 1,..., n 1. However, a necessary and sufficient condition for (i 1, i) Er (in other words, for the epidemic to contain at least one infective during (i 1, τ)) is that c r := max 1 k n 1 (r k+1 r k ). To see this, note that the condition implies that i k+1 = r k+1 c r k for k = 1,..., n 1, so that (i 1, i) Er. Conversely if c < r k+1 r k for some 1 k n 1 then we have i k+1 > r k. Thus the likelihood is given by ( n ) ( τ ) π(i, r β, c, i 1 ) = βn 1 S(i j )I(i j ) exp βn 1 S(t)I(t) dt i 1 j=2 1 { r c} 1 {ik =r k c,,...,n}. It follows that, if r, i and i 1 are all observed, then inference for c is trivial. Specifically, c is either a point mass at the value dictated by r, i and i 1, or else the observations have zero probability density and the model is inappropriate. If just the removals are observed, then β Γ(m β, λ β ) a priori yields that, for c r, π(β r, c) Γ ( n + m β 1, λ β + N 1 ξ SI ), where here ξ SI is dependent on the value of c. It follows that for c r, π(r 0 r, c) Γ ( n + m β 1, (λ β + N 1 ξ SI )c 1). (14) The posterior density π(r 0 r) can in principle be obtained from (14) by integrating c out of the product π(r 0 r, c)π(c), where π(c) is the prior density for c. In general this must be done numerically: analytical expressions are hard to obtain because of the way that ξ SI depends on c. However, it is possible to show that R 0 is stochastically non-decreasing in c, as follows. Recall that for j = 1,..., n we have i j = r j c, while if j n + 1 then i j = r j =. Substitution into (11) and a few lines of algebra yields that, for c r, ξ SI (c) = Nnc + {[r k (r j c)] (r k r j )}, (15) whence j=1 c ξ SI(c) = Nnc j=1 c1 {rk >r j c} (16) (except at the finite set of values c = r j r k for 1 j, k n, where the derivative is undefined). Next, for c r define ψ(c) = c/[λ β + N 1 ξ SI (c)], and observe that if ψ(c) is nondecreasing in c, then R 0 r, c will be stochastically non-decreasing in c. Now for any λ β 0, ψ (c) 0 if ξ SI (c) c ξ SI (c). By (15) and (16), this last inequality holds if and only if {(r k r j ) [r k (r j c)]}. j=1 c1 {rk >r j c} j=1 13

14 Now for 1 j, k n, r k r j c implies that (r k r j ) [r k (r j c)] = r k r k = 0, and thus ξ SI (c) c ξ SI (c) if and only if c1 {rk >r j c} {(r k r j ) [r k (r j c)]} 1 {rk >r j c}. j=1 j=1 However, for 1 j, k n, r k > r j c implies that 0 (r k r j ) [r k (r j c)] c, and the result follows. Now if c > r n r 1, we have that r k > r j c for all 1 j, k n. It follows from (15) that ξ SI (c) = (N n)nc + [r j (r k r j )], j=1 and in particular ξ SI (c)/c n(n n) as c. Therefore, ψ(c) N/[n(N n)] as c, which along with the fact that ψ(c) ψ( r) yields distributional bounds on R 0 r, c. For example, whatever the prior density for c, we have r(n + m β 1) λ β + N 1 ξ SI ( r) E[R 0 r] N(n + m β 1), (17) n(n n) P (R 0 1 r, c = ) P (R 0 1 r) P (R 0 1 r, c = r). (18) Note that the upper and lower bounds in (18) are straightforward to evaluate numerically via equation (14). We finish with numerical illustrations for the smallpox data described above. For these data we have c r = 13. Note that this value itself is considerably larger than typical estimates of the mean infectious period in the exponential infectious period model. Assuming m β = 1 and λ β = 0, the values of E[R 0 r, c] for c = 13, 14, 15 are, respectively, , and From (17) we have the bounds E[R 0 r] Note that this range of values lies within the bounds E[R 0 r] obtained above for the exponential infectious period case, suggesting that inference for the mean of R 0 is unlikely to be very different between the two different infectious period models. However, such a conclusion does not apply to the probability that the epidemic is below threshold. Specifically, for the constant infectious period model we obtain P (R 0 1 r) , which contrasts sharply with the range P (R 0 1 r) obtained previously for the exponential infectious period. Such a marked difference can be explained in two ways. First, the posterior distribution of R 0 has a larger variance for the exponential infectious period model, which itself is unsurprising since the model contains more inherent variability. Second, recall that the extinction probability of an SIR epidemic model is defined, for R 0 > 1, as the smallest root of the equation f(s) = s in [0, 1], where f(s) = E[s R ] is the probability generating function of R, the number of new infections that an infective gives rise to among infinitely many susceptibles (see e.g. Andersson and Britton, Theorem 3.1). In the present case, if the two models have the same R 0 > 1 value, then it is straightforward to show that the extinction probability of the exponential infectious period model exceeds that of the constant infectious period model. For example, with R 0 = 4/3, the two extinction probabilities are 0.75 and , respectively. Such findings reinforce the need for caution when using R 0 as a summary measure of an epidemic. 14

15 5 Conclusions In this paper we have considered Bayesian inference for the standard SIR model, focussing particularly on the basic reproduction number in the case where the infectious period is either exponentially distributed or non-random. These two choices of infectious period distribution are of some practical interest. The exponential case is commonly used by modellers, partly for mathematical convenience, and partly because it provides a natural analogue to deterministic differential equation models, in which a constant rate of removal corresponds to an exponentially distributed infectious period. The constant infectious period case is of interest because, for many diseases, it gives a good approximation to reality. For both models, we have provided either exact expressions or bounds for the posterior distribution and summary statistics of R 0, depending on the assumed data. It is natural to consider other choices of infectious period distribution, although in general the approaches described in this paper will be harder to adopt. There are two reasons for this. The main difficulty is that the posterior distribution for R 0 given complete observation will not, in general, have a closed form. For example, assuming a Γ(α, δ) infectious period distribution gives R 0 = βα/δ, which even in the case of complete observation does not yield a tractable posterior distribution. A second drawback is that in general it is more natural to work not with a set of ordered infection times i 1 i 2... i n, but instead define infection time i k to be that of the individual removed at time r k for k = 1, 2,..., n. In this way, the part of the likelihood that corresponds to the removal process is straightforward to write down as a product of terms such as f(r k i k ), where f is the probability density function of the infectious period. However, in this setting the constraints on the set of possible infection times that ensure there is always at least one infective present become more complicated, since they now rely on the order statistics of the infection times. The linear programming approach we have taken to bound certain posterior summary statistics can be applied to related problems. For example, bounds on the posterior mean of the infection rate parameter β could be derived via similar methods. References Andersson, H. and Britton, T. (2000), Stochastic Epidemic Models and their Statistical Analysis, Lecture Notes in Statistics 151, New York: Springer-Verlag. Auranen, K., Arjas, E., Leino, T. and Takala, A. K. (2000), Transmission of Pneumococcal carriage in families: a latent Markov process model for binary longitudinal data, Journal of the American Statistical Association, 95, Bailey, N.T.J. (1975), The Mathematical Theory of Infectious Diseases and its Applications (2nd ed), London: Griffin. Becker, N. G. (1989), Analysis of Infectious Disease Data, London: Chapman and Hall. Bhoj, D. S. and Schiefermayr, K. (2001), Approximations to the distribution of weighted combination of independent probabilities, Journal of Statistical Computation and Simulation, 68,

16 Cauchemez, S., Carrat, F., Viboud, C., Valleron, A. J. and Böelle, P. Y. (2004), A Bayesian MCMC approach to study transmission of influenza: application to household longitudinal data, Statistics in Medicine, 23, Charnes, A. and Cooper, W.W. (1962), Programming with linear fractional functionals, Naval Research Logistics Quarterly, 9, Demiris, N. and O Neill, P. D. (2005), Bayesian inference for stochastic multitype epidemics in structured populations via random graphs, Journal of the Royal Statistical Society, Ser. B, 67, Dietz, K. (1993), The estimation of the basic reproduction number for infectious diseases, Statistical methods in medical research, 2, Gibson, G. J. and Renshaw, E. (1998), Estimating parameters in stochastic compartmental models using Markov Chain methods, IMA Journal of Mathematics Applied in Medicine and Biology, 15, Höhle, M., Jørgensen, E. and O Neill, P. D. (2005), Inference in disease transmission experiments by using stochastic epidemic models, Applied Statistics, 54, Ibaraki, T. (1981), Solving mathematical programming problems with fractional objective functions, In Generalized Concavity in Optimization and Economics, eds. S. Schaible and W.T. Ziemba, New York: Academic Press, pp Kijima, M. and Seneta, E. (1991), Some results for quasistationary distributions of birth-death processes, Journal of Applied Probability, 28, Li, N., Qian, G., and Huggins, R. (2002), Analysis of between-household heterogeneity in disease transmission from data on outbreak sizes, Australia and New Zealand Journal of Statistics 44, McBryde, E.S., Gibson, G., Pettitt, A.N., Zhang, Y., Zhao, B. and McElwain, D.L.S. (2006), Bayesian modelling of the severe acute respiratory syndrome epidemic, Bulletin of Mathematical Biology, 68, Martos, B. (1965), The direct power of adjacent vertex programming methods, Management Science, 12, Neal, P. and Roberts, G. O. (2005), A case study in non-centering for data augmentation: stochastic epidemics, Statistics and Computing, 15, O Neill, P. D., Balding, D. J., Becker, N. G., Eerola, M. and Mollison, D. (2000), Analyses of infectious disease data from household outbreaks by Markov Chain Monte Carlo methods, Applied Statistics, 49, O Neill, P. D. and Marks, P. J. (2005), Bayesian model choice and infection route modelling in an outbreak of Norovirus, Statistics in Medicine, 24, O Neill, P. D. and Roberts, G. O. (1999), Bayesian inference for partially observed stochastic epidemics, Journal of the Royal Statistical Society, Ser. A, 162, Streftaris, G. and Gibson, G.J. (2004), Bayesian inference for stochastic epidemics in closed populations, Statistical Modelling, 4,

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

Bayesian inference for stochastic multitype epidemics in structured populations using sample data

Bayesian inference for stochastic multitype epidemics in structured populations using sample data Bayesian inference for stochastic multitype epidemics in structured populations using sample data PHILIP D. O NEILL School of Mathematical Sciences, University of Nottingham, Nottingham, UK. SUMMARY This

More information

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 2 Coding and output Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. General (Markov) epidemic model 2. Non-Markov epidemic model 3. Debugging

More information

arxiv: v1 [stat.co] 13 Oct 2017

arxiv: v1 [stat.co] 13 Oct 2017 Bayes factors for partially observed stochastic epidemic models arxiv:1710.04977v1 [stat.co] 13 Oct 2017 Muteb Alharthi Theodore Kypraios Philip D. O Neill June 27, 2018 Abstract We consider the problem

More information

Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models.

Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models. Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models. An Application to the 2001 UK Foot-and-Mouth Outbreak Theodore Kypraios @ University of Nottingham Gareth O. Roberts @ Lancaster

More information

Threshold Parameter for a Random Graph Epidemic Model

Threshold Parameter for a Random Graph Epidemic Model Advances in Applied Mathematical Biosciences. ISSN 2248-9983 Volume 5, Number 1 (2014), pp. 35-41 International Research Publication House http://www.irphouse.com Threshold Parameter for a Random Graph

More information

Reproduction numbers for epidemic models with households and other social structures II: comparisons and implications for vaccination

Reproduction numbers for epidemic models with households and other social structures II: comparisons and implications for vaccination Reproduction numbers for epidemic models with households and other social structures II: comparisons and implications for vaccination arxiv:1410.4469v [q-bio.pe] 10 Dec 015 Frank Ball 1, Lorenzo Pellis

More information

Bayesian inference for stochastic multitype epidemics in structured populations via random graphs

Bayesian inference for stochastic multitype epidemics in structured populations via random graphs Bayesian inference for stochastic multitype epidemics in structured populations via random graphs Nikolaos Demiris Medical Research Council Biostatistics Unit, Cambridge, UK and Philip D. O Neill 1 University

More information

Markov-modulated interactions in SIR epidemics

Markov-modulated interactions in SIR epidemics Markov-modulated interactions in SIR epidemics E. Almaraz 1, A. Gómez-Corral 2 (1)Departamento de Estadística e Investigación Operativa, Facultad de Ciencias Matemáticas (UCM), (2)Instituto de Ciencias

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Centre for Research on Inner City Health St Michael s Hospital Toronto On leave from Department of Mathematics University of Manitoba Julien Arino@umanitoba.ca

More information

Bayesian Inference for Contact Networks Given Epidemic Data

Bayesian Inference for Contact Networks Given Epidemic Data Bayesian Inference for Contact Networks Given Epidemic Data Chris Groendyke, David Welch, Shweta Bansal, David Hunter Departments of Statistics and Biology Pennsylvania State University SAMSI, April 17,

More information

A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation

A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation Theodore Kypraios 1, Peter Neal 2, Dennis Prangle 3 June 15, 2016 1 University of Nottingham,

More information

Partially Observed Stochastic Epidemics

Partially Observed Stochastic Epidemics Institut de Mathématiques Ecole Polytechnique Fédérale de Lausanne victor.panaretos@epfl.ch http://smat.epfl.ch Stochastic Epidemics Stochastic Epidemic = stochastic model for evolution of epidemic { stochastic

More information

Introduction to SEIR Models

Introduction to SEIR Models Department of Epidemiology and Public Health Health Systems Research and Dynamical Modelling Unit Introduction to SEIR Models Nakul Chitnis Workshop on Mathematical Models of Climate Variability, Environmental

More information

Branch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems

Branch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems Branch-and-cut Approaches for Chance-constrained Formulations of Reliable Network Design Problems Yongjia Song James R. Luedtke August 9, 2012 Abstract We study solution approaches for the design of reliably

More information

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015 Markov Chains Arnoldo Frigessi Bernd Heidergott November 4, 2015 1 Introduction Markov chains are stochastic models which play an important role in many applications in areas as diverse as biology, finance,

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Downloaded from:

Downloaded from: Camacho, A; Kucharski, AJ; Funk, S; Breman, J; Piot, P; Edmunds, WJ (2014) Potential for large outbreaks of Ebola virus disease. Epidemics, 9. pp. 70-8. ISSN 1755-4365 DOI: https://doi.org/10.1016/j.epidem.2014.09.003

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Department of Mathematics University of Manitoba Winnipeg Julien Arino@umanitoba.ca 19 May 2012 1 Introduction 2 Stochastic processes 3 The SIS model

More information

MATHEMATICAL MODELS Vol. III - Mathematical Models in Epidemiology - M. G. Roberts, J. A. P. Heesterbeek

MATHEMATICAL MODELS Vol. III - Mathematical Models in Epidemiology - M. G. Roberts, J. A. P. Heesterbeek MATHEMATICAL MODELS I EPIDEMIOLOGY M. G. Roberts Institute of Information and Mathematical Sciences, Massey University, Auckland, ew Zealand J. A. P. Heesterbeek Faculty of Veterinary Medicine, Utrecht

More information

Transmission in finite populations

Transmission in finite populations Transmission in finite populations Juliet Pulliam, PhD Department of Biology and Emerging Pathogens Institute University of Florida and RAPIDD Program, DIEPS Fogarty International Center US National Institutes

More information

Epidemics in Networks Part 2 Compartmental Disease Models

Epidemics in Networks Part 2 Compartmental Disease Models Epidemics in Networks Part 2 Compartmental Disease Models Joel C. Miller & Tom Hladish 18 20 July 2018 1 / 35 Introduction to Compartmental Models Dynamics R 0 Epidemic Probability Epidemic size Review

More information

The SIS and SIR stochastic epidemic models revisited

The SIS and SIR stochastic epidemic models revisited The SIS and SIR stochastic epidemic models revisited Jesús Artalejo Faculty of Mathematics, University Complutense of Madrid Madrid, Spain jesus_artalejomat.ucm.es BCAM Talk, June 16, 2011 Talk Schedule

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Models of Infectious Disease Formal Demography Stanford Summer Short Course James Holland Jones, Instructor. August 15, 2005

Models of Infectious Disease Formal Demography Stanford Summer Short Course James Holland Jones, Instructor. August 15, 2005 Models of Infectious Disease Formal Demography Stanford Summer Short Course James Holland Jones, Instructor August 15, 2005 1 Outline 1. Compartmental Thinking 2. Simple Epidemic (a) Epidemic Curve 1:

More information

LAW OF LARGE NUMBERS FOR THE SIRS EPIDEMIC

LAW OF LARGE NUMBERS FOR THE SIRS EPIDEMIC LAW OF LARGE NUMBERS FOR THE SIRS EPIDEMIC R. G. DOLGOARSHINNYKH Abstract. We establish law of large numbers for SIRS stochastic epidemic processes: as the population size increases the paths of SIRS epidemic

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

EPIDEMICS WITH TWO LEVELS OF MIXING

EPIDEMICS WITH TWO LEVELS OF MIXING The Annals of Applied Probability 1997, Vol. 7, No. 1, 46 89 EPIDEMICS WITH TWO LEVELS OF MIXING By Frank Ball, Denis Mollison and Gianpaolo Scalia-Tomba University of Nottingham, Heriot-Watt University

More information

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected 4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X

More information

Reproduction numbers for epidemic models with households and other social structures I: Definition and calculation of R 0

Reproduction numbers for epidemic models with households and other social structures I: Definition and calculation of R 0 Mathematical Statistics Stockholm University Reproduction numbers for epidemic models with households and other social structures I: Definition and calculation of R 0 Lorenzo Pellis Frank Ball Pieter Trapman

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Endemic persistence or disease extinction: the effect of separation into subcommunities

Endemic persistence or disease extinction: the effect of separation into subcommunities Mathematical Statistics Stockholm University Endemic persistence or disease extinction: the effect of separation into subcommunities Mathias Lindholm Tom Britton Research Report 2006:6 ISSN 1650-0377 Postal

More information

A Generic Multivariate Distribution for Counting Data

A Generic Multivariate Distribution for Counting Data arxiv:1103.4866v1 [stat.ap] 24 Mar 2011 A Generic Multivariate Distribution for Counting Data Marcos Capistrán and J. Andrés Christen Centro de Investigación en Matemáticas, A. C. (CIMAT) Guanajuato, MEXICO.

More information

Mathematical Epidemiology Lecture 1. Matylda Jabłońska-Sabuka

Mathematical Epidemiology Lecture 1. Matylda Jabłońska-Sabuka Lecture 1 Lappeenranta University of Technology Wrocław, Fall 2013 What is? Basic terminology Epidemiology is the subject that studies the spread of diseases in populations, and primarily the human populations.

More information

6. Age structure. for a, t IR +, subject to the boundary condition. (6.3) p(0; t) = and to the initial condition

6. Age structure. for a, t IR +, subject to the boundary condition. (6.3) p(0; t) = and to the initial condition 6. Age structure In this section we introduce a dependence of the force of infection upon the chronological age of individuals participating in the epidemic. Age has been recognized as an important factor

More information

PARAMETER ESTIMATION IN EPIDEMIC MODELS: SIMPLIFIED FORMULAS

PARAMETER ESTIMATION IN EPIDEMIC MODELS: SIMPLIFIED FORMULAS CANADIAN APPLIED MATHEMATICS QUARTERLY Volume 19, Number 4, Winter 211 PARAMETER ESTIMATION IN EPIDEMIC MODELS: SIMPLIFIED FORMULAS Dedicated to Herb Freedman on the occasion of his seventieth birthday

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

Numerical solution of stochastic epidemiological models

Numerical solution of stochastic epidemiological models Numerical solution of stochastic epidemiological models John M. Drake & Pejman Rohani 1 Introduction He we expand our modeling toolkit to include methods for studying stochastic versions of the compartmental

More information

Three Disguises of 1 x = e λx

Three Disguises of 1 x = e λx Three Disguises of 1 x = e λx Chathuri Karunarathna Mudiyanselage Rabi K.C. Winfried Just Department of Mathematics, Ohio University Mathematical Biology and Dynamical Systems Seminar Ohio University November

More information

Design of Optimal Bayesian Reliability Test Plans for a Series System

Design of Optimal Bayesian Reliability Test Plans for a Series System Volume 109 No 9 2016, 125 133 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://wwwijpameu ijpameu Design of Optimal Bayesian Reliability Test Plans for a Series System P

More information

Simulating stochastic epidemics

Simulating stochastic epidemics Simulating stochastic epidemics John M. Drake & Pejman Rohani 1 Introduction This course will use the R language programming environment for computer modeling. The purpose of this exercise is to introduce

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University

More information

Essential Ideas of Mathematical Modeling in Population Dynamics

Essential Ideas of Mathematical Modeling in Population Dynamics Essential Ideas of Mathematical Modeling in Population Dynamics Toward the application for the disease transmission dynamics Hiromi SENO Research Center for Pure and Applied Mathematics Department of Computer

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Weighted Sums of Orthogonal Polynomials Related to Birth-Death Processes with Killing

Weighted Sums of Orthogonal Polynomials Related to Birth-Death Processes with Killing Advances in Dynamical Systems and Applications ISSN 0973-5321, Volume 8, Number 2, pp. 401 412 (2013) http://campus.mst.edu/adsa Weighted Sums of Orthogonal Polynomials Related to Birth-Death Processes

More information

HETEROGENEOUS MIXING IN EPIDEMIC MODELS

HETEROGENEOUS MIXING IN EPIDEMIC MODELS CANADIAN APPLIED MATHEMATICS QUARTERLY Volume 2, Number 1, Spring 212 HETEROGENEOUS MIXING IN EPIDEMIC MODELS FRED BRAUER ABSTRACT. We extend the relation between the basic reproduction number and the

More information

Inference in disease transmission experiments by using stochastic epidemic models

Inference in disease transmission experiments by using stochastic epidemic models Appl. Statist. (2005) 54, Part 2, pp. 349 366 Inference in disease transmission experiments by using stochastic epidemic models Michael Höhle, University of Munich, Germany Erik Jørgensen Danish Institute

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

Markov Networks.

Markov Networks. Markov Networks www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts Markov network syntax Markov network semantics Potential functions Partition function

More information

MIP reformulations of some chance-constrained mathematical programs

MIP reformulations of some chance-constrained mathematical programs MIP reformulations of some chance-constrained mathematical programs Ricardo Fukasawa Department of Combinatorics & Optimization University of Waterloo December 4th, 2012 FIELDS Industrial Optimization

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

ESTIMATING LIKELIHOODS FOR SPATIO-TEMPORAL MODELS USING IMPORTANCE SAMPLING

ESTIMATING LIKELIHOODS FOR SPATIO-TEMPORAL MODELS USING IMPORTANCE SAMPLING To appear in Statistics and Computing ESTIMATING LIKELIHOODS FOR SPATIO-TEMPORAL MODELS USING IMPORTANCE SAMPLING Glenn Marion Biomathematics & Statistics Scotland, James Clerk Maxwell Building, The King

More information

Random Effects Models for Network Data

Random Effects Models for Network Data Random Effects Models for Network Data Peter D. Hoff 1 Working Paper no. 28 Center for Statistics and the Social Sciences University of Washington Seattle, WA 98195-4320 January 14, 2003 1 Department of

More information

Modelling spatio-temporal patterns of disease

Modelling spatio-temporal patterns of disease Modelling spatio-temporal patterns of disease Peter J Diggle CHICAS combining health information, computation and statistics References AEGISS Brix, A. and Diggle, P.J. (2001). Spatio-temporal prediction

More information

Understanding the contribution of space on the spread of Influenza using an Individual-based model approach

Understanding the contribution of space on the spread of Influenza using an Individual-based model approach Understanding the contribution of space on the spread of Influenza using an Individual-based model approach Shrupa Shah Joint PhD Candidate School of Mathematics and Statistics School of Population and

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

AARMS Homework Exercises

AARMS Homework Exercises 1 For the gamma distribution, AARMS Homework Exercises (a) Show that the mgf is M(t) = (1 βt) α for t < 1/β (b) Use the mgf to find the mean and variance of the gamma distribution 2 A well-known inequality

More information

Inference for discretely observed stochastic kinetic networks with applications to epidemic modeling

Inference for discretely observed stochastic kinetic networks with applications to epidemic modeling Biostatistics (2012), 13, 1, pp. 153 165 doi:10.1093/biostatistics/kxr019 Advance Access publication on August 10, 2011 Inference for discretely observed stochastic kinetic networks with applications to

More information

Estimating the Exponential Growth Rate and R 0

Estimating the Exponential Growth Rate and R 0 Junling Ma Department of Mathematics and Statistics, University of Victoria May 23, 2012 Introduction Daily pneumonia and influenza (P&I) deaths of 1918 pandemic influenza in Philadelphia. 900 800 700

More information

Mathematical Modeling and Analysis of Infectious Disease Dynamics

Mathematical Modeling and Analysis of Infectious Disease Dynamics Mathematical Modeling and Analysis of Infectious Disease Dynamics V. A. Bokil Department of Mathematics Oregon State University Corvallis, OR MTH 323: Mathematical Modeling May 22, 2017 V. A. Bokil (OSU-Math)

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models

Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models (with special attention to epidemics of Escherichia coli O157:H7 in cattle) Simon Spencer 3rd May

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

STABILITY ANALYSIS OF A GENERAL SIR EPIDEMIC MODEL

STABILITY ANALYSIS OF A GENERAL SIR EPIDEMIC MODEL VFAST Transactions on Mathematics http://vfast.org/index.php/vtm@ 2013 ISSN: 2309-0022 Volume 1, Number 1, May-June, 2013 pp. 16 20 STABILITY ANALYSIS OF A GENERAL SIR EPIDEMIC MODEL Roman Ullah 1, Gul

More information

Thursday. Threshold and Sensitivity Analysis

Thursday. Threshold and Sensitivity Analysis Thursday Threshold and Sensitivity Analysis SIR Model without Demography ds dt di dt dr dt = βsi (2.1) = βsi γi (2.2) = γi (2.3) With initial conditions S(0) > 0, I(0) > 0, and R(0) = 0. This model can

More information

MAS1302 Computational Probability and Statistics

MAS1302 Computational Probability and Statistics MAS1302 Computational Probability and Statistics April 23, 2008 3. Simulating continuous random behaviour 3.1 The Continuous Uniform U(0,1) Distribution We have already used this random variable a great

More information

Approximation of epidemic models by diffusion processes and their statistical inferencedes

Approximation of epidemic models by diffusion processes and their statistical inferencedes Approximation of epidemic models by diffusion processes and their statistical inferencedes Catherine Larédo 1,2 1 UR 341, MaIAGE, INRA, Jouy-en-Josas 2 UMR 7599, LPMA, Université Paris Diderot December

More information

\ fwf The Institute for Integrating Statistics in Decision Sciences

\ fwf The Institute for Integrating Statistics in Decision Sciences # \ fwf The Institute for Integrating Statistics in Decision Sciences Technical Report TR-2007-8 May 22, 2007 Advances in Bayesian Software Reliability Modelling Fabrizio Ruggeri CNR IMATI Milano, Italy

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Mathematical modelling and controlling the dynamics of infectious diseases

Mathematical modelling and controlling the dynamics of infectious diseases Mathematical modelling and controlling the dynamics of infectious diseases Musa Mammadov Centre for Informatics and Applied Optimisation Federation University Australia 25 August 2017, School of Science,

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

POMP inference via iterated filtering

POMP inference via iterated filtering POMP inference via iterated filtering Edward Ionides University of Michigan, Department of Statistics Lecture 3 at Wharton Statistics Department Thursday 27th April, 2017 Slides are online at http://dept.stat.lsa.umich.edu/~ionides/talks/upenn

More information

Mathematical Analysis of Epidemiological Models: Introduction

Mathematical Analysis of Epidemiological Models: Introduction Mathematical Analysis of Epidemiological Models: Introduction Jan Medlock Clemson University Department of Mathematical Sciences 8 February 2010 1. Introduction. The effectiveness of improved sanitation,

More information

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY UNIVERSITY OF NOTTINGHAM Discussion Papers in Economics Discussion Paper No. 0/06 CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY by Indraneel Dasgupta July 00 DP 0/06 ISSN 1360-438 UNIVERSITY OF NOTTINGHAM

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications

Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications Edward Ionides Department of Statistics, University of Michigan ionides@umich.edu Statistics Department

More information

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

A note on network reliability

A note on network reliability A note on network reliability Noga Alon Institute for Advanced Study, Princeton, NJ 08540 and Department of Mathematics Tel Aviv University, Tel Aviv, Israel Let G = (V, E) be a loopless undirected multigraph,

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Hmms with variable dimension structures and extensions

Hmms with variable dimension structures and extensions Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating

More information

On the Fisher Bingham Distribution

On the Fisher Bingham Distribution On the Fisher Bingham Distribution BY A. Kume and S.G Walker Institute of Mathematics, Statistics and Actuarial Science, University of Kent Canterbury, CT2 7NF,UK A.Kume@kent.ac.uk and S.G.Walker@kent.ac.uk

More information

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods M. González, R. Martínez, I. del Puerto, A. Ramos Department of Mathematics. University of Extremadura Spanish

More information

University of Leeds. Project in Statistics. Math5004M. Epidemic Modelling. Supervisor: Dr Andrew J Baczkowski. Student: Ross Breckon

University of Leeds. Project in Statistics. Math5004M. Epidemic Modelling. Supervisor: Dr Andrew J Baczkowski. Student: Ross Breckon University of Leeds Project in Statistics Math5004M Epidemic Modelling Student: Ross Breckon 200602800 Supervisor: Dr Andrew J Baczkowski May 6, 2015 Abstract The outbreak of an infectious disease can

More information

Age-dependent branching processes with incubation

Age-dependent branching processes with incubation Age-dependent branching processes with incubation I. RAHIMOV Department of Mathematical Sciences, KFUPM, Box. 1339, Dhahran, 3161, Saudi Arabia e-mail: rahimov @kfupm.edu.sa We study a modification of

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

arxiv: v1 [stat.me] 30 Sep 2009

arxiv: v1 [stat.me] 30 Sep 2009 Model choice versus model criticism arxiv:0909.5673v1 [stat.me] 30 Sep 2009 Christian P. Robert 1,2, Kerrie Mengersen 3, and Carla Chen 3 1 Université Paris Dauphine, 2 CREST-INSEE, Paris, France, and

More information

Efficient MCMC Samplers for Network Tomography

Efficient MCMC Samplers for Network Tomography Efficient MCMC Samplers for Network Tomography Martin Hazelton 1 Institute of Fundamental Sciences Massey University 7 December 2015 1 Email: m.hazelton@massey.ac.nz AUT Mathematical Sciences Symposium

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Convex relaxations of chance constrained optimization problems

Convex relaxations of chance constrained optimization problems Convex relaxations of chance constrained optimization problems Shabbir Ahmed School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive, Atlanta, GA 30332. May 12, 2011

More information

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,

More information

4452 Mathematical Modeling Lecture 16: Markov Processes

4452 Mathematical Modeling Lecture 16: Markov Processes Math Modeling Lecture 16: Markov Processes Page 1 4452 Mathematical Modeling Lecture 16: Markov Processes Introduction A stochastic model is one in which random effects are incorporated into the model.

More information

Variational Learning : From exponential families to multilinear systems

Variational Learning : From exponential families to multilinear systems Variational Learning : From exponential families to multilinear systems Ananth Ranganathan th February 005 Abstract This note aims to give a general overview of variational inference on graphical models.

More information

Temporal point processes: the conditional intensity function

Temporal point processes: the conditional intensity function Temporal point processes: the conditional intensity function Jakob Gulddahl Rasmussen December 21, 2009 Contents 1 Introduction 2 2 Evolutionary point processes 2 2.1 Evolutionarity..............................

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Discussion of Maximization by Parts in Likelihood Inference

Discussion of Maximization by Parts in Likelihood Inference Discussion of Maximization by Parts in Likelihood Inference David Ruppert School of Operations Research & Industrial Engineering, 225 Rhodes Hall, Cornell University, Ithaca, NY 4853 email: dr24@cornell.edu

More information

Bayesian Graphical Models

Bayesian Graphical Models Graphical Models and Inference, Lecture 16, Michaelmas Term 2009 December 4, 2009 Parameter θ, data X = x, likelihood L(θ x) p(x θ). Express knowledge about θ through prior distribution π on θ. Inference

More information