Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models.

Size: px

Start display at page:

Download "Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models."

Benjamin Norton
6 years ago
Views:

1 Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models. An Application to the 2001 UK Foot-and-Mouth Outbreak Theodore University of Nottingham Gareth O. Lancaster University November, Cambridge

2 Introduction to Epidemic Models

3 Background Understanding the spread of an infectious disease is an important issue in order to prevent major outbreaks of an epidemic. In general, inference problems for disease outbreak data are complicated by the facts that the data are inherently dependent and the data are usually incomplete in the sense that the actual process of infection is not observed. However, it is often possible to formulate simple stochastic models which describe the key features of epidemic spread.

4 The SIR Model Principles Consider a closed population (no births/deaths) of size N individuals α initially infected individuals at any given time point, each individual i is in one of the three states: Susceptible Infected Removed S I R

5 The Stochastic Process Each of the infected individuals remains infectious for some period which is distributed according to the distribution of random variable D. While infectious, makes contacts with each of the N individuals in the population at times given by the points of a Poisson process of rate β > 0. Any such contact with a susceptible individual immediately makes the susceptible an infective. At the end of its infectious period an infective no longer makes any contacts and is said to be removed.

6 Infectious Period The infectious periods of different infectives are assumed to be independent and identically distributed according to the distribution of a random variable D, which can have any arbitrary but specified distribution. The special (Markovian) case where the infectious period follows an Exponential distribution is known as the General Stochastic Epidemic (GSE).

7 The GSE Model The GSE model has received a lot of attention in the probabilistic literature. Although assuming an Exponential infectious period is mathematically convenient (limit results etc) it is not biologically motivated. The GSE has been generalised in a large number of ways, such as: Gamma or Weibull distributed infectious periods (O Neill and Becker, 2001 & Streftaris and Gibson, 2004). Incorporate latent infectious periods (eg. SEIR models)

8 Modelling the 2001 UK Foot-and-Mouth Disease (Motivation)

9 Introduction Livestock diseases are becoming significantly important due to their welfare and economic consequences. The Foot-and-Mouth disease (FMD) is considered the most important since it can be spread rapidly between livestock species. In general, FMD is rarely lethal to adult livestock, but causes blister on the mouth and feet which often leads to a significant drop in milk production in dairy cattle. It also causes very slow weight gain in other livestock. The economic effects of infection within a country are dramatic; prevention of export of meat and milk to other countries eliminates a vital source of revenue.

10 Effects of the FMD to the country The UK Foot-and-Mouth epidemic of 2001 is thought to have entered the UK in early February of that year. The epidemic resulted in recorded known infections and culls at Infected Premises (IPs) across the UK. Over 10,000 other farms with unknown infection status, but determined to be at high risk, had animals slaughtered. Non-IP slaughter policy has also taken place; Contiguous Premises (CPs) and Dangerous Contacts (DC). In total over 4,000,000 animals were slaughtered as part of the control program to eradicate the disease.

11 Effects of the FMD to the country (cont.) The estimated costs to the UK economy of the 2001 FMD epidemic consisted of: US $4.7 billion to the food and farming industries US $4.5 billion to the leisure and tourism industry; as well as US $3 billion in indirect costs. Source: Thompson et al. (2002)

12 Population of Farms in the UK

13 FMD in Cumbria The resulting dataset consists of N = 5378 (initially susceptible) farms in total. At the end of the outbreak, n I = 1021 got infected. This data set directly gives the cull-dates of each farm on which animals were culled, but the infection-dates are not observed. Unfortunately, information on which farms have been culled without knowing their infection status (eg. dangerous contacts) is unavailable to us.

14 Modelling Issues The GSE does not seem to be appropriate since the following assumptions are questionable: Homogeneously mixing population; Exponential Infectious Period. Therefore, Associate the infection rate with the available risk factors such as the distance between the two farms and the number different species within each of them. Assume a more plausible distribution for the infectious period (eg. Gamma).

15 Heterogeneously Mixing Stochastic Epidemic Models (A General Framework)

16 A Heterogeneously Mixing Stochastic Epidemic Model Model Specification Infection Rate: Infectious Period: β ij := β 0 h(i, j) (1) D i Ga(α, γ) (2) The parameters of interest are β 0, γ > 0 (α assumed to be known). The (deterministic) function h(i, j) refers to individual s characteristics which might involve some other parameters as well.

17 Inference GOAL: Draw inference for the parameters (β 0, γ) Likelihood: f (I, R β 0, γ) where n I i=1,i k γ αn R exp { Y i = {j : I j < I i < R j } β ji exp j Yi n R i=1 k denotes the initial infective. γ(r i I i )} T I k i I t j S t I t and S t denote the set of infectives and susceptibles just prior time t. I = (I 1,..., I ni ) and R = (R 1,..., R nr ) β ij dt

18 Inference for Partially Observed Epidemics However the infections times are rarely observed and therefore the inference now becomes slightly more complicated since the likelihood of the observed data (R) given the parameters (β 0, γ) is intractable. In a Bayesian framework, we need to impute the missing data I. Multiplying the prior of the parameters times the likelihood we get the posterior distribution of the parameters given the removal times π(β 0, γ, I R).

19 Full Conditional Distributions of the Parameters Model Parameters: β 0 γ, I, R Ga λ β0 + n I 1, ν β0 + Missing Data: π(i R, β 0, γ) γ β 0, I, R Ga n I i=1,i k exp ( T I k λ γ + n R, ν γ + i I t j S t n R i=1 β ji exp j Yi { } γ n R i=1 (R i I i ) (R i I i ) T I k h(i, j) dt ) i I t j S t β ij dt (3)

20 MCMC Algorithms for Stochastic Epidemics Gibson and Renshaw (1998) & Roberts and O Neill (1999) were the first to apply MCMC methods for stochastic epidemics. Outline of the MCMC (centered) algorithm: 1. Initialise the algorithm; 2. Choose uniformly one (or more) infection time(s) I j,j = 1,..., n I and update it (them) individually using a Metropolis Hastings step by proposing a replacement infection time; 3. Update the parameters β and γ via a Gibbs step.

21 MCMC Implementation Terminology: Random Scan (RS): At each MCMC iteration we update only one of the infection times (chosen at random). δ% Deterministic Scan (δ% DS): At each MCMC iteration, we choose (at random) to update δ% of the infection times. Computation: The integral involved in the likelihood can be rewritten as follows: T I k i I t j S t β ij dt = n I i=1 j=1 N β ij (R i I j I j I i ). This reduces the computational cost of evaluating the likelihood significantly.

22 How to propose a new infection time?

23 Updating the Infection Times The Metropolis step can be implemented by choosing one of the following proposals: Independent Sampler I - O Neill & Roberts (1999): I j U(I k, R k ) Independent Sampler II - O Neill & Becker (2001) : R j I j Ga(α, γ) Random Walk Metropolis - Neal and Roberts (2003): I j N(I j, σ 2 ).

24 Remarks (Efficient) Block updating of the infection times is very hard especially when the size of the data (n I, N ) is big. Gibbs step for the infection times is possible, but difficult to construct. The independence sampler R i I i Ga(α, γ) seems to be the most efficient proposal among the previously mentioned sampler.

25 Random Scan vs Deterministic Scan

26 Updating the infection times I [C] RS: I ACF Lag

27 Updating the infection times I (cont.) [C] 10% DS: I ACF Lag

28 Updating the infection times I(cont.) [C] 50% DS: I ACF Lag

29 Updating the infection times I(cont.) [C] 100% DS: I ACF Lag

30 Results Ignoring the cpu time needed to run an algorithm, 100% Deterministic scan performs better than any other chosen algorithm (ie. δ < 1). In general, increasing δ improves the mixing of the MCMC algorithm, nevertheless is more computationally costly (slower in cpu time). It is of interest to quantify the optimal percentage of deterministic scan taking into account mixing and cpu time. The optimal δ will depend on the size of the epidemic (n I, N ). Easy to do it computationally hard to derive a theoretical result.

31 Alternate the target distribution

32 Alternative Target Distributions Since the full conditional posterior distributions of β 0 and γ are of a standard form (Gamma distr.) and they are conditionally independent. Therefore either or both can be integrated out from the target distribution. Therefore, we could construct (similar to the previous) MCMC algorithms having the following target distributions: π(β 0, I R) = π(β 0, γ, I R) dγ γ π(γ, I R) = π(β 0, γ, I R) dβ 0 β 0 π(i R) = π(β 0, γ, I R) dβ 0 dγ β 0 γ

33 Which is one is the best? It turns out that if δ is chosen to be relatively small, then pretty similar results (regarding efficiency) are obtained. However If we increase δ, and in particular we choose a 100% DS, integrating both model parameters out (i.e. β 0 and γ) and choose as target distribution π(i R) seems to increase (slightly) the algorithm s efficiency compared to the standard target distribution (π(i, β 0, γ R).

34 Remarks When the parameter γ is integrated out, for example in π(β 0, I R) or π(i R), the previously mentioned independence sampler needs to be modified: R i I i Ga(α, γ) A fixed value could be assigned to γ (obtained via a pilot study). Otherwise, a pseudo -MLE of γ can be used: αn R γ = nr i=1 (R i Ii c ) where I c denotes the vector of the current infection times.

35 Mixing of the MCMC Algorithms for different level of variability of the infectious period

36 Model s Equations A Working Example β ij = β 0 exp { ρ(i, j)} R i I i Gamma(α, γ) N = 501 (1 initially infective) where ρ(i, j) denotes the distance between the two individuals. Simulated Dataset D1 D2 D3 True β True α True γ n I = n R

37 A Working Example (cont.) Distributions of the Infectious Periods Density f(x) D1 D2 D

38 (α, γ) = (0.5, 0.25) Mixing of the MCMC Algorithm ACF plot for γ :(Dataset 1) ACF Lag

39 (α, γ) = (2, 1) Mixing of the MCMC Algorithms (cont.) ACF plot for γ :(Dataset 2) ACF Lag

40 Mixing of the MCMC Algorithms (cont.) (α, γ) = (5, 2.5) ACF plot for γ :(Dataset 3) ACF Lag

41 Reasons for Poor Mixing If we carefully look at one of the likelihood equations which describes the infectious period of each individual: R i I i Ga(α, γ). This reveals that a-priori γ and I are dependent and this causes problems to the MCMC mixing. R i I i Ga(α, γ), for i = 1,..., n I n I (R i I i ) Ga(αn I, γ) i=1

42 Reasons for Poor Mixing (cont.) Thus for large n I or α, the parameter γ and the sum of the infectious periods n I i=1 (R i I i ) are a-priori heavily dependent. If these two were the parameters of interest, then this a-priori correlation would have caused mixing problems in the case of a two-state Gibbs sampler [see Amit(1991), Roberts and Sahu (1997)]. Things are more complicated in practice since the MCMC schemes used so far involved deterministic scan update of the each of infection times I i, i = 1,..., n I.

43 Correlation between Infection times and γ Dataset 1 Dataset 2 Dataset 3 γ γ γ I I I

44 Aim: Construct more efficient algorithms

45 Bayesian Hierarchical Models In Bayesian hierarchical models when the missing data contain much more information than the observed data about the parameter of interest then the standard data-augmentation techniques perform poorly. In epidemic models β 0 and γ are conditionally independent on the infection times I. Nevertheless, each of them is very dependent with I. Therefore we are interested in applying a parameterisation where γ and I are a-priori independent

46 Non Centered Parameterisations (NCP) The non-centered methodology presented in Papaspiliopoulos et. al (2003) suggests that appropriate non centered parameterisations out perform the centered existing centered algorithms over a range of examples. Their findings can be summarized as follows: When the observed (missing) data are much more informative about the parameter of interest than the missing (observed) then is better to choose a centered (non centered) parameterisation.

47 NCP for Stochastic Epidemics Neal and Roberts (2005) have applied this methodology to the General Stochastic Epidemic model. We introduce a non centered parameterisation: U i = γ(r i I i ), i = 1,..., n I It easy to see that since U i Ga(α, 1),i = 1,..., n I, the random variables U = (U 1,..., U ni ) T and γ are apriori independent. Change in variables: (I, β 0, γ, R) (U, β 0, γ, R).

48 NC Algorithm s Implementation An outline of an implementation of a Non Centered algorithm is the following: Outline 1. Get a sample of (γ, I) via a centered algorithm; 2. Transform the I s to U s via U i = γ(r i I i ) 3. Update γ using Metropolis Hastings algorithm using π(γ U, β 0, R) ;

49 The Existing Approach Neal and Roberts (2005) draw samples of (γ, I) via MCMC on π(γ, I R) using the standard algorithm. They update γ via a Random Walk Metropolis step, i.e. γ N(γ, σ 2 γ)

50 An Alternative Approach We propose to draw samples of (γ, I) via the most efficient centered algorithm having as target distribution one of the following: π(β 0, I R), π(γ, I R), π(i R) Note: A modification of the previously used proposal [R i I i Γ(α, γ)] is needed when γ is integrated out from the target distribution π( ).

51 Updating γ Apart from a Multiplicative Random Walk, we also propose to use one of the 3 different proposals: Independent Sampler I ( Pseudo Gibbs ): In more detail: q 1 (γ ) π(γ I c, R) Given the current values of the infection times, Ii c, i = 1,..., n I propose a new value for γ by sampling from ( ) γ new I c, R Ga λ γ + n R, ν γ + n R i=1 (R i I c i )

52 Updating γ (cont.) Independent Sampler II (Normal Approximation): In more detail: q 1 (γ ) N(ˆγ MLE, ɛ σγ 2 MLE ˆ ), for some ɛ > 1, If the infection times were known, inference for γ is straightforward: γ MLE = αn R nr i=1 (R i I i ) σγ 2 MLE ˆ can be very easily derived ɛ is chosen in order to allow the MH algorithm for big jumps.

53 Updating γ (cont.) Independent sampler III (Adaptive Sampling): In more detail: q 1 (γ ) Ga(a, b) Run the centered algorithm for an adequate number of iterations (pilot study) Try to approximate the derived (marginal) posterior with an appropriate well known distribution Estimate the parameters of such a distribution (eg. methods of moments)

54 Partially Non Centered Parameterisations The set of the infected individuals in the epidemic, is partitioned into two groups, C and U. Each individual is assigned to one of the two groups (independently of each other) with probability µ For those individuals in the U, let U i = γ(r i I i ) (i U) We propose a change in variable from (I C, I U, β 0, γ, R) to (I C, U U, β 0, γ, R)

55 Results - Dataset D1 [C] [PNC] ACF ACF Lag Lag

56 Results - Dataset D2 [C] [PNC] ACF ACF Lag Lag

57 Results - Dataset D3 [C] [PNC] ACF ACF Lag Lag

58 Conclusions Although the implementation of the standard (centered) MCMC algorithms for SIR-type models is quite straightforward their efficiency deteriorates as the number of infected individuals (n I ) increases. Non Centered Parameterisations often offer much more robust algorithms there is relatively little extra cost in implementing such an algorithm.

59 Modelling the 2001 UK Foot-and-Mouth Disease (A Bayesian Analysis)

60 Population of Farms in Cumbria

61 A Recap... The resulting dataset consists of N = 5378 (initially susceptible) farms in total. At the end of the outbreak, n I = 1021 got infected. This data set directly gives the cull-dates of each farm on which animals were culled, but the infection-dates are not observed. Unfortunately, information on which farms have been culled without knowing their infection status (eg. dangerous contacts) is unavailable to us.

62 Modelling the Infection Rate The Model β ij = β 0 K(ρ(i, j); δ) Infection Kernel ( ɛ (nc) i ζ + (ns) i ζ) Farm s Infectivity ( ξ (nc) j ζ + (ns) j ζ) Farm s Susceptibility where ɛ and ξ represent the relative infectivity and susceptibility of cows to sheep. Parameter ζ allows for a non linear effect. ρ(i, j) denotes a measure of distance between farms i and j.

63 Infectious Kernel The environmental spread is modelled by assuming a Cauchy kernel K(ρ i,j, δ) and is associated with parameter δ: K(ρ(i, j); δ) = δ ρ 2 ij + δ2 Other Kernels such as Exponential, Geometric could be also used. The Cauchy is chosen such that we adjust for long-range infections.

64 Infectious Kernel (cont.) The Euclidean distance is used as a measure for the distance between the farms. Other metrics could be used such as minimum walking distance. It is questionable which is the most appropriate (see Savill et. al. 2006).

65 Infectious Period The infectious period is assumed to be Gamma distributed: We also assume that α = 4. R i I i Ga(α, γ) Note that the infection times I i are unknown and will be imputed (estimated) within our framework.

66 Results - Kernel Spatial Kernel K(i, j) distance (in Km)

67 Results - Infectivity Posterior Distribution of ε Posterior Density ε

68 Results - Susceptibility Posterior Distribution of ξ Posterior Density ξ

69 Results - Non Linearity Posterior Distribution of ζ Posterior Density ζ

70 Summary A Heterogeneously Mixing Stochastic Epidemic (HMSE) model was applied in order to describe the dynamics of the 2001 UK FMD in Cumbria. The effect of the spatial kernel is very little when the distance between the two farms is > 3Km. Cattle were found to be more infective and more susceptible than sheep. There is strong evidence that the effect of the number of animals to the infection rate is non linear.

71 Work in Progress Incorporate time dependent infection rates. Develop efficient NC parameterisations for more complex models (eg. latent periods) Providing a real-time risk assessment tool for a potential outbreak of an Avian Influenza (Bird Flu) outbreak in the poultry industry of the UK (Joint work with Chris Jewell). Perform a comprehensive analysis for the 2001 UK FMD by taking into account the total population of farms in the UK (extra information is needed).

72 Modelling Highly Pathogenic Avian Influenza (A Bayesian Approach) Joint work with Chris Jewell and Gareth Lancaster University

73 Statistical Analysis A comprehensive statistical analysis must address: The dynamics of an infection Transmission and removal rates Imputation of unobserved data: Infection times If used during an epidemic, we require a method to impute occult infections

74 The Model S I N R Infectivity Time

75 Infectivity and Time to Notification Functions Infectivity function We assume a time-dependent transmission rate where n I β j = β 0 + [β ij h(i j I i )] h(t) = i eνt µ + e νt µ > 0, ν > 0 Time to Notification P(D > d) = e { a(e{b d} 1)} Time between infection and notification, D I = N I where a > 0, b > 0. [ Note: Parameters µ, ν, a and b are assumed known]

76 Transmission rate The Model (cont.) β ij = β sp,j β 1 C FM + β 2 C SH + β 3 C CP + β 4 e δρ[i,j], i I, j S β 5 e δρ[i,j], i N, j S Parameters β 0 β 1, β 2, β 3 β 4, β 5, δ β sp,j Spread Eg. Wild Bird Population Network Spatial Species susceptibility C FM, C SH, C CP : denote the contact matrices; ρ[i, j]: denotes the Euclidean distance between farms and i n and j

77 Toolbox Prior Information for θ Expert s opinion Previous disease outbreak data Likelihood Probability of observing what we have observed. Prior MCMC Posterior Likelihood

78 Markov Chain Monte Carlo algorithm Repeat the following steps: 1. Simulate the model parameters θ 2. Simulate the missing data (infection times) 3. Simulate occult infections NB. Simulation of occult infections changes the dimension of the parameter space.

79 Posteriors Rigorous probability statements for quantifying: Effectiveness of protection zones (β 5 /β 4 ) Relative importance of Network and Environmental spread. Contribution of Network components: (Feed Mills, Slaughter Houses, Company Contacts) Transmission rate β ij = β sp,j β 1 C FM + β 2 C SH + β 3 C CP + β 4 e δρ[i,j], i I, j S β 5 e δρ[i,j], i N, j S

80 Risk Analysis Target Provide real-time risk analysis during a potential outbreak, taking into account uncertainty in our parameter estimates. Challenges Denote by R i, the expected number of farms each farm i might infect were it to be the initial infective in an otherwise susceptible population. How does R i for each single farm evolve during the outbreak? How does the risk to specific areas of the country change over the course of the epidemic? How does the parameter uncertainty change with the amount of available data?

81 Farm-specific R 0 : R i The expected number of farms each farm i might infect were it to be the initial infective in an otherwise susceptible population). (a) Day 7 (b) Day 14 (c) End Figure: Risk analyses for the three stages of the epidemic.

82 Risk Maps The probability of farms becoming infected given the current situation (a) Day 7 (b) Day 14 (c) End Figure: Risk analyses for the three stages of the epidemic.

83 Conclusions We have constructed a robust flexible MCMC-based Bayesian approach for real-time parameter estimation. In conjunction with forward simulation, this provides a powerful risk assessment tool for use during a disease epidemic in the UK.. Such an approach can be easily adopted for other model structures as well. Parallel computing is necessary when dealing with large datasets.

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation