Bayesian inference for stochastic multitype epidemics in structured populations via random graphs

Size: px
Start display at page:

Download "Bayesian inference for stochastic multitype epidemics in structured populations via random graphs"

Transcription

1 Bayesian inference for stochastic multitype epidemics in structured populations via random graphs Nikolaos Demiris Medical Research Council Biostatistics Unit, Cambridge, UK and Philip D. O Neill 1 University of Nottingham, UK Summary. This paper is concerned with new methodology for statistical inference for final-outcome infectious disease data using certain structured-population stochastic epidemic models. A major obstacle to inference for such models is that the likelihood is both analytically and numerically intractable. The approach taken here is to impute missing information in the form of a random graph that describes the potential infectious contacts between individuals. This level of imputation overcomes various constraints of existing methodologies, and yields more detailed information about disease spread. The methods are illustrated with both real and test data. Keywords: Bayesian inference, epidemics, Markov chain Monte Carlo Methods, Metropolis- Hastings algorithm, random graphs, stochastic epidemic models 1 Introduction This paper is concerned with the problem of inferring information about disease spread given data on the final outcome of an epidemic in a structured population. Before outlining our approach, we begin by briefly recalling relevant background material. Stochastic epidemic models that incorporate structured populations have become a subject of considerable research activity in recent years. Examples include independenthousehold models (e.g. Longini and Koopman, 1982; Becker and Dietz, 1995; Becker and Hall, 1996), models with two levels of mixing (e.g. Ball et al., 1997; Ball and Lyne, 2001; Demiris and O Neill, 2005), random network models (e.g. Andersson, 1999; Britton and O Neill, 2002), and social cluster models (e.g. Schinazi, 2002). The basic motivation for such work is that, in contrast to epidemic models that assume a homogeneously mixing population of individuals, most human populations contain inherent 1 Address for correspondence: School of Mathematical Sciences, University of Nottingham, Nottingham NG7 2RD, UK. pdo@maths.nott.ac.uk

2 2 structure because individuals usually spend their time in various groups such as dwelling places, work places, childcare facilities etc. Our focus here is on so-called two-level-mixing models, defined formally below. These models, introduced in Ball et al. (1997), describe a population partitioned into groups in which infectious contacts can occur both locally within a group, and globally between groups. The basic inference problem is then to estimate the local and global infection rates, given knowledge of the underlying structure, and data that indicate which individuals in the population ever became infected during an epidemic. This problem is complicated because of the model dependence structures. Specifically, global infections are explicitly described in the model, and so the number infected in a given group is not independent of the numbers infected in other groups. This in turn means that the likelihood cannot be expressed as a simple product over group outcome, and in fact is intractable in any case of practical concern. This problem can be partly overcome by simply assuming independence between households, which is a reasonable assumption in a large population, and moreover is asymptotically the case as the number of groups tends to infinity. In particular, it is then possible to approximate the two-level mixing model with a simpler independent-groups model possessing a tractable likelihood. In such models, local mixing occurs as before, but global mixing is replaced by the assumption that each individual independently avoids infection from outside its group with some fixed probability. Inference for these models is possible in a variety of ways, see e.g. Addy et al. (1991), Becker and Dietz (1995) and Li et al. (2002). The use of an independent-households model as an approximation for a two-level-mixing model underlies the statistical analyses in Ball et al. (1997), Britton and Becker (2000), Ball and Lyne (2004), and Demiris and O Neill (2005). An attractive aspect of our methods is that they dispense with the need for such approximation. Another difficulty with performing inference for epidemics (and other models with threshold behaviour, e.g. branching and contact processes) arises due to the bimodal nature of realisations. Typically, either an epidemic dies out quickly, or else it infects a fraction of the population which, in a large population, is approximately Gaussian (see e.g. Andersson and Britton, 2000, Chapter 4). Many statistical analyses require the assumption that the epidemic has taken off, this being expressed by requiring that a threshold parameter R exceeds unity (e.g. Becker, 1989, Chapter 8; Rida, 1991; Ball and Lyne, 2004; Demiris and O Neill, 2005). Such an assumption (i) often leads to underestimation in the variability of model parameters or the threshold parameter, and (ii) is clearly not desirable when attempting to infer control strategies which require that R < 1. The methods that we present here are free from this restriction. For intractable likelihood problems, such as that which concerns us, one solution is to

3 3 augment the parameter space, adding missing data or other quantities that then yield a tractable likelihood. Although such methods are widely-used, in the current context it is far from obvious what should be imputed. Demiris and O Neill (2005) describe an approach based on imputation of the so-called final severity of the epidemic, which leads to an approximate analysis involving an independent-groups model, as mentioned above. The key idea in the present paper is instead to impute much more detailed information about the epidemic, namely the set of susceptible individuals that each infected individual would infect if no other infections were permitted. This information is conveniently described by a random digraph in which links correspond to potential infections. Three points about our approach should be noted. First, it is apparently ambitious, since a great deal of extra information is imputed. However, as described later, it is typically the case that only relatively few digraphs are likely to be compatible with the data, and thus the imputation is practicable. Second, some problems involving temporal data, such as weekly case incidence counts, have been approached by imputing missing information in the form of infection pathways (Haydon et al., 2003; Wallinga and Teunis, 2004). Although superficially related to our approach, in fact there are fundamental differences, namely (i) our data are not temporal; (ii) we have to deal with an intractable likelihood; and (iii) we do not impute the infection pathway itself. Third, although we focus here on two-level-mixing models, in fact our approach has very wide applicability, as will be outlined later. The paper is structured as follows. Sections 2 and 3 contain, respectively, the epidemic model of interest and the associated random digraph. Section 4 describes the data and augmented likelihood obtained when the imputed information is employed. An MCMC algorithm is described in Section 5, and illustrated in Section 6, while Section 7 contains some concluding remarks. 2 Multitype epidemic models with two levels of mixing In this section we describe the epidemic model of interest, and an associated threshold parameter that will be of importance in the sequel.

4 4 2.1 Multitype two-level mixing model The following model of a continuous-time epidemic process is defined in Ball and Lyne (2001). Consider a closed population of N individuals, labelled 1,..., N, that is partitioned into groups (e.g. households, farms) of varying sizes. Suppose that the population contains m j groups of size j, and let m = j=1 m j be the total number of groups. Thus N = j=1 jm j. In addition, each individual in the population is assumed to be one of a possible k types, these typically representing categorical covariates (e.g. age, vaccination status, previous infection history). For i = 1,..., k, let N i denote the total number of individuals in the population of type i. For convenience, suppose that the possible types are labelled 1,..., k, and for j = 1,..., N denote by τ(j) the type of individual j. Each individual in the population can, at any time t 0, be in one of three states, namely susceptible, infective, or removed. A susceptible individual is healthy and may contract the disease in question. An infective individual has become infected, and moreover can transmit the disease to others. A removed individual is one who is no longer infectious, and plays no part in further disease spread. In practice this could occur either because of actual immunity (induced by antibodies), or by isolation following the appearance of symptoms. If a type i individual, j say, becomes infected, then they remain so for a random time I(j) whose distribution is the same as some specified nonnegative random variable I i. The random variables describing the infectious periods I(j), j = 1, 2,... of different individuals are assumed to be mutually independent. The epidemic is initiated at time t = 0 by a (typically small) number of individuals becoming infectious. During its infectious period, an individual of type i makes infectious contacts with each type j susceptible, independently of other individuals, at times given by the points of a Poisson process of rate λ G ij/n j. In addition, and independently, the infectious individual also makes infectious contacts with each type j susceptible in its own group according to a Poisson process of rate λ L ij. Once contacted by an infective, a susceptible individual immediately becomes infectious. At the end of its infectious period, an infective individual becomes removed, and plays no further part in the epidemic. The epidemic ceases as soon as there are no infectives present in the population. Two points concerning realism should be noted. First, the model does not include a latent period, i.e. a time period between infection and infectiousness of an individual. However, the distribution of final numbers infected in the model is invariant to the inclusion of any sensible latent period, as described below, and in particular the finaloutcome-data scenario that is under consideration here. Second, the stipulation that the global and local infection rates have different scalings is common practice in two-level mixing models. It corresponds to the assumption that an individual would, on average,

5 5 make more local contacts if their group size increased, but would not make more global contacts if the entire population increased in size. 2.2 Threshold parameter Threshold parameters, known in some contexts as basic reproduction numbers, are of fundamental importance both in stochastic epidemic theory (Andersson and Britton, 2000, p.6), and epidemiology (Farrington et al., 2001). Typically, for a stochastic epidemic model, there is a parameter R such that epidemics in an infinite population of susceptibles are almost surely finite if and only if R 1. Such results essentially originate in branching process theory, in that the early stages of an epidemic are approximately identical to a suitable branching process. In practical terms, the goal of most disease control measures is to ensure that the value of R is reduced to below unity. A threshold parameter for the multitype two-level mixing model can be obtained by allowing the population to become large globally, i.e. by allowing the number of groups, m, to tend to infinity. The details are given in Ball and Lyne (2001), and are essentially as follows. Define the k k matrix M := (m ij ) where m ij is the average number of global contacts to type j individuals made by a group in which the first infected individual is of type i. The threshold parameter, R, is then defined as the maximal eigenvalue of M. It follows that, in a large population, epidemics are extremely unlikely to occur if R 1. Calculation of R in practice involves computing M; explicit details of how to do this are given in Ball et al. (2004). 2.3 Distribution of final outcome We shall be interested in the final outcome of the above model, i.e. the numbers of initially susceptible individuals of each type in each group who ever become infected during the epidemic. As described in Ball and Lyne (2001), it is possible in principle to write down a triangular set of linear equations, the solution of which furnishes us with the required joint probability mass function. However, in practice this system of equations can be numerically intractable, even for small population sizes. Such problems are well-known in epidemic modelling (e.g. Andersson and Britton, 2000, p.18), and arise because the final outcome probabilities that are typically of interest are derived recursively using probabilities that are often very close to zero, and in particular which may be outside the range of normal machine accuracy. An extra complication is that for non-homogeneous population models, such as that of interest here, the number of equations themselves can be enormous.

6 6 In the present context, these numerical problems mean that the most natural likelihood, namely that obtained by calculating the probability of the observed data given a parameter set, is intractable. To overcome this, we shall adopt a form of data imputation using a random graph, to which we now turn our attention. 3 Random Digraphs We now describe a representation of the final outcome of the epidemic model in terms of an associated random graph. Such representations have been considered by a number of authors (e.g. Ludwig, 1975; Barbour and Mollison, 1990; Islam et al., 1996; Andersson and Britton, 2000, Chapter 7), although to our knowledge such graphs have never been directly used for the purposes of statistical inference. Consider the following random directed graph (digraph) on N vertices labelled 1,..., N. Vertex j has type τ(j), where 1 τ(j) k, and there are N i vertices of type i in total. Associate with each vertex j of type i a random lifetime I(j) which is a realisation of a non-negative random variable I i. The lifetimes of different vertices are assumed independent. The set of vertices is partitioned into groups, not necessarily of equal size. For j = 1,..., N and l = 1,..., k insert a global edge from vertex j to each vertex of type l with probability 1 exp( I(j)λ G τ(j)l /N l), and insert a local edge to each vertex of type l in the same group as j with probability 1 exp( I(j)λ L τ(j)l ). The insertion of each possible edge is independent of the presence or absence of all other edges. Note that no edge can be drawn from a vertex to itself. Finally, say that a vertex j is directionally connected to a vertex l if and only if there exists a path from j to l, and for each vertex j define C j as the set of vertices to which j is directionally connected. For later convenience we adopt the convention that j C j. The relationship between the random digraph and the epidemic model is as follows (see also Andersson and Britton, Chapter 7). Each vertex corresponds in the obvious way to an individual, and the lifetime random variables correspond to infectious periods. The edges emanating from vertices do not correspond directly to infections, although they can be thought of as corresponding to potential infections. For example, the probability of a local edge from individual j to individual l in the same group is 1 exp( I(j)λ L τ(j)l ), which is the probability that j ever infects l in the epidemic model, provided that j ever becomes infected and that l remains susceptible until contacted by j. An equivalent view is that the edges describe all contacts between individuals, and that these only result in infection when the vertices concerned form an infective-susceptible pair. Without loss of generality, suppose that individuals 1,..., a are initially infective in the epidemic model. Then the random set of individuals who are ultimately infected in the epidemic

7 7 has the same distribution as the random set of vertices a j=1c j. Note that the random digraph does not contain any temporal information about the epidemic, but instead is defined in terms of infection probabilities. In particular this means that inclusion of a latent period in the epidemic model, provided it is almost surely finite, makes no change to the digraph, and hence makes no change to the final outcome distribution. Similarly, the infectious period in the original model need not be continuous, but could be modelled as separate disjoint parts in real time (e.g. corresponding to daytime-only contact). Nor does the digraph explicitly represent the actual route of infections: a vertex corresponding to an individual who does become infected might have more than one potential infector in the graph. However, given a realisation of the digraph it is straightforward to calculate the probability of any possible infection pathway, if desired. 4 Data and augmented Likelihood 4.1 Final outcome Data We consider data of the form n = {n(s 1,..., s k ; i 1,..., i k )}, where n(s 1,..., s k ; i 1,..., i k ) denotes the number of groups containing s j initially susceptible individuals of type j, of whom i j ever become infected, where j = 1,..., k. Our focus is on Bayesian statistical inference for the two infection rate matrices Λ L := (λ L ij) and λ G := (Λ G ij), given n. By Bayes Theorem, the posterior density of interest, π(λ L, Λ G n), satisfies π(λ L, Λ G n) π(n Λ L, Λ G )π(λ L, Λ G ), where π(n Λ L, Λ G ) denotes the likelihood and π(λ L, Λ G ) denotes the joint prior density of (Λ L, Λ G ). However, as described in Section 2.3, the likelihood is analytically and numerically intractable in any case of interest, and so something extra is needed. Before addressing this we make two observations. First, final outcome data contain no temporal information. Specifically, there is no information regarding the mean length of the infectious period. Consequently, we fix the infectious period distribution in advance of any data analysis. This implicitly creates a time-scale, with respect to which the values of Λ G and Λ L (but not R ) should be interpreted. Second, provided the data are not of single type, we would expect that the model parameters are not all identifiable. With sufficiently many different compositions of groups in the data, all of the Λ L parameters are identifiable. However, the only information

8 8 available about global infection is, essentially, the numbers of each type infected (k data points), which is insufficient for the k 2 parameters of Λ G (c.f. Britton, 1998). Although in principle the MCMC algorithm we describe below can function without regard to this problem, in practice it is usually pragmatic to consider a reduced-parameter model that involves extra constraints on Λ G. Examples of this are considered later. 4.2 Augmented Likelihood In order to surmount the difficulty of an intractable likelihood, we consider augmenting the parameter space by including a digraph describing potential infectious contacts. Note that it is only necessary to consider this digraph on vertices corresponding to individuals who ever become infected in the epidemic, according to the data. This is because (i) none of these individuals can have contacted any individual who escapes infection, so there is no need to impute such potential edges, and (ii) the data contain no information about potential (but never realised) edges from individuals escaping infection. To be precise, suppose now that the total number of individuals in the population who ever become infected is n, labelled 1,..., n, and define G as the digraph on these n vertices. For j = 1,..., n let I(j) denote the lifetime random variable corresponding to vertex j, distributed according to I τ(j), and I = (I(1),..., I(n)). As before, let N denote the total number of individuals who are initially susceptible. It is necessary to assume that at least one individual is initially infective: we assume henceforth that there is exactly one individual, whose label is κ, although an arbitrary number of initial infectives is easily catered for. The augmented posterior density is π(λ L, Λ G, G, I, κ n) π(n Λ L, Λ G, G, I, κ)π(g Λ L, Λ G, I, κ)π(i)π(κ)π(λ L, Λ G ), (4.1) where the last three terms on the RHS are prior densities. Note that π(i) is simply a product of the individual densities of the I(j) terms, i.e. it is specified by the model assumptions. It remains to evaluate the other two RHS terms in (4.1), which effectively correspond to the augmented likelihood. The π(n Λ L, Λ G, G, I, κ) term is the probability of there being no edges from the n vertices in G to the remaining N n outside G, provided that κ is directionally connected to every other vertex in G, i.e. C κ = {1,..., n}. If the latter does not hold then π(n Λ L, Λ G, G, I, κ) = 0, since G is then incompatible with the observed data. The term π(g Λ L, Λ G, κ, I) is simply the probability of the edges in G and is straightforward to write down.

9 9 For i = 1,..., n and j = 1,..., k, define the following quantities. Let ν L ij denote the number of local edges from vertex i to type j vertices, and define ν G ij similarly for global edges. Let N L ij denote the number of individuals in i s group of type j. Note that N L ij includes individuals who are never infected during the epidemic, and will include individual i if τ(i) = j. Finally, define δ(κ, G; n) = 1 {Cκ ={1,...,n}} and ij = 1 {τ(i)=j}, where 1 A denotes the indicator function of the set A. Thus δ(κ, G; n) is simply the indicator function of the event that, given the initial infective κ, G is compatible with the observed data n. Then L(Λ L, Λ G, G, I, κ) := π(n Λ L, Λ G, G, I, κ)π(g Λ L, Λ G, I, κ) = δ(κ, G; n) n k { 1 exp( I(i)λ L τ(i)j ) } νij L exp( I(i)λ L τ(i)j(nij L ij νij)) L i=1 j=1 { 1 exp( I(i)λ G τ(i)j /N j ) } ν G ij exp( I(i)λ G τ(i)j(n j ij ν G ij )/N j ). (4.2) Computation of (4.2) is straightforward in practice, the only minor challenge being the δ(κ, G; n) term. Note that in the above formulation, the lifetimes I(i), i = 1,..., n are included as extra model parameters. It is possible to integrate these out of (4.2) by multiplying out the {1 exp( )} terms and taking expectations, exploiting the independence of the I(i) terms. However, the resulting expressions consist of alternating sums and possess poor numerical stability, so in most scenarios it is expeditious to retain the lifetimes as parameters. 5 Markov chain Monte Carlo algorithm In order to obtain samples from the posterior density defined at (4.1), we now define a Metropolis-Hastings algorithm (see e.g. Gilks et al., 1996) in which the parameters are updated in blocks in the following manner. Updates for parameters other than the digraph G are largely routine so we only give brief details. It is assumed that the initial configuration of the parameters has positive probability, and in particular that δ(κ, G; n) = 1. The infection rate parameters λ L ij, λ G ij, 1 i, j k, are each updated individually using Gaussian proposal distributions centred on the current value, with either fixed variances, or by using an adaptive scheme in which the variances can change as the algorithm proceeds. The latter is especially useful in those cases where the data are relatively uninformative about a particular infection rate. The lifetime random variables I(i), i = 1,..., n can be updated naturally by using the prior distributions as proposals. The proposed new lifetimes I, say, are accepted with probability L(Λ L, Λ G, G, I, κ) L(Λ L, Λ G, G, I, κ) 1.

10 10 Note that directional connectivity is unaffected by this update, which simplifies the computation. It is usually best to update the lifetimes in blocks rather than all at once, to prevent low acceptance rates. The label of the initial infective, κ, can be updated using a Gibbs step. Specifically, let A(G) = {1 j n : δ(g, j; n) = 1} denote the possible values of κ, for a given G. Under the assumptions of the model each of these values is equally likely, conditional upon the current G. Thus κ has full conditional distribution given by π(κ G, n) = π(κ)δ(g, κ; n) κ:κ A(G) π(κ). Alternative updating methods for κ, that also ensure G remains directionally connected, include proposing a new value κ from among those vertices to which κ is connected, and then swapping the direction of the edge from κ to κ. By necessity, such moves involve some updating of G. Updating G can be achieved by simply adding and deleting edges at random, as described below. We use the terminology non-edge to refer to the absence of an edge, i.e. there is a non-edge from vertex i to vertex j if and only if there is not an edge from i to j. For i = 1,..., n and j = 1,..., k, denote by n L ij the number of individuals in i s group of type j who ever become infected. Thus n L ij is the number of vertices in G in i s group of type j, and k ( ) j=1 n L ij ij is the maximum possible number of local edges emanating from i. First, choose to try to add an edge with probability p a, otherwise try to delete an edge. In both cases, choose to act on the local edges with probability p L, otherwise act on the global edges. For local addition, first select, uniformly at random, an edge to add from among the entire set of local non-edges. If this set is empty, then stop at this point. Otherwise, suppose the edge is from vertex s to vertex t, these vertices being in the same group. To calculate the acceptance probability, note that the likelihood ratio of proposed to existing graph is simply {1 exp( I(s)λ L τ(s)τ(t) )}/ exp( I(s)λL τ(s)τ(t) ). Combining this with the proposal mechanism, and that for deletion described below, yields the acceptance probability { exp(i(s)λ L τ(s)τ(t) ) 1 } (1 p a) n k i=1 j=1 (nl ij ij νij) L ( p a 1 + n ) 1. k i=1 j=1 νl ij Note that there is no need to check directional connectivity. The addition of global edges occurs in the same way, mutatis mutandis. For local deletion, an edge is picked at random from among the n i=1 k j=1 νl ij available, and then deleted with probability { exp(i(s)λ L τ(s)τ(t) ) 1 } 1 δ(κ, G; n)p n k a i=1 j=1 νl ij ( (1 p a ) 1 + n ) 1, k i=1 j=1 (nl ij ij νij L)

11 11 where the proposed deletion is an edge from vertex s to vertex t. Note that evaluation of the acceptance probability requires checking directional convectivity. Global deletion is similar. 6 Application to data We now consider the performance of our methods in a variety of examples. Our aim is not to perform thorough data analyses, but to use suitable data sets to illustrate the feasibility of our approach, and its scope for providing new kinds of information not available via existing methods. All results are based on samples of size 10,000 from MCMC sample chain output. Algorithm convergence was checked by inspection of the resulting chain output. Unless otherwise indicated, parameters on (0, ) are assigned exponential prior densities with rate In all cases such priors allow the data to dominate the posterior distribution. If a uniform prior mass function for κ is assumed then, in the examples below, inference for Λ L and Λ G was found to be indistinguishable from the case where κ is simply fixed. Indeed for the single-type case, it can be shown that κ has no bearing on inference for the local and global infection rate parameters. This is essentially a consequence of the fact that the probability that a given individual i infects a given individual j is the same as the probability that j infects i. For the multitype case this is no longer true, but κ only becomes important in data on small populations. 6.1 Example 1: Single type, homogeneous mixing model We start by exploring the performance of the algorithm in the special case of a homogeneouslymixing single-type population. In the terminology of the general model this corresponds to a single type (k = 1), all groups being of size one, and Λ L being redundant since no local infection occurs. Thus there is just one infection rate parameter, λ G 11 = λ, say. In some sense this setting provides the most challenging inverse problem, since the data only comprise two numbers (initial number susceptible N, final number ever infected n), from which we shall try to infer information about both the infection rate and, implicitly, the random digraph. Note that R, usually called R 0 in this setting, equals λe[i], where I is the infectious period. Suppose that N = 100, with one initial infective, and consider the three data points n = 25, 50 and 75. We also consider three possible infectious period distributions, each with mean one, namely constant, exponential, and Gamma with variance 10. It should

12 12 be noted that the numerical problems outlined in Section 2.3 apply in these cases, so that direct likelihood calculation via the standard triangular equations for final size would typically exceed machine accuracy. Our focus in the following is on R 0 ; the next example illustrates how information about G can be easily obtained. Some posterior summary statistics are given in Table 1, and Figure 1 shows the posterior density estimates of R 0 for the case n = 25, under the three infectious period distributions. The algorithm ran successfully, with typical run times of a few hours, and with no apparent difficulties in terms of mixing. Various starting values for both λ and G were explored and in all cases the Markov chain quickly moved to a high posterior density region. In particular, this means that more exotic updates for G do not appear necessary in this case. Table 1 near here Figure 1 near here We highlight three aspects of our results. First, as Figure 1 illustrates, in all cases the posterior density of R 0 was found to be roughly symmetric, but with a discernible right tail. This tail became more pronounced as the variance of I increased, although the modal value for a given n was found to be very similar as I varied. Second, the key effect of the different infectious period distributions was to alter the posterior variance of R 0, which increased with the variance of I. Such findings are intuitively reasonable. Note that the posterior mean also increases with the variance of I, although this is essentially a consequence of the increased skewness. Third, the posterior probability that R 0 < 1 was found to be approximately 0.25 for n = 25, regardless of the distribution, and between 0.01 and 0.06 for n = 50, increasing with the variance of I. We mention this to emphasize the fact that our methods do not require any assumption that R 0 > 1, and moreover they provide information with how reliable such an assumption would be. 6.2 Example 2: Single type, two-level-mixing model We now turn to analyses based on two-level mixing models. In the sequel we consider data sets taken from detailed studies on outbreaks of influenza A(H3N2) in Tecumseh, Michigan. The data are in the form that we require in that they consist of final numbers infected in a population that has been divided into households. Many aspects of these data have been previously explored, see for example Monto et al. (1985), Longini et al. (1988), Addy et al. (1991), and references therein. More recent analyses based on two-level mixing models, all of which use approximations of one kind or another, can

13 13 be found in Ball et al. (1997), Britton and Becker (2000), Ball and Lyne (2004), and Demiris and O Neill (2005). Table 2 near here We begin with a single-type analysis for an outbreak in The data are given in Table 2 and show the numbers infected in households containing up to seven initially susceptible individuals. Previous analyses of the Tecumseh data have often only used households up to size five, this being due to numerical problems of the kind described in Section 2.3 above (e.g. Longini et al., 1988; Addy et al., 1991). Our methods have no such restriction. We define λ L = λ L 11 and λ G = λ G 11. In keeping with previous studies, we assume that the infectious periods are distributed according to a Gamma random variable with shape parameter 2 and scale parameter (1/2.05), i.e. with mean 4.1 days. Table 3 gives posterior summary information for λ L, λ G, the threshold parameter R, and the total numbers of local and global edges in G, denoted η L = i,j νl ij and η G = i,j νg ij, respectively. All of the marginal posterior density estimates of these parameters were unimodal and approximately symmetric. Estimation for λ L and λ G is reasonably precise in that the posterior credible intervals are relatively small. The 95% posterior credible interval for R includes unity, and moreover P (R 1 n) 0.085, highlighting the fact that assuming R > 1 is not entirely satisfactory for these data. Table 3 near here The results for η L and η G can be interpreted in a variety of ways. First, they provide summary information about G, and in particular the standard deviation and credible intervals give some indication of how accurately we can infer G from the data. For example, since there are 82 infected households, and 128 infected individuals, it follows that 81 η G = We might expect η G to be concentrated towards the lower end of this range, since larger values would be incompatible with the large number of individuals avoiding infection, but even so the posterior information reveals that η G can be inferred with considerable accuracy. Moreover, it would appear that the graph is fairly tree-like in structure, since η G +η L is typically not far in excess of the total number of infected individuals. The η L and η G parameters are also informative about the actual typical number of potential infections, and thus they give an alternative to R itself. For example, dividing both by the number of vertices in G, 128, we find the mean numbers of local and global links emanating from a vertex are 0.39 and 0.77, respectively. The sum of these is close to the posterior mean of R, while the individual values give some

14 14 idea of the relative importance of local and global infections during the outbreak. More sophisticated variants are possible, such as considering the ratio of actual to potential edges realised, or using the local structure to obtain more detailed descriptions of local spread (the point being that the distribution of number of local contacts depends on an individual s group size.) Thus far we have assumed that the infectious periods have a Gamma distribution, with mean 4.1 days. Although the algorithm computation times are reasonable, typically several hours, these can be reduced considerably (e.g. a factor of 3-5) by using a simpler model in which the infectious periods have fixed length 4.1 days. This change does not make much difference to the results: for example, the posterior means of λ L and λ G are similar, but the posterior variances are slightly smaller than before. Such similarities for these models are not new, see e.g. Ball et al. (1997), O Neill et al. (2000), but the point here is that the simpler model can be analysed rather more quickly. 6.3 Example 3: Two-type, two-level-mixing We now consider a two-type data set described in Longini et al. (1988). These data, also from the Tecumseh study, divide the at-risk population into two strata according to antibody titre level, the strata being termed low or higher. The data are actually combined from two separate influenza outbreaks, but this is immaterial for the purposes of illustrating our methods. The data set is given in Table 4 of Longini et al. (1988), and comprises 567 households containing between one and five initially susceptible individuals. In thirteen of the households the exact outcome is not presented, and for simplicity we exclude these from our present analysis. Of those households included, T 1 = 163 out of N 1 = 742 low-titre (type 1) individuals became infected, compared with T 2 = 53 out of N 2 = 562 higher-titre (type 2) individuals. The analysis described in Longini et al. (1988) employs an independent-households model with fixed-length infectious periods, in which individuals can differ in their susceptibility, but not infectivity. For the two-level mixing model, a natural way of making the latter assumption is to set λ L ij = λ L lj and λg ij = λ G lj for j, l = 1, 2. We refer to the resulting model as the LGS (Local-Global-Susceptibility) model. Since our methods do not require any structural restrictions on Λ L, we also consider for illustration the model with the sole constraint that λ G ij = λ G lj for j, l = 1, 2, and refer to this as the GS (Global-Susceptibility) model. In keeping with Example 2, and Longini et al. (1988), the infectious periods were all set to be of fixed length 4.1 days. Starting with the LGS model, we first indicate that results comparable to those presented in Longini et al. (1988) can be easily obtained via our methods. For example, the

15 15 independent-households model used in that paper is defined in terms of parameters Q i and B i, respectively representing the probability that a type i individual avoids infection from a single same-household infective, and the community at large. These parameters are of direct interest because they are used to define the so-called secondary attack rate (viz., (1 Q i ) 100%) and community probability of infection, 1 B i. In our model we have Q i = exp( 4.1 λ L 1i), and a simple approximation to B i is exp[ 4.1 λ G 1i(T 1 N1 1 + T 2 N2 1 )]. The maximum likelihood estimates of Q i and B i, i = 1, 2, in Longini et al. (1988) were found to be very similar to our corresponding posterior mean and median values. Table 4 near here Turning now to a comparison of the GS and LGS models, Table 4 gives some posterior summary statistics. As expected, the LGS model in some sense averages out differences in λ L 1j and λ L 2j found in the GS model. In the latter model, for j = 1, 2 the posterior mean of λ L 1j is somewhat larger than λ L 2j, suggesting that higher-titre individuals are less infectious. However, the posterior standard deviation of λ L 21 is relatively large, so any difference between λ L 21 and λ L 11 is not clear-cut. The posterior uncertainty arises because estimation of λ L 21 requires infected higher-titre individuals who then infect low-titre individuals, but the data only contain a few households with two types of individual, both of whom became infectious. For Λ G, the two models give roughly similar posterior distributions, the differences essentially arising as compensation for the corresponding Λ L differences. The threshold parameter R was found to have posterior mean 1.21 for the LGS model and 1.23 for the GS model, and in both cases the posterior standard deviation was Finally, although the data sets are certainly not strictly comparable, it is notable that the single-type model of Example 2 attributes more of the epidemic spread to global infections than the present example, insofar as the posterior mean of λ G exceeds all of the posterior Λ G entries. In particular, the extra detail of the multitype data appears to indicate that local spread between low-titre individuals is of key importance, suggesting that control measures should be targetted towards this. 7 Discussion In this paper we have described new methodology for performing Bayesian inference for two-level mixing epidemic models, given final outcome data. Implementation, although not a trivial matter, is not especially complicated. The methods work well in practice, although for data sets with very large numbers of infectives (thousands as opposed to

16 16 hundreds) the algorithm takes days rather than hours to run. However, data on such large outbreaks are not common and so this is not a serious restriction. The methods have several appealing features. First, they generate information regarding the actual propagation of the epidemic via the random digraph. Although not explicitly temporal, this information could be loosely regarded as such, for example by supposing that the real-time delay between generations of infection is roughly constant. The random digraph can then be used to infer information about the duration of the epidemic. Second, the methods do not require approximations, such as those introducing independence between groups, or that the epidemic is above threshold. Third, the methods are clearly very flexible, and have scope for application to other structured population epidemic models. Examples of the latter include spatial models, models with three or more levels of mixing, and models with overlapping subgroups (for example, with households, schools, and workplaces all explicitly described.) Although the form of the structure is required to be known, in practice different plausible scenarios could be explored if the exact contact structure was not available. An extension of practical interest is to the case where the observed data form only a fraction of the total population. The main impact of this setting compared to that we have studied is on the posterior variances of the quantities of interest; estimates of posterior mean behaviour will be largely unaffected. Our methods can still be applied, but now require additional assumptions regarding the (unknown) number of individuals infected from outside the observed fraction. Each such infective individual would then give rise to a connected digraph of its own on some subset of susceptibles within the observed fraction. One way to generate the unknown infectives is to use approximation methods involving the final severity along the lines discussed in Demiris and O Neill (2005). Roughly speaking, each individual in the observed fraction would have a fixed probability of being infected from outside, this probability itself being calculated by an approximation to the final severity of the epidemic in the entire population. A drawback with this approach is that it requires the undesirable R > 1 assumption discussed previously. An exact alternative is to simply impute the entire digraph in the unobserved population as well, which would be feasible in small population settings, but time-consuming for larger populations. Finally, an important extension of our methodology is towards Bayesian model choice. In principle it is possible to implement trans-dimensional MCMC methods in the multitype epidemic model setting, the key (non-trivial) challenge being to efficiently move between different models. This is a subject of current investigation. Acknowledgments

17 17 We thank Owen Lyne for helpful discussions. The first author was partly supported by EPSRC grant GR/M86323/01, and computing facilities were partly funded by EPSRC JREI grant GR/R08292/01. References Addy, C. L., Longini, I. M. and Haber, M. (1991). A generalized stochastic model for the analysis of infectious disease final size data. Biometrics 47, Andersson, H. (1999). Epidemic models and social networks. Math. Sci. 24, Andersson, H. and Britton, T. (2000). Stochastic Epidemic Models and Their Statistical Analysis. Lecture Notes in Statistics 151, Springer, New York. Ball, F. G., Britton, T. and Lyne, O. D. (2004) Stochastic multitype epidemics in a comunity of households: Estimation of threshold parameter R and secure vaccination coverage. Biometrika 91, Ball, F. G. and Lyne, O. D. (2001). Stochastic multitype SIR epidemics among a population partitioned into households. Adv. in Appl. Probab. 33, Ball, F. G. and Lyne, O. D. (2004). Private communication. Ball, F. G., Mollison, D. and Scalia-Tomba, G. (1997). Epidemics with two levels of mixing. Ann. Appl. Probab. 7, Barbour, A. D. and Mollison, D. (1990) Epidemics and random graphs. In Stochastic Processes in epidemic theory, eds. Gabriel J. P. and Lefévre, C., Lecture notes in Biomathematics 86, Becker, N. G. (1989) Analysis of Infectious Disease Data. Chapman and Hall, London. Becker, N. G. and Dietz, K. (1995) The effect of the household distribution on transmission and control of highly infectious diseases. Math. Biosci. 127, Becker, N. G. and Hall, R. (1996) Immunization levels for preventing epidemics in a community of households made up of individuals of various types. Math. Biosci. 132, Britton, T. (1998) Estimation in multitype epidemics. J. R. Statist. Soc. B 60, Britton, T. and Becker, N. G. (2000) Estimating the immunity coverage required to prevent epidemics in a community of households. Biostatistics 1, Britton, T. and O Neill, P. D. (2002) Bayesian inference for stochastic epidemics in populations with random social structure. Scand. J. Statist. 29, Demiris, N. and O Neill, P. D. (2005) Bayesian inference for epidemic models with two levels of mixing. To appear, Scand. J. Statist. Farrington C. P., Kanaan M. N. and Gay N. J. (2001) Estimation of the basic reproduction number for infectious diseases from age-stratified serological survey data, with

18 18 discussion. J. R. Statist. Soc. C, 50, Gilks, W. Richardson, S. and Spiegelhalter, D. (1996) Markov chain Monte Carlo in practice. Chapman and Hall, London. Haydon D. T., Chase-Topping M., Shaw D. J., Matthews L, Friar J. K., Wilesmith J., Woolhouse M. E. (2003) The construction and analysis of epidemic trees with reference to the 2001 UK foot-and-mouth outbreak. Proc. R. Soc. Lond. B 270, Islam, M. N., O Shaughnessy, C. D. and Smith, B. (1996) A random graph model for the final-size distribution of household infections. Stat. in Med. 15, Li, N., Qian, G., and Huggins, R. (2002) Analysis of between-household heterogeneity in disease transmission from data on outbreak sizes. Aust. N. Z. J. Stat. 44, Longini, I. M. and Koopman, J. S. (1982) Household and community transmission parameters from final distributions of infections in households. Biometrics 38, Longini, I. M., Koopman, J. S., Haber, M., and Cotsonis, G. A. (1988) Statistical inference for infectious diseases: risk-specific household and community transmission parameters. Am. J. Epid. 128, Ludwig, D. (1975) Final size distributions for epidemics. Math. Biosci. 23, Monto, A. S., Koopman, J. S. and Longini, I. M. (1985) Tecumseh study of illness. XIII. Influenza infection and disease, American Journal of Epidemiology 121, O Neill, P. D., Balding, D. J., Becker, N. G., Eerola, M. and Mollison, D. (2000) Analyses of infectious disease data from household outbreaks by Markov Chain Monte Carlo methods. J. R. Statist. Soc. C, 49, Rida, W. (1991) Asymptotic properties of some estimators for the infection rate in the general stochastic epidemic. J. R. Statist. Soc. B, 53, Schinazi, R., (2002) On the role of social clusters in the transmission of infectious diseases. Theoretical Population Biology 61, Wallinga J. and Teunis P. (2004) Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology 160,

19 19 Constant Exponential Gamma n = 25 Mean S. Dev n = 50 Mean S. Dev n = 75 Mean S. Dev Table 1: Posterior means and standard deviations for R 0 under the assumption of three different infectious period distributions, each with mean 1, and with variance 0 (Constant), 1 (Exponential) and 10 (Gamma). The population size is N = 100. Susceptibles per household No. infected Total Table 2: Final numbers infected in households during influenza outbreak in Tecumseh, Michigan.

20 20 Parameter λ L λ G R η L η G Mean Median S. dev % C. I. (0.032,0.072) (0.15,0.25) (0.91,1.65) (38,61) (90,109) Table 3: Posterior parameter summaries for the infection rates, threshold parameter, and numbers of local and global edges in G, Tecumseh data set. Model Λ L Λ G GS LGS ( (0.015) (0.013) ) ( 0.146(0.015) ) (0.0090) (0.032) (0.011) 0.146(0.015) (0.0090) ( (0.013) ) (0.0073) ( 0.143(0.014) ) (0.0090) (0.013) (0.0073) 0.143(0.014) (0.0090) Table 4: Posterior mean (standard deviation) for Λ L and Λ G for the Global-Susceptibility and Local-Global-Susceptibility models.

21 21 Fig. 1. Posterior density plots for R 0 under the assumption of three different infectious period distributions, each with mean 1, and with variance 0 (Constant), 1 (Exponential) and 10 (Gamma). The data are n = 25 cases in a population size of N = Constant Exponential Gamma 0.8 π(r 0 ) R

Bayesian inference for stochastic multitype epidemics in structured populations using sample data

Bayesian inference for stochastic multitype epidemics in structured populations using sample data Bayesian inference for stochastic multitype epidemics in structured populations using sample data PHILIP D. O NEILL School of Mathematical Sciences, University of Nottingham, Nottingham, UK. SUMMARY This

More information

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 2 Coding and output Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. General (Markov) epidemic model 2. Non-Markov epidemic model 3. Debugging

More information

Threshold Parameter for a Random Graph Epidemic Model

Threshold Parameter for a Random Graph Epidemic Model Advances in Applied Mathematical Biosciences. ISSN 2248-9983 Volume 5, Number 1 (2014), pp. 35-41 International Research Publication House http://www.irphouse.com Threshold Parameter for a Random Graph

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

More information

Reproduction numbers for epidemic models with households and other social structures I: Definition and calculation of R 0

Reproduction numbers for epidemic models with households and other social structures I: Definition and calculation of R 0 Mathematical Statistics Stockholm University Reproduction numbers for epidemic models with households and other social structures I: Definition and calculation of R 0 Lorenzo Pellis Frank Ball Pieter Trapman

More information

Reproduction numbers for epidemic models with households and other social structures II: comparisons and implications for vaccination

Reproduction numbers for epidemic models with households and other social structures II: comparisons and implications for vaccination Reproduction numbers for epidemic models with households and other social structures II: comparisons and implications for vaccination arxiv:1410.4469v [q-bio.pe] 10 Dec 015 Frank Ball 1, Lorenzo Pellis

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Partially Observed Stochastic Epidemics

Partially Observed Stochastic Epidemics Institut de Mathématiques Ecole Polytechnique Fédérale de Lausanne victor.panaretos@epfl.ch http://smat.epfl.ch Stochastic Epidemics Stochastic Epidemic = stochastic model for evolution of epidemic { stochastic

More information

Bayesian Inference for Contact Networks Given Epidemic Data

Bayesian Inference for Contact Networks Given Epidemic Data Bayesian Inference for Contact Networks Given Epidemic Data Chris Groendyke, David Welch, Shweta Bansal, David Hunter Departments of Statistics and Biology Pennsylvania State University SAMSI, April 17,

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models.

Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models. Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models. An Application to the 2001 UK Foot-and-Mouth Outbreak Theodore Kypraios @ University of Nottingham Gareth O. Roberts @ Lancaster

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

EPIDEMICS WITH TWO LEVELS OF MIXING

EPIDEMICS WITH TWO LEVELS OF MIXING The Annals of Applied Probability 1997, Vol. 7, No. 1, 46 89 EPIDEMICS WITH TWO LEVELS OF MIXING By Frank Ball, Denis Mollison and Gianpaolo Scalia-Tomba University of Nottingham, Heriot-Watt University

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

Bayesian estimation of the basic reproduction number in stochastic epidemic models

Bayesian estimation of the basic reproduction number in stochastic epidemic models Bayesian estimation of the basic reproduction number in stochastic epidemic models Damian Clancy Department of Mathematical Sciences, University of Liverpool Philip D. O Neill School of Mathematical Sciences,

More information

Thursday. Threshold and Sensitivity Analysis

Thursday. Threshold and Sensitivity Analysis Thursday Threshold and Sensitivity Analysis SIR Model without Demography ds dt di dt dr dt = βsi (2.1) = βsi γi (2.2) = γi (2.3) With initial conditions S(0) > 0, I(0) > 0, and R(0) = 0. This model can

More information

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area

More information

A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation

A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation Theodore Kypraios 1, Peter Neal 2, Dennis Prangle 3 June 15, 2016 1 University of Nottingham,

More information

Markov-modulated interactions in SIR epidemics

Markov-modulated interactions in SIR epidemics Markov-modulated interactions in SIR epidemics E. Almaraz 1, A. Gómez-Corral 2 (1)Departamento de Estadística e Investigación Operativa, Facultad de Ciencias Matemáticas (UCM), (2)Instituto de Ciencias

More information

Downloaded from:

Downloaded from: Camacho, A; Kucharski, AJ; Funk, S; Breman, J; Piot, P; Edmunds, WJ (2014) Potential for large outbreaks of Ebola virus disease. Epidemics, 9. pp. 70-8. ISSN 1755-4365 DOI: https://doi.org/10.1016/j.epidem.2014.09.003

More information

1 Using standard errors when comparing estimated values

1 Using standard errors when comparing estimated values MLPR Assignment Part : General comments Below are comments on some recurring issues I came across when marking the second part of the assignment, which I thought it would help to explain in more detail

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Assessing system reliability through binary decision diagrams using bayesian techniques.

Assessing system reliability through binary decision diagrams using bayesian techniques. Loughborough University Institutional Repository Assessing system reliability through binary decision diagrams using bayesian techniques. This item was submitted to Loughborough University's Institutional

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

A Bayesian Approach to Phylogenetics

A Bayesian Approach to Phylogenetics A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

LIMIT THEOREMS FOR A RANDOM GRAPH EPIDEMIC MODEL. By Håkan Andersson Stockholm University

LIMIT THEOREMS FOR A RANDOM GRAPH EPIDEMIC MODEL. By Håkan Andersson Stockholm University The Annals of Applied Probability 1998, Vol. 8, No. 4, 1331 1349 LIMIT THEOREMS FOR A RANDOM GRAPH EPIDEMIC MODEL By Håkan Andersson Stockholm University We consider a simple stochastic discrete-time epidemic

More information

Understanding the contribution of space on the spread of Influenza using an Individual-based model approach

Understanding the contribution of space on the spread of Influenza using an Individual-based model approach Understanding the contribution of space on the spread of Influenza using an Individual-based model approach Shrupa Shah Joint PhD Candidate School of Mathematics and Statistics School of Population and

More information

3 Undirected Graphical Models

3 Undirected Graphical Models Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 3 Undirected Graphical Models In this lecture, we discuss undirected

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Three Disguises of 1 x = e λx

Three Disguises of 1 x = e λx Three Disguises of 1 x = e λx Chathuri Karunarathna Mudiyanselage Rabi K.C. Winfried Just Department of Mathematics, Ohio University Mathematical Biology and Dynamical Systems Seminar Ohio University November

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

Temporal point processes: the conditional intensity function

Temporal point processes: the conditional intensity function Temporal point processes: the conditional intensity function Jakob Gulddahl Rasmussen December 21, 2009 Contents 1 Introduction 2 2 Evolutionary point processes 2 2.1 Evolutionarity..............................

More information

Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models

Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models (with special attention to epidemics of Escherichia coli O157:H7 in cattle) Simon Spencer 3rd May

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Conditional probabilities and graphical models

Conditional probabilities and graphical models Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within

More information

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION MAGNUS BORDEWICH, KATHARINA T. HUBER, VINCENT MOULTON, AND CHARLES SEMPLE Abstract. Phylogenetic networks are a type of leaf-labelled,

More information

Quantile POD for Hit-Miss Data

Quantile POD for Hit-Miss Data Quantile POD for Hit-Miss Data Yew-Meng Koh a and William Q. Meeker a a Center for Nondestructive Evaluation, Department of Statistics, Iowa State niversity, Ames, Iowa 50010 Abstract. Probability of detection

More information

Age-dependent branching processes with incubation

Age-dependent branching processes with incubation Age-dependent branching processes with incubation I. RAHIMOV Department of Mathematical Sciences, KFUPM, Box. 1339, Dhahran, 3161, Saudi Arabia e-mail: rahimov @kfupm.edu.sa We study a modification of

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Tree sets. Reinhard Diestel

Tree sets. Reinhard Diestel 1 Tree sets Reinhard Diestel Abstract We study an abstract notion of tree structure which generalizes treedecompositions of graphs and matroids. Unlike tree-decompositions, which are too closely linked

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference By Philip J. Bergmann 0. Laboratory Objectives 1. Learn what Bayes Theorem and Bayesian Inference are 2. Reinforce the properties of Bayesian

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Markov Chains and Pandemics

Markov Chains and Pandemics Markov Chains and Pandemics Caleb Dedmore and Brad Smith December 8, 2016 Page 1 of 16 Abstract Markov Chain Theory is a powerful tool used in statistical analysis to make predictions about future events

More information

Investigation into the use of confidence indicators with calibration

Investigation into the use of confidence indicators with calibration WORKSHOP ON FRONTIERS IN BENCHMARKING TECHNIQUES AND THEIR APPLICATION TO OFFICIAL STATISTICS 7 8 APRIL 2005 Investigation into the use of confidence indicators with calibration Gerard Keogh and Dave Jennings

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 By Philip J. Bergmann 0. Laboratory Objectives 1. Learn what Bayes Theorem and Bayesian Inference are 2. Reinforce the properties

More information

Efficient MCMC Samplers for Network Tomography

Efficient MCMC Samplers for Network Tomography Efficient MCMC Samplers for Network Tomography Martin Hazelton 1 Institute of Fundamental Sciences Massey University 7 December 2015 1 Email: m.hazelton@massey.ac.nz AUT Mathematical Sciences Symposium

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

Lecture 21: Spectral Learning for Graphical Models

Lecture 21: Spectral Learning for Graphical Models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 Lecture 21: Spectral Learning for Graphical Models Lecturer: Eric P. Xing Scribes: Maruan Al-Shedivat, Wei-Cheng Chang, Frederick Liu 1 Motivation

More information

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015 Markov Chains Arnoldo Frigessi Bernd Heidergott November 4, 2015 1 Introduction Markov chains are stochastic models which play an important role in many applications in areas as diverse as biology, finance,

More information

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness

More information

Bayesian time series classification

Bayesian time series classification Bayesian time series classification Peter Sykacek Department of Engineering Science University of Oxford Oxford, OX 3PJ, UK psyk@robots.ox.ac.uk Stephen Roberts Department of Engineering Science University

More information

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016 Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started

More information

6 Pattern Mixture Models

6 Pattern Mixture Models 6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Fast Likelihood-Free Inference via Bayesian Optimization

Fast Likelihood-Free Inference via Bayesian Optimization Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling

A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling A Statistical Input Pruning Method for Artificial Neural Networks Used in Environmental Modelling G. B. Kingston, H. R. Maier and M. F. Lambert Centre for Applied Modelling in Water Engineering, School

More information

Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graphical model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

PROBABILISTIC REASONING SYSTEMS

PROBABILISTIC REASONING SYSTEMS PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Any live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step.

Any live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step. 2. Cellular automata, and the SIRS model In this Section we consider an important set of models used in computer simulations, which are called cellular automata (these are very similar to the so-called

More information

Markov Chain Monte Carlo Lecture 4

Markov Chain Monte Carlo Lecture 4 The local-trap problem refers to that in simulations of a complex system whose energy landscape is rugged, the sampler gets trapped in a local energy minimum indefinitely, rendering the simulation ineffective.

More information

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016 Lecture 1: Introduction and Review We begin with a short introduction to the course, and logistics. We then survey some basics about approximation algorithms and probability. We also introduce some of

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

IN this paper, we consider the capacity of sticky channels, a

IN this paper, we consider the capacity of sticky channels, a 72 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 54, NO. 1, JANUARY 2008 Capacity Bounds for Sticky Channels Michael Mitzenmacher, Member, IEEE Abstract The capacity of sticky channels, a subclass of insertion

More information

Monte Carlo Integration using Importance Sampling and Gibbs Sampling

Monte Carlo Integration using Importance Sampling and Gibbs Sampling Monte Carlo Integration using Importance Sampling and Gibbs Sampling Wolfgang Hörmann and Josef Leydold Department of Statistics University of Economics and Business Administration Vienna Austria hormannw@boun.edu.tr

More information

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X

More information