Emulating a gravity model to infer the spatiotemporal dynamics of an infectious disease

Size: px
Start display at page:

Download "Emulating a gravity model to infer the spatiotemporal dynamics of an infectious disease"

Transcription

1 Emulating a gravity model to infer the spatiotemporal dynamics of an infectious disease arxiv: v3 [stat.me] 14 Feb 2013 Roman Jandarov Murali Haran Department of Statistics Department of Statistics The Pennsylvania State University The Pennsylvania State University raj153@psu.edu mharan@stat.psu.edu Ottar Bjørnstad Departments of Entomology and Biology The Pennsylvania State University onb1@psu.edu Bryan Grenfell Departments of Ecology and Evolutionary Biology Princeton University grenfell@princeton.edu Draft: February 18, 2013 Abstract Probabilistic models for infectious disease dynamics are useful for understanding the mechanism underlying the spread of infection. When the likelihood function for these models is expensive to evaluate, traditional likelihood-based inference may be computationally intractable. Furthermore, traditional inference may lead to poor parameter estimates and the fitted model may not capture important biological characteristics of the observed data. We propose a novel approach for resolving these issues that is inspired by recent work in emulation and calibration for complex computer models. Our motivating example is the gravity time series susceptible-infected-recovered (TSIR) model. Our approach focuses on the characteristics of the process that are 1

2 of scientific interest. We find a Gaussian process approximation to the gravity model using key summary statistics obtained from model simulations. We demonstrate via simulated examples that the new approach is computationally expedient, provides accurate parameter inference, and results in a good model fit. We apply our method to analyze measles outbreaks in England and Wales in two periods, the pre-vaccination period from and the vaccination period from Based on our results, we are able to obtain important scientific insights about the transmission of measles. In general, our method is applicable to problems where traditional likelihood-based inference is computationally intractable or produces a poor model fit. It is also an alternative to approximate Bayesian computation (ABC) when simulations from the model are expensive. 1 Introduction Infectious disease dynamics are of interest to modelers from a range of disciplines. The theory of disease dynamics provides a tractable system for investigating key questions in population and evolutionary biology. Understanding the disease dynamics helps in management and with pressing disease issues such as disease emergence and epidemic control strategies. Probabilistic models for disease dynamics are important as they help increase our understanding of the mechanism underlying the spread of the infection while also accounting for their inherent stochasticity. Observations on reported cases of the diseases, especially in the form of space-time data, are becoming increasingly available, allowing for statistical inference for unknown parameters of these models. However, traditional likelihood-based inference for many disease dynamics models is often challenging because the likelihood function may be expensive to evaluate, making likelihood-based inference computationally intractable. Furthermore, traditional inference may lead to poor parameter estimates and the fitted model may not capture important biological characteristics of the observed data. Hence, an approach that simultaneously addresses the computational challenges as well as the inferential issues would be very useful for a number of interesting and important probabilistic models for dynamics of diseases. Inspired by work in the field of emulation and calibration for complex computer models (cf. Bayarri, Berger, Cafeo, Garcia-Donato, Liu, Palomo, Parthasarathy, Paulo, Sacks, and Walsh, 2007; Craig, Goldstein, Rougier, and Seheult, 2001; Kennedy and O Hagan, 2001; Sacks, Welch, Mitchell, and Wynn, 1989), we develop a novel approach for inference for such models. Our approach uses a Gaussian process approximation to the disease dynamics model using key biologically relevant summary statistics obtained from simulations of the model at differing parameter values. As we will demonstrate, this approach results in 2

3 reliable parameter estimates and a good model fit, and is also computationally efficient. The motivating example for our approach is the gravity time series susceptible-infectedrecovered (TSIR) model for measles dynamics. The spatiotemporal dynamics of measles have received a lot of attention in part due to the importance of the disease, the highly nonlinear outbreak dynamics and also because of the availability of rich data sets. Important aspects of local dynamics of measles are well studied. These include key issues like seasonality in transmission of the infection (Bjørnstad, Finkenstädt, and Grenfell, 2002; Dietz, 1976), effects of host demography on outbreak frequency (Finkenstädt, Keeling, and Grenfell, 1998; McLean and Anderson, 1988), and causes of local persistence and extinctions (Bartlett, 1956; Grenfell, Bjornstad, and Kappey, 2001; Grenfell and Harwood, 1997). During the course of outbreaks in well-mixed local populations, the epidemic trajectory of measles is virtually unaffected by infection that may enter from neighboring locations. However, spatial coupling is fundamental to the dynamics and management of measles for smaller communities where the infection may become locally extinct (Bartlett, 1956; Grenfell and Harwood, 1997). Hence, ecologists have also studied the spatial spread of the disease using so-called metapopulation models (Earn, Rohani, and Grenfell, 1998; Grenfell and Harwood, 1997; Swinton and Grenfell, 1998). In this paper, we investigate inference for a model first proposed by Xia, Bjørnstad, and Grenfell (2004). The model represents a combination of the TSIR model (Bjørnstad et al., 2002; Grenfell, Bjørnstad, and Finkenstädt, 2002) with a term that allows for spatial transmission between different host communities modeled as a gravity process. Xia et al. (2004) demonstrate how this model captures scientifically important properties of measles dynamics. Since each likelihood evaluation is computationally very expensive, however, Xia et al. (2004) obtain only point estimates of the parameters minimizing ad hoc objective functions instead of using a likelihood-based approach. Here, we develop a more statistically rigorous approach to inferring model parameters, characterizing associated uncertainties and carefully studying parameter identifiability issues. First, in order to explain the issues that arise in inferring these parameters via a likelihood-based approach, we propose a partial discretization of the parameter space that allows us to perform Bayesian inference for the parameters using a fast MCMC algorithm. Using this approach we are able to study uncertainties about the parameter values. The method allows us to investigate parameter identifiability issues, showing which gravity model parameters can or cannot be inferred from a given data set. However, this approach to resolving the computational challenges of traditional likelihoodbased inference is problematic, as is revealed by our simulated data examples. We find that the parameter estimates are poor and the forward simulations of the model at these param- 3

4 eter settings do not reproduce epidemiological features of the data deemed key in Xia et al. (2004). In order to address the above issues, we propose a new approach that directly focuses on the aspects of the underlying process that are of scientific interest. We develop a Gaussian process approximation to the gravity model based on key summary statistics obtained from simulations of the model at different parameter values. These statistics are chosen by domain experts to capture the biologically important characteristics of the dynamics of the disease. The Gaussian process model emulator is then used to develop a probability model for the observations, thereby permitting an efficient MCMC approach to Bayesian inference for the parameters. We demonstrate that the new method recovers the true parameters and the resultant fitted model captures biologically relevant features of the data. When applied to the gravity TSIR model, our approach allows us to investigate several scientific questions that are of interest to the dynamics of measles. We study changes in dynamics between school holiday periods versus non-holidays in the pre-vaccination era. This is particularly interesting because the local, age-structured transmission rate of the disease changes from holidays to non-holidays (Bjørnstad et al., 2002; Dietz, 1976). Since our approach allows us to construct confidence regions easily, we also infer the amounts of exported and imported infected individuals for different cities during different time periods and reveal that movement patterns of the infection do not seem to change significantly between the pre-vaccination and vaccination eras. Based on the parameter estimates obtained using our method, we are able to display the inflow and outflow networks of the infection between cities. Along with histograms of the degree distributions of the networks, these graphs help to identify the cities that are important hubs in measles transmission. More generally, the methodology we develop here may be useful for models where the likelihood is expensive to evaluate or in situations where the likelihood is unable to capture characteristics of the model that are of scientific interest. We note that the computational cost of forward simulations for our model makes approaches based on approximate Bayesian computation (ABC) (cf. Beaumont, Zhang, and Balding, 2002; Marjoram, Molitor, Plagnol, and Tavaré, 2003; Pritchard, Seielstad, Perez-Lezaun, and Feldman, 1999) infeasible. Hence our approach is computationally efficient, while ABC is not a viable option here. The rest of the paper is organized as follows. Section 2 describes in detail the gravity TSIR model, which acts as our motivating example. Section 3 describes the inferential and computational challenges posed by the model and the large space-time data set. Section 4 describes our new emulation-based approach that is an alternative to traditional likelihoodbased inference. Section 5 describes computational details and the application of our method 4

5 to the gravity TSIR model in simulated data examples. Section 6 describes the application of our method to the England-Wales measles data sets. Finally, in Section 7, we summarize our results and discuss our statistical approach and scientific conclusions. 2 A gravity model for disease dynamics A general goal of fitting metapopulation disease dynamics models is to describe spatiotemporal patterns of epidemics at the local scale and understand how these patterns are affected by the network of spatial spread of the disease (Cliff, Haggett, and Smallman-Raynor, 1993; Keeling, Bjørnstad, and Grenfell, 2004). The gravity model we study is an extension of a discrete time-series susceptible-infected-recovered model (Bjørnstad et al., 2002; Grenfell et al., 2002) for local disease dynamics which includes an explicit formulation for the spatial transmission between different host cities (Xia et al., 2004). The common theoretical framework used to describe the dynamics of infectious diseases is based on the division of the human host population into groups containing susceptible, infected (infectious) and recovered individuals. Let I kt and S kt denote the number of infected and susceptible individuals respectively in disease generation t in city k and variable L kt be the number of infected people commuting to city k at time t. The commuting assumption reflects that movement of infection is mostly through transient movement of individuals. Denote the size and birth rate of city k at time t by N kt and B kt, and let d kj represent the distance between cities k and j. The model can then be described as follows. First, the model for the number of incidences of measles is I k(t+1) Poisson(λ k,t+1 ), where λ k,t+1 = β t S kt (I kt + L kt ) α, (1) with t = 1,..., T, k = 1,..., K, where K is the number of cities in our data and T is the total number of time steps. The time-step is taken to be 2 weeks, roughly corresponding to the generation length (serial interval) of measles. The so-called transmission coefficient, β := {β t }, is a parameter that represents the attack rate of measles at time t and α is a positive real number correcting for the discrete-time approximation to the underlying continuous-time epidemic process (Glass, Xia, and Grenfell, 2003). Since these parameters only affect the local dynamics of measles, henceforth we refer to these parameters as local dynamics parameters. The indexing by t for β t reflects how this parameter is taken to be a piece-wise constant taking 26 different values to accommodate seasonal variability of the transmission rate that is repeated every year (Bjørnstad et al., 2002; Fine and Clarkson, 1982; Finkenstädt and Grenfell, 2000; Grenfell et al., 2002). From this, it can be seen that I k(t+1) increases depending 5

6 on the number of susceptibles and the number of moving infections coming to city k at the previous time step. Note that we use the Poisson distribution whereas Xia et al. (2004) use the Negative Binomial distribution; this is due to the greater computational stability of the Poisson distribution for small values of λ. Our approach would proceed in the same way for the Negative Binomial and Poisson assumption. In addition, our exploratory analysis show that a model fit from using the Poisson distribution is similar to a model fit obtained with the Negative Binomial distribution and the final inference about the parameters of interest is not affected by changing the distributional assumption. The susceptibles are modeled as follows S k(t+1) = S kt + B kt I k(t+1), (2) reflecting how susceptibles are replenished by births and depleted by infection. Since case fatality from measles was very low for the period of time in this study and mean age of infection was small, mortalities are not included in this balance equation. We note that here and in the following, after vaccinations are available, the birth rates (B kt ) are deflated by the corresponding percentage of vaccinated newborns (V kt ), since those cannot be infected. Finally, the gravity model describes the number of moving infected individuals by K I τ 2 L kt Gamma(m kt, 1), where m kt = θn τ 1 jt kt d ρ, (3) kj j=1,j k where Gamma(a,b) represents the Gamma distribution with shape and scale parameters a and b respectively. Here, b is chosen to be equal to unity based on exploratory analysis of the fitted model (Xia et al., 2004). The reason to model immigrant infection as a continuous random variable lies in the assumption that the transient infectives do not remain for a full epidemic generation. The local dynamics parameters in Equation (1) have been estimated previously (Bjørnstad et al., 2002; Finkenstädt, Bjørnstad, and Grenfell, 2002; Grenfell et al., 2002). In this study, we are interested in learning about the parameters θ, τ 1, τ 2 and ρ in Equation (3) as these parameters control the spatial spread and regional behavior of the disease. Note, however, that for convenience and numerical stability, we use a reparametrization of θ, θ = log 10 (θ)/5 throughout the paper. 3 Parameter inference for the gravity model Reliable estimates of the local dynamics parameters α and β are available for measles dynamics (Bjørnstad et al., 2002; Finkenstädt et al., 2002; Grenfell et al., 2002; Xia et al., 6

7 2004). Therefore, since we are only interested in spatial dynamics of the disease, we assume that these parameters are known and use the estimates obtained from previous work (cf. Xia et al., 2004) as the true values. In particular, the local seasonal transmission parameters for biweeks 1 through 26, β t, are taken to be equal to β t = (1.24, 1.14, 1.16, 1.31, 1.24, 1.12, 1.06, 1.02, 0.94, 0.98, 1.06, 1.08, 0.96, 0.92, 0.92, 0.86, 0.76, 0.63, 0.62, 0.83, 1.13, 1.20, 1.11, 1.02, 1.04, 1.08), and α is assumed to be Here, the difference in the values of β t is primarily related to the fact that attack rates of measles differ depending on the season of the year since it is known that schools are major hubs of transmission of the disease. It also known that the true transmission process is continuous. Since we are considering a discretized model with a step equal to two week, it is therefore expected that the true attack rates of the disease could be higher. This explains the value of α which is slightly less than unity. In principle, it may be possible to reduce the dimensionality of β t while still preserving the seasonality of attack rates of the infection. With lower dimensional β t, one could assume strong priors for the local dynamics parameters and try to infer these parameters with the remaining unknown parameters jointly. However, trying to simultaneously infer these parameter values still significantly increases the identifiability issues and further complicates computation. Crucially, we note that assuming the local dynamics parameters are known does not have an undue effect on the model fit as has already been shown in the literature (cf. Xia et al., 2004). Assuming the local dynamics parameters are known leaves us with four unknown parameters, θ, τ 1, τ 2 and ρ, that we call the gravity model parameters (in our Gaussian process based approach in Section 4 we will also introduce several other parameters). In this paper our focus is on investigating the gravity model parameters and, when possible, obtaining the best estimates of them with relevant descriptions of their variability. As suggested by our domain experts, feasible values for the gravity parameters lie in the interval [0, 2] (see also Xia et al., 2004). Therefore, we use uniform priors for (θ, τ 1, τ 2, ρ) in all the inferential approaches that follow. The data are spatiotemporal and tend to be high-dimensional, in the case of the England-Wales measles data for the pre-vaccination era and for the later time period ( ). To study whether our fitted model captures epidemiologically relevant features of the data, we focus on two important biological characteristics of the process as suggested by domain experts. These are: 1. Maximum number of incidences which we will denote by M = (M 1,, M K ), where M i is the maximum number of incidences for the i-th city. 2. Proportions of bi-weeks without any cases of infection denoted by P = (P 1,, P K ), 7

8 where P i is the proportion of incidence free biweeks for the i-th city. An important goal of our work is to find parameter settings (along with associated uncertainties and dependencies among them) that yield a model that produces disease dynamics that are as close as possible to the data in terms of capturing these key properties. 3.1 A gridded MCMC approach and simulated examples It is easy to see why each evaluation of the likelihood for the gravity model is expensive. As in many population dynamic models, the major difficulty is in integrating over high-dimensional unobserved variables. For our model, {L kt } and {S kt } are of K T dimensions each, which translates to 2 519, 792 in the case of measles data set for the pre-vaccination era considered in Section 6. Details of the likelihood function are given in Web Appendix A. In this section, using an MCMC algorithm based on the discretization of a subspace of the parameter space, we describe some issues that arise from a traditional likelihood-based or Bayes approach for inference for the gravity model. Because likelihood-based inference for the gravity model is computationally intractable, our gridded MCMC algorithm requires certain simplifying assumptions and data imputation for unobservable susceptibles {S kt }. These assumptions and details of constructing our gridded MCMC algorithm for parameter inference are explained in Web Appendix B. We note, however, that our inferential approach based on a Gaussian process described in Section 4 does not require the simplifying assumptions, nor does it require data imputation. We note that all simulated data sets we consider in this work are generated from the full gravity model described in Section 2 with initial points equal to the actual observations at t = 1. In these examples, the number of locations, their coordinates, demographic variables, and the number of time steps are the same as those in the measles data described in Section 6.1. In our first example, we simulate a data set using values for the gravity parameters θ = 0.71, τ 1 = 0.3, τ 2 = 0.7 and ρ = 1. This parameter setting results in realistic data that resembles the observations. Figure 1 shows conditional and unconditional posterior likelihood surface plots for θ and ρ obtained by using the above gridded MCMC approach. From these plots, we can easily see that inference for θ and ρ is not possible because of the apparent issue with identifiability (Figure 1 (a)). In Figure 1 (b) we see that identifiability is reduced, but still exists when we fix one of the parameters, say τ 1, at its known true value. In Figure 1 (c), we fix both of τ 1 and τ 2 at their true values and see that the obtained ridge contains 8

9 the true values for θ and ρ. Figure 1 (d) demonstrates that the ridge moves by changing the values of τ 1 and τ 2 away from their true values. Figure 1: Inferred posterior 2D likelihood surface obtained for data with known parameters (θ = 0.71, τ 1 = 0.3, τ 2 = 0.7 and ρ = 1): (a) Marginal 2D likelihood surface for (θ, ρ); (b) Marginal 2D likelihood surface for (θ, ρ) assuming τ 1 = 0.3 (true); (c) 2D likelihood surface for (θ, ρ) assuming τ 1 = 0.3 (true) and τ 2 = 0.7 (true); (d) 2D likelihood surface for (θ, ρ) assuming τ 1 = 0.5 (any value) and τ 2 = 1 (any value). In our second example, we simulate a data set using values for the gravity parameters θ = 0.71, τ 1 = 0.5, τ 2 = 1 and ρ = 1. Figure 2 is a plot of the two-dimensional likelihood in θ and ρ space obtained by fixing τ 1 and τ 2 at their true values 0.5 and 1 respectively. We can see here that the true values of the parameters of interest are not in the region where the likelihood is maximized. This, unfortunately, means that repeating the above with other simulated data with different true values for the gravity parameters reveals that the ridge analogous to the ridge in Figure 1 (c) does not always have to contain the true values for θ and ρ. From our study of multiple simulated data, we also find that the likelihood ridge can have an intercept that is different from the ridge that we would intuitively think as the true ridge while having the same slope. This difference in intercepts creates a shift thereby resulting in poor parameter inference. Unfortunately the magnitude and direction of the shift depends on the true parameter values, so no simple bias correction is available. At first, 9

10 Figure 2: Inferred posterior 2D likelihood surface obtained for data with known parameters (θ = 0.71, τ 1 = 0.5, τ 2 = 1 and ρ = 1): Posterior 2D likelihood surface for (θ, ρ) assuming τ 1 = 0.5 (true) and τ 2 = 1 (true) has a shift and does not contain the true (θ, ρ) at its highest probability area. one may think that the discretization of the parameters τ 2 and ρ may be causing some of these issues. We verify that this is not the case by simply computing the values of the true likelihood function at the top of the ridges obtained with the discretized likelihood. We are able to see that the likelihood surface using the discretization is similar to the true likelihood surface. The poor inference from our traditional Bayes approach is therefore clearly not a result of the discretization. By generating additional simulations using a simpler model where we fix all the latent variables at their means we also find the full gravity model does not substantially differ from the simpler one in terms of capturing interesting biological characteristics of the underlying dynamics of the disease. In order to study the effect of this fixing on the likelihood surface, we save the true latent variables while simulating data and use them in our gridded MCMC in place of the expectations used in our gridded MCMC algorithm. The results show that using the true values of the latent variables does not change the traditional Bayes inference. This also confirms that the shifts that we observe in the traditional Bayes approach are not due to simplifying the model in gridded MCMC algorithm (see Web Appendix B for details about these assumptions), but rather due to inherent problems with the likelihood function. We note that our main interest is to examine whether the parameter estimates result in a 10

11 model fit that is capable of reproducing important characteristics of the observations. In order to study the model fit from the gridded MCMC, we simulate a data set using the full gravity model with estimated values of the parameters, where here and throughout the paper, we use modes of the corresponding posterior density functions as estimates of the parameters. These estimates for the measles data described in Section 6.1 are (θ, τ 1, τ 2, ρ) = (0.71, 0.5, 1, 1.48). For the simulated data set, we calculate the two 952 dimensional vectors (number of cities in the data) of summary characteristics and plot them against the summary vectors for the observed measles data (Figure 3). We can see that the simulated data do not seem to match the actual data in terms of the maximums M and the proportions of zeros P (Figure 3 (a)- (b)). In Section 5.2, we compare the model fit obtained via the gridded MCMC to the model fit we obtain via our Gaussian process-based approach described in Section 4. We summarize below our conclusions based on the gridded MCMC approach: 1. The confidence regions for the parameters are very wide, suggesting that there may be relatively little information even with a fairly rich data set. Hence we assume that τ 1 = 1, τ 2 = 1 as estimated in Xia et al. (2004) and study the joint distribution of θ and ρ, which becomes well informed by the data. 2. The fitted gravity model, using the above inference about its parameters, does not capture important biological features of the data. 3. We find that the parameter estimates from the traditional Bayes approach are shifted and the direction of the shift varies as shown in Figure 2. For example, for a simulated data set using the parameters values (θ, τ 1, τ 2, ρ) = (0.71, 0.5, 1, 1), our attempt to infer ρ assuming other parameters are known results in an estimate ˆρ = 1.5 with a confidence region that does not contain the truth. 4 Gaussian processes for emulation-based inference Since a traditional Bayes approach suffers from the above shortcomings, we develop an alternative method that is directly linked to the characteristics of the infectious disease dynamics that are of most interest to biologists. This method is based on using a Gaussian process to emulate the gravity model. A short review of Gaussian process basics is provided in Web Appendix C. We describe a new two-stage approach for inferring the gravity parameters. In the first stage, we simulate the gravity model at several parameter settings. For each forward simu- 11

12 Figure 3: Characteristics of simulated data at the parameters obtained via the traditional Bayes approach: (a) Simulated M vs M from the data; (b) Simulated P vs P from the data. lation of the model we can calculate the vector of summary statistics based on the simulated data set. This vector is high-dimensional, 952 (354) dimensions in the case of measles data for ( ). Since Gaussian process-based emulation for high dimensions poses serious computational challenges, we emulate the model by fitting a Gaussian process to the Euclidean distances between the summary statistics of the simulated data at the chosen parameter settings and the summary statistics for the real data. In the second stage, we perform Bayesian inference for the observations using the GP emulator from the first stage. We also allow for additional sources of uncertainty such as observational error and modeldata discrepancy as described below. We note that such two-stage approaches to parameter inference in complex models has been used to reduce computational challenges and alleviate identifiability issues (cf. Bhat, Haran, Olson, and Keller, 2012; Liu, Bayarri, and Berger, 2009). We begin with some notation. Let Z denote the vector of summary statistics of interest (e.g. proportions of zeros) calculated using the observed space-time data set. Let Θ be the gravity parameters and Y (Θ) denote the vector of summary statistics obtained using a simulation from the gravity model with the parameter setting Θ. Let Ω = (Θ 1,, Θ p ) be a grid on the parameter space. Our first goal is then to model D = (D 1,, D p ), where D i is the Euclidean distance between Y (Θ i ) and Z for i = 1,, p. This is done in the first stage 12

13 of our approach where we assume, D Ω, β G, ξ G N(Xβ G, Σ(ξ G )) (4) Here, ξ G = (σ 2 G, τ 2 G, φ G) is a vector of parameters that specify the covariance matrix, and β G is a vector of regression coefficients. The matrix X is a design matrix of dimension p 5 with i-th row equal to (1, Θ T i ). In other words, columns of X are the values the gravity parameters, (θ, τ 1, τ 2, ρ), on the selected grid and an intercept. We use Gaussian covariance matrix, Σ(ξ G ), elements of which are given by, (Σ(ξ G )) ij =cov(d i, D j ) = σ G 2 = exp( φ2 G Θ i Θ j 2 ), σ G 2 + τ G 2, if i j otherwise. Here, a b := d(a b, a b), where throughout the paper, the function d(, ) returns the Euclidean distance between the argument vectors. Then, if we let the maximum likelihood estimate of (β G, ξ G ) be ( ˆβ G, ˆξ G ), using standard multivariate normal theory (cf. Anderson, 1984), the normal predictive distribution for the simulated distance D at a new Θ can be obtained by substituting ( ˆβ G, ˆξ G ) in place of (β G, ξ G ) and conditioning on D. We denote this predictive distribution by η(d; Θ). Detailed version of constructing this predictive distribution (emulator) is given in Web Appendix D. Consider a new space-time data set, and let the vector of summary statistics for these data be Y. Let the distance between Y and Z be D. The predictive distribution from the first stage provides a model for D, η(d ; Θ ), connecting it to some unknown parameter vector Θ. Following Bayarri, Berger, Paulo, Sacks, Cafeo, Cavendish, Lin, and Tu (2007), we model the discrepancy between the gravity model and the real data. Failing to account for datamodel discrepancy can lead to poor inference as pointed out in Bayarri et al. (2007) and Bhat, Haran, and Goes (2010). We account for this by setting D = Dδ := δ, where δ > 0 is the discrepancy term. It is positive since it represents an Euclidean distance that is nonnegative (in the unrealistic case that there is an exact match between the model for the data and the model used to fit the data, δ would be identically equal to 0). We then infer the gravity parameters using η(dδ ; Θ ) considering δ to be another unknown parameter in the MCMC algorithm. In other words, the likelihood function we use for our MCMC algorithm is a function f(δ, Θ ) := η(dδ ; Θ ). We note that including a model discrepancy term results in more reliable parameter inference with narrower confidence regions since it adjusts for the fact that even the best model fit is not going to reduce the distance between the simulated and 13

14 observed summary statistics to zero. In our simulated examples, where data are generated from the gravity model, the discrepancy term can be thought of as an adjustment parameter for the fact that two data sets simulated at the same parameter settings will always have small differences due to stochasticity. In these examples, as it is expected, estimate of the discrepancy is very small compared to the discrepancy term inferred from the original data. We also note that using negative values for δ would mean an extrapolation in our emulator beyond the grid of the parameter space that may lead to unreliable inference. In many situations, having a well-defined discrepancy term with an informative prior helps to reduce problems with identifiability of the parameters as well (cf. Craig et al., 2001). We can now summarize our inferential approach as follows: 1. Emulating the gravity model: (a) Select a grid (Θ 1,, Θ p ) on the range of possible values for Θ. (b) Calculate Y (Θ i ) using a simulation from the gravity model with Θ i for all i. (c) Calculate D = (D 1,, D p ), distances from Y i to Z for all i. (d) Find the maximum likelihood estimates of (β G, ξ G ), the parameters of the Gaussian process in Equation (4). Obtain the predictive distribution η(d; Θ). 2. Bayesian inference for δ and Θ given the observations Z: (a) Using the predictive distribution with a discrepancy term, η(dδ ; Θ ), perform Bayesian inference for the parameters (Θ, δ) from the posterior distribution via MCMC. 5 Emulation-based inference for the gravity TSIR model In this section we describe details of the application of the inferential approach described in Section 4 to the gravity TSIR model. By using simulated data examples, we show that the approach resolves the problems posed by traditional approaches. In order to contrast our approach to a traditional likelihood-based approach (carried out by gridded MCMC as described in Section 3.1), we also provide computational details from the application of both methods. 14

15 5.1 Computational details of gridded MCMC and emulation-based approaches Inference for both the traditional Bayes and emulator-based approaches relies on sampling from the corresponding posterior distributions via MCMC. In both methods, we use univariate sequential slice sampling updates for the continuous parameters (Agarwal and Gelfand, 2005; Neal, 2003). Parameters that are on the grid are updated via an analog of a simple random walk for discrete variables. In all the MCMC algorithms that are used for the discretized MCMC approach, the chain is run until we obtain 200,000 samples. This takes about 3 days on a Intel Xeon E5472 Quad-Core 3.0 GHz processor. In all the MCMC algorithms for the Gaussian process-based method, all the updates are carried out using slice sampling since all the parameters here are continuous. Chain lengths are 200,000 again and it takes about 10 hours to generate them. The chain lengths in both methods are adequate for producing posterior estimates with small Monte Carlo standard errors (Flegal, Haran, and Jones, 2008; Jones, Haran, Caffo, and Neath, 2006). We emulate the gravity model with a Gaussian process using proportions of zeros as a summary statistic of interest. Our selection of proportions of zeros as the primary summary statistic of the analysis is based on suggestions by domain experts and intuition that these summary statistics are the most informative regarding the parameters of interest. It could be argued that big cities do not have bi-weeks without incidences of measles making the proportions of zeros for these cities equal to 1. However, during the course of outbreaks in these cities, the epidemic trajectory of measles is nearly unaffected by infection that may enter from neighboring locations. This means that big cities may not contain information about the gravity parameters - parameters of the movement of the infection between cities from data on number of cases of measles. In our data, more than 90% of the cities may be considered as small cities. Spatial transmission is very important to the dynamics of measles for these smaller cities where the infection may become locally extinct. For small cities, infection re-entered from other cities is the only possible way to start a new outbreak. Using different summary statistics may, of course, lead to different inference. Inference based on the maximums, however, was identical to what is obtained here and therefore we do not include details of the analysis and the corresponding results. It is also possible to develop an emulator using these two summary statistics at the same time; this is computationally more demanding and based on our exploratory data analysis will not impact our conclusions. In general the most informative summary statistics are not trivial to judge, and depend on the disease and available data. The choice of summary statistics is closely linked to the 15

16 particular inference questions addressed and can be limited by the availability of informative statistics for any particular model parameters. In cases when there are no well-established summary statistics and/or scientifically important aspects of the disease dynamics that need to be captured, our emulation-based approach can be used with summary statistics constructed/selected via algorithms borrowed from the approximate Bayesian computation literature (cf. Blum and François, 2010; Fearnhead and Prangle, 2012; Nunes and Balding, 2010; Sisson and Fan, 2010). A possible approach to the lack of informative summary statistics is to increase the number of summary statistics, thereby hoping to increase the amount of information regarding the unknown parameters (Sousa, Fritz, Beaumont, and Chikhi, 2009). This approach could, however, make our inferential methods more computationally expensive. Another method for selecting summary statistics is based on ordering summary statistics according to whether their inclusion in the analysis substantially improves the quality of inference defined by different criteria (Joyce and Marjoram, 2008; Nunes and Balding, 2010). Finally, one may construct informative summary statistics using different dimension reduction techniques (Blum and François, 2010; Fearnhead and Prangle, 2012; Wegmann, Leuenberger, and Excoffier, 2009) or by transforming the existing summary statistics (Blum, 2010). We use the priors for the gravity model parameters that are described in Section 3. Since the discrepancy term, δ, is always positive, we use an exponential(1) as its prior distribution. We use a uniform grid in the four-dimensional cube, each side of which is equal to the intervals [0, 2]. For each parameter, we use 20 different values on each axis of the cube; this grid size permits computationally expedient inference. Our analysis of simulated data sets also shows that 20 is sufficient for accurate inference. In addition, for each point on the grid, the average distances from multiple forward simulations can be used instead of the distances calculated from a single simulation. This may be important when model realizations are highly variable. For the parameters of the gravity model, however, our inference was insensitive to the number of repetitions. This was because multiple realizations from the probability model varied very little for a given parameter setting. Therefore, it was much more important to use our computational resources for emulation across more parameter settings than it was to obtain repeated realizations at the same setting. Hence, we used one simulated time-series at each location for each set of parameters in the four-dimensional cube. 16

17 5.2 Application to simulated data In the simulated examples that follow, our goal is to compare inference based on the GPapproach to inference from the traditional Bayes approach. In Figure 4, we show a simulated example where both the GP and traditional Bayes approaches yield the same inference, and another simulated example where the two approaches yield different answers. In both cases, the emulation-based approach provides inference that captures the true parameter values. In the first simulated data, the true parameters are θ = 1, τ 1 = 0.6, τ 2 = 1 and ρ = 1. In Figure 4 (a), we overlay two different 95% confidence regions obtained using the two different methods. Both of these regions are found by assuming τ 1 = 0.6 and τ 2 = 1. We can see that for this example, both solid (traditional Bayes) and dashed (GP emulator-based) regions contain the true values of θ and ρ. This shows that inference based on the GP emulator is as good as inference based on the traditional Bayes method. To demonstrate that the new approach is better than the traditional Bayes approach, we choose a second set of values for the gravity parameters (θ = 0.71, τ 1 = 0.62, τ 2 = 1 and ρ = 1.5) for which we know inference based on the traditional Bayes approach to be poor (like in Figure 2). Figure 4 (b) shows how the 95% confidence region from the traditional Bayes method (outlined with a solid line) is shifted and does not contain the truth. The permissible region obtained using the GP emulator (outlined with a dashed line) has corrected the shift and contains the true values of the parameters. We analyze the ability of the fitted gravity model to reproduce the key characteristics of the process at these new parameter estimates. Using estimates obtained via the GP-emulator based approach, (θ, τ 1, τ 2, ρ) = (0.71, 0.5, 0.5, 1.48), we generate a data set to obtain plots similar to the ones in Figure 3. Plots on Figure 5 (a)-(b) show that the model now can fit the maximums M and the proportions of zeros P very well. Comparing the plots in Figures 3 and 5, we can now say that the new emulation-based approach improves the model fit substantially while the traditional Bayes parameter estimates from the gridded MCMC fail to provide a model that captures the key epidemiological features of the data. In order to study the effect of a discrepancy term in our approach, we also tried to infer the gravity parameters using the emulation-based model with δ = 0 (no discrepancy). The resultant 95% confidence regions were much wider for the latter approach containing incorrect parameter settings, supporting the points made in Bayarri et al. (2007) about the importance of adding a discrepancy term to approximate models. We note, however, that these new confidence regions still contained the true parameters values in simulated examples and did not have the kinds of shifts seen in parameter inference using grid-based MCMC as in Section 3.1. This means that the problem when the true parameters of the model are 17

18 Figure 4: 95% C.I. s for (θ, ρ) obtained via different methods (assuming that τ 1 and τ 2 are known): Solid line shows the 95% region obtained using the traditional Bayes method. Dashed line outlines the 95% region obtained via GP emulator: (a) Both regions contain the true parameter values; (b) Region obtained by the GP emulator contains the true values of the parameters, while the traditional Bayes region does not. Figure 5: Characteristics of simulated data at the parameters chosen to minimize the discrepancy between the data and the simulation: (a) Simulated M vs M from the data; (b ) Simulated P vs P from the data. 18

19 not recovered by a likelihood-based approach is not related to the issue of accounting for model-data discrepancy. Continuing to explore the effect of the discrepancy term, we also tried a few different priors for δ; using the exponential(1) prior for the discrepancy term worked very well as was clear from the results. The posterior median for the discrepancy term was found to be around 2 which was close to the minimal distance from the simulated and the true vectors of summary statistics taken over all the points on the grid. 6 Results from application to measles data We apply our emulation-based approach to inference for the gravity TSIR model to a well known measles data set from the U.K. The purpose of this is twofold: to demonstrate the applicability of our approach to a real data set as well as to provide some insights into measles dynamics in the pre-vaccination era. 6.1 Description of measles data set The following description of the data closely follows Xia et al. (2004). We analyze weekly case reports of measles for cities in England and Wales. The data is available for K = 952 locations in the pre-vaccination era from 1944 to 1965 and for K = 354 locations from 1966 to 1994 with information on vaccine coverage. The data represent an interesting case study of spatiotemporal epidemic dynamics (Grenfell et al., 2002) with well understood underreporting rate of 40%-55% (Bjørnstad et al., 2002). Besides the under-reporting, the data are complete and reveal inter-annual outbreaks of infection. A critical feature of this data set is that, except for a few large cities, infection frequently goes locally extinct, so that overall persistence hinges on episodic reintroduction and spatial coupling. Before further analysis, we correct the reported data by a factor of 1/0.52, with 52% being the average reporting rate taken from previous analysis (Bjørnstad et al., 2002; Clarkson and Fine, 1985; Finkenstädt and Grenfell, 2000). In addition, as in previous works, we use a timescale that represent the exposed and infectious period, which is known to be about 2 weeks for measles (Black, 1989). In the analysis of the data for pre-vaccination era, following a standard assumption in the literature (see, for instance, Bjørnstad et al., 2002; Grenfell et al., 2002; Xia et al., 2004, and the references therein), the population sizes and per capita birth rates for all locations in this work are assumed to be approximately constant throughout the time period. These variables are taken as those in 1960 for each of the areas. This is a rough approximation, since 19

20 most communities grew during the period we analyze. The force of infection is, therefore, on average slightly underestimated (overestimated) during the early (late) part of the study. In the analysis of the newer data for , the population sizes and per capita birth rates are allowed to be variable as specified in the gravity model. We note that these assumptions are made for the consistency of our work with the previous analysis and do not have an effect on our inference and/or conclusions. 6.2 Some implications for measles dynamics Important biological questions we want to answer based on these data are: (i) do the gravity model parameters (and hence disease transmission) change for school holiday periods versus non-holiday periods? Do they change for different time periods (before and after vaccines against measles were available)? (ii) do movement rates of infected people change in different time periods? In order to answer these questions, using our emulator-based approach, we first fit the model to the parts of the data corresponding to periods of holidays and non-holidays. As demonstrated in our simulated examples in Section 3.1 and 5.2, it is not possible to infer all the gravity model parameters at once. Hence, we set the parameters τ 1 and τ 2 equal to 1 and study the remaining key gravity model parameters θ and ρ. The resulting 95% confidence regions for θ and ρ are provided in Figure 6 (a). As can be seen from this figure, the two regions are almost identical, indicating that any change in the number of cases of measles for holidays and non-holidays is not due to the change in the way the infection spreads between cities of the metapopulation during these periods. Since the matrix M = {m kj }, where m kj = θ N τ 1 kt T t=1 (Ijt) τ 2 d ρ kj is interpreted as a matrix of the amount of movement, sum of k-th row of M represents the amount of infected individuals leaving city k while sum of k-th column is the number of infected people coming to city k. Using samples for θ and ρ, we easily obtain a sample for the spatial flux of infection for selected cities. In Table 1, we report our estimates with corresponding credible regions based on this analysis. We use the posterior median as point estimates. For example, we estimate the average number of emigrating infections during the holiday periods each week to be equal to 31.1 for London. Below the estimate, we report a 95% credible interval for it which is (4.4, 479.1). Based on these estimates, the mobility of the infection appears to be less during the periods of holidays. Figure 6 (b) shows confidence regions obtained by the GP emulator-based approach by fitting the model to the data from and separately. From this figure, 20

21 Figure 6: 95% C.I. for (θ, ρ) obtained via fitting GP emulator to a part of the data: (a) Solid line outlines the confidence region for parameters when data from only holiday periods are used; Dashed line outlines the confidence region for parameters when data for only non-holiday periods are used; (b) Solid line outlines the confidence region for parameters when data for years from are used; Dashed line outlines the confidence region for parameters when data for years from are used. Table 1: Estimated amount of average movement in two weeks City From To Holiday Non-Holiday Holiday Non-Holiday London (4.4, 479.1) (6.6, 744.7) (4.6, 564.9) (6.9, 823.9) Birmingham (1.2, 72.9) (1.8, 110.6) (1.2, 74.7) (1.9, 115.8) Manchester (1.0, 151.4) (1.4, 180.9) (1.2, 162.9) (1.5, 189.1) Blackpool (0.1, 6.7) (0.2, 8.8) (0.1, 5.2) (0.1, 6.1) 21

22 Table 2: Estimated amount of average movement in two weeks City From To London (7.9, 488.4) (4.4, 623.2) (7.0, 591.5) (4.9, 739.8) Birmingham (1.9, 75.1) (1.2, 112.8) (2.8, 93.7) (1.2, 121.3) Manchester (2.1, 128.4) (0.9, 176.7) (1.9, 163.6) (1.4, 193.1) Blackpool ( 0.3, 8.1) (0.2, 9.7) (0.1, 7.4) ( 0.1, 7.1) we conclude that the change in parameter values is statistically insignificant for these two different time periods. The important scientific implication of this result is that introduction of vaccination in England and Wales in 1966 does not change the movement patterns of the infection between cities. This also means that any observed change in incidence rates of measles is only due to the effects of vaccination, not a change in movement patterns in the vaccination era. Table 2 shows estimates of the average amount of transit infections each bi-week for years and We see here that the infection appears to move less during the later years. We note that none of the differences are statistically significant. As a visual summary of this table for the time period with vaccination, in Figure 7, we plot histograms of log-transformed estimated amount of average movement in two weeks for From these plots, we can conclude that both incoming (Figure 7 (a)) and outgoing (Figure 7 (b)) number of infections for most of the cities is very small. Figure 8 displays graphs of networks of the movement of measles between cities in our data. These graphs are obtained using the movement matrix M and estimates of the gravity parameters from data for via the GP emulator-based approach. In Figure 8 (a), we plot the network of outgoing infections. In Figure 8 (b) we plot the network of incoming infections for cities of the metapopulation. Figure 8 (a) illustrates the importance of big cities in the dynamics of measles for smaller communities where the infection may become locally extinct. From this figure, we see that the edges radiating from the populated cities reach the small cities causing a re-introduction of the infection in these communities. This link between big and small cities do not seem to depend on distances between the cities. On the other hand, in Figure 8 (b), we see that the amount of incoming infections is mostly dependent on distances between cities since edges connecting different cities in this graph are 22

23 Figure 7: Histogram of estimated amount of average movement in two weeks for : (a) outgoing infections; (b) incoming infections. shorter relative to the edges of the graph in Figure 8 (a). This means that big cities are the only important factors in starting an outbreak in smaller cities, excluding the possibility of re-introduction of the disease from neighboring cities with small population sizes. 7 Discussion Complex models are very useful for representing physical phenomena, whether the phenomena is the spread of an infectious disease or the change in sea surface temperatures in the Atlantic. As is well known, it is not always possible for every aspect of such complicated phenomena to be modeled accurately; certain key characteristics of the process necessarily have to be focal points of the modeling effort. However, these key characteristics are not typically the focus of a statistical inferential procedure that uses a traditional likelihood-based approach. The approach we have developed in this paper addresses this point by providing a flexible inferential method that directly takes into account the characteristics of the process that are most important to scientists. Even though focusing on different summary statistics can lead to different estimates, parameter inference based on our approach produces an improved model fit to the biologically interesting features of the infectious disease dynamics. In addition to the flexibility this provides, we find that our approach is also computationally tractable in 23

24 Figure 8: Movement networks of the infection: (a) network of outgoing infections; (b) network of incoming infections. 24

Tutorial on Approximate Bayesian Computation

Tutorial on Approximate Bayesian Computation Tutorial on Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology 16 May 2016

More information

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Bayesian Dynamic Linear Modelling for. Complex Computer Models Bayesian Dynamic Linear Modelling for Complex Computer Models Fei Liu, Liang Zhang, Mike West Abstract Computer models may have functional outputs. With no loss of generality, we assume that a single computer

More information

Approximate Bayesian Computation

Approximate Bayesian Computation Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate

More information

Fast Likelihood-Free Inference via Bayesian Optimization

Fast Likelihood-Free Inference via Bayesian Optimization Fast Likelihood-Free Inference via Bayesian Optimization Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology

More information

A Generic Multivariate Distribution for Counting Data

A Generic Multivariate Distribution for Counting Data arxiv:1103.4866v1 [stat.ap] 24 Mar 2011 A Generic Multivariate Distribution for Counting Data Marcos Capistrán and J. Andrés Christen Centro de Investigación en Matemáticas, A. C. (CIMAT) Guanajuato, MEXICO.

More information

An introduction to Bayesian statistics and model calibration and a host of related topics

An introduction to Bayesian statistics and model calibration and a host of related topics An introduction to Bayesian statistics and model calibration and a host of related topics Derek Bingham Statistics and Actuarial Science Simon Fraser University Cast of thousands have participated in the

More information

arxiv: v1 [stat.me] 30 Sep 2009

arxiv: v1 [stat.me] 30 Sep 2009 Model choice versus model criticism arxiv:0909.5673v1 [stat.me] 30 Sep 2009 Christian P. Robert 1,2, Kerrie Mengersen 3, and Carla Chen 3 1 Université Paris Dauphine, 2 CREST-INSEE, Paris, France, and

More information

Efficient Likelihood-Free Inference

Efficient Likelihood-Free Inference Efficient Likelihood-Free Inference Michael Gutmann http://homepages.inf.ed.ac.uk/mgutmann Institute for Adaptive and Neural Computation School of Informatics, University of Edinburgh 8th November 2017

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Fitting the Bartlett-Lewis rainfall model using Approximate Bayesian Computation

Fitting the Bartlett-Lewis rainfall model using Approximate Bayesian Computation 22nd International Congress on Modelling and Simulation, Hobart, Tasmania, Australia, 3 to 8 December 2017 mssanz.org.au/modsim2017 Fitting the Bartlett-Lewis rainfall model using Approximate Bayesian

More information

Multivariate Count Time Series Modeling of Surveillance Data

Multivariate Count Time Series Modeling of Surveillance Data Multivariate Count Time Series Modeling of Surveillance Data Leonhard Held 1 Michael Höhle 2 1 Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Switzerland 2 Department of Mathematics,

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Thursday. Threshold and Sensitivity Analysis

Thursday. Threshold and Sensitivity Analysis Thursday Threshold and Sensitivity Analysis SIR Model without Demography ds dt di dt dr dt = βsi (2.1) = βsi γi (2.2) = γi (2.3) With initial conditions S(0) > 0, I(0) > 0, and R(0) = 0. This model can

More information

Measles Metapopulation Dynamics: A Gravity Model for Epidemiological Coupling and Dynamics

Measles Metapopulation Dynamics: A Gravity Model for Epidemiological Coupling and Dynamics vol. 164, no. 2 the american naturalist august 2004 Measles Metapopulation Dynamics: A Gravity Model for Epidemiological Coupling and Dynamics Yingcun Xia, 1,* Ottar N. Bjørnstad, 2, and Bryan T. Grenfell

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation COMPSTAT 2010 Revised version; August 13, 2010 Michael G.B. Blum 1 Laboratoire TIMC-IMAG, CNRS, UJF Grenoble

More information

Estimating the Exponential Growth Rate and R 0

Estimating the Exponential Growth Rate and R 0 Junling Ma Department of Mathematics and Statistics, University of Victoria May 23, 2012 Introduction Daily pneumonia and influenza (P&I) deaths of 1918 pandemic influenza in Philadelphia. 900 800 700

More information

A stochastic model for extinction and recurrence of epidemics: estimation and inference for measles outbreaks

A stochastic model for extinction and recurrence of epidemics: estimation and inference for measles outbreaks Biostatistics (2002), 3, 4,pp. 493 510 Printed in Great Britain A stochastic model for extinction and recurrence of epidemics: estimation and inference for measles outbreaks BÄRBEL F. FINKENSTÄDT Department

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

An introduction to Approximate Bayesian Computation methods

An introduction to Approximate Bayesian Computation methods An introduction to Approximate Bayesian Computation methods M.E. Castellanos maria.castellanos@urjc.es (from several works with S. Cabras, E. Ruli and O. Ratmann) Valencia, January 28, 2015 Valencia Bayesian

More information

Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation

Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation Won Chang Post Doctoral Scholar, Department of Statistics, University of Chicago Oct 15, 2014 Thesis Advisors: Murali

More information

CONDITIONING ON PARAMETER POINT ESTIMATES IN APPROXIMATE BAYESIAN COMPUTATION

CONDITIONING ON PARAMETER POINT ESTIMATES IN APPROXIMATE BAYESIAN COMPUTATION CONDITIONING ON PARAMETER POINT ESTIMATES IN APPROXIMATE BAYESIAN COMPUTATION by Emilie Haon Lasportes Florence Carpentier Olivier Martin Etienne K. Klein Samuel Soubeyrand Research Report No. 45 July

More information

Uncertainty quantification and calibration of computer models. February 5th, 2014

Uncertainty quantification and calibration of computer models. February 5th, 2014 Uncertainty quantification and calibration of computer models February 5th, 2014 Physical model Physical model is defined by a set of differential equations. Now, they are usually described by computer

More information

Approximate Bayesian computation: methods and applications for complex systems

Approximate Bayesian computation: methods and applications for complex systems Approximate Bayesian computation: methods and applications for complex systems Mark A. Beaumont, School of Biological Sciences, The University of Bristol, Bristol, UK 11 November 2015 Structure of Talk

More information

Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications

Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications Inference for partially observed stochastic dynamic systems: A new algorithm, its theory and applications Edward Ionides Department of Statistics, University of Michigan ionides@umich.edu Statistics Department

More information

Bayesian Inference for Contact Networks Given Epidemic Data

Bayesian Inference for Contact Networks Given Epidemic Data Bayesian Inference for Contact Networks Given Epidemic Data Chris Groendyke, David Welch, Shweta Bansal, David Hunter Departments of Statistics and Biology Pennsylvania State University SAMSI, April 17,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Multi-level Approximate Bayesian Computation

Multi-level Approximate Bayesian Computation Multi-level Approximate Bayesian Computation Christopher Lester 20 November 2018 arxiv:1811.08866v1 [q-bio.qm] 21 Nov 2018 1 Introduction Well-designed mechanistic models can provide insights into biological

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

POMP inference via iterated filtering

POMP inference via iterated filtering POMP inference via iterated filtering Edward Ionides University of Michigan, Department of Statistics Lecture 3 at Wharton Statistics Department Thursday 27th April, 2017 Slides are online at http://dept.stat.lsa.umich.edu/~ionides/talks/upenn

More information

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

More information

Bayesian Analysis of Latent Variable Models using Mplus

Bayesian Analysis of Latent Variable Models using Mplus Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are

More information

Stochastic Population Forecasting based on a Combination of Experts Evaluations and accounting for Correlation of Demographic Components

Stochastic Population Forecasting based on a Combination of Experts Evaluations and accounting for Correlation of Demographic Components Stochastic Population Forecasting based on a Combination of Experts Evaluations and accounting for Correlation of Demographic Components Francesco Billari, Rebecca Graziani and Eugenio Melilli 1 EXTENDED

More information

Structural Uncertainty in Health Economic Decision Models

Structural Uncertainty in Health Economic Decision Models Structural Uncertainty in Health Economic Decision Models Mark Strong 1, Hazel Pilgrim 1, Jeremy Oakley 2, Jim Chilcott 1 December 2009 1. School of Health and Related Research, University of Sheffield,

More information

ABCME: Summary statistics selection for ABC inference in R

ABCME: Summary statistics selection for ABC inference in R ABCME: Summary statistics selection for ABC inference in R Matt Nunes and David Balding Lancaster University UCL Genetics Institute Outline Motivation: why the ABCME package? Description of the package

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Bayesian Hierarchical Models

Bayesian Hierarchical Models Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology

More information

Downloaded from:

Downloaded from: Camacho, A; Kucharski, AJ; Funk, S; Breman, J; Piot, P; Edmunds, WJ (2014) Potential for large outbreaks of Ebola virus disease. Epidemics, 9. pp. 70-8. ISSN 1755-4365 DOI: https://doi.org/10.1016/j.epidem.2014.09.003

More information

Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005

Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005 Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach Radford M. Neal, 28 February 2005 A Very Brief Review of Gaussian Processes A Gaussian process is a distribution over

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Bayesian model selection for computer model validation via mixture model estimation

Bayesian model selection for computer model validation via mixture model estimation Bayesian model selection for computer model validation via mixture model estimation Kaniav Kamary ATER, CNAM Joint work with É. Parent, P. Barbillon, M. Keller and N. Bousquet Outline Computer model validation

More information

Bayesian Econometrics

Bayesian Econometrics Bayesian Econometrics Christopher A. Sims Princeton University sims@princeton.edu September 20, 2016 Outline I. The difference between Bayesian and non-bayesian inference. II. Confidence sets and confidence

More information

Statistical challenges in Disease Ecology

Statistical challenges in Disease Ecology Statistical challenges in Disease Ecology Jennifer Hoeting Department of Statistics Colorado State University February 2018 Statistics rocks! Get thee to graduate school Colorado State University, Department

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE Donald A. Pierce Oregon State Univ (Emeritus), RERF Hiroshima (Retired), Oregon Health Sciences Univ (Adjunct) Ruggero Bellio Univ of Udine For Perugia

More information

Tracking Measles Infection through Non-linear State Space Models

Tracking Measles Infection through Non-linear State Space Models Tracking Measles Infection through Non-linear State Space Models Shi Chen Department of Entomology, The Pennsylvania State University, University Park, PA 16802, USA. John Fricks Department of Statistics,

More information

New Insights into History Matching via Sequential Monte Carlo

New Insights into History Matching via Sequential Monte Carlo New Insights into History Matching via Sequential Monte Carlo Associate Professor Chris Drovandi School of Mathematical Sciences ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)

More information

Markov chain Monte Carlo methods for visual tracking

Markov chain Monte Carlo methods for visual tracking Markov chain Monte Carlo methods for visual tracking Ray Luo rluo@cory.eecs.berkeley.edu Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Approximating Bayesian Posterior Means Using Multivariate Gaussian Quadrature

Approximating Bayesian Posterior Means Using Multivariate Gaussian Quadrature Approximating Bayesian Posterior Means Using Multivariate Gaussian Quadrature John A.L. Cranfield Paul V. Preckel Songquan Liu Presented at Western Agricultural Economics Association 1997 Annual Meeting

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Gaussian Processes for Computer Experiments

Gaussian Processes for Computer Experiments Gaussian Processes for Computer Experiments Jeremy Oakley School of Mathematics and Statistics, University of Sheffield www.jeremy-oakley.staff.shef.ac.uk 1 / 43 Computer models Computer model represented

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 2 Coding and output Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. General (Markov) epidemic model 2. Non-Markov epidemic model 3. Debugging

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Estimation of Operational Risk Capital Charge under Parameter Uncertainty Estimation of Operational Risk Capital Charge under Parameter Uncertainty Pavel V. Shevchenko Principal Research Scientist, CSIRO Mathematical and Information Sciences, Sydney, Locked Bag 17, North Ryde,

More information

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016 Log Gaussian Cox Processes Chi Group Meeting February 23, 2016 Outline Typical motivating application Introduction to LGCP model Brief overview of inference Applications in my work just getting started

More information

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial William R. Gillespie Pharsight Corporation Cary, North Carolina, USA PAGE 2003 Verona,

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Advanced Statistical Methods. Lecture 6

Advanced Statistical Methods. Lecture 6 Advanced Statistical Methods Lecture 6 Convergence distribution of M.-H. MCMC We denote the PDF estimated by the MCMC as. It has the property Convergence distribution After some time, the distribution

More information

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29 Outline 1 Introduction 2 Spatial

More information

A Bayesian perspective on GMM and IV

A Bayesian perspective on GMM and IV A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all

More information

Fundamentals and Recent Developments in Approximate Bayesian Computation

Fundamentals and Recent Developments in Approximate Bayesian Computation Syst. Biol. 66(1):e66 e82, 2017 The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. This is an Open Access article distributed under the terms of

More information

PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH

PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH Submitted to the Annals of Applied Statistics arxiv: arxiv:1402.0536 PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH By Amanda A. Koepke, Ira M. Longini, Jr., M. Elizabeth Halloran, Jon Wakefield

More information

Optimization Tools in an Uncertain Environment

Optimization Tools in an Uncertain Environment Optimization Tools in an Uncertain Environment Michael C. Ferris University of Wisconsin, Madison Uncertainty Workshop, Chicago: July 21, 2008 Michael Ferris (University of Wisconsin) Stochastic optimization

More information

Approximate Bayesian computation for the parameters of PRISM programs

Approximate Bayesian computation for the parameters of PRISM programs Approximate Bayesian computation for the parameters of PRISM programs James Cussens Department of Computer Science & York Centre for Complex Systems Analysis University of York Heslington, York, YO10 5DD,

More information

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD Bayesian decision theory Nuno Vasconcelos ECE Department, UCSD Notation the notation in DHS is quite sloppy e.g. show that ( error = ( error z ( z dz really not clear what this means we will use the following

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Introduction to SEIR Models

Introduction to SEIR Models Department of Epidemiology and Public Health Health Systems Research and Dynamical Modelling Unit Introduction to SEIR Models Nakul Chitnis Workshop on Mathematical Models of Climate Variability, Environmental

More information

1. Gaussian process emulator for principal components

1. Gaussian process emulator for principal components Supplement of Geosci. Model Dev., 7, 1933 1943, 2014 http://www.geosci-model-dev.net/7/1933/2014/ doi:10.5194/gmd-7-1933-2014-supplement Author(s) 2014. CC Attribution 3.0 License. Supplement of Probabilistic

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

The effect of emigration and immigration on the dynamics of a discrete-generation population

The effect of emigration and immigration on the dynamics of a discrete-generation population J. Biosci., Vol. 20. Number 3, June 1995, pp 397 407. Printed in India. The effect of emigration and immigration on the dynamics of a discrete-generation population G D RUXTON Biomathematics and Statistics

More information

ABC random forest for parameter estimation. Jean-Michel Marin

ABC random forest for parameter estimation. Jean-Michel Marin ABC random forest for parameter estimation Jean-Michel Marin Université de Montpellier Institut Montpelliérain Alexander Grothendieck (IMAG) Institut de Biologie Computationnelle (IBC) Labex Numev! joint

More information

A Spatial Model for Chronic Wasting Disease in Rocky Mountain Mule Deer

A Spatial Model for Chronic Wasting Disease in Rocky Mountain Mule Deer A Spatial Model for Chronic Wasting Disease in Rocky Mountain Mule Deer Christopher H. Mehl Craig J. Johns June 3, 23 Abstract Chronic wasting disease (CWD) causes damage to portions of the brain and nervous

More information

Divergence Based priors for the problem of hypothesis testing

Divergence Based priors for the problem of hypothesis testing Divergence Based priors for the problem of hypothesis testing gonzalo garcía-donato and susie Bayarri May 22, 2009 gonzalo garcía-donato and susie Bayarri () DB priors May 22, 2009 1 / 46 Jeffreys and

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, ) Econometrica Supplementary Material SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, 653 710) BY SANGHAMITRA DAS, MARK ROBERTS, AND

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Computer Emulation With Density Estimation

Computer Emulation With Density Estimation Computer Emulation With Density Estimation Jake Coleman, Robert Wolpert May 8, 2017 Jake Coleman, Robert Wolpert Emulation and Density Estimation May 8, 2017 1 / 17 Computer Emulation Motivation Expensive

More information

Point process with spatio-temporal heterogeneity

Point process with spatio-temporal heterogeneity Point process with spatio-temporal heterogeneity Jony Arrais Pinto Jr Universidade Federal Fluminense Universidade Federal do Rio de Janeiro PASI June 24, 2014 * - Joint work with Dani Gamerman and Marina

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle Statistical Analysis of Spatio-temporal Point Process Data Peter J Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University School of Public Health

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

McGill University. Department of Epidemiology and Biostatistics. Bayesian Analysis for the Health Sciences. Course EPIB-675.

McGill University. Department of Epidemiology and Biostatistics. Bayesian Analysis for the Health Sciences. Course EPIB-675. McGill University Department of Epidemiology and Biostatistics Bayesian Analysis for the Health Sciences Course EPIB-675 Lawrence Joseph Bayesian Analysis for the Health Sciences EPIB-675 3 credits Instructor:

More information

Spatial Heterogeneity in Epidemic Models

Spatial Heterogeneity in Epidemic Models J. theor. Biol. (1996) 179, 1 11 Spatial Heterogeneity in Epidemic Models ALUN L. LLOYD AND ROBERT M. MAY University of Oxford, Department of Zoology, South Parks Road, Oxford OX1 3PS, U.K. (Received on

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Large Scale Bayesian Inference

Large Scale Bayesian Inference Large Scale Bayesian I in Cosmology Jens Jasche Garching, 11 September 2012 Introduction Cosmography 3D density and velocity fields Power-spectra, bi-spectra Dark Energy, Dark Matter, Gravity Cosmological

More information

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial

More information