MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Similar documents
MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Statistical Inference for Stochastic Epidemic Models

Robust MCMC Algorithms for Bayesian Inference in Stochastic Epidemic Models.

19 : Slice Sampling and HMC

Lecture 6: Markov Chain Monte Carlo

Non-Parametric Bayes

17 : Markov Chain Monte Carlo

Markov Networks.

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Computational statistics

Sampling Methods (11/30/04)

Bayesian Inference for Contact Networks Given Epidemic Data

Bayesian inference and model selection for stochastic epidemics and other coupled hidden Markov models

CSC 446 Notes: Lecture 13

16 : Markov Chain Monte Carlo (MCMC)

MCMC and Gibbs Sampling. Kayhan Batmanghelich

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

Probabilistic Graphical Models

Probabilistic Graphical Models Homework 2: Due February 24, 2014 at 4 pm

Advanced Statistical Methods. Lecture 6

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Markov Chain Monte Carlo, Numerical Integration

Electronic appendices are refereed with the text. However, no attempt has been made to impose a uniform editorial style on the electronic appendices.

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Bayesian inference for stochastic multitype epidemics in structured populations using sample data

Particle Filtering a brief introductory tutorial. Frank Wood Gatsby, August 2007

Bayesian Methods for Machine Learning

Transmission in finite populations

STA 4273H: Statistical Machine Learning

CSC 2541: Bayesian Methods for Machine Learning

Bayesian Machine Learning

Markov Chains and MCMC

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

Markov Chain Monte Carlo (MCMC)

(I AL BL 2 )z t = (I CL)ζ t, where

STA 4273H: Sta-s-cal Machine Learning

The Metropolis-Hastings Algorithm. June 8, 2012

Conditional probabilities and graphical models

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

arxiv: v1 [stat.co] 13 Oct 2017

Gaussian Processes for Sequential Prediction

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Contrastive Divergence

Nonparametric Drift Estimation for Stochastic Differential Equations

Lecture 8: Bayesian Estimation of Parameters in State Space Models

The Gaussian distribution

Density Estimation. Seungjin Choi

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Sequential Monte Carlo Methods for Bayesian Computation

Markov Chain Monte Carlo

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Introduction to Machine Learning CMU-10701

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Bayesian Inference for DSGE Models. Lawrence J. Christiano

MCMC notes by Mark Holder

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

A Bayesian Approach to Phylogenetics

CS281A/Stat241A Lecture 22

Markov Chain Monte Carlo methods

Expectation maximization

Bayesian inference for stochastic multitype epidemics in structured populations via random graphs

Bayesian Regression Linear and Logistic Regression

Lecture 2: From Linear Regression to Kalman Filter and Beyond

4452 Mathematical Modeling Lecture 16: Markov Processes

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Advanced Introduction to Machine Learning

Approximate Bayesian Computation and Particle Filters

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

16 : Approximate Inference: Markov Chain Monte Carlo

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Variational Scoring of Graphical Model Structures

Computer Practical: Metropolis-Hastings-based MCMC

The Expectation-Maximization Algorithm

Monte Carlo Methods. Leon Gu CSD, CMU

Introduction to Hamiltonian Monte Carlo Method

Learning Gaussian Process Models from Uncertain Data

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Markov Chain Monte Carlo methods

Lecture 4. Continuous Random Variables and Transformations of Random Variables

Markov Chain Monte Carlo (MCMC)

13: Variational inference II

Study Notes on the Latent Dirichlet Allocation

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

A tutorial introduction to Bayesian inference for stochastic epidemic models using Approximate Bayesian Computation

Lecture 8: The Metropolis-Hastings Algorithm

Approximate Bayesian Computation

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Introduction to Probabilistic Machine Learning

University of Toronto Department of Statistics

Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I

STA 4273H: Statistical Machine Learning

STA 414/2104: Machine Learning

Transcription:

MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

1. What can be estimated? As with any attempt to fit a model to data, it is always important to think about how informative the data are about the model parameters which one is trying to estimate. In some settings it is obvious what can or cannot be estimated; in other settings it can be much less obvious.

1. What can be estimated? Example: Latent periods Consider the Markov SIR model with latent periods, say of fixed unknown length = c. Thus when an individual is infected, they must wait c days until they become infectious. Latent individuals are called exposed.

1. What can be estimated? Example: Latent periods Thus we have an SEIR model where Exposed period = c time units. S E I R c Infected Removed

1. What can be estimated? Example: Latent periods Introduce exposure times (= infection times) e 1, e 2,..., e n, where e k = i k - c and define e = (e 2,..., e n ). As before, r = (r 1, r 2,..., r n ) is observed. Define i = (i 1, i 2,..., i n ).

1. What can be estimated? Example: Latent periods π (i, r, e,, c, e 1 ) = rn i n n N β γ N β 1 t }dt γi )SI / {( exp I )S I / ( t t 1 r 2 i i } 1,2,...,, { 1 n k c e i k k

1. What can be estimated? Example: Latent periods It would be straightforward to adapt the standard MCMC algorithm to include c as an extra parameter - e.g. using M-H updates for c. However such an algorithm would be uninformative about c given removal data alone.

1. What can be estimated? Example: Latent periods Roughly speaking, for any value of c, the infection rate would be estimated accordingly - large c means large and small c means small. So although the MCMC algorithm is correct, the output would need to be carefully interpreted. Here we would see high posterior correlation between c and.

1. What can be estimated? Example: Latent periods In practice, a better strategy would be to fix c to certain (biologically reasonable) values and then perform estimation for and.

1. What can be estimated? Example: Gamma infectious periods A common generalisation of the Markov SIR model is to have Gamma-distributed infectious periods. Thus each infective remains so for a period of time T I, where T I (c,d), say (c = shape, d = rate). Note E( T I ) = c / d.

1. What can be estimated? Example: Gamma infectious periods As seen in lecture 2, the likelihood is π (i, r,c, d, a, i a ) = a ( β / N)I exp r {( β / N)SI t i t i b a }dt where f(x c,d) = x c-1 d c exp(-dx) / (c) is the p.d.f. of T I. n 1 f ( r i c, d)

1. What can be estimated? Example: Gamma infectious periods The two parameters c,d can be updated in an MCMC algorithm. It is not immediately obvious if it is possible to estimate both parameters separately from removal data. One might expect E(T I ) = c / d to be estimated with reasonable precision.

1. What can be estimated? Example: Data for Markov SIR model Another important aspect of estimation is the detail of the data. For example, suppose we have observations (= removal times) in a population of N=100 susceptibles, of whom n become infected. Clearly if n=0, no inference can be drawn. But what if n=1? n=10? n=100?

Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

2. Reparameterisation Reminder: MCMC theory and practice A common problem with MCMC algorithm mixing is correlation, i.e. when two (or more) parameters in the target density are highly correlated. This problem usually means it is hard to update correlated parameters separately.

2. Reparameterisation Example: Gamma infectious periods As described above, consider the SIR model with (c,d) infectious periods. Since the mean infectious period is c/d, c and d will be positively correlated. It therefore makes sense to consider a reparameterisation to (m,v), where m = mean, v = variance of T I.

2. Reparameterisation Example: Gamma infectious periods Note that this makes the corresponding part of the likelihood more complex: ) ( )) / ( exp( ) ( ), ( 1 1 1 c i r d d i r d c i r f c c n n ) / ( )) / )( / ( exp( ) / ( ) ( ), ( 2 / 1) ) / (( 1 1 2 2 v m i r v m v m i r m v i r f v m v m n n

2. Reparameterisation Example: Gamma infectious periods In practice it is often the case that there is a trade-off between elegance and efficiency of MCMC algorithms.

2. Reparameterisation Example: Non-centering in Markov SIR model For the Markov SIR model with n (= number infected) large, it can be hard to move quickly around the space of all possible infection times. A key difficulty is that the infectious period rate parameter,, affects the likelihood of proposed new infection times.

2. Reparameterisation Example: Non-centering in Markov SIR model e.g. if is large, so the mean infectious period is small, proposed updates that give large infectious periods will often be reected. One way around this is to try and update infections times and simultaneously (i.e. block updating ). An alternative is to reparameterise to break the correlation: this is non-centering.

2. Reparameterisation Example: Non-centering in Markov SIR model Aside: the term non-centering arises via the following graphical model representation: i r Here, (, i) is a centered parameterisation.

2. Reparameterisation Example: Non-centering in Markov SIR model If the data r are relatively uninformative about i, then the dependence between and i can give poor MCMC mixing. i r

2. Reparameterisation Example: Non-centering in Markov SIR model A natural alternative is to find a non-centered parameterisation u such that and u are independent: u r

2. Reparameterisation Example: Non-centering in Markov SIR model Specifically: observe that if X Exp(a) then we can write X as X = (1/a) Y where Y Exp(1).

2. Reparameterisation Example: Non-centering in Markov SIR model Proof: P((1/a)Y > t) = P(Y > at) = exp(-at) = P(X > t) since X Exp(a).

2. Reparameterisation Example: Non-centering in Markov SIR model Now in the model, the infectious period of individual k Exp( ). As for the non-markov case, adopt the labelling where i k corresponds to r k. Thus r k - i k Exp( ) = (1/ )Exp(1) = (1/ ) U k, say.

2. Reparameterisation Example: Non-centering in Markov SIR model This suggests a reparameterisation in which the infection times i 1, i 2,..., i n are replaced with U 1, U 2,,..., U n. Note that i k can be recovered from U k and. r k - i k = (1/ ) U k, so i k = r k - (1/ ) U k.

2. Reparameterisation Example: Non-centering in Markov SIR model Note now that if is updated, then all of the infection times are simultaneously updated. This allows for faster mixing around the space of possible infection times.

2. Reparameterisation Example: Non-centering in Markov SIR model Consider now the likelihood in the new parameterisation. We are making the change of variable u k = (r k - i k ), k=1,..., n

2. Reparameterisation Example: Non-centering in Markov SIR model To evaluate the likelihood we simply write i k as a function of u k in the current likelihood, and multiply by the Jacobian of the transformation J = (i 1, i 2,..., i n ) = -n. (u 1, u 2,..., u n )

2. Reparameterisation Example: Non-centering in Markov SIR model When we do this, then the infection part is essentially unchanged: Unaffected by reparameterisation ), ( }dt )SI / {( exp )I / ( 1 i t t d c i r f N β N β n r i a b a

2. Reparameterisation Example: Non-centering in Markov SIR model Conversely, the remaining product term simplifies: ), ( }dt )SI / {( exp )I / ( 1 i t t d c i r f N β N β n r i a b a

2. Reparameterisation Example: Non-centering in Markov SIR model Specifically, f(r k - i k ) = exp(- (r k - i k ) ) = exp(- (r k - (r k - -1 u k ) ) = exp(- u k ), and including the Jacobian means that the whole product becomes exp(- 1n u ).

2. Reparameterisation Example: Non-centering in Markov SIR model MCMC algorithm: Update using Gamma full conditional distribution as before Update using Metropolis-Hastings step (e.g. Gaussian random walk). Note that when is updated, so are infection times Update u k s using M-H step (e.g. propose u k from Exp(1) distribution)

Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

3. Marginalisation Suppose we have an MCMC algorithm with target density π( 1, 2,..., n ). The idea of marginalisation is to instead run MCMC on a marginal target density, e.g. π( 1, 2 ). This is achieved by integrating out the other parameters.

3. Marginalisation The obvious reason to do this is that the resulting MCMC chain has less parameters to update. Note that samples from the integrated-out parameters can be recovered from the output of the reduced chain.

3. Marginalisation Example: Markov SIR model Recall that π (i, r,, i 1 ) = The standard MCMC algorithm updates,, and the infection times i. rn i n n N β γ N β 1 t }dt γi )SI / {( exp I )S I / ( t t 1 r 2 i i

3. Marginalisation Example: Markov SIR model However, it is possible to integrate out and, as follows: π (i, r i 1 ) = π (i, r,, i 1 ) π (,) d d, where π (,) is the oint prior density of (,). Standard choice is π (,) = π ( ) π ().

3. Marginalisation Example: Markov SIR model Specifically, suppose (m, ) a priori. (0, ) π (i, r,, i 1 ) π ( ) d (0, ) (n-1) + m-1 exp(- A) d, where A = + S t I t / N dt.

3. Marginalisation Example: Markov SIR model However, recalling that y n-1 exp(- y b) dy = (n) b -n, (0, ) (0, ) (n-1) + m-1 exp(- A) d = (n+m-1) A -(n+m-1) A -(n+m-1)

3. Marginalisation Example: Markov SIR model The parameter can be integrated out in a similar fashion. We can thus run an MCMC algorithm on the target density π (i, r i 1 ). Note that only i updates are required.

3. Marginalisation Example: Markov SIR model To recover samples from the marginal posterior distribution of, simply note that π ( i, r,, i 1 ) (m+n-1, A ). Each sample for i yields a sample for A, from which a sample from can be obtained.

References Neal, P. and Roberts, G.O. (2005) A case study in non-centering for data augmentation: stochastic epidemics. Statistics and Computing 15, 315-327.