Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Julien Arino Centre for Research on Inner City Health St Michael s Hospital Toronto On leave from Department of Mathematics University of Manitoba Julien Arino@umanitoba.ca 5 March 2014

1 Introduction 2 Stochastic processes 3 The SIS model used as an example 4 DTMC 5 CTMC 6 ODE to CTMC

Deterministic SIR Use the standard KMK S = βsi I = βsi αi R = αi Basic reproduction number R 0 = β α S 0 If R 0 < 1, no outbreak If R 0 > 1, outbreak Use α = 1/4, N = S + I + R = 100, I (0) = 1, R 0 = {0.5, 1.5} and β so that this is true

R 0 = 0.5 R 0 = 1.5 100 100 S S 90 I 90 I R R 80 80 70 70 60 60 50 50 40 40 30 30 20 20 10 10 0 0 50 100 150 Time (days) 1 0 0 50 100 150 Time (days) 8 0.9 7 0.8 6 0.7 Number infectious 0.6 0.5 0.4 Number infectious 5 4 3 0.3 2 0.2 0.1 1 0 0 50 100 150 0 0 50 100 150 Time (days) Time (days)

Stochastic DTMC SIR Consider S(t) and I (t) (R(t) = N S(t) I (t)). Define transition probabilities p (s,i ),(s,i)( t) = P{(S(t + t), I (t + t)) = (s + s, i + i ) (S(t), I (t)) = (s, i)} by βsi t (s, i ) = ( 1, 1) αi t (s, i ) = (0, 1) p (s,i ),(s,i)( t) = 1 [βsi + αi] t (s, i ) = (0, 0) 0 otherwise Parameters as previously and t = 0.01

7 R 0 = 0.5 R 0 = 1.5 25 6 5 20 Number infectious 4 3 Number infectious 15 10 2 5 1 0 0 50 100 150 Time (days) 0 0 50 100 150 Time (days) 1.4 3.5 1.2 3 1 2.5 Number infectious 0.8 0.6 Number infectious 2 1.5 0.4 1 0.2 0.5 0 0 50 100 150 Time (days) 0 0 50 100 150 Time (days)

1000 Number of outbreaks 900 Number of simulations with an outbreak (/1000) 800 700 600 500 400 300 200 100 R 0 =0.5 R 0 =1.5 0 0 10 20 30 40 50 60 70 80 90 100 Value of I 0 outbreak: t s.t. I (t) > I (0)

Why stochastic models? Important to choose right type of model for given application Deterministic models Mathematically easier Perfect reproducibility For given parameter set, one sim suffices Stochastic models Often harder Each realization different Needs many simulations In early stage of epidemic, very few infective individuals: Deterministic systems: behaviour entirely governed by parameters Stochastic systems: allows chance to play a role

In the context of the SIR model Deterministic R 0 strict threshold R 0 < 1 I (t) 0 monotonically R 0 > 1 I (t) 0 after a bump (outbreak) Stochastic R 0 threshold for mean R 0 < 1 I (t) 0 monotonically (roughly) on average but some realizations have outbreaks R 0 > 1 I (t) 0 after a bump on average (outbreak) but some realizations have no outbreaks

1 Introduction 2 Stochastic processes 3 The SIS model used as an example 4 DTMC 5 CTMC 6 ODE to CTMC

Theoretical setting of stochastic processes Stochastic processes form a well defined area of probability, a subset of measure theory, itself a well established area of mathematics Strong theoretical content Very briefly presented here for completeness, will not be used in what follows Used here: A. Klemke, Probability Theory, Springer 2008 DON T PANIC!!!

σ-algebras Definition 1 (σ-algebra) Let Ω be a set and A 2 Ω, the set of all subsets of Ω, be a class of subsets of Ω. A is called a σ-algebra if: 1 Ω A 2 A is closed under complements, i.e., A c := Ω \ A A for any A A 3 A is closed under countable unions, i.e., n=1 A n A for any choice of countably many sets A 1, A 2,... A Ω is the space of all elementary events and A the system of observable events

Probability space Definition 2 1 A pair (Ω, A), with Ω a nonempty set and A 2 Ω a σ-algebra is a measurable space, with sets A A measurable sets. If Ω is at most countably infinite and if A = 2 Ω, then the measurable space (Ω, 2 Ω ) is discrete 2 A triple (Ω, A, µ) is a measure space if (Ω, A) is a measurable space and µ is a measure on A 3 If in addition, µ(ω) = 1, then (Ω, A, µ) is a probability space and the sets A A are events. µ is then usually denoted P (or Prob)

Topological spaces and Borel σ-algebras Definition 3 (Topology) Let Ω be an arbitrary set. A class of sets τ Ω is a topology on Ω if: 1, Ω τ 2 A B τ for any A, B τ 3 ( A F A) τ for any F τ (Ω, τ) is a topological space, sets A τ are open and A Ω with A c τ are closed Definition 4 (Borel σ-algebra) Let (Ω, τ) be a topological space. The σ-algebra B(Ω, τ) generated by the open sets is the Borel σ-algebra on Ω, with elements A B(Ω, τ) the Borel sets

Polish spaces Definition 5 A separable topological space whose topology is induced by a complete metric is a Polish space R d, Z d, R N, (C([0, 1]), ) are Polish spaces

Stochastic process Definition 6 (Stochastic process) Let (Ω, F, P) be a probability space, (E, τ) be a Polish space with Borel σ-algebra E and T R be arbitrary (typically, T = N, T = Z, T = R + or T an interval). A family of random variables X = (X t, t T ) on (Ω, F, P) with values in (E, E) is called a stochastic process with index set (or time set) T and range E.

Types of processes So a stochastic process is a collection of random variables (r.v.) {X (t, ω), t T, ω Ω} The index set T and r.v. can be discrete or continuous Discrete time Markov chain (DTMC) has T and X discrete (e.g., T = N and X (t) N) Continuous time Markov chain (CTMC) has T continuous and X discrete (e.g., T = R + and X (t) N) In this course, we focus on these two. Also exist Stochastic differential equations (SDE), with both T and X (t) continuous

Definition 7 (Markov property) Markov property Let E be a Polish space with Borel σ-algebra B(E), T R and (X t ) t T an E-valued stochastic process. Assume that (F t ) t T = σ(x ) is the filtration generated by X. X has the Markov property if for every A B(E) and all s, t T with s t, P[X t A F s ] = P[X t A X s ] Theorem 8 (The usable one) If E is countable then X has the Markov property if for all n N, all s 1 < < s n < t and all i 1,..., i n, i E with P[X s1 = i 1,..., X sn = i n ] > 0, there holds P[X t = i X s1 = i 1,..., X sn = i n ] = P[X t = i X sn = i n ]

1 Introduction 2 Stochastic processes 3 The SIS model used as an example 4 DTMC 5 CTMC 6 ODE to CTMC

SIS model We consider an SIS model with demography. In ODE, S = b ds βsi + γi I = βsi (γ + d)i Total population N = b dn asymptotically constant (N(t) b/d =: N as t ). Can work with asymptotically autonomous system where N = N (and S = N I )

S = b βsi ds + γi I = βsi (γ + d)i DFE: (S, I ) = (b/d, 0). EEP: (S, I ) = ( γ+d β ), b d γ+d β VdD&W: F = D I [βs] DFE = βn, V = D I [(γ + d)i ] DFE = γ + d R 0 = ρ(fv 1 ) = β γ + d N If R 0 < 1, lim t (S(t), I (t)) = (N, 0) ( ) If R 0 > 1, lim t (S(t), I (t)) = N R 0, N N R 0

Reduction to one equation As N is asymptotically constant, we can write S = N I and so only need I = β(n I )I (γ + d)i We can then reconstruct dynamics of S from that of I

1 Introduction 2 Stochastic processes 3 The SIS model used as an example 4 DTMC 5 CTMC 6 ODE to CTMC

Discrete time Markov chain Definition 9 (DTMC Simple definition) An experiment with finite number of possible outcomes S 1,..., S N is repeated. The sequence of outcomes is a discrete time Markov chain if there is a set of N 2 numbers {p ij } such that the conditional probability of outcome S j on any experiment given outcome S i on the previous experiment is p ij, i.e., for 1 i, j N, t = 1,..., p ji = Pr(S j on experiment t + 1 S i on experiment t). The outcomes S 1,..., S N are the states, and the p ij are the transition probabilities. The matrix P = [p ij ] is the transition matrix.

Homogeneity One major distinction: If p ij does not depend on t, the chain is homogeneous If p ij (t), i.e, transition probabilities depend on t, the chain is nonhomogeneous

Markov chains operate on two level: Description of probability that the system is in a given state Description of individual realizations of the process Because we assume the Markov property, we need only describe how system switches (transitions) from one state to the next We say that the system has no memory, the next state depends only on the current one

States The states are here the different values of I. I can take integer values from 0 to N The chain links these states Infection Infection Infection I=0 I=1 I=2 I=3 I=N-1 I=N Recovery or Death Recovery or Death Recovery or Death Recovery or Death

Probability vector Let p i (t) = P(I (t) = i) is the probability distribution We must have i p i(t) = 1 p 0 (t) p 1 (t) p(t) =. p N (t) Initial condition at time t = 0: p(0)

Transition probabilities We then need description of probability of transition In general, suppose that S i is the current state, then one of S 1,..., S N must be the next state. If then we must have p ji = P{X (t + 1) = S j X (t) = S i } p i1 + p i2 + + p in = 1, 1 i N Some of the p ij can be zero

Transition matrix The matrix p 11 p 12 p 1N P = p 21 p 22 p 2N p N1 p N2 p NN has nonnegative entries, p ij 0 entries less than 1, p ij 1 row sum 1, which we write N p ij = 1, j=1 i = 1,..., N or, using the notation 1l = (1,..., 1) T, P1l = 1l

In matrix form p(t + 1) = Pp(t), t = 1, 2, 3,... where p(t) = (p 1 (t), p 2 (t),..., p N (t)) T is a (column) probability vector and P = (p ij ) is a N N transition matrix, p 11 p 12 p 1N P = p 21 p 22 p 2N p N1 p N2 p NN

Stochastic matrices Definition 10 (Stochastic matrix) The nonnegative N N matrix M is stochastic if N i=1 a ij = 1 for all j = 1, 2,..., N. In other words, 1l T M = 1l T Theorem 11 Let M be a stochastic matrix M. Then the spectral radius ρ(m) 1, i.e., all eigenvalues λ of M are such that λ 1. Furthermore, λ = 1 is an eigenvalue of M M has all column sums 1 1l T M = 1l T λ e.value of M with associated left e.vector v v T M = λv T 1l T M = 1l T : left e.vector 1l T and e.value 1..

Long time behavior Let p(0) be the initial distribution (column) vector. Then p(1) = Pp(0) p(2) = Pp(1) = P(Pp(0)) = P 2 p(0) Iterating, we get that for any t N, Therefore, lim p(t) = t + p(t) = P t p(0) ( lim t + Pt p(0) = lim t + Pt ) p(0) if this limit exists (it does if P is constant and chain is regular)

Regular Markov chain Regular Markov chains not covered here, as most chains in the demographic context are absorbing. For information, though.. Definition 12 (Regular Markov chain) A regular Markov chain is one in which P k is positive for some integer k > 0, i.e., P k has only positive entries, no zero entries. Theorem 13 If P is the transition matrix of a regular Markov chain, then 1 the powers P t approach a stochastic matrix W, 2 each column of W is the same (column) vector w = (w 1,..., w N ) T, 3 the components of w are positive. So if the Markov chain is regular, lim p(t) = Wp(0) t +

Interesting properties of matrices Definition 14 (Irreducibility) The square nonnegative matrix M is reducible if there exists a permutation matrix P such that P T MP is block triangular. If M is not reducible, then it is irreducible Definition 15 (Primitivity) A nonnegative matrix M is primitive if and only if there is an integer k > 0 such that M k is positive

Definition 16 (Digraph) Directed graphs (digraphs) A directed graph (or digraph) G = (V, A) consists in a set V of vertices and a set A of arcs linking these vertices 5 1 2 3 4

Connecting graphs and matrices Adjacency matrix M has m ij = 1 if arc from vertex j to vertex i, 0 otherwise 1 2 3 4 5 0 1 0 0 0 1 0 0 0 0 M = 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 Column of zeros (except maybe diagonal entry): vertex is a sink (nothing leaves)

Paths of length exactly 2 Given adjacency matrix M, M 2 has number of paths of length exactly two between vertices 1 2 3 4 5 1 0 0 0 0 M 2 0 1 0 0 0 = 0 2 0 1 0 0 0 0 1 0 0 0 0 0 0 Write m k ij the (i, j) entry in M k m 2 32 = 2 since 2 1 3 and 2 4 3..

Connecting matrices, graphs and regular chains Theorem 17 A matrix M is irreducible iff the associated connection graph is strongly connected, i.e., there is a path between any pair (i, j) of vertices Irreducibility: i, j, k, m k ij > 0 Theorem 18 A matrix M is primitive if it is irreducible and there is at least one positive entry on the diagonal of M Primitivity: k, i, j, m k ij > 0 Theorem 19 Markov chain regular transition matrix P primitive

Strong connectedness (1) 1 7 2 4 5 3 6 8 Not strongly connected

Strong connectedness (2) 1 7 2 4 5 3 6 8 Strongly connected

Strong components 1 7 2 4 5 3 6 8 Strong component 1 Strong component 2

Absorbing states, absorbing chains Definition 20 A state S i in a Markov chain is absorbing if whenever it occurs on the n th generation of the experiment, it then occurs on every subsequent step. In other words, S i is absorbing if p ii = 1 and p ij = 0 for i j Definition 21 A Markov chain is said to be absorbing if it has at least one absorbing state, and if from every state it is possible to go to an absorbing state In an absorbing Markov chain, a state that is not absorbing is called transient

Some questions on absorbing chains 1 Does the process eventually reach an absorbing state? 2 Average number of times spent in a transient state, if starting in a transient state? 3 Average number of steps before entering an absorbing state? 4 Probability of being absorbed by a given absorbing state, when there are more than one, when starting in a given transient state?

Reaching an absorbing state Answer to question 1: Theorem 22 In an absorbing Markov chain, the probability of reaching an absorbing state is 1

Standard form of the transition matrix For an absorbing chain with k absorbing states and N k transient states, the transition matrix can be written as ( ) Ik 0 P = R Q with following meaning, Absorbing states Transient states Absorbing states I k 0 Transient states R Q with I k the k k identity matrix, 0 an k (N k) matrix of zeros, R an (N k) k matrix and Q an (N k) (N k) matrix.

The matrix I N k Q is invertible. Let F = (I N k Q) 1 be the fundamental matrix of the Markov chain T i be the sum of the entries on row i of F B = FR Answers to our remaining questions: 2 F ij is the average number of times the process is in the jth transient state if it starts in the ith transient state 3 T i is the average number of steps before the process enters an absorbing state if it starts in the ith transient state 4 B ij is the probability of eventually entering the jth absorbing state if the process starts in the ith transient state

Back to the SIS model ODE: I = β(n I )I (γ + d)i This equation has 3 components: 1 β(n I )I new infection 2 γi recovery 3 di death In the DTMC, want to know what transitions mean and what their probabilities are Infection Infection Infection I=0 I=1 I=2 I=3 I=N-1 I=N Recovery or Death Recovery or Death Recovery or Death Recovery or Death

Consider small amount of time t, small enough that only one event occurs: new infection, loss of infectious individual, nothing.. Let B(i) := β(n i)i D(i) := (γ + d)i be rates of birth and death of infectious. Then p ji ( t) = P {I (t + ) = j I (t) = i} B(i) t j = i + 1 D(i) t j = i 1 = 1 [B(i) + D(i)] t j = i 0 otherwise β( N 1)Δ t β( N 2)2 Δ t β( N 1)Δ t Infection Infection Infection I=0 I=1 I=2 I=3 I=N-1 I=N Recovery Recovery Recovery Recovery or Death or Death or Death or Death (γ+d )Δ t (γ+d )2 Δt (γ+d )3 Δ t (γ+d ) N Δ t

Transition matrix 1 D(1) t 0 1 (B(1) + D(1)) t 0 B(1) t.... D(N) t 0 0 0 B(N 1) t 1 D(N) t Note that indices in the matrix range from 0 to N here

Properties p 0 is an absorbing state For any initial distribution p(0) = (p 0 (t), p 1 (0),..., p N (0)), lim p(t) = (1, 0,..., 0)T lim t p 0 (t) = 1 t Exists quasi-stationary distribution conditioned on non-extinction q i (t) = p i(t) 1 p 0 (t), i = 1,..., N

Consequence of absorption We have lim p(t) = (1, 0,..., 0)T lim t p 0 (t) = 1 t but time to absorption can be exponential

1 Introduction 2 Stochastic processes 3 The SIS model used as an example 4 DTMC 5 CTMC 6 ODE to CTMC

Several ways to formulate CTMC s A continuous time Markov chain can be formulated in terms of infinitesimal transition probabilities branching process time to next event Here, time is in R +

For small t, p ji ( t) = P {I (t + ) = j I (t) = i} B(i) t + o( t) j = i + 1 D(i) t + o( t) j = i 1 = 1 [B(i) + D(i)] t + o( t) j = i o( t) otherwise with o( t) 0 as t 0

Forward Kolmogorov equations Assume we know I (0) = k. Then p i (t + t) = p i 1 (t)b(i 1) t + p i+1 (t)d(i + 1) t + p i (t)[1 (B(i) + D(i)) t] + o( t) Compute (p i (t + t) p i (t))/ t and take lim t 0, giving d dt p 0 = p 1 D(1) d dt p i = p i 1 B(i 1) + p i+1 D(i + 1) p i [B(i) + D(i)] i = 1,..., N Forward Kolmogorov equations associated to the CTMC

Write previous system as In vector form p = Qp with 0 D(1) 0 0 0 (B(1) + D(1)) D(2) 0 Q = 0 B(1) (B(2) + D(2)) 0 D(N) D(N) Q generator matrix. Of course, p(t) = e Qt p(0)

Linking DTMC and CTMC for small t DTMC: p(t + t) = P( t)p(t) for transition matrix P( t). Let t 0, obtain Kolmogorov equations for CTMC d dt p = Qp where P( t) I Q = lim = P (0) t 0 t

1 Introduction 2 Stochastic processes 3 The SIS model used as an example 4 DTMC 5 CTMC 6 ODE to CTMC

Converting ODE to CTMC Extremely easy to convert compartmental ODE model to CTMC!!

The general setting: flow diagram b S βsi γ I I ds di

ODE approach b S βsi γ I I ds di Focus on compartments b S βsi γ I βsi γ I I ds di

CTMC approach b S βsi γ I I ds di b Focus on arrows (transitions) S ds βsi γ I I di

Focus on transitions b S βsi γ I I ds di b: S S + 1 ds: S S 1 βsi : S S 1, I I + 1 γi : S S + 1, I I 1 di : I I 1 birth death of susceptible new infection recovery death of infectious Will use S = N I and omit S dynamics

Gillespie s algorithm (SIS model with only I equation) 1: Set t t 0 and I (t) I (t 0 ) 2: while t t f do 3: ξ t β(n i)i + γi + di 4: Draw τ t from T E(ξ t ) 5: v [β(n i)i/ξ t, β(n i)i/ξ t + γi/ξ t, 1] 6: Draw ζ t from U([0, 1]) 7: Find pos such that ζ t v(pos) 8: switch (pos) 9: case 1: New infection, I (t + τ t ) = I (t) + 1 10: case 2: End of infectious period, I (t + τ t ) = I (t) 1 11: case 3: Death, I (t + τ t ) = I (t) 1 12: end switch 13: t t + τ t 14: end while

IMPORTANT: in Gillespie s algorithm, we do not consider p ii, the event nothing happens

Drawing at random from an exponential distribution Want τ t from T E(ξ t ), i.e., T has probability density function f (x, ξ t ) = ξ t e ξtx 1 x 0 Use cumulative distribution function F (x, ξ t ) = x f (s, ξ t) ds F (x, ξ t ) = (1 e ξtx )1 x 0 which has values in [0, 1]. So draw ζ from U([0, 1]) and solve F (x, ξ t ) = ζ for x: F (x, ξ t ) = ζ 1 e ξtx = ζ e ξtx = 1 ζ ξ t x = ln(1 ζ) x = ln(1 ζ) ξ t