Markov Chains and Pandemics

Save this PDF as:

Size: px
Start display at page: Transcription

1 Markov Chains and Pandemics Caleb Dedmore and Brad Smith December 8, 2016 Page 1 of 16

2 Abstract Markov Chain Theory is a powerful tool used in statistical analysis to make predictions about future events based solely upon a system model in its present state. Throughout the course of this paper we will be describing the many ways in which this theory is used for tracking the growth and possible outcomes of diseases as they rise in human populations. We will establish a simple example and will add additional elements to further refine our hypothetical model over time, and ultimately we will have a model that will mirror actual prediction models used for diseases in the real world. Page 1 of 16

3 1. Markov chain theory is one of the most widely used stochastic processes for making predictions about future events in the world based upon mathematical abstractions and models of the processes that are being predicted. It is the root of the Leslie Matrix in conservation, tracking weather patterns, and modeling much more. The Markov theory was invented by Andrey Markov to make predictions about future states of a system based solely upon the current state of a system. What this ultimately means is if we have a system with multiple different possible conditions (e.g. like a coin about to be tossed), we can predict the likely-hood of which state the system will be in (either heads or tails) after a set number of trials (coin flips). There are two limiting requirements a system must meet in order to be modeled through a Markov Chain. Firstly, each trial must lead to a finite set of outcomes. Without a limited set of outcomes it is impossible to determine the odds of any one outcome relative to all other outcomes, which is essential when for Markov Chain predictions, which are based on the likelyhood of any one outcome occurring. Secondly, the outcome of any trial depends at most on the outcome of the immediately preceding trial. This stipulation is required because if a trial 5 states previous dictates the outcome of the present trial, it could disagree with what the present state of the system would give as the next state. To clarify, previous system states dictating future system states can contradict what the present state implies about future system states and in such a case it would be impossible to know which prediction was correct. Through these very simple stipulations we can now form very powerful predictive models of just about anything in the world. Over the course of this paper we will demonstrate what elements make up Markov Chains, how these elements interact to make future predictions, and how it can be specifically applied when looking at the spread of diseases in populations. Page 1 of 16

4 3 2 1 Figure 1: A basic transition diagram Page 2 of Transition Diagrams A common visualization for Markov Chains is the transition diagram shown above in Figure 1: This diagram shows a three state probability chain, with each point having a determined probability of moving to another or staying in place. When added together the percentages of probability for the different states must add up to one hundred percent, as it is a closed system of chance. Another way to look at a Markov process is in matrix form, but first we ll need to define some notation.

5 1.2. Notation Setup With the requirements mentioned previously, one can set up what is called a transition matrix to express trials of a Markov Chain. A transition matrix consists of rows with fractional values that add up to one, seeing as a Markov process has to do with ratios of a whole in probability. The transition matrix never changes, and is reused for each trial. An example of a first trial through a transition matrix is shown below: For the values of our Transition Matrix, we ll use T ij for our notation: T 11 T T 1j (x (0) 1, x(0) 2,..., x(0) i ) T 21 T T 2j.... = (x(1) T i1 T i2... T ij 1, x(1) 2,..., x(1) i ) (1) The x (0) is what is known as the Initial Condition Vector. One trial, or one run of the initial condition vector through a transition matrix, gives us the values for x (1). The written out calculations for multiplying the vector through the matrix is shown below: T 11 x (0) 1 + T 21 x (0) T i1x (0) i = x (1) 1 T 12 x (0) 1 + T 22 x (0) T i2x (0) i = x (1) 2. T 1j x (0) 1 + T 2j x (0) T ijx (0) i = x (1) i After one trial of the transition matrix, we get a new condition vector for the next trial: x (1) = (x (1) 1, x(1) 2,..., x(1) i ) To calculate the next trial, x (1) is simply ran through the transition matrix again, providing us with x (2). Page 3 of 16

6 Based on this we can for representing any x (k), we can use the notation: T 11 T T 1j x (k) = (x (k 1) 1, x (k 1) 2,..., x (k 1) T 21 T T 2j i ).... T i1 T i2... T ij Which can be expanded into the following expressions: x (k) 1 = T 11 x (k 1) 1 +T 21 x (k 1) T i1 x (k 1) i x (k) 2 = T 12 x (k 1) 1 +T 22 x (k 1) T i2 x (k 1) i x (k) i. = T 1j x (k 1) 1 +T 2j x (k 1) T ij x (k 1) i Any condition vector in a Markov Chain is calculated by the immediately preceding initial condition vector. Now that we understand the process of calculating Markov Chains via matrices, we can continue to our scenario below Requirements for Modeling To be able to model a disease scenario with Markov Chains, we must ensure the requirements of Markov Chains are being satisfied. These requirements are that each trial must lead to a finite set of outcomes, and that the outcomes of any trial are only influenced at most by the immediately preceding trial. The latter stipulation is really simple to ensure and isn t something we design into our model, but a design principle we use successfully construct our model. With modeling a disease scenario in any population, we are fortunate because the rates of susceptibility don t change Page 4 of 16

7 as the populations shift between the various states, which ensures that at most we are only influenced by the immediately preceding trial and that we have a finite set of outcome for each trial Model Setup A common model in tracking the progression of epidemics is known as the SIR model. The SIR model consists of three columns which we will call susceptible, infected, and recovered. We will label each state by its letter; S, I, and R. The model also consists of three rows, which will also be labeled the same letters as the columns. As stated previously in the notation section, the rows of this matrix will represent the ratio of odds for the entire population and each must add up to one (i.e. one hundred percent). Together the rows and columns of the matrix will describe the likely-hood of the population transitioning from the various states to others. The the transition diagram 2 below, can be used to help us visualize what will be occurring in the described matrix. As can be seen, not every point goes to all of the others. This is a common scenario in Markov Chains, especially in a disease scenario. In the case of some diseases under the SIR model, an individual cannot recover unless they get sick, and once recovered one cannot be any longer susceptible. What this means is that once inside of this state, an individual will never leave it. This state is an absorptive state in our SIR model, R is referred to as absorptive. This information is critical as it helps us determine what values the R row of our matrix must contain. S I R S T 11 T 12 T 23 I T 21 T 22 T 23 (2) R While the values needed for our susceptible and infected rows have yet to be determined, based on the fact that recovered state is absorptive we know that one hundred percent of people recovered will remain recovered and so the recovered column is the only value in the recovered row. This is shown in matrix (2) above. Page 5 of 16

8 I S R Page 6 of 16 Figure 2: This is a basic SIR transition matrix

9 All that remains to have a complete transition matrix would be to apply the determined rates of transition from state to state to the appropriate positions. What this means is we would need to have the odds of an uninfected person becoming infected, and the odds of an infected person recovering. With these values, we can then determine the likely-hood of an uninfected individual remaining in that state, or an infected person remaining infected. We can do this because the percentages are always out of one hundred percent. So if we have the odds of an uninfected person getting infected, we can subtract that percentage from the total and the result will be the percentage of uninfected individuals remaining in that state. With our SIR transition matrix understood and only in need of actual values to be complete, we can now devise our initial condition vector, x (0). We can use the same guiding principle for this row vector as the rows in the transition matrix we have already begun organizing; that is all terms in the row must add up to one because they are all ratios of the total population. Using this principal and understanding our vector must have compatible dimensions to be multiplied into the matrix, we know x (0) must look something like this: x (0) = (x (0) S, x(0) I, 1 x (0) S x(0) I ) = (x (0) S, x(0) I, x (0) R ) As can be seen in the above equation, our initial condition vector will have three terms representing the percentages of the population in the different states of our model. Also visible above is the relationship between all three terms as a total of the entire population Assigning Values Lets say for example that before disease X became a concern to monitor, 90 percent of the population was susceptible, 7 percent were infected, and 3 percent were recovered. These numbers will become our initial condition vector, and we can (.90,.07,.03) (3) Page 7 of 16

10 Now we will set up our transition matrix for disease X using our SIR model from Figure (??) above. This system is much easier to manipulate and use when in matrix form. We will allow our values for this SIR model to be as shown above, with T 11 =.85, T 12 =.15, T 13 = 0, T 21 = 0, T 22 =.12, T 23 =.88, T 31 = 0, T 32 = 0, and T 33 = 1.0 We can now place these values onto our SIR transition diagram: I Page 8 of S R 1.0

11 Setting up our scenario in matrix form, however, makes for easier manipulation and calculations. S I R S I (4) R The SIR matrix is a transition matrix that can serve as a week span, referred to as one trial. As can be seen in the Matrix above, each row adds up to one. This is typical of any probability calculation of a whole. In row S, 85 percent of people that are susceptible remain in that state after one week. The remaining 15 percent becomes infected. For row I, 12 percent of those infected remain infected, while the remaining 88 percent recover. Once recovered, people have a 100 percent probability of remaining recovered. Now that our values are in place, we can calculate a trial within the population as the disease spreads (.90,.07,.03) Our new initial condition vector, x (1), is 0.85(.90) + 0(.07) + 0(.03) = (.90) (.07) (.03) = (.90) + 0(.07) + 1.0(.03) = (0.765, 0.143, 0.092) This shows that after one week, 76.5 percent of the population is predicted to be susceptible, 14.3 percent is predicted to be infected, and 9.2 percent is predicted to be recovered. Page 9 of 16

12 To calculate for week two, we can now plug in x (1) to our SIR matrix using the same method of calculations (0.765, 0.143, 0.092) = (0.650, 0.132, 0.218) This is the predicted outcome of two weeks from the current initial condition. Since a Markov chain is only based on immediately preceding conditions, the matrix input ratio will remain the same every week. Thus, exponentially multiplying a Markov Matrix provides a way to calculate multiple weeks time. The formula looks like this: (InitialConditionV ector) (T ransitionmatrix) t where t is the number of trials. With this new formula we can now calculate more distant probabilities without having to repeatedly multiply by SIR. In order for this to be possible, the rows of a matrix must be probability vectors, (adding up to one), and the matrix must be square. Lastly, all elements of the matrix must be positive integers. Thus, to find the predicted probability of five weeks from now, we can simply plug in 5 for t (.90,.07,.03) (.399,.082,.519) Next, if we plug in t for ten weeks, we can see what direction the disease seems to be heading over time (.90,.07,.03) (.178,.036,.786) As can be seen, the further out in time we predict, the more the population ratio tilts toward recovered. This is due to the fact that the recovery state is absorptive. Page 10 of 16

13 Steady State Vector Another common goal in using a Markov transition matrix is to find when a scenario will fixed in place. One could look at this as the equation of: (InitialConditionV ector) (M arkovm atrix) = (InitialConditionV ector) This can be simpler to imagine with the initial condition vector p in column form. Thus, we ll transpose our transition matrix and have the equation of: Mp = p We can now solve for P to see if the scenario will ever remain fixed in a condition. Essentially, we will find the eigenvalues and eigenvectors. This is referred to as the steady state vector. Mp = p Mp p = 0 (M I)p = (M I) After row reducing, we get the eigenvector of p = (0, 0, 1) Page 11 of 16

14 Thus, the only steady state for this Markov process is once the entire population has been absorbed into the recovered state. Since the SIR matrix is lower diagonalized, it can be seen there there are also two other eigenvalues, and thus two other potential eigenvectors for the steady state vector. However, the eigenvalues 0.85 and 0.12 both result in eigenvectors with one or more negative values, as shown below = (0.633, 0.130, 0.763) 0.12 = (0., 0.707, 0.707) It only stands to reason that we can t have a negative portion of the population in any state, because that ratio has to add up to that of the whole population. Therefore, the absorptive row R provides the only usable eigenvector for the steady state vector. p = (0, 0, 1), which makes sense seeing that any of the population that becomes recovered stays recovered. Eventually, all of the population will be within that state. Another way we can check this is be deriving a limit as trials go to infinite amount of weeks. This should give us our end result for the Markov process, which should match up with our only steady state vector. t (.90,.07,.03) lim (.90,.07,.03) t (0, 0, 1) As can be seen, as t approaches infinity, the condition vector matches that of our steady state vector, (0, 0, 1), staying consistent with our concept of an absorptive state. Page 12 of 16

15 3.2. Additional Absorptive States For our last example of a Markov Matrix manipulation, we must add an additional row and column to represent a new state. Let s say that in our SIR model there is also now a D, for deceased. This will be a second absorptive row, and we ll shift some of the numbers for our example. S I R D S I R D For this SIR-D model, the disease is a bit more aggressive, and 10 percent of people that get infected die. Now, rather than calculate trials by matrix power as before, we will introduce what is known as the IODA form. Through manipulating the matrix and taking advantage of absorptive states, we can shift the SIR-D matrix to be more beneficial. An IODA form matrix consists of categorized parts, as shown below: ( ) I O D The I section will an Identity matrix of the absorptive states, the O will be a null matrix, D consists of the probabilities of switching to states that are not absorptive, and A consists of transitions from non-absorptive to absorptive states. With a few simple row operations, we can transform our SIR-D matrix into IODA form. We can place rows three and four as one and two respectively, and do the same with our columns. A R D S I R D S I Page 13 of 16

16 Partitioned into IODA sections, our matrix looks like this: We can now consider our IODA model to be four separate matrices: I, O, D, and A. Now, to further manipulate our findings, we will solve N = (I A) 1 ( ( ) ( ) ) 1 (I A) ( ) ( ) (1/.12).0.15 ( ) We can call this new matrix matrix N. The significance of N is the probability of absorption via a certain amount of trials, or the average. To interpret our matrix, we can divide by 1.25: ( ) This shows that the average amount of trials that a member within the population goes through before being absorbed into either the state of being recovered or deceased 5.34 trials. Page 14 of 16

17 4. Through our disease scenario we have shown a few of the many manipulations and calculations that can be made through Markov Chain processes. Markov Chains can be used for an endless amount of purposes, and have proven useful in any field when it comes to future predictions. By simply plugging in probabilities, one can use this discrete stochastic process for almost any topic. Steady states, long and short term predictions, and averages are just a few of the many manipulations that are available when dealing with Markov Chain Theory. Page 15 of 16

18 References  David Arnold. Writing Scientific Papers in L A TEX  David Arnold. The Leslie Matrix  Rose-Hulman Institute of Technology. Markov Chains  Bernadette H. Perham and Arnold E. Perham. Topics in Discrete Mathematics: Markov Chain Theory Page 16 of 16