APM 541: Stochastic Modelling in Biology Discrete-time Markov Chains. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 92

Size: px
Start display at page:

Download "APM 541: Stochastic Modelling in Biology Discrete-time Markov Chains. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 92"

Transcription

1 APM 541: Stochastic Modelling in Biology Discrete-time Markov Chains Jay Taylor Fall 2013 Jay Taylor (ASU) APM 541 Fall / 92

2 Outline 1 Motivation 2 Markov Processes 3 Markov Chains: Basic Properties 4 Asymptotics: Class Structure 5 Asymptotics: Hitting Times and Probabilities 6 Asymptotics: Stationary Distributions 7 Time Reversal 8 Hidden Markov Models Jay Taylor (ASU) APM 541 Fall / 92

3 Motivation Stochastic Processes Definition A stochastic process is simply a collection of random variables {X t : t T } indexed by some set T, where all of the variables are defined on the same underlying probability space. We say that the process is E-valued if all of the variables take values in the set E. We say that the process is a discrete-time stochastic process if the index set T is a discrete subset of R, e.g., T Z, or a continuous-time stochastic process if the index set is an interval T = (a, b) R. Interpretation: Usually, T R is a subset of the real line and we think of the X t as specifying the state of some random system at time t T. Thus the stochastic process as a whole describes the noisy or random evolution of the system over time. Jay Taylor (ASU) APM 541 Fall / 92

4 Motivation To motivate our consideration of Markov processes, let us compare two models of density-dependent population regulation in a finite population of an asexual (or hermaphroditic) species. In both models, we will assume that reproduction is seasonal and we will let X t denote the number of individuals alive at the beginning of year t. Model 1 will make the following assumptions: Individuals become reproductively mature in the year following their birth, i.e., an individual born in year t will begin to reproduce in year t + 1. Conditional on the current population size X t, the number of surviving offspring born to an individual alive at time t is Poisson-distributed with mean r exp( γ b X t). Furthermore, the numbers of offspring born to different individuals are independent of one another. Following reproduction, each individual dies, independently of the rest, with probability 1 (1 µ) exp( γ d X t). Jay Taylor (ASU) APM 541 Fall / 92

5 Motivation Assume that the individuals alive at the beginning of time t are labeled i = 1,, X t and let η i,t denote the number of offspring born to the i th individual in that season. Since η 1,t,, η Xt,t are independent Poisson-distributed random variables, it follows that if we condition on X t = n, then the total number of offspring born at time t is also Poisson-distributed, with mean nre γ bn : η t nx η i,t Poisson `nre γ bn. i=1 Furthermore, under this same condition, the number of surviving adults S t is binomially-distributed with parameters n and (1 µ)e γ d n : S t Binomial `n, (1 µ)e γ d n. Combining these two observations, it follows that the number of individuals alive at the beginning of time t + 1 is X t+1 = S t + η t. Jay Taylor (ASU) APM 541 Fall / 92

6 Motivation This process can be simulated using the following procedure: 1 Ask the user to input values for the parameters r, µ, γ b, γ d as well as the initial number of individuals X 0 alive at time t = 0. 2 Given that X t = n, generate independent random variables η t Poisson `nre γ bn and S t Binomial `n, (1 µ)e γ d n. 3 Set X t+1 = S t + η t. 4 Increase t to t + 1 and return to step 2. Several features of this model contribute to the ease with which the process can be simulated. One is that we are able to directly sample the total number of individuals born in each season, rather than having to sample each of the individual offspring numbers separately. Equally importantly, to determine the number of individuals alive at time t + 1, we only need to know the number alive at time t. Jay Taylor (ASU) APM 541 Fall / 92

7 Motivation In the second model, we will assume that the density of the population at time t affects not only the survival and reproduction of adults alive at that time, but also the development of any juveniles that are born in that year. Specifically, we will assume that density has a lasting negative effect on the future reproduction and survival of such juveniles, as described below: Suppose that an individual is τ years old. Conditional on the population density X t in the present and on the population density X t τ in the year when that individual was born, the number of surviving offspring born to that individual at time t is Poisson-distributed with mean re γ bx t e λ bx t τ. Following reproduction, the probability that this individual dies is 1 µe γ d X t e λ d X t τ. Apart from these modifications, we will assume that Model 2 coincides with Model 1. Jay Taylor (ASU) APM 541 Fall / 92

8 Motivation As innocuous as these modifications may seem, they make the task of simulating Model 2 much more computationally demanding than Model 1. Specifically, we must deal with the following complications: Because reproduction and survival now depend on the density of the population in each individual s natal year, we must keep track of the ages of each individual in the population so that we can determine when they were born. This also necessitates sampling Poisson and binomially-distributed random variables for every age class rather than for the population as a whole. More seriously, we must know the population density not only at the present t, but potentially at every time 0, 1, 2, t 1 in the past. In other words, the entire history of the population is needed to be able to propagate the process forward from time t to t + 1. The history-dependence exhibited by Model 2 is a major obstacle both to analytical and Monte Carlo investigations. Although this may sometimes be unavoidable, it is usually preferable to begin with a model that can be propagated forward using only the present state of the system. Jay Taylor (ASU) APM 541 Fall / 92

9 Markov Processes Stochastic models that evolve in a manner that depends only on the current state are known as Markov processes. The formal definition is given below. Definition A stochastic process X = (X n : n 0) with values in a set E is said to be a discrete time Markov process if for every n 0 and every set of values x 0, x 1,, x n E, we have P (X n+1 A X 0 = x 0, X 1 = x 1,, X n = x n) = P (X n+1 A X n = x n), whenever A is a subset of E such that {X n+1 A} is an event. In this case, the functions defined by p n(x, A) = P(X n+1 A X n = x) are called the one-step transition probabilities of X. If the functions p n(x, A) do not depend on n, i.e., if there is a function p such that p(x, A) = P(X n+1 A X n = x) for every n 0, then we say that X is a time-homogeneous Markov process with transition function p. Otherwise, X is said to be time-inhomogeneous. Jay Taylor (ASU) APM 541 Fall / 92

10 Markov Processes Remarks: Markov processes are sometimes said to lack memory in the sense that if we know the current state of the process X t = x, then the values assumed by the process at future times, say X t+s = y, do not depend on how the process arrived at state x at time t. In other words, the process is only aware of its current location and forgets how it arrived there. Equivalently, it can be shown that a Markov process has the following property, which is known as the Markov property. If we condition on the event {X n = x}, then the variables (X n+k : k 1) are independent of the variables (X n k ; k 1), i.e., the future is conditionally independent of the past given the present. Although time-inhomogeneous processes have useful applications, here we will assume that all of our processes are time-homogeneous. Indeed, in some cases, we can convert a time-inhomogeneous process to one that is time-homogeneous by explicitly modeling the variables that are responsible for the changing transition probabilities. Jay Taylor (ASU) APM 541 Fall / 92

11 Markov Processes Example: Discrete-time Random Walks Let ξ 0, ξ 1, be an i.i.d. sequence of R d -valued random variables with density f and define the process X = (X n; n 0) by setting X 0 = 0 and X n+1 = X n + ξ n. X is said to be a d-dimensional discrete-time random walk and a simple calculation shows that X is a time-homogeneous Markov process with transition function P (X n+1 A X 0 = x 0,, X n = x n) = P (X n+1 A X n = x n) = P (x n + ξ n A X n = x n) = P (ξ n A x n) Z = f (y x n)dy. A Interpretation: We think of X n as the location at time n of an individual animal or particle that moves by taking random jumps at fixed time intervals. Jay Taylor (ASU) APM 541 Fall / 92

12 Markov Chains: Basic Properties Markov Chains Markov processes that take values in a discrete state space E are important enough to merit their own theory and terminology. In particular, it will be convenient for us to adopt the following two conventions: Without loss of generality, we will assume that the state space is either E = {1,, m} when E is finite or E = {1, 2, } when E is countably infinite. This amounts to assigning integer labels to the original objects that were in the state space. Probability distributions on E will be represented by row vectors, e.g., we will use the vector ν = (ν 1, ν 2, ) to represent the distribution that assigns probability ν i to state i E. If E is infinite, then we will regard ν as an infinite-dimensional vector. Jay Taylor (ASU) APM 541 Fall / 92

13 Markov Chains: Basic Properties With these conventions in force, we have the following definition. Definition A stochastic process X = (X n; n 0) with values in the countable set E = {1, 2, } is said to be a time-homogeneous discrete-time Markov chain with initial distribution ν and transition matrix P = (p ij ) if 1 for every i E, P (X 0 = i) = ν i ; 2 for every n 0 and every set of values x 0,, x n+1 E, we have P (X n+1 = x n+1 X 0 = x 0, X 1 = x 1,, X n = x n) = P (X n+1 = x n+1 X n = x n) = p xnx n+1. Caveat: Some authors (in particular, demographers) define the transition matrix of a Markov chain to be the transpose of the matrix P that we have introduced above. In this case, p ji is the probability that process transits from state i to state j in a single step. Jay Taylor (ASU) APM 541 Fall / 92

14 Markov Chains: Basic Properties Simulation of Discrete-time Markov Chains In principle, a discrete-time Markov chain X = (X n : n 0) with initial distribution ν and transition matrix P can be simulated using the following algorithm. 1 First, choose the initial state of the Markov chain X 0 = x 0 by sampling the value x 0 from the distribution ν on E. 2 If X t = i E, choose the next state of the Markov chain by sampling a value j E from the probability distribution (p i1, p i2, ) on E. Set X t+1 = j 3 Increase t to t + 1 and return to step 2. In practice, the ease with which we can simulate a Markov chain depends on how difficult it is to generate random samples from the initial distribution ν and from the transition probabilities specified by the rows of the transition matrix P. Jay Taylor (ASU) APM 541 Fall / 92

15 Markov Chains: Basic Properties Example: Seasonal Fecundity in Birds Etterson et al. (2009) propose a simple Markov chain model to describe the reproductive activities of a single female bird in a single breeding season. In these models, a female can occupy one of four states: state 1: she may be actively nesting state 2: she may have successfully fledged a brood state 3: she may have failed to fledge a brood state 4: she may have completed all nesting activities for the season. Thus, the state space can be represented as E = {1, 2, 3, 4}, where each number corresponds to one of the four states described above, and the variable X n E describes the state of the female following the n th change in state. Jay Taylor (ASU) APM 541 Fall / 92

16 Markov Chains: Basic Properties To complete the description of the Markov chain X = (X t : t 0), we need to specify both the initial distribution ν = (1, 0, 0, 0) of the female s state at the beginning of the season (i.e., she will initiate breeding) as well as the transition matrix P = (p ij ) determining how this state changes over time. Etterson and colleagues assume that the transition matrix can be written in the following form: P = 0 0 s a 1 s a 0 1 q s 0 0 q s 1 q f 0 0 q f C A = C A where s is the daily nest survival probability, a is the average time from first egg to fledging, q s is the probability that a female quits breeding following a successful breeding attempt, and q f is the probability that a female quits breeding following a failed breeding attempt. The numbers in the matrix on the right are estimates obtained from field studies of a population of Eastern Meadowlarks in Illinois. Jay Taylor (ASU) APM 541 Fall / 92

17 Markov Chains: Basic Properties The Eastern Meadowlark (Sturnella magna) is a member of the blackbird family (Icteridae). Medium sized (length ), with a streaked brown back and yellow underparts with a black V on the breast. Most common in grasslands, prairies and agricultural settings. Eats mainly insects, as well as some seeds. Jay Taylor (ASU) APM 541 Fall / 92

18 Markov Chains: Basic Properties The matrices that occur as transition matrices of Markov chains are sufficiently important in their own right to warrant the following definition. Definition A matrix P = (p ij ), indexed by a countable set E, is said to be a stochastic matrix if all of the entries p ij 0 are non-negative and all of the row sums are equal to one, i.e., X p ij = 1 for every i. j E Every transition matrix is a stochastic matrix, and it can be shown that every stochastic matrix is the transition matrix of some Markov chain. Furthermore, stochastic matrices satisfy a number of useful properties that are stated in the next lemma. Jay Taylor (ASU) APM 541 Fall / 92

19 Markov Chains: Basic Properties Lemma 1 The product of any two stochastic matrices, P and Q, of the same dimension is itself a stochastic matrix. 2 Every stochastic matrix P has λ = 1 as an eigenvalue, with corresponding right eigenvector v = (1, 1, ): Pv = v. 3 Every eigenvalue λ of a stochastic matrix lies in the unit disk: λ 1. Proof: (1) Suppose that P = (p ij ) and Q = (q ij ) are stochastic matrices. Since all of the elements p ij and q ij are non-negative, it is evident that every element of PQ, (PQ) ij = X k E p ik q kj 0, is non-negative. Furthermore, every row sum of PQ is equal to 1 since X X X p ik q kj = X p ik = 1. k E (PQ) ij = X p ik q kj = X j E j E k E k E j E Jay Taylor (ASU) APM 541 Fall / 92

20 Markov Chains: Basic Properties To establish (2), we simply need to show that v = (1, 1, ) is a right eigenvector for P corresponding to eigenvalue 1. However, using the fact that each of the row sums of P is equal to 1, we have (Pv) i = X p ik v k = X p ik = 1, k E k E which shows that Pv = v. Lastly, to establish (3), we can either invoke Gershgorin s disk theorem (look it up!) or observe that if v = (v 1, v 2, ) is any column vector with finite sup norm then v sup v i <, i E X (Pv) i = p ik v X p ik v k X p ik v = v. k k E k E k E This shows that Pv v and so there can be no vector v with Pv = λv when λ > 1. Jay Taylor (ASU) APM 541 Fall / 92

21 Markov Chains: Basic Properties One consequence of the Markov property is that the joint distribution of the values assumed by a Markov chain from time 0 through time n takes a particularly simple form. Theorem Let X be a time-homogeneous discrete time Markov chain with transition matrix P = (p ij ) and initial distribution ν on E. Then n 1 Y P (X 0 = x 0, X 1 = x 1,, X n = x n) = ν(x 0) p xi,x i+1. i=0 Jay Taylor (ASU) APM 541 Fall / 92

22 Markov Chains: Basic Properties Proof: By repeated conditioning and use of the Markov property, we have P (X 0 = x 0, X 1 = x 1,, X n = x n) = = P (X 0 = x 0,, X n 1 = x n 1) P (X n = x n X 0 = x 0,, X n 1 = x n 1) = P (X 0 = x 0,, X n 1 = x n 1) P (X n = x n X n 1 = x n 1) = P (X 0 = x 0,, X n 1 = x n 1) p xn 1,x n = = P (X 0 = x 0, X 1 = x 1) p x1,x 2 p xn 1,x n = P (X 0 = x 0) P (X 1 = x 1 X 0 = x 0) p x1,x 2 p xn 1,x n n 1 Y = ν(x 0) p xi,x i+1, i=0 where ν(x 0) is the probability of x 0 under the initial distribution ν. Jay Taylor (ASU) APM 541 Fall / 92

23 Markov Chains: Basic Properties Example: Recall that Etterson et al. (2009) estimated the following transition matrix for the breeding cycle of Eastern Meadowlarks: P = B A Here, state 1 corresponds to active breeding, state 2 corresponds to fledging, state 3 corresponds to nesting failure, and state 4 corresponds to cessation of breeding activities in that season. If we assume that all females begin the season with an active breeding attempt (X 0 = 1), then the probability that a female successfully rears her first brood, fails to fledge her second brood and then successfully rears a third brood before stopping in the season is: P( ) = ν 1p 12p 21p 13p 31p 12p Jay Taylor (ASU) APM 541 Fall / 92

24 Markov Chains: Basic Properties Multi-step Transition Probabilities While the transition matrix specifies how a Markov chain will evolve from one time period to the next, for some applications we need to be able to calculate the transition probabilities over multiple time steps. As the name suggests, the n-step transition probabilities p (n) ij of a DTMC X are defined for any n 1 by p (n) ij = P (X n = j X 0 = i). In fact, our next theorem will show that the n-step transition probabilities are independent of time whenever X is time-homogeneous, i.e., for every t 0, p (n) ij = P (X t+n = j X t = i), which means that p (n) ij is just the probability that the chain moves from state i to state j in n time steps. The next theorem expresses an important relationship that holds between the n-step transition probabilities of a DTMC X and its r- and n r-step transition probabilities. Jay Taylor (ASU) APM 541 Fall / 92

25 Markov Chains: Basic Properties Theorem ( Chapman-Kolmogorov Equations) Assume that X is a time-homogeneous discrete time Markov chain with n-step transition probabilities p (n) ij. Then, for any non-negative integer r < n, the identities p (n) ij = X p (r) ik p(n r) kj k E hold for all i, j E. Proof: By using first the law of total probability and then the Markov property, we have p (n) ij = P (X n = j X 0 = i) = X k E P (X n = j, X r = k X 0 = i) = X k E P (X n = j X r = k, X 0 = i) P (X r = k X 0 = i) = X k E P (X n = j X r = k) P (X r = k X 0 = i) = X k E p (r) ik p(n r) kj. Jay Taylor (ASU) APM 541 Fall / 92

26 Markov Chains: Basic Properties Many computations involving Markov chains can be solved using techniques from linear algebra. To understand why this is so, let P (n) = (p (n) ij ) be the matrix containing the n-step transition probabilities and observe that the Chapman-Kolmogorov equations can be expressed as the following matrix equation: P (n) = P (r) P (n r). In particular, if we take n = 2 and r = 1, then since P (1) = P, we see that P (2) = PP = P 2. This, in turn, implies that P (3) = PP (2) = PP 2 = P 3 and continuing in this fashion we find that P (n) = P n. In other words, the matrix of n-step transition probabilities is equal to the n th power of the one-step transition matrix. This leads to the following formula for the marginal distribution of a Markov chain at time n. Jay Taylor (ASU) APM 541 Fall / 92

27 Markov Chains: Basic Properties Theorem Suppose that X is a time-homogeneous discrete time Markov chain with transition matrix P and initial distribution ν. Then the distribution of X n is given by the column vector νp n. Proof: The result follows from the law of total probability: P (X n = j) = X i E = X i E = X i E P(X n = j X 0 = i) P (X 0 = i) ν i p (n) ij = (νp n ) j. ν i P (n) ij Jay Taylor (ASU) APM 541 Fall / 92

28 Markov Chains: Basic Properties Example: To find the probability that a female meadowlark initiates at least 4 breeding attempts in a single season, we need to calculate the probability of the event X 6 = 1. Since the initial distribution is ν = (1, 0, 0, 0), the distribution of X 6 is given by 0 νp 6 = (1, 0, 0, 0) C A = (0.116, 0, 0, 0.884). Thus, the probability that a female initiates breeding at least four times is equal to 0.116, while the probability that she initiates breeding three or fewer times is Jay Taylor (ASU) APM 541 Fall / 92

29 Asymptotics: Class Structure Long-term Behavior of Markov Chains When studying stochastic models of biological processes, we are often interested in the long-term behavior of the model. Such considerations are important, for example, in the following questions: What is the probability that a new mutation will spread in a population? What is the probability that a tumor cell will escape immune surveillance and grow to form a tumor? What is the probability that an infectious disease will persist in a population? What is the stead-state distribution of transcription factors and other proteins within a single cell? In the following sections, we will be concerned with two kinds of asymptotic behavior: absorption into a single state and long-term fluctuations in an equilibrium process. Jay Taylor (ASU) APM 541 Fall / 92

30 Asymptotics: Class Structure We begin by introducing some useful terminology. Definition Let X be a discrete time Markov chain on E with transition matrix P. 1 We say that i leads to j, written i j, if for some integer n 0 p (n) ij = P i (X n = j) > 0. In other words, i j if the process X beginning at X 0 = i has some positive probability of eventually arriving at j. 2 We say that i communicates with j, written i j, if i leads to j and j leads to i. Jay Taylor (ASU) APM 541 Fall / 92

31 Asymptotics: Class Structure Notice that the relation i j is an equivalence relation on E. 1 Each element communicates with itself: i i; 2 i communicates with j if and only if j communicates with i; 3 If i communicates with j and j communicates with k, then i communicates with k. In other words, a Markov chain with values in a state space E induces a partition of E into disjoint subsets E 1, E 2,, E = E 1 E 2 where all of the elements contained in the subset E i are communicating with each other and no two elements contained in different subsets, say E i and E j, communicate with each other. This structure is formalized in the following definition. Jay Taylor (ASU) APM 541 Fall / 92

32 Asymptotics: Class Structure Definition Let X be a DTMC on E with transition matrix P. 1 A nonempty subset C E is said to be a communicating class if it is an equivalence class under the relation i j. In other words, each pair of elements in C is communicating, and whenever i C and j E are communicating, j C. 2 A communicating class C is said to be closed if whenever i C and i j, we also have j C. If C is a closed communicating class for a Markov chain X, then that means that once X enters C, it never leaves C. 3 A state i is said to be absorbing if {i} is a closed class, i.e., once the process enters state i, it is trapped there forever. 4 A Markov chain is said to be irreducible if the entire state space E is a communicating class. Jay Taylor (ASU) APM 541 Fall / 92

33 Asymptotics: Class Structure Example: The transition matrix for the Markov chain studied by Etterson and colleagues is: P = 0 0 s a 1 s a 0 1 q s 0 0 q s 1 q f 0 0 q f C A Provided that s (0, 1), a > 0, and q s, q f (0, 1), then the state space E = {1, 2, 3, 4} can be decomposed into two communicating classes, C 1 = {1, 2, 3} and C 2 = {4}. This shows that the Markov chain is not irreducible. Furthermore, C 1 is not closed, since states 2 and 3 lead out of C 1 into C 2. On the other hand, C 2 is closed and 4 is an absorbing class. Jay Taylor (ASU) APM 541 Fall / 92

34 Asymptotics: Hitting Times and Probabilities Suppose that X is a discrete time Markov chain and let C E be a closed communicating class for X. In this section we will consider the following two problems: 1 Given that X 0 = i, what is the probability that X is eventually absorbed by C? 2 Given that X 0 = i, what is the average time to absorption in C? Since we will be conditioning on the initial state of the Markov chain, it will be convenient to introduce the following notation. We will use P i to denote the conditional distribution of the chain given X 0 = i, P i (A) = P(A X 0 = i), where A is any event defined in terms of the chain. Similarly, we will use E i to denote conditional expectations given X 0 = i, E i [Y ] = E[Y X 0 = i], where Y is any random variable defined in terms of the chain. Jay Taylor (ASU) APM 541 Fall / 92

35 Asymptotics: Hitting Times and Probabilities The main quantities of interest are defined below. Definition Let X be a discrete time Markov chain on E with transition matrix P and let C E be a closed communicating class for X. 1 The absorption time of C is the random variable T C {0, 1, } defined by 8 < min {n 0 : X n C} if X n C for some n 0 T C = : if X n / C for all n. 2 The absorption probability of C starting from i is the probability hi C = P i T C <. 3 The mean absorption time of C starting from i is the expectation τ C i = E i ht C i. Jay Taylor (ASU) APM 541 Fall / 92

36 Asymptotics: Hitting Times and Probabilities The following theorem asserts that the absorption probabilities hi C solving a certain system of linear equations. can be calculated by Theorem The vector of absorption probabilities h C = (h C 1, h C 2, ) is the unique minimal non-negative solution of the system of linear equations, j h C i = 1 if i C h C i = P j E p ijh C j if i / C To say that h C is a minimal non-negative solution means that there is no other non-negative solution with at least one component smaller than that of h C. In fact, it can be shown that the second set of equations is satisfied even when i C, which means that the vector h C is a right eigenvector for P corresponding to the eigenvalue λ = 1: h C = Ph C. Jay Taylor (ASU) APM 541 Fall / 92

37 Asymptotics: Hitting Times and Probabilities Proof: I will show that h C is a solution to this system of equations. For a proof of minimality, see the text Markov Chains by Norris (1997). A strategy that is often successful when working with Markov chains is to condition on what the chain does in a single step. Here we will condition on the state of the chain at time t = 1. However, we first note that hi C = 1 if i C since the chain will clearly absorb in C if it starts there. On the other hand, if i / C, then the law of total probability and the Markov property imply that hi C P i (X n C for some n < ) = X P i (X 1 = j and X n C for some n < ) j E = X P i (X n C for some n < X 1 = j) P i (X 1 = j) j E = X P j (X n C for some n < ) p ij j E = X p ij hj C. j E Jay Taylor (ASU) APM 541 Fall / 92

38 Asymptotics: Hitting Times and Probabilities Example: Suppose that X is a Markov chain on the state space E = {1, 2, 3, 4} with the following transition matrix P = a 0 1 a b 0 b where a, b (0, 1). Then both 1 and 4 are absorbing states and the chain can absorb in either one of these two states when it is started in state 2 or 3. Let h i be the probability that it eventually absorbs in state 4 when started in state i: 1 C A, h i = P i (X n = 4 for some n < ). From the last theorem, we know that h = (h 1, h 2, h 3, h 4) is the unique solution to the system of equations h i = X j E p ij h j with h 4 = 1. Jay Taylor (ASU) APM 541 Fall / 92

39 Asymptotics: Hitting Times and Probabilities Written explicitly, this system of equations is: h 1 = h 1 h 2 = ah 1 + (1 a)h 3 h 3 = (1 b)h 2 + b, which does not have a unique solution (since h 1 is unconstrained). However, if we insist on a minimal non-negative solution, then h 1 = 0, which in turn uniquely determines h 2 and h 3: h 2 = (1 a)b a + b ab and h 3 = b a + b ab. Jay Taylor (ASU) APM 541 Fall / 92

40 Asymptotics: Hitting Times and Probabilities The mean absorption times τi C equations. Theorem can also be calculated by solving a system of linear Suppose that C is a closed communicating class and that the absorption probability hi C = 1 for every i E, i.e., the chain is certain to absorb in C no matter where started. Then, the vector of mean absorption times τ C = (τ1 C, τ2 C, ) is the minimal non-negative solution of the system of linear equations, j τ C i = 0 if i C τi C = 1 + P j E p ijτj C if i / C In this case, the system of equations is inhomogeneous, i.e., there are constant terms in the equations. Furthermore, the condition hi C = 1 is important, since otherwise some of the mean absorption times will be infinite (since we set T C = if absorption never occurs). Jay Taylor (ASU) APM 541 Fall / 92

41 Asymptotics: Hitting Times and Probabilities Example: We can use this last result to calculate the mean number of breeding attempts by a female Eastern Meadowlark in the Etterson model. In this model, state 4 (cessation of breeding activities) is the unique absorbing state and the probability of absorption here is equal to 1. Using the transition matrix estimated by these authors, we obtain the following system of equations: τ 1 = τ τ 3 τ 2 = τ τ 4 τ 3 = τ τ 4 τ 4 = 0. This has a unique solution, which is τ , τ , τ These are the mean number of transitions that occur prior to absorption. To determine the mean number of breeding attempts, we take X 0 = 1 and observe that every breeding attempt passes through two events (active breeding followed by fledging or failure). Thus the mean number of breeding attempts is τ 1/ Jay Taylor (ASU) APM 541 Fall / 92

42 Asymptotics: Hitting Times and Probabilities Example: Birth-Death Processes A discrete time Markov chain X = (X n; n 0) is said to be a birth-death process if X takes values in a set E = {0, 1,, N} with a tridiagonal transition matrix 0 P = 1 b 0 b d 1 1 (b 1 + d 1 ) b d 2 1 (b 2 + d 2 ) b d N 1 1 (b N 1 + d N 1 ) b N d N 1 d N. 1, C A Such processes are often used to model the dynamics of populations in which individuals reproduce or die one at a time. However, a more general characterization of a birth-death process is that it is a Markov chain on a linearly ordered set in which transitions can only take place from a state to itself or to an immediately adjacent state. For example, birth-death processes can be used to model the movement of a motor protein such as dynein along a microtubule. Jay Taylor (ASU) APM 541 Fall / 92

43 Asymptotics: Hitting Times and Probabilities Suppose that a birth-death process X is used to model a population of individuals that are reproducing and dying. In this case, X n will denote the number of individuals present in the population at time n. If X n = k with 1 k N 1, then there are three possibilities for X n+1. With probability b k, one of the k individuals reproduces and the population size increases to k + 1. Here we are assuming that each reproduction results in the birth of exactly one offspring. With probability d k, one of the k individuals dies and the population size decreases to k 1. With probability 1 b k d k, no individual reproduces or dies during that time step and the population size remains at k. Alternatively, we could stipulate that this is the probability that exactly one individual gives birth and one individual dies, in which case there still is no change in the population size. Jay Taylor (ASU) APM 541 Fall / 92

44 Asymptotics: Hitting Times and Probabilities The boundary states 0 and N need special consideration. If b 0 = 0, then 0 is an absorbing state corresponding to extinction of the population. If b 0 > 0, then 0 is no longer absorbing and we can think of this parameter as the probability (per unit time) that an extinct population is rescued by a migrant arriving from outside. If N is finite, then b N = 0 and we are tacitly assuming that density dependence is so strong that no individual reproduced when the population is this large. However, we can also take N =, in which case the population may grow unboundedly. Birth-death processes have been used extensively in the biological literature in part because they have many nice analytical properties that allow one to do explicit computations without having to resort to Monte Carlo simulations. Jay Taylor (ASU) APM 541 Fall / 92

45 Asymptotics: Hitting Times and Probabilities Suppose that N is finite, that b 0 = b N = 0, and that the remaining probabilities b 1,, b N 1 and d 1,, d N are all positive. In this case 0 is the unique absorbing state for the Markov chain and it can be shown that the population is certain to go extinct at some finite time, which we will denote T extinct. Our goal is to calculate the expected time to extinction as a function of the initial number of individuals in the population, where k = 0,, N. τ k = E k ˆTextinct, By the preceding theorem, we know that the τ k s satisfy the following system of equations: τ 0 = 0 τ k = 1 + d k τ k 1 + (1 b k d k )τ k + b k τ k+1, 1 k N 1, τ N = 1 + d N τ N 1 + (1 d N )τ N. Jay Taylor (ASU) APM 541 Fall / 92

46 Asymptotics: Hitting Times and Probabilities By bringing all of the τ k s over to the left-hand side, we can rewrite this system in the following more concise form: d k (τ k 1 τ k ) + b k (τ k+1 τ k ) = 1, 1 k N 1 d N (τ N 1 τ N ) = 1. Notice that this is a two-term recursion (i.e., τ k+2 depends only on τ k+1 and τ k ), which is analogous two a second-order ODE and can be solved analogously, effectively by integrating twice. For our first integration, we will define a new variable k = τ k+1 τ k, k = 0,, N 1. Substituting this variable into the two equations above gives: b k k d k k 1 = 1, 0 k N 1 d N N 1 = 1. Jay Taylor (ASU) APM 541 Fall / 92

47 Asymptotics: Hitting Times and Probabilities This latter system can be solved by starting at the upper boundary (k = N 1) and working down. Indeed, we have: N 1 = 1 d N k 1 = b k d k k + 1 d k, 0 k N 1. This shows that N 2 = N 3 = with the general expression 1 d N d N 2 + b N 1 d N 1 d N b N 2 + d N 2 d N 1 b N 2b N 1 d N 2 d N 1 d N, for k = 0,, N 1. k = 1 d k+1 + NX j=k+2 b k+1 b j 1 d k+1 d j Jay Taylor (ASU) APM 541 Fall / 92

48 Asymptotics: Hitting Times and Probabilities For our second integration, we will start at the lower boundary k = 0 and work up. Since τ 0 = 0, we have Also, τ 1 = τ 1 τ 0 = 0 = 1 d 1 + NX j=2 τ 2 = τ = b 1 b j 1 d 1 d j. τ 3 = τ = τ k = i=0 X k 1 i i=0 for k = 2,, N. Using our result for j, it follows that the mean time to extinction when there are k individuals is " # Xk 1 1 NX b i+1 b j 1 τ k = +. d i+1 d i+1 d j j=i+2 Jay Taylor (ASU) APM 541 Fall / 92

49 Asymptotics: Hitting Times and Probabilities Remarks: The results described in the previous two theorems are most useful when the state space is small or the transition matrix is sparse, i.e., most of the transition probabilities p ij are equal to 0. Indeed, sparseness is part of the reason that birth-death processes are so amenable to analytical calculations. If these conditions are not satisfied, then it may impractical to solve these systems of linear equations. In such cases, absorption probabilities and mean absorption times can be estimated by conducting stochastic simulations of the model and observing the proportion of simulations that end in absorption in a closed class C, as well as the times required to arrive in C. Jay Taylor (ASU) APM 541 Fall / 92

50 Asymptotics: Stationary Distributions Stationary Distributions Recall that we say that a Markov chain is irreducible if the entire state space is a communicating class, i.e., if every pair of states is communicating. In particular, because irreducible chains lack absorbing states, we usually cannot predict the states that will be occupied by the chain at distant times in the future, especially if the initial distribution is unknown. On the other hand, sometimes we can determine the distribution of the chain at such future times even when we don t know the initial distribution. The following definition is key. Definition Let X be a discrete time Markov chain with values in E and transition matrix P. A distribution π on E is said to be a stationary distribution for X if πp = π. In other words, π is a left eigenvector for P with corresponding eigenvalue 1. Jay Taylor (ASU) APM 541 Fall / 92

51 Asymptotics: Stationary Distributions The probabilistic meaning of stationary distributions is revealed by the following theorem. Theorem Suppose that π is a stationary distribution for a Markov chain X with transition matrix P. If π is the distribution of X 0, then π is also the distribution of X n for every n 0. Proof: Since X has initial distribution π and transition matrix P, we know that the distribution of X n is πp n = (πp)p n 1 = πp n 1 = = π. In other words, any stationary distribution of a Markov chain is time invariant: if ever the process has π as its distribution, then it will retain this distribution at all future times. Jay Taylor (ASU) APM 541 Fall / 92

52 Asymptotics: Stationary Distributions Example: Suppose that X is a Markov chain on the state space E = {1, 2} with the following transition matrix «a 1 a P =, 1 b b where a, b (0, 1). Since π = (p, 1 p) is a stationary distribution for X if and only if p satisfies the following system of equations, ap + (1 b)(1 p) = p (1 a)p + b(1 p) = 1 p, it follows that π = «1 b 2 a b, 1 a 2 a b is the unique stationary distribution for X. Jay Taylor (ASU) APM 541 Fall / 92

53 Asymptotics: Stationary Distributions As the following examples illustrate, neither existence nor uniqueness of stationary distributions is assured. Example: Let Z 1, Z 2, be a sequence of i.i.d. Bernoulli random variables with success probability p > 0 and let X = (X n : n 0) be the random walk defined by setting X 0 = 0 and X n = Z Z n, n 1. By the strong law of large numbers, we know that «1 P lim n n Xn = p = 1. However, this implies that X n tends to infinity almost surely and so X has no stationary distribution on the integers. Jay Taylor (ASU) APM 541 Fall / 92

54 Asymptotics: Stationary Distributions Example: Suppose that X is a Markov chain on the state space E = {1, 2, 3, 4} with the following transition matrix P = a 0 1 a b 0 b where a, b (0, 1). As we saw earlier, 1 and 4 are both absorbing states. Furthermore, any distribution of the form (p, 0, 0, 1 p) with p [0, 1] is a stationary distribution of the chain and so there are uncountably infinitely many stationary distributions. Indeed, any chain with more than one absorbing distribution will have infinitely many stationary distributions. 1 C A, Jay Taylor (ASU) APM 541 Fall / 92

55 Asymptotics: Stationary Distributions Under certain conditions, we can guarantee that not only will a Markov chain have a unique stationary distribution π, but also that the distribution of the chain at time n will tend to π as n no matter what the initial distribution may be. For this to be true, we need to exclude certain types of oscillatory behavior, hence the following definition. Definition A discrete time Markov chain with values in E and transition matrix P is said to be aperiodic if for every state i E we have p (n) ii > 0 for all sufficiently large n. The need for aperiodicity is illustrated by the following example. If X is the Markov chain with transition matrix «0 1 P =, 1 0 then π = (1/2, 1/2) is the unique stationary distribution. However, the distribution of the chain will be equal to the initial distribution at all future times n with n even. Indeed, this chain is periodic with period 2. Jay Taylor (ASU) APM 541 Fall / 92

56 Asymptotics: Stationary Distributions The following theorem provides sufficient conditions for a chain to converge to a unique stationary distribution from any initial distribution. Theorem Suppose that X is an irreducible, aperiodic Markov chain with values in a finite set E and transition matrix P. Then X has a unique stationary distribution π and for any initial distribution µ, we have lim n P(Xn = j) = π j for every j E. In particular, for all i, j E. lim n p(n) ij = π j Remark: Although existence of a stationary distribution is not guaranteed if we allow E to be infinite, the uniqueness and convergence parts of this theorem remain true for arbitrary state spaces whenever a stationary distribution can be shown to exist. Jay Taylor (ASU) APM 541 Fall / 92

57 Asymptotics: Stationary Distributions Example: If X is a Markov chain on the state space E = {1, 2} with transition matrix «a 1 a P =, 1 b b where a, b (0, 1), then X is aperiodic and irreducible and «+ (a + b 1) n 1 a 2 a b P n = 1 b 2 a b 1 b 2 a b 1 a 2 a b 1 a 2 a b 1 b 2 a b 1 a 2 a b 1 b 2 a b «for every n 0. Furthermore, since a + b 1 < 1, it is clear that lim n Pn = 1 b 2 a b 1 b 2 a b 1 a 2 a b 1 a 2 a b where each row of the limiting matrix is the stationary distribution π that we found by direct calculation. In particular, we see that the rate of convergence to the stationary distribution is geometric and depends on the magnitude of a + b 1. «, Jay Taylor (ASU) APM 541 Fall / 92

58 Asymptotics: Stationary Distributions Convergence to Stationarity It is often (although not always) the case that convergence to the stationary distribution occurs at a geometric rate. For example, let P be the transition matrix of a Markov chain on a finite space E = {1,, n} with stationary distribution π and suppose that P has n distinct eigenvalues λ 1,, λ n with 1 = λ 1 λ 2 λ n. In this case, P is diagonalizable and there is a basis of left eigenvectors, say π (1),, π (n), with π (1) = π, such that π (i) P = λ i π (i). In particular, any distribution ν on E can be expanded as a linear combination of left eigenvectors, say ν = c 1π (1) + + c nπ (n). Jay Taylor (ASU) APM 541 Fall / 92

59 Asymptotics: Stationary Distributions Suppose that ν is the initial distribution of the chain and that λ 2 < 1. Then the distribution of X t is given by νp t = (c 1π (1) + + c nπ (n) )P t = c 1λ t 1π (1) + c 2λ t 2π (2) + + c nλ t nπ (n) = c 1π + O( λ 2 t ) π as t. This shows that c 1 = 1 and that νp t converges to π geometrically at a rate which depends on the leading non-unit eigenvalue, λ 2. Jay Taylor (ASU) APM 541 Fall / 92

60 Asymptotics: Stationary Distributions Example: Suppose that X is a birth-death process with N < and that the probabilities b 0,, b N 1 and d 1,, d N lie in the interval (0, 1), i.e., X is a birth-death process with immigration into extinct populations. In this case, X is also an irreducible, aperiodic Markov chain on a finite state space and so X has a unique stationary distribution π which satisfies the following system of equations: (1 b 0)π 0 + d 1π 1 = π 0 b k 1 π k 1 + (1 b k d k )π k + d k+1 π k+1 = π k k = 1,, N 1 b N 1 π N 1 + (1 d N )π N = π N. By subtracting the probabilities on the right-hand side from both sides, this system can be rewritten in the following form: b 0π 0 + d 1π 1 = 0 b k 1 π k 1 (b k + d k )π k + d k+1 π k+1 = 0 k = 1,, N 1 b N 1 π N 1 d N π N = 0. Jay Taylor (ASU) APM 541 Fall / 92

61 Asymptotics: Stationary Distributions Because each equation depends only on two or three successive probabilities, it is possible to solve this system recursively. The first equation can be rewritten as d 1π 1 = b 0π 0, which implies that Taking k = 1, we have π 1 = b0 d 1 π 0. b 0π 0 d 1π 1 b 1π 1 + d 2π 2 = 0. However, since the first two terms cancel, this reduces to which shows that b 1π 1 + d 2π 2 = 0. π 2 = b1 d 2 π 1 = b1b0 d 2d 1 π 0. Jay Taylor (ASU) APM 541 Fall / 92

62 Asymptotics: Stationary Distributions Continuing in this way, we find that π k = b k 1 d k π k 1 = bk 1 b 0 d k d 1 «π 0 for k = 1,, N. All that remains to be determined is π 0. However, since π is a probability distribution on E, the probabilities must sum to one. Consequently, which shows that 1 = NX NX π k = π k=0 π 0 = 1 + NX k=1 k=1 b k 1 b 0 d k d 1 b k 1 b 0 d k d 1! 1.!. Jay Taylor (ASU) APM 541 Fall / 92

63 Asymptotics: Stationary Distributions Combining these results, we see that the stationary distribution for this birth-death chain is «bk 1 b 0 NX π k = 1 + d k d 1 j=1 b j 1 b 0 d j d 1 for k = 1,, N, with π 0 as above. For example, if b 0 = m, b 1 = = b N 1 = b and d 1 = = d N = d, then provided γ b/d 1. 8 >< π k = >: 1 + m d m d γk m d 1 γ N 1 k = 0 1 γ! 1 1 γ N 1 k = 1,, N, 1 γ Jay Taylor (ASU) APM 541 Fall / 92

64 Time Reversal Time Reversal of Markov Chains Suppose that X = (X t : t N) is a discrete time Markov chain and for each t > 0, let X t = X T t. Then X = ( X t : t 0) is itself a Markov process on E, called the time reversal of X, and the transition probabilities of this Markov chain are given by p xy (t) P( X t+1 = y X t = x) = P(X T (t+1) = y X T t = x) = P(X T t = x X T t 1 = y) P(X T t 1 = y) P(X T t = x) = p yx P(X T t 1 = y) P(X T t = x). As is clear from the last line, the transition probabilities of the time reversal of a Markov chain are usually time inhomogeneous even if the original chain is time homogeneous. Jay Taylor (ASU) APM 541 Fall / 92

65 Time Reversal One setting in which the time reversal of a Markov chain is certain to be time homogeneous is when the original chain is stationary, say with distribution π. In this case, the distribution of X t is π for all t and so the transition probabilities of the time reversed process are p xy p yx P(X T t 1 = y) P(X T t = x) = p yx π y π x. Furthermore, if we let Π be the diagonal matrix with entries π along the diagonal (and zeros everywhere else), then the transition matrix P of the time-reversed chain is where P T denotes the transpose of P. P = Π 1 P T Π, Jay Taylor (ASU) APM 541 Fall / 92

66 Time Reversal It is usually the case that a stationary Markov chain and its time reversal have different transition matrices and therefore are fundamentally different processes. The one exception to this case occurs when the condition described in the following definition is satisfied. Definition Suppose that X is a discrete time Markov chain on E with transition matrix P and let π be a stationary distribution for X. The chain is said to satisfy the detailed balance conditions with respect to π if the following identities are satisfied: for every pair of states x, y E. π xp xy = π y p yx If the detailed balance conditions are satisfied, then p xy = pyxπy π x = pxy πx π x = p xy, which shows that P = P, i.e., the chain and its time reversal have the same transition matrix and so the process looks the same run forwards or backwards in time. Jay Taylor (ASU) APM 541 Fall / 92

67 Time Reversal Example: Let X be the birth-death process on E = {0, 1,, N} with stationary distribution 8 >< π k = >: 1 + P N b 1 k 1 b 0 k=1 d k d 1 k = 0 bk 1 b 0 d k d P N b 1 k 1 b 0 k=1 d k d 1 k = 1,, N. Then X satisfies the detailed balance conditions with respect to π. Indeed, since most of the transition probabilities are zero, we need only check the following cases: π 0p 01 = π 0b 0 = π 1d 1 = π 1p 10 π k p k,k+1 = π k b k = π k+1 d k+1 = π k+1 p k+1,k for k = 1,, N. This shows that the time reversal of a stationary birth-death process is also a birth-death process with the same birth and death probabilities. Jay Taylor (ASU) APM 541 Fall / 92

68 Hidden Markov Models Models and Observations Suppose that we are interested in the spread of a communicable infectious disease through a population and that we decide to investigate this using the stochastic SIR model introduced previously: S t+1 = S t w t I t+1 = I t + w t r t R t+1 = R t + r t w t Binomial S t, 1 (1 γ + γ(1 β)) It r t Binomial(I t, ρ). Here, w t and r t are the numbers of new infections and new recoveries between day t and t + 1, respectively, while γ is the probability (per day) of a contact between two individuals, β is the probability of disease transmission given that there is a contact between a susceptible and an infected individual, and ρ is the probability (per day) of recovery. Jay Taylor (ASU) APM 541 Fall / 92

69 Hidden Markov Models Since S t + I t + R t = N is constant for all t 0, the variables {(S t, I t) : t 0} form a discrete-time Markov chain with values in the state space and transition probabilities E = {(s, i) : i, s 0, i + s N} p (s,i),(s,i ) = P (S t+1, I t+1) = (s, i ) (S t, I t) = (s, i)!! s i = π s s s i (1 π i ) s ρ i+s i s (1 ρ) i +s s i + s i s where π i = 1 (1 γ + γ(1 β)) i and we require that 0 i i + s s i. While the transition matrix is complicated, at the very least we can conduct Monte Carlo simulations of this process to investigate quantities such as the duration of the epidemic and the total number of infections. Jay Taylor (ASU) APM 541 Fall / 92

70 Hidden Markov Models However, what if the values of the parameters γ, β and ρ are not known in advance, but instead must be estimated from data? There are two scenarios. Scenario I: Suppose that we have access to time series data consisting of the numbers of susceptible and infected individuals at each time in some period [0, T ], i.e., we have data D = {(s t, i t) : 0 t T }, and we believe this data to be accurate. In this case, we can use the Markov property to calculate the probability of the data for each possible set of parameter values and we will call the resulting quantity the likelihood of the parameter values Θ = (γ, β, ρ) L(θ D) P ((S t, I t) = (s t, i t), 0 t T Θ) = TY 1 t=0 p (st,i t ),(s t+1,i t+1 )(Θ). Jay Taylor (ASU) APM 541 Fall / 92

71 Hidden Markov Models The true values of the parameters can then be estimated either by finding the value ˆΘ ML that maximizes the likelihood function ˆΘ ML arg max L(Θ D), Θ or by choosing a prior distribution for the parameters, say p(θ), and then using Bayes formula in conjunction with the likelihood function to calculate the posterior distribution of the parameters given the data D: p(θ D) = p(θ) L(Θ D) P(D), where Z Z P(D) = L(Θ D)dΘ = P(D Θ)dΘ Θ Θ is the so-called prior predictive probability of the data. The first approach is called maximum likelihood estimation, while the second approach is Bayesian inference, which we will discuss in some depth later in the semester. Jay Taylor (ASU) APM 541 Fall / 92

2 Discrete-Time Markov Chains

2 Discrete-Time Markov Chains 2 Discrete-Time Markov Chains Angela Peace Biomathematics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains Markov Chains A random process X is a family {X t : t T } of random variables indexed by some set T. When T = {0, 1, 2,... } one speaks about a discrete-time process, for T = R or T = [0, ) one has a continuous-time

More information

STOCHASTIC PROCESSES Basic notions

STOCHASTIC PROCESSES Basic notions J. Virtamo 38.3143 Queueing Theory / Stochastic processes 1 STOCHASTIC PROCESSES Basic notions Often the systems we consider evolve in time and we are interested in their dynamic behaviour, usually involving

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Centre for Research on Inner City Health St Michael s Hospital Toronto On leave from Department of Mathematics University of Manitoba Julien Arino@umanitoba.ca

More information

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected 4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X

More information

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015 Markov chains c A. J. Ganesh, University of Bristol, 2015 1 Discrete time Markov chains Example: A drunkard is walking home from the pub. There are n lampposts between the pub and his home, at each of

More information

Lecture 20 : Markov Chains

Lecture 20 : Markov Chains CSCI 3560 Probability and Computing Instructor: Bogdan Chlebus Lecture 0 : Markov Chains We consider stochastic processes. A process represents a system that evolves through incremental changes called

More information

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat

More information

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes Lecture Notes 7 Random Processes Definition IID Processes Bernoulli Process Binomial Counting Process Interarrival Time Process Markov Processes Markov Chains Classification of States Steady State Probabilities

More information

SMSTC (2007/08) Probability.

SMSTC (2007/08) Probability. SMSTC (27/8) Probability www.smstc.ac.uk Contents 12 Markov chains in continuous time 12 1 12.1 Markov property and the Kolmogorov equations.................... 12 2 12.1.1 Finite state space.................................

More information

Stochastic modelling of epidemic spread

Stochastic modelling of epidemic spread Stochastic modelling of epidemic spread Julien Arino Department of Mathematics University of Manitoba Winnipeg Julien Arino@umanitoba.ca 19 May 2012 1 Introduction 2 Stochastic processes 3 The SIS model

More information

Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7. Markov chain background. 7.1 Finite state space Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

More information

Stochastic Modelling Unit 1: Markov chain models

Stochastic Modelling Unit 1: Markov chain models Stochastic Modelling Unit 1: Markov chain models Russell Gerrard and Douglas Wright Cass Business School, City University, London June 2004 Contents of Unit 1 1 Stochastic Processes 2 Markov Chains 3 Poisson

More information

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321 Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process

More information

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past. 1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if

More information

Markov Chains, Stochastic Processes, and Matrix Decompositions

Markov Chains, Stochastic Processes, and Matrix Decompositions Markov Chains, Stochastic Processes, and Matrix Decompositions 5 May 2014 Outline 1 Markov Chains Outline 1 Markov Chains 2 Introduction Perron-Frobenius Matrix Decompositions and Markov Chains Spectral

More information

Lectures on Probability and Statistical Models

Lectures on Probability and Statistical Models Lectures on Probability and Statistical Models Phil Pollett Professor of Mathematics The University of Queensland c These materials can be used for any educational purpose provided they are are not altered

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning March May, 2013 Schedule Update Introduction 03/13/2015 (10:15-12:15) Sala conferenze MDPs 03/18/2015 (10:15-12:15) Sala conferenze Solving MDPs 03/20/2015 (10:15-12:15) Aula Alpha

More information

Markov Chains Handout for Stat 110

Markov Chains Handout for Stat 110 Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Three Disguises of 1 x = e λx

Three Disguises of 1 x = e λx Three Disguises of 1 x = e λx Chathuri Karunarathna Mudiyanselage Rabi K.C. Winfried Just Department of Mathematics, Ohio University Mathematical Biology and Dynamical Systems Seminar Ohio University November

More information

The Wright-Fisher Model and Genetic Drift

The Wright-Fisher Model and Genetic Drift The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population

More information

Discrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices

Discrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices Discrete time Markov chains Discrete Time Markov Chains, Limiting Distribution and Classification DTU Informatics 02407 Stochastic Processes 3, September 9 207 Today: Discrete time Markov chains - invariant

More information

Statistics 992 Continuous-time Markov Chains Spring 2004

Statistics 992 Continuous-time Markov Chains Spring 2004 Summary Continuous-time finite-state-space Markov chains are stochastic processes that are widely used to model the process of nucleotide substitution. This chapter aims to present much of the mathematics

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulation Ulm University Institute of Stochastics Lecture Notes Dr. Tim Brereton Summer Term 2015 Ulm, 2015 2 Contents 1 Discrete-Time Markov Chains 5 1.1 Discrete-Time Markov Chains.....................

More information

The Transition Probability Function P ij (t)

The Transition Probability Function P ij (t) The Transition Probability Function P ij (t) Consider a continuous time Markov chain {X(t), t 0}. We are interested in the probability that in t time units the process will be in state j, given that it

More information

Markov Chain Model for ALOHA protocol

Markov Chain Model for ALOHA protocol Markov Chain Model for ALOHA protocol Laila Daniel and Krishnan Narayanan April 22, 2012 Outline of the talk A Markov chain (MC) model for Slotted ALOHA Basic properties of Discrete-time Markov Chain Stability

More information

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1 MATH 56A: STOCHASTIC PROCESSES CHAPTER. Finite Markov chains For the sake of completeness of these notes I decided to write a summary of the basic concepts of finite Markov chains. The topics in this chapter

More information

AARMS Homework Exercises

AARMS Homework Exercises 1 For the gamma distribution, AARMS Homework Exercises (a) Show that the mgf is M(t) = (1 βt) α for t < 1/β (b) Use the mgf to find the mean and variance of the gamma distribution 2 A well-known inequality

More information

Markov Chains. Sarah Filippi Department of Statistics TA: Luke Kelly

Markov Chains. Sarah Filippi Department of Statistics  TA: Luke Kelly Markov Chains Sarah Filippi Department of Statistics http://www.stats.ox.ac.uk/~filippi TA: Luke Kelly With grateful acknowledgements to Prof. Yee Whye Teh's slides from 2013 14. Schedule 09:30-10:30 Lecture:

More information

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015

Markov Chains. Arnoldo Frigessi Bernd Heidergott November 4, 2015 Markov Chains Arnoldo Frigessi Bernd Heidergott November 4, 2015 1 Introduction Markov chains are stochastic models which play an important role in many applications in areas as diverse as biology, finance,

More information

2. Transience and Recurrence

2. Transience and Recurrence Virtual Laboratories > 15. Markov Chains > 1 2 3 4 5 6 7 8 9 10 11 12 2. Transience and Recurrence The study of Markov chains, particularly the limiting behavior, depends critically on the random times

More information

http://www.math.uah.edu/stat/markov/.xhtml 1 of 9 7/16/2009 7:20 AM Virtual Laboratories > 16. Markov Chains > 1 2 3 4 5 6 7 8 9 10 11 12 1. A Markov process is a random process in which the future is

More information

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains 8.1 Review 8.2 Statistical Equilibrium 8.3 Two-State Markov Chain 8.4 Existence of P ( ) 8.5 Classification of States

More information

Stochastic Processes (Week 6)

Stochastic Processes (Week 6) Stochastic Processes (Week 6) October 30th, 2014 1 Discrete-time Finite Markov Chains 2 Countable Markov Chains 3 Continuous-Time Markov Chains 3.1 Poisson Process 3.2 Finite State Space 3.2.1 Kolmogrov

More information

1 Types of stochastic models

1 Types of stochastic models 1 Types of stochastic models Models so far discussed are all deterministic, meaning that, if the present state were perfectly known, it would be possible to predict exactly all future states. We have seen

More information

LIMITING PROBABILITY TRANSITION MATRIX OF A CONDENSED FIBONACCI TREE

LIMITING PROBABILITY TRANSITION MATRIX OF A CONDENSED FIBONACCI TREE International Journal of Applied Mathematics Volume 31 No. 18, 41-49 ISSN: 1311-178 (printed version); ISSN: 1314-86 (on-line version) doi: http://dx.doi.org/1.173/ijam.v31i.6 LIMITING PROBABILITY TRANSITION

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

A matrix over a field F is a rectangular array of elements from F. The symbol

A matrix over a field F is a rectangular array of elements from F. The symbol Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted

More information

Lecture 5: Random Walks and Markov Chain

Lecture 5: Random Walks and Markov Chain Spectral Graph Theory and Applications WS 20/202 Lecture 5: Random Walks and Markov Chain Lecturer: Thomas Sauerwald & He Sun Introduction to Markov Chains Definition 5.. A sequence of random variables

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Stochastic Processes

Stochastic Processes Stochastic Processes 8.445 MIT, fall 20 Mid Term Exam Solutions October 27, 20 Your Name: Alberto De Sole Exercise Max Grade Grade 5 5 2 5 5 3 5 5 4 5 5 5 5 5 6 5 5 Total 30 30 Problem :. True / False

More information

STAT STOCHASTIC PROCESSES. Contents

STAT STOCHASTIC PROCESSES. Contents STAT 3911 - STOCHASTIC PROCESSES ANDREW TULLOCH Contents 1. Stochastic Processes 2 2. Classification of states 2 3. Limit theorems for Markov chains 4 4. First step analysis 5 5. Branching processes 5

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

IEOR 6711: Professor Whitt. Introduction to Markov Chains

IEOR 6711: Professor Whitt. Introduction to Markov Chains IEOR 6711: Professor Whitt Introduction to Markov Chains 1. Markov Mouse: The Closed Maze We start by considering how to model a mouse moving around in a maze. The maze is a closed space containing nine

More information

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks Recap Probability, stochastic processes, Markov chains ELEC-C7210 Modeling and analysis of communication networks 1 Recap: Probability theory important distributions Discrete distributions Geometric distribution

More information

18.175: Lecture 30 Markov chains

18.175: Lecture 30 Markov chains 18.175: Lecture 30 Markov chains Scott Sheffield MIT Outline Review what you know about finite state Markov chains Finite state ergodicity and stationarity More general setup Outline Review what you know

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2 MATH 56A: STOCHASTIC PROCESSES CHAPTER 2 2. Countable Markov Chains I started Chapter 2 which talks about Markov chains with a countably infinite number of states. I did my favorite example which is on

More information

At the boundary states, we take the same rules except we forbid leaving the state space, so,.

At the boundary states, we take the same rules except we forbid leaving the state space, so,. Birth-death chains Monday, October 19, 2015 2:22 PM Example: Birth-Death Chain State space From any state we allow the following transitions: with probability (birth) with probability (death) with probability

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Ch5. Markov Chain Monte Carlo

Ch5. Markov Chain Monte Carlo ST4231, Semester I, 2003-2004 Ch5. Markov Chain Monte Carlo In general, it is very difficult to simulate the value of a random vector X whose component random variables are dependent. In this chapter we

More information

ISM206 Lecture, May 12, 2005 Markov Chain

ISM206 Lecture, May 12, 2005 Markov Chain ISM206 Lecture, May 12, 2005 Markov Chain Instructor: Kevin Ross Scribe: Pritam Roy May 26, 2005 1 Outline of topics for the 10 AM lecture The topics are: Discrete Time Markov Chain Examples Chapman-Kolmogorov

More information

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Some Results on the Ergodicity of Adaptive MCMC Algorithms Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship

More information

The SIS and SIR stochastic epidemic models revisited

The SIS and SIR stochastic epidemic models revisited The SIS and SIR stochastic epidemic models revisited Jesús Artalejo Faculty of Mathematics, University Complutense of Madrid Madrid, Spain jesus_artalejomat.ucm.es BCAM Talk, June 16, 2011 Talk Schedule

More information

Markov Chains and Hidden Markov Models

Markov Chains and Hidden Markov Models Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Chapter 16 focused on decision making in the face of uncertainty about one future

Chapter 16 focused on decision making in the face of uncertainty about one future 9 C H A P T E R Markov Chains Chapter 6 focused on decision making in the face of uncertainty about one future event (learning the true state of nature). However, some decisions need to take into account

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors LECTURE 3 Eigenvalues and Eigenvectors Definition 3.. Let A be an n n matrix. The eigenvalue-eigenvector problem for A is the problem of finding numbers λ and vectors v R 3 such that Av = λv. If λ, v are

More information

88 CONTINUOUS MARKOV CHAINS

88 CONTINUOUS MARKOV CHAINS 88 CONTINUOUS MARKOV CHAINS 3.4. birth-death. Continuous birth-death Markov chains are very similar to countable Markov chains. One new concept is explosion which means that an infinite number of state

More information

Markov Chains. As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006.

Markov Chains. As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006. Markov Chains As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006 1 Introduction A (finite) Markov chain is a process with a finite number of states (or outcomes, or

More information

A REVIEW AND APPLICATION OF HIDDEN MARKOV MODELS AND DOUBLE CHAIN MARKOV MODELS

A REVIEW AND APPLICATION OF HIDDEN MARKOV MODELS AND DOUBLE CHAIN MARKOV MODELS A REVIEW AND APPLICATION OF HIDDEN MARKOV MODELS AND DOUBLE CHAIN MARKOV MODELS Michael Ryan Hoff A Dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Chapter 5 Markov Chain Monte Carlo MCMC is a kind of improvement of the Monte Carlo method By sampling from a Markov chain whose stationary distribution is the desired sampling distributuion, it is possible

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING ERIC SHANG Abstract. This paper provides an introduction to Markov chains and their basic classifications and interesting properties. After establishing

More information

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions Instructor: Erik Sudderth Brown University Computer Science April 14, 215 Review: Discrete Markov Chains Some

More information

Markov Chains and Pandemics

Markov Chains and Pandemics Markov Chains and Pandemics Caleb Dedmore and Brad Smith December 8, 2016 Page 1 of 16 Abstract Markov Chain Theory is a powerful tool used in statistical analysis to make predictions about future events

More information

Linear Programming Redux

Linear Programming Redux Linear Programming Redux Jim Bremer May 12, 2008 The purpose of these notes is to review the basics of linear programming and the simplex method in a clear, concise, and comprehensive way. The book contains

More information

Stochastic Models: Markov Chains and their Generalizations

Stochastic Models: Markov Chains and their Generalizations Scuola di Dottorato in Scienza ed Alta Tecnologia Dottorato in Informatica Universita di Torino Stochastic Models: Markov Chains and their Generalizations Gianfranco Balbo e Andras Horvath Outline Introduction

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

6 Evolution of Networks

6 Evolution of Networks last revised: March 2008 WARNING for Soc 376 students: This draft adopts the demography convention for transition matrices (i.e., transitions from column to row). 6 Evolution of Networks 6. Strategic network

More information

Modelling of Dependent Credit Rating Transitions

Modelling of Dependent Credit Rating Transitions ling of (Joint work with Uwe Schmock) Financial and Actuarial Mathematics Vienna University of Technology Wien, 15.07.2010 Introduction Motivation: Volcano on Iceland erupted and caused that most of the

More information

Chapter 2: Markov Chains and Queues in Discrete Time

Chapter 2: Markov Chains and Queues in Discrete Time Chapter 2: Markov Chains and Queues in Discrete Time L. Breuer University of Kent 1 Definition Let X n with n N 0 denote random variables on a discrete space E. The sequence X = (X n : n N 0 ) is called

More information

n α 1 α 2... α m 1 α m σ , A =

n α 1 α 2... α m 1 α m σ , A = The Leslie Matrix The Leslie matrix is a generalization of the above. It is a matrix which describes the increases in numbers in various age categories of a population year-on-year. As above we write p

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lecturer: Stefanos Zafeiriou Goal (Lectures): To present discrete and continuous valued probabilistic linear dynamical systems (HMMs

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

12 Markov chains The Markov property

12 Markov chains The Markov property 12 Markov chains Summary. The chapter begins with an introduction to discrete-time Markov chains, and to the use of matrix products and linear algebra in their study. The concepts of recurrence and transience

More information

Mean-field dual of cooperative reproduction

Mean-field dual of cooperative reproduction The mean-field dual of systems with cooperative reproduction joint with Tibor Mach (Prague) A. Sturm (Göttingen) Friday, July 6th, 2018 Poisson construction of Markov processes Let (X t ) t 0 be a continuous-time

More information

Stochastic processes and

Stochastic processes and Stochastic processes and Markov chains (part II) Wessel van Wieringen w.n.van.wieringen@vu.nl wieringen@vu nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University

More information

Probability, Random Processes and Inference

Probability, Random Processes and Inference INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION Laboratorio de Ciberseguridad Probability, Random Processes and Inference Dr. Ponciano Jorge Escamilla Ambrosio pescamilla@cic.ipn.mx

More information

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan Chapter 5. Continuous-Time Markov Chains Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan Continuous-Time Markov Chains Consider a continuous-time stochastic process

More information

Markov Processes Hamid R. Rabiee

Markov Processes Hamid R. Rabiee Markov Processes Hamid R. Rabiee Overview Markov Property Markov Chains Definition Stationary Property Paths in Markov Chains Classification of States Steady States in MCs. 2 Markov Property A discrete

More information

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 7: Learning Fully Observed BNs. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 7: Learning Fully Observed BNs Theo Rekatsinas 1 Exponential family: a basic building block For a numeric random variable X p(x ) =h(x)exp T T (x) A( ) = 1

More information

MCMC notes by Mark Holder

MCMC notes by Mark Holder MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:

More information

Any live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step.

Any live cell with less than 2 live neighbours dies. Any live cell with 2 or 3 live neighbours lives on to the next step. 2. Cellular automata, and the SIRS model In this Section we consider an important set of models used in computer simulations, which are called cellular automata (these are very similar to the so-called

More information

1 Continuous-time chains, finite state space

1 Continuous-time chains, finite state space Université Paris Diderot 208 Markov chains Exercises 3 Continuous-time chains, finite state space Exercise Consider a continuous-time taking values in {, 2, 3}, with generator 2 2. 2 2 0. Draw the diagramm

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors Eigenvalues and Eigenvectors MAT 67L, Laboratory III Contents Instructions (1) Read this document. (2) The questions labeled Experiments are not graded, and should not be turned in. They are designed for

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes CDA6530: Performance Models of Computers and Networks Chapter 3: Review of Practical Stochastic Processes Definition Stochastic process X = {X(t), t2 T} is a collection of random variables (rvs); one rv

More information

LECTURE 3. Last time:

LECTURE 3. Last time: LECTURE 3 Last time: Mutual Information. Convexity and concavity Jensen s inequality Information Inequality Data processing theorem Fano s Inequality Lecture outline Stochastic processes, Entropy rate

More information

18.600: Lecture 32 Markov Chains

18.600: Lecture 32 Markov Chains 18.600: Lecture 32 Markov Chains Scott Sheffield MIT Outline Markov chains Examples Ergodicity and stationarity Outline Markov chains Examples Ergodicity and stationarity Markov chains Consider a sequence

More information

Applied Stochastic Processes

Applied Stochastic Processes Applied Stochastic Processes Jochen Geiger last update: July 18, 2007) Contents 1 Discrete Markov chains........................................ 1 1.1 Basic properties and examples................................

More information

The Leslie Matrix. The Leslie Matrix (/2)

The Leslie Matrix. The Leslie Matrix (/2) The Leslie Matrix The Leslie matrix is a generalization of the above. It describes annual increases in various age categories of a population. As above we write p n+1 = Ap n where p n, A are given by:

More information

Lecture 9 Classification of States

Lecture 9 Classification of States Lecture 9: Classification of States of 27 Course: M32K Intro to Stochastic Processes Term: Fall 204 Instructor: Gordan Zitkovic Lecture 9 Classification of States There will be a lot of definitions and

More information

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

INTRODUCTION TO MARKOV CHAIN MONTE CARLO INTRODUCTION TO MARKOV CHAIN MONTE CARLO 1. Introduction: MCMC In its simplest incarnation, the Monte Carlo method is nothing more than a computerbased exploitation of the Law of Large Numbers to estimate

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information