Lectures on Probability and Statistical Models

Similar documents
Markov Chains Handout for Stat 110


Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Lecture 4 - Random walk, ruin problems and random processes

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

4 Branching Processes

Lecture 20 : Markov Chains

1.3 Convergence of Regular Markov Chains

Section 9.2: Matrices.. a m1 a m2 a mn

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

Module 6:Random walks and related areas Lecture 24:Random woalk and other areas. The Lecture Contains: Random Walk.

Section 9.2: Matrices. Definition: A matrix A consists of a rectangular array of numbers, or elements, arranged in m rows and n columns.

18.175: Lecture 30 Markov chains

Budapest University of Tecnology and Economics. AndrásVetier Q U E U I N G. January 25, Supported by. Pro Renovanda Cultura Hunariae Alapítvány

The Leslie Matrix. The Leslie Matrix (/2)

Chapter 11 Advanced Topic Stochastic Processes

Chapter 16 focused on decision making in the face of uncertainty about one future

Gambler s Ruin with Catastrophes and Windfalls

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

Question Points Score Total: 70

MATH HOMEWORK PROBLEMS D. MCCLENDON

Markov Chains on Countable State Space

The Boundary Problem: Markov Chain Solution

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

Markov Processes Hamid R. Rabiee

So in terms of conditional probability densities, we have by differentiating this relationship:

18.600: Lecture 32 Markov Chains

MARKOV PROCESSES. Valerio Di Valerio

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Markov Chains. October 5, Stoch. Systems Analysis Markov chains 1

18.440: Lecture 33 Markov Chains

LECTURE NOTES: Discrete time Markov Chains (2/24/03; SG)

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

Readings: Finish Section 5.2

CS 798: Homework Assignment 3 (Queueing Theory)

Stochastic Processes

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Some Definition and Example of Markov Chain

6 Continuous-Time Birth and Death Chains

Lecture 5: Random Walks and Markov Chain

APM 541: Stochastic Modelling in Biology Discrete-time Markov Chains. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 92

Data analysis and stochastic modeling

Probability, Random Processes and Inference

Lecture 4a: Continuous-Time Markov Chain Models

6.842 Randomness and Computation March 3, Lecture 8

Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains

STOCHASTIC MODELS LECTURE 1 MARKOV CHAINS. Nan Chen MSc Program in Financial Engineering The Chinese University of Hong Kong (ShenZhen) Sept.

1 Gambler s Ruin Problem

Eigenvalues and Eigenvectors

Markov Chains. INDER K. RANA Department of Mathematics Indian Institute of Technology Bombay Powai, Mumbai , India

Markov Chains. Contents

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)

Mathematical Games and Random Walks

MATH3200, Lecture 31: Applications of Eigenvectors. Markov Chains and Chemical Reaction Systems

Markov chains. MC 1. Show that the usual Markov property. P(Future Present, Past) = P(Future Present) is equivalent to

Markov Chains and Transition Probabilities

Matrix Multiplication

Probability Distributions

Stochastic modelling of epidemic spread

Introduction to Stochastic Processes

Uncertainty Runs Rampant in the Universe C. Ebeling circa Markov Chains. A Stochastic Process. Into each life a little uncertainty must fall.

Lectures on Markov Chains

MATH 320, WEEK 7: Matrices, Matrix Operations

The SIS and SIR stochastic epidemic models revisited

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

2. Transience and Recurrence

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Random Walks on Graphs. One Concrete Example of a random walk Motivation applications

Lecture 9 Classification of States

MAS275 Probability Modelling Exercises

Chapter 10. Finite-State Markov Chains. Introductory Example: Googling Markov Chains

12 Markov chains The Markov property

Powerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature.

Basic Concepts in Linear Algebra

Eigenvalues in Applications

Probability Distributions

1 Random Walks and Electrical Networks

Math 166: Topics in Contemporary Mathematics II

No class on Thursday, October 1. No office hours on Tuesday, September 29 and Thursday, October 1.

On asymptotic behavior of a finite Markov chain

Markov Chains and Stochastic Sampling

Kevin James. MTHSC 3110 Section 2.1 Matrix Operations

EXTINCTION TIMES FOR A GENERAL BIRTH, DEATH AND CATASTROPHE PROCESS

Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Math 3191 Applied Linear Algebra

Finite-Horizon Statistics for Markov chains

Markov Chain Model for ALOHA protocol

n α 1 α 2... α m 1 α m σ , A =

Markov Chains. As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006.

Lecturer: Olga Galinina

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

Elementary Linear Algebra Review for Exam 3 Exam is Friday, December 11th from 1:15-3:15

Review of Basic Concepts in Linear Algebra

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

0.1 Naive formulation of PageRank

LTCC. Exercises. (1) Two possible weather conditions on any day: {rainy, sunny} (2) Tomorrow s weather depends only on today s weather

Stochastic modelling of epidemic spread

Transcription:

Lectures on Probability and Statistical Models Phil Pollett Professor of Mathematics The University of Queensland c These materials can be used for any educational purpose provided they are are not altered

13 Imprecise (intuitive) definition. A Markov process is a random process that forgets its past, in the following sense: Pr(Future = y Present = x and Past = z) = Pr(Future = y Present = x). Thus, given the past and the present state of the process, only the present state is of use in predicting the future.

Equivalently, Pr(Future = y and Past = z Present = x) = Pr(Future = y Present = x) Pr(Past = z Present = x), so that, given the present state of the process, its past and its future are independent. If the set of states S is discrete, then the process is called a Markov chain. Remark. At first sight this definition might appear to cover only trivial examples, but note that the current state could be complicated and could include a record of the recent past.

Andrei Andreyevich Markov (Born: 14/06/1856, Ryazan, Russia; Died: 20/07/1922, St Petersburg, Russia) Markov is famous for his pioneering work on, which launched the theory of stochastic processes. His early work was in number theory, analysis, continued fractions, limits of integrals, approximation theory and convergence of series.

Example. There are two rooms, labelled A and B. There is a spider, initially in Room A, hunting a fly that is initially in Room B. They move from room to room independently: every minute each changes rooms (with probability p for the spider and q for the fly) or stays put, with the complementary probabilities. Once in the same room, the spider eats the fly and the hunt ceases. The hunt can be represented as a Markov chain with three states: (0) the spider and the fly are in the same room (the hunt has ended), (1) the spider is in Room A and the fly is in Room B, and, (2) the spider is in Room B and the fly is in Room A.

Eventually we will be able to answer questions like What is the probability that the hunt lasts more than two minutes? Let X n be the state of the process at time n (that is, after n minutes). Then, X n S = {0, 1, 2}. The set S is called the state space. The initial state is X 0 = 1. State 0 is called an absorbing state, because the process remains there once it is reached.

Definition. A sequence {X n,n = 0, 1,...} of random variables is called a discrete-time stochastic process; X n usually represents the state of the process at time n. If {X n } takes values in a discrete state space S, then it is called a Markov chain if Pr(X m+1 = j X m = i, X m 1 = i m 1,...,X 0 = i 0 ) = Pr(X m+1 = j X m = i). (1) for all time points m and all states i 0,...,i m 1, i,j S. If the right-hand side of (1) is the same for all m, then the Markov chain is said to be time homogeneous.

We will consider only time-homogeneous chains, and we shall write p (n) ij = Pr(X m+n = j X m = i) = Pr(X n = j X 0 = i) for the n-step transition probabilities and p ij := p (1) ij = Pr(X m+1 = j X m = i) = Pr(X 1 = j X 0 = i) for the 1-step transition probabilities (or simply transition probabilities).

By the law of total probability, we have that j S p (n) ij = j S Pr(X n = j X 0 = i) = 1, and in particular that j S p ij = 1. The matrix P (n) = (p (n) ij, i,j S) is called the n-step transition matrix and P = (p ij, i,j S) is called the 1-step transition matrix (or simply transition matrix).

Remarks. (1) Matrices like this (with non-negative entries and all row sums equal to 1) are called stochastic matrices. Writing 1 = (1, 1,...) T (where T denotes transpose), we see that P1 = 1. Hence, P (and indeed any stochastic matrix) has an eigenvector 1 corresponding to an eigenvalue λ = 1. (2) We may usefully set P (0) = I, where, as usual, I denotes the identity matrix: { p (0) 1 if i = j, ij = δ ij := 0 if i j.

Example. Returning to the hunt, the three states were: (0) the spider and the fly are in the same room, (1) the spider is in Room A and the fly is in Room B, and, (2) the spider is in Room B and the fly is in Room A. Since the spider changes rooms with probability p and the fly changes rooms with probability q, 1 0 0 P = r (1 p)(1 q) pq, r pq (1 p)(1 q) where r = p(1 q) + q(1 p) = p + q 2pq = 1 [(1 p)(1 q) + pq].

For example, if p = 1/4 and q = 1/2, then 1 0 0 P = 1/2 3/8 1/8. 1/2 1/8 3/8 What is the chance that the hunt is over by n minutes? Can we calculate the chance of being in each of the various states after n minutes?

By the law of total probability, we have p (n+m) ij = Pr(X n+m = j X 0 = i) But, = k S Pr(X n+m = j X n = k,x 0 = i) Pr(X n+m = j X n = k,x 0 = i) Pr(X n = k X 0 = i). = Pr(X n+m = j X n = k) (Markov property) = Pr(X m = j X 0 = k) (time homogeneous)

and so, for all m,n 1, p (n+m) ij = k S p (n) ik p(m) kj, i,j S, or, equivalently, in terms of transition matrices, P (n+m) = P (n) P (m). Thus, in particular, we have P (n) = P (n 1) P (remembering that P := P (1) ). Therefore, P (n) = P n, n 1. Note that since P (0) = I = P 0, this expression is valid for all n 0.

Example. Returning to the hunt, if the spider and the fly change rooms with probability p = 1/4 and q = 1/2, respectively, then 1 0 0 P = 1/2 3/8 1/8. 1/2 1/8 3/8 A simple calculation gives 1 0 0 P 2 = 3/4 5/32 3/32, 3/4 3/32 5/32

P 3 = 1 0 0 7/8 9/128 7/128, 7/8 7/128 9/128 et cetera, and, to four decimal places, 1 0 0 P 15 = 1.0000 0.0000 0.0000. 1.0000 0.0000 0.0000 Recall that X 0 = 1, so p (n) 10 is the probability that the hunts ends by n minutes. What, then, is the probability that the hunt lasts more than two minutes? Answer: 1 3/4 = 1/4.

Arbitrary initial conditions. What if we are unsure about where the process starts? Let π (n) j = Pr(X n = j) and define a row vector π (n) = (π (n) j, j S), being the distribution of the chain at time n. Suppose that we know the initial distribution π (0), that is, the distribution of X 0 (in the previous example we had π (0) = (0 1 0)).

By the law of total probability, we have π (n) j = Pr(X n = j) = i S Pr(X n = j X 0 = i) Pr(X 0 = i) = i S π (0) i p (n) ij, and so π (n) = π (0) P n, n 0. Definition. If π (n) = π is the same for all n, then π is called a stationary distribution. If lim n π (n) exists and equals π, then π is called a limiting distribution.

Example. Returning to the hunt with p = 1/4 and q = 1/2, suppose that, at the beginning of the hunt, each creature is equally likely to be in either room, so that π (0) = (1/2 1/4 1/4). Then, π (n) = π (0) P n = (1/2 1/4 1/4) 1 0 0 1/2 3/8 1/8 1/2 1/8 3/8 n.

For example, π (3) = (1/2 1/4 1/4) 3 1 0 0 1/2 3/8 1/8 1/2 1/8 3/8 = (1/2 1/4 1/4) 1 0 0 7/8 9/128 7/128 7/8 7/128 9/128 = (15/16 1/32 1/32). So, if, initially, each creature is equally likely to be in either room, then the probability that the hunt ends within 3 minutes is 15/16.

The two state chain. Let S = {0, 1} and let ( ) 1 p p P =, q 1 q where p,q (0, 1). It can be shown that ( ) ( ) ( ) P = 1 1 p 1 0 q p p + q 1 q 0 r 1 1, where r = 1 p q. This is of the form P = V DV 1. Check it! (The procedure is called diagonalization.)

This is good news because P 2 = (V DV 1 )(V DV 1 ) = V D(V 1 V )DV 1 = V (DID)V 1 = V D 2 V 1. Similarly, P n = V D n V 1 for all n 1. Hence, ( ) ( 1 p q P (n) = 1 p + q = 1 p + q 1 q ( q + pr n q qr n ) ( 1 0 0 r n ) p 1 1 ) p pr n p + qr n.

Thus we have an explicit expression for the n-step transition probabilities. Remark. The above procedure generalizes to any Markov chain with a finite state space.

If the initial distribution is π (0) = (a b), then, since π (n) = π (0) P n, Pr(X n = 0) = Pr(X n = 1) = q + (ap bq)rn p + q p (ap bq)rn p + q,. (You should check this for n = 0 and n = 1.) Notice that when ap = bq, we have Pr(X n = 0) = 1 Pr(X n = 1) = q/(p + q), for all n 0, so that π = (q/(p + q) p/(p + q)) is a stationary distribution.

Notice also that r < 1, since p,q (0, 1). Therefore, π is also a limiting distribution because lim Pr(X n = 0) = q/(p + q), n lim Pr(X n = 1) = p/(p + q). n Remark. If, for a general Markov chain, a limiting distribution π exists, then it is a stationary distribution, that is, πp = π (π is a left eigenvector corresponding to the eigenvalue 1). For details (and the converse), you will need a more advanced course on Stochastic Processes.

Example. Max (a dog) is subjected to a series of trials, in each of which he is given a choice of going to a dish to his left, containing tasty food, or a dish to his right, containing food with an unpleasant taste. Suppose that if, on any given occasion, Max goes to the left, then he will return there on the next occasion with probability 0.99, while if he goes to the right, he will do so on the next occasion with probability 0.1 (Max is smart, but he is not infallible).

Poppy and Max

Let X n be 0 or 1 according as Max chooses the dish to the left or the dish to the right on trial n. Then, {X n } is a two-state Markov chain with p = 0.01 and q = 0.9 and hence r = 0.09. Therefore, if the first dish is chosen at random (at time n = 1), then Max chooses the tasty food on the n-th trial with probability 90 91 89 182 (0.09)n 1, the long-term probability being 90/91.

Birth-death chains. Their state space S is either the integers, the non-negative integers, or {0, 1,...,N}, and, jumps of size greater than 1 are not permitted; their transition probabilities are therefore of the form p i,i+1 = a i, p i,i 1 = b i and p ii = 1 a i b i, with p ij = 0 otherwise. The birth probabilities (a i ) and the death probabilities (b i ) are strictly positive and satisfy a i + b i 1, except perhaps at the boundaries of S, where they could be 0. If a i = a and b i = b, the chain is called a random walk.

Gambler s ruin. A gambler successively wagers a single unit in an even-money game. X n is his capital after n bets and S = {0, 1,...,N}. If his capital reaches N he stops and leaves happy, while state 0 corresponds to bust. Here a i = b i = 1/2, except at the boundaries (0 and 1 are absorbing states). It is easy to show that the player goes bust with probability 1 i/n if his initial capital is i.

The Ehrenfest diffusion model. N particles are allowed to pass through a small aperture between two chambers A and B. We assume that at each time epoch n, a single particle, chosen uniformly and at random from the N, passes through the aperture. Let X n be the number in chamber A at time n. Then, S = {0, 1,...,N} and, for i S, a i = 1 i/n and b i = i/n. In this model, 0 and N are reflecting barriers. It is easy to show that the stationary distribution is binomial B(N, 1/2).

Population models. Here X n is the size of the population time n (for example, at the end of the n-th breeding cycle, or at the time of the n-th census). S = {0, 1,...}, or S = {0, 1,...,N} when there is an upper limit N on the population size (frequently interpretted as the carrying capacity). Usually 0 is an absorbing state, corresponding to population extinction, and N is reflecting.

Example. Take S = {0, 1,...} with a 0 = 0 and, for i 1, a i = a > 0 and b i = b > 0, where a + b = 1. It can be shown that extinction occurs with probability 1 when a b, and with probability (b/a) i when a > b, where i is the initial population size. This is a good simple model for a population of cells: a = λ/(λ + µ) and b = µ/(λ + µ), where µ and λ are, respectively, the death and the cell division rates.

The logistic model. This has S = {0,...,N}, with 0 absorbing and N reflecting, and, for i = 1,...,N 1, a i = λ(1 i/n) µ + λ(1 i/n), b i = µ µ + λ(1 i/n). Here λ and µ are birth and death rates. Notice that the birth and the death probabilities depend on i only through i/n, a quantity which is proportional to the population density: i/n = (i/area)/(n/area). Models with this property are called density dependent.

Telecommunications. (1) A communications link in a telephone network has N circuits. One circuit is held by each call for its duration. Calls arrive at rate λ > 0 and are completed at rate µ > 0. Let X n be the number of calls in progress at the n-th time epoch (when an arrival or a departure occurs). Then, S = {0,...,N}, with 0 and N both reflecting barriers, and, for i = 1,...,N 1, a i = λ λ + iµ, b i = iµ λ + iµ.

(2) At a node in a packet-switching network, data packets are stored in a buffer of size N. They arrive at rate λ > 0 and are transmitted one at a time (in the order in which they arrive) at rate µ > 0. Let X n be the number of packets yet to be transmitted just after the n-th time epoch (an arrival or a departure). Then, S = {0,...,N}, with 0 and N both reflecting barriers, and, for i = 1,...,N 1, a i = λ λ + µ, b i = µ λ + µ.

Genetic models. The simplest of these is the Wright-Fisher model. There are N individuals, each of two genetic types, A-type and a-type. Mutation (if any) occurs at birth. We assume that A-types are selectively superior in that the relative survival rate of A-type over a-type individuals in successive generations is γ > 1. Let X n be the number of A-type individuals, so that N X n is the number of a-type.

Wright and Fisher postulated that the composition of the next generation is determined by N Bernoulli trials, where the probability p i of producing an A-type offspring is given by p i = γ[i(1 α) + (N i)β] γ[i(1 α) + (N i)β] + [iα + (N i)(1 β)], where α and β are the respective mutation probabilities. We have S = {0,...,N} and ( ) N p ij = p j i j (1 p i) N j, i,j S.