Markov Chains. INDER K. RANA Department of Mathematics Indian Institute of Technology Bombay Powai, Mumbai , India

Similar documents
Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Markov Chains, Stochastic Processes, and Matrix Decompositions

Lecture 20 : Markov Chains

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

2. Transience and Recurrence

Treball final de grau GRAU DE MATEMÀTIQUES Facultat de Matemàtiques Universitat de Barcelona MARKOV CHAINS

Lecture 9 Classification of States

Markov Chains Handout for Stat 110

1.3 Convergence of Regular Markov Chains

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

ISE/OR 760 Applied Stochastic Modeling

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

Markov Chains and Stochastic Sampling

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes


Markov Processes Hamid R. Rabiee

Markov Chains (Part 3)

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

2 DISCRETE-TIME MARKOV CHAINS

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

12 Markov chains The Markov property

Lectures on Markov Chains

The Transition Probability Function P ij (t)

2 Discrete-Time Markov Chains

Measure and Integration: Concepts, Examples and Exercises. INDER K. RANA Indian Institute of Technology Bombay India

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)

Chapter 16 focused on decision making in the face of uncertainty about one future

Budapest University of Tecnology and Economics. AndrásVetier Q U E U I N G. January 25, Supported by. Pro Renovanda Cultura Hunariae Alapítvány

Stochastic modelling of epidemic spread

Lectures on Probability and Statistical Models

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Probability, Random Processes and Inference

STOCHASTIC PROCESSES Basic notions

1 Gambler s Ruin Problem

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

Readings: Finish Section 5.2

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

Chapter 11 Advanced Topic Stochastic Processes

1 Random Walks and Electrical Networks

reversed chain is ergodic and has the same equilibrium probabilities (check that π j =

The Theory behind PageRank

Markov Chains. As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006.

Discrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices

SMSTC (2007/08) Probability.

FINITE MARKOV CHAINS

TCOM 501: Networking Theory & Fundamentals. Lecture 6 February 19, 2003 Prof. Yannis A. Korilis

IEOR 6711: Professor Whitt. Introduction to Markov Chains

Statistics 992 Continuous-time Markov Chains Spring 2004

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

Discrete time Markov chains. Discrete Time Markov Chains, Definition and classification. Probability axioms and first results

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Boolean Inner-Product Spaces and Boolean Matrices

Markov Chains. Andreas Klappenecker by Andreas Klappenecker. All rights reserved. Texas A&M University

Section 2: Classes of Sets

MAS275 Probability Modelling Exercises

Math Homework 5 Solutions

ISM206 Lecture, May 12, 2005 Markov Chain

1.2. Markov Chains. Before we define Markov process, we must define stochastic processes.

Irreducibility. Irreducible. every state can be reached from every other state For any i,j, exist an m 0, such that. Absorbing state: p jj =1

Stochastic modelling of epidemic spread

Zdzis law Brzeźniak and Tomasz Zastawniak

Math 6810 (Probability) Fall Lecture notes

Stochastic process. X, a series of random variables indexed by t

P(X 0 = j 0,... X nk = j k )

M3/4/5 S4 Applied Probability

Markov Chain Model for ALOHA protocol

Examples of Countable State Markov Chains Thursday, October 16, :12 PM

STAT STOCHASTIC PROCESSES. Contents

Markov Chains. Contents

Chapter 4 Markov Chains at Equilibrium

RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

Markov Chains. October 5, Stoch. Systems Analysis Markov chains 1

Markov Chains Introduction

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

Censoring Technique in Studying Block-Structured Markov Chains

Reinforcement Learning

TMA4265 Stochastic processes ST2101 Stochastic simulation and modelling

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

MARKOV CHAIN MONTE CARLO

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

Statistics 150: Spring 2007

ECE 6960: Adv. Random Processes & Applications Lecture Notes, Fall 2010

An Introduction to Entropy and Subshifts of. Finite Type

Introduction and Preliminaries

Discrete Time Markov Chain (DTMC)

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015

Markov Chains on Countable State Space

4 Branching Processes

Interlude: Practice Final

Linear Algebra March 16, 2019

A matrix over a field F is a rectangular array of elements from F. The symbol

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

EXAM IN COURSE TMA4265 STOCHASTIC PROCESSES Wednesday 7. August, 2013 Time: 9:00 13:00

18.175: Lecture 30 Markov chains

Classification of Countable State Markov Chains

A FIRST COURSE IN LINEAR ALGEBRA. An Open Text by Ken Kuttler. Matrix Arithmetic

Summary of Results on Markov Chains. Abstract

LTCC. Exercises. (1) Two possible weather conditions on any day: {rainy, sunny} (2) Tomorrow s weather depends only on today s weather

Transcription:

Markov Chains INDER K RANA Department of Mathematics Indian Institute of Technology Bombay Powai, Mumbai 400076, India email: ikrana@iitbacin

Abstract These notes were originally prepared for a College Teacher s Refresher course at University of Mumbai The current revised version is for the participants of the Summer school on Probability Theory at Kerala School of Mathematics,2010

Contents Prologue Basic Probability Theory 1 01 Probability space 1 02 Conditional probability 1 Chapter 1 Basics 3 11 Introduction 3 12 Random walks 7 13 Queuing chains 9 14 Ehrenfest chain 10 15 Some consequences of the markov property 11 Review Exercises 12 Chapter 2 Calculation of higher order probabilities 15 21 Distribution of X n and other joint distributions 15 22 Kolmogorov-Chapman equation 20 Exercises 21 Chapter 3 Classification of states 23 31 Closed subsets and irreducible subsets 23 Exercises 26 32 Periodic and aperiodic chains 27 Exercises 28 33 Visiting a state: transient and recurrent states 29 34 Absorption probability 36 35 More on recurrence/transience 41 Chapter 4 Stationary distribution for a markov chain 45 41 Introduction 45 42 Stopping times and strong markov property 46 43 Existence and uniqueness: 48 44 Asymptotic behavior 53 vii

viii Contents Diagonalization of matrices 55 References 57 Index 59

Prologue Basic Probability Theory 01 Probability space A mathematical model for analyzing statistical experiments is given by a probability space A probability space is a triple (Ω, S,P) where: Ω is a set representing the set of all possible outcomes of the experiment S is a σ-algebra of subsets of Ω Subsets of Ω are called events of the experiment Elements of S represents the collection of events of interest in that experiment For every E S, the nonnegative number P(E) is the probability that the event E occurs The map E P(E), called a probability, is P : S [0, 1], with the following properties: (i) P( ) = 0, and P(Ω) = 1 (ii) P is countably additive,ie, for countable sequence A 1, A 2,, A n, in S, which is pairwise disjoint: A i A j =, P( n=1(a i)) = P(A i) i=1 02 Conditional probability Let (Ω, S,P) be a probability space If B is an event with P(B) > 0, then for every A S, the conditional probability of A given B, denoted by P(A B), is defined by P(A B) = P(A B) P(B) Intuitively,P B(A) := P(A B) is as how likely is the event A to occur, given the knowledge that B has occurred Some properties of conditional probability are: (i) For countable sequence A 1, A 2,, A n, in S, which is pairwise disjoint P B( n=1(a i)) = P B(A i) i=1 1

2 Prologue (ii) Chain rule P(A B) = P(A B)P(B) In general, for A 1, A 2,, A n S, P(A 1 A 2 A n) = P(A 1 A 2 A 2 A n) P(A 2 A 3 A 2 A n) P(A n 1 A n), and for B S, P(A 1 A 2 A n B) = P(A 1 A 2 A n B)P(A 2 A 3 A 2 A n B)P(A n B) (iii) Bay s formula If A 1, A 2,, A n, in S, are pairwise disjoint and Ω = n=1a i, then for B S, P(A i B) = P(B A i) j=1 P(B Aj)P(Aj) (iv) Conditional impendence Let A 1, A 2,, A n, in S, be pairwise disjoint such that then P(A i=1 A i) = p P(A A i) = P(A A j) := p for every 1, j (v) If A 1, A 2,, A n, in S, are pairwise disjoint and Ω = n=1a i, then for B, C S, P(C B) = P(A i B) P(C A i D) i=1

Chapter 1 Basics 11 Introduction The aim of our lectures is to analyze the following situation: Consider an experiment/system under observation and let s 1, s 2,, s n, be the possible states in which the system can be Let us suppose that the system is being observed at every unit of time: n = 0,1, 2, Let X n denote the observation at time n 0 Thus each X n can take either of the values s 1, s 2,, s n, We further assume that the observations X n s are not deterministic, ie, X n can take value s i with some probability In other words, each X n is a random variables on some probability space (Ω, A, P) In case,the observations X 0, X 1, are independent, we know how to compute the probabilities of various events The situation we are going to look at is slightly more general Let us look at some examples 111 Example: Consider observing the working of a particular machine in a factory On any day, either the machine will be broken or it will be working So our system can be in any one of the two states: broken - represented by 0, or working - represented by 1 Let X n be the observation about the machine on n th day Clearly, there is no reason to assume that X n will be independent of X n 1,, X 0 112 Example: Consider a gambler making bets in a gambling house He starts with some amount say A rupees and makes a series of one rupee bets against the house Let X n, n 0 denote the gambler s capital at time n, say after n bets Then, the states of the system, the possible values each X n can take, are 0,1, 2, Clearly, the values of X n depends upon the values of X n 1 113 Example: Consider a bill collection office where people come to pay their bills People arrive at the paying counter at various time points and are being served eventually Let us suppose that we measure time in minutes Then the number of persons that arrive during one minute are taken as the ones which arrive at that minute and let us say at most one person can be/will be served in a minute Let ξ n denote the number of persons that arrive at the n th minute Let X 0 denote the number of persons that were waiting initially, (ie, when the office opened) and for n 1, let X n denote the number 3

4 1 Basics of customers at the n th minute Thus, for all n 0, X n+1 = ξ n+1, if X n = 0, X n+1 = X n + ξ n+1 1, if X n 0, because one person will be served in that minute The states of the system are 0,1, 2,, and clearly X n+1 depends upon X n Thus, we are going to look at a sequence of random variables {X n} n 0 defined on a probability space (Ω, A, P), such that each X n can take at most countable number of values As mentioned in the beginning, if X n s are independent, then one knows how to analyze the system If X n s are not independent, what kind of relation X n s can have? For example, let us consider the system of example 111: observing the working of a machine on each day Clearly, the observation that the machine will be in order or not in order on a particular day depends only upon the fact that it was working or was out of order on previous day Or in example 112, the example of gambler, his capital on n th day will be depend only upon his capital on the (n 1) th day This motivates the following assumption about our system 114 Definition: Let {X n} n 0 be a sequence of random variables taking values in a set S, called state space, which is at most a countable set We say that has {X n} n 0 has the markov property if for every n 0 and i 0, i 1, i n S, P {X n+1 = i n+1 X 0 = i 0, X 1 = i 1, X n = i n} = P {X n+1 = i n+1 X n = i n} for all n 0 That is, the observation/outcome at the (n + 1) th stage of the experiment depends only on the outcome immediate past Thus, if n 0, and i, j S, then the numbers P(i, j, n) := P {X n+1 = j X n = i} are going to be important for the system This is the probability that the system will be in state j at stage n + 1 given that it was in state i at stage n Note that saying that a sequence {X n} n 1, has markov property means that given X n 1, the random variable X n is conditionally independent of X n 2,, X 1, X 0 It means that the distribution of the sequence to go to next step depends only upon where the system is now and not where it has been in the past 115 Definition: Let {X n} n 1, be a markov system with state space S (i) For n 0, and i, j S, the number P(i, j, n) is called the one step transition probability for the system at stage n to go from state i to the state j at the next stage (ii) The system is said to have the stationary property or the homogeneous property if P(i, j, n) is independent of n, ie, P(i, j, n + 1) = P(i, j, n) for every i, j S, n 1 That is the probability that the system will be in state j at stage n + 1 given that it is in state i at stage n is independent of n Thus, the probability of the system in

11 Introduction 5 going from state i to j does not depend upon the time at which this happens (iii) A markov system {X n} n 1 is called a markov chain if it is stationary 116 Definition: Given a markov chain {X n} n 1, Π 0(i) := P {X 0 = i}, i S is called the initial distribution vector or the distribution of X 0 117 Graphical representation: A pictorial way to represent a markov chain is by its transition graph It consists of nodes representing the states of the chain and arrows between the nodes representing the transition probabilities The transition graphs of examples markov chain in example 111 is as follows: p(0, 0) = p, p(0,1) = 1 p, p(1, 0) = q, p(1,1) = 1 q 118 Theorem: Let {X n} n 1, be the markov chain with state space S, transition probabilities p(i, j), and initial distribution vector Π 0(i) Let P be the matrix Then the following hold: (i) 0 p(i, j), Π 0(i) 1 (ii) For every i, j S p(i, j) = 1 P = [p ij] i j (iii) For every j, i S Π 0(i) = 1 119 Definition: The matrix P = [p(i, j)] i j is called the transition matrix of the markov chain It has the property that each entry is a non negative number between 0 and 1, sum of each row and each column is 1 Let us look at some examples 1110 Example: Consider the example 111, observing the working of a machine Here S = {0, 1} Let Then, P {X n+1 = 1 X n = 0} := p(0,1) = p, P {X n+1 = 0 X n = 1} := p(1,0) = q P {X n+1 = 0 X n = 0} = 1 p and {X n+1 = 1 X n = 1} = 1 q Thus, the transition matrix is ( 1 p p P = q 1 q ) Another way of describing a markov chain is given by

6 1 Basics 1111 Theorem: A sequence of random variables {X n} n 0 is a markov chain with initial vector Π 0 and transition matrix P, if and only if for every n 1, and i 0, i 1,, i n S, P {X 0 = i 0, X 1 = i 1,, X n = i n} = Π 0(i)p(i 0, i 1)p(i 1, i 2) p(i n 1, i n) (11) Proof: First suppose that {X n} n 0 is a markov chain with initial vector Π 0 and transition matrix P Then using the chain rule for conditional probability, Thus, P {X 0 = i 0, X 1 = i 1,, X = i n} = P {X 0 = i 0}P {X 1 = i 1 X 0 = i 0} P {X n = i n X 0 = i 0,, X n 1 = i n 1} = Π 0(i) p(i 0, i 1) p(i 1, i 2) p(i n 1, i n), Conversely, if equation (11) holds, then summing both sides over i n S, P {X 0 = i 0, X 1 = i 1,, X = i n} i n S = Π 0(i)p(i 0, i 1)p(i 1, i 2) p(i n 1, i n) i n S P {X 0 = i 0, X 1 = i 1,, X n 1 = i n 1} = P {X 0 = i 0, X 1 = i 1,, X = i n} i n S = Π 0(i)p(i 0, i 1)p(i 1, i 2) p(i n 1, i n) i n S = Π 0(i)p(i 0, i 1)p(i 1, i 2) p(i n 2, i n 1) Proceeding similarly, we have for every n = 0, 1,, i k S, P {X 0 = i 0, X 1 = i 1,, X k = i k } = Π 0(i)p(i 0, i 1)p(i 1, i 2) p(i k 1, i k ) Thus, for k = 0, we have P {X 0 = i 0} = Π 0(i) and P {X n+1 = i n+1 X 0 = i 0,, X n = i n} P {X0 = i0, Xn = in, Xn+1 = in+1} = P {X 0 = i 0, X n = i n, X n = i n} Π0(i)p(i0, i1)p(i1, i2) p(in, in+1) = Π 0(i)p(i 0, i 1)p(i 1, i 2) p(i n 1, i n) = p(i n, i n+1) Hence, {X n} n 0 is a markov chain with initial vector Π 0, and transition probabilities p(i,j), i, j S

12 Random walks 7 12 Random walks 121 Example(Unrestricted random walk on the line): Consider a particle which moves one unit to the left with probability 1 p or to the right on the line with probability p This is called unrestricted random walk on the line Let X n denote the position of the particle at time n Then S = {0, ±1, ±2, } and the markov chain has the transition graph and the transition matrix: P = 3 2 1 0 1 2 3 3 2 1 0 1 2 3 0 (1 p) 0 p 0 (1 p) 0 p 0 0 (1 p) 0 p 0 0 (1 p) p 0 122 Random walk on the line with absorbing barriers: We can also consider the random walk on the line in with state space S = {0, 1, 2,3,, r} and the condition that the walk ends if the particle reaches 0 or r The states 0 and r are called absorbing states for the particle that reaches this state and is absorbed in it It cannot leave the state The transition graph and the transition probability matrix for this walk is given by

8 1 Basics 0 1 2 3 r P = 0 1 2 r 1 0 0 0 (1 p) 0 p 0 0 0 (1 p) 0 p 0 0 0 (1 p) 0 p 0 0 0 1 A typical illustration of this situation is when two players are gambling with total capital r rupees The game ends when A looses all the money, ie, 0 stage or B looses all the money, ie, stage r for A, and X n is the capital of A at n th stage 123 Random walk on the line with reflecting barriers: Another variation of the previous example is the situation where two friends are gambling with a view to play longer So they put the condition that every time a player loses his last rupee, the opponent returns it to him Let X n denote the capital of a player A at nth stage If total money both the players have is r + 1 rupees, then the state space for the system is S = {1, 2, 3,, r} To find the transition matrix, note that in the first row, P(1,1) = P {X n+1 = 1 X n = 1} = P {A s capital remains Rs1 at next stage given that it was 1 at this stage} = P {A has last rupee and loses It will be returned} = (1 p) p(1,2) = P {Capital of A becomes 2 it is 1 now} = P {A wins} = p p(1,j) = 0 for j 3 For the i th row, 1 < i < r, and 1 j r, p if j = i + 1, p(i, j) = P {X n+1 = j X n = i} = 0 if j = i 1 < i < r, (1 p) if j = i 1 Thus, the transition matrix is given by:

13 Queuing chains 9 1 2 3 i r 1 2 3 i r (1 p) p 0 0 (1 p) 0 p 0 0 0 (1 p) 0 p 0 0 0 (1 p) 0 p 0 0 0 (1 p) p 124 Birth and death chain Let X n denote the population of a living system at time n, n 1 The state space for the system {X n} n 1 is {0,1,2,} We assume that at any given stage n, if X n = x, then the population increases to x + 1, by a unit with probability p x or decreases to x 1 with probability q x, or can remain the same with probability r x Then, p x if y = x + 1, p(x,y) = q x if y = x 1, r x if y = x, 0 otherwise Clearly, this is a markov chain, called the birth and death chain and is a special case of random walks 13 Queuing chains Consider a counter where customers are being served at every unit of time Let X 0 be the number of customers in the queue to be served when the counter opens and let ξ n be the number of customers who arrive at the n th unit of time Then, X n+1 the number of customers waiting to be served at the beginning of n + 1 th time unit is ξ n if X n = 0, X n+1 = X n + ξ n 1 if X n 1 The state space for the system {X n} n 1 is S = {0, 1,2, } If {ξ n} n 1 are independent random variables taking only nonnegative integral values, then {X n} n 1 is a markov chain In case {ξ n} n 1 is also identically distributed with distribution function f, we

10 1 Basics can calculate the transition probabilities: for x, y S, p(x,y) = P {X n+1 = y X n = x} = = = { { { P {X n+1 = y = ξ n} if x = 0, P {X n+1 = y = ξ n 1 + X n} if x 1 P {ξ n = y} if x = 0, P {ξ n = y x + 1} if x > 1 f(y) if x = 0 f(y x + 1) if x > 1 14 Ehrenfest chain Consider two isolated containers labeled as body A and body B, containing two different fluids Let the total number of molecules of the two fluids, distributed in the containers A and B, be d, labeled as {1, 2,, d} Let the observation be made on the number of the molecules in A To start with, A has some number of molecules and B has some number of molecules In the next stage, a number 1 r d is chosen at random and the molecule labeled r is removed from the body in which it was and is placed in the other body This gives observation at second stage and so on Clearly, X n, which denotes the number of molecules that can be in A is {0, 1,2,, d} Thus, the state space is S = {0, 1,2, d} Let us find the transition probabilities p(i, j) 0 i, j d of the system When i = 0, P(0, j) = P {X n+1 = j X n = 0}, ie, A had no molecules at X n Therefore, clearly j can be only 1 at X n+1 Thus, P(0, j) = { 0 if j 0, 1 if j = 1 If A has to have d molecules, (ie, all of them) at (n+1) th stage, then, at n th stage, it should have only d 1 molecules Thus, B has one molecule and that should be chosen and added to A This can be done with probability 1 (Because B has only 1 molecule and it is to be selected at random) Thus, { 1 if j = d 1, P(d,j) = 0 otherwise For a fixed i, 1 < i < d, let us look at p(i, j), for 0 j d Since p(i, j) is the probability that A will have j molecules, given that it had i molecules Now if A had i molecules, then the only possibility for j is i 1 or i + 1, (because the number of molecules in A at any next stage can increase or decrease) Thus, p(i, j) = 0, if j i+1 or i 1 If j = i + 1, ie, A has to have i + 1 molecules, then B had d i molecules and one of the molecules for B should be selected and added to A The probability for doing this is d i d Thus, p(i, i + 1) = d i d = 1 i d and p(i, i 1) = i d Thus, the transition matrix for this markov chain is given by

15 Some consequences of the markov property 11 0 1 2 3 d 0 1 2 3 d 0 1 0 0 (1/d) 0 (1 1/d) 0 0 0 (1/d) 0 (1 1/d) 0 0 0 1/d 0 (1 1/d) 0 0 0 1 0 This model is called Ehrenfest diffusion model 15 Some consequences of the markov property Let {X n} n 0 be a markov chain with state space S and transition probabilities (p(i, j)), i, j S 151 Proposition: Let S 1, S 2, S 0 be subsets of S Then for any n 1, P {X n = j X n 1 = i, X n 2 S 2,, X 0 S 0} = p(i, j) Proof: The required property holds for elementary sets S k = i k, for i k S by the markov property: P {X n = j X n 1 = i, X n 2 = i n 2,, X 0 = i 0} = P {X n = j X n 1 = i} Since any subset A of S is a countable disjoint union of elementary sets and the required property follows from the property (iv) of conditional probability as in prologue 152 Example: let us compute P {X 3 = j X 1 = i, X 0 = k}, j, k S Using proposition 151, and markov property, we have P {X 3 = j X 1 = i, X 0 = k} = r S P {X 3 = j X 2 = r, X 1 = i, X 0 = k}p {X 2 = r X 1 = i, X 0 = k} = r S P {X 3 = j X 2 = r, X 1 = i}p {X 2 = r X 1 = i} = P {X 3 = j X 1 = i} In fact above example can be extended to following: 153 Theorem: For n > n s > n s1 > > n 1 0, P {X n = j X ns = i, X ns 1 = i s 1, X n1 = i 1} = P {X n = j X ns = i} Thus, for a markov chain, probability at n given past at n s > n s 1 > > n 1, it depends only on the most recent past, ie, n s

12 1 Basics Thus, to every markov chain, we can associate a vector, distribution of the initial stage and a stochastic matrix whose entries give us the probabilities of moving from a state to another at the next stage Here is the converse: 154 Theorem: Given a stochastic matrix P and probability vector Π 0, there exists a markov chain {X n} n 1 with Π 0, as initial distribution and P as transition probability matrix The interested reader may refer Theorem 81 of Billingsel[4] 154 Exercise Show that P {X 0 = i 0 X 1 = i 1,, X n = i n} = P {X 0 = i 0 X 1 = x 1} Review Exercises (11) Mark the following statements as True/False: (i) A Markov system can be in several states at one time (ii) The (1, 3) entry in the transition matrix is the probability of going from state 1 to state 3 in two steps (iii) The (6,5) entry in the transition matrix is the probability of going from state 6 to state 5 in one step (iv) The entries in each row of the transition matrix add to zero (v) Let {X n} n 1 be a sequence of independent identically distributed discrete random variables Then it is a markov chain (vi) If the state space is S = {s 1, s 2,, s n}, then its transition matrix will have order n (12) Let {ξ n} n 1 be a sequence of independent identically distributed discrete random variables Define { ξ 0 if n = 0, X n = ξ 1 + ξ 2 + + ξ n for n 1 Show that {X n} n 1 is a markov chain Sketch its transition graph and compute the transition probabilities (13) Consider a person moving on a 4 4 grid He can move only to the intersection points on the right or down, each with probability 1/2 If he starts his walk from the top left corner and X n, n 1 denotes his position after n steps Show that {X n} n 0 } is a markov chain Sketch its transition graph and compute the transition probability matrix Also find the initial distribution vector (14) Web surfing: Consider a person surfing the Internet, and each time he encounters a web page, he selects one of its hyperlinks at random (but uniformly) Let X n denote the page where the person is after n selections (clicks) What do you think is the state space? Find the transition probability matrix (15) Let {X n} n 0 be a markov chain with state space, initial probability distribution and transition matrix given by 1/3 1/3 1/3 S = {1, 2,3}, Π 0 = (1/3/1/3, 1/3), P = 1/3 1/3 1/3 1/3 1/3 1/3 Define Y n = { 0 ifxn = 1, 1 otherwise Show that {Y n} n 0 is not a markov chain Thus, function of a markov chain need not be a markov chain

Review Exercises 13 (16) Let {X n} n 0 be a markov chain with transition matrix P Define Y n = X 2n for every n 0 Show that {Y n} n 0 is a markov chain with transition matrix P 2 What happens if Y n is defined as Y n = X nk for every n 0?

Chapter 2 Calculation of higher order probabilities 21 Distribution of X n and other joint distributions Consider a markov chain {X n} n 1 with initial vector Π 0, and transition probabilities matrix P = [p(i, j)],i j We want to find the probability that after n steps, the system will be in a given state, say j S? For a matrix A, its n-fold product with itself will be denoted by A n 211 Theorem: (i) The joint distribution of X 0, X 1, X 2,, X n, is given by P {X 0 = i 0, X 1 = i 1,, X n = i n} = p(i n 1, i n)p(i n 2, i n 1) p(i 0, i 1)Π 0(i 0) (ii) The distribution of X n, P {X n = j}, is given by the j th component of the vector Π 0 P n (iii) For every n, m 0, P {X n = j X 0 = i} = P {X n+m = j X m = i} = p n (i, j), where p n (i, j) is the ij th term of the matrix P n Proof: (i) Using the chain rule for conditional probability, P {X 0 = i 0, X 1 = i 1,, X n = i n} = P {X n = i n X n 1 = i n 1}P {X n 1 = i n 1 X n 2 = i n 2,, X 0 = i 0} P {X 1 = i 1 X 0 = i 0}P {X 0 = i 0} = P {X n = i n X n 1 = i n 1} P {X n 1 = i n 1 X n 2 = i n 2},, P {X 1 = i 1 X 0 = i 0}P {X 0 = i 0} = p(i n 1, i n)p(i n 2, i n 1) p(i 0, i 1)Π 0(i 0) 15

16 2 Calculation of higher order probabilities (ii) Let Y be a random variable with values in S and distribution P {Y = i} = λ i, i S Then using the chain rule for conditional probability, P {X n = j} = P {Y = i 0, X n = j} i 0 S = i 0 S i 1 S = i 0 S i 1 S = i 0 S i 1 S i n 1 S i n 1 S i n 1 S Thus for Y = X 0, we have P {X n = j} = i 0 S i 1 S P {Y = i 0, X i1 = i 1,, X in 1 = i n 1, X n = j} P {Y = i 0} P {X i1 = i 1 X i1 1 = i 1 1}, P {X n = j X in 1 = i n 1} (21) λ i p(i 0, i 1) p(i n 1j) (22) i n 1 S = j th element of the vector Π 0 P n Π 0(i) p(i 0, i 1) p(i n 1, j) (iii) Once again, using the markov property and the chain rule for conditional probability, P {X n+m = j X m = i} P {X m = i} Thus = P {X n+m = j, X m = i} = i m S i m+1 S = i m S i m+1 S = i m S i m+1 S i m+n 1 S i m+n 1 S i m+n 1 S P {X m = i m, X m+1 = i m+1,, X in+m 1 = i n+m 1, X n+m = j} P {X m = i} P {X m+1 = i m+1 X m = i}, P {X in+m 1 = i n+m 1, X n+m = j} P {X m = i} p(i, i m+1) p(i n+m 1, j) P {X n = j X 0 = i} = P {X n+m = j X m = i} = p n (i, j), where p n (i, j) is the ij th term of the matrix P n 212 Definition: Let {X n} n 1 be a markov chain with initial vector Π 0, and transition probabilities matrix P = (p(i,j)), i, j S (i) For n 1, and j S, p n(j) = P {X n = j} is called the distribution of X n (ii) For n 1, p n(i, j) is called the n th stage transition probabilities Above theorem gives us the probability of the system in a state at the n th stage and the probability of the event that the system will move in n stages from a state i to a state j And these can be computed if we know the initial distribution and powers of the transition matrix Thus, it is important to compute the matrix P n, P being the transition matrix For large n, this is difficult to compute Let us look at some examples 213 Exercise: Show that the joint distribution of X m, X m+1,, X m+n is given by p(i n 1, i n)p(i n 2, i n 1) p(i m+1, i m+2)p {X m+1 = i m+1}

21 Distribution of X n and other joint distributions 17 Also write the joint distribution of any finite X n1, X n2,, X nr for n 1 < n 2, < n r 214 Example: Consider a markov chain {X n} n 1 with the special situation where all the X n s are independent Let us compute P n, where P is the transition probability matrix Because X ns are independent, p(i,j) = P {X n+1 = j X n = i} = P {X n+1 = j} for all j, i and for all n Thus, each row of P is identical By theorem 211(iii), for all i, p n (i, j) = P {X n+m = j X m = i} = P {X n = j X 0 = i} = P {X n = j} = p(i, j) Therefore each P n (i, j) = p(i, j), ie, P n = P 215 Example: Let us consider the markov chain with two states S = {0, 1} and transition matrix [ 1 p p P = q 1 q ] Let Π 0(0), Π 0(1) be initial distributions The knowledge of P and Π 0(0),Π 0(1) helps us to answer various questions For example, to compute the distribution of X n, using the formula of conditional probability: P(A B) P(B) = P(A B)), we have for every n 0, P {X n+1 = 0} = P {X n+1 = 0, X n = 0} + P {X n+1 = 0, X n = 1} Thus, for n = 0,1, 2,, = P {X n+1 = 0 X n = 0} P {X n = 0} +P {X n+1 = 0 X n = 1} P {X n = 1} = (1 p)p {X n = 0} + qp {X n = 1} = (1 p)p {X n = 0} + q(1 P {X n = 0}) = (1 p q)p {X n = 0} + q P {X 1 = 0} = (1 p q)π 0(0) + q P {X 2 = 0} = (1 p q)p {X 1 = 0} + q = (1 p q)[q + (1 p q)π 0(0)] + q = (1 p q) 2 Π 0(0) + q(1 p q) + q 1 = (1 p q) 2 Π 0(0) + q (1 p q) j j=0 P {X n = 0} = n 1 (1 p q) n + q (1 p q) j j=0

18 2 Calculation of higher order probabilities P n (0, 0) = P {X n = 0 X 0 = 0} = P {X n = 0} ( q = p + q ( q = p + q Then, using the fact that P {X 0 = 0} = 1, Then, And Therefore, P n (0, 1) = P {X n = 1 X 0 = 0} = P {X n = 1} ( p = p + q ( p = p + q ) + (1 p q) n [ 1 q ) + (1 p q) n ( p p + q p + q ) ) + (1 p q) n [ 0 p ) (1 p q) n ( P n (1, 0) = P {X n = 0 X 0 = 1} P n = = P {X n = 0} ( q = p + q ( q = p + q ( q = p + q P n (1,1) = ( 1 ) [ q p p + q q p p p + q ) + (1 p q) n [ Π 0(0) ) + (1 p q) n [ 0 q ) (1 p q) n ( q p + q p + q ) p + q ) ( ) ( ) 1 q + (1 p q) n p + q p + q ] + ] ] q ] p + q ] ( ) [ (1 p q) n p p p + q q q 216 Exercise: Consider the (random walk) markov chain as in example 1110 (i) If p = q = 0, what can be said about the machine? (ii) If p, q > 0, show that and P {X n = 0} = P {X n = 1} = q [ + (1 p q)n Π 0(0) q ] p + q p + q p [ + (1 p q)n Π 0(1) p ] p + q p + q (iv) Find conditions on Π 0(0) and Π 0(1) such that distribution of X n is independent of n (v) Compute the following: P {X 0 = 0, X 1 = 1, X 2 = 0} (vi) Can one compute joint distribution of X n+2, X n+1, X n? ]

21 Distribution of X n and other joint distributions 19 217 Note (In case P is diagonalizable: As we observed earlier, it is not easy to compute P n for a matrix P, even when it is finite However, in the case P is diagonalized (see Appendix for more details), it is easy: let there exist an invertible matrix U such that U P U 1 = D, where D is a diagonal matrix Then P n = U D n U 1, and D n is easy to computein this case, we can compute the elements of P n Let the state space has M elements and P be diagonalizable with diagonal elements of D be λ 1, λ 2,, λ M, these are the eigenvalues of P To find p n(i, j) : (i) Compute the eigenvalues λ 1, λ 2,, λ M, of P by solving the characteristic equation (ii) If all the eigenvalues are distinct, then for all n, p n ij has the form p n ij = a iλ n 1 + + a Mλ n M, for some constants a i,, a M, depending upon i and j These can be found by solving system of linear equations 218 Example: Let for a markov chain, the transition matrix is 0 1 0 P = 0 1/2 1/2, 1/2 0 1/2 and let us try to find a general formula for p n 11 We first compute the eigenvalues of P by solving det(p λi) = 0 λ 1 0 0 1/2 λ 1/2 1/2 0 1/2 λ = 0 This gives (complex) eigenvalues 1, ±(i/2) Thus, for some invertible matrix U, and hence P = U P n = U 1 0 0 0 i/2 0 0 0 i/2 1 0 0 0 (i/2) n 0 0 0 ( i/2) n U 1, U 1 In fact U can be explicitly written in terms of the eigenvectors In another way, above equation implies that for scalars a, b, c, p n 11 = a + b(i/2) n + c( i/2) n In order to have real solutions, we compare the real and imaginary parts of the above and have for all n 0, p n 11 = a + b(i/2) n cos(nπ/2) + c(i/2) n sin(npi/2) In particular for n = 0, 1,2, we have 1 = p 0 11 = a + b 0 = p 1 11 = a + 1/2c 0 = p 2 11 = a 1/4b

20 2 Calculation of higher order probabilities A solution of the above system is given by a = 1/5, b = 4/5, c = 2/5, and hence p n 11 = 1/5 + (1/2) n (4/5 cos(nπ/2) 2/5(i/2) n sin(nπ/2) 22 Kolmogorov-Chapman equation We saw that given a markov chain {X n} n 1 with state space S, initial distribution Π 0 and transition matrix P, we can calculate the distribution of X n and other joint distributions Thus, if we write Π n for the distribution of X n, ie, if Π n(j) = P {X n = j}, then, or symbolically, Π n(j) = k s Π 0(k)p n kj Π n = Π 0P n Now we can write the joint distribution of X n+1, X m+n as P {X m+t = i t, 0 t n} = Π m+1(i 1)p i1, i 2, p in+1, i n Entries of P n are called the n th step transition probabilities Thus, the knowledge about the markov chain is contained in Π 0 and the matrix P As noted earlier P is a matrix (may be an infinite) such that sum of each row is 1, ie, a stochastic matrix For consistency, we define P 0 = Id The following is easy to show: 221 Theorem: For n, m 0 and (i, j S, p n+m (i, j) = r S p n (i, r)p m (r,j), In matrix multiplication this is just P n+m = P n P m This is called the Kolmogorov Chapman equation Proof: Using the property (v) conditional probability p n+m (i, j) = P {X n+m = j X 0 = i} = r S P {X n = r, X 0 = i} P {X n+m=j X n = r,x 0 = i} = r S p n (i, r)p{x n+m = j X n = r,x 0 = i} = r S p n (i, r)p m(r,j), The last equality follows from the fact that P {X n+m = j X n = r,x 0 = i} = P {X n+m = j X n = r} = p m (r,j), as observed in theorem 155 222 Example: Consider the unrestricted random walk on the line, as in example 121, with probability p to move forward and 1 p to come back Then, p 2n+1 (0,0) = 0

Exercises 21 as only in even steps it can come back to starting point And, ( ) p 2n 2n (0,0) = p n (1 p) 2n n, n as there will be n moves to right and n back Thus, ( ) p 2n 2n (0,0) = (pq) n n In fact,this is true for every diagonal entry Other entries are difficult to compute Note that ( ) p in 00 = Σ 2n n=0 (pq) n n n=1 Using sterling s approximation, n! 2πn n+1/2 e n, we have P00 2n = Σ (pq) n 2 n n=0 nπ which is convergent if pq < 1, and divergent otherwise Thus, 0 is transient if p q, and recurrent if p = q = 1/2 223 Example: Consider the markov chain of exercise 13, with state space S = {1, 2, 3,4}, initial distribution (1, 0, 0, 0), and transition matrix Then, and P = P 2 = Π 0 P 2 = ( 1 0 0 0 ) 0 1/2 0 1/2 1/2 0 1/2 0 0 1/2 0 1/2 1/2 0 1/2 0 1/2 0 1/2 0 0 1/2 0 1/2 1/2 0 1/2 0 0 1/2 0 1/2 1/2 0 1/2 0 0 1/2 0 1/2 1/2 0 1/2 0 0 1/2 0 1/2 = ( 0 1/2 0 1/2 ) Thus, if we want to find the probability that the walker will be in state 3 in two steps, then it is Π 2(3) = (Π 0P 2 )(3) = 0 Exercises (21) Consider the markov chain of example 223 Show that (0, 1/2, 0, 1/2) for n=1, 3, 5, Π n = (1/2,0,1/2,0) for n= 2, 4, 6,

22 2 Calculation of higher order probabilities (22) Let {X n} n 0 be a markov chain with state space, initial probability distribution and transition matrix given by ( ) 3/4 3/4 S = {1, 2}, Π 0 = (1, 0), P = 1/4 1/4 Show that Π n = ( 1 2 (1 + 2 n ), 1 ) 2 (1 + 2 n ) for every n (23) Consider the two state markov chain {X n} n 0 with Π 0 = (1, 0), and transition matrix ( ) 1 p p P = q 1 q Using the the facts that P is stochastic and the relation P n+1 = P n P, deuce that p ( n + 1)(1,1) = p n (1, 2)q + p n (1,1)(1 p) P n (1, 1) + p n (1,2) = 1, and hence,for all n > 0, p ( n + 1)(1,1) = p n (1, 1)(1 p q) + q Show that this has a unique solution q p n p + q + p p + q (1 p q)n for p + q > 0 (1,1) = 1 for p + q < 0

Chapter 3 Classification of states Let {X n} n 0 be a Markov chain with state space S, initial distribution Π 0 and transition probability matrix P We will denote the ij th element of p n(i, j) also by p n ij We start looking at the possibility of moving from one state to another 31 Closed subsets and irreducible subsets 311 Definition (i) We say a state j is reachable from a state i (or i leads to j or j is approachable from i,) if there exists some n 0, such that p n ij > 0 We denote this by i j In other words, i leads to j in n steps with positive probability (ii) A subset C of the state space is said to be closed if no state from C leads to a state outside C Thus C is closed is same as for every i C, j / C p n ij = 0 n 0 This means once the chain enters the set C it will never leave it (iii) A state j is called an absorbing state if the singleton set {j} is a closed set 312 Proposition: (i) If i j and j k, then i k (ii) A state j is reachable from a state i iff p ii1 p i1 i 2 p in 1 j > 0, for some i 1, i 2,, i n 1 S (iii) C S is closed iff i C, j / C, p ij = 0 (iv) The state space S is closed and for i S, the set {i} is closed if p ii = 1 Proof: (i) Follows from the fact that p n+m ik = r S p n ir p m rk > p n 1jp m jk > 0 for some n, m > 0 (ii) Follows from the equality p n ij = i 1,i n 1 p ii1 p i1 i 2 p in 1 j 23

24 3 Classification of states (iii) Clearly, p n ij = 0 n implies that p ij = 0 Conversely, let for all i C, j / C, p ij = 0 Then p lk = 0 for l C, k / C, and p k,l = 0 for l / C, r C Thus, for all r C and k / C, p 2 rk = l S p rl p lk = l/ S Proceeding similarly, p n rk = 0 for all n 1 (iv) Proof of (iv) is obvious p rl p lk = 0 313 Definition: A subset C of S is called irreducible if any two states in C lead to one another Let us look at some examples 314 Example: Consider a markov chain with transition matrix: 0 1 2 3 4 5 0 1 2 3 4 5 1 0 0 0 0 0 1/4 1/2 1/4 0 0 0 0 1/5 2/5 1/5 0 1/5 0 0 0 1/6 1/3 1/2 0 0 0 1/2 0 1/2 0 0 0 1/4 0 3/4 We first look at which state leads to which state Whenever, i j, we put a in the matrix entry Note, p ij > 0 will give a at ij th entry, but p ij = 0 need not give 0 in the matrix For example, p 13 = 0, but 1 2 3, so p 13 is replaced by For the above matrix, we have 0 1 2 3 4 5 0 1 2 3 4 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 2 1 0, 2 3 4 5 3 4, 3 5, 3 3 4 3 4, 4 5 5 3 4, 5 5 Clearly, every single state i is a closed set if p ii = 1 For example in our case, {0} is a closed The set S is closed by definition for there is no state outside S Thus, {0, 1,2, 3,4, 5} is closed A look at the matrix of communication tells us that the set {3, 4,5} is closed because none of 3, 4,5, lead to 0, 1, 2 For example {1} is not closed because 1 2 In fact, there is no other closed sets The set {3, 4,5} is also irreducible

31 Closed subsets and irreducible subsets 25 315 Note (importance of closed irreducible sets): Why one should bother about closed subsets of the state space? To find the answer, let us look at the above example again Let us take a proper closed set, say C = {3, 4, 5} Now if we remove the rows and columns corresponding to states 1 and 2 from the transition matrix, we get the sub-matrix 3 4 5 3 4 5 1/6 1/3 1/2 1/2 0 1/2 1/4 0 3/4 which has the property that sum of each row is 1 In fact, if we take P 2 and delete rows and columns not in C, and write it as (P 2 ) C, then it is easy to check it is nothing by (P C) 2 For in P 2 note for i C, Therefore, P 2 ij = 0 if j / C 1 = j S P 2 ij = j C p 2 ij Thus, (P C) 2 is a stochastic matrix Also, for i, j C, p 2 ij = p ir p rj = (ij) th entry of PC 2 p ir p rj = r S r C 0 if j / C because C is closed, and p ir = 0, for r / C In general, (P n ) C = (P C) n Hence, one can consider the chain with state space C and analyze it This reduces the number of states 316 Definition: Two states i and j are said to communicate if either is accessible from the other, ie, p n ij > 0 and p m ji > 0 for some m, n 1 In this case we write i j 317 Proposition: (i) For i, j S, let us say i j iff i j Then is an equivalence relation on S (ii) Each equivalence class, called communicating class has no proper closed subsets Proof: (i) That i i follows from the fact that P 0 = Id, and hence p 0 ii = 1 Obviously, it is symmetric, and transitivity follows from proposition 312(i) (ii) Let C be an equivalence class If A is a proper subset of C, let j C \A Let i A Then i j implying that j / A is accessible from i A Hence, A is not closed 318 Note: A communicating class need not be closed It may be possible to start from one communicating class and enter another with positive probability For example consider a markov chain with transition matrix P = 1/2 0 0 0 0 0 0 0 1 0 0 0 1/3 0 0 1/3 1/3 0 0 0 0 1/2 1/2 0 0 0 0 0 0 1 0 0 0 0 1 0

26 3 Classification of states The communicating classes are {1, 2, 3}, {4}, {5, 6} Clearly, 3 4, but 4 3 Only {5, 6} is a closed subset 319 Example: Consider a markov chain with five states {1, 2, 3,4, 5} and with transition matrix 1/2 1/2 0 0 0 1/4 1/4 0 0 0 P = 0 0 0 1 0 0 0 1/2 0 1/2 0 0 0 1 0 States 1 and 2 communicate with each other and with no other state Similarly, states 3, 4, 5 communicate among themselves only Thus, the state space divides into two closed irreducible sets {1, 2} and {3, 4,5} For the sake of all practical purposes, analyzing the given markov chain is same as analyzing two smaller two chains with smaller state space, with transposition matrices P 1 = ( 1/2 1/2 1/4 1/4 ), P 2 = 0 1 0 1/2 0 1/2 0 1 0 3110 Theorem: A set C S is irreducible if every state in C communicates with every other state in it Proof: Suppose, C is irreducible For j C, define C j = {i C p n ij = 0 n 0} We claim that C j is a closed set To see this, let k / C j Then there exists some m such that p m kj > 0 Now if i is such that p ik > 0, then p m+1 ij = l S p il p m lj > p ik p m kj > 0, not possible if i C j Thus, p ik = 0, for every i C j and k / C j, implying that c j is closed In fact, C being irreducible, this implies that C = C j, and hence any two states in C communicate with each other Conversely, let i j for all i, j C and A C be a closed set Then, for j A and i C, since i j, we have jinc, and hence A = C, ie, C is irreducible In view of note 315, one would like to partition the state space into irreducible subsets Exercises (31) Let the transition matrix of a markov chain be given by 1/2 0 0 1/2 0 1/2 0 1/3 0 1/6 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0

32 Periodic and aperiodic chains 27 Write the transition graph and find all the disjoint closed subsets of the state space S = {1, 2,3, 4, 5} (32) Consider the markov chain in example 122, Random walk with absorbing barriers Show that the state space splits into three irreducible sets Is it possible to go from one set to other? (33) For the queuing markov chain in example in section 13, write the transition matrix and if f(k) > 0 for every k, deuce that S itself is irreducible (34) Let a markov chain have transition matrix 0 1 0 P = 0 0 1 1 0 0 Show that it is an irreducible chain 32 Periodic and aperiodic chains Throughout this section {X n} n 0 will be a markov chain with state space S, initial probability Π 0 and transition matrix P 321 Definition: A state j is said to have period d, if p n jj > 0 implies d divides n and d is the largest such integer In other words, period of j is the greatest common divisor of the numbers {n 1 p n ij > 0} A state j has period d, means that p n jj = 0 unless n = md for some m 1, and d is the greatest positive integer with this property Thus, j has period d means the chain may come back to j at time points md only But, it may never come back to the state j 322 Example: Consider a markov chain with transition matrix P = 1 2 3 4 1 2 3 4 0 1 0 0 1/2 0 1/2 0 0 0 0 1 0 0 1 0 Now p jj = 0 j Therefore, period of each state is > 1 In fact, each state has period 2 for p 2 jj > 0 and p (odd) jj = 0 But {3, 4} form a closed set and once a particle goes to the set {3, 4} (say from state 2,) it will never come out and return to 2 323 Definition: A state j is called aperiodic state if j has period 1 The chain is called aperiodic chain if every state in the chain has period 1 In an aperiodic chain, if i j, then p n ij > 0 for all sufficiently large n, ie, it is possible for the chain to come back to any state at any time

28 3 Classification of states 324 Example: Consider the transition graph of a markov chain with transition graph Note that the starting in state 1, it can be revisited at stages 4, 6, 10,8, Thus the state 1 has period 2 325 Example (Birth and death chain): Consider a markov chain on S = {0,1, 2, } Starting at i the chain can stay at i or move to i 1 or i + 1 with probabilities q i ifj = i 1 r i j = x, p(i,j) = p i j = i + 1, 0 otherwise Saying that that it is an irreducible chain is same as saying that p i > 0 for all i 0, and q i > 0 for all i > 0 It will be aperiodic if r i > 0, see exercise (35) below If r i = 0 for all i, then the chain can return to i only after even number of steps Thus the period of the chain can only be a multiple of 2 Since p 2 00 = p 0q 1 > 0, every state has period 2 326 Theorem: If two states communicate with each other, then they have same periods Proof: Let d i = period of i and d j= period j It is enough to show that d i divides r if p r jj > 0 i j implies there exist n, m such that p m ij > 0 and p n ji > 0 By Kolmogorov-Chapman equations, for every r 0, p m+r+n ii p m ijp r jjp n ji > 0 This implies d i divides m+r+n for every r 0, with p r jj > 0, In particular, with r = 0, as p 0 jj > 1 implies that d i divides m+n, and hence d i divides r = (m+r+n) (m+n) Hence, d i d j Similarly,d i d j Exercises (35) Show that if a markov chain is irreducible and p ii > 0 for some state i, then it is aperiodic (36) Show that the queuing chain of example 13 is aperiodic

33 Visiting a state: transient and recurrent states 29 33 Visiting a state: transient and recurrent states Let i, j S be fixed Let us consider the probability of the event that for some n 1, the system will visit the state j given that it starts in the state i Let f n ij := P {X n = j, X k j,1 k n 1 X 0 = i}, n 1, ie, fij n is the probability of first visit to state j starting at i in n steps We are interested in computing f ij := fij, n n 1 in terms of the transition probabilities Let us first compute f n ii for any n We define f 0 ii = 0 for all i It is the probability of eventual visit to state j starting from state i Note that, f 1 ii = p ii and f ij is the probability that the system has a visit to j starting at i in some finite time 331 Proposition: (i) f 1 ij = p ij (ii) f n+1 ij = r j p irf n rj (iii) p n ij = (iv) p n ii = n k=0 p n k jj fij k n p n k ii f k ii k=1 (v) P {system visits state j at least 2 times X 0 = i} = f ijf jj More generally, Proof: (i) Obvious (ii) P {system has m visits and at least to state j X 0 = i} = f ijf (m 1) jj f n+1 ij = r jp {from i to r in one step} P {first visit in n th step from r to j} = r j p irf n rj (iii) Note that p n ij = = n P {first visit to j at m th step X 0 = i}p {X n = j X m = j} m=1 n m=1 f m ij p n m jj (iv) Follows from (iii)

30 3 Classification of states (v) P {system visits state j at least2 times X 0 = i} = P {system has first visit toj at k X 0 = i} n k P {system has first visit at n + k X k = j} = ( ) ( ) fijf k jj n = fij k fjj n = f ijf jj n k In the general case, similarly, 332 Definition: n k P {system has m visits and at least to state j X 0 = i} = f ijf (m 1) jj (i) A state i is called recurrent if f ii = 1, ie, with probability 1, the system comes back to i (ii) A state i is called transient if f ii < 1 Thus, the probability that the system starting at i does not come back to j, ie, (1 f ii), is positive 333 Theorem: (i) The following statements are equivalent for a state j: (a) The state is transient (b) P {system visits to j infinite number of times X 0 = i} = 0 (c) p n jj < n (ii) The following statements are equivalent for a state j: (a) The state is recurrent (b) P {system visits to j infinite number of times X 0 = i} = 1 (c) p n jj = n Proof: (i) Using (v) of theorem 331, we have P {system visits to j infinite number of times X 0 = i} = lim m P {system has at least m visits to statej X 0 = i} = lim m (f ijf jj (m 1) ) = f ij (lim m (f jj) (m 1)) Hence, P {system visits to j infinite number of times X 0 = i} = 0 iff f jj < 1 This shows that (b) holds iff (a) holds Next suppose (c) holds, ie, p n jj < Then by Borel-Cantelli lemma, (b) n holds

33 Visiting a state: transient and recurrent states 31 Conversely, let (a) holds, ie, f jj < 1 We shall show (c) holds Using 332(ii), we have n n t 1 p t jj = f (t s) jj p s jj t=1 = t=1 s=0 n n 1 p s jj s=0 t=s+1 f (t s) jj f jj + Thus, (1 f ( n jj) t=1 jj) pt fjj Thus, for every n 1 n p t jj fjj, 1 f jj t=1 implying (c) as f jj < 1 This completely proves (i) Proof of (ii) follows from (i) n s=1 p s jjf jj 334 Example: Consider the unrestricted random walk on the integers with probability p moving to right, probability q moving to left, and p + q = 1 It is clearly an irreducible chain Starting at 0 one can come back to 0 only in even number of steps Thus, p 00 2n+1 = 0,and p 2n 00 = {X 2n = 0 X 0 = 0} Starting from 0 if it has to come back to 0 in 2n steps, then it can go to left in n steps and right by n steps Thus, ( ) p 2n 2n 00 = p n q n n Therefore, m=0 p 2n 00 = n=0 p 2n 00 = n=0 ( 2n n ) p n q n To decide whether the state 0 is transient or not, one has to know whether this series is convergent or not Note that, ( ) 2n = 2n! n n!n!, and by sterling s formula, n! ( 2π)n n+1/2, we have ( ) 2n (2n) 2n+1/2 n n n+1/2 n n+1/2 = 2 2n 2! n2n+1/2 2n 1 2π = 22n nπ Hence, p 2n 00 (4pq)n nπ Since p(1 p) = pq 1/4 and equality holds iff p = q = 1/2 Thus, for θ = 4pq, θ n, θ < 1 if p q 1/2 p 2n 0 n 00 n=0 1 if p = q = 1/2 n 0

32 3 Classification of states One knows that for θ < 1, 0 θ n n < + and is divergent if θ = 1 Thus, 0 is a recurrent state iff p = q = 1/2 In fact same holds state j If p q, then intuitively particle will drift to or + as 0 is the transit state and so in every state 335 Theorem: Let i j and i be recurrent Then, (i) f ji = 1, j i and j is recurrent (ii) f ij = 1 Proof: (i) Since i j, there exists n 1 such that p n ij > 0 Let n 0 be the smallest positive integer such that p n 0 ij > 0 Then, pm ij = 0 for 1 m < n Since p n 0 ij > 0, there exists states i 1, i 2, i n0 1, none equal to j such that P {X n0 = j, X n0 1 = i n0 1, X 1 = i 1 X 0 = i} > 0 (31) Suppose f ji < 1 Then (1 f ji) > 0, ie, Therefore, P {system starts at j but never visits i} > 0 (32) α : = P {X 1 = i 1,, X n0 1 = i n0 1, X n0 = j, X n i for n > n 0 X 0 = i} = P {X n i for n n 0 + 1 X n0 = j, X n0 1 = i n0 1,, X 0 = i} P {X n0 = j, X n0 1 = i n0 1, = P {X n i for n n 0 + 1 X n0 = j} > 0, using equations (31) and (32) Thus P {X n0 = j,, X 1 = i 1 x 0 = i},, X i1 = i 1 X 0 = i} P {X n i for every n X 0 = i} > α > 0 for all n, ie, the system starts at i and never comes back to i, ie, i cannot be a recurrent state Hence, if i is recurrent then our assumption that f ji < 1 is not true Thus, i recurrent implies f ji = 1 But then, f ji = m 1 f m ji = 1, and hence for some m, f m ji > 0, ie, with positive probability there is a first visit to i starting from j Hence p m ji f m ji > 0, ie, j i Thus, we have shown i j and i recurrent implies f ji = 1 and hence j i Further, p m+n+n 0 jj = p m jrp n rkp n 0 kj > pm jip n ikp n 0 ij r,k

33 Visiting a state: transient and recurrent states 33 Using this, n 1 p n jj n=m+1+n 0 p n jj = n 1 p m+n+n 0 jj because > n 1 p m jip n iip n 0 Thus, j is recurrent, proving (i) (ii) Apply (i) to i and j, interchange ij = pm ji p n ii p n 0 ij =, n 0 p m ji > 0, p n 0 ij > 0, and p n ii = + 336 Corollary: If i j and j i, then, either both are transient or both are recurrent Proof: If i is recurrent, and i j then, j is recurrent by above theorem Let i be transient and j be recurrent But as j i, and hence by above theorem i is recurrent, not possible Hence, i transient implies j transient 337 Corollary: Let C S be an irreducible set Then, either all states in C are recurrent or all are transient Further, if C is a communicating class and all its states are recurrent, then C is closed Proof: Since all states in C communicate with each other,by corollary 336, all states in C are either transient or recurrent Next suppose C is a communicating class and j / C Let i j for some i C Then by above theorem above, j i, and hence j C, not true Hence C is closed Hence we know how to characterize irreducible markov chains 338 Exercise: Show that if a state j is transient, then p n ij < for all i n=1 339 Theorem: Let {X n} n 1 be an irreducible markov chain with state space S and transition probability matrix P (i) Either all states are transient in which case p n ij < + for all i, j and n 0 P {X n = j infinite n s X 0 = i} = 0 (ii) All states are recurrent in which case p n ii = + for all i n 0

34 3 Classification of states 3310 Corollary: If S is finite then it has at least one recurrent state Proof: Suppose all states are transient Then, p n ij < + for all i, j n 0 Thus, lim n p n ij = 0 Hence, as S is finite and P is a stochastic matrix, a contradiction 0 = lim n p n ij = 1 j S 3311 Corollary: In a finite irreducible chain, all states are recurrent 3312 Examples: The two states markov chain with transition matrix ( 1 p p ) q 1 q is irreducible, finite and hence all states are recurrent 3313 Example : Consider the chain discussed in example 313 with transition matrix 0 1 2 3 4 5 Let us find its transient, recurrent states 0 1 2 3 4 5 1 0 0 0 0 0 1/4 1/2 1/4 0 0 0 0 1/5 2/5 1/5 0 0 0 0 0 1/6 1/3 1/2 0 0 0 1/2 0 1/2 0 0 0 1/4 0 3/4 (i) 0 is an absorbing state as p 00 = 1 and hence is recurrent (ii) As observed earlier {3, 4, 5} is a finite,closed, irreducible set, hence by corollary 3311, all states are recurrent (iii) Now if 2 was a recurrent state, since 2 0, and by theorem 335, we should have 0 2, but that is not true Hence 2 is not recurrent and hence must be transient Similarly, 1 is transient Thus we can write the state space as S = {1, 2} {3, 4, 5}, where first set consists of transient states and second is irreducible set of recurrent states

33 Visiting a state: transient and recurrent states 35 3314 Example : Let us find transient/recurrent state for chains with transition matrices: P = 0 1 2 3 0 1/2 1/2 1/2 0 1/2, 1/2 1/2 0 Q = 1 2 3 4 0 0 1/2 1/2 1 0 0 0 0 1 0 0 0 1 0 0, R = 1 2 3 4 5 1/2 1/2 0 0 0 1/2 1/2 0 0 0 0 0 1/2 1/2 0 0 0 1/2 1/2 0 1/4 1/4 0 0 1/2 Chain with transition matrix P is finite irreducible and thus recurrent and finite The chain with transition matrix Q is also finite irreducible and hence recurrent For the chain with transition matrix R, {1, 2} and {3, 4} are irreducible sets and hence are recurrent Since, 5 1 but 1 5 so 5 cannot be recurrent Therefore, 5 is transient Once again, we have the decomposition S = {5} {1, 2} {3, 4}, where first set consists of transient state and second and third sets are irreducible sets of recurrent states We had saw in above example, that the state space S could be written as S T C, When S T consists of all transient states, C 1, C 2, are closed irreducible sets containing of recurrent states We show this is possible in general 3315 Proposition: For every recurrent state i there exists a closed subset C(i) such that the following holds: (i) Each C(i), is closed and irreducible (ii) Either C(i 1) C(i 2) = or C(i 1) = C(i 2) (iii) ic(i) = S R, set of all recurrent states Proof: For i S R, define C(i) = {j S i j} We prove that the sets C(i) has the required properties (i) i C(i) for p 0 ii = 1 and hence C(i) If j C(i) then j is recurrent and j i Hence i j Thus, any two states in C communicate with each other, ie, C is irreducible If k / C(i), then i k, for otherwise k i implying k C Also for j / C, i j and hence j k for if j k then i k Therefore, C(i) is closed (ii) If i C(i 1) C(i 2), then for j C(i 1), implying C(i 1) C(i 2) Similarly, C(i 2) C(i 1) (iii) is obvious j i 1 i i 2