Continuous-time Markov Chains Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ October 23, 2017 Introduction to Random Processes Continuous-time Markov Chains 1
Exponential random variables Exponential random variables Counting processes and definition of Poisson processes Properties of Poisson processes Introduction to Random Processes Continuous-time Markov Chains 2
Exponential distribution Exponential RVs often model times at which events occur Or time elapsed between occurrence of random events RV T exp(λ) is exponential with parameter λ if its pdf is f T (t) = λe λt, for all t 0 Cdf, integral of the pdf, is F T (t) = P (T t) = 1 e λt Complementary (c)cdf is P(T t) = 1 F T (t) = e λt 1 pdf (λ = 1) cdf (λ = 1) 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Introduction to Random Processes Continuous-time Markov Chains 3
Expected value Expected value of time T exp(λ) is E [T ] = 0 tλe λt dt = te λt + 0 Integrated by parts with u = t, dv = λe λt dt 0 e λt dt = 0 + 1 λ Mean time is inverse of parameter λ λ is rate/frequency of events happening at intervals T Interpret: Average of λt events by time t Bigger λ smaller expected times, larger frequency of events T 1 T 2 S 1 S 2 t = 0 T 3 S 3 T 4 S 4 T 5 T 6 S 5 S 6 t = 5/λ T 7 T 8 S 7 S 8 T 9 T 10 S 9 S 10 t t = 10/λ Introduction to Random Processes Continuous-time Markov Chains 4
Second moment and variance For second moment also integrate by parts (u = t 2, dv = λe λt dt) E [ T 2] = t 2 λe λt dt = t 2 e λt + 0 0 0 First term is 0, second is (2/λ)E [T ] 2te λt dt E [ T 2] = 2 λ 0 tλe λt = 2 λ 2 The variance is computed from the mean and second moment var [T ] = E [ T 2] E 2 [T ] = 2 λ 2 1 λ 2 = 1 λ 2 Parameter λ controls mean and variance of exponential RV Introduction to Random Processes Continuous-time Markov Chains 5
Memoryless random times Def: Consider random time T. We say time T is memoryless if P ( T > s + t T > t ) = P (T > s) Probability of waiting s extra units of time (e.g., seconds) given that we waited t seconds, is just the probability of waiting s seconds System does not remember it has already waited t seconds Same probability irrespectively of time already elapsed Ex: Chemical reaction A + B AB occurs when molecules A and B collide. A, B move around randomly. Time T until reaction Introduction to Random Processes Continuous-time Markov Chains 6
Exponential RVs are memoryless Write memoryless property in terms of joint pdf P ( T > s + t T > t ) = P (T > s + t, T > t) P (T > t) = P (T > s) Notice event {T > s + t, T > t} is equivalent to {T > s + t} Replace P (T > s + t, T > t) = P (T > s + t) and reorder P (T > s + t) = P (T > t)p (T > s) If T exp(λ), ccdf is P (T > t) = e λt so that P (T > s + t) = e λ(s+t) = e λt e λs = P (T > t) P (T > s) If random time T is exponential T is memoryless Introduction to Random Processes Continuous-time Markov Chains 7
Continuous memoryless RVs are exponential Consider a function g(t) with the property g(t + s) = g(t)g(s) Q: Functional form of g(t)? Take logarithms log g(t + s) = log g(t) + log g(s) Only holds for all t and s if log g(t) = ct for some constant c Which in turn, can only hold if g(t) = e ct for some constant c Compare observation with statement of memoryless property P (T > s + t) = P (T > t) P (T > s) It must be P (T > t) = e ct for some constant c T continuous: only true for exponential T exp( c) T discrete: only geometric P (T > t) = (1 p) t with (1 p) = e c If continuous random time T is memoryless T is exponential Introduction to Random Processes Continuous-time Markov Chains 8
Main memoryless property result Theorem A continuous random variable T is memoryless if and only if it is exponentially distributed. That is P ( T > s + t T > t ) = P (T > s) if and only if f T (t) = λe λt for some λ > 0 Exponential RVs are memoryless. Do not remember elapsed time Only type of continuous memoryless RVs Discrete RV T is memoryless if and only of it is geometric Geometrics are discrete approximations of exponentials Exponentials are continuous limits of geometrics Exponential = time until success Geometric = nr. trials until success Introduction to Random Processes Continuous-time Markov Chains 9
Exponential times example First customer s arrival to a store takes T exp(1/10) minutes Suppose 5 minutes have passed without an arrival Q: How likely is it that the customer arrives within the next 3 mins.? Use memoryless property of exponential T P ( T 8 T > 5 ) = 1 P ( T > 8 T > 5 ) = 1 P (T > 3) = 1 e 3λ = 1 e 0.3 Q: How likely is it that the customer arrives after time T = 9? P ( T > 9 T > 5 ) = P (T > 4) = e 4λ = e 0.4 Q: What is the expected additional time until the first arrival? Additional time is T 5, and P ( T 5 > t T > 5 ) = P (T > t) E [ T 5 T > 5 ] = E [T ] = 1/λ = 10 Introduction to Random Processes Continuous-time Markov Chains 10
Time to first event Independent exponential RVs T 1, T 2 with parameters λ 1, λ 2 Q: Prob. distribution of time to first event, i.e., T := min(t 1, T 2 )? To have T > t we need both T 1 > t and T 2 > t Using independence of T 1 and T 2 we can write P (T > t) = P (T 1 > t, T 2 > t) = P (T 1 > t) P (T 2 > t) Substituting expressions of exponential ccdfs P (T > t) = e λ1t e λ2t = e (λ1+λ2)t T := min(t 1, T 2 ) is exponentially distributed with parameter λ 1 + λ 2 In general, for n independent RVs T i exp(λ i ) define T := min i T i T is exponentially distributed with parameter n i=1 λ i Introduction to Random Processes Continuous-time Markov Chains 11
First event to happen Q: Prob. P (T 1 < T 2 ) of T 1 exp(λ 1 ) happening before T 2 exp(λ 2 )? Condition on T 2 = t, integrate over the pdf of T 2, independence P (T 1 < T 2 ) = 0 P ( T 1 < t T 2 = t ) f T2 (t) dt = Substitute expressions for exponential pdf and cdf P (T 1 < T 2 ) = 0 (1 e λ1t )λ 2 e λ2t dt = λ 1 λ 1 + λ 2 Either T 1 comes before T 2 or vice versa, hence P (T 2 < T 1 ) = 1 P (T 1 < T 2 ) = λ 2 λ 1 + λ 2 Probabilities are relative values of rates (parameters) Larger rate smaller average higher prob. happening first 0 F T1 (t)f T2 (t) dt Introduction to Random Processes Continuous-time Markov Chains 12
Additional properties of exponential RVs Consider n independent RVs T i exp(λ i ). T i time to i-th event Probability of j-th event happening first ( ) λ P T j = min T i = j n i i=1 λ, j = 1,..., n i Time to first event and rank ordering of events are independent ( ) ( ) P min T i t, T i1 <... < T in = P min T i t P (T i1 <... < T in ) i i Suppose T exp(λ), independent of non-negative RV X Strong memoryless property asserts P ( T > X + t T > X ) = P (T > t) Also forgets random but independent elapsed times Introduction to Random Processes Continuous-time Markov Chains 13
Strong memoryless property example Independent customer arrival times T i exp(λ i ), i = 1,..., 3 Suppose customer 3 arrives first, i.e., min(t 1, T 2 ) > T 3 Q: Probability that time between first and second arrival exceeds t? We want to compute P ( min(t 1, T 2 ) T 3 > t ) min(t1, T 2 ) > T 3 Denote T i2 := min(t 1, T 2 ) the time to second arrival Recall T i2 exp(λ 1 + λ 2 ), T i2 independent of T i1 = T 3 Apply the strong memoryless property P ( T i2 T 3 > t ) T i2 > T 3 = P (Ti2 > t) = e (λ1+λ2)t Introduction to Random Processes Continuous-time Markov Chains 14
Probability of event in infinitesimal time Q: Probability of an event happening in infinitesimal time h? Want P (T < h) for small h Equivalent to P (T < h) = P (T < t) t h 0 λe λt dt λh = λ t=0 Sometimes also write P (T < h) = λh + o(h) o(h) o(h) implies lim = 0 h 0 h Read as negligible with respect to h Q: Two independent events in infinitesimal time h? P (T 1 h, T 2 h) (λ 1 h)(λ 2 h) = λ 1 λ 2 h 2 = o(h) Introduction to Random Processes Continuous-time Markov Chains 15
Counting and Poisson processes Exponential random variables Counting processes and definition of Poisson processes Properties of Poisson processes Introduction to Random Processes Continuous-time Markov Chains 16
Counting processes Random process N(t) in continuous time t R + Def: Counting process N(t) counts number of events by time t Nonnegative integer valued: N(0) = 0, N(t) {0, 1, 2,...} Nondecreasing: N(s) N(t) for s < t Event counter: N(t) N(s) = number of events in interval (s, t] N(t) continuous from the right N(S4) N(S 2) = 2, while N(S 4) N(S 2 ɛ) = 3 for small ɛ > 0 N(t) Ex.1: # text messages (SMS) typed 6 since beginning of class 5 4 Ex.2: # economic crises since 1900 3 2 Ex.3: # customers at Wegmans since 1 8 am this morning t S 1 S 2 S 3 S 4 S 5 S 6 Introduction to Random Processes Continuous-time Markov Chains 17
Independent increments Consider times s 1 < t 1 < s 2 < t 2 and intervals (s 1, t 1 ] and (s 2, t 2 ] N(t 1 ) N(s 1 ) events occur in (s 1, t 1 ] N(t 2 ) N(s 2 ) events occur in (s 2, t 2 ] Def: Independent increments implies latter two are independent P (N(t 1 ) N(s 1 ) = k, N(t 2 ) N(s 2 ) = l) = P (N(t 1 ) N(s 1 ) = k) P (N(t 2 ) N(s 2 ) = l) Number of events in disjoint time intervals are independent Ex.1: Likely true for SMS, except for have to send messages Ex.2: Most likely not true for economic crises (business cycle) Ex.3: Likely true for Wegmans, except for unforeseen events (storms) Does not mean N(t) independent of N(s), say for t > s These events are clearly dependent, since N(t) is at least N(s) Introduction to Random Processes Continuous-time Markov Chains 18
Stationary increments Consider time intervals (0, t] and (s, s + t] N(t) events occur in (0, t] N(s + t) N(s) events in (s, s + t] Def: Stationary increments implies latter two have same prob. dist. P (N(s + t) N(s) = k) = P (N(t) = k) Prob. dist. of number of events depends on length of interval only Ex.1: Likely true if lecture is good and you keep interest in the class Ex.2: Maybe true if you do not believe we become better at preventing crises Ex.3: Most likely not true because of, e.g., rush hours and slow days Introduction to Random Processes Continuous-time Markov Chains 19
Poisson process Def: A counting process N(t) is a Poisson process if (a) The process has stationary and independent increments (b) The number of events in (0, t] has Poisson distribution with mean λt P (N(t) = n) = e λt (λt) n An equivalent definition is the following (i) The process has stationary and independent increments (ii) Single event in infinitesimal time P (N(h) = 1) = λh + o(h) (iii) Multiple events in infinitesimal time P (N(h) > 1) = o(h) A more intuitive definition (even hard to grasp now) Conditions (i) and (a) are the same That (b) implies (ii) and (iii) is obvious Substitute small h in Poisson pmf s expression for P (N(t) = n) To see that (ii) and (iii) imply (b) requires some work n! Introduction to Random Processes Continuous-time Markov Chains 20
Explanation of model (i)-(iii) Consider time T and divide interval (0, T ] in n subintervals Subintervals are of duration h = T /n, h vanishes as n increases The m-th subinterval spans ( (m 1)h, mh ] Define A m as the number of events that occur in m-th subinterval A m = N ( mh ) N ( (m 1)h ) The total number of events in (0, T ] is the sum of A m, m = 1,..., n N(T ) = n A m= m=1 n N ( mh ) N ( (m 1)h ) In figure, N(T ) = 5, A 1, A 2, A 4, A 7, A 8 are 1 and A 3, A 5, A 6 are 0 m=1 S1 S2 S3 S4 S5 t h h h h h h h h t = 0 t = T Introduction to Random Processes Continuous-time Markov Chains 21
Probability distribution of A m (intuitive arg.) Note first that since increments are stationary as per (i), it holds P (A m = k) = P ( N ( mh ) N ( (m 1)h ) = k ) = P (N(h) = k) In particular, using (ii) and (iii) P (A m = 1) = P (N(h) = 1) = λh + o(h) P (A m > 1) = P (N(h) > 1) = o(h) Set aside o(h) probabilities They re negligible with respect to λh P (A m = 1) = λh P (A m = 0) = 1 λh A m is Bernoulli with parameter λh S1 S2 S3 S4 S5 t h h h h h h h h t = 0 t = T Introduction to Random Processes Continuous-time Markov Chains 22
Probability distribution of N(T ) (intuitive arg.) Since increments are also independent as per (i), A m are independent N(T ) is sum of n independent Bernoulli RVs with parameter λh N(T ) is binomial with parameters (n, λh) = (n, λt /n) As interval length h 0, number of intervals n The product n λh = λt stays constant N(T ) is Poisson with parameter λt (Law of rare events) Then (ii)-(iii) imply (b) and definitions are equivalent Not a proof because we neglected o(h) terms But explains what a Poisson process is S1 S2 S3 S4 S5 t h h h h h h h h t = 0 t = T Introduction to Random Processes Continuous-time Markov Chains 23
What is a Poisson process? Fundamental defining properties of a Poisson process Events happen in small interval h with probability λh proportional to h Whether event happens in an interval has no effect on other intervals Modeling questions Q1: Expect probability of event proportional to length of interval? Q2: Expect subsequent intervals to behave independently? If positive, then a Poisson process model is appropriate Typically arise in a large population of agents acting independently Larger interval, larger chance an agent takes an action Action of one agent has no effect on action of other agents Has therefore negligible effect on action of group Introduction to Random Processes Continuous-time Markov Chains 24
Examples of Poisson processes Ex.1: Number of people arriving at subway station. Number of cars arriving at a highway entrance. Number of customers entering a store... Large number of agents (people, drivers, customers) acting independently Ex.2: SMS generated by all students in the class. Once you send an SMS you are likely to stay silent for a while. But in a large population this has a minimal effect in the probability of someone generating a SMS Ex.3: Count of molecule reactions. Molecules are removed from pool of reactants once they react. But effect is negligible in large population. Eventually reactants are depleted, but in small time scale process is approximately Poisson Introduction to Random Processes Continuous-time Markov Chains 25
Formal argument to show equivalence Define A max := max (A m), maximum nr. of events in one interval m=1,...,n If A max 1 all intervals have 0 or 1 events. Consider probability P ( N(T ) = k Amax 1 ) For given h, N(T ) conditioned on A max 1 is binomial Parameters are n = T /h and p = λh + o(h) Interval length h 0 Parameter p 0, nr. of intervals n Product np lim h 0 np = lim h 0 (T /h)(λh + o(h)) = λt N(T ) conditioned on A max 1 is Poisson with parameter λt P ( N(T ) = k Amax 1 ) λt (λt )k = e k! Introduction to Random Processes Continuous-time Markov Chains 26
Formal argument to show equivalence (continued) Separate study in A max 1 and A max > 1. That is, condition P (N(T ) = k) = P ( N(T ) = k A max 1 ) P (A max 1) + P ( N(T ) = k A max > 1 ) P (A max > 1) Property (iii) implies that P (A max > 1) vanishes as h 0 P (A max > 1) n m=1 P (A m > 1) = no(h) = T o(h) h 0 Thus, as h 0, P (A max > 1) 0 and P (A max 1) 1. Then lim P (N(T ) = k) = lim P ( N(T ) = k Amax 1 ) h 0 h 0 Right-hand side is Poisson N(T ) Poisson with parameter λt Introduction to Random Processes Continuous-time Markov Chains 27
Properties of Poisson processes Exponential random variables Counting processes and definition of Poisson processes Properties of Poisson processes Introduction to Random Processes Continuous-time Markov Chains 28
Arrival times and interarrival times N(t) 6 5 4 3 2 1 T 1 T 2 T 3 T 4 T 5 T 6 S 1 S 2 S 3 S 4 S 5 S 6 t Let S 1, S 2,... be the sequence of absolute times of events (arrivals) Def: S i is known as the i-th arrival time Notice that S i = min t (N(t) i) Let T 1, T 2,... be the sequence of times between events Def: T i is known as the i-th interarrival time Useful identities: S i = ik=1 T k and T i = S i S i 1, where S 0 = 0 Introduction to Random Processes Continuous-time Markov Chains 29
Interarrival times are i.i.d. exponential RVs Ccdf of T 1 P (T 1 > t) = P (N(t) = 0) = e λt T 1 has exponential distribution with parameter λ Since increments are stationary and independent, likely T i are i.i.d. Theorem Interarrival times T i of a Poisson process are independent identically distributed exponential random variables with parameter λ, i.e., P (T i > t) = e λt Have already proved for T 1. Let us see the rest Introduction to Random Processes Continuous-time Markov Chains 30
Interarrival times are i.i.d. exponential RVs Proof. Recall S i is i-th arrival time. Condition on S i P (T i+1 > t) = P ( T i+1 > t Si = s ) f Si (s) ds To have T i+1 > t given that S i = s it must be N(s + t) = N(s) P ( T i+1 > t S i = s ) = P ( N(t + s) N(s) = 0 N(s) = i ) Since increments are independent, drop conditioning on N(s) = i P ( T i+1 > t Si = s ) = P (N(t + s) N(s) = 0) Since increments are also stationary and N(t) is Poisson, then P ( T i+1 > t Si = s ) = P (N(t) = 0) = e λt Substituting into integral yields P (T i+1 > t) = e λt Introduction to Random Processes Continuous-time Markov Chains 31
Interarrival times example Let N 1 (t) and N 2 (t) be Poisson processes with rates λ 1 and λ 2 Suppose N 1 (t) and N 2 (t) are independent Q: What is the expected time till the first arrival from either process? Denote as S (i) 1 the first arrival time from process i = 1, 2 [ ( )] We are looking for E min S (1) 1, S (2) 1 Note that S (1) 1 = T (1) 1 and S (2) 1 = T (2) 1 S (1) 1 exp(λ 1 ) and S (2) 1 exp(λ 2 ) Also, S (1) 1 and S (2) 1 are independent ( ) Recall that min S (1) 1, S (2) 1 exp(λ 1 + λ 2 ), then [ ( )] E min S (1) 1, S (2) 1 1 = λ 1 + λ 2 Introduction to Random Processes Continuous-time Markov Chains 32
Alternative definition of Poisson process Start with sequence of independent random times T 1, T 2,... Times T i exp(λ) have exponential distribution with parameter λ Define i-th arrival time S i S i = T 1 + T 2 +... + T i Define counting process of events occurred by time t N(t) = max(s i t) i N(t) is a Poisson process N(t) T 1 T 2 T 3 T 4 T 5 T 6 6 5 4 3 2 1 S 1 S 2 S 3 S 4 S 5 S 6 t If N(t) is a Poisson process interarrival times T i are i.i.d. exponential To show that definition is equivalent have to show the converse If interarrival times are i.i.d. exponential, process is Poisson Introduction to Random Processes Continuous-time Markov Chains 33
Alternative definition of Poisson process (cont.) Exponential i.i.d. interarrival times Q: Poisson process? Show that implies definition (i)-(iii) Stationarity true because all T i have same distribution Independent increments true because Interarrival times are independent Exponential RVs are memoryless Can have more than one event in (0, h] only if T 1 < h and T 2 < h P (N(h) > 1) P (T 1 h) P (T 2 h) We have no event in (0, h] if T 1 > h = (1 e λh ) 2 = (λh) 2 + o(h 2 ) = o(h) P (N(h) = 0) = P (T 1 h) = e λh = 1 λh + o(h) The remaining case is N(h) = 1, whose probability is P (N(h) = 1) = 1 P (N(h) = 0) P (N(h) > 1) = λh + o(h) Introduction to Random Processes Continuous-time Markov Chains 34
Three definitions of Poisson processes Def. 1: Prob. of event proportional to interval width. Intervals independent Physical model definition Can a phenomenon be reasonably modeled as a Poisson process? The other two definitions are used for analysis and/or simulation Def. 2: Prob. distribution of events in (0, t] is Poisson Event centric definition. Nr. of events in given time intervals Allows analysis and simulation Used when information about nr. of events in given time is desired Def. 3: Prob. distribution of interarrival times is exponential Time centric definition. Times at which events happen Allows analysis and simulation Used when information about event times is of interest Introduction to Random Processes Continuous-time Markov Chains 35
Number of visitors to a web page example Ex: Count number of visits to a webpage between 6:00pm to 6:10pm Def 1: Q: Poisson process? Yes, seems reasonable to have Probability of visit proportional to time interval duration Independent visits over disjoint time intervals Model as Poisson process with rate λ visits/second (v/s) Def 2: Arrivals in (s, s + t] are Poisson with parameter λt Prob. of exactly 5 visits in 1 sec? P (N(1) = 5) = e λ λ 5 /5! Expected nr. of visits in 10 minutes? E [N(600)] = 600λ On average, data shows N visits in 10 minutes. Estimate ˆλ = N/600 Def 3: Interarrival times are i.i.d. T i exp(λ) Expected time between visits? E [T i ] = 1/λ Expected arrival time S n of n-th visitor? Recall S n = n i=1 T i, hence E [S n ] = n i=1 E [T i] = n/λ Introduction to Random Processes Continuous-time Markov Chains 36
Superposition of Poisson processes Let N 1 (t), N 2 (t) be Poisson processes with rates λ 1 and λ 2 Suppose N 1 (t) and N 2 (t) are independent N 1 (t) 2 1 N 2 (t) 3 2 1 t S 1 S 1 S 2 S 3 S 2 t Then N(t) := N 1 (t) + N 2 (t) is a Poisson process with rate λ 1 + λ 2 N(t) 5 4 3 2 1 S 1 S 2 S 3 S 4 S 5 t Introduction to Random Processes Continuous-time Markov Chains 37
Thinning of a Poisson process Let B N = B 1, B 2,... be an i.i.d. sequence of Bernoulli(p) RVs Let N(t) be a Poisson process with rate λ, independent of B N Then M(t) := N(t) i=1 B i is a Poisson process with rate λp B i : 0 1 1 0 1 N(t) M(t) 5 4 3 2 1 S 1 S 2 S 3 S 4 S 5 t 3 2 1 S 1 S 2 S 3 t Introduction to Random Processes Continuous-time Markov Chains 38
Splitting of a Poisson process Let Z N = Z 1, Z 2,... be an i.i.d. sequence of RVs, Z i {1,..., m} Let N(t) be a Poisson process with rate λ, independent of Z N Define N k (t) = N(t) i=1 I {Z i = k}, for each k = 1,..., m Then each N k (t) is a Poisson process with rate λp (Z 1 = k) N 1 (t) Z i : 1 2 2 3 2 1 t N(t) 5 4 3 2 1 S 1 S 2 S 3 S 4 S 5 t N 2 (t) 3 2 1 N 3 (t) 1 S 1 S 1 S 2 S 3 S 1 t t Introduction to Random Processes Continuous-time Markov Chains 39
Glossary Random times Exponential distribution Memoryless random times Time to first event First event to happen Strong memoryless property Event in infinitesimal interval Continuous-time process Counting process Right-continuous function Poisson process Independent increments Stationary increments Equivalent definitions Arrival times Interarrival times Event and time centric Superposition of Poisson processes Thinning of a Poisson process Splitting of a Poisson process Introduction to Random Processes Continuous-time Markov Chains 40