DS-GA 1002 Lecture notes 5 Fall Random processes

Similar documents
Convergence of random processes

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Lecture 3: Probability Distributions

6. Stochastic processes (2)

6. Stochastic processes (2)

Markov chains. Definition of a CTMC: [2, page 381] is a continuous time, discrete value random process such that for an infinitesimal

Probability and Random Variable Primer

More metrics on cartesian products

Expected Value and Variance

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Google PageRank with Stochastic Matrix

Applied Stochastic Processes

APPENDIX A Some Linear Algebra

Random Walks on Digraphs

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

Continuous Time Markov Chain

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Problem Set 9 - Solutions Due: April 27, 2005

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

A be a probability space. A random vector

Linear Approximation with Regularization and Moving Least Squares

Foundations of Arithmetic

Complete subgraphs in multipartite graphs

Lecture 12: Discrete Laplacian

TAIL BOUNDS FOR SUMS OF GEOMETRIC AND EXPONENTIAL VARIABLES

= z 20 z n. (k 20) + 4 z k = 4

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Lecture 17 : Stochastic Processes II

The Geometry of Logit and Probit

Probability Theory (revisited)

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Lecture Notes on Linear Regression

Composite Hypotheses testing

NUMERICAL DIFFERENTIATION

k t+1 + c t A t k t, t=0

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

A random variable is a function which associates a real number to each element of the sample space

Appendix B. Criterion of Riemann-Stieltjes Integrability

Notes on Frequency Estimation in Data Streams

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere

Introduction to Random Variables

Lecture 3. Ax x i a i. i i

Module 9. Lecture 6. Duality in Assignment Problems

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Economics 130. Lecture 4 Simple Linear Regression Continued

Continuous Time Markov Chains

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

STAT 511 FINAL EXAM NAME Spring 2001

First Year Examination Department of Statistics, University of Florida

Notes prepared by Prof Mrs) M.J. Gholba Class M.Sc Part(I) Information Technology

Lecture 3: Shannon s Theorem

CS286r Assign One. Answer Key

2.3 Nilpotent endomorphisms

Uncertainty and auto-correlation in. Measurement

Edge Isoperimetric Inequalities

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Simulation and Random Number Generation

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

The Feynman path integral

Math 426: Probability MWF 1pm, Gasson 310 Homework 4 Selected Solutions

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

Conjugacy and the Exponential Family

Strong Markov property: Same assertion holds for stopping times τ.

Markov Chain Monte Carlo Lecture 6

Supplementary material: Margin based PU Learning. Matrix Concentration Inequalities

/ n ) are compared. The logic is: if the two

Chapter 11: Simple Linear Regression and Correlation

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

CS-433: Simulation and Modeling Modeling and Probability Review

Changing Topology and Communication Delays

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Differentiating Gaussian Processes

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP

Maximizing the number of nonnegative subsets

Error Probability for M Signals

Lecture 4: November 17, Part 1 Single Buffer Management

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

The Second Anti-Mathima on Game Theory

Chapter 20 Duration Analysis

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Eigenvalues of Random Graphs

The equation of motion of a dynamical system is given by a set of differential equations. That is (1)

Transcription:

DS-GA Lecture notes 5 Fall 6 Introducton Random processes Random processes, also known as stochastc processes, allow us to model quanttes that evolve n tme (or space n an uncertan way: the trajectory of a partcle, the prce of ol, the temperature n New York, the natonal debt of the Unted States, etc. In these notes we ntroduce a mathematcal framework that allows to reason probablstcally about such quanttes. Defnton We denote random processes usng a tlde over an upper case letter X. Ths s not standard notaton, but we want to emphasze the dfference wth random varables and random vectors. Formally, a random process X s a functon that maps elements n a sample space Ω to real-valued functons. Defnton. (Random process. Gven a probablty space (Ω, F, P, a random process X s a functon that maps each element ω n the sample space Ω to a functon X (ω, : T R, where T s a dscrete or contnuous set. There are two possble nterpretatons for X (ω, t: If we fx ω, then X (ω, t s a determnstc functon of t known as a realzaton of the random process. If we fx t then X (ω, t s a random varable, whch we usually just denote by X (t. We can consequently nterpret X as an nfnte collecton of random varables ndexed by t. The set of possble values that the random varable X (t can take for fxed t s called the state space of the random process. Random processes can be classfed accordng to the ndexng varable or to ther state space. If the ndexng varable t s defned on R, or on a sem-nfnte nterval (t, for some t R, then X s a contnuous-tme random process.

If the ndexng varable t s defned on a dscrete set, usually the ntegers or the natural numbers, then X s a dscrete-tme random process. In such cases we often use a dfferent letter from t, such as, as an ndexng varable. If X (t s a dscrete random varable for all t, then X s a dscrete-state random process. If the dscrete random varable takes a fnte number of values that s the same for all t, then X s a fnte-state random process. If X (t s a contnuous random varable for all t, then X s a contnuous-state random process. Note that there are contnuous-state dscrete-tme random processes and dscrete-state contnuoustme random processes. Any combnaton s possble. The underlyng probablty space (Ω, F, P mentoned n the defnton completely determnes the stochastc behavor of the random process. In prncple we can specfy random processes by defnng the probablty space (Ω, F, P and the mappng from elements n Ω to contnuous or dscrete functons, as llustrated n the followng example. As we wll dscuss later on, ths way of specfyng random processes s only tractable for very smple cases. Example. (Puddle. Bob asks Mary to model a puddle probablstcally. When the puddle s formed, t contans an amount of water that s dstrbuted unformly between and gallon. As tme passes, the water evaporates. After a tme nterval t the water that s left s t tmes less than the ntal quantty. Mary models the water n the puddle as a contnuous-state contnuous-tme random process C. The underlyng sample space s (,, the σ algebra s the correspondng Borel σ algebra (all possble countable unons of ntervals n (, and the probablty measure s the unform probablty measure on (,. For a partcular element n the sample space ω (, C (ω, t := ω, t [,, ( t where the unt of t s days n ths example. Fgure shows dfferent realzatons of the random process. Each realzaton s a determnstc functon on [,. Bob ponts out that he only cares what the state of the puddle s each day, as opposed to at any tme t. Mary decdes to smplfy the model by usng a contnuous-state dscrete-tme random process D. The underlyng probablty space s exactly the same as before, but the tme ndex s now dscrete. For a partcular element n the sample space ω (, D (ω, := ω, =,,... ( Fgure shows dfferent realzatons of the contnuous random process. realzaton s just a determnstc dscrete sequence. Note that each

C (ω, t.8.6.4 ω =.6 ω =.9 ω =. D (ω,.8.6.4 ω =. ω =.89 ω =.5.. 4 6 8 t 4 5 6 7 8 9 Fgure : Realzatons of the contnuous-tme (left and dscrete-tme (rght random process defned n Example.. When workng wth random processes n a probablstc model, we are often nterested n the jont dstrbuton of the process sampled at several fxed tmes. Ths s gven by the nth-order dstrbuton of the random process. Defnton. (nth-order dstrbuton. The nth-order dstrbuton of a random process X s the jont dstrbuton of the random varables X (t, X (t,..., X (t n for any n samples {t, t,..., t n } of the tme ndex t. Example.4 (Puddle (contnued. The frst-order cdf of C (t n Example. s F C(t (x := P C (t x ( = P (ω t x (4 t x du = t x f x, u= t = f x > t, (5 f x <. We obtan the frst-order pdf by dfferentatng. { t f x t f C(t (x =, otherwse. (6

If the nth order dstrbuton of a random process s shft-nvarant, then the process s sad to be strctly or strongly statonary. Defnton.5 (Strctly/strongly statonary process. A process s statonary n a strct or strong sense f for any n f we select n samples t, t,..., t n and any dsplacement τ the random varables X (t, X (t,..., X (tn have the same jont dstrbuton as X (t + τ, X (t + τ,..., X (t n + τ. The random processes n Example. are clearly not strctly statonary because ther frstorder pdf and pmf are not the same at every pont. An mportant example of strctly statonary processes are ndependent dentcally-dstrbuted sequences, presented n Secton 4.. As n the case of random varables and random vectors, defnng the underlyng probablty space n order to specfy a random process s usually not very practcal, except for very smple cases lke the one n Example.. The reason s that t s challengng to come up wth a probablty space that gves rse to a gven n-th order dstrbuton of nterest. Fortunately, we can also specfy a random process by drectly specfyng ts n-th order dstrbuton for all values of n =,,... Ths completely characterzes the random process. Most of the random processes descrbed n Secton 4, e.g. ndependent dentcally-dstrbuted sequences, Markov chans, Posson processes and Gaussan processes, are specfed n ths way. Fnally, random processes can also be specfed by expressng them as functons of other random processes. A functon Ỹ := g( X of a random process X s also a random process, as t maps any element ω n the sample space Ω to a functon Ỹ (ω, := g( X (ω,. In Secton 4.4 we defne random walks n ths way. Mean and autocovarance functons The expectaton operator allows to derve quanttes that summarze the behavor of the random process through weghted averagng. The mean of the random vector s the mean of X (t at any fxed tme t. Defnton. (Mean. The mean of a random process s the functon µ X (t := E X (t. (7 Note that the mean s a determnstc functon of t. The autocovarance of a random process s another determnstc functon that s equal to the covarance of X (t and X (t for any two ponts t and t. If we set t := t, then the autocovarance equals the varance at t. 4

Defnton. (Autocovarance. The autocovarance of a random process s the functon ( R X (t, t := Cov X (t, X (t. (8 In partcular, R X (t, t := Var X (t. (9 Intutvely, the autocovarance quantfes the correlaton between the process at two dfferent tme ponts. If ths correlaton only depends on the separaton between the two ponts, then the process s sad to be wde-sense statonary. Defnton. (Wde-sense/weakly statonary process. A process s statonary n a wde or weak sense f ts mean s constant and ts autocovarance functon s shft nvarant,.e. µ X (t := µ ( R X (t, t := R X (t + τ, t + τ ( for any t and t and any shft τ. For weakly statonary processes, the autocovarance s usually expressed as a functon of the dfference between the two tme ponts, R X (s := R X (t, t + s for any t. ( Note that any strctly statonary process s necessarly weakly statonary because ts frst and second-order dstrbutons are shft nvarant. Fgure shows several statonary random processes wth dfferent autocovarance functons. If the autocovarance functon s only nonzero at the orgn, then the values of the random processes at dfferent ponts are uncorrelated. Ths results n erratc fluctuatons. When the autocovarance at neghborng tmes s hgh, the trajectory random process becomes smoother. The autocorrelaton can also nduce more structured behavor, as n the rght column of the fgure. In that example X ( s negatvely correlated wth ts two neghbors X ( and X ( +, but postvely correlated wth X ( and X ( +. Ths results n rapd perodc fluctuatons. 4 Important random processes In ths secton we descrbe some mportant examples of random processes. 5

R(s...8.6.4... 5 5 s 5 5 4 6 8 4 4 6 8 4 4 6 8 4 R(s Autocovarance functon...8.6.4... 5 5 s 5 5 4 6 8 4 4 6 8 4 4 6 8 4 R(s..5..5. 5 5 5 5 s 4 6 8 4 4 6 8 4 4 6 8 4 Fgure : Realzatons (bottom three rows of Gaussan processes wth zero mean and the autocovarance functons shown on the top row. 6

...8.6.4... 4 6 8 4 Unform n (, (d...8.6.4... 4 6 8 4...8.6.4... 4 6 8 4 8 6 4 4 6 8 4 Geometrc wth p =.4 (d 8 6 4 4 6 8 4 8 6 4 4 6 8 4 Fgure : Realzatons of an d unform sequence n (, (frst row and an d geometrc sequence wth parameter p =.4 (second row. 4. Independent dentcally-dstrbuted sequences An ndependent dentcally-dstrbuted (d sequence X s a dscrete-tme random process where X ( has the same dstrbuton for any fxed and X (, X (,..., X (n are mutually ndependent for any n fxed ndces and any n. If X ( s a dscrete random varable (or equvalently the state space of the random process s dscrete, then we denote the pmf assocated to the dstrbuton of each entry by p X. Ths pdf completely characterzes the random process, snce for any n ndces,,..., n and any n: p X(, X(,..., X( n (x, x,..., x n = n p X (x. ( = Note that the dstrbuton that does not vary f we shft every ndex by the same amount, so the process s strctly statonary. Smlarly, f X ( s a contnuous random varable, then we denote the pdf assocated to the 7

dstrbuton by f X. For any n ndces,,..., n and any n we have f X(, X(,..., X( n (x, x,..., x n = n f X (x. (4 = Fgure shows several realzatons from d sequences whch follow a unform and a geometrc dstrbuton. The mean of an d random sequence s constant and equal to the mean of ts assocated dstrbuton, whch we denote by µ, µ X ( := E X ( (5 = µ. (6 Let us denote the varance of the dstrbuton assocated to the d sequence by σ. The autocovarance functon s gven by R X (, j := E X ( X (j E X ( E X (j (7 { σ, = (8. Ths s not surprsng, uncorrelated. X ( and X (j are ndependent for all j, so they are also 4. Gaussan process A random process X s Gaussan f the jont dstrbuton of the random varables X (t, X (t,..., X (tn s Gaussan for all t, t,..., t n and any n. An nterestng feature of Gaussan processes s that they are fully characterzed by ther mean and autocovarance functon. Fgure shows realzatons of several dscrete Gaussan processes wth dfferent autocovarances. 4. Posson process In Lecture Notes we motvated the defnton of Posson random varable by dervng the dstrbuton of the number of events that occur n a fxed tme nterval under the followng condtons:. Each event occurs ndependently from every other event. 8

. Events occur unformly.. Events occur at a rate of λ events per tme nterval. We now assume that these condtons hold n the sem-nfnte nterval [, and defne a random process Ñ that counts the events. To be clear Ñ (t s the number of events that happen between and t. By the same reasonng as n Example.7 of Lecture Notes, the dstrbuton of the random varable Ñ (t Ñ (t, whch represents the number of events that occur between t and t, s a Posson random varable wth parameter λ (t t. Ths holds for any t and t. In addton the random varables Ñ (t Ñ (t and Ñ (t 4 Ñ (t are ndependent as along as the ntervals [t, t ] and (t, t 4 do not overlap by Condton. A Posson process s a dscrete-state contnuous random process that satsfes these two propertes. Posson processes are often used to model events such as earthquakes, telephone calls, decay of radoactve partcles, neural spkes, etc. In Lecture Notes, Fgure 5 shows an example of a real scenaro where the number of calls receved at a call center s well approxmated as a Posson process (as long as we only consder a few hours. Note that here we are usng the word event to mean somethng that happens, such as the arrval of an emal, nstead of a set wthn a sample space, whch s the meanng that t usually has elsewhere n these notes. Defnton 4. (Posson process. A Posson process wth parameter λ s a dscrete-state contnuous random process Ñ such that. Ñ ( =.. For any t < t < t < t 4 Ñ (t Ñ (t s a Posson random varable wth parameter λ (t t.. For any t < t < t < t 4 the random varables Ñ (t Ñ (t and Ñ (t 4 Ñ (t are ndependent. We now check that the random process s well defned, by provng that we can derve the jont pmf of Ñ at any n ponts t < t <... < t n for any n. To allevate notaton let p ( λ, x be the value of the pmf of a Posson random varable wth parameter λ at x,.e. p ( λ, x := λ x e λ. (9 x! 9

λ =. λ = λ =..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8..8.6.4.. 4 t 6 8 Fgure 4: Events correspondng to the realzatons of a Posson process Ñ for dfferent values of the parameter λ. Ñ (t equals the number of events up to tme t.

We have pñ(t,...,ñ(tn (x,..., x n ( = P (Ñ (t = x,..., Ñ (t n = x n ( = P (Ñ (t = x, Ñ (t Ñ (t = x x,..., Ñ (t n Ñ (t n = x n x n ( = P (Ñ (t = x P (Ñ (t Ñ (t = x x... P (Ñ (tn Ñ (t n = x n x n = p (λt, x p (λ (t t, x x... p (λ (t n t n, x n x n. ( In words, we have expressed the event that Ñ (t = x for n n terms of the random varables Ñ (t and Ñ (t Ñ (t, n, whch are ndependent Posson random varables wth parameters λt and λ (t t respectvely. Fgure 4 shows several sequences of events correspondng to the realzatons of a Posson process Ñ for dfferent values of the parameter λ (Ñ (t equals the number of events up to tme t. Interestngly, the nterarrval tme of the events,.e. the tme between contguous events, always has the same dstrbuton: t s an exponental random varable. Ths allows to smulate Posson processes by samplng from an exponental dstrbuton. Fgure 4 was generated n ths way. Lemma 4. (Interarrval tmes of a Posson process are exponental. Let T denote the tme between two contguous events n a Posson process wth parameter λ. T s an exponental random varable wth parameter λ. The proof s n Secton A of the appendx. Fgure 8 n Lecture Notes shows that the nterarrval tmes of telephone calls at a call center are ndeed well modeled as exponental. The followng lemma, whch derves the mean and autocovarance functons of a Posson process s proved n Secton B. Lemma 4. (Mean and autocovarance of a Posson process. The mean and autocovarance of a Posson process equal E X (t = λ t, (4 R X (t, t = λ mn {t, t }. (5 The mean of the Posson process s not constant and ts autocovarance s not shft-nvarant, so the process s nether strctly nor wde-sense statonary.

Example 4.4 (Earthquakes. The number of earthquakes wth ntensty at least on the Rchter scale occurrng n the San Francsco pennsula s modeled usng a Posson process wth parameter. earthquakes/year. What s the probablty that there are no earthquakes n the next ten years and then at least one earthquake over the followng twenty years? We defne a Posson process X wth parameter. to model the problem. The number of earthquakes n the next years,.e. X (, s a Posson random varable wth parameter. =. The earthquakes n the followng years, X ( X (, are Posson wth parameter. = 6. The two random varables are ndependent because the ntervals do not overlap. P X ( =, X ( The probablty s 4.97%. = P X ( =, X ( X ( X ( = = P = P ( X ( = ( P P X ( X ( ( X ( X ( = (6 (7 (8 = e ( e 6 = 4.97. (9 4.4 Random walk A random walk s a dscrete-tme random process that s used to model a sequence that evolves by takng steps n random drectons. To defne a random walk formally, we frst defne an d sequence of steps S such that { + wth probablty S ( =, wth probablty. ( We defne a random walk X as the dscrete-state dscrete-tme random process { for =, X ( := S j= (j for =,,... ( We have specfed X as a functon of an d sequence, so t s well defned. Fgure 5 shows several realzatons of the random walk. X s symmetrc (there s the same probablty of takng a postve step and a negatve step and begns at the orgn. It s easy to defne varatons where the walk s non-symmetrc

6 4 4 6 4 6 8 4 6 4 4 6 4 6 8 4 6 4 4 6 4 6 8 4 Fgure 5: Realzatons of the random walk defned n Secton 5. and begns at another pont. Generalzatons to hgher dmensonal spaces for nstance to model random processes on a D surface are also possble. We derve the frst-order pmf of the random walk n the followng lemma, proved n Secton C of the appendx. Lemma 4.5 (Frst-order pmf of a random walk. The frst-order pmf of the random walk X s p X( (x = {( +x f + x s even and x otherwse. ( The frst-order dstrbuton of the random walk s clearly tme-dependent, so the random process s not strctly statonary. By the followng lemma, the mean of the random walk s constant (t equals zero. The autocovarance, however, s not shft nvarant, so the process s not weakly statonary ether. Lemma 4.6 (Mean and autocovarance of a random walk. The mean and autocovarance of the random walk X are µ X ( =, ( R X (, j = mn {, j}. (4

Proof. µ X ( := E X ( ( = E S (j = j= E S (j j= (5 (6 by lnearty of expectaton (7 =. (8 R X (, j := E X ( X (j ( j = E S (k S (l = E = k= mn{,j} k= l= mn{,j} k= + S (k + k= E X ( j l= l k k= E X (j (9 (4 j S (k S (l (4 l= l k E S (k E S (l (4 = mn {, j}, (4 where (4 follows from lnearty of expectaton and ndependence. The varance of X at equals R X (, = whch means that the standard devaton of the random walk scales as. Example 4.7 (Gambler. A gambler s playng the followng game. A far con s flpped sequentally. Every tme the result s heads the gambler wns a dollar, every tme t lands on tals she loses a dollar. We can model the amount of money earned (or lost by the gambler as a random walk, as long as the flps are ndependent. Ths allows us to estmate that the expected gan equals zero or that the probablty that the gambler s up 6 dollars or more after the frst flps s P (gambler s up $6 or more = p X( (6 + p X( (8 + p X( ( (44 = 8 + 9 + (45 = 5.47. (46 4

4.5 Markov chans We begn by defnng the Markov property, whch s satsfed by any random process for whch the future s condtonally ndependent from the past gven the present. Defnton 4.8 (Markov property. A random process satsfes the Markov property f X (t + s condtonally ndependent of X (t,..., X (t gven X (t for any t < t <... < t < t +. If the state space of the random process s dscrete, then for any x, x,..., x + p X(tn+ X(t, X(t,..., X(t (x n+ x, x,..., x n = p X(t+ X(t (x + x. (47 If the state space of the random process s contnuous (and the dstrbuton has a jont pdf, f X(t+ X(t, X(t,..., X(t (x + x, x,..., x = f X(t+ X(t (x + x. (48 Any d sequence satsfes the Markov property, snce all condtonal pmfs or pdfs are just equal to the margnals. The random walk also satsfes the property, snce once we fx where the walk s at a certan tme the path that t took before has no nfluence n ts next steps. Lemma 4.9. The random walk satsfes the Markov property. Proof. Let X denote the random walk defned n Secton 4.4. Condtoned on X (j = x for j, X ( + equals x + S ( +. Ths does not depend on x,..., x, whch mples (47. A Markov chan s a random process that satsfes the Markov property. In these notes we wll consder dscrete-tme Markov chans wth a fnte state space, whch means that the process can only take a fnte number of values at any gven tme pont. To specfy such a Markov chan, we only need to defne the pmf of the random process at ts startng pont (whch we wll assume s at = and ts transton probabltes. Ths follows from the Markov property, snce for any n p X(, X(,..., X(n (x, x,..., x n := = n p X( X(,..., X( (x x,..., x (49 = n p X( X( (x x. (5 = 5

If these transton probabltes are the same at every tme step (.e. they are constant and do not depend on, then the Markov chan s sad to be tme homogeneous. In ths case, we can store the probablty of each possble transton n an s s matrx T X, where s s the number of states. ( T X jk := p X(+ X( (x j x k. (5 In the rest of ths secton we wll focus on tme-homogeneous fnte-state Markov chans. The transton probabltes of these chans can be vsualzed usng a state dagram, whch shows each state and the probablty of every possble transton. See Fgure 6 below for an example. To smplfy notaton we defne an s-dmensonal vector p X( called the state vector, whch contans the margnal pmf of the Markov chan at each tme, p X( (x p p X( := X( (x. (5 p X( (x s Each entry n the state vector contans the probablty that the Markov chan s n that partcular state at tme. It s not the value of the Markov chan, whch s a random varable. The ntal state space p X( and the transton matrx T X suffce to completely specfy a tmehomogeneous fnte-state Makov chan. Indeed, we can compute the jont dstrbuton of the chan at any n tme ponts,,..., n for any n from p X( and T X by applyng (5 and margnalzng over any tmes that we are not nterested n. We llustrate ths n the followng example. Example 4. (Car rental. A car-rental company hres you to model the locaton of ther cars. The company operates n Los Angeles, San Francsco and San Jose. Customers regularly take a car n a cty and drop t off n another. It would be very useful for the company to be able to compute how lkely t s for a car to end up n a gven cty. You decde to model the locaton of the car as a Markov chan, where each tme step corresponds to a new customer takng the car. The company allocates new cars evenly between the three ctes. The transton probabltes, obtaned from past data, are gven by 6

.8 LA.... SF.6.. SJ.4 SJ SJ SJ LA LA LA SF SF SF 5 5 Customer 5 5 Customer 5 5 Customer Fgure 6: State dagram of the Markov chan descrbed n Example (4. (top. Each arrow shows the probablty of a transton between the two states. Below we show three realzatons of the Markov chan. 7

San Francsco Los Angeles San Jose (.6.. San Francsco..8. Los Angeles...4 San Jose To be clear, the probablty that a customer moves the car from San Francsco to LA s., the probablty that the car stays n San Francsco s.6, and so on. The ntal state vector and the transton matrx of the Markov chan are /.6.. p X( := /, T X :=..8.. (5 /...4 State s assgned to San Francsco, state to Los Angeles and state to San Jose. Fgure 6 shows a state dagram of the Markov chan. Fgure 6 shows some realzatons of the Markov chan. The researcher now wshes to estmate the probablty that the car starts n San Francsco and s n San Jose after the second customer. Ths s gven by p X(, X( (, = The probablty s 7.%. = p X(, X(, X( (,, (54 = p X( ( p X( X( ( p X( X( ( (55 = = ( p X( = = ( T X ( T X.6. +.. +..4 (56 7.. (57 The followng lemma provdes a smple expresson for the state vector at tme p X( n terms of T X and the prevous state vector. Lemma 4. (State vector and transton matrx. For a Markov chan X wth transton matrx T X p X( = T X p X(. (58 8

If the Markov chan starts at tme then where T ĩ X denotes multplyng tmes by matrx T X. p X( = T ĩ X p X(, (59 Proof. The proof follows drectly from the defntons, p X( (x s j= p X( (x j p X( X( (x x j p p X( := X( (x s = j= p X( (x j p X( X( (x x j (6 p X( (x s s j= p X( (x j p X( X( (x s x j p X( X( (x x p X( X( (x x p X( X( (x x s p X( (x p = X( X( (x x p X( X( (x x p X( X( (x x s p X( (x p X( X( (x s x p X( X( (x s x p X( X( (x s x s p X( (x s = T X p X( Equaton (59 s obtaned by applyng (58 tmes and takng nto account the Markov property. (6 Example 4. (Car rental (contnued. The company wants to estmate the dstrbuton of locatons rght after the 5th customer has used a car. Applyng Lemma 4. we obtan p X(5 = T 5 X p X( (6.8 =.54. (6.85 The model estmates that after 5 customers more than half of the cars are n Los Angeles. The states of a Markov chan can be classfed dependng on whether the Markov chan may eventually stop vstng them or not. 9

Defnton 4. (Recurrent and transent states. Let X be a tme-homogeneous fnte-state Markov chan. We consder a partcular state x. If P X (j = s for some j > X ( = s = (64 then the state s recurrent. In words, gven that the Markov chan s at x, the probablty that t returns to x s one. In contrast, f P X (j s for all j > X ( = s > (65 the state s transent. Gven that the Markov chan s at x, there s nonzero probablty that t wll never return. The followng example llustrates the dfference between recurrent and transent states. Example 4.4 (Employment dynamcs. A researcher s nterested n modelng the employment dynamcs of young people usng a Markov chan. She determnes that at age 8 a person s ether a student wth probablty.9 or an ntern wth probablty.. After that she estmates the followng transton probabltes: Student Intern Employed Unemployed.8.5 Student..5 Intern..9.4 Employed..6 Unemployed The Markov assumpton s obvously not completely precse, someone who has been a student for longer s probably less lkely to reman a student, but such Markov models are easer to ft (we only need to estmate the transton probabltes and often yeld useful nsghts. The ntal state vector and the transton matrx of the Markov chan are.9.8.5...5 p X( :=, T X :=. (66..9.4..6 Fgure 7 shows the state dagram and some realzatons of the Markov chan.

.9.6. Employed.4 Unemployed.. Student.5 Intern.8.5 Unemp. Emp. Int. Stud. 5 Age Unemp. Emp. Int. Stud. 5 Age Unemp. Emp. Int. Stud. 5 Age Fgure 7: State dagram of the Markov chan descrbed n Example (4.4 (top. Below we show three realzatons of the Markov chan.

States (student and (ntern are transent states. Note that the probablty that the Markov chan returns to those states after vstng state (employed s zero, so P X (j for all j > X ( = P X ( + = X ( = (67 =. >, (68 P X (j for all j > X ( = P X ( + = X ( = (69 In contrast, states and 4 (unemployed are recurrent. argument for state 4 s exactly the same: P X (j for all j > X ( = = P X (j = 4 for all j > X ( = = lm k P = lm..6 k k =. ( X ( + = 4 X ( = k j= =.4. >. (7 We prove ths for state (the P X ( + j + = 4 X ( + j = 4 (7 (7 (7 (74 (75 In ths example, t s not possble to reach the states student and ntern from the states employed or unemployed. Markov chans for whch there s a possble transton between any two states (even f t s not drect are called rreducble. Defnton 4.5 (Irreducble Markov chan. A tme-homogeneous fnte-state Markov chan s rreducble f for any state x, the probablty of reachng every other state y x n a fnte number of steps s nonzero,.e. there exsts m such that P X ( + m = y X ( = x >. (76 One can easly check that the Markov chan n Example 4. s rreducble, whereas the one n Example 4.4. An mportant result s that all states n an rreducble Markov chan are recurrent. Theorem 4.6 (Irreducble Markov chans. All states n an rreducble Markov chan are recurrent. The result s proved n Secton D of the appendx. We end ths secton by defnng the perod of a state.

.9 A B C. Fgure 8: State dagram of a Markov chan where states the states have perod two. Defnton 4.7 (Perod of a state. Let X be a tme-homogeneous fnte-state Markov chan and x a state of the Markov chan. The perod m of x s the smallest nteger such that t s only possble to return to x n a number of steps that s a multple of m,.e. km for some postve nteger k wth nonzero probablty. Fgure 8 shows a Markov chan where the states have a perod equal to two. Markov chans do not contan states wth perods greater than one. Aperodc Defnton 4.8 (Aperodc Markov chan. A tme-homogeneous fnte-state Markov chan X s aperodc f all states have perod equal to one. The Markov chans n Examples 4. and 4.4 are both aperodc. A Proof of Lemma 4. We begn by dervng the cdf of T, F T (t := P (T t (77 = P (T > t (78 = P (no events n an nterval of length t (79 = e λ t (8 because the number of ponts n an nterval of length t follows a Posson dstrbuton wth parameter λ t. Dfferentatng we conclude that f T (t = λe λ t. (8 B Proof of Lemma 4. By defnton the number of events between and t s dstrbuted as a Posson random varables wth parameter λ t and hence ts mean s equal to λ t.

The autocovarance equals ( R X (t, t := E X (t X (t E X (t E X (t (8 ( = E X (t X (t λ t t. (8 By assumpton X (t and X (t X (t are ndependent so that ( E X (t X ( ( (t = E X (t X (t X (t + X (t (84 ( = E X (t E X (t X ( (t + E X (t (85 = λ t (t t + λt + λ t (86 = λ t t + λt. (87 C Proof of Lemma 4.5 Let us defne the number of postve steps S + that the random walk takes. Gven the assumptons on S, ths s a bnomal random varable wth parameters and /. The number of negatve steps s S := S +. In order for X ( to equal x we need for the net number of steps to equal x, whch mples Ths means that S + must equal +x. We conclude that ( p X( ( = P S ( = x = j= +x f + x D Proof of Theorem 4.6 x = S + S (88 = S +. (89 (9 s an nteger between and. (9 In any fnte-state Markov chan there must be at least one state that s recurrent. If all the states are transent there s a nonzero probablty that t leaves all of the states forever, whch s not possble. Wthout loss of generalty let us assume that state x s recurrent. We 4

wll now prove that another arbtrary state y must also be recurrent. To allevate notaton let p x,x := P X (j = x for some j > X ( = x, (9 p x,y := P X (j = y for some j > X ( = x. (9 The chan s rreducble so there s a nonzero probablty p m > of reachng y from x n at most m steps for some m >. The probablty that the chan goes from x to y and never goes back to x s consequently at least p m ( p y,x. However, x s recurrent, so ths probablty must be zero! Snce p m > ths mples p y,x =. Consder the followng event:. X goes from y to x.. X does not return to y n m steps after reachng x.. X eventually reaches x agan at a tme m > m. The probablty of ths event s equal to p y,x ( p m p x,x = p m (recall that x s recurrent so p x,x =. Now magne that steps and repeat k tmes,.e. that X fals to go from x to y n m steps k tmes. The probablty of ths event s p y,x ( p m k p k x,x = ( p m k. Takng k we have that the probablty that X does not eventually return to x must be zero. 5