Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Similar documents

Markov Chains, Stochastic Processes, and Matrix Decompositions

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

18.175: Lecture 30 Markov chains

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

CS 246 Review of Linear Algebra 01/17/19

STOCHASTIC PROCESSES Basic notions

MAS275 Probability Modelling Exercises

Markov chains. Randomness and Computation. Markov chains. Markov processes

Definition A finite Markov chain is a memoryless homogeneous discrete stochastic process with a finite number of states.

Finite-Horizon Statistics for Markov chains

Markov Chains Handout for Stat 110

LTCC. Exercises. (1) Two possible weather conditions on any day: {rainy, sunny} (2) Tomorrow s weather depends only on today s weather

1.3 Convergence of Regular Markov Chains

Chapter 29 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M.

TMA Calculus 3. Lecture 21, April 3. Toke Meier Carlsen Norwegian University of Science and Technology Spring 2013

Math 304 Handout: Linear algebra, graphs, and networks.

12 Markov chains The Markov property

Daily Update. Math 290: Elementary Linear Algebra Fall 2018

3 (Maths) Linear Algebra

IEOR 6711: Professor Whitt. Introduction to Markov Chains

The Theory behind PageRank

Markov Chains. Chapter 16. Markov Chains - 1

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Markov Chains on Countable State Space

Lectures on Probability and Statistical Models

2 DISCRETE-TIME MARKOV CHAINS

Lecture 5: Random Walks and Markov Chain

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

18.600: Lecture 32 Markov Chains

18.440: Lecture 33 Markov Chains

STOCHASTIC MODELS LECTURE 1 MARKOV CHAINS. Nan Chen MSc Program in Financial Engineering The Chinese University of Hong Kong (ShenZhen) Sept.

Matrices. Chapter Definitions and Notations

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

Stochastic modelling of epidemic spread

Powerful tool for sampling from complicated distributions. Many use Markov chains to model events that arise in nature.

Linear Algebra March 16, 2019

Chapter 7. Markov chain background. 7.1 Finite state space

Discrete quantum random walks

Probability, Random Processes and Inference

A Note on Google s PageRank

= main diagonal, in the order in which their corresponding eigenvectors appear as columns of E.

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

0.1 Naive formulation of PageRank

Chapter 3. Linear and Nonlinear Systems

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

4.1 Markov Processes and Markov Chains

Online Social Networks and Media. Link Analysis and Web Search

Data Mining and Analysis: Fundamental Concepts and Algorithms

Stochastic modelling of epidemic spread

Recitation 8: Graphs and Adjacency Matrices

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Linear Algebra Solutions 1

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

MATH Mathematics for Agriculture II

Markov Chains and Stochastic Sampling

Lecture 1: Systems of linear equations and their solutions

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

MATRICES. knowledge on matrices Knowledge on matrix operations. Matrix as a tool of solving linear equations with two or three unknowns.

Definition 2.3. We define addition and multiplication of matrices as follows.

The following definition is fundamental.

Discrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices

Math 1553, Introduction to Linear Algebra

Introduction to Stochastic Processes

Lecture 20 : Markov Chains

MATH HOMEWORK PROBLEMS D. MCCLENDON

LIMITING PROBABILITY TRANSITION MATRIX OF A CONDENSED FIBONACCI TREE

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

Stochastic Models: Markov Chains and their Generalizations

Eigenvalues and Eigenvectors

Section Summary. Sequences. Recurrence Relations. Summations Special Integer Sequences (optional)

Markov Chains. Contents

Matrices: 2.1 Operations with Matrices

Linear equations in linear algebra

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

Econ Slides from Lecture 7

Markov chains (week 6) Solutions

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)

Stochastic Processes

Math Camp Notes: Linear Algebra II

= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano

MTH 2032 Semester II

Regular finite Markov chains with interval probabilities

Review of Linear Algebra

Markov Processes Hamid R. Rabiee

Roberto s Notes on Linear Algebra Chapter 10: Eigenvalues and diagonalization Section 3. Diagonal matrices

Study Guide for Linear Algebra Exam 2

Elementary Linear Algebra

1 - Systems of Linear Equations

Homework 1 Elena Davidson (B) (C) (D) (E) (F) (G) (H) (I)

5.1 Introduction to Matrices

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015

Finite Math - J-term Section Systems of Linear Equations in Two Variables Example 1. Solve the system

An Introduction to Entropy and Subshifts of. Finite Type

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Transience: Whereas a finite closed communication class must be recurrent, an infinite closed communication class can be transient:

3 Matrix Algebra. 3.1 Operations on matrices

Solving Homogeneous Systems with Sub-matrices

Transcription:

Chapter 1: and Markov chains Stochastic processes We study stochastic processes, which are families of random variables describing the evolution of a quantity with time. In some situations, we can treat time as discrete. School of Mathematics and Statistics, University of Sheffield Spring Semester, 2018 I.e. we consider the non-negative integers N 0, and for each n N 0 we have a random variable X n giving the value of the quantity at time n. For example, when describing the evolution of a population, X n could be the number of individuals in the nth generation. Continuous time Markov property Alternatively, it often makes more sense to treat time as continuous: We represent time by the positive real numbers R +. For each t R + we have a random variable X t giving the value of the quantity of interest at time t. For example, X t could be the price of a financial asset at time t. For most of this course we will concentrate on discrete time, which is easier, but we will introduce some continuous time ideas towards the end of the course. A process going on over a period of time is said to have the Markov property if given the present, the future is independent of the past, in the sense that...... if at any time we know what the current state of the process is, then any information that we are given about what happened in the past will not affect the probability of any future event.

Markov property II Markov chain In mathematical terms, it may be written P(C A B) = P(C B) whenever A represents a past event, B represents a statement of the present state of the process and C represents a future event. The Markov property is a natural assumption in many situations, provided that the current state of the process can be appropriately described. A Markov chain is a process with the Markov property where the set S of possible states at each time point is finite or countably infinite. The most obvious example is the integers or some subset of them, but in some contexts S is most naturally thought of as non-numeric. S is known as the state space. We will study Markov chains in discrete time only, although it is possible to develop a similar theory in continuous time. Random variables To describe the behaviour of a Markov chain, we specify probabilities of the form Let X n and X n+1 be random variables, taking values in our state space S, representing the states of the process at times n and n + 1 respectively. for i, j S. p ij = P(X n+1 = j X n = i) Here p ij is the probability, from state i, of moving to state j at the next time point, and is known as the transition probability from state i to state j.

Assumptions Transition matrix In this course, we will assume that p ij does not depend upon n: we say that the transition probabilities are time-homogeneous. Note also that, because of the Markov property, we may write P(X n+1 = j X n = i, previous history before time n) = p ij. For each ordered pair of states (i, j) there is a corresponding transition probability p ij. So we have a square two-dimensional array of numbers, known as the (one-step) transition matrix of the Markov chain. Often helpful to label the rows and columns of the matrix, especially if the state space is non-numerical. Finite and infinite matrices Where S is a finite set, the transition matrix is a S S matrix. We will also consider some examples where S is an infinite set, in which case the transition matrix has to be thought of as an matrix of infinite extent. s Wet and dry days Gambler s ruin Gambler s ruin with no target Ehrenfest model for diffusion

Properties of the transition matrix Stochastic matrices Starting in state i, the process moves to exactly one state at the next time point. So a transition matrix must have the property that each of its rows adds up to 1: p ij = 1 for each i S. j S Another way of saying this is that each row of the transition matrix represents a probability distribution, namely, the conditional distribution of the next state, given that the present state is i. Any square matrix with these properties is called a stochastic matrix. Also, its elements are probabilities and therefore non-negative. Eigenvector Initial distribution By their definition, transition probabilities are conditional probabilities. The sum of each row being 1 implies that, if 1 is the column vector whose entries are all 1, then P1 = 1. In other words a stochastic matrix always has 1 as an eigenvalue with (right) eigenvector 1. In order to describe exactly how a process is behaving we need to specify some unconditional or absolute probabilities. We can do this by specifying the initial distribution of the chain, probabilities of the form π (0) i = P(X 0 = i) for i S, where these numbers add up to 1 because they form a probability distribution.

Row vector Starting in a known state Note that we can write all these probabilities in a vector: Let π (0) be the vector (in R S ) with entry i being π (0) i, for i S. The usual convention in probability theory, fitting with the way we defined the transition matrix, is to treat this as a row vector. Sometimes we will simply say that the chain starts at time 0 in a particular state. To represent this, let π (0) be the vector with entries given by { π (0) 1 i = i0 ; i = 0 i i 0 which describes the chain being in i 0 with probability 1 at time 0. Later states Markov property To calculate probabilities involving later states, we then need to use the rules of conditional probability. For example, P(X 0 = i, X 1 = j) = P(X 0 = i)p(x 1 = j X 0 = i) = π (0) i p ij. To extend this to a further step, the key is to use the Markov property to simplify P(X 2 = k X 0 = i, X 1 = j): (on board)

General case Row vectors More generally, for i 0, i 1..., i n S we may write P(X 0 = i 0, X 1 = i 1,..., X n = i n ) = π (0) i 0 p i0 i 1 p i1 i 2... p in 1 i n. As above for time zero, we can write the distribution of X n, the state of the chain at time n, as a row vector. Let π (n) be the vector with entry i being π (n) i, for i S. Then π (1) = π (0) P (calculation on board). n-step transition probabilities n-step transition matrix For positive integer n, the n-step transition probabilities of a Markov chain are defined in the obvious way p (n) ij = P(X m+n = j X m = i) so that the transition probabilities which we have encountered are just the special case n = 1. As with the n = 1 case, we can gather these n-step transition probabilities into the form of a matrix, called the n-step transition matrix. For now we denote it by P (n), but we will now see that it is related to the one-step transition matrix P in a simple way, by a set of equations known as the Chapman-Kolmogorov equations.

Chapman-Kolmogorov equations Corollary Theorem (Chapman-Kolmogorov equations) (a) For all positive integers m, n, we have P (m+n) = P (m).p (n). (b) For all n = 1, 2, 3,..., P (n) = P n. We can now obtain a relationship between the row vectors representing the distribution of the state of the chain at different times. Corollary For all non-negative integers m and n, we have π (m+n) = π (m) P n. Graphs A graph here refers to a network consisting of a set of vertices, some pairs of which are linked by edges. We assume for now that each edge can be traversed in either direction, and that there are no loops (that is, no edges where both ends are the same vertex). We construct a Markov chain by letting the state space S be the set of vertices, and assume that we have a particle moving from vertex to vertex. At each step, the particle chooses one of the possible edges from its current vertex, each with equal probability, and travels along that edge. This is called a (symmetric) random walk on a graph.

of a random walk on a graph Diagonalisation For n-step transition probabilities for large n, one approach we could consider is to try to diagonalise the transition matrix, namely to try to find an expression for P in the form where P = CDC 1 D is a matrix consisting of eigenvalues of P down the main diagonal and zeroes everywhere else, C is a matrix whose columns are corresponding right eigenvectors of P. Diagonalisation II If we can do this, then P n = CDC 1.CDC 1... CDC 1 = CD n C 1, where D n is easy to write down explicitly. Diagonalisation Diagonalisation of larger matrices is usually not easy...... but it can be helped by the fact that a stochastic matrix always has 1 as an eigenvalue with 1 as the corresponding right eigenvector.

Left eigenvectors A left eigenvector of P with eigenvalue λ is a row vector x such that xp = λx. By thinking about the eigenvalues of a matrix and its transpose, it can be seen that the set of eigenvalues of a matrix is the same regardless of whether left or right eigenvectors are considered. So a finite stochastic matrix always has a left eigenvector π with eigenvalue 1. This can then be iterated to give for any n Diagonalisation = 1, 2, 3, of the.. transition.. matrix πp n = π II If this is the case, we say that the Markov chain is in equilibrium. A distribution which, when chosen as initial distribution, causes a Markov chain to be in equilibrium is called a stationary distribution. This means that if we choose our initial distribution π (0) = π... (which requires that the entries in π sum to 1 and that they are non-negative)... then for any n π (n) = πp n = π and so the (unconditional) distribution of X n is the same as that of X 0, for any n. Finding stationary distributions To find stationary distributions, we need to solve the (left) eigenvector equations π j = π i p ij i S for all j S, which are called the equilibrium equations. Each equilibrium equation corresponds to a column of the transition matrix. In the finite state space case, because we know that 1 is an eigenvalue, the equations are not linearly independent, and we may always discard one of them without losing any

Finding stationary distributions II Short cuts The eigenvector is only defined up to a constant multiple, but...... as we are looking for a probability distribution we have the condition π j = 1, j S which is sufficient to guarantee a unique solution in the case where the eigenvalue 1 has multiplicity 1. In practice, the solution of these equations is often helped by the fact that (in many examples) there are a lot of zeros in the transition matrix. When we know that the solution is unique (which, as we will see later in the course, is very often the case) it is also often valuable to appeal to symmetry to argue that some of the π j s are equal to each other, thereby enabling us to eliminate some variables from the equations. Infinite state space Stationary distribution for random walk on a graph In the infinite state space case, the same equations apply, but our vectors and matrices are now infinite in extent, so we have an infinite family of equations, each of which may have an infinite sum on the right hand side. Such families of equations may not have a solution, and even if one does exist it may not be easy to find. However there are some examples of infinite state Markov chains with stationary distributions which can be found by exploiting the structure of the equations.