Lecture 5: Introduction to Markov Chains

Similar documents
MATH3200, Lecture 31: Applications of Eigenvectors. Markov Chains and Chemical Reaction Systems

Lecture 19: Introduction to Linear Transformations

SDS 321: Introduction to Probability and Statistics

Lecture 1 Introduction to Probability and Set Theory Text: A Course in Probability by Weiss

Today. Next lecture. (Ch 14) Markov chains and hidden Markov models

Grades 7 & 8, Math Circles 24/25/26 October, Probability

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

Lecture 1. ABC of Probability

P (E) = P (A 1 )P (A 2 )... P (A n ).

18.440: Lecture 33 Markov Chains

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Markov Chains. As part of Interdisciplinary Mathematical Modeling, By Warren Weckesser Copyright c 2006.

k P (X = k)

MATH HOMEWORK PROBLEMS D. MCCLENDON

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

EPE / EDP 557 Homework 7

MAT Mathematics in Today's World

Chapter 35 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

18.600: Lecture 32 Markov Chains

Probability: Part 1 Naima Hammoud

Math 304 Handout: Linear algebra, graphs, and networks.

Sampling Distributions

MATH 56A SPRING 2008 STOCHASTIC PROCESSES

18.600: Lecture 39 Review: practice problems

CIS 2033 Lecture 5, Fall

Section 13.3 Probability

Lecture 3 - Axioms of Probability

CHAPTER 3 PROBABILITY TOPICS

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Lecture 4 An Introduction to Stochastic Processes

Statistics 100A Homework 1 Solutions

Uncertainty. Michael Peters December 27, 2013

COS 341: Discrete Mathematics

Math Tech IIII, Jan 21

Stochastic Processes

Math 243 Section 3.1 Introduction to Probability Lab

Math Camp Notes: Linear Algebra II

the time it takes until a radioactive substance undergoes a decay

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

STT 315 Problem Set #3

MATH 3C: MIDTERM 1 REVIEW. 1. Counting

Probability and Inference. POLI 205 Doing Research in Politics. Populations and Samples. Probability. Fall 2015

Ch 14 Randomness and Probability

Elementary Discrete Probability

DO NOT WRITE BELOW THIS LINE Total page 1: / 20 points. Total page 2: / 20 points

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Tutorial:A Random Number of Coin Flips

Mean, Median and Mode. Lecture 3 - Axioms of Probability. Where do they come from? Graphically. We start with a set of 21 numbers, Sta102 / BME102

Example. What is the sample space for flipping a fair coin? Rolling a 6-sided die? Find the event E where E = {x x has exactly one head}

LTCC. Exercises. (1) Two possible weather conditions on any day: {rainy, sunny} (2) Tomorrow s weather depends only on today s weather

Discrete Mathematics and Probability Theory Fall 2010 Tse/Wagner MT 2 Soln

Chapter 10 Markov Chains and Transition Matrices

Independence Solutions STAT-UB.0103 Statistics for Business Control and Regression Models

= 2 5 Note how we need to be somewhat careful with how we define the total number of outcomes in b) and d). We will return to this later.

Markov Chains. Chapter 16. Markov Chains - 1

CHAPTER 6. Markov Chains

Discrete Structures for Computer Science

Lecture 10: Powers of Matrices, Difference Equations

February 2017 February 18, 2017

Markov Chains (Part 4)

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Brief Review of Probability

Introduction to probability

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Basic Probability. Introduction

Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains

9. DISCRETE PROBABILITY DISTRIBUTIONS

P (A) = P (B) = P (C) = P (D) =

Math 1313 Experiments, Events and Sample Spaces

1 Basic continuous random variable problems

Term Definition Example Random Phenomena

CS 246 Review of Proof Techniques and Probability 01/14/19

1 Proof techniques. CS 224W Linear Algebra, Probability, and Proof Techniques

the probability of getting either heads or tails must be 1 (excluding the remote possibility of getting it to land on its edge).

LECTURE 15: SIMPLE LINEAR REGRESSION I

Be able to define the following terms and answer basic questions about them:

STAT 111 Recitation 1

Introduction to Algebra: The First Week

Great Theoretical Ideas in Computer Science

Ph 12b. Homework Assignment No. 3 Due: 5pm, Thursday, 28 January 2010

Terminology. Experiment = Prior = Posterior =

V. Probability. by David M. Lane and Dan Osherson

Modelling data networks stochastic processes and Markov chains

Lecture Lecture 5

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E.

Discrete Random Variables

MAS275 Probability Modelling Exercises

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Lecture 6: Entropy Rate

18.175: Lecture 30 Markov chains

Be able to define the following terms and answer basic questions about them:

Note: Please use the actual date you accessed this material in your citation.

MITOCW ocw f99-lec30_300k

Cambridge University Press How to Prove it: A Structured Approach: Second edition Daniel J. Velleman Excerpt More information

Hypothesis Tests Solutions COR1-GB.1305 Statistics and Data Analysis

1 Basic continuous random variable problems

Dept. of Linguistics, Indiana University Fall 2015

STAT Chapter 3: Probability

Lecture 6: The Pigeonhole Principle and Probability Spaces

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

Data Analysis and Monte Carlo Methods

Transcription:

Lecture 5: Introduction to Markov Chains Winfried Just Department of Mathematics, Ohio University January 24 26, 2018

weather.com light The weather is a stochastic process. For now we can assume that this just means it changes over time and cannot be predicted with certainty. Suppose you make the following observations: If day t is a sunny day, then day t + 1 will be sunny with probability p 11 and will be rainy with probability p 12. If day t is a rainy day, then day t + 1 will be sunny with probability p 21 and will be rainy with probability p 22. These observations can be organized into a matrix P as follows: [ ] p11 p P = 12 p 21 p 22 What can you say about the properties of the matrix P?

What are probabilities, anyway? Probabilities are numbers p such that 0 p 1. A probability of p = 0 for an event or outcome signifies that the event will never occur. A probability of p = 1 for an event signifies that the event will occur with certainty. A probability of p = 0.5 for an event signifies that there is a fifty-fifty chance that the event will occur. For example, if we flip a fair coin, then the event H that it will come up heads will be p = 0.5. Similarly, if we roll a fair die, then the event e 3 that it will come up 3 will be p = 1/6.

Probability distributions Suppose we can cut up all aspects of the future that are of interest for a particular application into events aka elementary outcomes e 1,..., e n so that exactly one of these outcomes will occur. Then the probability distribution is the vector [p 1,..., p n ] of probabilities of these outcomes, and we must have: p 1 + + p n = n i=1 p i = 1. When we flip a fair coin once, we might consider [H, T ] = [e 1, e 2 ], and then [p 1, p 2 ] = [0.5, 0.5]. If we roll a fair die once, we might consider [e 1,..., e 6 ], and then [p 1, p 2, p 3, p 4, p 5, p 6 ] = [1/6, 1/6, 1/6, 1/6, 1/6, 1/6]. When we flip a fair coin twice and are only interested in the number of times heads comes up, we might consider [H, T ] = [e 0, e 1, e 2 ], and then [p 0, p 1, p 2 ] = [0.25, 0.5, 0.25].

weather.com light, continued [ ] p11 p P = 12 p 21 p 22 What can you say about the properties of the matrix P? Nothing much, actually. How about dry, but overcast days? Or about days when it rains in the morning and is sunny in the afternoon? OK, let us assume that you can somehow unambiguously classify each day as either sunny or rainy, but not both. Then you have two possible states of the weather on day t: 1 State 1: sunny day. 2 State 2: rainy day. Then P is the matrix of transition probabilities between the states on day t and day t + 1.

weather.com light, continued Under these assumptions, we can say something definite about P: [ ] p11 p P = 12 p 21 p 22 p 11 + p 12 = 1 and p 21 + p 22 = 1. P must be a stochastic matrix, that is, its elements must be probabilities, and each row of P must add up to 1. Consider the following examples: P 1 = [ 0.4 ] 0.6 0.6 0.4 P 2 = [ 0.4 ] 0.3 0.6 0.4 P 3 = [ 0.4 ] 0.6 0.7 0.3 Here P 1 and P 3 are legitimate options for P, while P 2 is not. In P 1, each column also adds up to 1; this matrix is doubly stochastic. The latter property is not required though.

weather.com for perfectionists [ ] p11 p There are some stochastic matrices P = 12 p 21 p 22 that would allow you to predict the weather with perfect accuracy on any day in the future. Homework 13: (a) Which of the following matrices are stochastic and have this property? (b) What would they tell you about the weather pattern? P 1 = P 5 = [ ] 1 0 0 1 [ ] 1 0 1 0 P 2 = P 6 = [ ] 1 1 1 1 [ ] 0 1 1 0 P 3 = P 7 = [ ] 1 1 0 0 [ ] 0 0 0 0 P 4 = P 8 = [ 0.5 ] 0.5 0.5 0.5 [ ] 0 1 0 1

weather.com, two-day forecast Let q 11 be the probability that if day t is a sunny day, then day t + 2 will also be sunny. q 12 be the probability that if day t is a sunny day, then day t + 2 will be rainy. q 21 be the probability that if day t is a rainy day, then day t + 2 will be sunny. q 12 be the probability that if day t is a rainy day, then day t + 2 will also be rainy. How can we express the transition probability matrix [ ] q11 q Q = 12 q 21 q 22 from day t to day t + 2 in terms of the next-day transition probabilities p 11, p 12, p 21, p 22?

weather.com, working out a two-day forecast q 11 is the probability that if day t is a sunny day, then day t + 2 will also be sunny. If day t is sunny, day t + 1 could be sunny (with probability p 11 ) or rainy (with probability p 12 ): q 11 = p 11 p 11 + p 12 p 21 q 12 is the probability that if day t is a sunny day, then day t + 2 will be rainy. q 12 = p 11 p 12 + p 12 p 22 q 21 is the probability that if day t is a rainy day, then day t + 2 will be sunny. q 21 = p 21 p 11 + p 22 p 21 q 12 is the probability that if day t is a rainy day, then day t + 2 will also be rainy. q 22 = p 21 p 12 + p 22 p 22

weather.com, posting a two-day forecast The transition probability matrix from day t to day t + 2 is [ ] [ ] q11 q Q = 12 p11 p = 11 + p 12 p 21 p 11 p 12 + p 12 p 22 q 21 q 22 p 21 p 11 + p 22 p 21 p 21 p 12 + p 22 p 22 Done!!! Anyone up for working out a seven-day forecast? Homework 14: Find the two-day transition probability matrix Q if the one-day transition probability matrix is [ ] 0.4 0.6 P = 0.7 0.3

weather.com continued: The Markov property Let q 11 be the probability that if day t is a sunny day, then day t + 2 will also be sunny. Recall that the calculations of q 11 used the following argument: If day t is sunny, day t + 1 could be sunny (with probability p 11 ) or rainy (with probability p 12 ). In the first case, the probability that day t 2 will be sunny is p 11, in the second case this probability is p 21. This gives: q 11 = p 11 p 11 + p 12 p 21 Note that we are making a tacit assumption here: The state of the process at time t + 1 depends only on the state at time t, but not on prior history. (Whether day t 1 was also sunny or rainy.) This is called the Markov property and a (discrete-time) stochastic process with finitely many states that has this property is called a Markov chain.

weather.com for real: Does the weather work this way? Does the weather really have the Markov property? Probably not. A long stretch of sunny days may be more likely to be followed by one more sunny day than a single sunny day in an otherwise rainy period. But perhaps the real weather pattern is close enough to a Markov chain so that we can make somewhat accurate predictions by modeling it as one. All models involve simplifying assumptions. In this course we will only consider examples of stochastic processes that are assumed to be Markov chains.

Where is Waldo? Recall the friendship matrix for six students in our class that we discussed earlier? Waldo is another student in this class. He is highly gregarious and motivated and spends all of his evenings working with those six students on his MATH 3200 homework. At 7p.m. he visits a randomly chosen student i among those six, and then operates as follows: He starts working with i. After 10 minutes, he flips a fair coin. If the coin comes up heads, he continues working with i for another 10 minutes before flipping the coin again. If the coin comes up tails, he moves to the room of a randomly chosen friend of i and repeats the procedure. He never tires of these efforts until 1a.m. Where should we go looking for Waldo at midnight?

A few observations about Waldo We cannot know for certain where Waldo is at midnight. But can try to identify the room where he will be with the highest probability. Waldo s itinerary can be modeled as a Markov chain with states i = 1, 2,..., 6 where one time step lasts 10 minutes. State i simply means that Waldo is in i s room. We want to find the transition probability matrix for this Markov chain.

The transition probability matrix P for Waldo We can find P from the friendship matrix 0 0 0 1 0 1 0 0 0 1 0 1 A = 0 0 0 1 1 0 1 1 1 0 0 1 = [a ij ] 6 6 0 0 1 0 0 0 1 1 0 1 0 0 1/2 0 0 1/4 0 1/4 0 1/2 0 1/4 0 1/4 P = 0 0 1/2 1/4 1/4 0 1/8 1/8 1/8 1/2 0 1/8 = [p ij ] 6 6 0 0 1/2 0 1/2 0 1/6 1/6 0 1/6 0 1/2

If Waldo weren t such great company... The story is halfway believable only if you assume that Waldo is extremely likable. What would happen if our six students were not all that fond of him? In this case you can think of i flipping the coin and deciding whether to kick out Waldo and sending him to a randomly chosen student who is not i s friend. This alternative story defines another Markov chain with the same states and the same interpretation of a single time step. Homework 15: Find the transition probability matrix for this new Markov chain.

Waldo surfs the web At 1a.m., Waldo goes back to his own room and surfs the web. He opens his home page and follows a randomly chosen link. At each page that he visits: If the page has no link to another page, he teleports to a randomly chosen URL. If the page has links, he rolls a fair die. If 6 comes up, he teleports to a randomly chosen URL. If any other number comes up, he follows a randomly chosen link from the current page. Note that teleporting here means that Waldo somehow reaches a web page without using any links. This surfing pattern, together with what he learned in MATH 3200, made Waldo rich and famous. Who is Waldo? Ohio University Since 1804 Department of Mathematics