Introduction to Stochastic Processes

Similar documents
Lecture 2: Repetition of probability theory and statistics

Algorithms for Uncertainty Quantification

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2

Recitation 2: Probability

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

CMPSCI 240: Reasoning Under Uncertainty

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 1: Probability Fundamentals

Probability Theory for Machine Learning. Chris Cremer September 2015

Single Maths B: Introduction to Probability

Dept. of Linguistics, Indiana University Fall 2015

Intro to Probability. Andrei Barbu

Introduction to Probability and Stocastic Processes - Part I

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Appendix A : Introduction to Probability and stochastic processes

3 Multiple Discrete Random Variables

Chapter 2 Random Variables

the time it takes until a radioactive substance undergoes a decay

SDS 321: Introduction to Probability and Statistics

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

MATH 450: Mathematical statistics

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

MAT 271E Probability and Statistics

Statistics and Econometrics I

Lecture 4: Probability and Discrete Random Variables

Sample Spaces, Random Variables

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

EE514A Information Theory I Fall 2013

1.1 Review of Probability Theory

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Basic Probability. Introduction

Lecture 16. Lectures 1-15 Review

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Econ 325: Introduction to Empirical Economics

STAT 430/510: Lecture 10

CMPSCI 240: Reasoning Under Uncertainty

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Probability Theory Review

M378K In-Class Assignment #1

1 Presessional Probability

SDS 321: Introduction to Probability and Statistics

MAT 271E Probability and Statistics

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Probability Review. Chao Lan

Discrete Random Variables

Chapter 2: Random Variables

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Bayesian statistics, simulation and software

Discrete Probability Refresher

CME 106: Review Probability theory

EE 178 Lecture Notes 0 Course Introduction. About EE178. About Probability. Course Goals. Course Topics. Lecture Notes EE 178

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Lecture 11. Probability Theory: an Overveiw

Bivariate distributions

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

CS 630 Basic Probability and Information Theory. Tim Campbell

Design of Engineering Experiments

1 INFO Sep 05

SDS 321: Introduction to Probability and Statistics

Probability theory basics

MATH Solutions to Probability Exercises

Machine Learning. Instructor: Pranjal Awasthi

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Probability, Random Processes and Inference

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Math Review Sheet, Fall 2008

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 1

ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172.

Sociology 6Z03 Topic 10: Probability (Part I)

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 2: Random Experiments. Prof. Vince Calhoun

Statistics for Economists. Lectures 3 & 4

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

MAT 271E Probability and Statistics

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Lecture 1. ABC of Probability

CS4705. Probability Review and Naïve Bayes. Slides from Dragomir Radev

Probability- describes the pattern of chance outcomes

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Northwestern University Department of Electrical Engineering and Computer Science

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 1

Probability Theory and Simulation Methods

Review of Probability. CS1538: Introduction to Simulations

Review of Probability Theory

Brief Review of Probability

Statistical Inference

STAT 516 Midterm Exam 3 Friday, April 18, 2008

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

PROBABILITY THEORY. Prof. S. J. Soni. Assistant Professor Computer Engg. Department SPCE, Visnagar

Disjointness and Additivity

Part (A): Review of Probability [Statistics I revision]

Transcription:

Stat251/551 (Spring 2017) Stochastic Processes Lecture: 1 Introduction to Stochastic Processes Lecturer: Sahand Negahban Scribe: Sahand Negahban 1 Organization Issues We will use canvas as the course webpage. HW, solutions, lecture notes will be linked from canvas. The course book written by Prof. Chang is posted on classes V2. 1.1 Assignments There will be roughly eight homework assignments. We have posted the first one online. Homework assignments will be typically assigned on Monday and due the following Wednesday. Exceptions will be noted. Students are also expected to act as scribe a small number of lectures preparing a latex version of lecture notes. I will posted the template on classesv2. There will be three midterm exams for undergraduates and two for graduate students. Graduate students will complete a final project in lieu of the final midterm. Undergraduates with sufficiently high grades may also request to do a project. Students may complete projects in groups of at most 3. The breakdowns of how grades will be assigned can be found on the course syllabus. 2 Overview The course will introduce basic ideas in stochastic processes. We will follow the textbook developed by Prof. Joe Chang. The book is posted on classesv2. Stochastic processes are simply collections of random objects. There are many examples and applications of stochastic processes: Images: We can think of an image as a collection of pixels. For simplicity let us consider a grey-scale image. Then each pixel tells us an intensity of brightness. Furthermore, pixels are related to eachother. Pixels that are close to each other should usually have similar colors. If we consider the image in Figure 2 that is 5 pixels by 5 pixels then we can think of each circle as a pixel that is organized on a grid. Stock prices: We can plot the price of an equity as a function of time in Figure 2 Protein folding: In bioinformatics we are frequently interested in understanding the dynamics of a specific protein. This might be because imaging technology is not sufficiently advanced or we wish to develop a new protein and understand how it behaves without needing to create the protein in the lab. Often times the dynamics of these systems are treated as a Markov chain, which we will discuss in more detail in the upcoming lectures. 1

2 Introduction to Stochastic Processes Figure 1: Cartoon of a five by five pixel image of the letter Y Electromagnetic signals: In Electrical Engineering applications we often wish to estimate some electromagnetic signal, for example cell phone, GPS, wifi, or radio signals. These are frequently modeled as stochastic processes because there are certain factors that are unknown to the receiver. For example the receiver does not know what the transmitter aims to send and the transmitter s signal can be corrupted by noise. Speech recognition: One simple and successful model for speech recognition is known as a Hidden Markov Model or HMM. We can think of spoken language as a sequence (x 1, x 2, x 3,...) of idealized phonemes where we only can hear the speakers interpretation of what that phoneme should sound like. For speech recognition application we wish to infer the collection of the most likely phonemes. 3 Prerequisites Before continuing with our discussion of stochastic processes we will lay down some notation that arises from first year probability that you are all expected to understand. 3.1 Probability We refer to an experiment as any event whose result is unknown in advance. The result of that experiment is known as the outcome. We will denote the sample space Ω to be the set of all possible outcomes. Example (Flipping two coins). Flipping a coin (usually) has one of two possible outcomes: heads (H) or tails (T). Two coin flips now have four possible outcomes: HH, HT, TH, and TT. Therefore, Ω = {HH, HT, TH, TT}. (Note that we have said nothing about probabilities). We use probability to help us understand the uncertainty. Probability theory has been a very successful model for understanding uncertainty. Probability allows us to formally capture that uncertainty: Probability theory is nothing but common sense reduced to calculation. Probability allows us to quantify the uncertainty of events. An event is any subset of the sample space Ω. For example, if we flip two coins then we can consider the event that each of the coin flips was the same. Formally, we let A = {HH,TT}. Clearly A is a subset of the sample space Ω from above. We can also consider another event B = {HH,HT}. Clearly B is

Introduction to Stochastic Processes 3 0 20 40 60 80 100 1996 04 12 1998 08 17 2000 12 19 2003 05 05 2005 09 08 2008 01 16 2010 05 24 2012 09 26 Figure 2: Daily stock prices for Yahoo! the event that the first coin flip is heads. Since A and B are both sets we can do the standard operations like intersections, set complements, etc... Assigning probabilities: Probability is a way of quantifying the level of uncertainty of an event. We use a function P(A) that assignes a number between 0 and 1 to any (technically measurable) set or event. Furthermore, we assume that for any countably infinite set of disjoint subsets A 1, A 2, A 3,... we have that P( A i ) = P(A i ) i=1 Recall that two events A and B are independent if i=1 P(A B) = P(A)P(B). (1) For example we usually assume that the probability that a fair coin flip comes up heads is one half. That is, P ({H}) = 0.5. Furthermore, we assume that all coin flips from the same coin are independent. Note that we distinguish between a normal P and a probability function P. 3.2 Conditional Probabilities A conditional probability is intuitively the probablity that a certain event happens if we know that another one happen. More concretely, given two sets A and B we define the conditional probability of A given B as P(A B) = P(A B) P(B) Note that if two events A and B are independent then P(A B) = P(A). This point is intuitively clear since independence should mean that B has no information about A. An alternative way to consider conditional

4 Introduction to Stochastic Processes probabilities is P(A B) = P(A B)P(B) The above formulation is a bit more intuitive. It effectively says that the probability of both A and B happening is equal to the probability that B happens times the probability that A happens when we know that B already happened. Given conditional probabilities we have Bayes Rule P(A B) = P(B A)P(A) P(B) We also have the Law of Total Probability which states that P(A) = P(A B)P(B) + P(A B c )P(B c ) (Recall that B c is the set complement.) Using Bayes rule and the law of total probability we have that P(A B) = P(B A)P(A) P(B A)P(A) + P(B A c )P(A c ) 3.3 Random Variables and Expectations For the technically inclined recall that a random variable X is a measurable real-valued function that maps values ω Ω to the reals R. An example of this is a coin flip. We might consider the random variable X to be function that maps an outcome of a heads to 1 and an outcome of a tails to 0. This is a bit abstract so we will often just talk about the probability that a random variable takes on a certain value. 3.3.1 Discrete Random Variables Discrete-valued random variables can take on a finite or countably-infinite number of values. For each possible value x of the random variable X we will assign some positive probability P(X = x) > 0. Example (Binomial distribution). Suppose that a basketball player take n free throws. Suppose that each free throw has probability of p of success and that all free throws are independent of eachother. Let the random variable X n count the total number of free throws that go in. Then P(X n = k) = ( n k ) p k (1 p) n k for k {0, 1,..., n}. The distribution of X n is known as the Binomial distribution with parameters n and p. Clearly in n coin flips the maximum number of heads is n and the minimum is 0. Thus, the total number of values that X n can take is n + 1, so this is an example of a discrete random variable that takes on a finite number of values. Next we consider a discrete random variable that can take on an infinite number of values Example (Poisson distribution). A random variable X follows the Poisson distribution if λ λk P(X = k) = e k! for any integer k 0. Here, X can take on a countably-infinite number of values. Poissons are frequently used to model the number of photons that hit a silicon sensor in extrem low-light situations. (Like the ones in your digital cameras). Often times we will associate a probability mass function p X (k) or PMF with the discrete random variable X such that p X (k) = P(X = k). We will frequently drop the subscript X when the context is clear.

Introduction to Stochastic Processes 5 3.3.2 Continuous Random Variables Next we consider continuous random variables. These random variables can take on any values in the real line. For example the total time it takes you to finish your homework. A continuous random variable X has an associated probability density function or PDF f X (t) if for all a b P(a X b) = b a f X (t)dt We often drop the subscript and write f(x) with the lower-case letter x to make clear that f is the PDF of X. We require that f(x) 0 and f(x) = 1. Note that in the continuous case it is not true that P(X = x) = f X (x). 3.3.3 Joint Distributions Getting closer to stochastic processes, if we have a collection of random variables X 1, X 2,..., X n we wish to discuss the joint randomness. When the random variables are discrete we can simply write p(x 1, x 2,..., x n ) = P(X 1 = x 1,..., X n = x n ) and assign probabilities to those joint outcomes. That gives us the joint PMF. For continuous random variables we specific a joint PDF f(x 1,..., x n ) such that P((X 1,..., X n ) A) = f(x 1, x 2,..., x n )dx 1 dx 2... dx n. (x 1,x 2,...,x n) A From the joint distribution we can always recover the marginal distributions. In the discrete case we have p X (x) = P(X = x) = y p X,Y (x, y) and in the continuous case we have f X (x) = f X,Y (x, y) y Throughout the course when we write a sum as x (or an integral as ) we mean that the sum (respectively y integral) should be taken over all possible values of the index. 3.3.4 Independence of Random Variables We say that two random variables X and Y are independent if P(X A, Y B) = P(X A)P(Y B). Note that if two random variables X and Y are independent then for any two function f and g we have that the random variables f(x) and g(y ) are also independent. 3.3.5 Expected Values We denote the expected value of a random variable X as E(X). For a discrete valued random variable we let E(X) = xp(x = x). x

6 Introduction to Stochastic Processes For a continuous valued random variable we let E(X) = xf X (x)dx We define the variance of a random variable to be var(x) = E((X E(X)) 2 ) and the mean is simply EX. Recall that for two random variable X and Y E(X + Y ) = E(X) + E(Y ). x