Recap of Basic Probability Theory

Similar documents
Recap of Basic Probability Theory

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Random Variable. Pr(X = a) = Pr(s)

Statistical Inference

Sample Spaces, Random Variables

Recitation 2: Probability

Stochastic Processes - lesson 2

Discrete Random Variables

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

the time it takes until a radioactive substance undergoes a decay

1 Presessional Probability

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Northwestern University Department of Electrical Engineering and Computer Science

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Math 564 Homework 1. Solutions.

1.1 Review of Probability Theory

Example 1. The sample space of an experiment where we flip a pair of coins is denoted by:

RVs and their probability distributions

Algorithms for Uncertainty Quantification

Notes 1 : Measure-theoretic foundations I

Statistical Theory 1

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Deep Learning for Computer Vision

Probability Theory for Machine Learning. Chris Cremer September 2015

Week 12-13: Discrete Probability

Lecture 2: Repetition of probability theory and statistics

Venn Diagrams; Probability Laws. Notes. Set Operations and Relations. Venn Diagram 2.1. Venn Diagrams; Probability Laws. Notes

Lecture 8: Probability

EE514A Information Theory I Fall 2013

Discrete Probability. Mark Huiskes, LIACS Probability and Statistics, Mark Huiskes, LIACS, Lecture 2

Week 2. Review of Probability, Random Variables and Univariate Distributions

MATH MW Elementary Probability Course Notes Part I: Models and Counting

STAT 712 MATHEMATICAL STATISTICS I

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Random variables (discrete)

BASICS OF PROBABILITY CHAPTER-1 CS6015-LINEAR ALGEBRA AND RANDOM PROCESSES

Bandits, Experts, and Games

Probability, Random Processes and Inference

Fundamentals of Probability CE 311S

Chapter 2: Random Variables

Bayesian statistics, simulation and software

Statistical Preliminaries. Stony Brook University CSE545, Fall 2016

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

Dept. of Linguistics, Indiana University Fall 2015

Introduction to probability theory

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya

Chapter 2 Random Variables

Probability theory basics

2. AXIOMATIC PROBABILITY

Probability Theory and Random Variables

MATH1231 Algebra, 2017 Chapter 9: Probability and Statistics

Introduction to Probability. Ariel Yadin. Lecture 1. We begin with an example [this is known as Bertrand s paradox]. *** Nov.

Brief Review of Probability

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

1 INFO Sep 05

Chap 4 Probability p227 The probability of any outcome in a random phenomenon is the proportion of times the outcome would occur in a long series of

Review of Probability Theory

Quick Tour of Basic Probability Theory and Linear Algebra

Appendix A : Introduction to Probability and stochastic processes

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

1: PROBABILITY REVIEW

Fundamental Tools - Probability Theory II

Probability Review. Gonzalo Mateos

Lecture 3. Discrete Random Variables

ECE 340 Probabilistic Methods in Engineering M/W 3-4:15. Lecture 2: Random Experiments. Prof. Vince Calhoun

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Notes 6 Autumn Example (One die: part 1) One fair six-sided die is thrown. X is the number showing.

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

Introduction to Probability

IEOR 6711: Stochastic Models I SOLUTIONS to the First Midterm Exam, October 7, 2008

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, x. X s. Real Line

Probability Theory and Applications

Statistics and Econometrics I

CMPSCI 240: Reasoning Under Uncertainty

MTH 202 : Probability and Statistics

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)

M378K In-Class Assignment #1

Chapter 1. Sets and probability. 1.3 Probability space

Probability Notes (A) , Fall 2010

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Random Variables Example:

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, < x <. ( ) X s. Real Line

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Chapter 2. Probability. Math 371. University of Hawai i at Mānoa. Summer 2011

PROBABILITY VITTORIA SILVESTRI

Lecture 4: Probability and Discrete Random Variables

Relationship between probability set function and random variable - 2 -

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Topic 3: The Expectation of a Random Variable

Recap. The study of randomness and uncertainty Chances, odds, likelihood, expected, probably, on average,... PROBABILITY INFERENTIAL STATISTICS

Probability Year 9. Terminology

Introduction to Probability and Stocastic Processes - Part I

Transcription:

02407 Stochastic Processes? Recap of Basic Probability Theory Uffe Høgsbro Thygesen Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: uht@imm.dtu.dk 1 / 39

? Stochastic experiments The probability triple (Ω, F, P): Ω: The sample space, ω Ω F: The set of events, A F A Ω P: The probability measure, A F P(A) [0, 1] Random variables Distribution functions Conditioning 2 / 39

?? Stochastic es is applied probability A firm understanding of probability (as taught in e.g. 02405) will get you far We need a more solid basis than most students develop in e.g. 02405. What to recap? The concepts are most important: What is a stochastic variable, what is conditioning, etc. Specific models and formulas: That a binomial distribution appears as the sum of Bernoulli variates, etc. 3 / 39

? We perform a stochastic experiment. We use ω to denote the outcome. The sample space Ω is the set of all possible outcomes. Ω ω 4 / 39

The sample space Ω? Ω can be a very simple set, e.g. {H, T } (tossing a coin a.k.a. Bernouilli experiment) {1, 2, 3, 4, 5, 6} (throwing a die once). N (typical for single discrete stochastic variables) R d (typical for multivariate continuous stochastic variables) or a more complicated set, e.g. The set of all functions R R d with some regularity properties. Often we will not need to specify what Ω is. 5 / 39

Events? Events are sets of outcomes/subsets of Ω Events correspond to statements about the outcome. For a die thrown once, the event A = {1, 2, 3} corresponds to the statement the die showed no more than three. A ω Ω 6 / 39

Probability? A Probability is a set measure of an event If A is an event, then P(A) is the probability that the event A occurs in the stochastic experiment - a number between 0 and 1. (What exactly does this mean? C.f. G&S p 5, and appendix III) Regardless of interpretation, we can pose simle conditions for mathematical consistency. 7 / 39

Logical operators as set operators? An important question: Which events are measurable, i.e. have a probability assigned to them? We want our usual logical reasoning to work! So: If A and B are legal statements, represented by measurable subsets of Ω, then so are Not A, i.e. A c = Ω\A A or B, i.e. A B. 8 / 39

Parallels between statements and sets? Set Statement A The event A occured (ω A) A c Not A A B A and B A B A or B (A B)\(A B) A exclusive-or B See also table 1.1 in Grimmett & Stirzaker, page 3 9 / 39

An infinite, but countable, number of statements? For the Bernoulli experiment, we need statements like At least one experiment shows heads or In the long run, every other experiment shows heads. So: If A i are events for i N, then so is i N A i. 10 / 39

All events considered form a σ-field F? Definition: 1. The empty set is an event, F 2. Given a countable set of events A 1, A 2,..., its union is also an event, i N A i F 3. If A is an event, then so is the complementary set A c. 11 / 39

(Trivial) examples of σ-fields? 1. F = {, Ω} This is the deterministic case: All statements are either true ( ω) or false ( ω). 2. F = {, A, A c, Ω} This corresponds to the Bernoulli experiment or tossing a coin: The event A corresponds to heads. 3. F = 2 Ω = set of all subsets of Ω. When Ω is finite or enumerable, we can actually work with 2 Ω ; otherwise not. 12 / 39

Define P(A) for all events A? 1. P( ) = 0, P(Ω) = 1 2. If A 1, A 2,... are mutually excluding events (ie. A i A j = for i j), then P( i=1a i ) = P(A i ) A P : F [0, 1] satisfying these is called a probability measure. The triple (Ω, F, P) is called a probability space. i=1 13 / 39

? In stochastic es, we want to know what to expect from the future, conditional on our past observations. A A^B B Ω P(A B) = P(A B) P(B) A^B B 14 / 39

Be careful when conditioning!? If you are not careful about specifying the events involved, you can easily obtain wrong conclusions. Example: A family has two children. Each child is a boy with probability 1/2, independently of the other. Given that at least one is a boy, what is the probability that they are both boys? 15 / 39

Be careful when conditioning!? If you are not careful about specifying the events involved, you can easily obtain wrong conclusions. Example: A family has two children. Each child is a boy with probability 1/2, independently of the other. Given that at least one is a boy, what is the probability that they are both boys? The meaningless answer: P(two boys at least one boy) = P(other child is a boy) = 1 2 15 / 39

Be careful when conditioning!? If you are not careful about specifying the events involved, you can easily obtain wrong conclusions. Example: A family has two children. Each child is a boy with probability 1/2, independently of the other. Given that at least one is a boy, what is the probability that they are both boys? The meaningless answer: P(two boys at least one boy) = P(other child is a boy) = 1 2 The right answer: P(two boys at least one boy) = P(two boys) P(at least one boy) = 1/4 3/4 = 1 3 15 / 39

Lemma: The law of total probability? Let B 1,..., B n be a partition of Ω (ie., mutually disjoint and n i=1 B i = Ω) Then n P(A) = P(A B i )P(B i ) i=1 16 / 39

Independence? Events A and B are called independent if P(A B) = P(A)P(B) When 0 < P(B) < 1, this is the same as P(A B) = P(A) = P(A B c ) A family {A i : i I} of events is called independent if P( i J A i ) = P(A i ) i J for any finite subset J of I. 17 / 39

? Informally: A quantity which is assigned by a stochastic experiment. Formally: A mapping X : Ω R. 18 / 39

? Informally: A quantity which is assigned by a stochastic experiment. Formally: A mapping X : Ω R. A Technical comment We want the P(X x) to be well defined. So we require x R : {ω : X(ω) x} F 18 / 39

Examples of stochastic variables? Indicator functions: { 1 when ω A, X(ω) = I A (ω) = 0 else. Bernoulli variables: Ω = {H, T }, X(H) = 1, X(T) = 0. 19 / 39

? Properties: F(x) = P(X x) 1. lim x F(x) = 0, lim x + F(x) = 1. 2. x < y F(x) F(y) 3. F is right-continuous, ie. F(x + h) F(x) as h 0. 20 / 39

? Discrete variables: ImX is a countable set. So F X is a step function. Typically ImX Z. Continuous variables: X has a probability density function (pdf) f, i.e. so F is differentiable. F(x) = x f(u) du 21 / 39

A variable which is neither continuous nor discrete? Let X U(0, 1), i.e. uniform on [0, 1] so that F X (x) = x for 0 x 1. Toss a fair coin. If heads, then set Y = X. If tails, then set Y = 0. F Y (y) = 1 2 + 1 x for 0 y 1. 2 We say that Y has an atom at 0. 22 / 39

Mean and variance? The mean of a stochastic variable is in the discrete case, and EX = i Z EX = + ip(x = i) f(x) dx in the continuous case. In both cases we assume that the sum/integral exists absolutely. The variance of X is VX = E(X Ex) 2 = EX 2 (EX) 2 23 / 39

? The conditional is the mean in the conditional distribution E(Y X = x) = yf Y X (y x) y It can be seen as a stochastic variable: Let ψ(x) = E(Y X = x), then ψ(x) is the conditional of Y given X We have ψ(x) = E(Y X) E(E(Y X)) = EY 24 / 39

variance V(Y X)? is the variance in the conditional distribution. V(Y X = x) = (y ψ(x)) 2 f Y X (y x) y This can also be written as V(Y X) = E(Y 2 X) (E(Y X)) 2 and can be manipulated into (try it!) VY = EV(Y X) + VE(Y X) which partitions the variance of Y. 25 / 39

Random vectors? When a single stochastic experiment defines the value of several stochastic variables. Example: Throw a dart. Record both vertical and horizontal distance to center. X = (X 1,...,X n ) : Ω R n Also random vectors are characterised by the distribution function F : R n [0, 1]: F(x) = P(X 1 x 1,..., X n x n ) where x = (x 1,..., x n ). 26 / 39

Example? In one experiment, we toss two fair coins and assign the results to V and X. In another experiment, we toss one fair coin and assign the result to both Y and Z. V, X, Y and Z are all identically distributed. But (V, X) and (Y, Z) are not identically distributed. E.g. P(V = X = heads) = 1/4 while P(Y = Z = heads) = 1/2. 27 / 39

? Start with working out one single Bernoulli experiment. Then consider a finite number of Bernoulli experiments: The binomial distribution Next, a sequence of Bernoulli experiments:. Waiting times in the Bernoulli : The negative binomial distribution. 28 / 39

experiment? A Bernoulli experiment models e.g. tossing a coin. The sample space is Ω = {H, T }. Events are F = {, {H}, {T }, Ω} = 2 Ω = {0, 1} Ω. The probability P : F [0, 1] is defined by (!) P({H}) = p. The stochastic variable X : Ω R with X(H) = 1 X(T) = 0 is Bernoulli distributed with parameter p. 29 / 39

A finite number of Bernoulli variables? We toss a coin n times. The sample space is Ω = {H, T } n. For the case n = 2, this is {TT, TH, HT, HH}. Events are F = 2 Ω = {0, 1} Ω. How many events are there? F = 2 Ω = 2 (2n) (a lot). Introduce A i for the event the i th toss showed heads. 30 / 39

? The probability P : F [0, 1] is defined by P(A i ) = p and by requiring that the events {A i : i = 1,...,n} are independent. From this we derive P({ω}) = p k (1 p) n k if ω has k heads and n k tails. and from that the probability of any event. 31 / 39

? Define the stochastic variable X as number of heads X = n 1(A i ) i=1 To find its probability mass function, consider the events P(X = x) which is shorthand for P({ω : X(ω) = x}) This event {X = x} has ( n x) elements. Each ω {X = x} has probability P(ω) = p x (1 p) n x so the probability is P(X = x) = ( ) n p x (1 p) n x x 32 / 39

Properties of B(n,p)? ( ) Probability mass function n fx (x) = P(X = x) = p x (1 p) n x x Cumulated FX (x) = P(X x) = Mean value Variance EX = E n 1(A i ) = i=1 VX = x f X (i) i=0 n P(A i ) = np i=1 n V1(A i ) = np(1 p) i=1 because {A i : i = 1,...n} are (pairwise) independent. 33 / 39

Problem 3.11.8? Let X B(n, p) and Y B(m, p) be independent. Show that Z = X + Y B(n + m, p) 34 / 39

Problem 3.11.8? Let X B(n, p) and Y B(m, p) be independent. Show that Z = X + Y B(n + m, p) Solution: Consider m + n independent Bernoulli trials, each w.p. p. Set X = n i=1 1(A i) and Y = n+m i=n+1 1(A i). Then X and Y are as in the problem, and Z = n+m i=1 1(A i ) B(n, p) 34 / 39

? A sequence of Bernoulli experiments. The sample space Ω is the set of functions N {0, 1}. Introduce events A i for the ith toss showed heads. Strictly: A i = {ω : ω(i) = 1} Let F be the smallest σ-field that contains all A i. 35 / 39

Probabilities in the Bernoulli? Define (!) P : F [0, 1] by P(A i ) = p and {A i : i N} are independent. 36 / 39

Probabilities in the Bernoulli? Define (!) P : F [0, 1] by P(A i ) = p and {A i : i N} are independent. 36 / 39

Waiting times in the Bernoulli? Let W r be the waiting time for the rth succes: W t = min{i : i 1(A j ) = r} j=1 To find the probability mass function of W r, note that W r = k is the same event as k 1 ( i=1 1(A i ) = r 1) A k Since the two events involved here are independent, we get ( ) k 1 f W (k) = P(W r = k) = p r (1 p) k r r 1 37 / 39

The geometric distribution? The waiting time W to the first success P(W = k) = (1 p) k 1 p (First k 1 failures and then one success) The survival function is G W (k) = P(W > k) = (1 p) k 38 / 39

? We need be precise in our use of, at least until we have developed intuition. When in doubt, ask: What is the stochastic experiment? What is the probability triple? Which event am I considering? Venn diagrams a very useful. This holds particularly for conditioning, which is central to stochastic es. Indicator functions are powerful tools, once mastered. You need to know the distributions that can be derived from the Bernoulli : The binomial, geometric, and negative binomial distribution. 39 / 39