MATH MW Elementary Probability Course Notes Part I: Models and Counting

Similar documents
= 2 5 Note how we need to be somewhat careful with how we define the total number of outcomes in b) and d). We will return to this later.

Fundamentals of Probability CE 311S

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Notes Week 2 Chapter 3 Probability WEEK 2 page 1

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

Probability Year 9. Terminology

STAT 516 Answers Homework 2 January 23, 2008 Solutions by Mark Daniel Ward PROBLEMS. = {(a 1, a 2,...) : a i < 6 for all i}

3.2 Probability Rules

Probability Year 10. Terminology

Probability (Devore Chapter Two)

Probability & Random Variables

STOR Lecture 4. Axioms of Probability - II

HW2 Solutions, for MATH441, STAT461, STAT561, due September 9th

Probability and Independence Terri Bittner, Ph.D.

CMPSCI 240: Reasoning about Uncertainty

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

P (A) = P (B) = P (C) = P (D) =

Statistics 1L03 - Midterm #2 Review

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

n N CHAPTER 1 Atoms Thermodynamics Molecules Statistical Thermodynamics (S.T.)

Origins of Probability Theory

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Lecture 1. ABC of Probability

1. Discrete Distributions

Probability and the Second Law of Thermodynamics

If S = {O 1, O 2,, O n }, where O i is the i th elementary outcome, and p i is the probability of the i th elementary outcome, then

Discrete Probability

Dept. of Linguistics, Indiana University Fall 2015

4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

Probability 1 (MATH 11300) lecture slides

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

STA Module 4 Probability Concepts. Rev.F08 1

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E.

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

Review Basic Probability Concept

12 1 = = 1

Grades 7 & 8, Math Circles 24/25/26 October, Probability

Statistical Inference

(6, 1), (5, 2), (4, 3), (3, 4), (2, 5), (1, 6)

MATH 556: PROBABILITY PRIMER

Statistics for Managers Using Microsoft Excel (3 rd Edition)

Probability Theory for Machine Learning. Chris Cremer September 2015

Chapter 6: Probability The Study of Randomness

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.)

PROBABILITY. Contents Preface 1 1. Introduction 2 2. Combinatorial analysis 5 3. Stirling s formula 8. Preface

Probability Theory and Applications

Problem # Number of points 1 /20 2 /20 3 /20 4 /20 5 /20 6 /20 7 /20 8 /20 Total /150

Lecture 1: Probability Fundamentals

Probability and distributions. Francesco Corona

PROBABILITY VITTORIA SILVESTRI

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

God doesn t play dice. - Albert Einstein

the time it takes until a radioactive substance undergoes a decay

7.1 What is it and why should we care?

Axioms of Probability

Introduction and basic definitions

(a) Fill in the missing probabilities in the table. (b) Calculate P(F G). (c) Calculate P(E c ). (d) Is this a uniform sample space?

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Probability: Sets, Sample Spaces, Events

Lecture 3. Probability and elements of combinatorics

Lecture 1 : The Mathematical Theory of Probability

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Name: Exam 2 Solutions. March 13, 2017

BINOMIAL DISTRIBUTION

Probabilistic models

Basic Concepts of Probability

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

6.041/6.431 Spring 2009 Quiz 1 Wednesday, March 11, 7:30-9:30 PM. SOLUTIONS

5. Conditional Distributions

Statistical Methods for the Social Sciences, Autumn 2012

CIVL Why are we studying probability and statistics? Learning Objectives. Basic Laws and Axioms of Probability

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation

Chapter 3 : Conditional Probability and Independence

Introduction to Probability. Ariel Yadin. Lecture 1. We begin with an example [this is known as Bertrand s paradox]. *** Nov.

Econ 325: Introduction to Empirical Economics

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Probability theory basics

Lecture 2: Probability. Readings: Sections Statistical Inference: drawing conclusions about the population based on a sample

Probability- describes the pattern of chance outcomes

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel

Conditional Probability & Independence. Conditional Probabilities

With Question/Answer Animations. Chapter 7

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Chapter 2 Class Notes

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Discrete Mathematics and Probability Theory Fall 2011 Rao Midterm 2 Solutions

Chance, too, which seems to rush along with slack reins, is bridled and governed by law (Boethius, ).

Probability: Axioms, Properties, Interpretations

MATH 118 FINAL EXAM STUDY GUIDE

Probability, For the Enthusiastic Beginner (Exercises, Version 1, September 2016) David Morin,

2. AXIOMATIC PROBABILITY

Brief Review of Probability

Number Theory and Counting Method. Divisors -Least common divisor -Greatest common multiple

Review of Basic Probability

Intro to Probability Day 4 (Compound events & their probabilities)

Transcription:

MATH 2030 3.00MW Elementary Probability Course Notes Part I: Models and Counting Tom Salisbury salt@yorku.ca York University Winter 2010

Introduction [Jan 5] Probability: the mathematics used for Statistics Are related but different. Probability question: Assume a coin is fair. Flip it 50 times. What is the probability of getting 28 heads? Statistics question: Flip a coin 50 times and observe 28 heads. Do you believe it is fair? Also fundamental in: mathematical modelling, modern physics (statistical mechanics), electrical engineering (signal noise), computer science (simulation), finance (option pricing), etc. Frequentist interpration: repeat an experiment a large number of times. The probability of an event is the relative frequency with which it occurs, in the long run. Other interpretations: level of uncertainty, or subjective belief. First task: build a mathematical model that captures our empirical sense of probabilities.

Model [Jan 7] Model has three ingredients: a non-empty set Ω a collection F of subsets of Ω, satisfying F A F A c F A1, A 2, F A i F a function P : F R satisfying P(A) 0 A F P(Ω) = 1 A1, A 2, F and disjoint P( A i ) = P(A i ). Remarks: Here A c = Ω \ A, where B \ A = {ω B ω / A}. The model is (Ω, F, P). F as above is called a σ-field, and P a probability measure. The sequence A 1, A 2,... above can be finite or countable, and the last property is called countable additivity.

Interpretation [Jan 7] Elements ω Ω are called outcomes (or sample points or states of nature). Ω is the sample space, ie. the set of all possible outcomes. We think of the experiment as choosing ω at random from Ω. So knowing what ω is chosen determines everything else: ω should code within it all information we care about, and answers to all questions we will ask. Events A are elements of F, so A Ω. If ω is the outcome picked, then we say A occurs if ω A. P(A) is the probability that the event A occurs. F is the set of events to which we can assign probabilities. Random variables are measurements, whose values depend on ω. Formally, are functions X : Ω R such that {ω X (ω) x} F x R.

Example [Jan 7] Flip a fair coin twice: eg. if the first flip is head and the second is tails we represent this by ω = (H, T ). So Ω = {(H, H), (H, T ), (T, H), (T, T )} has 4 elements. We ll assign prob s to all subsets of Ω. So F consists of all 16 = 2 4 subsets of Ω. eg. the event that the flips agree is {(H, H), (T, T )}, and contains 2 outcomes. Fair means all 4 outcomes are equally likely. So let P(A) = #(A)/4. eg. prob of 2 heads is P({(H, H)}) = 1 4. eg. prob that flips agree is P({(H, H), (T, T )}) = 1 2.

Example (continued) [Jan 7] Let the random variable X be the number of heads. X ((H, H)) = 2 X ((H, T )) = 1 = X ((T, H)) X ((T, T )) = 0. So P({ω X (ω) = 2}) = 1 4, P({ω X (ω) = 1}) = 1 2. Remark: We abbreviate these as P(X = 2) and P(X = 1). More generally, for any real number x we write P(X = x) as an abbreviation for P({ω X (ω) = x}). There are 16 events in F: 1 with 0 outcomes, namely 4 with 1 outcome, eg {(H, H)} (both flips are heads) 6 with 2 outcomes, eg {(H, H), (H, T )} (first flip is a head) 4 with 3 outcomes, eg {(H, H), (H, T ), (T, H)} (at least one is a head) 1 with 4 outcomes, namely Ω itself.

Example (continued) [Jan 7] In fact, to write it all out: { F =, {(H, H)}, {(H, T )}, {(T, H)}, {(T, T )}, {(H, H), (H, T )}, {(H, H), (T, H)}, {(H, H), (T, T )}, {(H, T ), (T, H)}, {(H, T ), (T, T )}, {(T, H), (T, T )}, {(H, H), (H, T ), (T, H)}, {(H, H), (H, T ), (T, T )}, {(H, H), (T, H), (T, T )}, {(H, T ), (T, H), (T, T )}, } {(H, H), (H, T ), (T, H), (T, T )}. To model flipping unfair (or linked) coins, keep same F but change P.

Interpretation [Jan 10] A c is the complimentary event to A. A c occurs A doesn t. never occurs. A null event. Ω always occurs. A sure event. A B occurs A occurs or B occurs (ie if at least one does) A B occurs both A and B occur. A and B are disjoint mutually exclusive A B means that ω A ω B. That is, if A occurs, then B occurs too. For discrete Ω (ie finite or countable), can take F to consist of ALL subsets of Ω. And ALL X : Ω R as r.v. This won t work in general, but in this course we will behave as though this is always the case. ie we will mostly ignore F now.

Models [Jan 10] Whether we can use simple models depends on the questions we ll ask, since ω has to code all the answers. Eg to model stock market for a month would take each ω to be a list of ALL prices at ALL times. Eg a full statistical mechanics model takes ω to code positions and momenta of ALL particles. Often we need to know that a model, but don t want to actually write it down. Have a choice of many ways to model the same experiment. In fact (as soon as we have enough foundation) will try to do higher-level calculations without actually specifying the details of the model we re using.

Properties / Consequences of the axioms [Jan 10] P(A c ) = 1 P(A) (as A A c = Ω disjointly). 0 P(A) 1 (as P(A c ) 0). A B P(A) P(B) (as B = A (B \ A) disjointly). P(A B) = P(A) + P(B) P(A B) (pf: break up disjointly as (A \ B) (A B) (B \ A). Then use additivity.) P(A B C) = P(A) + P(B) + P(C) P(A B) P(A C) P(B C) + P(A B C). This is called inclusion/exclusion. Proof can be given as above, or using Venn diagrams.

Discrete Models [Jan 12] A model is discrete if Ω = {ω 1, ω 2,... } is finite or countable. In that case, P is determined by fixing numbers p 1, p 2,... such that 0 p i 1 i, and p i = 1. Just take P(A) = i s.t. ω i A p i. Special case: equally likely outcome models Ω finite, all p i equal. Then p i = 1/#(Ω), and P(A) = #(A) #(Ω. In this case we will calculate by counting.

More complicated models [Jan 12] Not all models are discrete. Example: Model for a uniform r.v.. Ω = [0, 1), P(A) = A. (F is complicated. = Borel sets. This is a case where it CAN T be all subsets, only reasonable ones). Take Y (ω) = ω R. eg. P(Y x) = [0, x] = x, for 0 x < 1. We ll study such r.v. later (using calculus). Example: Model for many coin flips. Ω = [0, 1), P(A) = A (as before) Write ω = 0.ω 1 ω 2 ω 3... in binary (eg 1 2 =.1, 3 4 =.11, 1 4 =.01). If 0 represents T and 1 represents H, then the ith coin flip r.v. is X i (ω) = ω i. eg. P(X 1 = 0) = [0, 1 2 ) = 1 2, P(X 2 = 0) = [0, 1 4 ) [ 1 2, 3 4 ) = 1 2. P(X 1 = 0 and X 2 = 1) = [0, 1 4 ) = 1 4.

Discrete r.v. [Jan 12] A r.v. has a discrete distribution if there are only finitely or countably many values it can take. In that case, the distribution is given by a table of values x and probabilities P(X = x). Eg. No. of H in 2 coin flips: x 0 1 2 P(X = x) 1 4 1 2 1 4 Eg. Roll 2 dice, X = sum of the numbers showing. Take an equally likely outcome model, Ω = {(i, j) 1 i, j 6} = {(1, 1),..., (6, 6)}. So {ω X (ω) = 2} = {(1, 1)} has probability 1 36, {ω X (ω) = 3} = {(1, 2), (2, 1)} has probability 2 36. x 2 3 4 5 6 7 8 9 10 11 12 P(X = x) 1 36 2 36 3 36 4 36 5 36 6 36 5 36 4 36 3 36 2 36 1 36

Counting [Jan 14] Want to count #(A), the number of elements in a set A. Do this by finding a list of k properties that each element has. Need to do this so the following hold Each property can be specified some fixed number of ways, regardless of how the other properties were specified. Let n i be the number of choices for the ith property. There is a one-to-one correspondence between ways of specifying all these properties, and elements of A. Then #(A) = n 1 n 2 n k. [Basic counting principle] Eg. Dice: We just saw that there were 36 ways to roll two dice in sequence. In this framework, there are two properties in our list: the first roll (6 possibilities), and the second roll (6 possibilities). 36 = 6 6.

Counting [Jan 14] Subsets: If A has n elements, how many subsets B A are there? Here there are n properties in our list: Is the 1st element in or out? Is the 2nd element in or out? etc. There are two possibilities for each, so there are 2 2 2 = 2 n subsets. Eg. Cards: Each card has two properties: a value and a suit. There are 13 ways of specifying the value (Ace, 2, 3, 4, 5, 6, 7, 8, 9, Jack, Queen, King) and 4 ways of specifying the suit (clubs, diamonds, hearts, spades ). Specifying a card is the same as specifying its value and suit, so #(cards) = 13 4 = 52. [In future, spades and clubs will be called black suits, and diamonds and hearts will be called red suits]

Eg. 3-card hands [Jan 14] Deal out 3 cards in sequence, without replacement. ie a card hand is (a, b, c) where a is the 1st card dealt, b is the 2nd card dealt, and c is the 3rd card dealt. Want to count the number of: 3-card hands: Imagine ranking the 52 cards in some way. The properties are: the first card (52 ways); the ranking of the 2nd card among the remaining cards (51 ways); the ranking of the 3rd card among the remaining cards (50 ways). so #(hands) = 52 51 50. 3-card hands with same value: 52 3 2 3-card hands with same suit: 52 12 11 3-card hands with 1 pair: (ie 2 with = value, other ). Properties: value for pair, 1st pair suit, 2nd pair suit, other card, deal for other card: 13 4 3 48 3 [Bad properties: 1st card (52 ways), 2nd (51 ways), 3rd card (48 or 6 ways, depending on the 1st and 2nd choices). So would have to break up as 52 3 48 + 52 48 6]

Permutations [Jan 14] Choose k distinct objects from n objects, in order. In other words, let #(A) = n, and take (n) k = #{(a 1, a 2,..., a k ) each a i A, all a i distinct}, the number of permutations of k objects from n objects. So (n) k = n(n 1)(n 2)... (n k + 1) }{{} n! = n(n 1) (3)(2)(1). [Sometimes people write n P k for (n) k.] k = n! (n k)!, where Eg: The number of ways of ordering n objects is (n) n = n! Eg: The number of 3-card hands (dealt in order) is (52) 3 = 52 51 50. Convention: 0! = 1 and (n) 0 = 1.

Combinations [Jan 17] Choose k distinct objects from n objects, without order. In other words, let #(A) = n, and take ( ) n = #{B B is a k-element subset of A}, k the number of combinations of k objects from n objects. This is read n choose k [and sometimes people write n C k for ( n k).] Also called a binomial coefficient. If we count permutations by counting first the subset, and then the way it is ordered, we get (n) k = ( n k) k!, that is, ( ) n n! = k k!(n k)! So ( ( n n) = 1 = n ( 0), n ( 1) = n, n ) ( k = n n k).

Poker hands without order [Jan 17] Deal out 5 cards without replacement, and without distinguishing on the basis of the sequence they re dealt in. ie a card hand is a 5-card subset of the set of 52 cards. Want to count the number of: hands: ( 52 5 ) = 2, 598, 960 flushes: ( 4 1)( 13 5 ) = 5, 148. [Really should subtract the probability of a straight flush]. 3 of a kind: ( 13 1 )( 12 2 )( 4 3)( 4 1)( 4 1) [Choose 3-value, other values, 3-suits, suit for lowest other value, suit for highest other value. Note that we are excluding a full house or a 4 of a kind] pairs: ( 13 1 )( 12 3 )( 4 2)( 4 1)( 4 1)( 4 1). In a fair deal, all outcomes are equally likely. Can work out probabilities either with order (permutations) or without (combinations). Counting is different, but probabilities should be the same.

Binomial theorem [Jan 17] (a + b) n = n k=0 ( ) n a k b n k k Expand (a + b) n = (a + b)(a + b) (a + b). Get a sum of terms c 1 c 2 c n where each c i is a or b. How many are a k b n k? Choose the k locations corresponding to a s, so ( n k). Eg. (a + b) 4 = a 4 + 4a 3 b + 6a 2 b 2 + 4ab 3 + b 4. Recurrence ( ) ( n = n 1 ) ( + n 1 ). k k 1 k [k element subsets of {1, 2,..., n} either include n or they don t. ( ) ( n 1 k 1 do, and n 1 ) k don t] This Pascal s triangle (see text). An inefficient way of finding one binomial coefficient, but a good way of finding a whole row.

Binomial distribution [Jan 17] Let 0 p 1. Say that a random variable X has a binomial distribution with parameters n and p, if ( ) n P(X = k) = p k (1 p) n k for k = 0, 1,..., n. k This makes sense, because all these ( are 0, and by the n binomial theorem, they sum to p + (1 p)) = 1 n = 1. We ll study this distribution later, where it will be important for things like counting the number of successes in independent trials.

Sampling [Jan 19] With replacement: An urn has 12 red balls and 8 green balls. Pick 5, with replacement. P(3R, 2G) =?? Take Ω = all sequences (a 1, a 2,..., a 5 ) chosen from {R 1, R 2,..., R 12, B 1,..., B 8 }. #(Ω) = 20 5. The event A has #(A) = ( ) 5 3 12 3 8 2. So P(A) = ( ) 5 3 ( 12 20 )3 ( 8 20 )2 0.3456 [In fact, the number of reds has a binomial distribution.] Without replacement: Now Ω = all 5-element subsets of {R 1, R 2,..., R 12, B 1,..., B 8 }. #(Ω) = ( ) 20 5 = 15, 504 and #(A) = ( )( 12 8 3 2) = 6, 160, so )( 8 2) P(A) = ( 12 3 ( 20 5 ) 0.3456 [Now the number of reds has what is called a hypergeometric distribution] Either case could be done with or without order, but these choices are easiest.