Lecture 2: Discrete Probability Distributions

Similar documents
Lecture 5: Moment Generating Functions

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014

Lecture 1: Probability Fundamentals

BINOMIAL DISTRIBUTION

Guidelines for Solving Probability Problems

Lecture 12. Poisson random variables

Topic 9 Examples of Mass Functions and Densities

18.175: Lecture 13 Infinite divisibility and Lévy processes

Counting principles, including permutations and combinations.

Bernoulli Trials, Binomial and Cumulative Distributions

Binomial and Poisson Probability Distributions

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Given a experiment with outcomes in sample space: Ω Probability measure applied to subsets of Ω: P[A] 0 P[A B] = P[A] + P[B] P[AB] = P(AB)

Discrete Distributions

Common Discrete Distributions

4. Discrete Probability Distributions. Introduction & Binomial Distribution

Binomial random variable

Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8.

Probability Theory and Simulation Methods. April 6th, Lecture 19: Special distributions

Things to remember when learning probability distributions:

Fundamental Tools - Probability Theory II

6.207/14.15: Networks Lecture 4: Erdös-Renyi Graphs and Phase Transitions

Lecture 2: Probability and Distributions

Lecture 2 Binomial and Poisson Probability Distributions

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /13/2016 1/33

Chapters 3.2 Discrete distributions

Introduction to Error Analysis

Lecture 14. Text: A Course in Probability by Weiss 5.6. STAT 225 Introduction to Probability Models February 23, Whitney Huang Purdue University

Physics 6720 Introduction to Statistics April 4, 2017

SIMPLE RANDOM WALKS: IMPROBABILITY OF PROFITABLE STOPPING

CS 237: Probability in Computing

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Mathematical Statistics 1 Math A 6330

a zoo of (discrete) random variables

Bernoulli Trials and Binomial Distribution

Math 151. Rumbos Spring Solutions to Review Problems for Exam 3

Known probability distributions

Discrete Distributions

success and failure independent from one trial to the next?

CHAPTER 6. 1, if n =1, 2p(1 p), if n =2, n (1 p) n 1 n p + p n 1 (1 p), if n =3, 4, 5,... var(d) = 4var(R) =4np(1 p).

Statistics for Economists. Lectures 3 & 4

p. 4-1 Random Variables

CMPSCI 240: Reasoning Under Uncertainty

STAT509: Discrete Random Variable

Randomized Load Balancing:The Power of 2 Choices

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes

Introduction to Probability

Lecture 3. Discrete Random Variables

Relationship between probability set function and random variable - 2 -

Analysis of Engineering and Scientific Data. Semester

Topic 3: The Expectation of a Random Variable

Copyright c 2006 Jason Underdown Some rights reserved. choose notation. n distinct items divided into r distinct groups.

Poisson Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University

Discrete Distributions

An-Najah National University Faculty of Engineering Industrial Engineering Department. Course : Quantitative Methods (65211)

ALL TEXTS BELONG TO OWNERS. Candidate code: glt090 TAKEN FROM

Lectures on Elementary Probability. William G. Faris

Stat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.

Lecture 16. Lectures 1-15 Review

Poisson approximations

Continuous Random Variables. What continuous random variables are and how to use them. I can give a definition of a continuous random variable.

Probability and Probability Distributions. Dr. Mohammed Alahmed

18.175: Lecture 17 Poisson random variables

Quick review on Discrete Random Variables

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

Lecture 08: Poisson and More. Lisa Yan July 13, 2018

ACM 116: Lecture 2. Agenda. Independence. Bayes rule. Discrete random variables Bernoulli distribution Binomial distribution

Data, Estimation and Inference

The central limit theorem

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

I. Introduction to probability, 1

Midterm Exam 1 Solution

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Lecture 2. Binomial and Poisson Probability Distributions

STA 584 Supplementary Examples (not to be graded) Fall, 2003

CS5314 Randomized Algorithms. Lecture 15: Balls, Bins, Random Graphs (Hashing)

Statistics, Data Analysis, and Simulation SS 2017

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Modeling Rare Events

Statistics, Data Analysis, and Simulation SS 2013

BMIR Lecture Series on Probability and Statistics Fall, 2015 Uniform Distribution

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Special distributions

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Math/Stat 352 Lecture 8

Lecture 4: Bernoulli Process

Random Variable And Probability Distribution. Is defined as a real valued function defined on the sample space S. We denote it as X, Y, Z,

Brief Review of Probability

Discrete Random Variables

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

Bernoulli Trials and Binomial Distribution

Statistics, Probability Distributions & Error Propagation. James R. Graham

Chapter 4 : Discrete Random Variables

MAS113 Introduction to Probability and Statistics

Topic 3: The Expectation of a Random Variable

Twelfth Problem Assignment

Chapter 3. Discrete Random Variables and Their Probability Distributions

Special Mathematics Discrete random variables

Week 4: Chap. 3 Statistics of Radioactivity

Lecture 6. Probability events. Definition 1. The sample space, S, of a. probability experiment is the collection of all

Transcription:

Lecture 2: Discrete Probability Distributions IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge February 1st, 2011 Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 1 / 16

Two useful rules for computing discrete probabilities When a set of outcomes are all equally likely, the probability of an event is the number of outcomes consistent with the event divided with the total number of possible outcomes. Example: What is the probability that someone has their birthday on Dec 31st? It s 1/365. Example: What is the probability that someone has their birthday in April? Roughly 1/12. The probability that two independent events happen simultaneously is given by the product of the probabilities. Example: What is the probability that two people both have their birthday in January? It s 1/144. Example: What is the probability that two people both have their birthday in the same month? It s roughly 1/12. There a several ways to think about this. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 2 / 16

Computing the probability of more complicated events What is the smallest number of people you need to ensure that the probability that 2 people have identical birthdays is at least a half? First, work out the probability q n that with n people, none share birthdays. Arbitrarily, order the people. The birthday of the first person is irrelevant. The second person must have a birthday different from the first, p 2 = 364/365. The third person must have a birthday different from the first and the second, p 3 = 363/365. Generally, for the i th person, p i = (366 i)/365. Because of independence, q n = n i=1 p i = 365 364 365 365 366 n 365! 365 = 365 n (365 n)!. We re interested in the smallest value of n for which q n < 1/2. A (tedious) calculation shows n = 23, q 23 = 0.493. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 3 / 16

The Bernoulli Distribution The simplest distribution is the Bernoulli distribution: X Ber(p), where 0 p 1. The random variable X is binary, and takes on the value p(x = 1) = p and p(x = 0) = 1 p. The probability is sometimes written concisely as p(x = x) = p x (1 p) (1 x). Example: In a simple climate model, you might have the random variable X indicating whether it rains on a particular day. You could then have X Ber(p), where different values of p would be used for different areas. Warning: on this page the symbol p (intentionally) has two different meanings: probability or a parameter. Underline all the occurrences where p is a parameter. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 4 / 16

Characterising the Bernoulli Distribution The mean or expectation of the Bernoulli is E[X] = x X x p(x) = 0 (1 p) + 1 p = p, i.e., the mean of Bernoulli with parameter p is equal to p itself. The variance of a distribution measures how much the outcomes spread. The variance is defined as V[X] = Var[X] = E[(X E[X]) 2 ] = x X(x E[X]) 2 p(x). For the Bernoulli random variable: V[X] = x X(x p) 2 p(x) = p 2 (1 p)+(1 p) 2 p = ((1 p)+p)p(1 p) = p(1 p). Thus, the maximum variance of the Bernoulli is 1/4, attained when p = 1/2. Does this seem reasonable? The entropy of the Bernoulli E[log(p)] = p log(p) (1 p) log(1 p) is maximally 1 bit, attained when p = 1/2. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 5 / 16

Variance and Moments The variance is always non-negative. It is sometimes useful to write the variance as V[X] = ( ) 2p(x) ( x E[X] = x 2 + E[X] 2 2xE[X] ) p(x) x X x X = E[X 2 ] + E[X] 2 2E[X] 2 = E[X 2 ] E[X] 2. Here, E[X 2 ] is called the second moment. Similarly, E[X] is called the first moment. The variance is also called the central second moment. Example: Verify the above rule for Bernoulli. The second moment is 0 (1 p) + 1 2 p = p. The square of the first moment is p 2. The variance is thus p p 2 = p(1 p). Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 6 / 16

The Binomial Distribution The Binomial is the number of successes r in n independent, identical Bernoulli trials, each with probability of success p. We write X B(n, p), where n = 0, 1, 2,..., and 0 p 1. The value of the probability function is p(x = r) = n C r p r (1 p) (n r). The notation n C r are the combinations n choose r, the number of ways one can choose r out of n when the order doesn t matter: ( ) n n! nc r = = r (n r)!r!. The Bernoulli is a special case of the Binomial: Ber(p) = B(1, p). Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 7 / 16

Refresher on Permutations and Combinations If you don t remember this, refer to part 1A math notes for more details. The number of ways in which you can permute r elements from a total of n (when order is important) is np r = n! (n r)! = n(n 1)(n 2) (n r + 1), since, for the first item you can choose between n, for the second n 1 and so on. The number of ways in which you can choose r elements from a total of n (when order is not important) is nc r = n P r r! = n! (n r)!r!, as there are r! orderings of the permutations which gives rise to the same choices. Warning: n! is difficult to compute with for large n. Stirling s approximation n! 2πn exp( n)n n. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 8 / 16

Binomial Example A semiconductor factory produces CPUs on silicon wafers, 100 CPU s on each wafer. Some of the CPUs don t work. Cutting the chips from the wafer is expensive so wafers with many failing units are discarded. Assume that the probability of a CPU failing is independent and p = 0.05. What is the probability p( 5) that 5 or more CPUs fail? Answer: Let the random variable X be the number of failures on a wafer: p(x 5) = 1 p(x 4) = 1 4 B(n, i) = 1 i=0 4 nc i p i (1 p) 100 i i=0 1 0.006 0.031 0.081 0.140 0.178 = 0.564 This shows that just over half the wafers will have 5 failures or more. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 9 / 16

Some Binomial Distributions Below is an illustration of the Binomial probability distribution for various settings of the parameters n and p. probability 0.2 0.15 0.1 0.05 n=20, p=0.5 n=20, p=0.7 n=40, p=0.5 0 0 5 10 15 20 25 30 number of successes Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 10 / 16

Characterising the Binomial Distribution The expectation of the Binomial is E[X] = n r p(x = r) = r=0 = np = np n r=1 ñ r=0 n r n C r p r (1 p) n r = r=0 (n 1)! (n r)!(r 1)! pr 1 (1 p) n r ñ! (ñ r)! r! p r (1 p)ñ r = np, n r=1 rn! (n r)!r! pr (1 p) n r where ñ = n 1 and r = r 1, and using the fact that the Binomial normalizes to one. In fact, the result is not surprising, since the Binomial gives the number of successes in n independent Bernoulli trials; the Bernoulli has expectation p. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 11 / 16

Variance of the Binomial We can compute the variance in an analogous way to the expectation, by doing the substitution twice. The result is V[X] = np(1 p). These are instances of two general rules which we will see later: When you add independent variables the means add up, and the variances add up. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 12 / 16

Note: as the histogram bins get smaller, the probability of multiple arrivals in a bin decreases and each bin tends to a Bernoulli event. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 13 / 16 Independent Event Arrivals Imagine phone calls arriving randomly and independently to a phone exchange, with an average rate (or intensity) λ. 10 5 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 4 2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 time, t Above are 4 histograms of the same 25 arrivals at different time resolutions. What is the probability distribution over number of arrivals per second?

Examining a Sequence of Bernoulli Events Therefore, we look at getting r arrivals in a Binomial B(n, p), where the number of bins n grows. When the bins get smaller, the probability of an arrival within a bin falls proportional to the bin size, p = λ/n. So, we want: p(x = r) = B(n, λ/n) as n. p(x = r) = lim B(n, λ/n) = lim n! λ n n (n r)!r!( n n n 1 = lim n n n n r + 1 ( λ r )( 1 λ ) n }{{ n } r! }{{ n } =1 =exp( λ) = λr exp( λ) r! This is called the Poisson distribution, with intensity λ. ) r ( 1 λ ) n r n ( ) r 1 λ }{{ n } =1 Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 14 / 16

The Poisson Distribution If events happen randomly at independent times, with an average intensity (number of event per unit time) λ, then the probability that a certain number of events r within a time interval is Poisson distributed, Po(λ). p(x = r) = λr exp( λ), where λ 0. r! Examples: radioactive decay, network traffic, customer arrival, etc. The expectation and variance of the Poisson are both E[X] = V[X] = λ, which can be derived from the limiting Binomial. Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 15 / 16

Some Poisson Distributions Below are some Poisson distributions with different intensities probability 0.4 0.3 0.2 0.1 0 λ=1 λ=4 λ=10 0 5 10 15 20 number of successes Because Binomial distributions with large n are tedious to compute with, one can approximate the Binomial with the Poisson with the same mean, i.e. B(n, p) Po(np). The approximation is good when n is large (say n > 50) and p small (say p < 0.1). Rasmussen (CUED) Lecture 2: Discrete Probability Distributions February 1st, 2011 16 / 16