ST 371 (V): Families of Discrete Distributions

Similar documents
Discrete Distributions

Lecture 12. Poisson random variables

a zoo of (discrete) random variables

Bernoulli Trials, Binomial and Cumulative Distributions

Binomial and Poisson Probability Distributions

Topic 9 Examples of Mass Functions and Densities

STOR Lecture 8. Random Variables - II

Lecture 13. Poisson Distribution. Text: A Course in Probability by Weiss 5.5. STAT 225 Introduction to Probability Models February 16, 2014

Math/Stat 352 Lecture 8

BNAD 276 Lecture 5 Discrete Probability Distributions Exercises 1 11

Week 6, 9/24/12-9/28/12, Notes: Bernoulli, Binomial, Hypergeometric, and Poisson Random Variables

Part 3: Parametric Models

Chapter 3 Discrete Random Variables

Chapter (4) Discrete Probability Distributions Examples

Suppose that you have three coins. Coin A is fair, coin B shows heads with probability 0.6 and coin C shows heads with probability 0.8.

BINOMIAL DISTRIBUTION

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Guidelines for Solving Probability Problems

18.175: Lecture 13 Infinite divisibility and Lévy processes

Expectations. Definition Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X ) or

Common Discrete Distributions

Things to remember when learning probability distributions:

Some Special Discrete Distributions

18440: Probability and Random variables Quiz 1 Friday, October 17th, 2014

Notes on Continuous Random Variables

Bernoulli Trials and Binomial Distribution

Class 26: review for final exam 18.05, Spring 2014

Relationship between probability set function and random variable - 2 -

Chapter 2: Discrete Distributions. 2.1 Random Variables of the Discrete Type

Random Variables Example:

Discrete Random Variable Practice

DEFINITION: IF AN OUTCOME OF A RANDOM EXPERIMENT IS CONVERTED TO A SINGLE (RANDOM) NUMBER (E.G. THE TOTAL

Chapters 3.2 Discrete distributions

Random Models. Tusheng Zhang. February 14, 2013

Topic 3: The Expectation of a Random Variable

STAT 414: Introduction to Probability Theory

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

DISCRETE VARIABLE PROBLEMS ONLY

Notes for Math 324, Part 17

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 1

Discrete Distributions

18.175: Lecture 17 Poisson random variables

REPEATED TRIALS. p(e 1 ) p(e 2 )... p(e k )

p. 4-1 Random Variables

Binomial random variable

Discrete Distributions

MATH 250 / SPRING 2011 SAMPLE QUESTIONS / SET 3

Conditional Probability

Lecture 4: Random Variables and Distributions

1 Basic continuous random variable problems

STAT 418: Probability and Stochastic Processes

Mathematical Statistics 1 Math A 6330

Part 3: Parametric Models

Lecture 3. Discrete Random Variables

Discrete Random Variable

UNIVERSITY OF CALIFORNIA, BERKELEY

1 Basic continuous random variable problems

Random Variable And Probability Distribution. Is defined as a real valued function defined on the sample space S. We denote it as X, Y, Z,

STAT509: Discrete Random Variable

CS 1538: Introduction to Simulation Homework 1

1. Summarize the sample categorical data by creating a frequency table and bar graph. Y Y N Y N N Y Y Y N Y N N N Y Y Y N Y Y

Bernoulli Trials and Binomial Distribution

Find the value of n in order for the player to get an expected return of 9 counters per roll.

Statistics for Economists. Lectures 3 & 4

Discrete Random Variables

Probability Method in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Twelfth Problem Assignment

Senior Math Circles November 19, 2008 Probability II

1. If X has density. cx 3 e x ), 0 x < 0, otherwise. Find the value of c that makes f a probability density. f(x) =

Lectures Random Variables

CMPSCI 240: Reasoning Under Uncertainty

STT 315 Problem Set #3

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 2 MATH00040 SEMESTER / Probability

STA 4321/5325 Solution to Extra Homework 1 February 8, 2017

Bernoulli and Binomial Distributions. Notes. Bernoulli Trials. Bernoulli/Binomial Random Variables Bernoulli and Binomial Distributions.

ECE 313: Conflict Final Exam Tuesday, May 13, 2014, 7:00 p.m. 10:00 p.m. Room 241 Everitt Lab

Probability Theory and Simulation Methods. April 6th, Lecture 19: Special distributions

Midterm Exam 1 Solution

Random variables. DS GA 1002 Probability and Statistics for Data Science.

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

Discussion 03 Solutions

Topic 3: The Expectation of a Random Variable

a zoo of (discrete) random variables

Stats Review Chapter 6. Mary Stangler Center for Academic Success Revised 8/16

II. The Binomial Distribution

Introduction. Probability and distributions

Problems and results for the ninth week Mathematics A3 for Civil Engineering students

Week 04 Discussion. a) What is the probability that of those selected for the in-depth interview 4 liked the new flavor and 1 did not?

Math 493 Final Exam December 01

1. Discrete Distributions

R Based Probability Distributions

Quick review on Discrete Random Variables

Applied Statistics I

Topic 3 - Discrete distributions

37.3. The Poisson Distribution. Introduction. Prerequisites. Learning Outcomes

Chapter 1: Revie of Calculus and Probability

Geometric Distribution The characteristics of a geometric experiment are: 1. There are one or more Bernoulli trials with all failures except the last

Discrete Distributions

Transcription:

ST 371 (V): Families of Discrete Distributions Certain experiments and associated random variables can be grouped into families, where all random variables in the family share a certain structure and a particular random variable in the family is described by one or more parameters. For discrete random variables, we will be looking at the following families: (1) Binomial (special case: Bernoulli); (2) Poisson; (3) Geometric; (4) Negative Binomial; (5) Hypergeometric. 1 Bernoulli/Binomial Probability Distribution Bernoulli Random Variable: Suppose that a trial (experiment) whose outcome can be classified as either a success or a failure is performed. Let the random variable X equal 1 when the outcome is a success and X equal 0 when the outcome is a failure. Then the pmf of X is p(0) = P {X = 0} = 1 p p(1) = P {X = 1} = p, where p is the probability that the trial is a success. A random variable X that has the above pmf for 0 < p < 1 is said to be Bernoulli (p) random variable. p is the parameter of the distribution. Binomial Random Variables: Suppose that n independent trials are performed, each of which results in a success with probability p and a failure with probability 1 p. If X represents the number of successes that occur in n trials, then X is said to be a binomial random variable with parameters (n, p). Thus, a Bernoulli random variable is just a binomial random variable with parameters (1, p). The pmf of a binomial random variable having parameters is given by ( ) n (1.1) p(i) = p i (1 p) n i, i = 0, 1,, n. i 1

The validity of (1.1) is verified by first noting that the probability of any particular sequence of n outcomes containing i successes and n i failures is, by the assumed independence of trials, p i (1 p) n i. Equation (1.1) follows because there are C i,n different sequences of the n outcomes leading to i successes and n i failures (we are choosing combinations of i slots to put the successes among n possible slots). Examples of binomial random variables: Example 1 Samuel Pepys was a great diarist of the English language and a friend of Isaac Newton s. Pepys was a gambler and wrote to Newton to ask which of three events is the most likely (a) at least one six comes up when six fair dice are rolled; (b) at least two sixes come up when 12 dice are rolled; (c) at least three sixes come up when 18 dice are rolled. What is the answer? Note that the number of rolls that are sixes in n independent tosses of a fair die is Binomial(n, 1/6). 2

Example 2 Each day the price of a new computer stock moves up one point or down one point with probabilities 0.75 and 0.25 respectively. What is the probability that after six days the stock will have returned to its original quotation? Assume that the daily price fluctuations are independent events. Note that the number of times that the stock moves up is Binomial(6, 0.75). Example 3 On her way to work, a commuter encounters four traffic signals. The distance between each of the four is sufficiently great that the probability of getting a green light at any intersection is independent of what happened at any prior intersection. If each light is green for 40 seconds of every minute, what is the probability that the driver has to stop at least three times? Note that the number of times that the driven has to stop is Binomial (4, 1/3). 3

Properties of Binomial Variable: If X Binomial(n, p), then E(X) = np, Var(X) = np(1 p), and σ x = np(1 p) (the proof is not required). Example 4 On a multiple choice test that has 100 questions with five possible answers for each question, we would expect to get 20 questions right if we were just guessing: E(X) = np = 100(1/5) = 20. The standard deviation of the number of questions we would get right is SD(X) = V ar(x) = np(1 p) = 100(1/5)(4/5) = 4. Binomial Random Variables in R: The command rbinom(m,size,prob) simulates m binomial (n=size,p=prob) random variables, e.g., > rbinom(2,10,.5) [1] 6 4 simulates the number of heads in 10 fair coin tosses two times. The command dbinom(x,size,prob) calculates the probability that a binomial(n=size,p=prob) random variable equals x, e.g., > dbinom(4,10,.5) [1] 0.2050781 is the probability of getting 4 heads in 10 fair coin tosses. The command pbinom(q,size,prob) calculates the probability that a binomial(n=size,p=prob) random variable is less than or equal to q, e..g, > pbinom(4,10,.5) [1] 0.3769531 is the probability of getting 4 or less heads in 10 fair coin tosses. 4

2 Poisson Probability Distribution 2.1 Poisson Random Variable A random variable X taking on one of the values 0, 1, 2,... is said to be a Poisson random variable with parameter λ if for some λ > 0, p(i) = P {X = i} = e λλi i!, for i = 0, 1, 2,.... This is a pmf because p(i) = e λ i=0 = e λ e λ = 1. Applications of Poisson random variables: The Poisson family of random variables provides a good model for the number of successes in an experiment consisting of a large number of independent trials with a small probability of success for each trial (since the number of successes is a binomial random variable with n large and p small) Examples of random phenomenon that are accurately modeled as Poisson random variables include: The number of misprints on a page (or a group of pages) of a book The number of people in a community living to 100 years of age The number of wrong telephone numbers that are dialed in a day The number of people arrive at a bus stop within an hour Example 5 Let X be the number of drivers who travel between a particular origin and destination during a given time period. Suppose X has a Poisson distribution with parameter λ = 20. What is the probability that the number of drivers will 5 i=0 λ i i!

be at most 10? exceed 20? be between 10 and 20 inclusive? 2.2 The Poisson Distribution as a Limit Poisson random variables provide an approximation for a binomial random variable when n is large and p is small enough so that np = λis of moderate size. Let X be a binomial random variable with parameters (n, p) and let λ = np. Then P (X = i) = = = n! (n i)!i! pi (1 p) n i ( ) i ( n! λ 1 λ ) n i (n i)!i! n n n(n 1) (n i + 1) λ i (1 λ/n) n n i i! (1 λ/n) i For n large, λ moderate and i considerably smaller than n we have ( 1 λ ) n ( e λ n(n 1) (n i + 1), 1, 1 λ i 1 n n n) i 6

The following tables gives an idea of the accuracy of the Poisson approximation to the binomial. For n=100, p=1/100, the Poisson approximation is remarkably good. Binomial and Poisson probabilities for n=5, p=1/5 and np=1 x B(5, 0.2) Poi(1) 0 0.328 0.368 1 0.410 0.368 2 0.205 0.184 3 0.051 0.061 4 0.006 0.015 5 0.000 0.003 6 0.000 0.001 Binomial and Poisson probabilities for n=100, p=.01 and np=1 x B(100, 0.01) Poi(1) 0 0.366032 0.367879 1 0.369730 0.367879 2 0.184865 0.183940 3 0.060999 0.061313 4 0.014942 0.015328 5 0.002898 0.003066 6 0.000463 0.000511 7 0.000063 0.000073 8 0.000007 0.000009 9 0.000001 0.000001 10 0.000000 0.000000 7

Example 6 A chromosome mutation believed to be linked with colorblindness is known to occur, on the average, once in every 10,000 births. If 20,000 babies are born this year in a certain city, what is the probability that at least one will develop color blindness? An advantage of the Poisson approximation to the binomial is that we do not need to know the precise number of trials and the precise value of the probability of success; it is enough to know what the product of these two values is. Example 7 Suppose that in a certain country commercial airplane crashes occur at the rate of 2.5 per year. Find the probability that four or more crashes will occur next year. 8

2.3 The Properties of Poisson Distribution Let X be a Poisson(λ) random variable, then E(X) = = λ ie λ λ i i=0 i=1 i! = λe λ = λe λ e λ = λ e λ λ i 1 (i 1)! i=0 λ j (by letting j = i 1) j! and E(X 2 ) = = λ i 2 e λ λ i i=0 i=1 i! ie λ λ i 1 (i 1)! = λe λ (j + 1)e λ λ j (by letting j = i 1) j! j=0 [ ] je λ λ j e λ λ j = λ + j! j! j=0 = λ(λ + 1), j=0 where the final equality follows since the first sum is the expected value of a Poisson random variable with parameter λ and the second is the sum of the probabilities of this random variable. Therefore, V ar(x) = E(X 2 ) (E(X)) 2 = λ. 9

R Commands for Poisson distribution: The command rpois(n,lambda) simulates n Poisson random variables with parameter lambda. The command dpois(x,lambda) computes the probability that a Poisson random variable with parameter lambda equals x. The command ppois(x,lambda) computes the probability that a Poisson random variable with parameter lambda is less than or equal to x. 2.4 The Poisson Process: Poisson random variables as the number of events occurring in a time period Another use of the Poisson probability distribution, besides approximating the binomial for large n, small p, is to model the number of events occurring in a certain period of time, e.g., the number of earthquakes occurring during some fixed time span the number of wars per year the number of electrons emitted from a radioactive source during a given period of time the number of freak accidents, such as falls in the shower, for a large population during a given period of time (used by insurance companies) number of vehicles that pass a marker on a roadway during a given period of time. Let X denote the number of events occurring in a certain period of time. Suppose for a positive constant λ, the following assumptions hold true: 1. The probability that exactly 1 event occurs in a given interval of length h is equal to λh+o(h), where o(h) stands for any function f(h) such that lim h 0 f(h)/h = 0 [for instance, f(h) = h 2 is o(h) whereas f(h) = h is not.] 10

2. The probability that 2 or more events occur in an interval of length h is equal to o(h). 3. For any integers n, j 1,, j n and any set of n nonoverlapping intervals, if we define E i to be the event that exactly j k of the events under consideration occur in the ith of these intervals, then events E 1,, E n are independent. Under Assumptions 1-3, the number of events occurring in any interval of length t is a Poisson random variable with parameter λt. Real life examples: 1. In the 432 years from 1500 to 1931, war broke out somewhere in the world of 299 times (by definition, a military action was a war if it either was legally declared, involved over 50,000 troops or resulted in significant boundary realignments.) The following table gives distribution of the number of years in which x wars broke out and the expected frequencies for a Poisson (λ = 0.69) random variable. # of Wars in a given year Observed frequency Expected frequency 0 223 217 1 142 149 2 48 52 3 15 12 4 4 2 Total 432 432 2. During World War II, London was heavily bombed by V-2 guided ballistic rockets. These rockets, luckily, were not particularly accurate at hitting targets. The number of direct hits in the southern section of London has been analyzed by splitting the area up into 576 sectors measuring one quarter of a square kilometer each. The average number of direct hits per sector was 0.9323. The fit of a Poisson distribution with to the observed frequencies is excellent: 11

Hits Actual Frequency Expected Frequency 0 229 226.74 1 211 211.39 2 93 98.54 3 35 30.62 4 7 7.14 5 or more 1 1.57 Example 8 Bacteria are distributed throughout a volume of liquid according to the three assumptions with an intensity of θ = 0.6 organisms per mm 3. A measuring device counts the number of bacteria in a 10 mm 3 volume of the liquid. What is the probability that more than two bacteria are in this measured volume? 12

Example 9 Suppose that earthquakes occur in the western portion of the United States in accordance with assumptions 1, 2 and 3 with and with 1 week as the unit of time (That is, earthquakes occur in accordance with the three assumptions at the rate of 2 per week). Find the probability that at least 3 earthquakes occur during the next 2 weeks. Find the probability distribution of the time, starting from now, until the next earthquake. 13

2.5 Poisson Distribution: put together Poisson distribution arises in two settings: (1) provides an approximation to the binomial distribution when n is large, p is small and λ = np is moderate. (2) models the number of events that occur in a time period t when (a) the probability of an event occurring in a given small time period is approximately proportional to λh. (b) the probability of two or more events occurring in a given small time period is much smaller than λh. (c) the number of events occurring in two non-overlapping time periods are independent. When (a), (b) and (c) are satisfied, the number of events occurring in a time period t has a Poisson (λt) distribution. The parameter λ is called the rate of the Poisson distribution; λ is the mean number of events that occur in a time period of length 1. The mean number of events that occur in a time period of length t is λt and the variance of the number of events is also λt. Sketch of proof for Poisson distribution under (a)-(c): For a large value of n, we can divide the time period t into n nonoverlapping intervals of length t/n. The number of events occurring in time period t is then approximately Binomial (n, λt/n). Using the Poisson approximation to the binomial, the number of events occurring in time period t is approximately Poisson (nλt/n) =Poisson (λt). Taking the limit as n yields the result. The Poisson distribution also applies to the number of events occurring in space. Instead of intervals of length t, we have domains of area or volume t. Assumptions (a)-(c) become: (a ) the probability of an event occurring in a given small region of area or volume h is approximately proportional to λh. (b ) the probability of two or more events occurring in a given small region of area or volume is much smaller than λh. 14

(c ) the number of events occurring in two non-overlapping regions are independent. The parameter λ for a Poisson distribution for the number of events occurring in space is called the intensity. 3 Other Discrete Probability Distributions 3.1 Geometric Distribution Suppose that independent trials, each having a probability p, 0 < p < 1, of being a success, are performed until a success occurs. Let X be the random variable that denotes the number of trials required. The probability mass function of X is (3.2) P (X = n) = (1 p) n 1 p, n = 1, 2, The pmf follows because in order for X to equal n, it is necessary and sufficient that the first n-1 trials are failures and the nth trial is a success. A random variable that has the pmf (3.2) is called a geometric random variable with parameter p. The expected value and variance of a geometric (p) random variable are E(X) = 1 p, V ar(x) = 1 p p 2 Example 10 A fair die is tossed. What is the probability that the first six occurs on the fourth roll? What is the expected number of tosses needed to toss the first six? 15

3.2 Negative Binomial Distribution Suppose that independent trials, each having a probability p, 0 < p < 1, of being a success, are performed until r successes occurs. Let X be the random variable that denotes the number of trials required. The probability mass function of X is (3.3) P {X = n} = ( n 1 r 1 ) p r (1 p) n r n = r, r + 1,... A random variable whose pmf is given by (3.3) is called a negative binomial random variable with parameters (r, p). Note that the geometric random variable is a negative binomial random variable with parameters (1, p). The expected value and variance of a negative binomial random variable are E(X) = r r(1 p), V ar(x) = p p 2 Example 11 Suppose that an underground military installation is fortified to the extent that it can withstand up to four direct hits from air-to-surface missiles and still function. Enemy aircraft can score direct hits with these particular missiles with probability 0.7. Assume all firings are independent. What is the probability that a plane will require fewer than 8 shots to destroy the installation? What is the expected number of shots required to destroy the installation? 16

3.3 Hypergeometric Distribution Suppose that a sample of size n is to be chosen randomly (without replacement) from an urn containing N balls, of which m are white and N m are black. If we let X be the random variable that denotes the number of white balls selected, then (3.4) P (X = i) = ( m i ) ( ) N m n i ( ), i = 0, 1,..., n N n A random variable X whose pmf is given by (3.4) is said to be a hypergeometric random variable with parameters (n, N, m). The expected value and variance of a hypergeometric random variable with parameters (n, N, m) are E(X) = nm N, V ar(x) = np(1 p) ( 1 n 1 N 1 Example 12 A Scrabble set consists of 54 consonants and 44 vowels. What is the probability that your initial draw (of seven letters) will be all consonants? six consonants and one vowel? five consonants and two vowels? ). 17