MATH 382 The Geometric Distribution Dr. Neal, WKU Suppose we have a fixed probability p of having a success on any single attempt, where p > 0. We continue to make independent attempts until we succeed. Then the geometric random variable, denoted by X ~ geo( p), counts the total number of attempts needed to obtain the first success. Its range is the set of natural numbers {, 2, 3,... }. Probability Distribution Function The probability of having a success on the first attempt is p. In general, to succeed for the first time on the k th attempt, there would have to be k failures beforehand. By independence, any sequence having k failures followed by a success occurs with probability q k p, where q = p is the probability of failure on any attempt. Thus, the probability of succeeding for the first time on the k th attempt is given by P( X = k ) = q k p, for k. We can use the geometric series formula to verify that the probability distribution sums to. Because p > 0, we have 0 q <, and thus by geometric series we have P( X = k) = q k p = p q k = p q =. Mode Because the pdf is a decreasing function of k, the mode is. That is, the most likely number of attempts needed to succeed is, and P(X =) = p. (Because p > 0, then q <. So for k 2, q k p < p ; thus, P( X = k ) < P(X =).) So if you are going to succeed, then you are most likely to succeed on the first try, which is better known as beginners luck. Cumulative Distribution Function There is a useful closed-form formula for the cumulative distribution function (cdf) P( X k). The event { X k } denotes that the first success occurs within k attempts. Its complement is the event that there are no successes in any of the first k attempts, which has probability q k. So the cdf of the geometric distribution is given by P( X k) = q k, for k. Example. Roll a fair six-sided die over and over until you roll a 3. Count the number of rolls needed. What are the probabilities that it takes (a) exactly 6 rolls (b) at most 5 rolls (c) at least 4 rolls (d) from 4 to 8 rolls (e) at least 9 rolls given that it takes at least 4 rolls (f) an odd number of rolls?
Solution. The probability of rolling a 3 on any roll is p = / 6 > 0. Let X ~ geo( / 6) count the number of rolls needed to roll the first 3. Letting q = 5 / 6, we have (a) P ( X = 6 ) =! 5 $ # " 6 5 6 0.06698 (b) P ( X 5 ) = # 5 $ 6 ( 5 0.598 (c) To take at least 4 rolls to succeed means we must have failed on the first three tries; so P ( X 4 ) = " 5 $ # 6 3 0.5787. In general, we have P( X k) = P( X k ) = ( q k ) = q k. (d) Using the cdf of X, we have P( 4 X 8) = P( X 8) P ( X 3 ) = # # 5 $ $ 6 # = 5 $ 6 ( 3 # 5 $ 6 ( 8 0.346. ( 8 ( # # 5 $ $ 6 ( 3 ( (e) Using the properties of conditional probability and the general result from (c), we have P( X 9 X 4) = P( X 9 X 4) P( X 4) = P( X 9) (5 / 6)8 = P( X 4) (5 / 6) 3 = (5 / 6)5 0.402. In other words, given that we have already failed 3 times in a row so that X 4, what is the probability that we fail 8 times in a row so that X 9? It is equivalent to starting all over and simply failing 5 times in a row. (f) Finally, the probability that it takes an odd number of rolls to roll a 3 is P( X {, 3, 5, 7,...}) = p + q q p + q q q q p + q q q q q q p +... ( ) +... = p (q 2 ) 0 + (q 2 ) + (q 2 ) 2 + (q 2 ) 3 = p (q 2 ) k = p q 2 = p = ( q)( + q) = + q + 5 / 6 = 6.
Other Geometric Sums The geometric series formula gives us x k x m =, for < x <. Taking the k = m x derivative of both sides with respect to x, we obtain k x k = m xm m x m + x m ( x ) 2. () k = m In particular, for m =, we have k x k = Multiplying by x in Equation (2), we have for < x < k x k = k x k = ( x) 2. (2) x ( x ) 2. (3) Taking the derivative of both sides of (3) with respect to x then multiplying by x, we obtain for < x < k 2 x k = k 2 x k = + x ( x ) 3. (4) Mean and Standard Deviation of the Geometric Distribution Using these variations of the geometric series, we can derive the expected value and variance of the geometric random variable X ~ geo( p). From (2), the expected value is given by Next from (4) we have E[ X ] = k P( X = k) = p k q k = p ( q) 2 = p p 2 = p. E[ X 2 ] = k 2 P( X = k) = p k 2 q k = p + q ( q) 3 = + q p 2 ; thus, Var( X) = E[ X 2 ] (E[ X]) 2 = + q p 2 p 2 = q p 2 = p p 2, and σ X = p p.
The Law of Try Try Again The probability of succeeding within k attempts is P( X k) = q k. Because 0 q <, this probability increases to as k increases to. So by choosing k large enough, we can make our probability of succeeding within k attempts as high as we like. Proposition. Let 0 < p < be the fixed probability of success on any single independent attempt. Let 0 < r <. In order to have probability of at least r of succeeding within k attempts, the minimum number of attempts needed is If p =, then we only need attempt. # ln( r) k = $. Note: The expression " x # means to round x up to the nearest integer. Proof. The result is obvious for p =. For 0 < p <, we simply solve for k in the inequality r P(X k). Note that because 0 < q < when 0 < p <, we have ln q < 0. Thus we have r P(X k) iff r q k iff q k r iff k ln( r) iff k ln( r) # ln( r) iff k = $. Example 2. Qualified applicants have a 20 chance of being accepted into graduate school in top-notch programs. Assume a students evaluations at differing schools are independent of each other. (a) On average, how many applications does it take for a qualified applicant to be accepted? (b) What is the probability that the number of required applications is within a standard deviation of average? (c) You want your chance of acceptance to be at least 98. How many applications should you send out? Solution. Let X ~ geo(0.2) model the number of applications needed for an acceptance. (a) On average, it takes E[ X ] = / p = / 0.2 = 5 applications for an acceptance. (b) With σ X = p / p = 0. 8 / 0.2 4. 47, we have
P ( µ σ X µ σ ) = P(5 4.77 X 5 + 4. 77) = P( X 9) = P( X 9) = 0.8 9 0.86578. (c) To make P(X k) 0.98, the minimum number of applications needed is! ln (0.02) # k = " $ =! 7.53#. ln(0.80) So we need 8 applications. Exercises. Draw cards from a shuffled deck one at a time, each time with replacement and reshuffling. Count the number of draws needed to draw a Heart. (i) What are the probabilities that it takes (a) exactly 4 draws (b) at most 3 draws (c) at least 5 draws (d) from 2 to 6 draws (e) at least 7 draws given that it takes at least 3 draws (f) an odd number of draws? (ii) What are the average number of draws needed and the standard deviation in the number of draws needed? 2. When drawing cards with replacement and re-shuffling, you bet someone that you can draw an Ace within k draws. You want your chance of winning this bet to be at least 52. What is the minimum value of k needed?