Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19

Similar documents
Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 15

Discrete Mathematics and Probability Theory Spring 2012 Alistair Sinclair Note 15

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Lecture 2: April 3, 2013

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

Infinite Sequences and Series

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

Random Models. Tusheng Zhang. February 14, 2013

An Introduction to Randomized Algorithms

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

Lecture Chapter 6: Convergence of Random Sequences

0, otherwise. EX = E(X 1 + X n ) = EX j = np and. Var(X j ) = np(1 p). Var(X) = Var(X X n ) =

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Approximations and more PMFs and PDFs

STAT Homework 1 - Solutions

Discrete probability distributions

Problem Set 2 Solutions

Parameter, Statistic and Random Samples

6.3 Testing Series With Positive Terms

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Random Variables, Sampling and Estimation

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

The standard deviation of the mean

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand Final Solutions

Final Review for MATH 3510

2 Definition of Variance and the obvious guess

Math 155 (Lecture 3)

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Massachusetts Institute of Technology

Discrete Mathematics and Probability Theory Fall 2016 Walrand Probability: An Overview

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Learning Theory: Lecture Notes

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Introduction to Probability and Statistics Twelfth Edition

4. Partial Sums and the Central Limit Theorem

Lecture 12: November 13, 2018

Math 113, Calculus II Winter 2007 Final Exam Solutions

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 18

CS 330 Discussion - Probability

EE 4TM4: Digital Communications II Probability Theory

Basics of Probability Theory (for Theory of Computation courses)

Seunghee Ye Ma 8: Week 5 Oct 28

Confidence Intervals for the Population Proportion p

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

Ma 530 Infinite Series I

( ) = p and P( i = b) = q.

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

1 Introduction to reducing variance in Monte Carlo simulations

Ma 530 Introduction to Power Series

Statistics 511 Additional Materials

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Lecture 16

Section 11.8: Power Series

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Axioms of Measure Theory

Quick Review of Probability

The Random Walk For Dummies

Binomial Distribution

Sequences I. Chapter Introduction

Module 1 Fundamentals in statistics

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Lecture 6: Integration and the Mean Value Theorem. slope =

Topic 8: Expected Values

STAT 516 Answers Homework 6 April 2, 2008 Solutions by Mark Daniel Ward PROBLEMS

Math 10A final exam, December 16, 2016

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Lecture 16. Multiple Random Variables and Applications to Inference

Frequentist Inference

This section is optional.

Quick Review of Probability

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Series: Infinite Sums

Application to Random Graphs

On Random Line Segments in the Unit Square

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

Once we have a sequence of numbers, the next thing to do is to sum them up. Given a sequence (a n ) n=1

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

The coalescent coalescence theory

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m?

MAT1026 Calculus II Basic Convergence Tests for Series

Probability and Statistics

The Poisson Distribution

MA131 - Analysis 1. Workbook 2 Sequences I

Sequences, Series, and All That

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Part I: Covers Sequence through Series Comparison Tests

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS

Estimation for Complete Data

5. Likelihood Ratio Tests

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

CHAPTER 5. Theory and Solution Using Matrix Techniques

Mathematics 170B Selected HW Solutions.

Probability theory and mathematical statistics:

Expectation and Variance of a random variable

Transcription:

CS 70 Discrete Mathematics ad Probability Theory Sprig 2016 Rao ad Walrad Note 19 Some Importat Distributios Recall our basic probabilistic experimet of tossig a biased coi times. This is a very simple model, yet surprisigly powerful. May importat probability distributios that are widely used to model real-world pheomea ca be derived from lookig at this basic coi tossig model. The first example, which we have see i Note 16, is the biomial distributio Bi(, p). This is the distributio of the umber of Heads, S, i tosses of a biased coi with probability p to be Heads. Recall that the distributio of S is Pr[S k] ( k) p k (1 p) k for k {0,1,...,}. The expected value is E(S ) p ad the variace is Var(S ) p(1 p). The biomial distributio frequetly appears to model the umber of successes i a repeated experimet. Geometric Distributio Cosider tossig a biased coi with Heads probability p repeatedly. Let X deote the umber of tosses util the first Head appears. The X is a radom variable that takes values i the set of positive itegers {1,2,3,...}. The evet that X i is equal to the evet of observig Tails for the first i 1 tosses ad gettig Heads i the i-th toss, which occurs with probability (1 p) i 1 p. Such a radom variable is called a geometric radom variable. The geometric distributio frequetly occurs i applicatios because we are ofte iterested i how log we have to wait before a certai evet happes: how may rus before the system fails, how may shots before oe is o target, how may poll samples before we fid a Democrat, how may retrasmissios of a packet before successfully reachig the destiatio, etc. Defiitio 19.1 (Geometric distributio). A radom variable X for which Pr[X i] (1 p) i 1 p for i 1,2,3,... is said to have the geometric distributio with parameter p. This is abbreviated as X Geom(p). As a saity check, we ca verify that the total probability of X is equal to 1: Pr[X i] (1 p) i 1 p p (1 p) i 1 p where i the secod-to-last step we have used the formula for geometric series. 1 1 (1 p) 1, If we plot the distributio of X (i.e., the values Pr[X i] agaist i) we get a curve that decreases mootoically by a factor of 1 p at each step, as show i Figure 1. Let us ow compute the expectatio E(X). Applyig the defiitio of expected value directly gives us: E(X) i Pr[X i] p i(1 p) i 1. CS 70, Sprig 2016, Note 19 1

Figure 1: The Geometric distributio. However, the fial summatio is difficult to evaluate ad requires some calculus trick. Istead, we will use the followig alterative formula for expectatio. Theorem 19.1. Let X be a radom variable that takes values i {0,1,2,...}. The E(X) Pr[X i]. Proof. For otatioal coveiece, let s write p i Pr[X i], for i 0,1,2,... From the defiitio of expectatio, we have E(X) (0 p 0 ) + (1 p 1 ) + (2 p 2 ) + (3 p 3 ) + (4 p 4 ) + p 1 + (p 2 + p 2 ) + (p 3 + p 3 + p 3 ) + (p 4 + p 4 + p 4 + p 4 ) + (p 1 + p 2 + p 3 + p 4 + ) + (p 2 + p 3 + p 4 + ) + (p 3 + p 4 + ) + (p 4 + ) + Pr[X 1] + Pr[X 2] + Pr[X 3] + Pr[X 4] +. I the third lie, we have regrouped the terms ito coveiet ifiite sums, ad each ifiite sum is exactly the probability that X i for each i. You should check that you uderstad how the fourth lie follows from the third. Let us repeat the proof more formally, this time usig more compact mathematical otatio: E(X) j Pr[X j] j1 j1 j Pr[X j] ji Pr[X j] Pr[X i]. We ca ow use Theorem 19.1 to compute E(X) more easily. Theorem 19.2. For a geometric radom variable X Geom(p), we have E(X) 1 p. Proof. The key observatio is that for a geometric radom variable X, Pr[X i] (1 p) i 1 for i 1,2,... (1) We ca obtai this simply by summig Pr[X j] for j i. Aother way of seeig this is to ote that the evet X i meas at least i tosses are required. This is equivalet to sayig that the first i 1 tosses are all CS 70, Sprig 2016, Note 19 2

Tails, ad the probability of this evet is precisely (1 p) i 1. Now, pluggig equatio (1) ito Theorem 19.1, we get E(X) Pr[X i] (1 p) i 1 1 1 (1 p) 1 p, where we have used the formula for geometric series. So, the expected umber of tosses of a biased coi util the first Head appears is 1 p. Ituitively, if i each coi toss we expect to get p Heads, the we eed to toss the coi 1 p times to get 1 Head. So for a fair coi, the expected umber of tosses is 2, but remember that the actual umber of coi tosses that we eed ca be ay positive itegers. Remark: Aother way of derivig E(X) 1 p is to use the iterpretatio of a geometric radom variable X as the umber of coi tosses util we get a Head. Cosider what happes i the first coi toss: If the first toss comes up Heads, the X 1. Otherwise, we have used oe toss, ad we repeat the coi tossig process agai; the umber of coi tosses after the first toss is agai a geometric radom variable with parameter p. Therefore, we ca calculate: Solvig for E(X) yields E(X) 1 p, as claimed. Let us ow compute the variace of X. E(X) p 1 }{{} + (1 p) (1 + E(X)). }{{} first toss is H first toss is T, the toss agai Theorem 19.3. For a geometric radom variable X Geom(p), we have Var(X) 1 p p 2. Proof. We will show that E(X(X 1)) 2(1 p). Sice we already kow E(X) 1 p 2 p, this will imply the desired result: Var(X) E(X 2 ) E(X) 2 E(X(X 1)) + E(X) E(X) 2 2(1 p) p 2 + 1 p 1 2(1 p) + p 1 p2 p 2 1 p p 2. Now to show E(X(X 1)) 2(1 p), we begi with the followig idetity of geometric series: p 2 i0 (1 p) i 1 p. Differetiatig the idetity above with respect to p yields (the i 0 term is equal to 0 so we omit it): i(1 p) i 1 1 p 2. Differetiatig both sides with respect to p agai gives us (the i 1 term is equal to 0 so we omit it): i2 i(i 1)(1 p) i 2 2 p 3. (2) CS 70, Sprig 2016, Note 19 3

Now usig the geometric distributio of X ad idetity (2), we ca calculate: E(X(X 1)) i2 p(1 p) i(i 1) Pr[X i] i(i 1)(1 p) i 1 p i2 i(i 1)(1 p) i 2 p(1 p) 2 p 3 (usig idetity (2)) (the i 1 term is equal to 0 so we omit it) as desired. 2(1 p) p 2, Applicatio: The Coupo Collector s Problem Suppose we are tryig to collect a set of differet baseball cards. We get the cards by buyig boxes of cereal: each box cotais exactly oe card, ad it is equally likely to be ay of the cards. How may boxes do we eed to buy util we have collected at least oe copy of every card? Let X deote the umber of boxes we eed to buy i order to collect all cards. The distributio of X is difficult to compute directly (try it for 3). But if we are oly iterested i its expected value E(X), the we ca evaluate it easily usig liearity of expectatio ad what we have just leared about the geometric distributio. As usual, we start by writig X X 1 + X 2 +... + X (3) for suitable simple radom variables X i. What should the X i be? Naturally, X i is the umber of boxes we buy while tryig to get the i-th ew card (startig immediately after we ve got the (i 1)-st ew card). With this defiitio, make sure you believe equatio (3) before proceedig. What does the distributio of X i look like? Well, X 1 is trivial: o matter what happes, we always get a ew card i the first box (sice we have oe to start with). So Pr[X 1 1] 1, ad thus E(X 1 ) 1. How about X 2? Each time we buy a box, we ll get the same old card with probability 1, ad a ew card with probability 1 1. So we ca thik of buyig boxes as flippig a biased coi with Heads probability p ; the X 2 is just the umber of tosses util the first Head appears. So X 2 has the geometric distributio with parameter p 1, ad E(X 2 ) 1. How about X 3? This is very similar to X 2 except that ow we oly get a ew card with probability 2 (sice there are ow two old oes). So X 3 has the geometric distributio with parameter p 2, ad E(X 3 ) 2. Arguig i the same way, we see that, for i 1,2,...,, X i has the geometric distributio with parameter p i+1, ad hece that E(X i ) i + 1. CS 70, Sprig 2016, Note 19 4

Fially, applyig liearity of expectatio to equatio (3), we get E(X) E(X i ) + 1 + + 2 + 1 1 i. (4) This is a exact expressio for E(X). We ca obtai a tidier form by otig that the sum i it actually has a very good approximatio, 1 amely: 1 l + γ, i where γ 0.5772... is kow as Euler s costat. Thus, the expected umber of cereal boxes eeded to collect cards is about (l + γ). This is a excellet approximatio to the exact formula (4) eve for quite small values of. So for example, for 100, we expect to buy about 518 boxes. Poisso Distributio Cosider the umber of clicks of a Geiger couter, which measures radioactive emissios. The average umber of such clicks per uit time, λ, is a measure of radioactivity, but the actual umber of clicks fluctuates accordig to a certai distributio called the Poisso distributio. What is remarkable is that the average value, λ, completely determies the probability distributio o the umber of clicks X. Defiitio 19.2 (Poisso distributio). A radom variable X for which Pr[X i] λ i i! e λ for i 0,1,2,... (5) is said to have the Poisso distributio with parameter λ. This is abbreviated as X Poiss(λ). To make sure this is a valid defiitio, let us check that (5) is i fact a distributio, i.e., that the probabilities sum to 1. We have λ i i0 i! e λ e λ λ i i0 i! e λ e λ 1. I the secod-to-last step, we used the Taylor series expasio e x 1 + x + x2 2! + x3 3! +. The Poisso distributio is also a very widely accepted model for so-called rare evets," such as miscoected phoe calls, radioactive emissios, crossovers i chromosomes, the umber of cases of disease, the umber of births per hour, etc. This model is appropriate wheever the occurreces ca be assumed to happe radomly with some costat desity i a cotiuous regio (of time or space), such that occurreces i disjoit subregios are idepedet. Oe ca the show that the umber of occurreces i a regio of uit size should obey the Poisso distributio with parameter λ. Example: Suppose whe we write a article, we make a average of 1 typo per page. We ca model this with a Poisso radom variable X with λ 1. So the probability that a page has 5 typos is Pr[X 5] 15 5! e 1 1 120 e 1 326. 1 This is aother of the little tricks you might like to carry aroud i your toolbox. CS 70, Sprig 2016, Note 19 5

Now suppose the article has 200 pages. If we assume the umber of typos i each page is idepedet, the the probability that there is at least oe page with exactly 5 typos is Pr[ a page with exactly 5 typos] 1 Pr[every page has 5 typos] 200 1 k1 Pr[page k has 5 typos] 200 1 (1 Pr[page k has exactly 5 typos]) 1 k1 ( 1 1 ) 200, 120 e where i the last step we have used our earlier calculatio for Pr[X 5]. Let us ow calculate the expectatio ad variace of a Poisso radom variable. As we oted before, the expected value is simply λ. Here we see that the variace is also equal to λ. Theorem 19.4. For a Poisso radom variable X Poiss(λ), we have E(X) λ ad Var(X) λ. Proof. We ca calculate E(X) directly from the defiitio of expectatio: E(X) i0 i Pr[X i] i λ i i! e λ (the i 0 term is equal to 0 so we omit it) λe λ (i 1)! λ i 1 λe λ e λ (sice e λ j0 λ j j! λ. Similarly, we ca calculate E(X(X 1)) as follows: E(X(X 1)) Therefore, as desired. i0 i2 i(i 1) Pr[X i] i(i 1) λ i λ 2 e λ (i 2)! i2 with j i 1) i! e λ (the i 0 ad i 1 terms are equal to 0 so we omit them) λ i 2 λ 2 e λ e λ (sice e λ j0 λ j j! λ 2. with j i 2) Var(X) E(X 2 ) E(X) 2 E(X(X 1)) + E(X) E(X) 2 λ 2 + λ λ 2 λ, CS 70, Sprig 2016, Note 19 6

A plot of the Poisso distributio reveals a curve that rises mootoically to a sigle peak ad the decreases mootoically. The peak is as close as possible to the expected value, i.e., at i λ. Figure 2 shows a example for λ 5. 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 5 10 15 20 Figure 2: The Poisso distributio with λ 5. Poisso ad Coi Flips To see a cocrete example of how Poisso distributio arises, suppose we wat to model the umber of cell phoe users iitiatig calls i a etwork durig a time period, of duratio (say) 1 miute. There are may customers i the etwork, ad all of them ca potetially make a call durig this time period. However, oly a very small fractio of them actually will. Uder this sceario, it seems reasoable to make two assumptios: The probability of havig more tha 1 customer iitiatig a call i ay small time iterval is egligible. The iitiatio of calls i disjoit time itervals are idepedet evets. The if we divide the oe-miute time period ito disjoit itervals, the the umber of calls X i that time period ca be modeled as a biomial radom variable with parameter ad probability of success p, i.e., p is the probability of havig a call iitiated i a time iterval of legth 1/. But what should p be i terms of the relevat parameters of the problem? If calls are iitiated at a average rate of λ calls per miute, the E(X) λ ad so p λ, i.e., p λ/. So X Bi(, λ ). As we shall see below, as we let ted to ifiity, this distributio teds to the Poisso distributio with parameter λ. We ca also see why the Poisso distributio is a model for rare evets. We are thikig of it as a sequece of a large umber,, of coi flips, where we expect oly a fiite umber λ of Heads. Now we will prove that the Poisso distributio Poiss(λ) is the limit of the biomial distributio Bi(, λ ), as teds to ifiity. Theorem 19.5. Let X Bi(, λ ) where λ > 0 is a fixed costat. The for every i 0,1,2,..., Pr[X i] λ i i! e λ as. That is, the probability distributio of X coverges to the Poisso distributio with parameter λ. CS 70, Sprig 2016, Note 19 7

Proof. Fix i {0,1,2,...}, ad assume i (because we will let ). The, because X has biomial distributio with parameter ad p λ, Pr[X i] Let us collect the factors ito Pr[X i] λ i The first parethesis above becomes, as, ( ) ( ) p i (1 p) i! λ i ( 1 λ ) i. i i!( i)!! ( i)! 1 ( 1) ( i + 1) ( i)! i ( i)! i! (! ( i)! 1 ) ( i 1 λ ) ( 1 λ ) i. (6) 1 i From calculus, the secod parethesis i (6) becomes, as, ( 1 λ ) e λ. Ad sice i is fixed, the third parethesis i (6) becomes, as, Substitutig these results back to (6) gives us ( 1 λ ) i (1 0) i 1. ( 1) ( i + 1) 1. as desired. Pr[X i] λ i i! 1 e λ 1 λ i i! e λ, CS 70, Sprig 2016, Note 19 8