Discrete Random Variables and Probability Distributions. Random Variables. Discrete Models

UCLA STAT 35 Applied Computatioal ad Iteractive Probability Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Chris Barr Uiversity of Califoria, Los Ageles, Witer 006 http://www.stat.ucla.edu/~diov/ Discrete Models Discrete Radom Variables ad Probability Distributios Slide Slide Radom Variables Radom Variable For a give sample space S of some eperimet, a radom variable is ay rule that associates a umber with each outcome i S. Slide 3 Slide 4 Beroulli Radom Variable Ay radom variable whose oly possible values are 0 ad is called a Beroulli radom variable. RedBlackGame Types of Radom Variables A discrete radom variable is a rv whose possible values either costitute a fiite set or else ca listed i a ifiite sequece. A radom variable is cotiuous if its set of possible values cosists of a etire iterval o a umber lie. Slide 5 Slide 6

Probability Distributios for Discrete Radom Variables Probability Distributio The probability distributio or probability mass fuctio (pmf) of a discrete rv is defied for every umber by p() = P(all s S : X(s) = ) Slide 7 Slide 8 Parameter of a Probability Distributio Suppose that p() depeds o a quatity that ca be assiged ay oe of a umber of possible values, each with differet value determiig a differet probability distributio. Such a quatity is called a parameter of the distributio. The collectio of all distributios for all differet parameters is called a family of distributios. Slide 9 Cumulative Distributio Fuctio The cumulative distributio fuctio (cdf) F() of a discrete rv variable X with pmf p() is defied for every umber by F( ) = P( X ) = p( y) Slide 0 yy : For ay umber, F() is the probability that the observed value of X will be at most. Propositio For ay two umbers a ad b with a b, Pa ( X b) = Fb ( ) Fa ( ) a represets the largest possible X value that is strictly less tha (<) a. Note: For itegers Pa ( X b) = Fb ( ) Fa ( ) Probability Distributio for the Radom Variable X A probability distributio for a radom variable X: P(X = ) 8 0.3 3 0.5 0.7 0 0.0 Fid a. P( X 0 ) 0.65 b. P 3 X 0.67 ( ) 0.5 4 0. 6 0.09 Slide Slide

Epected Values of Discrete Radom Variables The Epected Value of X Let X be a discrete rv with set of possible values D ad pmf p(). The epected value or mea value of X, deoted EX ( ) or µ X, is E( X) = µ = p( ) X D Slide 3 Slide 4 Eample I the at least oe of each or at most 3 childre eample, where X ={umber of Girls} we have: X 0 3 pr( ) 8 5 8 Slide 5 8 E( X ) = P( ) 5 = 0 + + + 3 8 8 8 8 =.5 8 E. Use the data below to fid out the epected umber of the umber of credit cards that a studet will possess. = # credit cards 0 3 4 5 6 P( =X) 0.08 0.8 0.38 0.6 0.06 0.03 0.0 E ( X) = p+ p +... + p = 0(.08) + (.8) + (.38) + 3(.6) + 4(.06) + 5(.03) + 6(.0) =.97 About credit cards Slide 6 The Epected Value of a Fuctio If the rv X has the set of possible values D ad pmf p(), the the epected value of ay fuctio h(), deoted EhX [ ( )] or µ h ( X ), is E[ hx ( )] = h ( ) p ( ) D Rules of the Epected Value E( ax + b) = a E( X ) + b This leads to the followig:. For ay costat a, E( ax ) = a E( X ).. For ay costat b, E( X + b) = E( X) + b. Slide 7 Slide 8 3

The Variace ad Stadard Deviatio Let X have pmf p(), ad epected value µ The the variace of X, deoted V(X) (or σ or σ ), is X V( X) = ( µ ) p( ) = E[( X µ ) ] D The stadard deviatio (SD) of X is σ X = σ X E. The quiz scores for a particular studet are give below:, 5, 0, 8,, 0, 4, 0, 0, 5, 4, 5, 8 Fid the variace ad stadard deviatio. Value Frequecy Probability µ =.08 8.5 0 4.3.08 4.5 5 3.3 ( µ ) ( µ ) ( µ ) V( X) = p + p +... + p σ = V( X) Slide 9 Slide 0 V( X ) =.08 +.5 8 +.3 0 ( ) ( ) ( ) ( ) ( ) ( ) +.08 +.5 4 +.3 5 V( X ) = 3.5 σ = V( X) = 3.5 3.64 Shortcut Formula for Variace V( X) = σ = p( ) µ D ( ) E( X) = E X Slide Slide Rules of Variace + σ V( ax + b) = σax b = a X ad σ + = a σ ax b X This leads to the followig:. σax = a σx, σax = a σx. σx+ b = σx Liear Scalig (affie trasformatios) ax + b For ay costats a ad b, the epectatio of the RV ax + b is equal to the sum of the product of a ad the epectatio of the RV X ad the costat b. E(aX + b) = a E(X) +b Ad similarly for the stadard deviatio (b, a additive factor, does ot affect the SD). SD(aX +b) = a SD(X) Slide 3 Slide 4 4

Liear Scalig (affie trasformatios) ax + b Liear Scalig (affie trasformatios) ax + b Why is that so? E(aX + b) = a E(X) +b SD(aX +b) = a SD(X) E(aX + b) = (a + b) P(X = ) = = 0 a P(X = ) + b P(X = ) = = 0 = 0 a P(X = ) + b P(X = ) = = 0 = 0 ae(x) + b = ae(x) + b. Slide 5 Eample: E(aX + b) = a E(X) +b SD(aX +b) = a SD(X). X={-,, 0, 3, 4, 0, -, }; P(X=)=/8, for each. Y = X-5 = {-7, -, -5,, 3, -5, -9, -3} 3. E(X)= 4. E(Y)= 5. Does E(X) = E(X) 5? 6. Compute SD(X), SD(Y). Does SD(Y) = SD(X)? Slide 6 Liear Scalig (affie trasformatios) ax + b Ad why do we care? E(aX + b) = a E(X) +b SD(aX +b) = a SD(X) -completely geeral strategy for computig the distributios of RV s which are obtaied from other RV s with kow distributio. E.g., X~N(0,), ad Y=aX+b, the we eed ot calculate the mea ad the SD of Y. We kow from the above formulas that E(Y) = b ad SD(Y) = a. -These formulas hold for all distributios, ot oly for Biomial ad Normal. Slide 7 Liear Scalig (affie trasformatios) ax + b Ad why do we care? E(aX + b) = a E(X) +b SD(aX +b) = a SD(X) -E.g., say the rules for the game of chace we saw before chage ad the ew pay-off is as follows: {$0, $.50, $3}, with probabilities of {0.6, 0.3, 0.}, as before. What is the ewly epected retur of the game? Remember the old epectatio was equal to the etrace fee of $.50, ad the game was fair! Y = 3(X-)/ {$, $, $3} {$0, $.50, $3}, E(Y) = 3/ E(X) 3/ = 3 / 4 = $0.75 Ad the game became clearly biased. Note how easy it is to compute E(Y). Slide 8 Meas ad Variaces for (i)depedet Variables! Meas: Idepedet/Depedet Variables {X, X, X3,, X0} E(X + X + X3 + + X0) = E(X)+ E(X)+ E(X3)+ + E(X0) Variaces: Idepedet Variables {X, X, X3,, X0}, variaces add-up Var(X +X + X3 + + X0) = Var(X)+Var(X)+Var(X3)+ +Var(X) Depedet Variables {X, X} Variace cotiget o the variable depedeces, E.g., If X = X + 5, Var(X +X) =Var (X + X +5) = Var(3X +5) =Var(3X) = 9Var(X) Slide 9 The Biomial Probability Distributio Slide 30 5

Biomial Eperimet A eperimet for which the followig four coditios are satisfied is called a biomial eperimet.. The eperimet cosists of a sequece of trials, where is fied i advace of the eperimet.. The trials are idetical, ad each trial ca result i oe of the same two possible outcomes, which are deoted by success (S) or failure (F). 3. The trials are idepedet. 4. The probability of success is costat from trial to trial: deoted by p. Slide 3 Slide 3 Biomial Eperimet Suppose each trial of a eperimet ca result i S or F, but the samplig is without replacemet from a populatio of size N. If the sample size is at most 5% of the populatio size, the eperimet ca be aalyzed as though it were eactly a biomial eperimet. Biomial Radom Variable Give a biomial eperimet cosistig of trials, the biomial radom variable X associated with this eperimet is defied as X = the umber of S s amog trials Slide 33 Slide 34 Notatio for the pmf of a Biomial rv Because the pmf of a biomial rv X depeds o the two parameters ad p, we deote the pmf by B(;,p). Computatio of a Biomial pmf b( ;, p) = p ( 0 p) BiomialCoiEperimet UrEperimet Ball ad Ur Eperimet Slide 35 Slide 36 6

E. A card is draw from a stadard 5-card deck. If drawig a club is cosidered a success, fid the probability of a. eactly oe success i 4 draws (with replacemet). p = ¼; q = ¼ = ¾ 3 4 3 4 4 0.4 b. o successes i 5 draws (with replacemet). 0 5 5 3 0 4 4 0.37 Notatio for cdf For X ~ Bi(, p), the cdf will be deoted by PX ( ) = Bp ( ;, ) = byp ( ;, ) y= 0 = 0,,, Slide 37 Slide 38 Mea ad Variace For X ~ Bi(, p), the E(X) = p, V(X) = p( p) = pq, σ X = pq (where q = p). E. 5 cards are draw, with replacemet, from a stadard 5-card deck. If drawig a club is cosidered a success, fid the mea, variace, ad stadard deviatio of X (where X is the umber of successes). p = ¼; q = ¼ = ¾ µ = p = 5 =.5 4 3 V ( X ) = pq = 5 = 0.9375 4 4 σ = pq = 0.9375 0.968 X Slide 39 Slide 40 E. If the probability of a studet successfully passig this course (C or better) is 0.8, fid the probability that give 8 studets 8 a. all 8 pass. 0.8 0.8 8 b. oe pass. 8 0.8 0.8 0 c. at least 6 pass. 8 0 ( ) ( ) 0 8 ( ) ( ) 0.044 0.00000 Hypergeometric ad Negative Biomial Distributios 8 8 8 6 7 8 ( 0.8) ( 0.8) + ( 0.8) ( 0.8) + ( 0.8) ( 0.8) 6 7 8 0 0.758 + 0.3590 + 0.044 = 0.839 Slide 4 Slide 4 7

The Hypergeometric Distributio The three assumptios that lead to a hypergeometric distributio:. The populatio or set to be sampled cosists of N idividuals, objects, or elemets (a fiite populatio).. Each idividual ca be characterized as a success (S) or failure (F), ad there are M successes i the populatio. 3. A sample of idividuals is selected without replacemet i such a way that each subset of size is equally likely to be chose. Slide 43 Slide 44 Hypergeometric Distributio If X is the umber of S s i a completely radom sample of size draw from a populatio cosistig of M S s ad (N M) F s, the the probability distributio of X, called the hypergeometric distributio, is give by M N M PX ( = ) = hmn ( ;,, ) = ma(0, N + M) mi(, M) N Hypergeometric Mea ad Variace M N M M EX ( ) = VX ( ) = N N N N Ball_ad_Ur_Eperimet HyperGeometric Distributio & Biomial Approimatio to HyperGeometric http://socr.stat.ucla.edu/htmls/socr_eperimets.html Slide 45 Slide 46 The Negative Biomial Distributio The egative biomial rv ad distributio are based o a eperimet satisfyig the followig four coditios:. The eperimet cosists of a sequece of idepedet trials.. Each trial ca result i a success (S) or a failure (F). 3. The probability of success is costat from trial to trial, so P(S o trial i) = p for i =,, 3, 4. The eperimet cotiues util a total of r successes have bee observed, where r is a specified positive iteger. Slide 47 Slide 48 8

pmf of a Negative Biomial The pmf of the egative biomial rv X with parameters r = umber of S s ad p = P(S) is r NB ( ; r, p) = + p p r r ( ) = 0,,, Negative Biomial Mea ad Variace ( ) ( ) ( ) r p ( ) r EX = VX = p p p NegativeBiomialEperimet Slide 49 Slide 50 Hypergeometric Distributio & Biomial Biomial approimatio to Hypergeometric M is small (usually < 0.), the p N N approaches HyperGeom( ; N,, M ) Bi( ;, p) M / N = p E: 4,000 out of 0,000 residets are agaist a ew ta. 5 residets are selected at radom. P(at most 7 favor the ew ta) =? http://socr.stat.ucla.edu/applets.dir/normal_t_chi_f_tables.htm Geometric, Hypergeometric, Negative Biomial Negative biomial pmf [X ~ NegBi(r, p), if r= Geometric (p)] P( X = ) = ( p) p Number of trials () util the r th success (egative, sice umber of successes (r) is fied & umber of trials (X) is radom) ( ) ( ) p p r r r P X = = + r( p) r( p) E( X ) = ; Var( X ) = p p Slide 5 Slide 5 The Poisso Probability Distributio Poisso Distributio A radom variable X is said to have a Poisso distributio with parameter ( > 0, ) if the pmf of X is e p ( ; ) = = 0,,...! Slide 53 Slide 54 9

The Poisso Distributio as a Limit Suppose that i the biomial pmf b(;, p), we let ad p 0 i such a way that p approaches a value > 0. The bp ( ;, ) p ( ; ). Poisso Distributio Mea ad Variace If X has a Poisso distributio with parameter, the EX ( ) = VX ( ) = PoissoEperimet PoissoDEperimet Slide 55 Slide 56 Poisso Process 3 Assumptios:. There eists a parameter α > 0 such that for ay short time iterval of legth t, the probability that eactly oe evet is received is α t+ o( t).. The probability of more tha oe evet durig t is o( t). 3. The umber of evets durig the time iterval t is idepedet of the umber that occurred prior to this time iterval. Slide 57 Slide 58 Poisso Distributio αt k Pk () t = e ( αt) / k!, so that the umber of pulses (evets) durig a time iterval of legth t is a Poisso rv with parameter = αt. The epected umber of pulses (evets) durig ay such time iterval is αt, so the epected umber durig a uit time iterval is α. http://socr.stat.ucla.edu/htmls/socr_eperimets.html PoissoEperimet Poisso Distributio Defiitio Used to model couts umber of arrivals (k) o a give iterval The Poisso distributio is also sometimes referred to as the distributio of rare evets. Eamples of Poisso distributed variables are umber of accidets per perso, umber of sweepstakes wo per perso, or the umber of catastrophic defects foud i a productio process. Slide 59 Slide 60 0

Fuctioal Brai Imagig Positro Emissio Tomography (PET) Fuctioal Brai Imagig - Positro Emissio Tomography (PET) Slide 6 Slide 6 http://www.ucmed.buffalo.edu Fuctioal Brai Imagig Positro Emissio Tomography (PET) Fuctioal Brai Imagig Positro Emissio Tomography (PET) Isotope Eergy (MeV) Rage(mm) /-life Appl. C 0.96. 0 mi receptors 5O.7.5 mi stroke/activatio 8 F 0.6.0 0 mi eurology 4I ~.0.6 4.5 days ocology Slide 63 Slide 64 Hypergeometric Distributio & Biomial Biomial approimatio to Hyperheometric M is small (usually < 0.), the p N N approaches HyperGeom( ; N,, M ) Bi( ;, p) M / N = p E: 4,000 out of 0,000 residets are agaist a ew ta. 5 residets are selected at radom. P(at most 7 favor the ew ta) =? Poisso Distributio Mea Used to model couts umber of arrivals (k) o a give iterval k e Y~Poisso( ), the P(Y=k) =, k = 0,,, k! Mea of Y, µ Y =, sice k k k e k E( Y ) = k = e = e = k! k! ( k )! = e k = 0 k = k = e ( k )! k = 0 k = 0 k = e k! k = e = Slide 65 Slide 66

Poisso Distributio - Variace k Y~Poisso( ), the P(Y=k) = e Variace of Y, σ Y = ½ k!, sice σ = = Y Var( Y ) = k 0 ( k ) k e k!, k = 0,,, For eample, suppose that Y deotes the umber of blocked shots (arrivals) i a radomly sampled game for the UCLA Bruis me's basketball team. The a Poisso distributio with mea=4 may be used to model Y. =... = Poisso Distributio - Eample For eample, suppose that Y deotes the umber of blocked shots i a radomly sampled game for the UCLA Bruis me's basketball team. Poisso distributio with mea=4 may be used to model Y. 3 4 5 6 7 8 9 0 3 4 5 Slide 67 Slide 68 Poisso as a approimatio to Biomial Suppose we have a sequece of Biomial(, p ) models, with lim( p ), as ifiity. For each 0<=y<=, if Y ~ Biomial(, p ), the y y P(Y =y)= p p y ( ) But this coverges to: y WHY? y y e p ( p ) y p y! Thus, Biomial(, p ) Poisso() Poisso as a approimatio to Biomial Rule of thumb is that approimatio is good if: >=00 p<=0.0 = p <=0 The, Biomial(, p ) Poisso() Slide 69 Slide 70 Eample usig Poisso appro to Biomial Suppose P(defective chip) = 0.000=0-4. Fid the probability that a lot of 5,000 chips has > defective! Y~ Biomial(5,000, 0.000), fid P(Y>). Note that Z~Poisso( = p =5,000 0.000=.5) P( Z > ) = P( Z ) = 0.5 e 0!.5.5 + e!.5 Slide 7 z= 0.5 + e! z.5 e z!.5 = 0.456.5 =