robability ad Statistics rof. Zheg Zheg
Radom Variable A fiite sigle valued fuctio.) that maps the set of all eperimetal outcomes ito the set of real umbers R is a r.v., if the set ) is a evet F ) for every i R, ad the prob. of the evets {= } {=- } are 0. is a r.v, if B) F where B represets semi-defiite itervals of the form { a} ad all other sets that ca be costructed from these sets by performig the set operatios of uio, itersectio ad egatio ay umber of times. ) A B R
if is a r.v, the ) a b is also a evet { a} b are evets, c a a is a evet, Thus a b { a b} is a evet is also a evet a a { a } 3
robability Distributio Fuctio DF) Deote ) F ) 0 F ) is said to the robability Distributio Fuctio DF) associated with the r.v. The subscript is to idetify the r.v. if g) is a DF, the it is odecreasig, rightcotiuous, e.g. i) ii) if the. g ), g ) 0,, g ) g ), iii) g ) g ), for all. 4
From the earlier defiitio of F ), we have i) F ) ) ) ad F ) ) 0. ) ii) If, the the subset, ), ). ), Cosequetly the evet ) sice ) implies ). As a result F ) ) F ), ) implyig that the probability distributio fuctio is oegative ad mootoe odecreasig. iii) Let, ad cosider the evet A ). sice ) ) ), 5
usig mutually eclusive property of evets we get ) F ) F ). A ) But A A A, ad hece Thus lim A A ad hece lim A ) 0. lim A ) lim ) 0. But lim, the right limit of, ad hece F F ) F ) F ), i.e., F ) is right-cotiuous, justifyig all properties of a distributio fuctio. 6
Additioal roperties of a DF iv) If F 0 ) 0 for some, the This follows, sice F 0 ) ) 0 0 implies ) 0 is the ull set, ad for ay 0, ) will be a subset of the ull set. v) We have ) ), ad sice the two evets are mutually eclusive, the above follows. vi) ) F ). ) 0,. 0 0 The evets ) ad { ) } are mutually eclusive ad their uio represets the evet ) F ) F ) F ),.. 7
vii) ) F ) F ). Let, 0, ad. Sice or lim 0 ) F ) lim F ), 0 ) F ) F ). F 0 ), ), the limit of F as 0 from the right always eists ad equals F 0 ). However the left limit value F 0 ) eed ot equal F 0 ). Thus F ) eed ot be cotiuous from the left. At a discotiuity poit of the distributio, the left ad right limits are differet, ad from above ) F ) F ) 0. 0 0 0 8
Thus the oly discotiuities of a distributio fuctio are of the jump type, ad occur at poits 0 where it is satisfied. These poits ca always be eumerated as a sequece, ad moreover they are at most coutable i umber. Eample : is a r.v such that ) c,. Fid Solutio: For c, ), so that F ) 0, ad for c, ), so that. F ) ). F ) F F ) c Eample : Toss a coi. H,T. Suppose the r.v is such that T ) 0, H ). Fid ). F 9
Solutio: For 0, ), so that 0,, ) ) T, so that F ) T p, ) H, T, so that F ). Fig. 3) is said to be a cotiuous-type r.v if its distributio fuctio F ) is cotiuous. I that case F ) F all, ad we get 0. If F ) is costat ecept for a fiite umber of jump discotiuitiespiece-wise costat; step-type), the is said to be a discrete-type r.v. If i is such a discotiuity poit, the p F ) F ). i q F ) i i i F 0. ) for 0
From the Fig., at a poit of discotiuity we get c F c) F c ) 0. ad from the Fig., 0 F 0) F 0 ) q 0 q. Eample 3. A fair coi is tossed twice, ad let the r.v represet the umber of heads. Fid F ). Solutio: I this case HH, HT, TH, TT, ad 0 0,,,, HH ), HT ), TH ), TT ) ) ) TT F ) TT ) TT, HT, TH F ) TT, HT, TH ) F F ) ) 0,. T 0. ) T ) 4, 3 4,
From the Fig.3, robability desity fuctio p.d.f) The derivative of the distributio fuctio F ) is called the probability desity fuctio f ) of the r.v. Thus Sice df d ) ) 3 / 4 / 4 /. 3/ 4 / 4 ) F ) f F F df d ). from the mootoe-odecreasig ature of ) lim 0 F ) F ) 0, F ),
it follows that f ) 0 for all. f ) will be a cotiuous fuctio, if is a cotiuous type r.v. However, if is a discrete type r.v as i the above, the f ) its p.d.f has the geeral form Fig. 5) f ) p where i represet the jump-discotiuity poits i F As Fig. 5 shows f ) represets a collectio of positive discrete masses, ad it is ow as the probability mass fuctio p.m.f ) i the discrete case. We also obtai by itegratio Sice it yields F ) i F, i i ), ) f u ) du f ) d,. Fig. 5 p i i ). 3
which justifies its ame as the desity fuctio. Further, we also get Fig. 6b). ) F ) F ) f ) d Thus the area uder f ) i the iterval, ) represets the probability. F ) f ) a) b) Fig. 6 Ofte, r.vs are referred by their specific desity fuctios - both i the cotiuous ad discrete cases - ad i what follows we shall list a umber of them i each category. 4
Cotiuous-type radom variables. Normal Gaussia): is said to be ormal or Gaussia r.v, if f ) ) This is a bell shaped curve, symmetric aroud the parameter, ad its distributio fuctio is give by F ) ) y e y / where G ) e dy is ofte tabulated. Sice f ) depeds o two parameters ad, the otatio is ofte used. f ) e / / dy. G, N, ) Fig. 7 5
. Uiform: U a, b), a b, if Fig. 8) f ) b 0, a, a otherwise. b, 3. Epoetial: ) if Fig. 9) f ) e 0, /, 0, otherwise. b a f ) a Fig. 8 b f ) Fig. 9 6
4. Gamma: G, ) if 0, 0) Fig. 0) f ) ) 0, If a iteger e /, otherwise. 0, ) )!. f f ) Fig. 0 ) 5. Beta: a, b) if a 0, b 0) Fig. ) f ) a, b ) 0, a ) b, otherwise. 0, 0 Fig. where the Beta fuctio a, b) is defied as a b a, b ) u u ) du 0. 7
6. Chi-Square: ), if Fig. ) f ) 0, / / e / / ) otherwise. 0, Note that ) is the same as Gamma /, )., f ) Fig. 7. Rayleigh: R ), if Fig. 3) f ) e 0, /, otherwise. 0, f ) Fig. 3 8
Discrete-type radom variables. Beroulli: taes the values 0,), ad 0) q, ) p.. Biomial: if Fig. 7) B, p), ) p q, 0,,,, 3. oisso: ), if Fig. 8) ) e, 0,,,,.! ) ). Fig. 7 Fig. 8 9
4. Hypergeometric: ) m Nm N 5. Geometric: g p ) if, ma0, m N) mi m, ) ) pq, 0,,,,, q p. 6. Negative Biomial: ~ NB r, p), if r r ) p q, r, r,. r 7. Discrete-Uiform: ),,,, N. N 0
olya s distributio Icludes both biomial ad hypergeometric as special cases. A bo cotais a white balls ad b blac balls. A ball is draw at radom, ad it is replaced alog with c balls of the same color. If represets the umber of white balls draw i such draws, 0,,,,, fid the probability mass fuctio of. Solutio: Cosider the specific sequece of draws where white balls are first draw, followed by blac balls. The probability of drawig successive white balls is give by p W a a c a c a ) c a b a b c a b c a b ) c Similarly the probability of drawig white balls
followed by blac balls is give by b b c b ) c p p w a b c a b ) c a b ) c aic b jc abic ab j) c i0 j0. Iterestigly, p i above also represets the probability of drawig white balls ad ) blac balls i ay other specific order i.e., The same set of umerator ad deomiator terms i above cotribute to all other sequeces as well.) But there are such distict mutually eclusive sequeces ad summig over all of them, we obtai the olya distributio probability of gettig white balls i draws) to be a ic b jc ) p, 0,,,,. i0 j0 abic ab j) c
Both biomial distributio as well as the hypergeometric distributio are its special cases. For eample if draws are doe with replacemet, the c = 0 ad it simplifies to the biomial distributio where ) p q, 0,,,, a b p, q p. a b a b Similarly if the draws are coducted without replacemet, The c =, ad it gives ) aa ) a) a) bb ) b) ab) ab) ab ) ab) ab) 3
! a! ab)! b! ab)! )! )! a)! ab)! b)! ab)! which represets the hypergeometric distributio. Fially c = + gives replacemets are doubled) ) a )! ab)! b )! ab)! a)! ab )! b)! ab)! a b ab =. we shall refer it as olya s + distributio. the geeral olya distributio has bee used to study the spread of cotagious diseases epidemic modelig). a b ab 4
5 Let represet a Biomial r.v, the Sice the biomial coefficiet grows quite rapidly with, it is difficult to compute it for large. I this cotet, two approimatios are etremely useful. The Normal Approimatio Demoivre-Laplace Theorem) Suppose with p held fied. The for i the eighborhood of p, we ca approimate. ) q p! )!! pq Biomial Radom Variable Approimatios
p q p ) / pq pq Thus if ad are withi or aroud the eighborhood of the iterval p pq, p pq, we ca approimate the summatio by a itegratio. I that case it reduces to where e pq e p) / pq y / d p pq, p pq.. e dy, We ca epress it i terms of the ormalized itegral erf ) erf ) that has bee tabulated etesively See Table ). 0 e y / dy 6
For eample, if ad are both positive,we obtai erf ) erf ). Eample : A fair coi is tossed 5,000 times. Fid the probability that the umber of heads is betwee,475 to,55. Solutio: We eed,475,55 ). Here is large so that we ca use the ormal approimatio. I this case p so that p,500 ad 35. Sice p pq ad pq,535, the approimatio is valid for, ad,55. Thus Here pq,465, p 475 7 y / e 5 p p, pq pq 5 7. dy. 7,
8 erf) erf) erf) erf) 0.05 0.0 0.5 0.0 0.5 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.0994 0.03983 0.0596 0.0796 0.0987 0.79 0.3683 0.554 0.7364 0.946 0.0884 0.575 0.45 0.5804 0.7337 0.80 0.85 0.90 0.95.00.05.0.5.0.5.30.35.40.45.50 0.884 0.3034 0.3594 0.3894 0.3434 0.3534 0.36433 0.37493 0.38493 0.39435 0.4030 0.449 0.494 0.4647 0.4339.55.60.65.70.75.80.85.90.95.00.05.0.5.0.5 0.43943 0.4450 0.45053 0.45543 0.45994 0.46407 0.46784 0.478 0.4744 0.4776 0.4798 0.484 0.484 0.4860 0.48778.30.35.40.45.50.55.60.65.70.75.80.85.90.95 3.00 0.4898 0.4906 0.4980 0.4986 0.49379 0.4946 0.49534 0.49597 0.49653 0.4970 0.49744 0.4978 0.4983 0.4984 0.49865 ) ) erf 0 / G dy e y Table
Sice 0, from Fig. b), the above probability is give by,475,55 erf ) erf ) erf ) erf erf where we have used Table a) The oisso Approimatio e 0, 0 / 5 0.56, 7 erf0.7) 0.58. Fig. As we have metioed earlier, for large, the Gaussia approimatio of a biomial r.v is valid oly if p is fied, i.e., oly if p ad pq. what if p is small, or if it does ot icrease with? b) e 0, 0 / ) 9
Obviously that is the case if, for eample, 0 as such that is a fied umber. p p, May radom pheomea i ature i fact follow this patter. Total umber of calls o a telephoe lie, claims i a isurace compay etc. ted to follow this type of behavior. Cosider radom arrivals such as telephoe calls over a lie. Let represet the total umber of calls i the iterval 0, ). From our eperiece, as T we have so that we may assume T. Cosider a small iterval of duratio as i Fig.. If there is oly a sigle call comig i, the probability p of that sigle call occurrig i that iterval must deped o its relative size with respect to T. T 0 T Fig. 30
Hece we may assume p. Note that as T p 0 T. However i this case p T is a costat, T ad the ormal approimatio is ivalid here. Suppose the iterval i Fig. is of iterest to us. A call iside that iterval is a success H), whereas oe outside is a failure T ). This is equivalet to the coi tossig situatio, ad hece the probability ) of obtaiig calls i ay order) i a iterval of duratio is give by the biomial p.m.f. Thus )! )!! ad here as, p 0 such that p. It is easy to obtai a ecellet approimatio i that situatio. To see this, rewrite it as p p), 3
3. ) / ) /! ) /! ) ) ) ) p p,! ) lim 0,, e p p sice the fiite products as well as ted to uity as ad The right side of it represets the oisso p.m.f ad the oisso approimatio to the biomial r.v is valid i situatios where the biomial r.v parameters ad p diverge to two etremes such that their product p is a costat.,. lim e 0), p Thus
Eample : Wiig a Lottery: Suppose two millio lottery ticets are issued with 00 wiig ticets amog them. a) If a perso purchases 00 ticets, what is the probability of wiig? b) How may ticets should oe buy to be 95% cofidet of havig a wiig ticet? Solutio: The probability of buyig a wiig ticet p No. of wiig ticets 00 5 50. 6 Total o. of ticets 0 Here 00, ad the umber of wiig ticets i the purchased ticets has a approimate oisso distributio with parameter Thus p 00 ad a) robability of wiig ) e 5! 0, 5 0.005. ) 0) e 0.005. 33
b) I this case we eed ) 0.95. ) e 0.95 implies l 0 But p 5 0 5 3 or 60,000. Thus oe eeds to buy about 60,000 ticets to be 95% cofidet of havig a wiig ticet! Eample 3: A space craft has 0 5 compoets The probability of ay oe compoet beig defective is 5 0 p 0). The missio will be i dager if five or more compoets become defective. Fid the probability of such a evet. Solutio: Here is large ad p is small, ad hece oisso approimatio is valid. Thus p 00,000 0 5 ad the desired probability is give by 4 4 5) 4) e e e 0 4 3 3! 0.05. 0! 3. 34,