CH5. Discrete Probability Distributions

CH5. Discrete Probabilit Distributios Radom Variables A radom variable is a fuctio or rule that assigs a umerical value to each outcome i the sample space of a radom eperimet. Nomeclature: - Capital letters: radom variables (e.g.,, ). - Lower case letters: values of the radom variable (e.g.,, ). A discrete radom variable has a coutable umber of distict values. Probabilit Distributios A discrete probabilit distributio assigs a probabilit to each value of a discrete radom variable. To be a valid probabilit, each probabilit must be betwee P ( i ) The sum of all the probabilities for the values of must be equal to uit. i = P ( ) = i or P ( ) = all E) Flip a Coi Three Times If is the umber of heads, the is a radom variable whose probabilit distributio is as follows: Possible Evets P() TTT /8 HTT, THT, TTH 3/8 HHT, HTH, THH 3/8 HHH 3 /8 Total What is a PDF (PMF) & CDF? A probabilit distributio (mass) fuctio (PDF/PMF) is a mathematical fuctio that shows the probabilit of each -value. P( ) = P( = ) A cumulative distributio fuctio (CDF) is a mathematical fuctio that shows the cumulative sum of probabilities, addig from the smallest to the largest -value, graduall approachig uit.

The cumulative distributio fuctio, F( ) of a radom variable epresses the probabilit that does ot eceed the value, as a fuctio of. Let be a discrete radom variable with probabilit fuctio P() ad cumulative probabilit fuctio F( ). The it ca be show that where the otatio implies that summatio is over all possible values that are less tha or equal to. Let be a discrete radom variable with a cumulative probabilit fuctio, F( ). The we ca show that F ( ) = P( ) F( ) = P( ) ) F( ) for ever umber ) If ad are two umbers with <, the F( ) F( ) E) roll of a fair sigle die (: the umber of dots o the roll of a die) P() /6 3 4 5 6 <Probabilit mass fuctio (pmf) plot> E) Let be a radom variable defied o the sample space resulted from flippig a fair coi. outcome P() F() Tail.5.5 Head.5 Epected Value The epected value E() of a discrete radom variable is the sum of all - values weighted b their respective probabilities.

If there are distict values of, i i i= all E ( ) =µ= P ( ) = P ( ) The E() is a measure of cetral tedec. Let be a discrete radom variable with probabilit fuctio P() ad let g() be some fuctio of. The the epected value, E[g()], of that fuctio is defied as Eg [ ( )] = gp ( ) ( ) Variace If there are distict values of, the the variace of a discrete radom variable is: ( ) = σ = [ i µ ] ( i) i= V P The stadard deviatio is the square root of the variace ad is deoted σ. σ= σ = V( ) The variace of a discrete radom variable ca be epressed as Proof σ = E ( ) µ = P ( ) µ E p ( µ ) = ( µ ) ( ) all = + p all ( µ µ ) ( ) = E( ) µ + µ = E( ) [ E( )] ) For a costat a : E( a) = a ad Var( a) = If a radom variable alwas takes the value a, it will have a mea a ad a variace. ) For a costat b ad a radom variable : E( b) = bµ ad Var( b) = b σ Let be a radom variable with mea µ, ad variace σ ; ad let a 3

ad b be a costat. Defie the radom variable = a +b. The, the mea ad variace of are µ = Ea ( + b) = a+ bµ σ = Var( a + b ) = b σ so that the stadard deviatio of is Epected Value Fuctio of Joitl Distributed Radom Variables Let ad be a pair of discrete radom variables with joit probabilit fuctio P(, ). The epectatio of a fuctio g(, ) of these radom variables is defied as: E [ g(, )] g(, ) P(, ) Covariace Let be a radom variable with mea µ, ad let be a radom variable with mea, µ. The epected value of ( - µ )( - µ ) is called the covariace betwee ad, deoted Cov (, ). For discrete radom variables A equivalet epressio is σ = bσ = Cov (, ) = E[( µ )( µ )] = ( µ )( µ ) P(, ) Cov (, ) = E( ) µ µ = P(, ) µ µ Proof Cov(, ) = ( µ )( µ ) P(, ) = ( µ µ + µ µ ) P(, ) = P (, ) µ P (, ) µ P (, ) + µ µ P (, ) = P(, ) µµ = E ( ) µµ P (, ) = p ( ) P( ) = µ P (, ) = 4

Correlatio Let ad be joitl distributed radom variables. The correlatio betwee ad is: Cov (, ) ρ = Corr (, ) = ( ρ ) σ σ ρ =: ρ = : ρ = : ad have a perfect positive liear relatioship. ad have a perfect egative liear relatioship. ad have o liear relatioship. <correlatio=?> <correlatio=?> <correlatio=?> It is possible for there to be a strog relatioship betwee two variables ad still have correlatio. 5

Questio) Corr(a+b, c+d)=? Cov( a + b, c + d) = (( a + b) ( aµ + b))(( c + d) ( cµ + d)) P(, ) = ac ( µ )( µ ) P(, ) = accov(, ) V a b a V ( + ) = ( ) Vc d cv ( + ) = ( ) ( +, + ) = (, )/ ( ) ( ) = accorr(, )/ ac Corr a b c d accov a V c V Covariace & Idepedece If two radom variables are statisticall idepedet, the covariace betwee them is (ucorrelated). However, the coverse is ot ecessaril true. Idepedece: P ( ) = PP ( ) ( ) Ucorrelated: E ( ) EE ( ) ( ) = Idepedece Ucorrelated E( ) = P( ) = P( ) P( ) = P( ) P ( ) = E ( ) E ( ) Ucorrelated, but depedet case - - 4 6

P(= ad =-)=/4, P(=)=/, P(=-)=/4 E()=, E()=, E()= Portfolio Aalsis The radom variable is the price for stock A ad the radom variable is the price for stock B. The market value, W, for the portfolio is give b the liear fuctio, W=a+b where, a, is the umber of shares of stock A ad, b, is the umber of shares of stock B. The mea value for W is, µ W = E[ W ] = E[ a + b ] = a + b proof The variace for W is, or usig the correlatio, σ E ) Cosider the prices of two stocks A ad B ( ad are Prices of stocks A ad B, respectivel). $ 3..5...35.5 3.3.. µ µ W = a σ + b σ + W = a σ + b σ Ea ( + b) = ap (, ) + bp (, ) = a P(, ) + b P(, ) = a P( ) + b P( ) = aµ + bµ abcov(, ) σ + abcorr(, ) σ σ σ = V ( a + b ) = E[( a + b ) ( aµ + bµ )] w = E a + b a + b a + b + a + b [( ) ( )( µ µ ) ( µ µ ) ] = a { E( ) µ } + b { E( ) µ } + ab{ E( ) µ µ } = a σ + b σ + abcov(, ) 7

() If ou bu 5 of A ad 4 of B, what is the epected value of this portfolio? E(5+4)= 3 3 = = (5 + 4 P ) (, ) = (5 + 4 ). + (5 + 4 ).5 + L+ (5 3+ 4 3). () Calculate the covariace betwee two stocks. Cov(, ) = E( ) µ µ 3 µ = P( ) = (.7) + (.5) + 3(.33) =.6 = 3 µ = P( ) = (.3) + (.5) + 3(.7) =.4 = 3 3 E ( ) P (, ) = = = = ()(.) + ()(.5) + L+ 3(3)(.) = 4.65 (3) Calculate the correlatio betwee two stocks. Cov(, ) σ ( ) ρ = = E µ σ = E ( ) µ σσ 3 E ( ) = P ( ) = (.7) + (.5) + 3 (.33) = 3 E ( ) = P( ) = (.3) + (.5) + 3 (.7) = Uiform Distributio The uiform distributio describes a radom variable with a fiite umber of iteger values from a to b (the ol two parameters). P ( ) = b a +, = aa, +,... b Each value of the radom variable is equall likel to occur. E) The umber of dots o the roll of a die form a uiform radom variable with si equall likel iteger values:,, 3, 4, 5, 6 Questios: Rage of? PMF? Parameters? E()? V()? Beroulli Distributio A radom eperimet with ol outcomes is a Beroulli eperimet. Oe outcome is arbitraril labeled a success ( = ) ad the other a 8

failure ( = ). Psuccess ( ) = π, P( fail) = π ( ) =, =, P ( ) π ( π) Note that P() + P() = ad π For eample, toss a die, the give if the umber of dots o the roll of a die is odd. Otherwise give. The epected value (mea) of a Beroulli eperimet is calculated as: E ( ) = P ( ) = ()( π ) + ()( π) = π i= The variace of a Beroulli eperimet is calculated as: [ i ] i= i i V( ) = E( ) P( ) = ( π )( π) + ( π)( π) = π( π) i Biomial Distributio The biomial distributio arises whe a Beroulli eperimet is repeated times. Each Beroulli trial is idepedet so the probabilit of success p remais costat o each trial. I a biomial eperimet, we are iterested i = umber of successes i (fied umber of ) trials. So, = + +... +! P ( ) = ( )!( )! π π, =,,,..., E ) Build the distributio of umber of heads i flippig a coi three times 3 P() P() /8 3/8 3/8 /8 E ) What is the probabilit that eactl of the et = cars serviced are late (P ( = ))? P (car is late)=., P (car ot late)=.9! P= = =!( )! ( ). (.).937 E 3) O average, % of the emergec room patiets at Greewood Geeral Hospital lack health isurace. I a radom sample of 4 patiets, aswer 9

to the followig questio? = umber of uisured patiets ( success ) P (uisured)=., P (isured)=.8, = 4 patiets The rage is =,,, 3, 4 patiets. () The probabilit that the sample of 4 patiets will cotai at least uisured patiets is P( ) = P() + P(3) + P(4) () The probabilit that the sample of 4 patiets will cotai at most uisured patiets is P( ) = P() + P() (3) What is the probabilit that fewer tha patiets are uisured? P( < ) = P() + P() (4) What is the probabilit that o more tha patiets are uisured? P( ) = P() + P() + P() Sequeces of Successes i Trials The umber of sequeces with successes i idepedet trials is: C! =!( )! where! = ( ) ( )... ad! =. Sum of Radom Sample Radom variables,,..., are called a Radom Sample if,,..., are mutuall idepedet radom variables ad the PMF (PDF) of each i is the same distributio with same parameter values (idetical). Let,,..., be a radom sample from a populatio, the E( i) = E( i) ad V( i) = V( i). i= i= Mea ad variace The mea of a biomial distributio is foud b addig the meas for each of the Beroulli idepedet evets: π + π+... + π = π The variace of a biomial distributio is foud b addig the variaces for each of the Beroulli idepedet evets: π ( π)+ π( π)+... + π( π) = π( π) The stadard deviatio is

π ( π ) Biomial shape A biomial distributio is skewed right if π <.5, skewed left if π >.5, ad smmetric if π =.5 Skewess decreases as icreases, regardless of the value of π. Hpergeometric Distributio The hpergeometric distributio is similar to the biomial distributio. However, ulike the biomial, samplig is without replacemet from a fiite populatio of N items. Parameters: N = umber of items i the populatio = sample size s = umber of successes i populatio Rage: = ma(, -N+s) mi(s, ) The hpergeometric PMF uses the formula for combiatios: s N s P ( ) = N E ) Suppose that a ur cotais s red chips ad W white chips. (s+w=n). If chips are draw out at radom, without replacemet, ad deotes the total umber of red chipd selected, the is said to have a Hpergeometric distributio: ~Hpergeometric (, s, N) = sample size s =umber of successes i the populatio N =populatio size E ) A committee of size 5 is to be selected at radom from 3 probabilitists ad 5 statisticias. Defie the radom variable to be the umber of probabilitists o the committee. What is the probabilit distributio for? ~Hpergeometric (, s, N), where = 5, s = 3, N = 8. 3

p() /56 5/56 3/56 /56 E 3) Damaged ipods. I a shipmet of ipods, were damaged ad 8 are good. The receivig departmet at Best Bu tests a sample of 3 ipods at radom to see if the are defective. Let the radom variable be the umber of damaged ipods i the sample. Now describe the problem: N = (umber of ipods i the shipmet) = 3 (sample size draw from the shipmet) s = (umber of damaged ipods i the shipmet ( successes i populatio)) N s = 8 (umber of o-damaged ipods i the shipmet) = umber of damaged ipods i the sample ( successes i sample) = umber of o-damaged ipods i the sample This is ot a biomial problem because p is ot costat. What is the probabilit of gettig a damaged ipod o the first draw from the sample? π = / Now, what is the probabilit of gettig a damaged ipod o the secod draw? π = /9 (if the first ipod was damaged) or = /9 (if the first ipod was udamaged) What about o the third draw? π 3 = /8 or = /8 or = /8 depedig o what happeed i the first two draws. Poisso Distributio The Poisso distributio describes the umber of occurreces withi a radoml chose uit of time or space. Called the model of arrivals, most Poisso applicatios model arrivals per uit of time. The evets occur radoml ad idepedetl over a cotiuum of time or space: Let = the umber of evets per uit of time. e λ λ P ( ) =! =,,, E) Suppose the umber of radioactive particles passig through a couter i a

give time follows a Poiiso distributio. The average umber of radioactive particles passig through a couter durig millisecod i a laborator eperimet is four. () What is the probabilit that the umber of particles that will eter the couter is at most 5 i a give millisecod? () What is the probabilit that the umber of particles that will eter the couter is at most 5 ad greater tha i a give millisecod? 3