Discrete Random Variables Class 4, Jeremy Orloff and Jonathan Bloom

Lerning Gols Discrete Rndom Vribles Clss 4, 8.05 Jeremy Orloff nd Jonthn Bloom. Know the definition of discrete rndom vrible. 2. Know the Bernoulli, binomil, nd geometric distributions nd exmples of wht they model. 3. Be ble to describe the probbility mss function nd cumultive distribution function using tbles nd formuls. 4. Be ble to construct new rndom vribles from old ones. 5. Know how to compute expected vlue (men). 2 Rndom Vribles This topic is lrgely bout introducing some useful terminology, building on the notions of smple spce nd probbility function. The key words re. Rndom vrible 2. Probbility mss function (pmf) 3. Cumultive distribution function (cdf) 2. Recp A discrete smple spce Ω is finite or listble set of outcomes {ω, ω 2...}. The probbility of n outcome ω is denoted P (ω). An event E is subset of Ω. The probbility of n event E is P (E) = ω E P (ω). 2.2 Rndom vribles s pyoff functions Exmple. A gme with 2 dice. Roll die twice nd record the outcomes s (i, j), where i is the result of the first roll nd j the result of the second. We cn tke the smple spce to be Ω = {(, ), (, 2), (, 3),..., (6, 6)} = {(i, j) i, j =,... 6}. The probbility function is P (i, j) = /36.

8.05 clss 4, Discrete Rndom Vribles, Spring 207 2 In this gme, you win $500 if the sum is 7 nd lose $00 otherwise. We give this pyoff function the nme X nd describe it formlly by { 500 if i + j = 7 X(i, j) = 00 if i + j 7. Exmple 2. We cn chnge the gme by using different pyoff function. For exmple Y (i, j) = ij 0. In this exmple if you roll (6, 2) then you win $2. If you roll (2, 3) then you win -$4 (i.e., lose $4). Question: Which gme is the better bet? nswer: We will come bck to this once we lern bout expecttion. These pyoff functions re exmples of rndom vribles. number to ech outcome in smple spce. More formlly: A rndom vrible ssigns Definition: Let Ω be smple spce. A discrete rndom vrible is function X : Ω R tht tkes discrete set of vlues. (Recll tht R stnds for the rel numbers.) Why is X clled rndom vrible? It s rndom becuse its vlue depends on rndom outcome of n experiment. And we tret X like we would usul vrible: we cn dd it to other rndom vribles, squre it, nd so on. 2.3 Events nd rndom vribles For ny vlue we write X = to men the event consisting of ll outcomes ω with X(ω) =. Exmple 3. In Exmple we rolled two dice nd X ws the rndom vrible { 500 if i + j = 7 X(i, j) = 00 if i + j 7. The event X = 500 is the set {(,6), (2,5), (3,4), (4,3), (5,2), (6,)}, i.e. outcomes tht sum to 7. So P (X = 500) = /6. the set of ll We llow to be ny vlue, even vlues tht X never tkes. In Exmple, we could look t the event X = 000. Since X never equls 000 this is just the empty event (or empty set) X = 000 = {} = P (X = 000) = 0.

8.05 clss 4, Discrete Rndom Vribles, Spring 207 3 2.4 Probbility mss function nd cumultive distribution function It gets tiring nd hrd to red nd write P (X = ) for the probbility tht X =. When we know we re tlking bout X we will simply write p(). If we wnt to mke X explicit we will write p X (). We spell this out in definition. Definition: The probbility mss function (pmf) of discrete rndom vrible is the function p() = P (X = ). Note:. We lwys hve 0 p(). 2. We llow to be ny number. If is vlue tht X never tkes, then p() = 0. Exmple 4. Let Ω be our erlier smple spce for rolling 2 dice. Define the rndom vrible M to be the mximum vlue of the two dice: M(i, j) = mx(i, j). For exmple, the roll (3,5) hs mximum 5, i.e. M(3, 5) = 5. We cn describe rndom vrible by listing its possible vlues nd the probbilities ssocited to these vlues. For the bove exmple we hve: vlue : 2 3 4 5 6 pmf p(): /36 3/36 5/36 7/36 9/36 /36 For exmple, p(2) = 3/36. Question: Wht is p(8)? nswer: p(8) = 0. Think: Wht is the pmf for Z(i, j) = i + j? Does it look fmilir? 2.5 Events nd inequlities Inequlities with rndom vribles describe events. For exmple X is the set of ll outcomes ω such tht X(w). Exmple 5. If our smple spce is the set of ll pirs of (i, j) coming from rolling two dice nd Z(i, j) = i + j is the sum of the dice then Z 4 = {(, ), (, 2), (, 3), (2, ), (2, 2), (3, )} 2.6 The cumultive distribution function (cdf) Definition: The cumultive distribution function (cdf) of rndom vrible X is the function F given by F () = P (X ). We will often shorten this to distribution function. Note well tht the definition of F () uses the symbol less thn or equl. importnt for getting your clcultions exctly right. Exmple. Continuing with the exmple M, we hve vlue : 2 3 4 5 6 pmf p(): /36 3/36 5/36 7/36 9/36 /36 cdf F (): /36 4/36 9/36 6/36 25/36 36/36 This will be

8.05 clss 4, Discrete Rndom Vribles, Spring 207 4 F () is clled the cumultive distribution function becuse F () gives the totl probbility tht ccumultes by dding up the probbilities p(b) s b runs from to. For exmple, in the tble bove, the entry 6/36 in column 4 for the cdf is the sum of the vlues of the pmf from column to column 4. In nottion: As events: M 4 = {, 2, 3, 4}; F (4) = P (M 4) = /36+3/36+5/36+7/36 = 6/36. Just like the probbility mss function, F () is defined for ll vlues. exmple, F (8) =, F ( 2) = 0, F (2.5) = 4/36, nd F (π) = 9/36. In the bove 2.7 Grphs of p() nd F () We cn visulize the pmf nd cdf with grphs. For exmple, let X be the number of heds in 3 tosses of fir coin: vlue : 0 2 3 pmf p(): /8 3/8 3/8 /8 cdf F (): /8 4/8 7/8 The colored grphs show how the cumultive distribution function is built by ccumulting probbility s increses. The blck nd white grphs re the more stndrd presenttions. 3/8 3/8 /8 0 2 3 /8 0 2 3 Probbility mss function for X 7/8 7/8 4/8 4/8 /8 0 2 3 /8 0 2 3 Cumultive distribution function for X

8.05 clss 4, Discrete Rndom Vribles, Spring 207 5 33/36 30/36 26/36 2/36 5/36 0/36 6/36 5/36 4/36 3/36 2/36 /36 2 3 4 5 6 7 8 9 0 2 6/36 3/36 /36 2 3 4 5 6 7 8 9 0 2 pmf nd cdf for the mximum of two dice (Exmple 4) Histogrms: Lter we will see nother wy to visulize the pmf using histogrms. These require some cre to do right, so we will wit until we need them. 2.8 Properties of the cdf F The cdf F of rndom vrible stisfies severl properties:. F is non-decresing. Tht is, its grph never goes down, or symboliclly if b then F () F (b). 2. 0 F (). 3. lim F () =, lim F () = 0. In words, () sys the cumultive probbility F () increses or remins constnt s increses, but never decreses; (2) sys the ccumulted probbility is lwys between 0 nd ; (3) sys tht s gets very lrge, it becomes more nd more certin tht X nd s gets very negtive it becomes more nd more certin tht X >. Think: Why does cdf stisfy ech of these properties? 3 Specific Distributions 3. Bernoulli Distributions Model: The Bernoulli distribution models one tril in n experiment tht cn result in either success or filure This is the most importnt distribution is lso the simplest. A rndom vrible X hs Bernoulli distribution with prmeter p if:

8.05 clss 4, Discrete Rndom Vribles, Spring 207 6. X tkes the vlues 0 nd. 2. P (X = ) = p nd P (X = 0) = p. We will write X Bernoulli(p) or Ber(p), which is red X follows Bernoulli distribution with prmeter p or X is drwn from Bernoulli distribution with prmeter p. A simple model for the Bernoulli distribution is to flip coin with probbility p of heds, with X = on heds nd X = 0 on tils. The generl terminology is to sy X is on success nd 0 on filure, with success nd filure defined by the context. Mny decisions cn be modeled s binry choice, such s votes for or ginst proposl. If p is the proportion of the voting popultion tht fvors the proposl, thn the vote of rndom individul is modeled by Bernoulli(p). Here re the tble nd grphs of the pmf nd cdf for the Bernoulli(/2) distribution nd below tht for the generl Bernoulli(p) distribution. vlue : 0 pmf p(): /2 /2 cdf F (): /2 /2 p() 0 Tble, pmf nd cmf for the Bernoulli(/2) distribution /2 F () 0 vlues : 0 pmf p(): -p p cdf F (): -p p p p() 0 p Tble, pmf nd cmf for the Bernoulli(p) distribution F () 0 3.2 Binomil Distributions The binomil distribution Binomil(n,p), or Bin(n,p), models the number of successes in n independent Bernoulli(p) trils. There is hierrchy here. A single Bernoulli tril is, sy, one toss of coin. A single binomil tril consists of n Bernoulli trils. For coin flips the smple spce for Bernoulli tril is {H, T }. The smple spce for binomil tril is ll sequences of heds nd tils of length n. Likewise Bernoulli rndom vrible tkes vlues 0 nd nd binomil rndom vribles tkes vlues 0,, 2,..., n. Exmple 6. Binomil(,p) is the sme s Bernoulli(p).

8.05 clss 4, Discrete Rndom Vribles, Spring 207 7 Exmple 7. The number of heds in n flips of coin with probbility p of heds follows Binomil(n, p) distribution. We describe X Binomil(n, p) by giving its vlues nd probbilities. For nottion we will use k to men n rbitrry number between 0 nd n. ( ) n We remind you tht n choose k = = n C k is the number of wys to choose k things k out of collection of n things nd it hs the formul ( ) n n! = k k! (n k)!. () (It is lso clled binomil coefficient.) Here is tble for the pmf of Binomil(n, k) rndom vrible. We will explin how the binomil coefficients enter the pmf for the binomil distribution fter simple exmple. vlues : 0 2 k n ( ) ( ) ( ) n pmf p(): ( p) n p ( p) n n p 2 ( p) n 2 n p k ( p) n k p n 2 k Exmple 8. Wht is the probbility of 3 or more heds in 5 tosses of fir coin? nswer: The binomil coefficients ssocited with n = 5 re ( ) ( ) 5 5 =, = 5! 0! 4! = 5 4 3 2 ( ) 5 4 3 2 = 5, = 5! 2 2! 3! = 5 4 3 2 2 3 2 = 5 4 2 = 0, nd similrly ( ) 5 = 0, 3 ( ) 5 = 5, 4 ( ) 5 =. 5 Using these vlues we get the following tble for X Binomil(5,p). vlues : 0 2 3 4 5 pmf p(): ( p) 5 5p( p) 4 0p 2 ( p) 3 0p 3 ( p) 2 5p 4 ( p) p 5 We were told p = /2 so P (X 3) = 0 ( 2 ) 3 ( ) 2 + 5 2 ( 2 Think: Why is the vlue of /2 not surprising? ) 4 ( ) + 2 ( ) 5 = 6 2 32 = 2. 3.3 Explntion of the binomil probbilities For concreteness, let n = 5 nd k = 2 (the rgument for rbitrry n nd k is identicl.) So X binomil(5, p) nd we wnt to compute p(2). The long wy to compute p(2) is to list ll the wys to get exctly 2 heds in 5 coin flips nd dd up their probbilities. The list hs 0 entries: HHTTT, HTHTT, HTTHT, HTTTH, THHTT, THTHT, THTTH, TTHHT, TTHTH, TTTHH

8.05 clss 4, Discrete Rndom Vribles, Spring 207 8 Ech entry hs the sme probbility of occurring, nmely p 2 ( p) 3. This is becuse ech of the two heds hs probbility p nd ech of the 3 tils hs probbility p. Becuse the individul tosses re independent we cn multiply probbilities. Therefore, the totl probbility of exctly 2 heds is the sum of 0 identicl probbilities, i.e. p(2) = 0p 2 ( p) 3, s shown in the tble. This guides us to the shorter wy to do the computtion. We hve to count the number of sequences with exctly 2 heds. To do this we need to choose 2 of the tosses to be heds nd the remining 3 to be tils. The number of such sequences is the number of wys to choose 2 out of 5 things, tht is ( 5 2). Since ech such sequence hs the sme probbility, p 2 ( p) 3, we get the probbility of exctly 2 heds p(2) = ( 5 2) p 2 ( p) 3. Here re some binomil probbility mss function (here, frequency is the sme s probbility). 3.4 Geometric Distributions A geometric distribution models the number of tils before the first hed in sequence of coin flips (Bernoulli trils). Exmple 9. () Flip coin repetedly. Let X be the number of tils before the first heds. So, X cn equl 0, i.e. the first flip is heds,, 2,.... In principle it tke ny nonnegtive integer vlue. (b) Give flip of tils the vlue 0, nd heds the vlue. In this cse, X is the number of 0 s before the first.

8.05 clss 4, Discrete Rndom Vribles, Spring 207 9 (c) Give flip of tils the vlue, nd heds the vlue 0. In this cse, X is the number of s before the first 0. (d) Cll flip of tils success nd heds filure. So, X is the number of successes before the first filure. (e) Cll flip of tils filure nd heds success. So, X is the number of filures before the first success. You cn see this models mny different scenrios of this type. The most neutrl lnguge is the number of tils before the first hed. Forml definition. The rndom vrible X follows geometric distribution with prmeter p if X tkes the vlues 0,, 2, 3,... its pmf is given by p(k) = P (X = k) = ( p) k p. We denote this by X geometric(p) or geo(p). In tble form we hve: vlue : 0 2 3... k... pmf p(): p ( p)p ( p) 2 p ( p) 3 p... ( p) k p... Tble: X geometric(p): X = the number of 0s before the first. We will show how this tble ws computed in n exmple below. The geometric distribution is n exmple of discrete distribution tht tkes n infinite number of possible vlues. Things cn get confusing when we work with successes nd filure since we might wnt to model the number of successes before the first filure or we might wnt the number of filures before the first success. To keep stright things stright you cn trnslte to the neutrl lnguge of the number of tils before the first heds. 0.4 0.8 0.3 0.6 0.2 0.4 0. 0.2 0.0 0 5 0 0.0 0 5 0 pmf nd cdf for the geometric(/3) distribution Exmple 0. Computing geometric probbilities. Suppose tht the inhbitnts of n islnd pln their fmilies by hving bbies until the first girl is born. Assume the probbility of hving girl with ech pregnncy is 0.5 independent of other pregnncies, tht ll bbies survive nd there re no multiple births. Wht is the probbility tht fmily hs k boys?

8.05 clss 4, Discrete Rndom Vribles, Spring 207 0 nswer: In neutrl lnguge we cn think of boys s tils nd girls s heds. Then the number of boys in fmily is the number of tils before the first heds. Let s prctice using stndrd nottion to present this. So, let X be the number of boys in (rndomly-chosen) fmily. So, X is geometric rndom vrible. We re sked to find p(k) = P (X = k). A fmily hs k boys if the sequence of children in the fmily from oldest to youngest is BBB... BG with the first k children being boys. The probbility of this sequence is just the product of the probbility for ech child, i.e. (/2) k (/2) = (/2) k+. (Note: The ssumptions of equl probbility nd independence re simplifictions of relity.) Think: Wht is the rtio of boys to girls on the islnd? More geometric confusion. Another common definition for the geometric distribution is the number of tosses until the first heds. In this cse X cn tke the vlues, i.e. the first flip is heds, 2, 3,.... This is just our geometric rndom vrible plus. The methods of computing with it re just like the ones we used bove. 3.5 Uniform Distribution The uniform distribution models ny sitution where ll the outcomes re eqully likely. X uniform(n). X tkes vlues, 2, 3,..., N, ech with probbility /N. We hve lredy seen this distribution mny times when modeling to fir coins (N = 2), dice (N = 6), birthdys (N = 365), nd poker hnds (N = ( 52 5 ) ). 3.6 Discrete Distributions Applet The pplet t http://mthlets.org/mthlets/probbility-distributions/ gives dynmic view of some discrete distributions. The grphs will chnge smoothly s you move the vrious sliders. Try plying with the different distributions nd prmeters. This pplet is crefully color-coded. Two things with the sme color represent the sme or closely relted notions. By understnding the color-coding nd other detils of the pplet, you will cquire stronger intuition for the distributions shown. 3.7 Other Distributions There re million other nmed distributions rising is vrious contexts. We don t expect you to memorize them (we certinly hve not!), but you should be comfortble using resource like Wikipedi to look up pmf. For exmple, tke look t the info box t the top right ofhttp://en.wikipedi.org/wiki/hypergeometric_distribution. The info box lists mny (surely unfmilir) properties in ddition to the pmf.

8.05 clss 4, Discrete Rndom Vribles, Spring 207 4 Arithmetic with Rndom Vribles We cn do rithmetic with rndom vribles. For exmple, we cn dd subtrct, multiply or squre them. There is simple, but extremely importnt ide for counting. It sys tht if we hve sequence of numbers tht re either 0 or then the sum of the sequence is the number of s. Exmple. Consider the sequence with five s, 0, 0,, 0, 0, 0,, 0, 0, 0, 0,, 0, 0, 0,, 0, 0. It is esy to see tht the sum of this sequence is 5 the number of s. We illustrtes this ide by counting the number of heds in n tosses of coin. Exmple 2. Toss fir coin n times. Let X j be if the jth toss is heds nd 0 if it s tils. So, X j is Bernoulli(/2) rndom vrible. Let X be the totl number of heds in the n tosses. Assuming the tosses re independence we know X binomil(n, /2). We cn lso write X = X + X 2 + X 3 +... + X n. Agin, this is becuse the terms in the sum on the right re ll either 0 or. So, the sum is exctly the number of X j tht re, i.e. the number of heds. The importnt thing to see in the exmple bove is tht we ve written the more complicted binomil rndom vrible X s the sum of extremely simple rndom vribles X j. This will llow us to mnipulte X lgebriclly. Think: Suppose X nd Y re independent nd X binomil(n, /2) nd Y binomil(m, /2). Wht kind of distribution does X + Y follow? (Answer: binomil(n + m, /2). Why?) Exmple 3. tbles. Suppose X nd Y re independent rndom vribles with the following Vlues of X x: 2 3 4 pmf p X (x): /0 2/0 3/0 4/0 Vlues of Y y: 2 3 4 5 pmf p Y (y): /5 2/5 3/5 4/5 5/5 Check tht the totl probbility for ech rndom vrible is. Mke tble for the rndom vrible X + Y. nswer: The first thing to do is mke two-dimensionl tble for the product smple spce consisting of pirs (x, y), where x is possible vlue of X nd y one of Y. To help do the computtion, the probbilities for the X vlues re put in the fr right column nd those for Y re in the bottom row. Becuse X nd Y re independent the probbility for (x, y) pir is just the product of the individul probbilities.

8.05 clss 4, Discrete Rndom Vribles, Spring 207 2 Y vlues 2 3 4 5 /50 2/50 3/50 4/50 5/50 /0 X vlues 2 3 2/50 4/50 6/50 8/50 0/50 3/50 6/50 9/50 2/50 5/50 2/0 3/0 4 4/50 8/50 2/50 6/50 20/50 4/0 /5 2/5 3/5 4/5 5/5 The digonl stripes show sets of squres where X + Y is the sme. All we hve to do to compute the probbility tble for X + Y is sum the probbilities for ech stripe. X + Y vlues: 2 3 4 5 6 7 8 9 pmf: /50 4/50 0/50 20/50 30/50 34/50 3/50 20/50 When the tbles re too big to write down we ll need to use purely lgebric techniques to compute the probbilities of sum. We will lern how to do this in due course. 5 Expected Vlue In the R reding questions for this lecture, you simulted the verge vlue of rolling die mny times. You should hve gotten vlue close to the exct nswer of 3.5. To motivte the forml definition of the verge, or expected vlue, we first consider some exmples. Exmple 4. Suppose we hve six-sided die mrked with five 5 3 s nd one 6. (This ws the red one from our non-trnsitive dice.) Wht would you expect the verge of 6000 rolls to be? nswer: If we knew the vlue of ech roll, we could compute the verge by summing the 6000 vlues nd dividing by 6000. Without knowing the vlues, we cn compute the expected verge s follows. Since there re five 3 s nd one six we expect roughly 5/6 of the rolls will give 3 nd /6 will give 6. Assuming this to be exctly true, we hve the following tble of vlues nd counts: vlue: 3 6 expected counts: 5000 000

8.05 clss 4, Discrete Rndom Vribles, Spring 207 3 The verge of these 6000 vlues is then 5000 3 + 000 6 6000 = 5 6 3 + 6 6 = 3.5 We consider this the expected verge in the sense tht we expect ech of the possible vlues to occur with the given frequencies. Exmple 5. We roll two stndrd 6-sided dice. You win $000 if the sum is 2 nd lose $00 otherwise. How much do you expect to win on verge per tril? nswer: The probbility of 2 is /36. If you ply N times, you cn expect N of the 36 trils to give 2 nd 35 36 N of the trils to give something else. Thus your totl expected winnings re 000 N 35N 00 36 36. To get the expected verge per tril we divide the totl by N: expected verge = 000 35 00 36 36 = 69.44. Think: Would you be willing to ply this gme one time? Multiple times? Notice tht in both exmples the sum for the expected verge consists of terms which re vlue of the rndom vrible times its probbilitiy. This leds to the following definition. Definition: Suppose X is discrete rndom vrible tht tkes vlues x, x 2,..., x n with probbilities p(x ), p(x 2 ),..., p(x n ). The expected vlue of X is denoted E(X) nd defined by n E(X) = p(x j ) x j = p(x )x + p(x 2 )x 2 +... + p(x n )x n. Notes: j=. The expected vlue is lso clled the men or verge of X nd often denoted by µ ( mu ). 2. As seen in the bove exmples, the expected vlue need not be possible vlue of the rndom vrible. Rther it is weighted verge of the possible vlues. 3. Expected vlue is summry sttistic, providing mesure of the loction or centrl tendency of rndom vrible. 4. If ll the vlues re eqully probble then the expected vlue is just the usul verge of the vlues. Exmple 6. Find E(X) for the rndom vrible X with tble: vlues of X: 3 5 pmf: /6 /6 2/3 nswer: E(X) = 6 + 6 3 + 2 3 5 = 24 6 = 4

8.05 clss 4, Discrete Rndom Vribles, Spring 207 4 Exmple 7. Let X be Bernoulli(p) rndom vrible. Find E(X). nswer: X tkes vlues nd 0 with probbilities p nd p, so E(X) = p + ( p) 0 = p. Importnt: This is n importnt exmple. Be sure to remember tht the expected vlue of Bernoulli(p) rndom vrible is p. Think: Wht is the expected vlue of the sum of two dice? 5. Men nd center or mss You my hve wondered why we use the nme probbility mss function. Here s the reson: if we plce n object of mss p(x j ) t position x j for ech j, then E(X) is the position of the center of mss. Let s recll the ltter notion vi n exmple. Exmple 8. Suppose we hve two msses long the x-xis, mss m = 500 t position x = 3 nd mss m 2 = 00 t position x 2 = 6. Where is the center of mss? nswer: Intuitively we know tht the center of mss is closer to the lrger mss. m m 2 3 6 x From physics we know the center of mss is x = m x + m 2 x 2 500 3 + 00 6 = = 3.5. m + m 2 600 We cll this formul weighted verge of the x nd x 2. Here x is weighted more hevily becuse it hs more mss. Now look t the definition of expected vlue E(X). It is weighted verge of the vlues of X with the weights being probbilities p(x i ) rther thn msses! We might sy tht The expected vlue is the point t which the distribution would blnce. Note the similrity between the physics exmple nd Exmple. 5.2 Algebric properties of E(X) When we dd, scle or shift rndom vribles the expected vlues do the sme. shorthnd mthemticl wy of sying this is tht E(X) is liner.. If X nd Y re rndom vribles on smple spce Ω then The E(X + Y ) = E(X) + E(Y ) 2. If nd b re constnts then E(X + b) = E(X) + b. We will think of X + b s scling X by nd shifting it by b.

8.05 clss 4, Discrete Rndom Vribles, Spring 207 5 Before proving these properties, let s consider few exmples. Exmple 9. Roll two dice nd let X be the sum. Find E(X). nswer: Let X be the vlue on the first die nd let X 2 be the vlue on the second die. Since X = X + X 2 we hve E(X) = E(X ) + E(X 2 ). Erlier we computed tht E(X ) = E(X 2 ) = 3.5, therefore E(X) = 7. Exmple 20. Let X binomil(n, p). Find E(X). nswer: Recll tht X models the number of successes in n Bernoulli(p) rndom vribles, which we ll cll X,... X n. The key fct, which we highlighted in the previous reding for this clss, is tht n X = X j. Now we cn use the Algebric Property () to mke the clcultion simple. X = n j= j= X j E(X) = j We could hve computed E(X) directly s E(X) = n kp(k) = k=0 k=0 E(X j ) = j p = np. n ( ) n k p k ( p) n k. k It is possible to show tht the sum of this series is indeed np. We think you ll gree tht the method using Property () is much esier. Exmple 2. (For infinite rndom vribles the men does not lwys exist.) X hs n infinite number of vlues ccording to the following tble vlues x: 2 2 2 2 3... 2 k... pmf p(x): /2 /2 2 /2 3... /2 k Try to compute the men.... nswer: The men is E(X) = 2 k 2 k = =. k= k= The men does not exist! This cn hppen with infinite series. Suppose 5.3 Proofs of the lgebric properties of E(X) The proof of Property () is simple, but there is some subtlety in even understnding wht it mens to dd two rndom vribles. Recll tht the vlue of rndom vrible is number determined by the outcome of n experiment. To dd X nd Y mens to dd the vlues of X nd Y for the sme outcome. In tble form this looks like: outcome ω: ω ω 2 ω 3... ω n vlue of X: x x 2 x 3... x n vlue of Y : y y 2 y 3... y n vlue of X + Y : x + y x 2 + y 2 x 3 + y 3... x n + y n prob. P (ω): P (ω ) P (ω 2 ) P (ω 3 )... P (ω n )

8.05 clss 4, Discrete Rndom Vribles, Spring 207 6 The proof of () follows immeditely: E(X + Y ) = (x i + y i )P (ω i ) = x i P (ω i ) + y i P (ω i ) = E(X) + E(Y ). The proof of Property (2) only tkes one line. E(X + b) = p(x i )(x i + b) = p(x i )x i + b p(x i ) = E(X) + b. The b term in the lst expression follows becuse p(x i ) =. Exmple 22. Men of geometric distribution Let X geo(p). Recll this mens X tkes vlues k = 0,, 2,... with probbilities p(k) = ( p) k p. (X models the number of tils before the first heds in sequence of Bernoulli trils.) The men is given by E(X) = p p. To see this requires clever trick. Mthemticins love this sort of thing nd we hope you re ble to follow the logic. In this clss we will not sk you to come up with something like this on n exm. Here s the trick.: to compute E(X) we hve to sum the infinite series E(X) = k( p) k p. k=0 Here is the trick. We know the sum of the geometric series: Differentite both sides: Multiply by x: kx k = k=0 Replce x by p: Multiply by p: k=0 k=0 kx k = k=0 x ( x) 2. ( x) 2. k( p) k = p p 2. k( p) k p = p p. This lst expression is the men. E(X) = p p. k=0 x k = x. Exmple 23. Flip fir coin until you get heds for the first time. Wht is the expected number of times you flipped tils? nswer: The number of tils before the first hed is modeled by X geo(/2). From the previous exmple E(X) = /2 =. This is surprisingly smll number. /2

8.05 clss 4, Discrete Rndom Vribles, Spring 207 7 Exmple 24. Michel Jordn, the gretest bsketbll plyer ever, mde 80% of his free throws. In gme wht is the expected number he would mke before his first miss. nswer: Here is n exmple where we wnt the number of successes before the first filure. Using the neutrl lnguge of heds nd tils: success is tils (probbility p) nd filure is heds (probbility = p). Therefore p =.2 nd the number of tils (mde free throws) before the first heds (missed free throw) is modeled by X geo(.2). We sw in Exmple 22 tht this is E(X) = p p =.8.2 = 4. 5.4 Expected vlues of functions of rndom vrible (The chnge of vribles formul.) If X is discrete rndom vrible tking vlues x, x 2,... nd h is function the h(x) is new rndom vrible. Its expected vlue is E(h(X)) = j h(x j )p(x j ). We illustrte this with severl exmples. Exmple 25. Let X be the vlue of roll of one die nd let Y = X 2. Find E(Y ). nswer: Since there re smll number of vlues we cn mke tble. X 2 3 4 5 6 Y 4 9 6 25 36 prob /6 /6 /6 /6 /6 /6 Notice the probbility for ech Y vlue is the sme s tht of the corresponding X vlue. So, E(Y ) = E(X 2 ) = 2 6 + 22 6 +... + 62 6 = 5.67. Exmple 26. Roll two dice nd let X be the sum. Suppose the pyoff function is given by Y = X 2 6X +. Is this good bet? 2 nswer: We hve E(Y ) = (j 2 6j + )p(j), where p(j) = P (X = j). j=2 We show the tble, but relly we ll use R to do the clcultion. X 2 3 4 5 6 7 8 9 0 2 Y -7-8 -7-4 8 7 28 4 56 73 prob /36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 /36 Here s the R code I used to compute E(Y ) = 3.833. x = 2:2 y = x^2-6*x + p = c( 2 3 4 5 6 5 4 3 2 )/36 ve = sum(p*y) It gve ve = 3.833. To nswer the question bove: since the expected pyoff is positive it looks like bet worth tking.

8.05 clss 4, Discrete Rndom Vribles, Spring 207 8 Quiz: If Y = h(x) does E(Y ) = h(e(x))? nswer: NO!!! This is not true in generl! Think: Is it true in the previous exmple? Quiz: If Y = 3X + 77 does E(Y ) = 3E(X) + 77? nswer: Yes. By property (2), scling nd shifting does behve like this.