Massachusetts Institute of Technology

Solutios to Quiz : Sprig 006 Problem : Each of the followig statemets is either True or False. There will be o partial credit give for the True False questios, thus ay explaatios will ot be graded. Please clearly idicate True or False i your quiz booklet, ambiguous marks will receive zero credit. Cosider a probabilistic model with a sample space Ω, a collectio of evets that are subsets of Ω, ad a probability law P() defied o the collectio of evets all exactly as usual. Let A, B ad C be evets. (a) If P(A) P(B), the A B. True False False. As a couterexample, cosider the sample space associated with a biased coi which comes up heads with probability. The, P(heads) P(tails), but the evet heads is clearly ot a subset of the evet tails. Note that the coverse statemet is true, i.e. if A B, the P(A) P(B). (b) Assumig P(B) > 0, P(A B) is at least as large as P(A). True False False. As a couterexample, cosider the case where A B C ad 0 < P(B) <. Clearly P(A B) 0, but P(A) P(B) > 0. So, i this case P(A B) < P(A). Now let X ad Y be radom variables defied o the same probability space Ω as above. (c) If E[X ] > E[Y ], the E[X ] E[Y ]. True False False. Let X take o the values 0 ad with equal probability. Let Y take o the values 000 ad 000 with equal probability. The, E[X].5, ad E[Y ] 0, so E[X] > E[Y ]. However, E[X ].5, while E[Y ] 000000. Thus, E[X ] < E[Y ] i this case. A additioal couterexample usig degeerate radom variables is the trivial case of X, Y. (d) Suppose P(A) > 0. The E[X] E[X A] + E[X A C ]. True False False. This resembles the total expectatio theorem, but the P(A) ad P(A C ) terms are missig. As a explicit couterexample, say X is idepedet of A, ad E[X]. The, the left had side is, while the right had side of the equatio is +. (e) If X ad Y are idepedet ad P(C) > 0, the p X,Y C (x, y) p X C (x) p Y C (y). True False False. If X ad Y are idepedet, we ca coclude p X,Y (x, y) p X (x) p Y (y). The statemet asks whether X ad Y are coditioally idepedet give C. As we have see i class, idepedece does NOT imply coditioal idepedece. For example, let X take o the values 0 ad with equal probability. Let Y also take o the values 0 ad with equal probability, idepedet of X. Let C be the evet X + Y. The, clearly X ad Y are ot idepedet coditioed o C. To see this, ote that the joit PMF of X ad Y coditioed o C puts probability 4 o each of the outcomes (0, ) ad (, 0). Thus, coditioed o C, tellig you the value of X determies the value of Y exactly. O the other had, coditioed o C, if you do t kow the value of X, Y is still equally likely to be 0 or.

(f) If for some costat c we have P({X > c}), the E[X] > c. True False False. Let X take o the values ad with equal probability. The, P({X > 0}).5, ad E[X] 0 4. Icidetally, if we restrict X to be oegative, the statemet is true. We will study this more carefully later i the term. The iterested reader ca look up the Markov iequality i sectio 7. of the textbook. I a simple game ivolvig flips of a fair coi, you wi a dollar every time you get a head. Suppose that the maximum umber of flips is 0, however, the game termiates as soo as you get a tail. (g) The expected gai from this game is. True False False. Cosider a alterative game where you cotiue flippig util you see a tail, i.e. the game does ot termiate at a max of 0 flips. We ll refer to this as the ulimited game. Let G be your gai, ad X be the total umber of flips (head ad tails). Realize that X is a geometric radom variable, ad G X is a shifted geometric radom variable. Also E[G] E[X] $, which is the value of the ulimited game. The trucated game effectively scales dow all pay offs of the ulimited game which are > $0. Thus the trucated game must have a lower expected value tha tha the ulimited game. The expected gai of the trucated game is 0 04.9990 <. Let X be a uiformly distributed cotiuous radom variable over some iterval [a, b]. (h) We ca uiquely describe f X (x) from its mea ad variace. True False True. A uiformly distributed cotiuous radom variable is completely specified by it s (b a) rage, i.e. by a ad b. We have E[X] (a+b), ad var(x), thus give E[X] ad var(x) oe ca solve for a ad b. Let X be a expoetially distributed radom variable with a probability desity fuctio f X (x) e x. (i) The P ({0 X } { X 4}) e 4 True False True. Note {0 X } { X 4} {0 X 4}, so we have P ({0 X 4}) F X (4) e 4 Let X be a ormal radom variable with mea ad variace 4. Let Y be a ormal radom variable with mea ad variace. (j) P(X < 0) < P(Y < 0). True False False. Sice X N(, 4), X N(0, ). Similarly, Y N(0, ). So, P(X < 0) Φ( 0 ), ad P(Y < 0) Φ(0 ). We kow that ay CDF is mootoically odecreasig, so Φ( ) Φ( ). This shows that the statemet i false. A alterative solutio:

False. Sice X N(, 4), X N(0, ). Similarly, Y N(0, ). Let Z be a stadard ormal radom variable. The, P(X < 0) P(Z < 0 ), ad P(Y < 0) P(Z < 0 ). Now, the evet {Z < } is a subset of the evet {Z < }, ad hece P(Z < ) P(Z < ), which implies that the statemet i false.

Problem : (40 poits) Borders Book store has bee i busiess for 0 years, ad over that period, the store has collected trasactio data o all of its customers. Various marketig teams have bee busy usig the data to classify customers i hopes of better uderstadig customer spedig habits. Marketig Team A has determied that out of their customers, /4 are low frequecy buyers (i.e., they do t come to the store very ofte). They have also foud that out of the low frequecy buyers, / are high speders (i.e., they sped a sigificat amout of moey i the store), whereas out of the high frequecy buyers oly /0 are high speders. Assume each customer is either a low or high frequecy buyer. (a) Compute the probability that a radomly chose customer is a high speder. We use the abbreviatios HF, LF, HS, ad LS to refer to high frequecy, low frequecy, high speder, ad low speder. Usig the total probability theorem, P(HS) P(LF )P(HS LF ) + P(HF )P(HS HF ) + 4 4 0 9 0.58 (b) Compute the probability that a radomly chose customer is a high frequecy buyer give that he/she is a low speder. Usig Bayes rule, P(HF LS) P(HF )P(LS HF ) P(HF )P(LS HF ) + P(LF )P(LS LF ) 9 4 0 9 + 4 0 4 8 0.800 You are told that the oly products Borders sells are books, CDs, ad DVDs. You are itroduced to Marketig Team B which has idetified customer groupigs. These groups are collectively exhaustive ad mutually exclusive. They have also determied that each customer is equally likely to be i ay group, customers are i.i.d, ad each customer buys oly oe item per day. They refer to the groupigs as C, C, ad C, ad have determied the followig coditioal probabilities: P(purchases a book customer i C ) / P(purchases a CD customer i C ) /4 P(purchases a DVD customer i C ) /4 P(purchases a book customer i C ) / P(purchases a CD customer i C ) 0 P(purchases a DVD customer i C ) / P(purchases a book customer i C ) / P(purchases a CD customer i C ) / P(purchases a DVD customer i C ) /

(c) Compute the probability that a customer purchases a book or a CD. We use the abbreviatios B, C, D for buyig a book, CD, or DVD. P(B C) P(B) + P(C) because a customer ca oly buy item, ad hece B ad C are disjoit. Applyig the total probability theorem, Similarly, P(B) P(C )P(B C ) + P(C )P(B C ) + P(C )P(B C ) + + 4 0.4444 9 P(C) P(C )P(C C ) + P(C )P(C C ) + P(C )P(C C ) + 0 + 4 7 6 0.944 So, P(B C) 4 + 7 9 6 6.69. (Note: Alteratively, we could have used the fact that B ad C are coditioally disjoit give C, C, or C. The, we could just add together the coditioal probabilities. This would give the formula P(B C) 4 ) + ( + ( + 0) + ( + ).) (d) Compute the probability that a customer is i group C or C give he/she purchased a book. We use Bayes rule. P(C C B) P(B (C C )) P(B) P(B C ) + P(B C ) P(B C ) + P(B C ) + P(B C ). To get the secod lie, we used the fact that C, C, ad C are mutually disjoit ad collectively exhaustive. Now, substitutig the give umbers, P(C C B) + + + 5.65 8 Now i additio to the data from Marketig Team B, you are told that each book costs $5, each CD costs $0, ad each DVD costs $5. (e) Compute the PMF, expected value ad variace of the reveue (i dollars) Borders collects from a sigle item purchase of oe customer? Let R be the reveue from oe customer. The, R ca take o the values 0 ad 5. p R (0) 7 9 P(C). We calculated P(C) 6 i part c). p R (5) p R (0) 6 because the PMF must sum to. So,

p R (r) 7 r 0 6 9 r 5 6 0 otherwise Oce we have the PMF, calculated the mea ad variace is a simply a matter of pluggig 7 9 ito the defiitios. E[R] rp 505 R (r) 0 6 + 5 6 6 4.0. Sice we kow the mea, we ca calculate the variace from var(r) E[R ] (E[R]) 505 5075 0 7 + 5 9 6 6 ( 6 ) 96.959. (f) Suppose that customers shop o a give day. Compute the expected value ad variace of the reveue Borders makes from these customers. Let R, R,..., R be radom variables such that R i is the reveue from customer i. The total reveue is R i R i. Recall that the customers are i.i.d. Thus, by liearity of expectatio E[R] i E[R 505 i ] E[R ] usig the result from (e). Because the R 6 i are assumed to be idepedet, var(r) i var(r i) var(r ) 5075 96, usig the result from (e). The followig questios are required for 6.4 Studets (6.04 Studets may attempt them for Extra Credit) Skipper is very abormal, ot that there s aythig wrog with that. He does t fit ito ay of the marketig teams models. Every day Skipper wakes up ad walks to Borders Bookstore. There he flips a fair coi repeatedly util he flips his secod tails. He the goes to the couter ad buys DVD for each head he flipped. Let R be the reveue Borders makes from Skipper each day. (g) What s the daily expected reveue from Skipper? The crucial observatio is that Skipper s reveue ca be broke dow as the sum of two idepedet shifted geometric radom variables. Let X ad X be idepedet geometric radom variables with paramater p. Let R be the total daily reveue from Skipper. We fid R 5(X ) + 5(X ), ad thus E[R] 5(E[X ] + E[X ]) 0 $0, where we ve used E[X 5 i ] Likewise we fid var(r) (var(x ) + var(x )) $900, where. we ve used var(x i ). ( )

Problem : (0 poits) We have s urs ad balls, where s. Cosider a experimet where each ball is placed i a ur at radom (i.e., each ball has equal probability of beig placed i ay of the urs). Assume each ball placemet is idepedet of other placemets, ad each ur ca fit ay umber of balls. Defie the followig radom variables: For each i,,..., s, let X i be the umber of balls i ur i. For each k 0,,...,, let Y k be the umber of urs that have exactly k balls. Note: Be sure to iclude rages of variables where appropriate. (a) Are the X i s idepedet? Aswer: Yes / No No. Beforehad, X ca take o ay value from 0,,...,. Say we are told X X... X s 0. The, coditioed o this iformatio, X with probability. Thus, the X i s caot be idepedet. (b) Fid the PMF, mea, ad variace of X i. p Xi (k) E[X i ] var(x i ) The crucial observatio is that X i has a biomial PMF. Cosider each ball. It chooses a ur idepedetly ad uiformly. Thus, the probability the ball lads i ur i is s. There are balls, so X i is distributed like a biomial radom variable with parameters ad s. Thus, we obtai ( ) ( ) ) k ( k p Xi (k), k 0,,..., k s s E[X i ] s ( ) ( ) ( ) ( ) var(x i ) s s s s (c) For this questio let 0, ad s. Fid the probability that the first ur has balls, the secod has, ad the third has 5. i.e. compute P(X, X, X 5) We ca calculate the required probability usig coutig. As our sample space, we will use sequeces of legth, where each the i th elemet i the sequece is the umber of the ur that the i th ball is i. The total umber of elemets i the sample space is s 0. The sequeces that have X, X, ad X 5 are those sequeces that have three s, two ( 0 ) s, ad five s. The umber of such sequeces is give be the partitio formula,,,5. So,,,5) 80 P(X, X, X 5) ( 0 0 656.047. (d) Compute E[Y k ]. E[Y k ] This problem is very similar to the hat problem discussed i lecture. The trick is to defie idicator radom variables I, I,..., I s. I i is if ur i has exactly k balls, ad 0 otherwise. With s s this defiitio, Y k i I i. By liearity of expectatio, we see that E[Y k ] i E[I i ].

To calculate E[I i ], ote that E[I i ] P(X i k) p Xi (k). From b), we kow that p Xi (k) ( ) ( ) k ( ) ) k k s s. This meas E[Yk ] s ( ) ( ) k ( k k s s. (e) Compute var(y k ). You may assume k. var(y k ) var(y k ) E[Y ] (E[Y k ]). From d), we kow E[Y k ], so we oly eed to fid E[Y k ]. s E[Y ] E[( k i I i) k ]. Usig liearity of expectatio, ad otig that I I i sice I i is i s s s always 0 or, we obtai E[Yk ] i E[I i ] + i ji E[I ii j ]. From d), we kow E[I i ], which takes care of the first term. To calculate the secod term, E[I i I j ] P(X i X j k). Note that this evet oly has ozero probability if k (hece the assumptio i the problem statemet). Now, P(X i X j k) ca be computed usig the partitio formula. The probability that the first k balls lad i ur i, the ext k balls lad i ur j, ad the remaiig k balls ( ) k ( ) k ( ) k lad i urs ot equal to i or j is give by s s s. The umber of ways ( ) of partitioig the balls ito groups of size k, k, ad k is k,k, k. This gives P(X i X j k) ( ) ( ) k ( ) k k,k, k s s. Fially, substitutig our results ito the origial equatio, we get s s s var(y k ) E[I i ] + E[I i I j ] (E[Y k ]) i i ji+ ( ) ( ) ) ) k ( ( ) k ( ) k ( k s + s(s ) k s s k, k, k s s ( ( ) ( ) ) k ( k ) s k s s (f) This problem is required for 6.4 Studets (6.04 Studets may attempt it for Extra Credit) What is the probability that o ur is empty? i.e., compute P (X > 0, X > 0,..., X s > 0). P(X > 0, X > 0,..., X s > 0) To determie the desired probability, we will compute P(some ur is empty). Defie evets A i, i,,..., s such that A i is the evet {X i 0}, i.e. that ur i is empty. The, P(some ur is empty) P(A A... A s ). We will use the iclusio exclusio priciple to calculate the probability of the uio of the A i (see chapter, problem 9 for a detailed discussio of the iclusio exclusio priciple). The iclusio exclusio formula states s P(A A... A s ) P(A i ) P(A i A j )+ P(A i A j A k )+...+( ) s P(A A... A s ) i i<j s i<j<k s Let us calculate P(A A... A k ) for ay k s. This itersectio represets the evet that the first k urs are all empty. The probability the first k urs are empty is simply the

( ) k probability that every ball misses these k urs, which is s. By symmetry, this formula works for ay fixed set of k urs. Pluggig this ito the iclusio exclusio formula, we get s ( ) ( ) ( ) ( ) P(A A... A s ) +...+( ) s s. s s s s i i<j s i<j<k s To simplify this ( expressio, ) cosider the first sum. There are s terms i the sum, so the first ( s ) ( s ) sum is just s s. I the ext sum, there are terms. I geeral, the k th sum has k terms. Thus, we get ( ) ( ) ( ) ( ) ( ) ( ) ( ) s s s P(A A... A s ) s +...+ ( ) s s. s s s s s We subtract this probability from to get the fial aswer, P(X > 0, X > 0,..., X s > 0) ( ( ) ( ) ( ) ( ) ( ) ( ) ( ) s s s ) s + +... + ( ) s s s s s s s ( ) ( ) s s ( ) k k k s k0 (We ca leave out the k s term sice that term is always 0.)