STAT 285 Fall 2014 Assignment 1 Solutions 1. An environmental agency sets a standard of 200 ppb for the concentration of cadmium in a lake. The concentration of cadmium in one lake is measured 17 times. The measurements average 211 parts per billion with an SD of 15 parts per billion. Could the real concentration of cadmium be below the standard of 200 ppb? Your answer will be 4 or 5 sentences long. I want no formulas. Solution The short answer is that while it is possible that the true concentration is at or below 200ppb this is very unlikely. However, the statistician giving this advice must first discover if the standard statistical method is appropriate. Your answer should have two parts. In the first part you need to say that in order to answer the question you need to be confident that the 17 measurements can be thought of as a random sample from a population whose mean is the real concentration of cadmium. The 17 samples need to be spread throughout the lake with sites chosen at random and the measuring method needs to be free of bias. Now I will answer at some length. I am putting more words here than you need. Statisticians approach this question as a hypothesis testing problem. It calls for a yes or no answer but we give an answer which is more like probably not. First we ask how the data were collected so we know what was measured. We hope that the samples were collected in such a way that they can be treated as a simple random sample from a population whose mean value µ is the concentration of cadmium in the lake. We leave it up to the subject area experts to make sure this is true but when consulting we ask questions about it. If so we have a sample of n = 17 measurements from a population with mean µ. We observe x = 211 and s = 15. The question is: is µ above 200 or not? That means the problem is one sided and we have to have either H o : µ 200 or H o : µ 200. 1
If we test the former and reject the null hypothesis we would conclude No the real concentration is (probably) not below 200. If we test the hypothesis µ 200 and accept the null our conclusion is we have little evidence against the assertion that µ 200 which is far from providing a definitive answer. To test the hypotheses we compute t = 211 200 15/ 17 = 3.02 For the null hypothesis H o : µ 200 we get a P-value from t tables with 16 degrees of Freedom. I get 0.004 which is very strong evidence against this null and conclude No, almost certainly not. If you test the other way the P value is 0.996 and the conclusion is far weaker: I see very little evidence against the assertion that µ 200. Statisticians have a duty to do their best to answer the question asked so the former answer is far better. I said no formulas so your answer did not need to contain the formula for the t statistic. Please focus on the practical issues it is the statistician s job to make sure the techniques we use are relevant to the problem at hand. Assumptions matter. 2. Chapter 2 page 89 number 94 in text. Solution Always start by defining notation. I will let R i be the event that relay i sends a 1. I will let T be the event that a 1 is sent by the transmitter. The receiver at the end gets a 1 if R 3 occurs. We are told P(R 1 T) = 0.8 = P(R 1 T ) P(R 1 T) = 0.2 = P(R 1 T ) P(R 2 R 1 ) = 0.8 = P(R 2 R 1 ) P(R 2 R 1 ) = 0.2 = P(R 2 R 1) P(R 3 R 2 ) = 0.8 = P(R 3 R 2 ) P(R 3 R 2) = 0.2 = P(R 3 R 2 ) 2
(a) We are asked about the event R 1 R 2 R 3 assuming T happens. So we have P(R 1 R 2 R 3 T) = P(R 1R 2 R 3 T) P(T) = P(R 3 TR 1 R 2 )P(TR 1 R 2 ) P(T) = P(R 3 TR 1 R 2 )P(R 2 TR 1 )P(TR1) P(T) = P(R 3 TR 1 R 2 )P(R 2 TR 1 )P(R1 T) These equations are all just applications of the definition of conditional probability. Now think about P(R 3 TR 1 R 2 ). The fact that the relays operate independently of one another means that the probability that relay 3 sends a 1 depends only on what it received from relay 2. So Similarly This gives P(R 3 TR 1 R 2 ) = P(R 3 R 2 ) = 0.8. P(R 2 TR 1 ) = P(R 2 R 1 ) = 0.8 P(R 1 R 2 R 3 T) = 0.8 3 I am quite ok with students who simply said that each time there is a chance of 0.8 that the relay retransmits a 1 if it receives one and all three of those have to happen so the answer is 0.8 3. (b) You are told to condition on T and compute the probability of R 3. To get from T to R 3 you must have one of 4 intervening results for the relays: (1,1,1), (1,0,1), (0,1,1), or (0,0,1). Each of these results has to have a probability calculated the way I did part a). This leads to the probability 0.8 3 +0.8 0.2 0.2+0.2 0.2 0.8+0.2 0.8 0.2 The final answer is 0.8 3 +3(0.8)(0.2) 2 = 0.608. 3
(c) Now we want P(T R 3 ) when P(T) = 0.7. We just computed P(R 3 T)sothisisacasewhereweneedtoreverse theconditioning Baye s Theorem. P(T R 3 ) = P(R 3 T)P(T) P(R 3 T)P(T)+P(R 3 T )P(T ) = 0.608 0.7 0.608 0.7+P(R 3 T )(1 0.7). The remaining term has to be done the way you did it in b) BUT, by symmetry P(R 3 T ) = 0.608. So The answer is P(R 3 T ) = 1 0.608 = 0.392 P(T R 3 ) = 0.784. I don t care about more digits than that. 3. Chapter 2 page 89 number 96 in text. (a) Plug in c = c to get (b) When β = 4 we have P d (c ) = (c /c ) β 1+(c /c ) β = 1 2. P d (2c ) = (2c /c ) β 1+(2c /c ) β = 2β 1+2 β = 16 17. (c) If A is the event the first crack is detected and B is the event the second crack is detected we want P(AB A B). The union is of mutually exclusive events so P(exactly one detected) = P(AB )+P(A B) = P(A)P(B )+P(A )P(B) = 1 1 217 + 1 16 217 = 1 2. 4
(d) It converges to the function 0 c < c 1 P d (c) = c = c 2 1 c > c 4. Chapter 2 page 90 number 100 in text. I will let Pos be the event of a positive test, Neg be the event of a negative test, C be the event the selected individual is a carrier. Then Pos 1 will be a positive result the first time the test is done and so on. We know P(C) = 0.01. Given C the results of the two tests are independent so Also Next and P(Pos 1 Pos 2 C) = 0.9 0.9 = 0.81. P(Pos 1 Pos 2 C) = 0.1 0.1 = 0.01. P(Pos 1 Pos 2 C ) = 0.05 0.05 = 0.0025 P(Pos 1Pos 2 C ) = 0.95 0.95 = 0.9025. To compute the probability of two results being the same we have to add together the first two multiplied by P(C) and the second two multiplied by P(C ): P(Two results the same) = (0.81+0.01) 0.01+(0.0025+0.9025) 0.99 = 0.90415. Usually the two results are the same. For part b we want P(C Pos 1 Pos 2 ) = We worked out the pieces above so we get P(Pos 1 Pos 2 C)P(C) P(Pos 1 Pos 2 C)P(C)+P(Pos 1 Pos 2 C )P(C ) 0.81 0.01 0.81 0.01+0.0025 0.99 = 0.7659574. 5
I would prefer the answer 0.766 or 0.77 for practical purposes! Note that in practice the test results are not really independent. The test works less well in some carriers than in others, typically, so doing the test twice is less informative than this question suggests. 5. In the game of craps the shooter rolls a pair of fair 6 sided dice each labelled with 1 to 6 spots. When she rolls the number of spots showing on the two dice are counted that total is her roll. She wins immediately if she rolls a 7 or an 11 and loses immediately if she rolls 2, 3 or 12. If she rolls any other number she has to keep rolling until she rolls either that same number or a 7. If she rolls her number before rolling a 7 she wins while if 7 comes up before her number comes up again she loses. Solution I am going to give just quick answers in this problem (a) What is the probability she loses on the first roll? P(roll a 2 on first roll)+p(roll a 3 on first roll)+p(roll a 12 on first roll) = 1 36 + 2 36 + 1 36 = 4 36 = 1 9. (b) What is the probability she wins on the first roll? P(roll a 7 on first roll)+p(roll an 11 on first roll) = 6 36 + 2 36 = 8 36 = 2 9. (c) What is the probability she rolls a 4 on her first roll? 3 36 = 1 12. (d) Given that she rolls a 4 on her first roll what is the probability that she wins (rolls a 4) on her next (second) roll? 6 1 12
(e) Under the same condition what is the probability she loses on her next roll? 6 36 = 1 6. (f) Given that she rolls a 4 on her first roll what is the probability that she wins (rolls a 4) on her third roll? (She has to get to the third roll and then roll a 4. (1 1/6 1/12) 1/12. (g) Do the same for the fourth and fifth rolls, figure out the pattern and compute the conditional probability that she wins given that her first roll is a 4? and (1 1/6 1/12) (1 1/6 1/12) (1/12) (1 1/6 1/12) (1 1/6 1/12) (1 1/6 1/12) (1/12) The pattern is that P(on roll k first roll is a 4) = (1 1/6 1/12) k 2 1/12. We get the desired probability by adding these up from 2 to infinity: Notice: 1 12 (1+(3/4)+(3/4)2 +(3/4) 3 + = 1 1 121 3/4 = 1 3 P(roll 4 roll 4 or roll 7) = 1/12 1/12+1/6 = 1 3. In other words think about the toss where the game ends because she rolls either a 4 or a 7. On that toss about 1/3 of the time she rolls a 4 and 2/3 of the time she rolls a 7. 7
6. Two players, I and II, take turns tossing a biased coin. Player I goes first. First person to toss Heads wins. Each time a player plays s/he has chance p of getting Heads. What is the chance that Player I wins and how does this compare to the answer for Player II? Solution I am going to let I be the event Player I wins and I be the event Player II wins. I am going to assume we can prove someone has to win eventually. I am going to let A be the event that Player I wins right away on his/her first toss. We have P(A) = p and I claim that P(I A ) = P(I). The idea is that if player I does not win right away then we are now playing the very same game but Player II is starting! So P(I) = P(I A)P(A)+P(I A )P(A ) = 1 p+(1 P(I A )(1 p) = p+(1 P(I))(1 p) Now it s just algebra. Get all the terms with P(I) together: so P(I)(1+(1 p)) = p+1 p = 1 P(I) = 1 2 p Notice that if p = 1 then of course Player I always wins immediately while if p is really small then P(I) is very close to 1/2 (but the game takes a very long time! DUE: Thursday, 11 September. 8