Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry out computations. Problem. (Counting) (a) There are several ways to think about this. Here is one. The letters are p, r, o, b,b, a, i,i, l, t, y. We use the following steps to create a sequence of these letters. Step : Choose a position for the letter p: ways to do this. Step : Choose a position for the letter r: ways to do this. Step 3: Choose a position for the letter o: 9 ways to do this. Step 4: Choose two positions for the two b s: 8 choose ways to do this. Step 5: Choose a position for the letter a: 6 ways to do this. Step 6: Choose two positions for the two i s: 5 choose ways to do this. Step 7: Choose a position for the letter l: 3 ways to do this. Step 8: Choose a position for the letter t: ways to do this. Step 9: Choose a position for the letter y: ways to do this. Multiply these all together we get: ( ) 8 9 6 (b) Here are two ways to do this problem. ( ) 5 3 =!!! Method. Since every arrangement has equal probability of being chosen we simply have to count the number that start with the letter b. After putting a b in position there are letters: p, r, o, b, a, i,i, l, t, y, to place in the last positions. We count this in the same manner as part (a). That is Choose the position for p: ways. Choose the positions for r,o,b,a,: 9 8 7 6 ways. Choose two positions for the two i s: 5 choose ways. Choose the position for l: 3 ways. Choose the position for t: ways. Choose the position for y: ways. ( ) 5 Multiplying this together we get 9 8 7 6 3 =! arrangements! start with the letter b. Therefore the probability a random arrangement starts with!/! b is!/!! =
Method. Suppose we build the arrangement by picking a letter for the first position, then the second position etc. Since there are letters, two of which are b s we have a / chance of picking a b for the first letter. Problem. (Probability) We are given P (E F ) = 3/4. E c F c = (E F ) c P (E c F c ) = P (E F ) = /4. Problem 3. (Counting) Let H i be the event that the i th hand has one king. We have the conditional probabilities ( )( ) ( )( ) ( )( ) 4 48 3 36 4 P (H ) = ( ) 5 ; P (H H ) = 3 ( ) 39 ; P (H 3 H H ) = 3 ( ) 6 3 P (H 4 H H H 3 ) = P (H H H 3 H 4 ) = P (H 4 H H H 3 ) P (H 3 H H ) P (H H ) P (H ) ( )( )( )( )( )( ) 4 3 36 4 48 = ( )( )( ). 6 39 5 3 3 3 Problem 4. (Conditional probability) (a) Sample space = Ω = {(, ), (, ), (, 3),..., (6, 6) } = {(i, j) i, j =,, 3, 4, 5, 6 }. (Each outcome is equally likely, with probability /36.) A = {(, 3), (, ), (3, )}, B = {(3, ), (3, ), (3, 3), (3, 4), (3, 5), (3, 6), (, 3), (, 3), (4, 3), (5, 3), (6, 3) } P (A B) = P (A B) P (B) = /36 /36 =.. (b) P (A) = 3/36 P (A B), so they are not independent. Problem 5. (Bayes formula) For a given problem let C be the event the student gets the problem correct and K the event the student knows the answer. The question asks for P (K C). We ll compute this using Bayes rule: P (K C) = P (C K) P (K). P (C)
We re given: P (C K) =, P (K) =.6. Law of total prob.: P (C) = P (C K) P (K) + P (C K c ) P (K c ) =.6 +.5.4 =.7. Therefore P (K C) =.6 =.857 = 85.7%..7 Problem 6. (Bayes formula) Here is the game tree, R means red on the first draw etc. 7/ 3/ R B 6/9 3/9 7/ 3/ R B R B 5/8 3/8 6/9 3/9 6/9 3/9 7/ 3/ R 3 B 3 R 3 B 3 R 3 B 3 R 3 B 3 Summing the probability to all the B 3 nodes we get P (B 3 ) = 7 6 9 3 8 + 7 3 9 3 9 + 3 7 3 9 + 3 Problem 7. (Expected value and variance) 3 3 =.35. X - - p(x) /5 /5 3/5 4/5 5/5 We compute Thus E[X] = 5 + 5 + 3]5 + [ 4 5 Var(X) = E((X 3 ) ) = 4 9. + 5 5 = 3. Problem 8. (Expected value and variance) We first compute E[X] = E[X ] = E[X 4 ] = Thus, x xdx = 3 x xdx = x 4 xdx = 3. Var(X) = E[X ] (E[X]) = 4 9 = 8 3
and Var(X ) = E[X 4 ] = ( E[X ] ) = 3 4 =. Problem 9. (Expected value and variance) Use Var(X) = E(X ) E(X) 3 = E(X ) 4 E(X ) = 7. Problem. (Expected value and variance) answer: Make a table X: prob: (-p) p X. From the table, E(X) = ( p) + p = p. Since X and X have the same table E(X ) = E(X) = p. Therefore, Var(X) = p p = p( p). Problem. (Expected value) Let X be the number of people who get their own hat. Following the hint: let X j represent whether person j gets their own hat. That is, X j = if person j gets their hat and if not. We have, X = X j, so E(X) = E(X j ). j= j= Since person j is equally likely to get any hat, we have P (X j = ) = /. Thus, X j Bernoulli(/) E(X j ) = / E(X) =. Problem. (Expected value and variance) (a) There are a number of ways to present this. X 3 binomial(5, /6), so P (X = 3k) = ( 5 k (b) X 3 binomial(5, /6). ) ( 6 ) k ( ) 5 k 5, for k =,,,..., 5. 6 Recall that the mean and variance of binomial(n, p) are np and np( p). So, E(X) = 3np = 3 5 6 = 75/6, and Var(X) = 9np( p) = 9 5(/6)(5/6) = 5/4. (c) E(X + Y ) = E(X) + E(Y ) = 5/6 = 5., E(X) = E(X) = 5/6 = 5. 4
Var(X + Y ) = Var(X) + Var(Y ) = 5/4. Var(X) = 4Var(X) = 5/4. The means of X + Y and X are the same, but Var(X) > Var(X + Y ). This makes sense because in X +Y sometimes X and Y will be on opposite sides from the mean so distances to the mean will tend to cancel, However in X the distance to the mean is always doubled. Problem 3. (Continuous random variables) First we find the value of a: f(x) dx = = x + ax dx = + a 3 a = 3/. The CDF is F X (x) = P (X x). We break this into cases: (i) b < F X (b) =. (ii) b F X (b) = (iii) < x F X (b) =. b x + 3 x dx = b + b3. Using F X we get ( ).5 +.5 3 P (.5 < X < ) = F X () F X (.5) = = 3 6. Problem 4. (Exponential distribution) (a) We compute P (X 5) = P (X < 5) = 5 λe λx dx = ( e 5λ ) = e 5λ. (b) We want P (X 5 X ). First observe that P (X 5, X ) = P (X 5). From similar computations in (a), we know P (X 5) = e 5λ P (X ) = e λ. From the definition of conditional probability, P (X 5 X ) = P (X 5, X ) P (X ) = P (X 5) P (X ) = e 5λ Note: This is an illustration of the memorylessness property of the exponential distribution. Problem 5. (a) Note, Y follows what is called a log-normal distribution. 5
F Y (a) = P (Y a) = P (e Z a) = P (Z ln(a)) = Φ(ln(a)). Differentiating using the chain rule: f y (a) = d da F Y (a) = d da Φ(ln(a)) = a φ(ln(a)) = π a e (ln(a)) /. (b) (i) We want to find q.33 such that P (Z q.33 ) =.33. That is, we want (ii) We want q.9 such that Φ(q.33 ) =.33 q.33 = Φ (.33). F Y (q.9 ) =.9 Φ(ln(q.9 )) =.9 q.9 = e Φ (.9). (iii) As in (ii) q.5 = e Φ (.5) = e =. Problem 6. (a) We did this in class. Let φ(z) and Φ(z) be the PDF and CDF of Z. F Y (y) = P (Y y) = P (az + b y) = P (Z (y b)/a) = Φ((y b)/a). Differentiating: f Y (y) = d dy F Y (y) = d dy Φ((y b)/a) = φ((y b)/a) = e (y b) /a. a π a Since this is the density for N(b, a ) we have shown Y N(b, a ). (b) By part (a), Y N(µ, σ ) Y = σz + µ. But, this implies (Y µ)/σ = Z N(, ). QED Problem 7. (Quantiles) The density for this distribution is f(x) = λ e λx. We know (or can compute) that the distribution function is F (a) = e λa. The median is the value of a such that F (a) =.5. Thus, e λa =.5.5 = e λa log(.5) = λa a = log()/λ. Problem 8. (Correlation) As usual let X i = the number of heads on the i th flip, i.e. or. Let X = X + X + X 3 the sum of the first 3 flips and Y = X 3 + X 4 + X 5 the sum of the last 3. Using the algebraic properties of covariance we have Cov(X, Y ) = Cov(X + X + X 3, X 3 + X 4 + X 5 ) = Cov(X, X 3 ) + Cov(X, X 4 ) + Cov(X, X 5 ) + Cov(X, X 3 ) + Cov(X, X 4 ) + Cov(X, X 5 ) + Cov(X 3, X 3 ) + Cov(X 3, X 4 ) + Cov(X 3, X 5 ) 6
Because the X i are independent the only non-zero term in the above sum is Cov(X 3 X 3 ) = Var(X 3 ) = 4 Therefore, Cov(X, Y ) = 4. We get the correlation by dividing by the standard deviations. Since X is the sum of 3 independent Bernoulli(.5) we have σ X = 3/4 Cor(X, Y ) = Cov(X, Y ) σ X σ Y = /4 (3)/4 = 3. Problem 9. (Joint distributions) (a) Here we have two continuous random variables X and Y with going potability density function and f(x, y) = otherwise. So f(x, y) = xy( + y) for x and y, 5 P ( 4 X, 3 Y 3 ) = 4 3 3 f(x, y)dy dx = 4 7. (b) F (a, b) = a b f(x, y)dy dx = 3 5 a b + 5 a b 3 for a and b. (c) We find the marginal cdf F X (a) by setting b in F (a, b) to the top of its range, i.e. b =. So F X (a) = F (a, ) = a. (d) For x, we have f X (x) = f(x, y)dy = This is consistent with (c) because d dx (x ) = x. (e) We first compute f Y (y) for y as f Y (y) = f(x, y)dy = x. f(x, y)dx = 6 y(y + ). 5 Since f(x, y) = f X (x)f Y (y), we conclude that X and Y are independent. Problem. (Joint distributions) (a) The marginal probability P Y () = / P (X =, Y = ) = P (X =, Y = ) =. Now each column has one empty entry. This can be computed by making the column add up to the given marginal probability. 7
Y \X P Y - /6 /6 /6 / / / P X /6 /3 /6 (b) No, X and Y are not independent. For example, P (X =, Y = ) = / = P (X = ) P (Y = ). Problem. Standardize: ( ) ( Xi µ n P X i < 3 = P σ/ < 3/n µ ) n σ/ n i ( ) 3/ /5 P Z < (by the central limit theorem) /3 = P (Z < 3) =.3 =.9987 (from the table) Problem. (Central limit theorem) Let X j be the IQ of a randomly selected person. We are given E(X j ) = and σ Xj = 5. Let X be the average of the IQ s of randomly selected people. We have (X) = and σ X = 5/ =.5. The problem asks for P (X > 5). Standardizing we get P (X > 5) P (Z > ). This is effectively. Problem 3. (NHST chi-square) We will use a chi-square test for homogeneity. Remember we need to use all the data!. For hypotheses we have: H : the re-offense rate is the same for both groups. H A : the rates are different. Here is the table of counts. The computation of the expected counts is explained below. Control group Experimental group observed expected observed expected Re-offend 7 5 3 5 Don t re-offend 3 5 7 5 3 4 The expected counts are computed as follows. Under H the re-offense rates are the same, say θ. To find the expected counts we find the MLE of θ using the combined data: total re-offend ˆθ = total subjects = 4. 8
Then, for example, the expected number of re-offenders in the control group is ˆθ = 5. The other expected counts are computed in the same way. The chi-square test statistic is X = (observed - expected) expected = 5 + 5 + 5 + 5 8+.67+8+.67.33. Finally, we need the degrees of freedom: df = because this is a two-by-two table and ( ) ( ) =. (Or because we can freely fill in the count in one cell and still be consistent with the marginal counts,,, 3, 4 used to compute the expected counts.) From the χ table: p = P (X >.33 df = ) <.. Conclusion: we reject H in favor of H A. The experimental intervention appears to be effective. Problem 4. (Confidence intervals) We compute the data mean and variance x = 65, s = 35.778. The number of degrees of freedom is 9. We look up the critical value t 9,.5 =.6 in the t-table The 95% confidence interval is [ x t 9,.5s, x + t ] 9,.5s = n n [ 65.6 3.5778, 65 +.6 ] 3.5778 = [6.7, 69.79] Problem 5. (Confidence intervals) Suppose we have taken data x,..., x n with mean x. The 95% confidence interval for the mean is x ± z.5 σ n. This has width z.5 σ n. Setting the width equal to and substitituting values z.5 =.96 and σ = 5 we get So, n = (9.6) = 384...96 5 n = n = 9.6. If we use our rule of thumb that z.5 = we have n/ = n = 4. Problem 6. (Confidence intervals) The 9% confidence interval is x ± z.5. Since z n.5 =.64 and n = 4 our confidence interval is x ±.64 4 = x ±.4 If this is entirely above.5 we have x.4 >.5, so x >.54. Let T be the number out of 4 who prefer A. We have x = T >.54, so T > 6. 4 Problem 7. (Confidence intervals) A 95% confidence means about 5% = / will be wrong. You d expect about to be wrong. 9
With a probability p =.5 of being wrong, the number wrong follows a Binomial(4, p) distribution. This has expected value, and standard deviation 4(.5)(.95) =.38. wrong is (-)/.38 = 5.8 standard deviations from the mean. This would be surprising. Problem 8. (Confidence intervals) We have n = 7 and s = 5.86. If we fix a hypothesis for σ we know (n )s σ χ n We used R to find the critical values. (Or use the χ table.) c5 = qchisq(.975,6) = 4.93 c975 = qchisq(.5,6) = 3.844 The 95% confidence interval for σ is [ ] [ ] (n ) s (n ) s 6 5.86 6 5.86, =, = [.968, 64.496] c.5 c.975 4.93 3.844 We can take square roots to find the 95% confidence interval for σ [4.648, 8.37] Problem 9. (a) Step. We have the point estimate p ˆp =.333. Step. Use the computer to generate many (say ) size samples. (These are called the bootstrap samples.) Step 3. For each sample compute p = / x and δ = p ˆp. Step 4. Sort the δ and find the critical values δ.95 and δ.5. (Remember δ.95 is the 5th percentile etc.) Step 5. The 9% bootstrap confidence interval for p is [ˆp δ.5, ˆp δ.95 ] (b) It s tricky to keep the sides straight here. We work slowly and carefully: The 5th and 95th percentiles for x are the th and 9th entries.89, 3.7 (Here again there is some ambiguity on which entries to use. We will accept using the th or the 9st entries or some interpolation between these entries.) So the 5th and 95th percentiles for p are /3.7 =.688, /.89 =.346 So the 5th and 95th percentiles for δ = p ˆp are.343,.499
These are also the.95 and.5 critical values. So the 9% CI for p is [.333.499,.333 +.343] = [.64,.3374] Problem 3. (a) The steps are the same as in the previous problem except the bootstrap samples are generated in different ways. Step. We have the point estimate q.5 ˆq.5 = 3.3. Step. Use the computer to generate many (say ) size resamples of the original data. Step 3. For each sample compute the median q.5 and δ = q.5 ˆq.5. Step 4. Sort the δ and find the critical values δ.95 and δ.5. (Remember δ.95 is the 5th percentile etc.) Step 5. The 9% bootstrap confidence interval for q.5 is [ˆq.5 δ.5, ˆq.5 δ.95 ] (b) This is very similar to the previous problem. We proceed slowly and carefully to get terms on the correct side of the inequalities. The 5th and 95th percentiles for q.5 are.89, 3.7 So the 5th and 95th percentiles for δ = q.5 ˆq.5 are [.89 3.3, 3.7 3.3] = [.4,.4] These are also the.95 and.5 critical values. So the 9% CI for p is [3.3.4, 3.3 +.4] = [.9, 3.7] Problem 3. The model is y i = a + bx i + ε i, where ε i is random error. We assume the errors are independent with mean and the same variance for each i (homoscedastic). The total error squared is E = (y i a bx i ) = ( a b) + ( a b) + (3 a 3b) The least squares fit is given by the values of a and b which minimize E. We solve for them by setting the partial derivatives of E with respect to a and b to. In R we found that a =., b =.5