This does not cover everything on the final. Look at the posted practice problems for other topics.

Similar documents
Exam 1 Practice Questions I solutions, 18.05, Spring 2014

Class 8 Review Problems solutions, 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014

Class 8 Review Problems 18.05, Spring 2014

Post-exam 2 practice questions 18.05, Spring 2014

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

Probability and Distributions

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom

Statistics 100A Homework 5 Solutions

Math Review Sheet, Fall 2008

STAT Chapter 5 Continuous Distributions

1 Basic continuous random variable problems

18.05 Practice Final Exam

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Exam 2 Practice Questions, 18.05, Spring 2014

2 Functions of random variables

Probability Theory and Statistics. Peter Jochumzen

Quick Tour of Basic Probability Theory and Linear Algebra

ECON Fundamentals of Probability

Review for Exam Spring 2018

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

Common ontinuous random variables

18.440: Lecture 28 Lectures Review

1 Review of Probability and Distributions

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

Master s Written Examination

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

2 (Statistics) Random variables

Midterm Exam 1 Solution

Continuous Expectation and Variance, the Law of Large Numbers, and the Central Limit Theorem Spring 2014

Chapter 2: Random Variables

Smoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total

1 Basic continuous random variable problems

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Review: mostly probability and some statistics

SDS 321: Introduction to Probability and Statistics

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Lecture 10: Generalized likelihood ratio test

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

Chapter 5. Chapter 5 sections

Random Variables and Their Distributions

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001

Exercises and Answers to Chapter 1

Final Exam # 3. Sta 230: Probability. December 16, 2012

CONTINUOUS RANDOM VARIABLES

3. Probability and Statistics

STA2603/205/1/2014 /2014. ry II. Tutorial letter 205/1/

This paper is not to be removed from the Examination Halls

Week 2: Review of probability and statistics

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014

1 Random Variable: Topics

SDS 321: Introduction to Probability and Statistics

ORF 245 Fundamentals of Statistics Practice Final Exam

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Solution to Assignment 3

18.05 Exam 1. Table of normal probabilities: The last page of the exam contains a table of standard normal cdf values.

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

The Binomial distribution. Probability theory 2. Example. The Binomial distribution

Confidence Intervals for Normal Data Spring 2018

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Review of Probabilities and Basic Statistics

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Practice Examination # 3

Confidence Intervals for Normal Data Spring 2014

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

MAS113 Introduction to Probability and Statistics. Proofs of theorems

Non-parametric Inference and Resampling

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Spring 2012 Math 541B Exam 1

Math 50: Final. 1. [13 points] It was found that 35 out of 300 famous people have the star sign Sagittarius.

Frequentist Statistics and Hypothesis Testing Spring

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

Mathematical statistics

STAT 4385 Topic 01: Introduction & Review

Probability reminders

Statistics Examples. Cathal Ormond

LIST OF FORMULAS FOR STK1100 AND STK1110

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

SDS 321: Practice questions

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Math/Stats 425, Sec. 1, Fall 04: Introduction to Probability. Final Exam: Solutions

EECS 126 Probability and Random Processes University of California, Berkeley: Spring 2015 Abhay Parekh February 17, 2015.

Lecture 2: Review of Probability

CSE 312: Foundations of Computing II Quiz Section #10: Review Questions for Final Exam (solutions)

CS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer.

7 Random samples and sampling distributions

Analysis of Engineering and Scientific Data. Semester

Lecture 2: Repetition of probability theory and statistics

Statistics for scientists and engineers

Preliminary Statistics. Lecture 3: Probability Models and Distributions

Recitation 2: Probability

Institute of Actuaries of India

CME 106: Review Probability theory

Confidence Intervals for the Mean of Non-normal Data Class 23, Jeremy Orloff and Jonathan Bloom

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

1 Probability theory. 2 Random variables and probability theory.

Transcription:

Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry out computations. Problem. (Counting) (a) There are several ways to think about this. Here is one. The letters are p, r, o, b,b, a, i,i, l, t, y. We use the following steps to create a sequence of these letters. Step : Choose a position for the letter p: ways to do this. Step : Choose a position for the letter r: ways to do this. Step 3: Choose a position for the letter o: 9 ways to do this. Step 4: Choose two positions for the two b s: 8 choose ways to do this. Step 5: Choose a position for the letter a: 6 ways to do this. Step 6: Choose two positions for the two i s: 5 choose ways to do this. Step 7: Choose a position for the letter l: 3 ways to do this. Step 8: Choose a position for the letter t: ways to do this. Step 9: Choose a position for the letter y: ways to do this. Multiply these all together we get: ( ) 8 9 6 (b) Here are two ways to do this problem. ( ) 5 3 =!!! Method. Since every arrangement has equal probability of being chosen we simply have to count the number that start with the letter b. After putting a b in position there are letters: p, r, o, b, a, i,i, l, t, y, to place in the last positions. We count this in the same manner as part (a). That is Choose the position for p: ways. Choose the positions for r,o,b,a,: 9 8 7 6 ways. Choose two positions for the two i s: 5 choose ways. Choose the position for l: 3 ways. Choose the position for t: ways. Choose the position for y: ways. ( ) 5 Multiplying this together we get 9 8 7 6 3 =! arrangements! start with the letter b. Therefore the probability a random arrangement starts with!/! b is!/!! =

Method. Suppose we build the arrangement by picking a letter for the first position, then the second position etc. Since there are letters, two of which are b s we have a / chance of picking a b for the first letter. Problem. (Probability) We are given P (E F ) = 3/4. E c F c = (E F ) c P (E c F c ) = P (E F ) = /4. Problem 3. (Counting) Let H i be the event that the i th hand has one king. We have the conditional probabilities ( )( ) ( )( ) ( )( ) 4 48 3 36 4 P (H ) = ( ) 5 ; P (H H ) = 3 ( ) 39 ; P (H 3 H H ) = 3 ( ) 6 3 P (H 4 H H H 3 ) = P (H H H 3 H 4 ) = P (H 4 H H H 3 ) P (H 3 H H ) P (H H ) P (H ) ( )( )( )( )( )( ) 4 3 36 4 48 = ( )( )( ). 6 39 5 3 3 3 Problem 4. (Conditional probability) (a) Sample space = Ω = {(, ), (, ), (, 3),..., (6, 6) } = {(i, j) i, j =,, 3, 4, 5, 6 }. (Each outcome is equally likely, with probability /36.) A = {(, 3), (, ), (3, )}, B = {(3, ), (3, ), (3, 3), (3, 4), (3, 5), (3, 6), (, 3), (, 3), (4, 3), (5, 3), (6, 3) } P (A B) = P (A B) P (B) = /36 /36 =.. (b) P (A) = 3/36 P (A B), so they are not independent. Problem 5. (Bayes formula) For a given problem let C be the event the student gets the problem correct and K the event the student knows the answer. The question asks for P (K C). We ll compute this using Bayes rule: P (K C) = P (C K) P (K). P (C)

We re given: P (C K) =, P (K) =.6. Law of total prob.: P (C) = P (C K) P (K) + P (C K c ) P (K c ) =.6 +.5.4 =.7. Therefore P (K C) =.6 =.857 = 85.7%..7 Problem 6. (Bayes formula) Here is the game tree, R means red on the first draw etc. 7/ 3/ R B 6/9 3/9 7/ 3/ R B R B 5/8 3/8 6/9 3/9 6/9 3/9 7/ 3/ R 3 B 3 R 3 B 3 R 3 B 3 R 3 B 3 Summing the probability to all the B 3 nodes we get P (B 3 ) = 7 6 9 3 8 + 7 3 9 3 9 + 3 7 3 9 + 3 Problem 7. (Expected value and variance) 3 3 =.35. X - - p(x) /5 /5 3/5 4/5 5/5 We compute Thus E[X] = 5 + 5 + 3]5 + [ 4 5 Var(X) = E((X 3 ) ) = 4 9. + 5 5 = 3. Problem 8. (Expected value and variance) We first compute E[X] = E[X ] = E[X 4 ] = Thus, x xdx = 3 x xdx = x 4 xdx = 3. Var(X) = E[X ] (E[X]) = 4 9 = 8 3

and Var(X ) = E[X 4 ] = ( E[X ] ) = 3 4 =. Problem 9. (Expected value and variance) Use Var(X) = E(X ) E(X) 3 = E(X ) 4 E(X ) = 7. Problem. (Expected value and variance) answer: Make a table X: prob: (-p) p X. From the table, E(X) = ( p) + p = p. Since X and X have the same table E(X ) = E(X) = p. Therefore, Var(X) = p p = p( p). Problem. (Expected value) Let X be the number of people who get their own hat. Following the hint: let X j represent whether person j gets their own hat. That is, X j = if person j gets their hat and if not. We have, X = X j, so E(X) = E(X j ). j= j= Since person j is equally likely to get any hat, we have P (X j = ) = /. Thus, X j Bernoulli(/) E(X j ) = / E(X) =. Problem. (Expected value and variance) (a) There are a number of ways to present this. X 3 binomial(5, /6), so P (X = 3k) = ( 5 k (b) X 3 binomial(5, /6). ) ( 6 ) k ( ) 5 k 5, for k =,,,..., 5. 6 Recall that the mean and variance of binomial(n, p) are np and np( p). So, E(X) = 3np = 3 5 6 = 75/6, and Var(X) = 9np( p) = 9 5(/6)(5/6) = 5/4. (c) E(X + Y ) = E(X) + E(Y ) = 5/6 = 5., E(X) = E(X) = 5/6 = 5. 4

Var(X + Y ) = Var(X) + Var(Y ) = 5/4. Var(X) = 4Var(X) = 5/4. The means of X + Y and X are the same, but Var(X) > Var(X + Y ). This makes sense because in X +Y sometimes X and Y will be on opposite sides from the mean so distances to the mean will tend to cancel, However in X the distance to the mean is always doubled. Problem 3. (Continuous random variables) First we find the value of a: f(x) dx = = x + ax dx = + a 3 a = 3/. The CDF is F X (x) = P (X x). We break this into cases: (i) b < F X (b) =. (ii) b F X (b) = (iii) < x F X (b) =. b x + 3 x dx = b + b3. Using F X we get ( ).5 +.5 3 P (.5 < X < ) = F X () F X (.5) = = 3 6. Problem 4. (Exponential distribution) (a) We compute P (X 5) = P (X < 5) = 5 λe λx dx = ( e 5λ ) = e 5λ. (b) We want P (X 5 X ). First observe that P (X 5, X ) = P (X 5). From similar computations in (a), we know P (X 5) = e 5λ P (X ) = e λ. From the definition of conditional probability, P (X 5 X ) = P (X 5, X ) P (X ) = P (X 5) P (X ) = e 5λ Note: This is an illustration of the memorylessness property of the exponential distribution. Problem 5. (a) Note, Y follows what is called a log-normal distribution. 5

F Y (a) = P (Y a) = P (e Z a) = P (Z ln(a)) = Φ(ln(a)). Differentiating using the chain rule: f y (a) = d da F Y (a) = d da Φ(ln(a)) = a φ(ln(a)) = π a e (ln(a)) /. (b) (i) We want to find q.33 such that P (Z q.33 ) =.33. That is, we want (ii) We want q.9 such that Φ(q.33 ) =.33 q.33 = Φ (.33). F Y (q.9 ) =.9 Φ(ln(q.9 )) =.9 q.9 = e Φ (.9). (iii) As in (ii) q.5 = e Φ (.5) = e =. Problem 6. (a) We did this in class. Let φ(z) and Φ(z) be the PDF and CDF of Z. F Y (y) = P (Y y) = P (az + b y) = P (Z (y b)/a) = Φ((y b)/a). Differentiating: f Y (y) = d dy F Y (y) = d dy Φ((y b)/a) = φ((y b)/a) = e (y b) /a. a π a Since this is the density for N(b, a ) we have shown Y N(b, a ). (b) By part (a), Y N(µ, σ ) Y = σz + µ. But, this implies (Y µ)/σ = Z N(, ). QED Problem 7. (Quantiles) The density for this distribution is f(x) = λ e λx. We know (or can compute) that the distribution function is F (a) = e λa. The median is the value of a such that F (a) =.5. Thus, e λa =.5.5 = e λa log(.5) = λa a = log()/λ. Problem 8. (Correlation) As usual let X i = the number of heads on the i th flip, i.e. or. Let X = X + X + X 3 the sum of the first 3 flips and Y = X 3 + X 4 + X 5 the sum of the last 3. Using the algebraic properties of covariance we have Cov(X, Y ) = Cov(X + X + X 3, X 3 + X 4 + X 5 ) = Cov(X, X 3 ) + Cov(X, X 4 ) + Cov(X, X 5 ) + Cov(X, X 3 ) + Cov(X, X 4 ) + Cov(X, X 5 ) + Cov(X 3, X 3 ) + Cov(X 3, X 4 ) + Cov(X 3, X 5 ) 6

Because the X i are independent the only non-zero term in the above sum is Cov(X 3 X 3 ) = Var(X 3 ) = 4 Therefore, Cov(X, Y ) = 4. We get the correlation by dividing by the standard deviations. Since X is the sum of 3 independent Bernoulli(.5) we have σ X = 3/4 Cor(X, Y ) = Cov(X, Y ) σ X σ Y = /4 (3)/4 = 3. Problem 9. (Joint distributions) (a) Here we have two continuous random variables X and Y with going potability density function and f(x, y) = otherwise. So f(x, y) = xy( + y) for x and y, 5 P ( 4 X, 3 Y 3 ) = 4 3 3 f(x, y)dy dx = 4 7. (b) F (a, b) = a b f(x, y)dy dx = 3 5 a b + 5 a b 3 for a and b. (c) We find the marginal cdf F X (a) by setting b in F (a, b) to the top of its range, i.e. b =. So F X (a) = F (a, ) = a. (d) For x, we have f X (x) = f(x, y)dy = This is consistent with (c) because d dx (x ) = x. (e) We first compute f Y (y) for y as f Y (y) = f(x, y)dy = x. f(x, y)dx = 6 y(y + ). 5 Since f(x, y) = f X (x)f Y (y), we conclude that X and Y are independent. Problem. (Joint distributions) (a) The marginal probability P Y () = / P (X =, Y = ) = P (X =, Y = ) =. Now each column has one empty entry. This can be computed by making the column add up to the given marginal probability. 7

Y \X P Y - /6 /6 /6 / / / P X /6 /3 /6 (b) No, X and Y are not independent. For example, P (X =, Y = ) = / = P (X = ) P (Y = ). Problem. Standardize: ( ) ( Xi µ n P X i < 3 = P σ/ < 3/n µ ) n σ/ n i ( ) 3/ /5 P Z < (by the central limit theorem) /3 = P (Z < 3) =.3 =.9987 (from the table) Problem. (Central limit theorem) Let X j be the IQ of a randomly selected person. We are given E(X j ) = and σ Xj = 5. Let X be the average of the IQ s of randomly selected people. We have (X) = and σ X = 5/ =.5. The problem asks for P (X > 5). Standardizing we get P (X > 5) P (Z > ). This is effectively. Problem 3. (NHST chi-square) We will use a chi-square test for homogeneity. Remember we need to use all the data!. For hypotheses we have: H : the re-offense rate is the same for both groups. H A : the rates are different. Here is the table of counts. The computation of the expected counts is explained below. Control group Experimental group observed expected observed expected Re-offend 7 5 3 5 Don t re-offend 3 5 7 5 3 4 The expected counts are computed as follows. Under H the re-offense rates are the same, say θ. To find the expected counts we find the MLE of θ using the combined data: total re-offend ˆθ = total subjects = 4. 8

Then, for example, the expected number of re-offenders in the control group is ˆθ = 5. The other expected counts are computed in the same way. The chi-square test statistic is X = (observed - expected) expected = 5 + 5 + 5 + 5 8+.67+8+.67.33. Finally, we need the degrees of freedom: df = because this is a two-by-two table and ( ) ( ) =. (Or because we can freely fill in the count in one cell and still be consistent with the marginal counts,,, 3, 4 used to compute the expected counts.) From the χ table: p = P (X >.33 df = ) <.. Conclusion: we reject H in favor of H A. The experimental intervention appears to be effective. Problem 4. (Confidence intervals) We compute the data mean and variance x = 65, s = 35.778. The number of degrees of freedom is 9. We look up the critical value t 9,.5 =.6 in the t-table The 95% confidence interval is [ x t 9,.5s, x + t ] 9,.5s = n n [ 65.6 3.5778, 65 +.6 ] 3.5778 = [6.7, 69.79] Problem 5. (Confidence intervals) Suppose we have taken data x,..., x n with mean x. The 95% confidence interval for the mean is x ± z.5 σ n. This has width z.5 σ n. Setting the width equal to and substitituting values z.5 =.96 and σ = 5 we get So, n = (9.6) = 384...96 5 n = n = 9.6. If we use our rule of thumb that z.5 = we have n/ = n = 4. Problem 6. (Confidence intervals) The 9% confidence interval is x ± z.5. Since z n.5 =.64 and n = 4 our confidence interval is x ±.64 4 = x ±.4 If this is entirely above.5 we have x.4 >.5, so x >.54. Let T be the number out of 4 who prefer A. We have x = T >.54, so T > 6. 4 Problem 7. (Confidence intervals) A 95% confidence means about 5% = / will be wrong. You d expect about to be wrong. 9

With a probability p =.5 of being wrong, the number wrong follows a Binomial(4, p) distribution. This has expected value, and standard deviation 4(.5)(.95) =.38. wrong is (-)/.38 = 5.8 standard deviations from the mean. This would be surprising. Problem 8. (Confidence intervals) We have n = 7 and s = 5.86. If we fix a hypothesis for σ we know (n )s σ χ n We used R to find the critical values. (Or use the χ table.) c5 = qchisq(.975,6) = 4.93 c975 = qchisq(.5,6) = 3.844 The 95% confidence interval for σ is [ ] [ ] (n ) s (n ) s 6 5.86 6 5.86, =, = [.968, 64.496] c.5 c.975 4.93 3.844 We can take square roots to find the 95% confidence interval for σ [4.648, 8.37] Problem 9. (a) Step. We have the point estimate p ˆp =.333. Step. Use the computer to generate many (say ) size samples. (These are called the bootstrap samples.) Step 3. For each sample compute p = / x and δ = p ˆp. Step 4. Sort the δ and find the critical values δ.95 and δ.5. (Remember δ.95 is the 5th percentile etc.) Step 5. The 9% bootstrap confidence interval for p is [ˆp δ.5, ˆp δ.95 ] (b) It s tricky to keep the sides straight here. We work slowly and carefully: The 5th and 95th percentiles for x are the th and 9th entries.89, 3.7 (Here again there is some ambiguity on which entries to use. We will accept using the th or the 9st entries or some interpolation between these entries.) So the 5th and 95th percentiles for p are /3.7 =.688, /.89 =.346 So the 5th and 95th percentiles for δ = p ˆp are.343,.499

These are also the.95 and.5 critical values. So the 9% CI for p is [.333.499,.333 +.343] = [.64,.3374] Problem 3. (a) The steps are the same as in the previous problem except the bootstrap samples are generated in different ways. Step. We have the point estimate q.5 ˆq.5 = 3.3. Step. Use the computer to generate many (say ) size resamples of the original data. Step 3. For each sample compute the median q.5 and δ = q.5 ˆq.5. Step 4. Sort the δ and find the critical values δ.95 and δ.5. (Remember δ.95 is the 5th percentile etc.) Step 5. The 9% bootstrap confidence interval for q.5 is [ˆq.5 δ.5, ˆq.5 δ.95 ] (b) This is very similar to the previous problem. We proceed slowly and carefully to get terms on the correct side of the inequalities. The 5th and 95th percentiles for q.5 are.89, 3.7 So the 5th and 95th percentiles for δ = q.5 ˆq.5 are [.89 3.3, 3.7 3.3] = [.4,.4] These are also the.95 and.5 critical values. So the 9% CI for p is [3.3.4, 3.3 +.4] = [.9, 3.7] Problem 3. The model is y i = a + bx i + ε i, where ε i is random error. We assume the errors are independent with mean and the same variance for each i (homoscedastic). The total error squared is E = (y i a bx i ) = ( a b) + ( a b) + (3 a 3b) The least squares fit is given by the values of a and b which minimize E. We solve for them by setting the partial derivatives of E with respect to a and b to. In R we found that a =., b =.5