Statistics 20: Final Exam Solutions Summer Session 2007

Similar documents
Common Large/Small Sample Tests 1/55

Stat 200 -Testing Summary Page 1

STATISTICAL INFERENCE

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

1 Inferential Methods for Correlation and Regression Analysis

Rule of probability. Let A and B be two events (sets of elementary events). 11. If P (AB) = P (A)P (B), then A and B are independent.

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Chapter 1 (Definitions)

Mathematical Notation Math Introduction to Applied Statistics

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Samples from Normal Populations with Known Variances

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

University of California, Los Angeles Department of Statistics. Hypothesis testing

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Y i n. i=1. = 1 [number of successes] number of successes = n

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Data Analysis and Statistical Methods Statistics 651

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

z is the upper tail critical value from the normal distribution

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

STAT431 Review. X = n. n )

(all terms are scalars).the minimization is clearer in sum notation:

Statistics. Chapter 10 Two-Sample Tests. Copyright 2013 Pearson Education, Inc. publishing as Prentice Hall. Chap 10-1

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Lesson 2. Projects and Hand-ins. Hypothesis testing Chaptre 3. { } x=172.0 = 3.67

Regression, Inference, and Model Building

1 Constructing and Interpreting a Confidence Interval

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Announcements. Unit 5: Inference for Categorical Data Lecture 1: Inference for a single proportion

Topic 9: Sampling Distributions of Estimators

independence of the random sample measurements, we have U = Z i ~ χ 2 (n) with σ / n 1. Now let W = σ 2. We then have σ 2 (x i µ + µ x ) 2 i =1 ( )

(7 One- and Two-Sample Estimation Problem )

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

Final Examination Solutions 17/6/2010

Stat 139 Homework 7 Solutions, Fall 2015

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

1 Review of Probability & Statistics

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

Chapter 23: Inferences About Means

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Sample Size Determination (Two or More Samples)

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Properties and Hypothesis Testing

Data Analysis and Statistical Methods Statistics 651

1 Constructing and Interpreting a Confidence Interval

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Background Information

MATH/STAT 352: Lecture 15

Topic 9: Sampling Distributions of Estimators

AP Statistics Review Ch. 8

Statistical Properties of OLS estimators

TAMS24: Notations and Formulas

MA238 Assignment 4 Solutions (part a)

Comparing your lab results with the others by one-way ANOVA

Expectation and Variance of a random variable

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

1 Models for Matched Pairs

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

CONFIDENCE INTERVALS STUDY GUIDE

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Random Variables, Sampling and Estimation

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Read through these prior to coming to the test and follow them when you take your test.

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Topic 9: Sampling Distributions of Estimators

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

SOLUTIONS y n. n 1 = 605, y 1 = 351. y1. p y n. n 2 = 195, y 2 = 41. y p H 0 : p 1 = p 2 vs. H 1 : p 1 p 2.

Statistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005

Math 140 Introductory Statistics

Describing the Relation between Two Variables

STAC51: Categorical data Analysis

Statistical inference: example 1. Inferential Statistics

Stat 319 Theory of Statistics (2) Exercises

Transcription:

1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets were tested positive by Test I. Hece the estimated sesitivity of Test I is 156 = 0.6996. 223 Similarly, the estimated sesitivity of Test II is 200 = 0.8969. 223 (b) 10 poits Are the results of the two tests idepedet? Use a χ 2 -test to test for idepedece. Solutio: We wat to test H 0 : Test I ad Test II are idepedet vs. H a : The two tests are ot idepedet. The expected couts for the four cells uder the ull hypothesis are as follows, Test II Test I positive egative positive 139.9103 16.0897 egative 60.0897 6.9103 The value of the χ 2 test statistic for the above hypothesis is (142 139.9103) X 2 2 (14 16.0897)2 (58 60.0897)2 = + + + 139.9103 16.0897 60.0897 = 1.0072. (9 6.9103)2 6.9103 ad the degree of freedom is =(2 1)(2 1) = 1. Fro the χ 2 table we have the P-value for this test is more tha 0.25. Hece we fail to reject the ull hypothesis that the two tests are idepedet at ay reasoable level. (c) 12 poits We assume that Test I has sesitivity, 72%, ad specificity, 80%. We also asume that 5% of our populatio has diabetes. i. 4 poits What is the probability that a radomly selected idividual from the populatio will be tested positive by Test I? Solutio: P A radomly selected idividual will be tested positive by Test I = P positive result by Test I P the perso has diabetes the perso has diabetes + P positive result by Test I the perso does t have diabetes P The perso does t have diabetes = 72 100 5 ( 100 + 1 80 ) 95 72 5 + 20 95 = = 0.226. 100 100 10000 1

2. 10 poits ii. 4 poits What is the probability that a radom selected idividual from the populatio has diabetes give his/her Test I result is positive? Solutio: P A radomly selected idividual has Diabetes ad his/her Test I result is positive = P Positive result by Test I P The perso has diabetes The perso has diabetes = 72 100 5 100 = 0.036. Hece, the coditioal probability that a radom selected idividual from the populatio has diabetes give his/her Test I result is positive is = 0.036 = 0.226 0.1593. iii. 4 poits What is the probability that a radom selected idividual from the populatio will be tested positive be Test I give he/she has diabetes? Solutio: The coditioal probability that a radom selected idividual from the populatio will be tested positive be Test I give he/she has diabetes is the sesitivity of Test I which is 0.72. (a) 5 poits Fid α = the probability of a Type I error, that is, the probability that H 0 is rejected whe actually H 0 is true. Solutio: α = P( x > 0 µ = 0) = 0.5 sice x follows ormal distributio with mea µ ad variace σ 2 /. (b) 5 poits Fid the power of this test whe µ = 0.2. Solutio: Power of this test whe µ = 0.2 is P( x > 0 µ = 0.2) ( x 0.2 = P σ/ > 0.2 ) σ/ 3. 20 poits Votig. ( = P Z > 0.2 ) 1/ = P(Z > 0.8) = 0.7881. 16 (a) 5 poits Let p be the probability that the aswer of a surveyed voter is yes. What is the expressio of p i terms of q? Solutio: p = P(the aswer of a surveyed voter is yes ) = Pthe aswer of a surveyed voter is yes the voter actually voted Pa surveyed voter actually voted + Pthe aswer of a surveyed voter is yes the voter did t vote Pa surveyed voter did t vote = 1 0.66 + q 0.34 = 0.66 + 0.34q. 2

(b) 5 poits Solutio: We kow that for ˆp = X/, Hece E(ˆp) = p ad Var(ˆp) = p(1 p). E(ˆq) = E( 1.94 + 2.94ˆp) = 1.94 + 2.94E(ˆp) = 1.94 + 2.94p = q ad Std(ˆq) = Std( 1.94 + 2.94ˆp) = 2.94 Std(ˆp) = 2.94 p(1 p) Now usig the fact that q = 1.94 + 2.94q we have p = q + 1.94 2.94. Hece, p(1 p) (q + 1.94)(2.94 q 1.94) (1.94 + q)(1 q) 2.94 = 2.94 =. (2.94) 2 Hece stadard deviatio of ˆq is (1.94 + q)(1 q)/. (c) 5 poits Suppose X = 80 was observed with = 100. What is your estimate for q? Give a approximate 99% cofidece iterval for q. Solutio: Here ˆp = X = 80 = 0.8. So the estimate for q is 100 ˆq = 1.94 + 2.94ˆp = 1.94 + 2.94 0.8 = 0.412. The estimated stadard deviatio of ˆq is ˆp(1 ˆp) 0.8 0.2 SEˆq = 2.94 = 2.94 = 0.1176. 100 Now ˆq approximately follows N(0, 1) distributio. Hece a 99% cofidece SEˆq iterval for q is (ˆq ± z SEˆq ) where z is the 99% ormal cutoff poit 2.576. So the 99% C.I. is (0.412 ± 2.576 0.1176) = (0.412 ± 0.303) = (0.109, 0.715). (d) 5 poits Solutio: The margi of error i a 99% cofidece iterval for q is 2.576 SEˆq = ˆp(1 ˆp) 2.576 2.94. Usig the fact that p 0.66 we have margi of error is 0.66 0.34 12.871 less tha 2.576 2.94 =. Hece 12.871 0.05 12.871 (0.05) = 5148.4 2 5149. 3

4. 15 poits Plates for Glass (a) 6 poits Solutio: A ubiased estimate for µ X µ Y is X Ȳ = 469 463 = 6. Here our assumptios are The X ad Y samples are idepedet. The populatio variaces are equal. So a ubiased estimate for the commo variace of X ad Y is the pooled variace s 2 p = (5 1) s2 x + (5 1) s2 y 5 + 5 2 = 4 839.5 + 4 916.5 8 = 878. i.e. s p = 29.63. So estimated stadard error of X Ȳ is SE X Ȳ = s p 1 5 + 1 5 = 29.63 0.4 = 18.740. Now ( X Ȳ ) (µ X µ Y ) SE X Ȳ follows t-distributio with 5 + 5 2 = 8 degrees of freedom. 95% cutoff poit for t-distributio with 8 d.f. is t = 2.307. Hece a 95% cofidece iterval for µ X µ Y is ( X Ȳ ) ± t SE X Ȳ = 6 ± 2.307 18.74 = 37.233, 49.233. Sice 0 is i the 95% cofidece iterval of µ X µ Y, we fail to reject the ull hypothesis that the averages for the two processes are same at 5% sigificace level. (b) 6 poits Solutio: Here a ubiased estimate for µ X µ Y is agai X Ȳ = 469 463 = 6. But the assumptios are The X ad Y samples are paired. So the pairs (X i, Y i ) are idepedet. So estimated stadard error of X Ȳ is SE X Ȳ = s differece = 21.5/5 = 2.074. Now ( X Ȳ ) (µ X µ Y ) SE X Ȳ follows t-distributio with 5 1 = 4 degrees of freedom. 95% cutoff poit for t-distributio with 4 d.f. is t = 2.777. Hece a 95% cofidece iterval for µ X µ Y is ( X Ȳ ) ± t SE X Ȳ = 6 ± 2.777 2.074 = 0.241, 11.759. Sice 0 is ot i the 95% cofidece iterval of µ X µ Y, we reject the ull hypothesis that the averages for the two processes are same at 5% sigificace level. 4

(c) 3 poits Solutio: (Less variace due to positive correlatio betwee the pairs. Removal of the effect of lurkig variables.) 5. 20 poits Predictio of ozoe level (a) 3 poits Write dow the multiple regressio equatio. Solutio: The estimated regressio equatio is OZONE = 388.4121 0.1957033 YEAR + 0.0342877 RAIN. (b) 7 poits Solutio: The missig value for the t-statistic for RAIN is b RAIN se brain = 0.0342877 0.0096548 = 3.5514. The error degrees of freedom is p 1 = 13 2 1 = 10. So b RAIN β RAIN se brain follows t-distributio with 10 degrees of freedom. Now 95% cutoff poit for t- distributio with 10 d.f. is t = 2.229. Hece a 95% cofidece iterval for the regressio parameter for rai (β RAIN ) is b RAIN ± t se brain = 0.0342877 ± 2.229 0.0096548 = 0.012767, 0.055808. (c) 4 poits Solutio: For the regressio model the degrees of freedom is p = 2 ad the regressio mea square is MSR=10.3680841/2 = 5.18404205. The error degrees of freedom is 12 2 = 10 ad the mea square error is MSE=1.03960755/10 = 0.103960755. (d) 6 poits Solutio: The value of the F-statistic used for this test is F = MSR MSE = 5.18404205 0.103960755 = 49.8654. The degrees of freedom for the F-statistic is (p, 1 p) = (2, 10). 6. 10 poits For the items below, select True or False. (a) 2 poits If the correlatio betwee two radom variables x ad Y is egative, the Var(X + Y ) < Var(X Y ). Solutio: TRUE. Note that Var(X ± Y ) = σ 2 X + σ2 Y ± 2ρσ Xσ Y. Hece if the correlatio ρ betwee X ad Y is egative we have Var(X +Y ) Var(X Y ) = 4ρσ X σ Y < 0. 5

(b) 2 poits For two radom variables X ad Y, if E(X Y ) = E(X + Y ), the E(Y ) must be equal to 0. Solutio: TRUE. E(X Y ) = E(X+Y ) implies E(X) E(Y ) = E(X)+E(Y ), so E(Y ) = 0. (c) 2 poits If we fail to reject a ull hypothesis H 0 at the 0.05 sigificat level, the there is a 95% probability that H 0 is true. Solutio: FALSE. Probability of H 0 beig TRUE is 0 or 1. (d) 2 poits For a specified sample size, the margi of error for a cofidece iterval for a populatio mea µ icreases as the cofidece level icreases. Solutio: TRUE. Note that margi of error is z σ for kow variace ad t s for ukow variace. Ad the cutoff z or t icreases as the cofidece level icreases. (e) 2 poits I order to calculate a P-value, you must kow the distributio of the test statistic uder the alterative hypothesis H a. Solutio: FALSE. We must kow the distributio of the test statistic uder the ull hypothesis H 0. 6