Smoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total

Similar documents
Review. December 4 th, Review

7.1 Basic Properties of Confidence Intervals

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

This does not cover everything on the final. Look at the posted practice problems for other topics.

Math 494: Mathematical Statistics

Problem 1 (20) Log-normal. f(x) Cauchy

STAT 285: Fall Semester Final Examination Solutions

Probability Density Functions

STAT100 Elementary Statistics and Probability

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

STAT100 Elementary Statistics and Probability

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

Semester , Example Exam 1

Hypothesis for Means and Proportions

Mathematical statistics

Tables Table A Table B Table C Table D Table E 675

Lecture 15: Inference Based on Two Samples

This paper is not to be removed from the Examination Halls

MAT 2377C FINAL EXAM PRACTICE

CBA4 is live in practice mode this week exam mode from Saturday!

An interval estimator of a parameter θ is of the form θl < θ < θu at a

The Components of a Statistical Hypothesis Testing Problem

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Math 151. Rumbos Fall Solutions to Review Problems for Final Exam

MTH U481 : SPRING 2009: PRACTICE PROBLEMS FOR FINAL

Problem Set 4 - Solutions

Applied Statistics I

Hypothesis testing: theory and methods

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

Chapter 8 - Statistical intervals for a single sample

Math 494: Mathematical Statistics

Hypothesis Testing Problem. TMS-062: Lecture 5 Hypotheses Testing. Alternative Hypotheses. Test Statistic

, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2

Introduction to Statistics

S n = x + X 1 + X X n.

Sample Problems for the Final Exam

6 The normal distribution, the central limit theorem and random samples

Chapter 9. Hypothesis testing. 9.1 Introduction

3 Modeling Process Quality

Math 50: Final. 1. [13 points] It was found that 35 out of 300 famous people have the star sign Sagittarius.

Chapter 12: Inference about One Population

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001

1 Statistical inference for a population mean

1 Binomial Probability [15 points]

Chapter 8 of Devore , H 1 :

MTMS Mathematical Statistics

SMAM 314 Practice Final Examination Winter 2003

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Exam 2 (KEY) July 20, 2009

Math 152. Rumbos Fall Solutions to Exam #2

Statistics 135 Fall 2008 Final Exam

HW on Ch Let X be a discrete random variable with V (X) = 8.6, then V (3X+5.6) is. V (3X + 5.6) = 3 2 V (X) = 9(8.6) = 77.4.

CONTINUOUS RANDOM VARIABLES

9-6. Testing the difference between proportions /20

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

Contents 1. Contents

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

Chapter 9 Inferences from Two Samples

Solution: First note that the power function of the test is given as follows,

Confidence intervals and Hypothesis testing

OHSU OGI Class ECE-580-DOE :Statistical Process Control and Design of Experiments Steve Brainerd Basic Statistics Sample size?

STA 2101/442 Assignment 2 1

Summary of Chapters 7-9

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences FINAL EXAMINATION, APRIL 2013

Point Estimation and Confidence Interval

Confidence Intervals, Testing and ANOVA Summary

1; (f) H 0 : = 55 db, H 1 : < 55.

Hypothesis testing for µ:

Inferences About Two Population Proportions

Final Examination December 17, 2012 MATH 323 P (Y = 5) =

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

10.4 Hypothesis Testing: Two Independent Samples Proportion

MATH20802: STATISTICAL METHODS EXAMPLES

An inferential procedure to use sample data to understand a population Procedures

Evaluating Hypotheses

MAT 2377 Practice Exam

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Chapter 3 Discrete Random Variables

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance

MATH 240. Chapter 8 Outlines of Hypothesis Tests

The t-statistic. Student s t Test

Chapter 7: Statistical Inference (Two Samples)

STAT Exam Jam Solutions. Contents

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

MATH 360. Probablity Final Examination December 21, 2011 (2:00 pm - 5:00 pm)

ST Introduction to Statistics for Engineers. Solutions to Sample Midterm for 2002

MATH 728 Homework 3. Oleksandr Pavlenko

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

CHAPTER 14 THEORETICAL DISTRIBUTIONS

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Math 151. Rumbos Fall Solutions to Review Problems for Final Exam

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Chapter 7 Comparison of two independent samples

Reference: Chapter 7 of Devore (8e)

Statistics 251: Statistical Methods

Tests for Population Proportion(s)

Transcription:

Math 3070. Treibergs Final Exam Name: December 7, 00. In an experiment to see how hypertension is related to smoking habits, the following data was taken on individuals. Test the hypothesis that the proportions of hypertension in the two groups are not equal. Use a.0 level of significance. State the null hypothesis, the test statistic, why your statistic is appropriate, the rejection region, your computation and conclusion. Smoking Habits Moderate Smokers Heavy Smokers Total Hypertension 36 30 66 No Hypertension 6 9 4 Total 6 49 Let X be the number of hypertensive individuals among n moderate smokers and X the number of hypertensives among n heavy smokers. Thus X i Bin(n i, p i ), where p i is the population proportion of hypertensives in the i-th group. The null and alternative hypotheses are H 0 : p p ; H a : p p. Because the proportions are equal in the null hypothesis, the appropriate statistic has pooled estimate for p i. Since we have large samples: n p 36 > n q 6 > 0 and n p 30 > n q 9 > 0, by rule of thumb, the z-test is appropriate. ˆp ˆp Z ( ) where p i p q n + n X i n i and p X + X n + n. The null hypothesis is rejected if Z z α/. For α.0, z.0.960. We find p 66 36, Z 6 30 49 66 4 ( 6 + 49 ).337. Since Z < z α/, we cannot reject the null hypothesis. The evidence is not strong that the proportion of hypertensives among moderate smokers is different than the proportion of hypertensives among heavy smokers.

. The Wanship Widgets Company is interested in evaluating its current inspection procedure on shipments of identical widgets. The procedure is to take a sample of from the and pass the shipment if no more than two are found to be defective. What is the probability that the shipment will be accepted even though actually 0% of the widgets in the shipment are defective? Let X be the number of defective widgets in the sample of n taken from N without replacement. Suppose the shipment has M defective widgets. Thus X is a hypergeometric variable with X Hypergeometric(n, N, M pn), where p is the proportion of defectives in the shipment. If p. then M 0. The probability of accepting the shipment is β P (X ) hyp(0,,, 0) + hyp(,,, 0) + hyp(,,, 0) )( 40 ) )( 40 ) )( 40 ) 4 3 ( 0 0 ( ) + ( 0 ( ) + ( 0 ( ).9. 3. Extensive records of Wanship Widgets Co. indicates the number of days to fill orders for their widget controller were 3 8 0 9 8 7 4 7 0 8 7 3 9 Does the data strongly indicate that such orders take longer than days to fill? State the null and alternative hypothesis. State the test statistic and rejection region. What assumptions are you making about the data and how well are they satisfied? Perform the computation and state your conclusion. [Hint: X 3.9 and S 3.34.] Let X be the number of days to fill the controller order. Let µ be the population mean of the number of days to fill the order. The null and alternative hypotheses are H 0 : µ ; H a : µ >. The test statistic is T X S/ n.

Since we have a small sample (n 40) and the population standard deviation σ is unknown, we use the t-test. The assumption we make in this problem is that the sample comes from a normal distribution. The normal PP-plot shows that the points line up pretty well so there is no strong indication that the distribution is not normal, although there is an S shape indicating heavy tails. (In fact the data was generated by a random t- distribution!) The null hypothesis is rejected at the α level of significance if t t ν,α where ν n 9 is degrees of freedom. We compute using the hint t 3.9 3.34/ 0.33. From table A.8, for ν 9 d.f., P (T >.).0 and P (T >.6).009. Thus, interpolating, P P (T >.33).00. From table A., t 9,.0.093 and t 9,.0.39 so.0 < P <.0. Thus at the α.0 significance level, we reject the null hypothesis: there is strong evidence that it takes longer than days to fill the controller order. 4. Let X denote the number of years that a Wanship Widget operates before it ceases to function. Assume that the probability density function (pdf) is { f(x) 3 e x/3, if x 0; 0, if x < 0. Find the cumulative density function (cdf) for X. Find the probability that the widget operates at least four years. Find the probability that the widget continues to operate at least another four years given that it has been functioning for two years. Observe that X is an exponential variable X Exponential(λ 3 ). The cdf is by definition F (x) x { x f(x) dx 0 3 e x/3 dx e x/3, if x 0; 0, if x < 0. The probability that the widget lasts at least four years is P (X 4) P (X < 4) F (4) ( e 4/3 ).64. The probability that the widget continues to operate at least another four years given that it has been functioning for two years is by the conditional probability formula P (X 6 X ) P ({X 6} {X }) P (X ) P (X 6) P (X < 6) P (X ) P (X < ) F (6) F () ( e 6/3 ) ( e /3 ) e 4/3.64. Equality in (b.) and (c.) is known as the memoryless property of exponential variables.. A random sample of Dan s customers shows that of 30 shoppers, of them regularly used cents-off coupons. Construct a 99% one-sided upper confidence interval for the proportion of customers who use coupons. Finish the following statement that describes your confidence interval There is a 99% chance that... Let X be the number of coupon users. X Binomial(n, p), where p is the population proportion of coupon users. Since we have a small sample, nˆq 9 < 0, we cannot use the traditional CI given by equation 7.. Instead we use the unrestricted one-sided upper confidence bound ˆp + z α ˆpˆq p < n + z α n + z α 4n + z α n 3

Since z.0.36, ˆp /30.7, ˆq.3, n 30 we find p <.7 +.36 60 +.36 (.7)(.3) +.36 30 30 +.36 3600.8. The following statement describes the interval. There is a 99 % chance that our interval has captured p. The statement There is a 99% chance that p <.8. is vague and imprecise. On the one hand it can be interpreted as my statement, but it can also be interpreted as.99 P (p <.8) which is false because although in this statement we intend that.8 be the random variable, it is a fixed number. 6. Suppose that 0% of all shafts used in widget construction are nonconforming. A group of 00 shafts is randomly selected. What is the exact probability that not more than 30 are nonconforming? Give the formula (as a sum) but do not evaluate. Compute the probability using the best approximation. Why is your approximation valid? Let X denote the number of nonconforming shafts in the sample os n 00. X Binomial(n 00, p.0), where p is the population proportion of nonconforming shafts. Then the probability that not more than 30 are nonconforming is P P (X 30) Binomial(30, 00,.) 30 i0 ( ) 00 (.) i (.9) 00 i. i The rule of thumb for Poisson approximation is not satisfied, although the rule for normal approximation is: nq > np 00 (.) 0 > 0. The best approximation includes the continuity correction ( ) x +. np P (X x) Φ npq Thus we find ( ) 30 +. 0 P (X 30) Φ Φ(.47). 00(.)(.9) Since Φ(.47).993 and Φ(.48).9934, by interpolation, Φ(.47).9933. 7. To compare two different manufacturers of widget shafts, two samples of sizes n and n were ordered from the first and second manufacturers. Suppose X of the first sample were nonconforming and X of the second sample were nonconforming. Show that ˆθ X X is an unbiased estimator for θ p p, where p i is the population proportion n n of nonconforming shafts from manufacturer i. Derive the formula for standard error of ˆθ. What assumptions are you making in your derivation? The statistic ˆθ is unbiased if E(ˆθ) θ. We are given that X i Binomial(n i, p i ) are binomial variables. Hence E(X i ) p i n i and V (X i ) n i p i q i. Using the linearity of expectations ( X E(ˆθ) E X ) E(X ) E(X ) n p n p p p. n n n n n n 4

The standard error is σ ˆθ V (ˆθ). Let us assume that X and X are independent. Then the formula for variance of a linear combination yields ( X σˆθ V X ) V (X ) n n n + V (X ) n p q n n + n p q p q n + p q. n n 8. In the study Vitamin C Retention in Reconstituted Frozen Orange Juice, (VPI Department of Human Nutrition and Foods, 97), levels of vitamin C were measured for twelve samples at two different time periods (3,7 days) between when OJ concentrate was blended and when it was tested. Response is mg/l ascorbic acid. Test whether the data shows that there is a difference in mean vitamin levels at different times. State the assumptions. State the null and alternative hypotheses, test statistic, rejection region and your conclusion. Sample 3 days 7 days D ------ ------ ------ ----- 49.4 4.7 6.7 49. 48.8 0.4 3 4.8 40.4.4 4 3. 47.6.6 48.8 49. -0.4 6 44.0 44.0 0.0 7 44.0 4.0.0 8 4.4 43. -0.8 9 48.0 48. -0. 0 47.0 43.3 3.7 48. 4. 3.0 49.6 47.6.0 Let X i be the acid level for the i-th sample measured at three days and Y i be the acid level of the same sample measured at seven days. X i and Y i are therefore not independent, and we must use the paired test. We have recorded the random variable D i X i Y i in the table. The sample is small (n 40) and σ D is unknown so we use a paired t-test. We assume that D i is a random sample taken from a normal distribution D i N(µ D, σ D ). The null and alternative hypotheses are The test statistic is H 0 : µ D 0; H a : µ D 0. T D S D / n. The null hypothesis is rejected at the α level of significance if t t ν,α/ since it is a two-tailed test where ν n is the degrees of freedom. The calculator gives D.008333333 and S D.440364478. Hence Thus, interpolating Table A.8, t.008.440/.8. P P (T >.8).008. Thus the P -value is.06. Similarly we could have used Table A.: t,.00.78 and t,.00 3.06 so interpolating, t,.008.8. Thus the mean ascorbic acid level at the two times is significantly different.

9. The Wanship Widgets Company wishes to test whether the average tensile strength of steel wire exceeds the tensile strength of iron wire by at least 0 kilograms. To test the claim, pieces of each type of wire are tested under similar conditions. The steel wire had an average tensile strength of 88.9 kilograms with a standard deviation of 6.8 kilograms, while the iron wire had an average strength of 76.9 kilograms with a standard deviation of.6 kilograms. State the null and alternative hypotheses, the test statistic, and the rejection region. Test at the.0 level of significance and state your conclusion. Why is your statistic valid? Compute the probability that the test will reject the null hypothesis assuming that the mean tensile strength of the steel wire is actually 4 kilograms higher than that of the iron. Assume the same standard deviations. Let X i be be the tensile strength of the steel wire and Y j the tensile strength of the iron wire. We are assuming that all the X i s and Y j s are independent, that X i comes from a distribution with population mean µ and population standard deviation σ and Y j comes from a distribution with population mean µ and population standard deviation σ. Since both populations are large, n n > 40, we may use the two sample z-test, even though the σ i s are unknown. We also cannot assume that σ σ so we must not pool the variance computation. The null and alternative hypotheses are The test statistic is H 0 : µ µ 0 0; H a : µ µ > 0. Z X Ȳ 0. S n + S n The null hypothesis is rejected at the α level of significance if z z α since it is a one-tailed test. We have for α.0, z.0.64. Computing, 88.9 76.9 0.0 6.8 + 76.9.679. Thus.679 >.64 so the null hypothesis is rejected: there is strong evidence that the mean tensile strength of steel wire exceeds the mean tensile strength of iron wire by more than 0 kilograms. If the actual difference is 4 kilograms, then the probability of accepting the null hypothesis is β( ) P (Z < z α µ µ ) ( P X Ȳ S 0 + z α n + S n ) µ µ P Z X Ȳ 0 + z α S n + S S n n + S µ µ n Φ 0 4 +.64 Φ(.74).043. 6.8 + 76.9 We have interpolated Table A.: Φ(.7).0436 and Φ(.7).047. 6