CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

CHAPTER 9, 10 Hypothesis Testing Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities: The person is guilty. The person is innocent. To begin with, the person is assumed innocent. The prosecutor presents evidence, trying to convince the jury to reject the original assumption of innocence, and conclude that the person is guilty. Parts of a Statistical Test The null hypothesis, H 0 The alternative hypothesis, H a The test statistic and its p-value The rejection region The conclusion The two competing hypotheses are the alternative hypothesis H a, generally the hypothesis that the researcher wishes to support, and the null hypothesis H 0, a contradiction of the alternative hypothesis. The researcher uses the sample data to Reject H 0 and conclude that H a is true. Accept (do not reject) H 0 as true. Test statistic: A single number calculated from the sample data. p-value: A probability calculated using the test statistic. Rejection region: One set, consisting of values that support the alternative hypothesis and lead to rejecting H 0. 1

Accepting region: One set, consisting of values that support the null hypothesis. Critical values: The value that separate the acceptance and rejection regions. A Type I error for a statistical test is the error of rejecting the null hypothesis when it is true. A level of significance (significance level α: for a statistical test of hypothesis is α = P (Type I error)=p (falsely rejecting H 0 )=P (rejecting H 0 when it is true) A Type II error for a statistical test is the error of accepting the null hypothesis when it is false. β = P (Type II error)=p (falsely accepting H 0 )=P (accepting H 0 when it is false) the power of a statistical test, given as 1 β = P (reject H 0 when H a is true) measures the ability of the test to perform as required. Large-Sample Statistical Test for µ 1. Null hypothesis: H 0 : µ = µ 0 One-Tailed Test H a : µ > µ 0 (or, H a : µ < µ 0 ) Two-Tailed Test H a : µ µ 0 3. Test statistic: z = x µ 0 σ/ n estimated as z = x µ 0 s/ n One-Tailed Test z > z α (or z < z α when the alternative hypothesis is H a : µ < µ 0 ) 2

Two-Tailed Test z > z α/2 or z < z α/2 Assumptions: The n observations in the sample are randomly selected from the population and n is large (n 30) p-value: The p-value or observed significant level of a statistical test is the smallest value of α for which H 0 can be rejected. It is the actual risk of committing a Type I error, if H 0 is rejected based on the observed value of the test statistic. The p-value measures the strength of the evidence against H 0. If the p-value is less than or equal to a preassigned significance level α, then the null hypothesis can be rejected, and you can report that the results are statistically significant at level α. Small-Sample Hypothesis Test for µ 1. Null hypothesis: H 0 : µ = µ 0 One-Tailed Test H a : µ > µ 0 (or, H a : µ < µ 0 ) Two-Tailed Test H a : µ µ 0 3. Test statistic: t = x µ 0 s/ n One-Tailed Test t > t α (or t < t α when the alternative hypothesis is H a : µ < µ 0 ) Two-Tailed Test t > t α/2 or t < t α/2 or when p-value< α 3

The critical values of t are based on (n 1) degrees of freedom. Large-Sample Statistical Test for p 1. Null hypothesis: H 0 : p = p 0 One-Tailed Test H a : p > p 0 (or, H a : p < p 0 ) Two-Tailed Test H a : p p 0 3. Test statistic: z = ˆp p 0 p0 q 0 n with ˆp = x n One-Tailed Test z > z α (or z < z α when the alternative hypothesis is H a : µ < µ 0 ) Two-Tailed Test z > z α/2 or z < z α/2 or when p-value< α Assumptions: The sampling satisfies the assumptions of a binomial experiment and n is large enough so that the sampling distribution of ˆp can be approximated by a normal distribution (np 0 > 5 and nq 0 > 5). Assumptions: The sample is randomly selected from a normally distributed population. - Examples: 1. Suppose a scheduled flight must average at least 60% occupancy in order to be profitable, and an examination of the occupancy rate for 120 flights from Atlanta to Dallas showed a mean occupancy per flight of 58% and a standard deviation of 11%. a. If µ is the mean occupancy per flight and if the company wishes to determine whether or 4

not this scheduled flight is unprofitable, give the alternative and the null hypotheses for the test. b. Does the alternative hypothesis in part a imply a one or two-tailed test? c. Do the occupancy data for the 120 flights suggest that this scheduled flight is unprofitable? 2. A random sample of 120 observations was selected from a binomial population, and 72 successes were observed. Do the data provide sufficient evidence to indicate that p is greater than 0.5? 3. The following n = 10 observations are a sample from a normal population: 7.4, 7.1, 6.5, 7.5, 7.6, 6.3, 6.9, 7.7, 6.5, 7.0 a. Find a 99% upper one-sided confidence bound for the population mean µ. b. Test H 0 : µ = 7.5 versus H a : µ < 7.5. Use α = 0.01. c. Do the results of part a support your conclusion in part b? Large-Sample Statistical Test for (µ 1 µ 2 ) 1. Null hypothesis: H 0 : (µ 1 µ 2 ) = D 0, where D 0 is some specific difference that you wish to tests. One-Tailed Test H a : (µ 1 µ 2 ) > D 0 or (µ 1 µ 2 ) < D 0 Two-Tailed Test (µ 1 µ 2 ) D 0 3. Test statistic: z = ( x 1 x 2 ) D 0 SE = ( x 1 x 2 ) D 0 s 2 1 + s2 2 n 2 One-Tailed Test z > z α or z < z α when (µ 1 µ 2 ) < D 0 Two-Tailed Test z > z α/2 or z < z α/2 or when p-value< α 5

Assumptions: The samples are randomly and independently selected from the two populations and 30 and n 2 30. Test of Hypothesis Concerning the Difference Between Two Means: Independent Random Small Samples 1. Null hypothesis: H 0 : (µ 1 µ 2 ) = D 0, where D 0 is some specific difference that you wish to tests. One-Tailed Test H a : (µ 1 µ 2 ) > D 0 or (µ 1 µ 2 ) < D 0 Two-Tailed Test H a : (µ 1 µ 2 ) D 0 3. Test statistic: t = ( x 1 x 2 ) D 0 ( ) s 2 1 + 1 n 2 where s 2 = ( 1)s 2 1 +(n 2 1)s 2 2 +n 2 2 One-Tailed Test t > t α or t < t α when (µ 1 µ 2 ) < D 0 Two-Tailed Test t > t α/2 or t < t α/2 or when p-value< α The critical values of t are based on ( + n 2 2) df. Assumptions: The samples are randomly and independently selected from normally distributed populations. The variances of the populations σ 2 1 and σ 2 2 are equal. Examples: 1. Random samples of 50 recent college graduates in each major were selected and the following information was obtained: 6

Major Education Social science Mean 40554 38348 SD 2225 2375 a. Do the data provide sufficient evidence to indicate a difference in average starting salaries for college graduates who majored in education and the social sciences? Test using α = 0.05. b. Find a 95% confidence interval for difference between means for the two groups in the general population. Compare your result with part a. 2. A geologist collected the titanium contents of the samples, found using two different methods: Method 1: 0.011, 0.013, 0.013, 0.015, 0.014, 0.013, 0.010, 0.013, 0.011, 0.012 Method 2: 0.011, 0.016, 0.013, 0.012, 0.015, 0.012, 0.017, 0.013, 0.014, 0.015 a. Use an appropriate method to test for a significant difference in the average titanium contents using the two different methods. b. Determine a 95% confidence interval estimate for (µ 1 µ 2 ). Does your interval estimate support your conclusion in part a? Large-Sample Statistical Test for (p 1 p 2 ) 1. Null hypothesis: H 0 : (p 1 p 2 ) = 0, or alternatively H 0 : p 1 = p 2. One-Tailed Test H a : (p 1 p 2 ) > 0 or (p 1 p 2 ) < 0 Two-Tailed Test (p 1 p 2 ) 0 3. Test statistic: z = (ˆp 1 ˆp 2 ) 0 SE = (ˆp 1 ˆp 2 ) p1 q 1 + p 2q 2 n 2 = (ˆp 1 ˆp 2 ) pq + pq n 2 where ˆp 1 = x 1 / and ˆp 2 = x 2 /n 2. Since the common value of p 1 = p 2 = p (used in the standard error) is unknown,it is estimated by and the test statistic is z = (ˆp 1 ˆp 2 ) 0 ˆpˆq + ˆpˆq ˆp = x 1 + x 2 + n 2 n 2 or z = 7 (ˆp 1 ˆp 2 ) ( ) 1 ˆpˆq + 1 n 2

One-Tailed Test z > z α or z < z α when (p 1 p 2 ) < 0 Two-Tailed Test z > z α/2 or z < z α/2 or when p-value< α Assumptions: Samples are selected in a random and independent manner from two binomial populations and and n 2 are large enough, that is ˆp 1, ˆq 1, n 2ˆp 2 and n 2ˆq 2 should all be greater than 5. - Example: Independent random samples of 280 and 350 observations were selected from binomial populations 1 and 2 respectively. Sample 1 had 132 successes, and sample 2 had 178 successes. Do the data present sufficient evidence to indicate that the proportion of successes in populatio is smaller than the proportion in population 2? Suggested Exercises: 9.7, 9.11, 9.15, 9.17, 9.21, 9.23, 9.27, 9.31, 9.33, 9.35, 9.37, 9.41, 9.45, 9.51, 9.57, 9.61, 9.69, 9.75, 10.7, 10.11, 10.15, 10.21, 10.23, 10.27, 10.31, 10.33 8