MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate our question of interest into a hypothesis about an unknown parameter like μ X or σ X, and then test it. Unlike what we did in Chapter 4 to use a sample of data to get a point-valued or an interval-valued estimate of the unknown parameter, we now would have a hypothesized value of the parameter which is assumed to be true first and then we use data to see if the assumption should be rejected or not be rejected. STATISTICAL HYPOTHESES: H 0 AND H 1 In hypothesis testing, we first need to study the following key terms: The null hypothesis H 0 : It is the hypothesis that is assumed to be true and then tested to be rejected or not to be rejected formally. It always contains = sign. (i.e. =,, ). The alternative hypothesis H 1 : It is the hypothesis that typically represents the underlying research question of the investigator and is the complement of H 0, i.e. it contains the values of parameter we accept if we reject H 0. It never contains = sign, EXCEPT a simple test. In this chapter, both of them are with respect to the parameter μ X or σ X. Test statistic: An estimator used for the parameter in a test. Throughout our course, the test statistics used for testing the hypotheses for μ X and σ X are X and S n 1, respectively. ~ 1 ~
MATH411: Applied Statistics Dr. YU, Chi Wai EXAMPLE An agronomist may want to decide on the basis of experiments whether or not a new fertilizer would produce a higher yield of soybeans than an old one whose mean is known to be 10. In this case the agronomist has to test μ X > 10, where μ X is the mean of the random variable of the yield of soybean by the new fertilizer, assuming a normal population. Then, we have H 0 : μ X = 10 against H 1 : μ X > 10. A manufacturer of pharmaceutical products may decide on the basis of samples whether or not 90% of all patients given a new medication will recover from a certain disease. In this case we might say that the manufacturer has to decide whether or not the parameter p of a binomial population equals 0.90. We have H 0 : μ X = p = 0. 9 against H 1 : μ X 0. 9. 3 TYPE OF HYPOTHESIS TESTING According to the form of the alternative hypothesis, we can have the following Four types of tests: I) SIMPLE TEST H 0: μ X = μ 0 H 1 : μ X = μ 1 II) ONE-SIDED RIGHT TEST H 0: μ X = μ 0 H 1 : μ X > μ 0 III) IV) ONE-SIDED LEFT TEST TWO-SIDED TEST H 0: μ X = μ 0 H 1 : μ X < μ 0 H 0: μ X = μ 0 H 1 : μ X μ 0 ~ ~
MATH411: Applied Statistics Dr. YU, Chi Wai In this course, we will NOT study any test with an inequality null hypothesis. Note that the hypotheses should be stated BEFORE looking at the data. Unless we have enough information to do a simple test or a onesided test, in practice we would opt for the default which is a two-sided test. 4 MAIN CONCEPT OF HYPOTHESIS TESTING The Basic Idea of doing hypothesis testing is a CONTRADICTION with the following three steps: Step 1: Determine H 0 and H 1. Step : Under H 0, define a rare event --- the event which happens with a very small probability in one experiment of getting n data. Step 3: Collect data. If data contradicts H 0, then we can say that H 0 is false and reject H 0, while if data do not contradict H 0, then we cannot say that H 0 is true and accept it, but we can say that we do NOT reject H 0. ~ 3 ~
MATH411: Applied Statistics Dr. YU, Chi Wai Example: We want to know whether or not a coin is fair. Consider a random experiment of flipping the coin, say 10 times. Step 1: (H 0 ) The coin is fair, i.e. P(H})=P(T}) = 1/, and (H 1 ) The coin is NOT fair. Step : Under H 0, i.e. the coin is assumed to be fair, the probability of getting 10 tails in ONE experiment is (1/) 10 0.00098. So, we can define the event of getting 10 tails to be the rare event under H 0. Step 3: Perform the experiment to collect data, i.e. we now flip the coin 10 times. If we finally get 10 tails, then the collected data tell us that getting 10 tails is NOT a rare event, i.e. they contradict H 0. Therefore, we would say that we have evidence to suspect the reliability of H 0 and thus reject H 0 ( = accept H 1 ). In hypothesis testing, we only use the data to see if there is enough evidence to reject H 0. If we have enough evidence to reject H 0, we can have great confidence that H 0 is false and H 1 is true. However, if we do not have enough evidence to reject H 0, then it does not mean that we have great confidence in the truth of H 0. In this case, we should say "do not reject H 0 ", instead of "accept H 0 ". ~ 4 ~
MATH411: Applied Statistics Dr. YU, Chi Wai 5 TEST ERRORS AND ERROR PROBABILITIES We would use a test statistic (a point-valued estimator) to formulate the so-called a statistical test statement (hereafter, a test statement) of hypothesis H 0 ------ a procedure/statement/condition based upon the observed values of the random variable of our interest that leads to the rejection or non-rejection of the hypothesis H 0. Note that there is no perfect test statement. Each test statement must lead to the following two kinds of errors. Not reject H 0 Reject H 0 If H 0 is true No error TYPE I ERROR If H 0 is false TYPE II ERROR No error TYPE I ERROR: THE ERROR OF REJECTING H 0 WHEN IT IS IN FACT TRUE. TYPE II ERROR: THE ERROR OF NOT REJECTING H 0 WHEN IT IS IN FACT FALSE. Correspondingly, we have α = P(Type I error) = P(reject H 0 if H 0 is true) It is the probability of making a wrong decision to reject H 0. β = P(Type II error) = P(Not reject H 0 if H 0 is false) It is the probability of making a wrong decision not to reject H0. Ideally, we want to formulate a test statement such that these two error probabilities can be minimized. However, in general, we cannot control both error probabilities simultaneously (when the sample size is fixed). The following example illustrates this problem. ~ 5 ~
MATH411: Applied Statistics Dr. YU, Chi Wai EXAMPLE Suppose we knew that the light bulbs produced from a standard manufacturing process have life times distributed as normal with a standard deviation σ X = 300 hours. However, we did not know the mean lifetime μ X. For simplicity, assume that we were sure that the mean lifetime should be either 100 or 140. Then we may set up the following simple test: H 0: μ X = 100 H 1 : μ X = 140 Suppose that we draw a sample of 100 light bulbs and measure their lifetimes. The sample mean X is used to estimate the true population mean μ X. Since the hypothesized value in H 1 is larger than the hypothesized value in H 0, intuitively, we can say that a large value of x will lead to the rejection of H 0, or we can set the test statement Reject H 0 if x > c. We would later discuss how to determine the constant c, which is often called a critical value in practice. Now we simply use the above statement to get and α = P(reject H 0 if H 0 is true) = P(X > c if μ X = 100) β = P(Not reject H 0 if H 0 is false) = P(X c if μ X = 140) Note that (Step ) under H 0 (i.e. when μ X = 100), the event X > c} occurs with a very small probability α. Thus, X > c} is a rare event under H 0. So, in ONE experiment of getting n data, we should NOT get x > c IF H 0 is TRUE. (Step 3) In other words, getting x > c in one experiment would contradict H 0, and then we would reject H 0. ~ 6 ~
MATH411: Applied Statistics Dr. YU, Chi Wai The following picture illustrates why we cannot minimize these two error probabilities simultaneously. (More details would be discussed in lecture!) Since there is a trade-off between the two types of error: making α smaller will lead to a larger β, and vice versa. So, in designing a test statement, we can only control one of the errors, normally guarantee α in a desired low value, and then find a test statement with β as small as possible. DETERMINATION OF A CRITICAL VALUE Recall that when we design a test of hypothesis, in general we cannot control the two error probabilities at the same time, and what we can do is to control the Type I error probability α in a desired low level, (often use 0.01, 0.05 or 0.1), and then reduce the Type II error probability β as much as we could. How to design a test statement with this restriction of α? With reference to the previous example, we now want to test at α = 0. 05. H 0: μ X = 100 H 1 : μ X = 140 ~ 7 ~
MATH411: Applied Statistics Dr. YU, Chi Wai Again, in this example, a large value of the sample mean will lead to the rejection of the null hypothesis H 0. So, we consider Reject H 0 if x > c. Now, we can expect that the critical value c should be determined by the fixed Type I error probability. From now, we would call α a significance level. After some technical steps (discussed in lecture), we have c = 149.35 and thus the complete test statement: Reject H 0 at a significance level 0.05 if x > 149. 35. Suppose that if the observed value of the sample mean is x = 137, then we could conclude that we DO NOT HAVE ENOUGH EVIDENCE TO REJECT H 0 at a level α = 0.05. ~ 8 ~
MATH411: Applied Statistics Dr. YU, Chi Wai 6 POWER OF A TEST STATEMENT A power of the test statement is defined as 1 β, i.e. the probability of rejecting H 0 if H 0 is false. It is often used to assess the goodness of the test statement. For the comparison of two different test statements, first we need both test statements to have a common α, and then the test statement is said to be better if it has a higher power. Question: How can we increase the power of a given test statement when the value α remains unchanged? Answer: QUESTION Find the power of the test statement Reject H 0 at a significance level 0.05 if x > 149. 35. in the previous example. ~ 9 ~
MATH411: Applied Statistics Dr. YU, Chi Wai What would be the test statement and its power if the sample size is changed from 100 to 400? ~ 10 ~
MATH411: Applied Statistics Dr. YU, Chi Wai 7 FORMULATION OF A TEST STATEMENT ABOUT μ X (NORMAL CASE) In the following, we ONLY consider the formulation of the test statement for one-sided tests and two-sided test in the case that X follows a Normal distribution, i.e. X N(μ X, σ X ). Recall that when X N(μ X, σ X ), we have the result that X N (μ X, σ X n ). Thus, when σ X is known, we have the following results: 1. One-sided right test: Consider H 0 : μ X = μ 0. H 1 : μ X > μ 0 Intuitively, we reject H 0 if x > c. Consequently, we would Reject H 0 at a significance level α if x > μ 0 + z α σ X n (when σ X is KNOWN). ~ 11 ~
MATH411: Applied Statistics Dr. YU, Chi Wai. One-sided left test: Consider H 0 : μ X = μ 0. H 1 : μ X < μ 0 Intuitively, we reject H 0 if x < c. Consequently, we would Reject H 0 at a significance level α if x < μ 0 z α σ X n (when σ X is KNOWN). 3. Two-sided test: Consider H 0 : μ X = μ 0. H 1 : μ X μ 0 Intuitively, we reject H 0 if x < a OR x > b. Consequently, we would Reject H 0 at a significance level α if x μ 0 σ X (when σ X is KNOWN). n > zα ~ 1 ~
MATH411: Applied Statistics Dr. YU, Chi Wai Previously in Chapter 4, when σ X is Unknown and we used s n 1 to replace σ X, we had a t distribution to derive the formulas of the random and confidence interval for μ X. Similarly, in the following we would also use the t distribution to get the test statement of the hypothesis about μ X when σ X is Unknown. Recall that If X N(μ X, σ X ), then X μ X S n 1 n t n 1 which means that the random variable on the left follows a t distribution with n 1 degrees of freedom. Thus, when σ X is UNknown, we have the following results: 1. One-sided right test: Consider H 0 : μ X = μ 0. H 1 : μ X > μ 0 Intuitively, we reject H 0 if x > c. Consequently, we would Reject H 0 at a significance level α if x > μ 0 + t n 1,α s n 1 n (when σ X is UNKNOWN). ~ 13 ~
MATH411: Applied Statistics Dr. YU, Chi Wai. One-sided left test: Consider H 0 : μ X = μ 0. H 1 : μ X < μ 0 Intuitively, we reject H 0 if x < c. Consequently, we would Reject H 0 at a significance level α if x < μ 0 t n 1,α s n 1 n (when σ X is UNKNOWN). 3. Two-sided test: Consider H 0 : μ X = μ 0. H 1 : μ X μ 0 Intuitively, we reject H 0 if x < a OR x > b. Consequently, we would Reject H 0 at a significance level α if x μ 0 s n 1 (when σ X is UNKNOWN). n > t n 1, α ~ 14 ~
MATH411: Applied Statistics Dr. YU, Chi Wai Remark that for the above three tests when σ X is UNknown, we often call the term x μ 0 s n 1 n a t value. In R (https://www.r-project.org/), we can use the function t.test to get the t value in the case of UNKNOWN σ X, when all collected data are given. EXAMPLE Frequencies, in hertz (Hz), of 1 elephant calls: 14, 16, 17, 17, 4, 0, 3, 18, 9, 31, 15, 35 Assume that the population of possible elephant call frequencies (X) is a normal distribution, Now a scientist is interested in the expected frequency μ X of X. Do a two-sided test with H 0 : μ X = 10 at a 0.05 level of significance. Note that R only provides us a t value only. When we do a two-sided test, we need to find the absolute value of the t value by ourselves. How to find t n 1, α = t 11, 0.05 in R? ~ 15 ~
MATH411: Applied Statistics Dr. YU, Chi Wai QUESTION The average length of time for students to register for classes at a certain college has been 46 minutes. A new registration procedure using modern computing machines is being tried. If a random sample of 1 students had an average registration time of 4 minutes with a standard deviation of 11.9 minutes under the new system. Test the hypothesis that the population mean length of time under the new system is now less than 46. Use a 0.05 level of significance with the assumption that the data are from a normal distribution. ~ 16 ~
MATH411: Applied Statistics Dr. YU, Chi Wai 8 FORMULATION OF A TEST STATEMENT ABOUT σ X (NORMAL CASE) S n 1 is the test statistic we would use to formulate the test statement about σ X. Similar to our procedure of using S n 1 to construct a random interval for σ X and then get a confidence interval in Chapter 4, we would use the following theoretical result to get the test statement: If X N(μ X, σ X ), then (n 1)S n 1 σ X χ n 1 which means that the random variable on the left follows a χ distribution with n 1 degrees of freedom. Therefore, we can write down the following general result 1. One-sided right test: Consider H 0 : σ X = σ 0 H 1 : σ X > σ 0. Intuitively, we reject H 0 if s n 1 > c. Consequently, we would Reject H 0 at a significance level α if (n 1)s n 1 σ 0 > χ n 1,α. ~ 17 ~
MATH411: Applied Statistics Dr. YU, Chi Wai. One-sided left test: Consider H 0 : σ X = σ 0 H 1 : σ X < σ 0. Intuitively, we reject H 0 if s n 1 < c. Consequently, we would Reject H 0 at a significance level α if (n 1)s n 1 σ 0 < χ n 1,1 α. 3. Two-sided test: Consider H 0 : σ X = σ 0 H 1 : σ X σ 0. Intuitively, we reject H 0 if s n 1 < a OR s n 1 > b. Consequently, we would Reject H 0 at a significance level α if (n 1)s n 1 σ 0 OR (n 1)s n 1 < χ n 1,1 α σ 0 > χ n 1, α. ~ 18 ~
MATH411: Applied Statistics Dr. YU, Chi Wai QUESTION A manufacturer of car batteries claims that the life of his batteries is normally distributed with a standard deviation equal to 0.9 year. If a random sample of 10 of these batteries has a standard deviation of 1. years, do you think that σ > 0.9 year? Use a 0.05 level of significance to draw a conclusion. How to find χ n 1,α = χ 9,0.05 in R? ~ 19 ~
MATH411: Applied Statistics Dr. YU, Chi Wai 9 TWO-SIDED HYPOTHESIS TESTING VS CONFIDENCE INTERVAL (NORMAL CASE) Recall that when we do a two-sided test for μ X with unknown σ X, i.e. test H 0 : μ X = μ 0 H 1 : μ X μ 0 at a significance level α, we would reject H 0 at a significance level α if x μ 0 s n 1 n > t n 1, α. Or equivalently, we would NOT REJECT H 0 at a significance level α if This is exactly saying that x μ 0 s n 1 n t n 1, α μ 0 x ± t n 1, α s n 1 n. ~ 0 ~
MATH411: Applied Statistics Dr. YU, Chi Wai Thus, we have the following approach to use C.I. to draw a conclusion for twosided testing of hypothesis: Given that [a,b] is the 100(1 α)% C.I. for μ X. Thus, to test H 0 : μ X = μ 0 H 1 : μ X μ 0 at a significance level α, it suffices to check if μ 0 is inside OR outside [a, b]. Remark Inside Not reject H0 at the level of significance α Outside Reject H0 at the level of significance α Note that using the confidence interval to do the two-sided test, we have to ensure that the significance level + the confidence level = 1. Similarly, this confidence-interval based approach can also be used for any other two-sided tests, like the two-sided tests of μ X (when σ X is KNOWN) and of σ X. ~ 1 ~
MATH411: Applied Statistics Dr. YU, Chi Wai 10 ONE-SIDED HYPOTHESIS TESTING VS CONFIDENCE INTERVAL (NORMAL CASE) Given that [a,b] is the 100(1 α)% C.I. for μ X. If μ 0 < a, then, we can reject H 0 in the one-sided right test H 0 : μ X = μ 0 H 1 : μ X > μ 0 at a significance level α/. Similarly, this confidence-interval based approach can also be used for other onesided right tests, like tests of μ X (when σ X is KNOWN) and of σ X. ~ ~
MATH411: Applied Statistics Dr. YU, Chi Wai Given that [a,b] is the 100(1 α)% C.I. for μ X. If μ 0 > b, then, we can reject H 0 in the one-sided left test H 0 : μ X = μ 0 H 1 : μ X < μ 0 at a significance level α/. Similarly, this confidence-interval based approach can also be used for other onesided left tests, like the tests of μ X (when σ X is KNOWN) and of σ X. ~ 3 ~