MA 1125 Lecture 33 - The Sign Test Monday, December 4, 2017 Objectives: Introduce an example of a non-parametric test. For the last topic of the semester we ll look at an example of a non-parametric test. I m not sure why the term non-parametric was chosen, but it means that no specific distribution is assumed for the variable. All of our tables give us probabilities based on the assumption that the underlying population is normally distributed. If our samples are large enough, the central limit theorem tells us that sample means will be approximately normally distributed, but if our underlying population is significantly non-normal, then we can get bad results, if we re not careful. 1. The basic idea Suppose we have test, and we know the median score is M = 70. Suppose also that we have a teaching program that we feel makes our students do better on this test. In other words, our students come from a population with a higher median. How do we test this statistically? How can we compute probabilities, if we do not assume a normal distribution? Furthermore, we re talking about the median, not the mean, and that s a little different too. The sign test is one simple way of finding a probability. Here are our results. (1) 54 55 55 56 62 68 68 71 72 74 74 74 75 76 79 80 80 81 82 89 Note that the sample median is m = 74, which is better than the population median. What is the probability of this happening? Here s what we ll do. Half the population is above the median, and half is below. If we take a sample from the population, then we can interpret it as a binomial experiment. Getting something above the median is like a head, and something below is like a tail. We can compute probabilities for that. In this example, we have 13 out of 20 above the population median, M = 70. What s the probability of this happening, if these numbers actually are coming from the population? Using the normal approximation for P(x 13), we want P(x 12.5). Converting to a 1
2 z-score, we need µ = np = 10 and σ = npq = 2.236, and we have (2) z = 12.5 10 2.236 = 1.12, and so (3) P(x 12.5) = P(z 1.12) = 0.5000 0.3686 = 0.1314 We re looking at a 13% tail here, so having 13 scores above the median would not quite be statistically significant at α = 0.10. Note that we re using the normal distribution, but we re not assuming that the population is normal. The only assumption we re making is that half of the population is above the median and half is below. That is always true. 2. The Sign Test In the sign test, we re going to take what we just did, and formulate a test statistic that we can compare to a critical value in a table. In the sign test, we look at our sample, and assign a -sign to each number below the median, a +-sign to each number above the median, and 0 to numbers equal to the median. Our test statistic is (4) x = { number of + s or number of s, whichever is smaller }. We ll let n be the total number of + s and s (we toss out the 0 s). In the sign test table, we have the largest value of x that lies in the α-tail. Smaller is further out in the tail. Let s set α = 0.05, and use the sample given above. We get (5) + + + + + + + + + + + + + There are 7 s and 13 + s. Therefore, (6) x = 7. This is our test statistic. There are (7) n = 7 + 13 = 20 + s and s altogether, so our critical value is x = 5. Since our test statistic was x = 7, our sample is not statistically significant (x = 5, 4, 3, 2, 1, 0 would have been statistically significant).
MA 1125 Lecture 33 - The Sign Test 3 3. Another example I heard recently that the median price of a house in the U.S. just went over $200K. For simplicity, let s say that M = 200K. If we randomly chose houses in Pittsburgh, we could test whether the cost of houses in Pittsburgh are different from the national costs. Let s say we got 15 house prices as follows, and we ll use α = 0.05. (8) 195K 145K 210K 183K 311K 812K 172K 134K 116K 198K 200K 189K 162K 188K 102K We re comparing to the national median of M = 200K, so we get the following signs. (9) + + + 0 There are 11 s and 3 + s, so x = 3. The total number of +- and -signs is n = 11+3 = 14 (we got rid of the one 0). The critical value from the table is x = 3. The numbers in the table and anything smaller are statistically significant, so our sample is statistically significant. 4. Quiz 33 Using the sample from the first example (Table (1)) and a population median of M = 80, use the sign test to determine if the sample is statistically significant at α = 0.05. 1. Find the test value for x. 2. Find n. 3. Find the critical value for x. 4. Is the sample statistically significant? 5. Homework 33 For problems 1-4, the sample below is supposedly drawn from a population with median M = 55. Is the sample statistically significant at α = 0.05? (10) 55 83 45 74 91 12 45 65 75 45 48 41 73 54 55 55 1. Find the test value for x. 2. Find n. 3. Find the critical value, x.
4 4. Is the sample statistically significant? For problems 5-8, the sample below is supposedly drawn from a population with median M = 90. Is the sample statistically significant at α = 0.01? (11) 55 83 45 74 91 12 45 65 75 45 48 41 73 54 55 55 5. Find the test value for x. 6. Find n. 7. Find the critical value, x. 8. Is the sample statistically significant? For problems 9-12, the sample below is supposedly drawn from a population with median M = 3. Is the sample statistically significant at α = 0.05? (12) 5 3 5 8 1 9 4 6 7 4 7 2 5 7 5 4 6 4 4 1 1 1 9. Find the test value for x. 10. Find n. 11. Find the critical value, x. 12. Is the sample statistically significant? For problems 13-16, the sample below is supposedly drawn from a population with median M = 7. Is the sample statistically significant at α = 0.05? (13) 5 3 5 8 1 9 4 6 7 4 7 2 5 7 5 4 6 4 4 1 1 1 13. Find the test value for x. 14. Find n. 15. Find the critical value, x. 16. Is the sample statistically significant? Homework continued on next page.
MA 1125 Lecture 33 - The Sign Test 5 For problems 17-20, the sample below is supposedly drawn from a population with median M = 85. Is the sample statistically significant at α = 0.05? (14) 32 55 85 19 90 40 63 72 41 78 17. Find the test value for x. 18. Find n. 19. Find the critical value, x. 20. Is the sample statistically significant? Answers: 1) x = 6. 2) n = 13. 3) x = 3. 4) Not significant. 5) x = 1. 6) n = 16. 7) x = 2. 8) Yes, statistically significant. 9) x = 5. 10) n = 21. 11) x = 6. 12) Yes, statistically significant. 13) x = 2. 14) n = 19. 15) x = 5. 16) Yes, statistically significant. 17) x = 1. 18) n = 9. 19) x = 1. 20) Yes, statistically significant.