HYPOTHESIS TESTING: SINGLE MEAN, NORMAL DISTRIBUTION (Z-TEST)

Size: px

Start display at page:

Download "HYPOTHESIS TESTING: SINGLE MEAN, NORMAL DISTRIBUTION (Z-TEST)"

Matthew Rich
5 years ago
Views:

1 HYPOTHESIS TESTING: SINGLE MEAN, NORMAL DISTRIBUTION (Z-TEST) In Binomial Hypothesis Testing researchers generally ignore the actual numbers that are obtained on their measure. The Binomial Test for whether UofW students have above average IQs in the last chapter, for example, simply compared the observed IQ for each of the 9 students to 100 and used the binomial outcome of above 100 or not above 100 as the data to be analyzed. The actual IQs of the 9 students were not analyzed further. A superior statistical test would use the numerical IQs of the 9 students to test whether the IQs of UofW students are above average; that is, to test whether = 100. One obvious possibility would be to use (i.e., the sample mean) to test whether it is likely that the observed sample came from a population with mean of 100. In order to carry out such a test, however, it would be necessary to have a probability distribution for, so that we could decide whether the observed outcome was unlikely enough for us to reject the hypothesis that = 100. Recall that for the binomial test it was possible to determine the probability associated with different numbers of successes using the binomial theorem, the binomial table, or the normal approximation. THE CENTRAL LIMIT THEOREM Fortunately much is known about the probability distribution of means drawn from a sample. Specifically, the Central Limit Theorem (CLT) states that for many (all possible) samples of size n drawn from a population, the probability distribution of the means: (a) will tend towards a normal distribution, (b) will have a mean equal to (i.e., equal to the population mean from which the samples were drawn, and (c) will have a standard deviation equal to /n (i.e., equal to the standard deviation of the original population divided by the square root of n, the sample size). Let s start with a small population of just three scores (i.e., x = 1, 2, and 3) from which we select all possible samples of two observations (i.e., n = 2). There are 3 x 3 = 9 possible samples when we sample with replacement; the samples are: 11, 12, 13, 21, 22, 23, 31, 32, 33. The 9 sample means are: = 1.0, 1.5, 2.0, 1.5, 2.0, 2.5, 2.0, 2.5, and 3.0. Now let us see how the population statistics compare to one another. According to the CLT, = x, and = x /n. The relevant calculations are shown below; note that because we are calculation variances for the population, we use formulas for population statistics. If we did calculate using sample formulas, we would divide SS by n and not n - 1. Population of xs Population of s x p(x) p() x 2.0 = 1x x = 1x x x.8165 = (1-2)x =(1-2)x.1111+(1.5-2)x =.8165/2

Consider now an initial population with = 100 and = 15 from which we select n = 9 observations and calculate. That gives us one mean and one standard deviation.

2 Consider now an initial population with = 100 and = 15 from which we select n = 9 observations and calculate. That gives us one mean and one standard deviation. Now we repeatedly select samples of 9 observations and calculate for each sample. We repeat this until we have a very large number of samples, each with an. What would the distribution of be like? The CLT states that the means will have a = = x and = 15/3 = 5.0 = x. We can use SPSS to demonstrate this more concretely and to confirm the validity of the CLT. SPSS was used to generate 50,000 samples of 9 observations and calculate. The first 10 samples and s are shown below. X1 X2 X3 X4 X5 X6 X7 X8 X9 Mean () Imagine another 49,990 rows like these, giving a column of 50,000 sample means. What would the probability distribution of those means look like? According to the CLT, it will be approximately normal, with a mean of 100 and a standard deviation of 5. The following histogram shows the frequency distribution (i.e., probability distribution) of the 50,000 means. Note that the mean of the 50,000 means is very close to 100, as predicted by the CLT, and that the standard deviation of the 50,000 means is very close to 5.0, again as predicted by the CLT. Also observe that the distribution is quite normal in shape. The dark line along the borders of the distribution is in fact a fit of the normal curve to the observed data. It is a very good fit. If sample means are normally distributed and we know the mean and standard deviation of the s, then it is possible to use the table for the normal distribution (i.e., the distribution of z) to calculate the probability that a certain range of sample means would occur given the hypothesized value for was correct. This is just what is needed to test hypotheses about a population mean. But in using the normal distribution to test hypotheses about a mean, it is important to remember that it is the probability distribution of the sample means that would be used. Probability distributions for sample statistics are also known as sampling distributions.

3 USING Z TO TEST HYPOTHESES ABOUT A POPULATION MEAN Now that we know what the probability distribution (or sampling distribution) for sample means is like, we can use that information to determine when we should reject or not reject some hypothesis about a population mean. The procedure can be illustrated using the question of UofW students s IQs. We start with our null and alternative hypotheses. The null hypothesis is that student IQs have a population mean of 100; that is, UofW students do not differ from the general population. If our research hypothesis is that UofW students have a higher IQ than the general population, then our alternative hypothesis would be that shown below (i.e., population mean is greater than 100). Ho: = 100 Ha: > 100 If the Ho is true, then it is possible to determine the sampling distribution of using the CLT. The population of individual Xs from which our samples are drawn has X = 100 (if H0 true) and X = 15 (because that is the published SD for an IQ test). I have put X as a subscript to make clear that these are the mean and standard deviation of the individual Xs in the population from which we are sampling. According to the CLT, our sample means for n = 9 will be normally distributed, with = X = 100 and = X /n = 15/3 = 5. Given this information, there are several equivalent approaches that can be used to test the null hypothesis. As illustrated in the figure to the right, the Rejection Region is in the upper end of the z distribution; this is because the alternative hypothesis states the researchers expect > 100. The shaded area is the probability that we reject H0 if H0 is true, that is, the probability of a Type I error. This probability, denoted, is generally set at.05, although other values are possible. If we look for.05 in column C of our table for the normal distribution, we will find that it falls exactly half-way between z = 1.64 and z = Hence, we use z = 1.645, as shown in the above figure. Now we can calculate how large our sample must be before the H0 will be rejected. The value is given by: Critical = x (15/9) = x 5 = The H0 will be rejected and the Ha accepted for sample means greater than or equal to and the H0 will not be rejected for sample means less than The 9 scores are shown below, along with calculations for the mean. S X X = = 980/9 =

4 The observed value for the sample mean is greater than the critical value (i.e., the value separating the Do not reject and Reject regions), so the H0 is rejected and Ha is accepted. We conclude that the IQ of UofW students is greater than 100. There is a second approach to testing this hypothesis, equivalent to the first but somewhat easier to implement in practice. Instead of calculating a critical value for the Mean, it is possible to use z = as the critical value, calculate a z score for our observed mean, and compare the two to determine whether z Observed is greater than z Critical = z Observed = ( ) / (15 / 9) = 8.89 / 5 = 1.78 Since 1.78 is greater than the critical value of 1.645, the H0 is rejected and the Ha accepted, identical to our previous conclusion, since the two approaches are logically equivalent. A third approach would be to find C for z = 1.78 and see if p(z1.78) is less than alpha of.05. Finding z in our table of the normal distribution gives a value of C =.0375, which is less than.05, our critical value for alpha. In summary, the three ways to determine whether to reject H0 or not given =.05 are: 1. z Critical = z Observed = 1.78 > Reject Ho, Accept Ha 2. z Observed = 1.78 p(z1.78) =.0375 < Reject Ho, Accept Ha 3. Critical = x 5 = Observed = > Reject Ho, Accept Ha FORMULA FOR Z-TEST H0: = 0 Ha: > 0 OR Ha: < 0 OR Ha: 0 z Observed = ( - 0 ) / /n Note that must be known to use this z-test PROBLEM # 10 Directional Ho: = 50 Ha: > 50 Sampling Distribution of, = 50, = 10/10 = = 56.7 z Observed = ( ) / 10/10 = 6.7/3.162 = =.05 z Critical = z Observed = 2.12>1.645 Reject Ho, Accept Ha 2. p(z >= 2.12) = <.05 Reject Ho, Accept Ha 3. Critical = x = Observed = 56.7>55.20 Reject Ho, Accept Ha Nondirectional Ho: = 50 Ha: 50 z Observed = /2 =.025 z Critical = 1.96 z Observed = 2.12 > 1.96 Reject Ho, Accept Ha 2. p(z 2.12 or z -2.12) = 2 x.0170 =.0340 <.05 Reject Ho, Accept Ha 3. +/ x = > Reject Ho, Accept Ha

5 HYPOTHESIS TESTING PROBLEMS SINGLE-SAMPLE, NORMAL DISTRIBUTION 1. The word "the" appears an average of 15.6 times per paragraph in the writings of Shakespeare, with a standard deviation of 4.3. A sample of 32 paragraphs from an unidentified manuscript contained an average of 14.3 "the's." Can we conclude with alpha =.06 that the unidentified manuscript contains fewer "the's" than works by Shakespeare? 2. Fiction writers spend an average of 6.2 hours per day working, with a standard deviation of 2.5. A sample of 35 poets found that they spent an average of 7.1 hours per day. Can we conclude with alpha =.06 that poets work longer hours than fiction writers? 3. In a certain sampling distribution of the mean of X based on samples of size 44 each, a sample mean of 438 is known to correspond to a z value of If the standard error of the mean is 9, what are the population mean and standard deviation for X? [tricky question] 4. Standard diet programs are known to result in a mean weight loss over six weeks of 10.8 pounds with standard deviation of 3.4 pounds. The designers of a new program report that 3 people had lost an average of 13.9 pounds after six weeks. Would you recommend the new program over standard diet programs? 5. The mean weight of parcels sent by Courier Express is 12.4 kg with a standard deviation of 4.8 kg. A special rate is given for customers sending over 500 kg. What is the probability that a company sending 40 parcels qualifies for the special rate? [tricky question] 6. The amount of money spent on Hallowe'en for costumes is a normally distributed variable with a mean of and a standard deviation of 3.7. What is the probability that the mean value of 80 costumes is more dollars? 7. A researcher observes that the mean number of trials that unreinforced rats require to learn a maze is 7.5 with standard deviation of A sample of 4 rats rewarded with food pellets learned in an average of 3.2 trials. Does reinforcement improve learning? 8. Archeologists excavating at a Viking site in Newfoundland have determined that the average age of artifacts is normally distributed with a mean age of 1000 years and a SD of 50 years. A new site is located that is thought to be older than those previously studied. A random sample of 24 artifacts had a mean age of 1024 years. What conclusions appear warranted about the new site? 9. The average score (MU) on The Achievement Motivation Test for the general population is 95 with a standard deviation of 8. Social psychologists have developed a program designed to raise achievement motivation. A sample of 38 students participated in a pilot study and obtained a mean achievement motivation score of 98.4 after 7 weeks. Do these findings support the conclusion that the program is effective at alpha =.02? 10. The average for the MMPI-2 Psychopathic Deviant Scale is 50, with a standard deviation of 10. Scores for 10 inmates averaged What conclusions are warranted about the PD scores of inmates?

6 SINGLE-SAMPLE, NORMAL DISTRIBUTION - SOLUTIONS 1. Ho: = 15.6 Ha: < 15.6 z = ( ) / (4.3/32) = -1.3/.760 = a. =.06, z = , < , Rej Ho & Acc Ha b. z = -1.71, p(z-1.71) =.0436 <.06, Rej Ho c. M = x.760 = , 14.3<14.418, Rej Ho Yes, manuscript contains fewer "the's" 2. Ho: = 6.2 Ha: > 6.2 z = ( ) / (2.5/35) = +.90 /.423 = 2.13 a. =.06, z = , 2.13>1.555, Rej Ho & Acc Ha b. z = 2.13, p(z2.13) =.0166 <.06, Rej Ho c. M = x.423 = 6.858, 7.1>6.858, Rej Ho Yes, poets work longer hours than fiction writers 3. n = 44, M = 438, z 438 = -1.75, s M = = (438 - ) / 9 = x9 = = /44 = 9x44 = Ho: = 10.8 Ha: > 10.8 z = ( ) / (3.4/3) = 3.1/1.963 = a. =.05, z = , 1.579<1.645, Do Not Rej Ho b. p(z1.58) =.0571 >.05, Do Not Rej Ho c. M = x1.963 = , 13.9<14.029, DNR Ho Insufficient evidence to conclude new program more effective 5. = 12.4, = 4.8, M = 500/40 = 12.5 z = ( )/(4.8/40) =.1/.759 =.13 p(z.13) =.4483 Tricky part is to appreciate that 500 = X 6. = 15.34, = 3.7 z = ( ) / (3.7/80) =.91/.4137 = 2.20 p(z2.20) = Ho: = 1000 Ha: > 1000 z = ( ) / (50/24) = 24/ = 2.35 a. =.05, z = 1.645, 2.35 > 1.645, Rej Ho & Acc Ha b. p(z2.35) =.0094 <.05, Rej Ho c. M = x = , 1024> , Rej Ho The new site appears to be older than other sites 9. Ho: = 95 Ha: > 95 z = ( ) / (8/38) = 3.4 / = 2.62 a. =.02, z = 2.05, 2.62 > 2.05, Rej Ho & Acc Ha b. p(z2.62) =.0044 <.02, Rej Ho c. M = x = 97.66, 98.4 > 97.66, Rej Ho Yes, the program appears effective at increasing motivation 10. Directional (One-tailed) Ho: = 50 Ha: > 50 If assume inmates higher on Psychopathic Deviance z = ( ) / (10/10) = 6.7 / = 2.12 a. =.05, z = 1.645, 2.12>1.645, Rej Ho & Acc Ha b. p(z2.12) =.0170 <.05, Rej Ho c. M = x3.162 = 55.20, 56.7 > 55.20, Rej Ho Inmates have higher PD scores than general population OR Nondirectional (Two-tailed) Ho: = 50 Ha: =/ 50 Assume inmates could be higher or lower on PD z = ( ) / (10/10) = 6.7 / = 2.12 Same as above a. =.05, /2 =.025, z ±1.96, 2.12>+1.96, Rej Ho & Acc Ha b. p(z<-2.12 or z>+2.12) = =.034 <.05, Rej Ho c. M = 50 ± 1.96x3.162 = , , 56.7 > , Rej Ho 7. Ho: = 7.5 Ha: < 7.5 z = ( ) / (2.25/4) = -4.3 / = a. =.05, z = , < , Rej Ho & Acc Ha b. p(z-3.82) <.0001, Rej Ho c. M = x = 5.649, 3.2<5.649, Rej Ho Reinforcement does appear to improve learning

T-TEST FOR HYPOTHESIS ABOUT

T-TEST FOR HYPOTHESIS ABOUT Previously we tested the hypothesis that a sample comes from a population with a specified using the normal distribution and a z-test. But the z-test required the population