Hypothesis Testing For the next few lectures, we re going to look at various test statistics that are formulated to allow us to test hypotheses in a variety of contexts: In all cases, the hypothesis testing approach uses the same multi-step process: 1. State the null hypothesis (H 0 ) 2. State the alternative hypothesis (H A ) 3. Choose α, our significance level 4. Select a statistical test, and calculate the test statistic 5. Determine the critical value where H 0 will be rejected 6. Compare the test statistic with the critical value What differs is the formulation of the test statistic (which distribution/which formula)
Hypothesis Testing - Tests 4. Select a statistical test, and calculate the test statistic To test the hypothesis, we must construct a test statistic, which frequently takes the form: Test statistic = θ - θ 0 Std. error For example, using the normal distribution, the basic z- test is formulated as: z = x - µ σ x σ x = σ/ n when σ is known, or σ x ~ s/ n when we have to estimate the standard deviation from the sample data
Hypothesis Testing - One-Sample Z-test The example we looked at in the last lecture used the onesample z-test, which is formulated as: Z test = x - µ σ n (difference between means) (standard error) We use this test statistic: 1. To compare a sample mean to the population mean 2. If the size of the sample is reasonably large, i.e. n > 30 3. When the population standard deviation is known (although we can estimate it from the sample standard deviation), so that we can use this value to calculate the standard error in the denominator
Hypothesis Testing - One-Sample Z-test Example Data: Acidity data has been collected for a population of ~6000 lakes in Ontario, with a mean ph of µ = 6.69, and a standard deviation of σ = 0.83. A group of 50 lakes in a particular region of Ontario with acidic conditions is sampled and is found to have a mean ph of x = 6.16, and a s = 0.60 Research question: Are the lakes in that particular region more acidic than the lakes throughout Ontario? 1. H 0 : x = µ (No significant difference in acidity) 2. H A : x < µ (Sample lakes are significantly more acidic) 3. Select α = 0.05, one-tailed because of how the alternate hypothesis is formulated
Hypothesis Testing - One-Sample Z-test Example 4. The test statistic is formulated as: Z test = x - µ σ n = 6.16-6.69 0.83 50 = 0.53 0.12 = 4.52 5. For an α = 0.05 and a one-tailed test, Z crit =1.645 6. Z test > Z crit, therefore we reject H 0 and accept H A, finding that there is a significant difference between the sample mean and the population mean: The 50 sampled lakes are significantly more acidic than the population of ~6000 lakes from throughout Ontario
Hypothesis Testing - One-Sample Z-test for Proportions We can also set up this test to see whether a proportion is different from some hypothesized value: Z test = p - p 0 (p 0 ) (1-p 0 ) / n Basically, the same test as previous We use this test statistic: 1. To compare a sample proportion to the population s 2. If the size of the sample is reasonably large, i.e. n > 30 3. When the population standard deviation is known (although we can estimate it from the sample standard deviation), so that we can use this value to calculate the standard error in the denominator
Hypothesis Testing - One-Sample Z-test for Proportions Example Data: A citywide survey finds that the proportion of households that own cars is p 0 = 0.2. We survey 50 households and find that 16 of them own a car (p = 16/50 = 0.32) Research question: Is the proportion of households in our survey that has a car different from the proportion found in the citywide survey? 1. H 0 : p 0 = p (No significant difference in car ownership) 2. H A : p 0 p(a significant difference in car ownership between our small sample and the citywide figure) 3. Select α = 0.05, two-tailed because of how the alternate hypothesis is formulated
Hypothesis Testing - One-Sample Z-test for Proportions Example 4. The test statistic is formulated as: Z test = p - p 0 (p 0 ) (1-p 0 ) / n = 0.32-0.2 (0.2) (0.8) / 50 = 0.12 0.00565 = 2.12 5. For an α = 0.05 and a two-tailed test, Z crit =1.96 6. Z test > Z crit, therefore we reject H 0 and accept H A, finding that there is a significant difference between the proportion of households that own a car in the neighborhood that we surveyed as compared to the proportion of car owning households found throughout the city
Hypothesis Testing - Smaller Samples The z-test methods that we have looked at so far are applicable in the rare instances where we have both: 1. A large sample (n > 30) which we wish to compare to the population from which it was drawn AND 2. Complete knowledge of that population, including its parameters (i.e. its mean and standard deviation) Far more frequently, we find ourselves in a situation where we are working from samples that are smaller than size n = 30 and we very seldom have a great deal of information about the parameters of the population from which a sample was drawn (otherwise, we might not have bothered sampling at all)
Hypothesis Testing - Smaller Samples With smaller samples and lacking the population parameters, we make use of t-tests rather than z-tests, which in turn draw their critical values from the family of t-distributions, which are closely related to the normal distribution Given our small sample, the sample sampling distribution of a set of samples of this size is no longer normally distributed (even though the population from which they are drawn should be normally distributed) Instead we select critical values from a t-distribution that has a shape that is appropriate to our sample size, as selected by using a t-distribution with the appropriate degrees of freedom
Hypothesis Testing - Degrees of Freedom You can think of degrees of freedom as the number of observations (i.e. the sample size) minus the number of quantities being estimated (usually one, but not always): Given a sample of size n = 34, if we wish to estimate the mean, then the degrees of freedom is calculated as df = (n - 1) = (34-1) = 33 We have n observations and we use up one degree of freedom when we estimate the mean We have n - 1 degrees of freedom in this example, because if I tell you every observation value except one of them, and I tell you the mean, you can tell me the remaining value of the missing observation
Hypothesis Testing - t-distributions T-distributions are very similar in shape to the normal distribution, but the tails of the distribution contain a larger proportion of the total area than those in a normal distribution, and the shape of the crown of the distribution is slightly flatter on t- distributions than in the normal case This allows t-distributions to account for the higher variability we expect to find using smaller sample sizes (i.e. this reflects the uncertainty in estimating σ from s as n gets quite small s will certainly be greater than σ as n )
Hypothesis Testing - t-distributions Normal Dist. T-Dist. Source: Earickson, RJ, and Harlin, JM. 1994. Geographic Measurement and Quantitative Analysis. USA: Macmillan College Publishing Co., p. 1066.
Hypothesis Testing - t-tests We can formulate t-tests that can be used to address a range of different kinds of situations: 1. The basic one-sample t-test is used in much the same way as the basic z-test, except in instances where the sample size is less than or equal to 30 2. We can use a properly formulated t-test to compare the mean statistics derived from a pair of samples to assess if the samples are drawn from the same population and/or if they are significant different from one another using a two-sample t- test for the mean
Hypothesis Testing - t-tests 3. In a case where we are looking at paired observations (i.e. two samples that are collected in such a way that each observation in one sample matches up with its counterpart in the corresponding sample), there is a reduction in the amount of independence of the observations, and this needs to be taken into account in the formulation of the t-test, which in this case is known as a paired comparison or matched pairs t-test (e.g. measuring soil moisture at a set of 10 locations before and after a rainfall event)
Hypothesis Testing - One-Sample t-test The one-sample t-test is formulated very much like the one-sample Z-test we looked at earlier: t test = x - µ s n (difference between means) (standard error) We use this test statistic: 1. To compare a sample mean to the population mean 2. If the size of the sample is somewhat small, i.e. n 30 3. We do not need to know the population standard deviation to calculate the standard error, although we still need to know the population mean for purposes of comparison with the sample mean
Hypothesis Testing - One-Sample t-test Example Data: Suppose we sampled only 25 lakes in that region of Ontario rather than 50, and found a mean ph of x = 6.27, and a s = 0.75 (population parameters of the ~6000 lakes in Ontario remain µ = 6.69, σ = 0.83). We can still make the same sort of comparison as before, only using a t-test: Research question: Are the lakes in that particular region more acidic than the lakes throughout Ontario? 1. H 0 : x = µ (No significant difference in acidity) 2. H A : x < µ (Sample lakes are significantly more acidic) 3. Select α = 0.05, one-tailed because of how the alternate hypothesis is formulated
Hypothesis Testing - One-Sample t-test Example 4. The test statistic is formulated as: t test = x - µ s n = 6.27-6.69 0.75 25 = 0.42 0.15 = 2.8 5. We now need to find the critical t-score, which is a slightly more involved procedure, because we need to take the sample size / degrees of freedom into account: df = (n - 1) = (25-1) = 24 We can now look up the t crit value in the appropriate table (A.3, p. 215 in the Rogerson text) for our α and df
Hypothesis Testing - One-Sample t-test Example The t distribution table works a little differently from the standard normal table find the appropriate degrees of freedom on the left side, then move over until you find the column for the selected α level of significance (in this case df = 24 and α = 0.05)
Hypothesis Testing - One-Sample t-test Example 5. Cont. - The provided table of t-scores in Rogerson gives t- scores for one-sided tests, so to find the t-value for a twosided test, simply halve the α level and use the value from the appropriate column (i.e. supposing we had formulated a two-sided alternate hypothesis here, selected α = 0.05 still with 24 degrees of freedom, we would have found the t-score value in the α = 0.025 column [t = 2.064], since that would be the proportion in each tail) 6. t test > t crit, therefore we reject H 0 and accept H A, finding that there is a significant difference between the sample mean and the population mean: The sample of 25 lakes is also significantly more acidic than the population of ~6000 lakes from throughout Ontario
Hypothesis Testing - Two-Sample t-tests Two-sample t-tests are used to compare one sample mean with another sample mean, rather than with a population parameter. The form of the two-sample t-test that is appropriate depends on whether or not we can treat the variances of the two samples as being equal If the variances can be assumed to be equal (a condition called homoscedasticity), the t-statistic is: t test = S p x 1 -x 2 (1 / n 1 ) + (1 / n 2 ) and s p is the pooled estimate of the standard deviation: = (n 1-1)s 1 2 + (n 2-1)s 2 2 n 1 + n 2-2
Hypothesis Testing - Two-Sample t-tests Two-sample t-tests that use the equal variance assumption have degrees of freedom equal to the sum of the number of observations in the two samples, less two since we are estimating the values of two means here: df = (n 1 + n 2-2) If we cannot assume that the two samples have equal variances, the appropriate t-statistic takes a slightly different form, since we cannot produce a pooled estimate for the standard error portion of the statistic: t test = x 1 -x 2 (s 12 / n 1 ) + (s 22 / n 2 )
Hypothesis Testing - Two-Sample t-tests Unfortunately, in the heteroscedastic case (where the variances are unequal), calculating the degrees of freedom appropriate to use for the critical t-score uses a somewhat involved formula (equation 3.17 on p. 50) As an alternative, Rogerson suggests using the lesser value of n 1-1 or n 2-1: df = min[(n 1-1),(n 2-1)] based on the grounds that this value will always be lower than that produced by the involved calculation, and thus will produce a higher t crit score at the selected α; this is a conservative assumption because it makes it even harder to reject the null hypothesis mistakenly and commit a type I error
Hypothesis Testing - F-test In order to make the decision as to whether or not the variances of two samples are the same or different enough to warrant the use of one form the two-sample t- test or the other, we have a further statistical test that we use to compare the variances The F-test, a.k.a. the variance ratio test, assesses whether or not the variances are equal by computing a test statistic of the form: F test = s 1 2 s 2 2 Critical values are taken from the F-distribution, which has a 2-dimensional array of degrees of freedom (i.e. n 1-1 df in the numerator, n 2-1 df in the denominator)
Hypothesis Testing - F-test Table A.5 on pp. 218-220 gives F-dists. for 3 α levels (0.10, 0.05, and 0.01). E.g. selecting α = 0.05, given samples n 1 = 11, n 2 = 15 we would use df1 = 10, df2 = 14
Hypothesis Testing - Two-Sample t-test Example Data: Suppose we are interested in comparing the annual swimming frequencies between people who live in the central city versus the suburban part of a city, so we do a survey and sample 8 neighborhoods in each area (x 1 = 48.63, s 1 = 19.88, x 2 = 63.63, s 2 = 12.66) Research question: Do people in one of the two parts of the city swim more often than people in the other part? 1. H 0 : x 1 = x 2 (No significant diff. in swimming freq.) 2. H A : x 1 x 2 (People in one part of the city swim more) 3. Select α = 0.05, two-tailed because of how the alternate hypothesis is formulated (we have no prior expectation of one area swimming more than the other)
Hypothesis Testing - Two-Sample t-test Example 4. Before we can formulate our t-test statistic, we need to check if we can make the equal variances assumption using the F-test: F test = s 1 2 s 2 2 = 19.882 12.66 2 = 395.21 160.28 = 2.47 Looking in Table A.5 for α = 0.05 for df=(7,7), we find the F crit value is 3.79. Because F test < F crit we do not reject the null hypothesis and can assume the variances are equal, allowing us to use the homoscedastic form of the 2 sample t-test: t test = S p x 1 -x 2 (1 / n 1 ) + (1 / n 2 )
Hypothesis Testing - Two-Sample t-test Example 4. Cont. - First, we must calculate the pooled estimate of the standard deviation (S p ): S p = (n 1-1)s 1 2 + (n 2-1)s 2 2 n 1 + n 2-2 = (8-1)(19.882 ) + (8-1)(12.66 2 ) 8 + 8-2 = (7)(395.21) + (7)(160.28) 14 = 16.67
Hypothesis Testing - Two-Sample t-test Example 4. Cont. - Now, we calculate the test statistic: t test = = S p x 1 -x 2 (1 / n 1 ) + (1 / n 2 ) 48.63-63.63 16.67 (1/8 + 1/8) = 1.8 5. We now need to find the critical t-score, first calculating the degrees of freedom: df = (n 1 + n 2-2) = (8 + 8-2) = 14 We can now look up the t crit value for our α (0.025 in each tail) and df = 14, t crit = 2.145
Hypothesis Testing - Two-Sample t-test Example 6. t test < t crit, therefore we accept H 0, finding that there is no significant difference between the frequency of swimming in the two neighborhoods