Sociology 301 Hypothesis Testing + t-test for Comparing Means Liying Luo 04.14 Hypothesis Testing 5. State a technical decision and a substan;ve conclusion Hypothesis Testing A random sample of 100 UD students. Their mean GPA = 3.2. Example 1: If the true mean GPA of all UD students is below 3.0, how likely we would observe a mean of 3.2 in a random sample? Example 2: If the true mean GPA of all UD students is 3.0, how likely we would observe a mean of 3.2 in a random sample?
Hypothesis Testing: 1. State the hypotheses A random sample of 100 UD students. Their mean GPA = 3.2. Example 1: H 0: UDel mean GPA is below 3.00 H 1: UDel mean GPA is 3.00 or higher H0: μ<3.00 H1: μ 3.00 Example 2: H 0: UDel mean GPA is 3.00 H 1: UDel mean GPA is not equal to 3.00 H0: μ=3.00 H1: μ 3.00 Hypothesis Testing: 2. Choose an alpha level and determine the critical value One-tail hypothesis Two-tail hypothesis Cri$cal Value: the minimum value of a distribu;on necessary to designate an alpha area, i.e., how large the test sta;s;c must be in order to reject the null hypothesis at the given a level. Hypothesis Testing: 2. Choose an alpha level and determine the critical value The cri;cal value depends on your choice of alpha level and the type of hypothesis (one-tail vs. two-tail). Z α denotes the cri;cal value for a one-tail test. Z α/2 denotes the cri;cal value for a two-tail test. Choose an alpha level and iden;fy the cri;cal value for the following tests Example 1: H 0: UDel mean GPA is below 3.00 H 1: UDel mean GPA is 3.00 or higher H0: μ<3.00 H1: μ 3.00 Example 2: H 0: UDel mean GPA is 3.00 H 1: UDel mean GPA is not equal to 3.00 H0: μ=3.00 H1: μ 3.00
Hypothesis Testing: 3. Compute a test statistic When the popula;on standard devia;on σ is known, use Z test. Hypothesis Testing: 3. Compute a test statistic Suppose that the standard devia;on for the popula;on is σ=0.5. What is the Z-score of a sample mean = 3.2 with sample size n = 100? Example 1: H 0: UDel mean GPA is below 3.00 H 1: UDel mean GPA is 3.00 or higher H0: μ<3.00 H1: μ 3.00 Example 2: H 0: UDel mean GPA is 3.00 H 1: UDel mean GPA is not equal to 3.00 H0: μ=3.00 H1: μ 3.00 Hypothesis Testing: 4. Make a decision by comparing the test statistic to the critical value Rule: reject H 0 if the absolute value of your test sta;s;c is larger than the cri;cal value Z > Z α/2 or Z α One-tail Two-tail
Hypothesis Testing: 4. Make a decision by comparing the test statistic to the critical value Rule: reject H 0 if the absolute value of your test sta;s;c is larger than the cri;cal value Z > Z α/2 or Z α Suppose that the standard devia;on for the popula;on is σ=0.5. What is the Z-score of a sample mean = 3.2 with sample size n = 100? Example 1: H 0: UDel mean GPA is below 3.00 H 1: UDel mean GPA is 3.00 or higher H0: μ<3.00 H1: μ 3.00 Example 2: H 0: UDel mean GPA is 3.00 H 1: UDel mean GPA is not equal to 3.00 H0: μ=3.00 H1: μ 3.00 Hypothesis Testing: 5. State your conclusion Technically: At that alpha level, we reject or fail to reject H 0. Substan;vely: At that alpha level, we do or do not sufficient evidence to conclude that Suppose that the standard devia;on for the popula;on is σ=0.5. What is the Z-score of a sample mean = 3.2 with sample size n = 100? Example 1: H 0: UDel mean GPA is below 3.00 H 1: UDel mean GPA is 3.00 or higher H0: μ<3.00 H1: μ 3.00 Example 2: H 0: UDel mean GPA is 3.00 H 1: UDel mean GPA is not equal to 3.00 H0: μ=3.00 H1: μ 3.00 Hypothesis Testing: Steps 5. State a technical decision and a substan;ve conclusion
Example 3 In 2007 the U.S. Na;onal Transporta;on Safety Board set a 5-year goal of having more than 95% of all American drivers use their seatbelts. To see whether they are on target for mee;ng that goal, they randomly sampled 1,000 American drivers in 2012. They found that 962 or 96.2% of the 1,000 drivers they sampled use their seatbelts. 5. State a technical decision and a substan;ve conclusion Worksheet A veterinarian claims that 6% of cats have FIDS (Feline Immune Deficiency Syndrome). To evaluate this claim, researchers randomly sampled 320 cats They found that 26 or 8.1% of the 320 cats have FIDS. Is this evidence sufficient to confidently conclude that the popula;on propor;on of cats who have FIDS is different from 0.06? 5. State a technical decision and a substan;ve conclusion Hypothesis Testing When the popula;on standard devia;on σ is unknown, use t test.
Example 4 A 2001 census of penguins found that there were 12.1 penguins per square mile in Antarc;ca. Now researchers are interested in seeing how the size of the popula;on of penguins in Antarc;ca has changed. They randomly sampled 200 square mile sec;ons in Antarc;ca, and observed the number of penguins that lived on each sec;on. In their sample they observed a mean of 11.7 penguins per square mile, with s Y=2.3 penguins. Is this evidence sufficient to confidently conclude that the size of the penguin popula;on has changed? 5. State your decision t-test for Comparing Means one-sample hypothesis test: make inferences about one popula;on two-sample hypothesis test: make inferences about two popula;on t-test for comparing means: very similar to one-sample t-test different formulas for standard errors t-test for Comparing Means Basic ques;on: Are the two (or more) groups different? Example: beer preference between east coasters and west coasters gender wage gap difference in sleep ;me across ethnic groups
t-test for Comparing Means Basic ques;on: Are the two (or more) groups different in their means? The difference between two independent samples S sta;s;c and the Central Limit Theorem applies. is a sample It means that is normally distributed with mean and standard error of t-test for Comparing Means Basic ques;on: How likely to observe such a difference if the true popula;on means do not differ? That is, is there a sta$s$cally significant difference? t-test for Comparing Means Research ques;on: Do women have higher vocabulary scores than men? For the vocabulary test in General Social Survey Is the difference sta;s;cally significant?
t-test for Comparing Means: Steps 5. State a technical decision and a substan;ve conclusion Comparing Means: 1. State the Null and Research Hypotheses Basic ques;on: Are the two (or more) groups different? One-tail test: H 0: μ 1 - μ 2 0 or μ 1 - μ 2 0 H 1: μ 1 - μ 2 > 0 or μ 1 - μ 2 < 0 Two-tail test: H 0: μ 1 - μ 2 = 0 H 1: μ 1 - μ 2 0 Comparing Means: 1. State the Null and Research Hypotheses Do women have higher vocabulary scores than men? One-tail or two-tail test?
Comparing Means: 1. State the Null and Research Hypotheses Do women have higher vocabulary scores than men? One-tail or two-tail test? H 0: Women have equal or lower vocabulary scores than men. H 1: Women have higher vocabulary scores than men. H 0: μ w - μ m 0 H 1: μ w - μ m > 0 Comparing Means: 2. Decide the Alpha Level and Critical Value alpha level: the probability of making Type I Error Let s use α = 0.05. The cri;cal value depends on your choice of alpha level and the type of hypothesis (one-tail vs. two-tail). For large samples (n>50) Comparing Means: 2. Decide the Alpha Level and Critical Value alpha level: the probability of making Type I Error Let s use α = 0.05. The cri;cal value depends on your choice of alpha level and the type of hypothesis (one-tail vs. two-tail). For large samples (n>50) for the GSS vocabulary example
Comparing Means: 2. Decide the Alpha Level and Critical Value Do women have higher vocabulary scores than men? α = 0.05 and Z α= 1.65 Comparing Means: 3. Compute the Test Statistic According to the Central Limit Theorem, when popula;on standard devia;on σ is known, test sta;s;c: Comparing Means: 3. Compute the Test Statistic According to the Central Limit Theorem, when popula;on standard devia;on σ is unknown, test sta;s;c:
Comparing Means: 3. Compute the Test Statistic For the vocabulary test in GSS Comparing Means: 4. Compare the Test Statistic to the Critical Value Rule: reject H 0 if the absolute value of your test sta;s;c is larger than the cri;cal value. For the vocabulary test in GSS compare t to 1.65 Comparing Means: 5. State the Conclusion For the vocabulary test in GSS At the 0.05 level, we reject the null hypothesis. At the 0.05 level, we have sufficient evidence to conclude that women have beter vocabularies than men.