Chapter 3. Comparing two populations

Similar documents
1 Statistical inference for a population mean

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Introduction to Statistics

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

Chapter 7: Statistical Inference (Two Samples)

Inference for Proportions, Variance and Standard Deviation

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Lecture 10: Comparing two populations: proportions

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Chapter 10: Inferences based on two samples

One sample problem. sample mean: ȳ = . sample variance: s 2 = sample standard deviation: s = s 2. y i n. i=1. i=1 (y i ȳ) 2 n 1

Example. χ 2 = Continued on the next page. All cells

Summary of Chapters 7-9

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Two Sample Problems. Two sample problems

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

1 Hypothesis testing for a single mean

Confidence Intervals, Testing and ANOVA Summary

T test for two Independent Samples. Raja, BSc.N, DCHN, RN Nursing Instructor Acknowledgement: Ms. Saima Hirani June 07, 2016

Probability and Statistics Notes

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

ECON Introductory Econometrics. Lecture 2: Review of Statistics

Lecture 15: Inference Based on Two Samples

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Exam 2 (KEY) July 20, 2009

Chapter 8: Confidence Interval Estimation: Further Topics

CBA4 is live in practice mode this week exam mode from Saturday!

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Econ 325: Introduction to Empirical Economics

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

2011 Pearson Education, Inc

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Hypothesis Testing One Sample Tests

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Statistics for Business and Economics

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Inferences Based on Two Samples

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Statistics II Lesson 1. Inference on one population. Year 2009/10

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Practice Questions: Statistics W1111, Fall Solutions

Tables Table A Table B Table C Table D Table E 675

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

First we look at some terms to be used in this section.

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Simple Linear Regression: One Qualitative IV

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Extra Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences , July 2, 2015

Statistics for IT Managers

Quantitative Analysis and Empirical Methods

10.4 Hypothesis Testing: Two Independent Samples Proportion

Chapter 8. Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis

Chapter 8 of Devore , H 1 :

Math 2000 Practice Final Exam: Homework problems to review. Problem numbers

The Components of a Statistical Hypothesis Testing Problem

ANOVA - analysis of variance - used to compare the means of several populations.

Chapter 10: Chi-Square and F Distributions

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)

Basic Statistics and Probability Chapter 9: Inferences Based on Two Samples: Confidence Intervals and Tests of Hypotheses

Diploma Part 2. Quantitative Methods. Examiners Suggested Answers

Comparing two samples

Lecture 28 Chi-Square Analysis

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

Solution: First note that the power function of the test is given as follows,

Inferences About Two Proportions

Population 1 Population 2

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption

+ Specify 1 tail / 2 tail

Chapter 9: Hypothesis Testing Sections

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

On Assumptions. On Assumptions

Statistics. Statistics

The Chi-Square Distributions

WISE International Masters

Inference for Single Proportions and Means T.Scofield

Content by Week Week of October 14 27

Econ 325: Introduction to Empirical Economics

Problem Set 4 - Solutions

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Frequency Distribution Cross-Tabulation

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

Introduction to Survey Analysis!

2.57 when the critical value is 1.96, what decision should be made?

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

1; (f) H 0 : = 55 db, H 1 : < 55.

Rama Nada. -Ensherah Mokheemer. 1 P a g e

Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT):

Basic Business Statistics, 10/e

Stat 427/527: Advanced Data Analysis I

1 Independent Practice: Hypothesis tests for one parameter:

Chapter 20 Comparing Groups

Transcription:

Chapter 3. Comparing two populations Contents Hypothesis for the difference between two population means: matched pairs Hypothesis for the difference between two population means: independent samples Two normal populations with equal (unknown) variances Two normal populations with known variances Two nonnormal populations with unknown variances and large samples Two Bernoulli populations Hypothesis for the ratio of two population variances: independent samples

Chapter 3. Comparing two populations Learning goals At the end of this chapter you should be able to: Perform a test of hypothesis for the difference between two population means and for the ratio of two population variances Construct confidence intervals for the difference/ratio Distinguish situations where a test based on matched pairs is suitable from those where a test based on independent samples is Calculate the power of a test and the probability of Type II Error

Chapter 3. Comparing two populations References Newbold, P. Statistics for Business and Economics Chapter 9 (9.6-9.9) Ross, S. Chapter 10

Introduction In this chapter, we examine the case where instead of one random sample, two random samples are available from two populations, and the quantities of interest are: the difference between two population means case of matched pairs case of independent samples the ratio between two population variances case of independent samples We will draw on our experience from Chapters 1 and 2 to construct confidence intervals and perform tests of hypothesis for the abovementioned differences/rations of population parameters.

Tests for the difference between two means: matched pairs Example: In a study aimed at assessing the relationship between a subject s brain activity while watching a tv commercial and the subject s subsequent ability to recall the contents of the commercial, subjects were shown commercials for two brands of each of ten products. For each commercial, the ability to recall 24h later was measured, and each member of a pair of commercials was then designated high-recall or low-recall. The table below shows an index of the total amount of brain activity of subjects while watching these commercials. product: i 1 2 3 4 5 6 7 8 9 10 high-recall: x i 137 135 83 125 47 46 114 157 57 144 low-recall: y i 53 114 81 86 34 66 89 113 88 111 diff.: d i = x i y i 84 21 2 39 13 20 25 44 31 33

Tests for the difference between two means: matched pairs Let X be a population with mean µ X and Y be a population with mean µ Y. Suppose we have a random sample of n matched pairs of observations from these two populations and let d 1 = x 1 y 1, d 2 = x 2 y 2,..., d n = x n y n represent n differences with mean d and quasi-standard deviation s d. Let assume that the population of differences is normal. In a two-tail test H 0 : µ X µ Y = D 0 against H 1 : µ X µ Y D 0 : The test statistic is T = D D 0 s D / n H 0 t n 1 The rejection region is (at significance level α): RR α = {t : t < t n 1;α/2 or t > t n 1;α/2 }

Tests for the difference between two means: matched pairs Example: cont. Population: D = difference between highand low-recall D N(µ X µ Y, σ 2 D ) 'SRS: n = 10 Sample: d = 210 10 = 21 s 2 d = 142022 10(21)2 10 1 = 1088 = Objective: test D 0 {}}{ H 0 : µ X µ Y 0 against H 1 : µ X µ Y > 0 (Upper-tail test) Test statistic: T = D D 0 s D / n t n 1 Observed test statistic: D 0 = 0 n = 10 d = 21 s d = 1088 = 32.98 t = d D 0 s d / n 21 32.98/ 10 = 2.014

Tests for the difference between two means: matched pairs Example: cont. p-value = P(T 2.014) (0.025, 0.05) because t 9;0.05 t 9;0.025 {}}{{}}{ 1.833 < 2.014 < 2.262 Hence, given that p-value < α = 0.05 we reject the null hypothesis at this level. t n 1 density t= 2.014 p value =area 1.833 2.262 Conclusion: The sample data gave enough evidence to support the claim that on the average, brain activity is higher for the high-recall than for the low-recall group. If in fact, the mean brain activity were the same for these two groups, then the probability of finding a sample result as extreme as or more extreme than that actually obtained would be between 0.025 and 0.05 (which is rather low).

Tests for the difference between two means: matched pairs Example: cont. in Excel: Go to menu: Data, submenu: Data Analysis, choose function: t-test Paired Two Sample for Means. Columns A and B (data), in yellow (the observed test statistic and p-value).

Two-tail test for the difference between two means via CI: matched pairs Example: cont. Construct a 95% confidence interval for µ X µ Y. ( ) s d CI 0.95 (µ X µ Y ) = d t n 1;0.025, d s d + t n 1;0.025 n n ( = 21 2.262 32.98, 21 + 2.262 32.98 ) 10 10 = ( 2.59, 44.59) Since the value of 0 belongs to this interval, we cannot reject the null hypothesis of the equality of the two population means at a α = 0.05 significance level.

Tests for the difference between two means: independent normal samples, population variances equal Let X be a population with mean µ X and variance σx 2 and Y be a population with mean µ Y and variance σy 2, both normally distributed with unknown, but equal population variances σ 2 = σx 2 = σ2 Y. Suppose we have a random sample of n 1 observations from X and an independent random sample of n 2 observations from Y. In a two-tail test H 0 : µ X µ Y = D 0 against H 1 : µ X µ Y D 0 : The test statistic is T = X Ȳ D 0 q H0 t n1 1 s p n 1 + 1 +n 2 2 n 2 where the estimator of the common population variance is s 2 p = (n1 1)s2 X + (n 2 1)s 2 Y n 1 + n 2 2 Note: the number of degrees of freedom is n 1 + n 2 2 (the total number of observations from both samples minus two - two dfs are lost to estimate µ X and µ Y ) The rejection region is (at significance level α): RR α = {t : t < t n1 +n 2 2;α/2 or t > t n1 +n 2 2;α/2}

Tests for the difference between two means: independent normal samples, population variances equal Example: 9.8 (Newbold) A study attempted to assess the effect of the presence of a moderator on the number of ideas generated by a group. Groups of four members, with or without moderator, were observed. For a random sample of four groups with a moderator, the mean number of ideas generated per group was 78.0, and the sample quasi-standard deviation was 24.4. For an independent sample of four groups without a moderator, the mean number of ideas generated was 63.5, and the sample quasi-standard deviation was 20.2. Assuming that the populations distributions are normal with equal variances, test the null hypothesis (α = 0.1) that the population means are equal against the alternative that the true mean is higher for groups with a moderator. Population 1: Population 2: X = number of ideas in groups Y = number of ideas in groups with a moderator without a moderator X N(µ X, σx 2 ) X N(µ Y, σy 2 ) 'SRS: n 1 = 4 'SRS: n 2 = 4 Sample: x = 78.0 Sample: ȳ = 63.5 s x = 24.4 s y = 20.2 Assume independent normal samples and σx 2 = σy 2 = σ 2

Tests for the difference between two means: independent normal samples, population variances equal Example: 9.8 (Newbold cont.) Objective: test D 0 z} { H 0 : µ X µ Y = 0 against H 1 : µ X µ Y > 0 (Upper-tail test) Test statistic: T = r X Ȳ 1 sp + 1 n1 n 2 Observed test statistic: H0 t n1 +n 2 2 sp = t = Rejection region: = 501.7 = 22.4 x ȳ sp p 1/n1 + 1/n 2 78.0 63.5 22.4 p = 0.915 1/4 + 1/4 1.440 z } { RR 0.1 = {t : t > t 6;0.1 } Since t = 0.915 / RR 0.1 we cannot reject the null hypothesis at a 10% level. D 0 = 0 n 1 = 4 n 2 = 4 x = 78.0 sx = 24.4 ȳ = 63.5 sy = 20.2 s p 2 = (n 1 1)sx 2 + (n 2 1)s2 y n 1 + n 2 2 = (4 1)24.4 2 + (4 1)20.2 2 4 + 4 2 = 501.7 Conclusion: The sample data did not contain strong evidence suggesting that on average, more ideas will be generated by groups with moderators. However, for such small sample sizes, we cannot expect great power in the test so quite large differences in the population means would be needed to reject the null hypothesis at low significance levels.

Two-tail test for the difference between two means via CI: independent normal samples, population variances equal Example: 9.8 (Newbold cont.) Construct a 99% confidence interval for µ X µ Y. CI 0.99 (µ X µ Y ) = = ( x ȳ t n1+n2 2;0.005s p 1n1 + 1n2 ) ( ) 1 78.0 63.5 3.707 22.4 4 + 1 4 = ( 44.22, 73.22) Since the value of 0 belongs to this interval, we cannot reject the null hypothesis of the equality of the two population means at a α = 0.01 significance level.

Tests for the difference between two means: independent large samples or two normal populations with known variances Let X be a population with mean µ X and variance σx 2 and Y be a population with mean µ Y and variance σy 2. Suppose we have a random sample of n 1 observations from X and an independent random sample of n 2 observations from Y and: Either that both n1 and n 2 are large and σ1 2 and σ2 2 are unknown Or that X and Y are normally distributed and σ 2 1 and σ2 2 are known In a two-tail test H 0 : µ X µ Y = D 0 against H 1 : µ X µ Y D 0 : The test statistic is: Either Z = X Ȳ D 0 r H0, approx. N(0, 1) s X 2 n1 + s2 Y n2 Or Z = X Ȳ D 0 r σ 2 X n1 + σ2 Y n2 H0 The rejection region is (at significance level α): N(0, 1) RR α = {z : z < z α/2 or z > z α/2 }

Tests for the difference between two means: independent large samples or two normal populations with known variances Example: 9.7 (Newbold) A survey of practicing certified public accountants on attitudes to women in the profession was carried out. Survey respondents were asked to react on a scale from one (strongly disagree) to five (strongly agree) to the statement: Women in public accounting are given the same job assignments as men. For a sample of 186 male accountants, the mean response was 4.059 and the sample quasi-standard deviation was 0.839. For an independent random sample of 172 female accountants, the mean response was 3.680 and the sample quasi-standard deviation was 0.966. Test the null hypothesis (α = 0.0001) that the two population means are equal against the alternative that the true mean is higher for male accountants. Population 1: X = response of a male accountant X µ X, σ 2 X Population 2: Y = response of a female accountant X µ Y, σ 2 Y 'SRS: n 1 = 186 Sample: x = 4.059 s x = 0.839 'SRS: n 2 = 172 Sample: ȳ = 3.680 s y = 0.966

Tests for the difference between two means: independent large samples or two normal populations with known variances Example: 9.7 (Newbold cont.) Objective: test D 0 z} { H 0 : µ X µ Y = 0 against H 1 : µ X µ Y > 0 (Upper-tail test) Test statistic: Z = s X Ȳ s X 2 + s2 Y n1 n2 Observed test statistic: H0, approx. N(0, 1) D 0 = 0 n 1 = 186 n 2 = 172 x = 4.059 sx = 0.839 ȳ = 3.680 sy = 0.966 z = x ȳ q s x 2/n 1 + s2 y /n 2 Rejection region: z 3.75 } { RR 0.0001 = {z : z > z 0.0001 } Since z = 3.95 RR 0.0001 we reject the null hypothesis at a 0.01% level. Conclusion: The data contains very strong evidence suggesting that the population mean response is higher for males than for females - that is, on average, males feel more strongly than females in the profession that women are given the same job assignments as men. = 4.059 3.680 q 0.839 2 /186 + 0.966 2 = 3.95 /172

Tests for the difference between two means: independent large samples or two normal populations with known variances Example: 9.7 (Newbold) Construct a 95% confidence interval for µ X µ Y. CI 0.95 (µ X µ Y ) = = sx x ȳ z 2 0.025 + s2 y n 1 n 2 ( 4.059 3.680 1.96 ) 0.839 2 /186 + 0.966 2 /172 = (0.19, 0.57) Since the value of 0 does not belong to this interval, we can reject the null hypothesis of the equality of the two population means at a α = 0.05 significance level.

Tests for the difference between two proportions: independent large samples Let X Bernoulli(p X ) and let Y Bernoulli(p Y ) where p X and p Y are two population proportions of individuals with a characteristic of interest. Suppose we have a random sample of n 1 observations from X and an independent random sample of n 2 observations from Y and that both n 1 and n 2 are large In a two-tail test H 0 : p X = p Y (= p 0 ) against H 1 : p X p Y : The test statistic is: ˆp X ˆp Y Z = r H0, approx. N(0, 1), 1 ˆp 0(1 ˆp 0) n 1 + 1 n 2 where ˆp 0 = n1ˆp X + n 2ˆp Y n 1 + n 2 The rejection region is (at significance level α): RR α = {z : z < z α/2 or z > z α/2 }

Tests for the difference between two proportions: independent large samples Example: 9.9 (Newbold) In market research, when populations of individuals or households are surveyed by mail questionnaires, it is important to achieve as high a response rate as possible. One way to improve response might be to include in the questionnaire an initial inducement question, intended to increase the respondent s interest in completing the questionnaire. Questionnaires containing an inducement question on the importance of recreation facilities in a city were sent to a sample of 250 households, yielding 101 responses. Otherwise identical questionnaires, but without the inducement question, were sent to an independent random sample of 250 households, producing 75 responses. Test the null hypothesis that the two population proportions of responses would be the same against the alternative that the response rate would be higher when the inducement question is included. Population 1: X = 1 if a person completes the questionnaire with the inducement question, and 0 otherwise X Bernoulli(p X ) Population 2: Y = 1 if a person completes the questionnaire without the inducement question, and 0 otherwise Y Bernoulli(p Y ) 'SRS: n 1 = 250 Sample: ˆp x = 101 250 = 0.404 'SRS: n 2 = 250 Sample: ˆp y = 75 250 = 0.300

Tests for the difference between two proportions: independent large samples Example: 9.9 (Newbold cont.) Objective: test H 0 : p X = p Y against H 1 : p X > p Y (Upper-tail test) Test statistic: ˆp Z = X ˆp s Y «H0 ˆp 0 (1 ˆp 0 ) 1 + n 1, approx. N(0, 1) 1 n 2 Observed test statistic: n 1 = 250 n 2 = 250 ˆpx = 0.404 ˆpy = 0.300 ˆp 0 = = = 0.352 n 1 ˆpx + n 2 ˆpy n 1 + n 2 250(0.404) + (250)(0.300) 250 + 250 z = = ˆpx ˆpy s «ˆp 0 (1 ˆp 0 ) 1 + n 1 1 n 2 0.404 0.300 r 0.352(1 0.352) 1 250 + 250 1 = 2.43 p-value = P(Z z) = P(Z 2.43) = 0.0075 Since p-value is very small, the null hypothesis can be rejected at any significance level bigger than 0.0075. Conclusion: The sample data did contain very strong evidence suggesting that a higher response rate will be achieved when an inducement question is included than when it is not.

Tests for the difference between two proportions: independent large samples Example: 9.9 (Newbold cont.) Construct a 95% confidence interval for p X p Y. ( ) CI 0.95 (p X p Y ) = (ˆp ) 1 x ˆp y z 0.025 ˆp 0 (1 ˆp 0 ) = ( 0.404 0.300 1.96 = (0.1877, 0.0203) n 1 + 1 n 2 0.352(1 0.352) Since the value of 0 does not belong to this interval, we can reject the null hypothesis of the equality of the two population means at a α = 0.05 significance level. ( 1 250 + 1 ) ) 250

Tests for the ratio of variances: normal samples Let X be a population with mean µ X and variance σx 2 and Y be a population with mean µ Y and variance σy 2, both normally distributed. Suppose we have a random sample of n 1 observations from X and an independent random sample of n 2 observations from Y. In a two-tail test H 0 : σ 2 X = σ2 Y (= σ2 ) against H 1 : σ 2 X σ2 Y : The test statistic is F = s2 X s 2 Y H0 F n1 1,n 2 1 The rejection region is (at significance level α): RR α = {f : f < F n1 1,n 2 1;1 α/2 or f > F n1 1,n 2 1;α/2}

F distribution Recall that if X 1, X 2,..., X n and Y 1, Y 2, Y 3,..., Y m denote independent rvs, all following an N(0, 1) distribution. The random variable F = 1 n 1 m P n i=1 X 2 i P m i=1 Y 2 i follows an F n,m distribution with n and m degrees of freedom. We can view it as a ratio of two normalized chi-square rvs. This is where the result from the previous page comes from: s 2 X s 2 Y = H0 1 n 1 1 χ 2 n 1 1 z } { (n 1 1)s 2 X σ 2 1 (n 2 1)sY 2 n 2 1 σ {z 2 } χ 2 n 2 1 F n1 1,n 2 1 0.0 0.2 0.4 0.6 0.8 1.0 1.2 F densities df1=30 df2=30) df1=10 df2=15 df1=8 df2=8 df1=5 df2=3 0 2 4 6 8

Tests for the ratio of variances: normal samples Example: 9.10 (Newbold) For a random sample of 17 newly issued AAA-rated industrial bonds, the quasi-variance of maturities (in years squared) was 123.35. For an independent random sample of 11 issued CCC-rated industrial bonds, the quasi-variance of maturities was 8.02. If the respective population variances are denoted σ 2 X and σ2 Y, perform a two-sided test at a 5% level. Population 1: X maturity of AAA-rated bonds (in years) X N(µ X, σ 2 X ) Population 2: Y maturity of CCC-rated bonds (in years) Y N(µ Y, σ 2 Y ) 'SRS: n 1 = 17 Sample: s 2 x = 123.35 'SRS: n 2 = 11 Sample: s 2 y = 8.02

Tests for the ratio of variances: normal samples Example: 9.10 (Newbold cont.) Objective: test H 0 : σ 2 X = σ 2 Y against H 1 : σ 2 X σ 2 Y (Two-tail test) Test statistic: F = s2 X s Y 2 H0 F n1 1,n 2 1 Observed test statistic: n 1 = 17 n 2 = 11 s 2 x = 123.35 s 2 y = 8.02 f = 123.35 8.02 = 15.38 Rejection region: RR 0.10 = {f : f < 0.402 z } { F 16,10;1 0.05} {f : f > F 16,10;0.05 } {z } 2.83 Note: the quantile F 16,10;0.05 = 2.83 is directly available from the F-table, but the other one not. We can get it however using the following property of the F-distribution F n,m;α = F 16,10;1 0.05 = 1 F m,n;1 α Hence 1 F 10,16;0.05 = 1 2.49 = 0.402 We see that f = 15.38 RR 0.10. Conclusion: There is very strong evidence that the population variances are different.

Two-tail test for the ratio of variances via confidence interval Example: 9.10 (Newbold cont.) Construct a 90% confidence interval for the ratio of the variances. ( ) ( ) σ 2 CI X s 2 0.90 σy 2 = x 1 sy 2, s2 x 1 F n1 1,n 2 1;0.05 sy 2 F n1 1,n 2 1;1 0.05 ( 123.35 1 = 8.02 2.83, 123.35 ) 1 8.02 0.402 = (5.43, 38.26) As we expected, the value of 1 does not belong to this interval, so we can reject the null hypothesis of the equality of the two population variances at a α = 0.1 significance level.

Test statistics Parameter Assumptions Test statistic Normal differences Matched pairs Normal pops. Equal common var. µ X µ Y = D 0 Normal pops. Known vars. p X p Y = 0 Nonnormal pops. Unknown vars. Large samples Bernoulli pops. Large samples D D 0 s D / n t n 1 X Ȳ D r 0 1 sp + 1 n1 n 2 H0 t n1 +n 2 2 Ȳ D s X 0 σ X 2 + σ2 Y n1 n2 H0 N(0, 1) X Ȳ s D 0 s X 2 H0 + s2, approx N(0, 1) Y n1 n2 ˆp X ˆp s Y «H0 ˆp 0 (1 ˆp 0 ) 1 + n 1, approx N(0, 1) 1 n 2 σ 2 X /σ2 Y = 1 Normal pops. s 2 X s 2 Y H0 F n1 1,n 2 1 Question: How would you define RR α in upper- and lower-tail tests?