One- and Two-Sample Tests of Hypotheses

Similar documents
(8 One- and Two-Sample Test Of Hypothesis)

CHAPTER EIGHT TESTS OF HYPOTHESES

Chapter 9. Hypothesis testing. 9.1 Introduction

Math 101: Elementary Statistics Tests of Hypothesis

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Hypothesis testing for µ:

Ch 8: Inference for two samples

Hypothesis for Means and Proportions

Hypothesis Testing. ECE 3530 Spring Antonio Paiva

The Purpose of Hypothesis Testing

Chapter 5: HYPOTHESIS TESTING

[ z = 1.48 ; accept H 0 ]

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

EXAM 3 Math 1342 Elementary Statistics 6-7

INTERVAL ESTIMATION AND HYPOTHESES TESTING

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

SL - Binomial Questions

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses

HYPOTHESIS TESTING. Hypothesis Testing

Introductory Econometrics. Review of statistics (Part II: Inference)

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except on problems 1 & 2. Work neatly.

Econ 325: Introduction to Empirical Economics

Chapter 7: Hypothesis Testing - Solutions

POLI 443 Applied Political Research

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

Hypotheses Test Procedures. Is the claim wrong?

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Example. χ 2 = Continued on the next page. All cells

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

(8 One-and Two-Sample Test Of Hypothesis)

15: CHI SQUARED TESTS

their contents. If the sample mean is 15.2 oz. and the sample standard deviation is 0.50 oz., find the 95% confidence interval of the true mean.

MEI STRUCTURED MATHEMATICS STATISTICS 2, S2. Practice Paper S2-B

Hypothesis Testing: One Sample

Chapter Eight: Assessment of Relationships 1/42

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

Unit II. Page 1 of 12

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

The Components of a Statistical Hypothesis Testing Problem

Math 2000 Practice Final Exam: Homework problems to review. Problem numbers

Chapter 8 of Devore , H 1 :

Chapter 10: Analysis of variance (ANOVA)

20 Hypothesis Testing, Part I

2. A music library has 200 songs. How many 5 song playlists can be constructed in which the order of the songs matters?

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

Chapter Six: Two Independent Samples Methods 1/51

LECTURE NOTES. INTSTA2 Introductory Statistics 2. Francis Joseph H. Campeña, De La Salle University Manila

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. describes the.

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

Chapter. Hypothesis Testing with Two Samples. Copyright 2015, 2012, and 2009 Pearson Education, Inc. 1

(ii) at least once? Given that two red balls are obtained, find the conditional probability that a 1 or 6 was rolled on the die.

COVENANT UNIVERSITY NIGERIA TUTORIAL KIT OMEGA SEMESTER PROGRAMME: ECONOMICS

Chi-Squared Tests. Semester 1. Chi-Squared Tests

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

11-2 Multinomial Experiment

11 CHI-SQUARED Introduction. Objectives. How random are your numbers? After studying this chapter you should

Chapter 12: Inference about One Population

CHAPTER 9: HYPOTHESIS TESTING

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Chapter 9. Hypothesis Testing. 9.1 Introduction

This gives us an upper and lower bound that capture our population mean.

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

MAT 2377 Practice Exam

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

PhysicsAndMathsTutor.com

Statistical Inference. Section 9.1 Significance Tests: The Basics. Significance Test. The Reasoning of Significance Tests.

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

Inferential statistics

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Mathematical Statistics

1; (f) H 0 : = 55 db, H 1 : < 55.

SMAM 314 Exam 42 Name

1036: Probability & Statistics

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Inferences About Two Proportions

Chapter 7: Hypothesis Testing

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Hypothesis testing. Data to decisions

Inferences Based on Two Samples

Lab #12: Exam 3 Review Key

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Single Sample Means. SOCY601 Alan Neustadtl

Statistical Inference. Hypothesis Testing

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

9-7: THE POWER OF A TEST

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Tribhuvan University Institute of Science and Technology 2065

NONPARAMETRIC TESTS. LALMOHAN BHAR Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-12

41.2. Tests Concerning a Single Sample. Introduction. Prerequisites. Learning Outcomes

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Problem Set 4 - Solutions

Non-parametric Statistics

Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Transcription:

One- and Two-Sample Tests of Hypotheses 1- Introduction and Definitions Often, the problem confronting the scientist or engineer is producing a conclusion about some scientific system. For example, a medical researcher may decide on the basis of experimental evidence whether coffee drinking increases the risk of cancer in humans; or a sociologist might wish to collect appropriate data to enable him or her to decide whether a person s blood type and eye color are independent variables. In addition, each must make use of experimental data and make a decision based on the data. In each case, the conjecture can be put in the form of a statistical hypothesis. Definition: A statistical hypothesis is an assertion or conjecture concerning one or more populations. The truth or falsity of a statistical hypothesis is never known with absolute certainty unless we examine the entire population. This, of course, would be impractical in most situations. Instead, we take a random sample from the population of interest and use the data contained in this sample to provide evidence that either supports or does not support the hypothesis. Evidence from the sample which is inconsistent with the stated hypothesis leads to reject the hypothesis. Null Hypothesis: Denoted by H 0, refers to any hypothesis we wish to test. Alternative Hypothesis: The rejection of H 0 leads to the acceptance of an alternative hypothesis, denoted by H 1. Conclusions: Reject H 0 in favor of H 1 because of sufficient evidence in the data or fail to reject H 0 because of insufficient evidence in the data. Note that the conclusions do not involve a formal and literal accept H 0. Example: A certain type of cold vaccine is known to be only 25% effective after a period of 2 years. To determine if a new vaccine is superior in providing protection against the same virus for a longer period of time, suppose that 20 people are chosen at random and inoculated. If more than 8 of those receiving the new vaccine surpass the 2-year period without contracting the virus, the new vaccine will be considered superior to the one presently in use. H 0 : p = 0.25 H 1 : p > 0.25 2- The Test Statistic The test statistic on which we base our decision is X, the number of individuals in our test group who receive protection from the new vaccine for a period of at least 2 years. The 1

possible values of X, from 0 to 20, are divided into two groups: those numbers less than or equal to 8 and those greater than 8. All possible scores greater than 8, constitute the critical region. The last number that we observe in passing into the critical region is called the critical value. In our illustration, the critical value is the number 8. Therefore, if x > 8, we reject H 0 in favor of the alternative hypothesis H 1. If x 8, we fail to reject H 0. This decision criterion is illustrated in the figure. The decision procedure could lead to either of two wrong conclusions. For instance, the new vaccine may be no better than the one now in use (H 0 true) and yet, in this particular randomly selected group of individuals, more than 8 surpass the 2-year period without contracting the virus. We would be committing an error by rejecting H 0 in favor of H 1 when, in fact, H 0 is true. Such an error is called a type I error. Definition: Rejection of the null hypothesis when it is true is called a type I error. A second kind of error is committed if 8 or fewer of the group surpass the 2-year period successfully and we are unable to conclude that the vaccine is better when it actually is better (H 1 true). Thus, in this case, we fail to reject H 0 when in fact H 0 is false. This is called a type II error. Definition: Non-rejection of the null hypothesis when it is false is called a type II error. The probability of type I error (α): We say that the null hypothesis, p = 0.25, is being tested at the α =0.0409 level of significance. The Probability of a Type II Error (β): The probability of committing a type II error, denoted by β, is impossible to compute unless we have a specific alternative hypothesis. If we test the null hypothesis that p = 2

0.25 against the alternative hypothesis that p = 0.5, then we are able to compute the probability of not rejecting H 0 when it is false. In this case, Definition: The power of a test is the probability of rejecting H 0 given that a specific alternative is true. The power of a test can be computed as 1 β. 3- One- and Two-Tailed Tests A test of any statistical hypothesis is called a one-tailed test where the alternative is one sided, such as: H 0 : θ = θ 0, H 1 : θ >θ 0 or perhaps H 1 : θ < θ 0, A test of any statistical hypothesis is called a two-tailed test where the alternative is two sided, such as: H 0 : θ = θ 0, H 1 : θ θ 0, Example: A manufacturer of a certain brand of rice cereal claims that the average saturated fat content does not exceed 1.5 grams per serving. State the null and alternative hypotheses. H 0 : μ = 1.5, H 1 : μ > 1.5 Example: A real estate agent claims that 60% of all private residences being built today are 3-bedroom homes. To test this claim, a large sample of new residences is inspected; the proportion of these homes with 3 bedrooms is recorded and used as the test statistic. State the null and alternative. H 0 : p = 0.6, H 1 : p 0.6, 3

4- The Use of P-Values for Decision Making in Testing Hypotheses In testing hypotheses in which the test statistic is discrete, the critical region may be chosen arbitrarily and its size determined. Over a number of generations of statistical analysis, it had become customary to choose an α of 0.05 or 0.01 and select the critical region accordingly. One-tailed test: If P-value < α then H 0 will be rejected. If P-value > α then H 0 will not be rejected. Two-tailed test: If P-value < α/2 then H 0 will be rejected. If P-value > α/2 then H 0 will not be rejected. Approach to Hypothesis Testing with Fixed Probability of Type I Error 1. State the null and alternative hypotheses. 2. Choose a fixed significance level α. 3. Choose an appropriate test statistic and establish the critical region based on α. 4. Reject H 0 if the computed test statistic is in the critical region. Otherwise, do not reject. 5. Draw scientific or engineering conclusions. Significance Testing (P-Value Approach) 1. State null and alternative hypotheses. 2. Choose an appropriate test statistic. 3. Compute the P-value based on the computed value of the test statistic. 4. Use judgment based on the P-value and knowledge of the scientific system. 5- Single Sample: Tests Concerning a Single Mean (Variance Known) In this section, we formally consider tests of hypotheses on a single population mean. zz = xx μμ 0 σσ If z α/2 < z < z α/2, do not reject H 0. Rejection of H 0, implies acceptance of the nn alternative hypothesis μ μ 0. 4

Tests of one-sided hypotheses: H 0 : μ = μ 0 H 0 : μ = μ 0 H 1 : μ > μ 0 H 1 : μ < μ 0 If z > z α, reject H 0 If z < -z α, reject H 0 If α > P-value, reject H 0 If α > P-value, reject H 0 Example: A random sample of 100 recorded deaths in the United States during the past year showed an average life span of 71.8 years. Assuming a population standard deviation of 8.9 years, does this seem to indicate that the mean life span today is greater than 70 years? Use a 0.05 level of significance. 1. H 0 : μ = 70 years. 2. H 1 : μ > 70 years. 3. α = 0.05. 4. Critical region: z >1.645, where zz = xx μμ 0 σσ. nn 5. Computations: xx = 71.8 years, σ = 8.9 years, and hence zz = 71.8 70 8.9 100 = 2.02. 6. Decision: Reject H 0 and conclude that the mean life span today is greater than 70 years. The P-value corresponding to z = 2.02 is given by the area of the shaded region in the figure. Using Table A.3, we have P = P (Z > 2.02) = 1 0.9783 = 0.0217. As a result, the evidence in favor of H 1 is even stronger than that suggested by a 0.05 level of significance. Example: A manufacturer of sports equipment has developed a new synthetic fishing line that the company claims has a mean breaking strength of 8 kilograms with a standard deviation of 0.5 kilogram. Test the hypothesis that μ = 8 kilograms against the alternative 5

that μ 8 kilograms if a random sample of 50 lines is tested and found to have a mean breaking strength of 7.8 kilograms. Use a 0.01 level of significance. 1. H 0 : μ = 8 kilograms. 2. H 1 : μ 8 kilograms. 3. α = 0.01. 4. Critical region: z < 2.575 and z > 2.575, where zz = xx μμ 0 σσ. nn 5. Computations: xx =7.8 kilograms, n = 50 and hence zz = 7.8 8 0.5 50 = 2.83. 6. Decision: Reject H 0 and conclude that the average breaking strength is not equal to 8 but is, less than 8 kilograms. Since the test in this example is two tailed, the desired P-value is twice the area of the shaded region in the figure to the left of z = 2.83. Therefore, using Table A.3, we have P = P( Z > 2.83) = 2P(Z < 2.83) = 0.0046, which allows us to reject the null hypothesis that μ = 8 kilograms at a level of significance smaller than 0.01. 6- The t-statistic for a Test on a Single Mean (Variance Unknown) For the two-sided hypothesis H 0 : μ = μ 0, H 1 : μ μ 0, we reject H 0 at significance level α when the computed t-statistic tt = xx μμ 0 ss nn exceeds t α/2,n 1 or is less than t α/2,n 1. Example: The Edison Electric Institute has published figures on the number of kilowatt hours used annually by various home appliances. It is claimed that a vacuum cleaner uses an average of 46 kilowatt hours per year. If a random sample of 12 homes included in a planned study indicates that vacuum cleaners use an average of 42 kilowatt hours per year 6

with a standard deviation of 11.9 kilowatt hours, does this suggest at the 0.05 level of significance that vacuum cleaners use, on average, less than 46 kilowatt hours annually? Assume the population of kilowatt hours to be normal. 1. H 0 : μ = 46 kilowatt hours. 2. H 1 : μ < 46 kilowatt hours. 3. α = 0.05. 4. Critical region: t < 1.796, where tt = xx μμ 0 ss nn with 11 degrees of freedom. 5. Computations: xx = 42 kilowatt hours, s = 11.9 kilowatt hours, and n = 12. Hence, tt = 42 46 11.9 12 = 1.16 P = P (T < 1.16) 0.135. 6. Decision: Do not reject H 0 and conclude that the average number of kilowatt hours used annually by home vacuum cleaners is not significantly less than 46. 7- Two Samples: Tests on Two Means with Known Variances Two independent random samples of sizes n 1 and n 2, respectively, are drawn from two populations with means μ 1 and μ 2 and variances σ 1 2 and σ 2 2. zz = (xx 1 xx 2) (μμ 1 μμ 2 ) σσ 1 2 nn + σσ 2 2 1 nn 2 H 0 : μ 1 μ 2 = d 0 H 1 : μ 1 μ 2 d 0 zz = (xx 1 xx 2) dd 0 σσ 1 2 nn + σσ 2 2 1 8- Tests on Two Means with Unknown But Equal Variances H 0 : μ 1 μ 2 = d 0 H 1 : μ 1 μ 2 d 0 nn 2 7

tt = (xx 1 xx 2) dd 0 ss pp 1 nn + 1 1 nn 2 ss pp 2 = ss 1 2 (nn 1 1) + ss 2 2 (nn 2 1) nn 1 + nn 2 2 Example: An experiment was performed to compare the abrasive wear of two different laminated materials. 12 pieces of material 1 were tested by exposing each piece to a machine measuring wear. 10 pieces of material 2 were similarly tested. In each case, the depth of wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with a sample standard deviation of 5. Can we conclude at the 0.05 level of significance that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units? Assume the populations to be approximately normal with equal variances. 1. H 0 : μ 1 μ 2 = 2. 2. H 1 : μ 1 μ 2 > 2. 3. α = 0.05. 4. Critical region: t > 1.725, where tt = (xx 1 xx 2 ) dd 0 with ν = 20 degrees of freedom. ss pp 1 nn1 + 1 nn2 5. Computations: xx 1 = 85, s 1 = 4, n 1 = 12, xx 2 = 81, s 2 = 5, n 2 = 10 Hence P = P (T > 1.04) 0.16. (See Table A.4.) 16(11) + 25(9) ss pp = 12 + 10 2 = 4.478 (85 81) 2 tt = 4.478 1 12 + 1 = 1.04 10 6. Decision: Do not reject H 0. We are unable to conclude that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units. 9- Tests on Two Means with Unknown But Unequal Variances 8

The test procedure is to not reject H0 when ttαα 2,νν < tt < ttαα 2,νν 10- Paired Observations Testing of two means can be accomplished when data are in the form of paired observations for μ 1 μ 2 and tt = dd dd 0 ss dd nn Example: In a study conducted in the Forestry and Wildlife Department at Virginia Tech, J. A. Wesson examined the influence of the drug succinylcholine on the circulation levels of androgens in the blood. Blood samples were taken from wild, free-ranging deer immediately after they had received an intramuscular injection of succinylcholine administered using darts and a capture gun. A second blood sample was obtained from each deer 30 minutes after the first sample, after which the deer was released. The levels of androgens at time of capture and 30 minutes later, measured in nanograms per milliliter (ng/ml), for 15 deer are given in the table. Assuming that the populations of androgen levels at time of injection and 30 minutes later are normally distributed, test at the 0.05 level of significance whether the androgen concentrations are altered after 30 minutes. 1. H 0 : μ 1 = μ 2 or μ D = μ 1 μ 2 = 0. 2. H 1 : μ 1 μ 2 or μ D = μ 1 μ 2 0. 9

3. α = 0.05. 4. Critical region: t < 2.145 and t > 2.145, where tt = dd dd 0 ss dd nn with ν = 14. 5. Computations: dd = 9.848 and s d = 18.474. Therefore, tt = 9.848 0 18.474 15 = 2.06 6. Though the t-statistic is not significant at the 0.05 level, from Table A.4, P = P( T > 2.06) 0.06. As a result, there is some evidence that there is a difference in mean circulating levels of androgen. 11- Choice of Sample Size for Testing Means In most practical circumstances, the experiment should be planned, with a choice of sample size made prior to the data-taking process if possible. The sample size is usually determined to achieve good power for a fixed α and fixed specific alternative. This fixed alternative may be in the form of μ μ 0 in the case of a hypothesis involving a single mean or μ 1 μ 2 in the case of a problem involving two means. Specific cases will provide illustrations. Suppose that we wish to test the hypothesis H0: μ = μ 0, H1: μ >μ 0, with a significance level α, when the variance σ 2 is known. For a specific alternative, say μ = μ 0 + δ, the power of our test is shown in figure below to be 1 β = P (X > a when μ = μ 0 + δ). Thus, β = P (X < a when μ = μ 0 + δ) 10

from which we conclude that z β = z α δ n and hence choice of sample size: n = zz αα+zz ββ 2 σσ 2 and n 2 zzαα+zz ββ σσ 2 2 for two-tailed test when δ = μ μ δδ 2 0 Example: Suppose that we wish to test the hypothesis H 0 : μ = 68 kilograms, H 1 : μ > 68 kilograms σ δδ 2 for one-tailed test when δ = μ μ 0 for the weights of male students at a certain college, using an α = 0.05 level of significance, when it is known that σ = 5. Find the sample size required if the power of our test is to be 0.95 when the true mean is 69 kilograms. Since α = β = 0.05, we have z α = z β = 1.645. For the alternative μ = 69, we take δ = 1 and then n = zz αα+zz ββ 2 σσ 2 = (1.645+1.645)2 5 2 = 270. δδ 2 1 2 Therefore, 271 observations are required if the test is to reject the null hypothesis 95% of the time when, in fact, μ is as large as 69 kilograms. 12- One Sample: Test on a Single Proportion zz = xx nnpp 0 = pp pp 0 nnpp 0 qq 0 pp 0qq 0 nn For a two-tailed test the critical region is z < z α/2 or z > z α/2. For the one-sided alternative p < p 0, the critical region is z < z α, and for the alternative p > p 0, the critical region is z > z α. Example: A commonly prescribed drug for relieving nervous tension is believed to be only 60% effective. Experimental results with a new drug administered to a random sample of 100 adults who were suffering from nervous tension show that 70 received relief. Is this sufficient evidence to conclude that the new drug is superior to the one commonly prescribed? Use a 0.05 level of significance. 1. H 0 : p = 0.6. 11

2. H 1 : p >0.6. 3. α = 0.05. 4. Critical region: z >1.645. 5. Computations: x = 70, n = 100, pp = 70/100 = 0.7, and zz = 0.7 0.6 2.04) < 0.0207. 6. Decision: Reject H 0 and conclude that the new drug is superior. 13- Two Samples: Tests on Two Proportions zz = pp 1 pp 2 pp qq 1 nn1 + 1 nn2 pp = xx 1 + xx 2 nn 1 + nn 2 (0.6)(0.4) 100 For the alternative p 1 p 2, the critical region is z < z α/2 or z > z α/2. = 2.04, P = P(Z > For a test of p 1 <p 2, the critical region is z < z α, and for p 1 > p 2, the critical region is z > z α. Example: A vote is to be taken among the residents of a town and the surrounding county to determine whether a proposed chemical plant should be constructed. The construction site is within the town limits, and for this reason many voters in the county believe that the proposal will pass because of the large proportion of town voters who favor the construction. To determine if there is a significant difference in the proportions of town voters and county voters favoring the proposal, a poll is taken. If 120 of 200 town voters favor the proposal and 240 of 500 county residents favor it, would you agree that the proportion of town voters favoring the proposal is higher than the proportion of county voters? Use α = 0.05 level of significance. 1. H 0 : p 1 = p 2. 2. H 1 : p 1 >p 2. 3. α = 0.05. 4. Critical region: z > 1.645. 5. Computations: pp 1 = 120 = 0.6, 200 pp 2 = 240 120+240 = 0.48 and pp = = 0.51 500 200+500 Therefore, zz = 0.6 0.48 (0.51)(0.49) 1 200 + 1 500 = 2.9, 12

P = P (Z > 2.9) = 0.0019. 6. Decision: Reject H 0 and agree that the proportion of town voters favoring the proposal is higher than the proportion of county voters. 14- One-Sample Tests Concerning Variances (nn χχ 2 1)ss2 = where n is the sample size, s 2 is the sample variance, and σ 0 2 is the value of σ 2 given by the null hypothesis. If H 0 is true, χ 2 is a value of the chi-squared distribution with ν = n 1 degrees of freedom. For a two-tailed test, the critical region is χ 2 < χ 2 1 α/2 or χ 2 > χ 2 α/2. For the one-sided alternative σ 2 < σ 0 2, the critical region is χ 2 < χ 2 1 α, and for the one-sided alternative σ 2 > σ 0 2, the critical region is χ 2 > χ 2 α. Example: A manufacturer of car batteries claims that the life of the company s batteries is approximately normally distributed with a standard deviation equal to 0.9 year. If a random sample of 10 of these batteries has a standard deviation of 1.2 years, do you think that σ > 0.9 year? Use a 0.05 level of significance. 1. H 0 : σ 2 = 0.81. 2. H 1 : σ 2 > 0.81. 3. α = 0.05. 4. Critical region: From figure below, we see that the null hypothesis is rejected when χ 2 > 16.919, where χχ 2 = (nn 1)ss2 σσ2, with ν = 9 degrees of freedom. 0 σσ 0 2 13

5. Computations: s 2 =1.44, n = 10, and χχ 2 = (9)(1.44) = 16.0, P 0.07. 6. Decision: The χ 2 -statistic is not significant at the 0.05 level. However, based on the P- value 0.07, there is evidence that σ > 0.9. 0.81 15- Two-Sample Tests Concerning Variances Test the equality of the variances σ 1 2 and σ 2 2 of two populations. H 0 : σ 1 2 = σ 2 2 H 1 : σ 1 2 < σ 2 2, σ 1 2 > σ 2 2 or σ 1 2 σ 2 2. For independent random samples of sizes n 1 and n 2, respectively, from the two populations, the f-value for testing σ 1 2 = σ 2 2 is the ratio ff = ss 1 2 ss 2 2 F-distribution with ν 1 = n 1 1 and ν 2 = n 2 1 degrees of freedom. The critical regions of the one-sided alternatives σ 1 2 < σ 2 2, σ 1 2 > σ 2 2 are, respectively, f<f 1 α(ν1,ν2) and f > f α(v1,v2). For the two-sided alternative σ 1 2 σ 2 2, the critical region is f<f 1 α/2(v1,v2) or f > f α/2(v1,v2). Example: An experiment was performed to compare the abrasive wear of two different laminated materials. 12 pieces of material 1 were tested by exposing each piece to a machine measuring wear. 10 pieces of material 2 were similarly tested. In each case, the depth of wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with a sample standard deviation of 5. Check whether or not that the two unknown population variances were equal. Use a 0.10 level of significance. 1. H 0 : σ 1 2 = σ 2 2. 2. H 1 : σ 1 2 σ 2 2. 3. α = 0.10. 4. Critical region: From the figure below, we see that f 0.05(11,9) = 3.11, and, by using we 1 find f 0.95(11,9) = = 0.34. ff0.05(9,11) 14

Therefore, the null hypothesis is rejected when f < 0.34 or f > 3.11, where ff = ss 1 2 ss 2 2 with ν 1 = 11 and ν 2 = 9 degrees of freedom. 5. Computations: s 1 2 = 16, s 2 2 = 25, and henceff = 16 25 = 0.64. 6. Decision: Do not reject H 0. Conclude that there is insufficient evidence that the variances differ. 16- Goodness-of-Fit Test The test is based on how good a fit we have between the frequency of occurrence of observations in an observed sample and the expected frequencies obtained from the hypothesized distribution. kk χχ 2 = (oo ii ee ii ) 2 ii=1 where χ 2 is a value of the chi-squared distribution with ν = k 1 degrees of freedom. o i : Observed frequencies, for the i th cell. e i : Expected frequencies, for the i th cell. H 0 : A good fit H 1 : A poor fit If χχ 2 > χχ αα 2, H 0 will be rejected. Example: to illustrate the tossing of a die, we hypothesize that the die is honest, which is equivalent to testing the hypothesis that the distribution of outcomes is the discrete uniform distribution ff(xx) = 1, xx = 1,2,,6. 6 ee ii 15

Suppose that the die is tossed 120 times and each outcome is recorded. Theoretically, if the die is balanced, we would expect each face to occur 20 times. The results are given as the following: Using Table A.5, we find χ 2 0.05 = 11.070 for ν = 5 degrees of freedom. Since 1.7 is less than the critical value, we fail to reject H 0. We conclude that there is insufficient evidence that the die is not balanced. 17- Test for Independence (Categorical Data) The chi-squared test procedure discussed in the previous part can also be used to test the hypothesis of independence of two variables of classification. H 0 : Two variable are independent H 1 : Two variable are dependent expected frequency = kk χχ 2 = (oo ii ee ii ) 2 ii=1 ee ii (column total) (row total) grand total If χ 2 > χ 2 α with ν = (r 1)(c 1) degrees of freedom, reject the null hypothesis of independence at the α-level of significance; otherwise, fail to reject the null hypothesis. Example: Suppose that we wish to determine whether the opinions of the voting residents of the state of Illinois concerning a new tax reform are independent of their levels of income. Members of a random sample of 1000 registered voters from the state of Illinois are classified as to whether they are in a low, medium, or high income bracket and whether or not they favor the tax reform. The observed frequencies are presented in Table below, which is known as a contingency table. 16

ee FFFFFF/LLLLLL = (336)(598) 1000 = 200.9 From Table A.5 we find that χ 2 0.05 = 5.991 for ν = (2 1)(3 1) = 2 degrees of freedom. The null hypothesis is rejected and we conclude that a voter s opinion concerning the tax reform and his or her level of income are not independent. 17