CHAPTER 9: HYPOTHESIS TESTING

Similar documents
One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

STAT Chapter 8: Hypothesis Tests

Hypothesis testing. Data to decisions

HYPOTHESIS TESTING. Hypothesis Testing

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Chapter 9 Inferences from Two Samples

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Section 6.2 Hypothesis Testing

Example. χ 2 = Continued on the next page. All cells

Introduction to Business Statistics QM 220 Chapter 12

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Categorical Data Analysis. The data are often just counts of how many things each category has.

Multiple Regression Analysis

Inferential statistics

Two Sample Hypothesis Tests

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000.

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

CENTRAL LIMIT THEOREM (CLT)

Mathematical Notation Math Introduction to Applied Statistics

Chapter 7: Hypothesis Testing - Solutions

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1

hypothesis a claim about the value of some parameter (like p)

9-6. Testing the difference between proportions /20

Chapter 10: Comparing Two Populations or Groups

23. MORE HYPOTHESIS TESTING

Statistics for IT Managers

Hypothesis for Means and Proportions

Chapter 9. Hypothesis testing. 9.1 Introduction

MAT2377. Rafa l Kulik. Version 2015/November/23. Rafa l Kulik

First we look at some terms to be used in this section.

Chapter 26: Comparing Counts (Chi Square)

The Difference in Proportions Test

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

MONT 105Q Mathematical Journeys Lecture Notes on Statistical Inference and Hypothesis Testing March-April, 2016

DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence interval to compare two proportions.

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Hypothesis Testing The basic ingredients of a hypothesis test are

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

Study Ch. 9.3, #47 53 (45 51), 55 61, (55 59)

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

Quantitative Analysis and Empirical Methods

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except on problems 1 & 2. Work neatly.

Two-Sample Inferential Statistics

Single Sample Means. SOCY601 Alan Neustadtl

Introductory Econometrics. Review of statistics (Part II: Inference)

Tests about a population mean

We need to define some concepts that are used in experiments.

An inferential procedure to use sample data to understand a population Procedures

Inferences Based on Two Samples

INTERVAL ESTIMATION AND HYPOTHESES TESTING

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

Hypotheses Test Procedures. Is the claim wrong?

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Harvard University. Rigorous Research in Engineering Education

Student s t-distribution. The t-distribution, t-tests, & Measures of Effect Size

Chapter 7 Comparison of two independent samples

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

10.4 Hypothesis Testing: Two Independent Samples Proportion

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

Problem of the Day. 7.3 Hypothesis Testing for Mean (Small Samples n<30) Objective(s): Find critical values in a t-distribution

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

8.1-4 Test of Hypotheses Based on a Single Sample

Rejection regions for the bivariate case

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Sign test. Josemari Sarasola - Gizapedia. Statistics for Business. Josemari Sarasola - Gizapedia Sign test 1 / 13

Hypothesis Testing with Z and T

Data Analysis and Statistical Methods Statistics 651

Chapter 7 Class Notes Comparison of Two Independent Samples

1 Independent Practice: Hypothesis tests for one parameter:

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Statistics 251: Statistical Methods

Difference between means - t-test /25

Psych 10 / Stats 60, Practice Problem Set 5 (Week 5 Material) Part 1: Power (and building blocks of power)

Inference in Regression Model

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Lecture 7: Hypothesis Testing and ANOVA

Mathematical statistics

CHAPTER 10 HYPOTHESIS TESTING WITH TWO SAMPLES

Section 9.1 (Part 2) (pp ) Type I and Type II Errors

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1

Applied Statistics for the Behavioral Sciences

Introduction to Statistics

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

In any hypothesis testing problem, there are two contradictory hypotheses under consideration.

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Transcription:

CHAPTER 9: HYPOTHESIS TESTING THE SECOND LAST EXAMPLE CLEARLY ILLUSTRATES THAT THERE IS ONE IMPORTANT ISSUE WE NEED TO EXPLORE: IS THERE (IN OUR TWO SAMPLES) SUFFICIENT STATISTICAL EVIDENCE TO CONCLUDE THAT (THE POPULATIONS OF) BOYS AND GIRLS REACT DIFFERENTLY TO VACCINATION (THE FACT THAT THE TWO SAMPLE PROPORTIONS ARE DIFFERENT DOES NOT MEAN MUCH BY ITSELF, SINCE RANDOM DIFFERENCES WILL PRACTICALLY ALWAYS MAKE THEM SO). A SIMILAR ISSUE ARISES IN JUST ABOUT EVERY CASE WE STUDIED SO FAR IN THE CONTEXT OF CONFIDENCE INTERVALS. LET US GO OVER THEM AGAIN, ONE BY ON. < THE POPULATION MEAN : (LARGE SAMPLES). 1

QUITE OFTEN, THERE IS A SPECIAL VALUE OF :, E.G. APPLE TREES IN A CERTAIN ORCHARD BEAR APPLES OF THE AVERAGE WEIGHT OF 500g (CLAIMS THE OWNER), AND WE WOULD LIKE TO TEST THIS HYPOTHESIS. WE WILL CALL IT THE NULL HYPOTHESIS, AND USE THE FOLLOWING NOTATION: H 0 : : = 500g NEXT, WE HAVE TO DECIDE WHAT ARE WE TESTING IT AGAINST (THE ALTERNATE HYPOTHESIS). THERE ARE SOME CASES WHEN WE NEED TO VERIFY THE EXACT AMOUNT (AN ASPIRIN TABLET CONTAINING 350 mg OF ASA) - IN THAT SITUATION THE ALTERNATE HYPOTHESIS WOULD BE TWO-TAILED, I.E. H 1 : : 350mg. THERE ARE OTHER CASES WHEN WE GLADLY ACCEPT MORE (E.G. BIGGER APPLES), BUT ANYTHING SMALLER WILL BE CONSIDERED A MISREPRESENTATION. IN THAT CASE, THE ALTERNATE HYPOTHESIS IS LEFT-TAILED, I.E. H 1 : : < 500g. 2

WE MAY ALSO HAVE THE OPPOSITE CASE: THE WATER PURIFYING PLANT CLAIMS REDUCING POLLUTANTS TO A 18ppm LEVEL (OUR NULL HYPOTHESIS), BUT WE GLADLY ACCEPT LESS. THE ALTERNATE HYPOTHESIS WILL THEN BE RIGHT-TAILED: H 1 : : > 18ppm. THE TEST ITSELF (DECIDING WHETHER WE CAN BELIEVE H 0 OR H 1 ) IS BASED ON COMPUTING A VALUE OF A RANDOM VARIABLE WHOSE DISTRIBUTION (UNDER H 0 ) IS KNOWN. IN THE CURRENT CASE (TESTING THE VALUE OF :), THIS WILL BE x µ 0 (CALLED TEST STATISTIC), WHERE s n : 0 IS THE VALUE CLAIMED BY THE NULL HYPOTHESIS (500g, 150mg, 18ppm, ETC.). WE KNOW ALREADY THAT THE DISTRIBUTION OF THIS RANDOM VARIABLE IS STANDARD NORMAL WHEN THE EXACT VALUE OF F IS KNOWN (NOT TOO COMMON), WE USE IT IN THE PREVIOUS FORMULA INSTEAD OF s. 3

THEN, WE SELECT THE LEVEL OF SIGNIFICANCE " (THE OPPOSITE OF THE CONFIDENCE LEVEL), WITH 0.05, 0.01, 0.10 BEING THE MOST COMMON CHOICES, AND IN TABLE 5(c) WE LOOK UP THE CORRESPONDING CRITICAL VALUE z c (SAME AS WHEN CONSTRUCTION CONFIDENCE INTERVALS, EXCEPT: IN ONE-TAILED TESTS, THE FULL 0.05 PROBABILITY IS ASSIGNED TO A SINGLE TAIL). WE REJECT THE NULL HYPOTHESIS (IN FAVOR OF H 1 ) WHENEVER THE TEST STATISTIC ENTERS THE CRITICAL REGION (THE ONE OR TWO TAILS BEYOND z c ). EXAMPLE: TO CHECK THE CLAIM THAT THE AVERAGE WEIGHT OF APPLES IS 500g, WE SELECT A RANDOM INDEPENDENT SAMPLE OF SIZE 75, RESULTING IN A SAMPLE MEAN OF 489g AND A SAMPLE STANDARD DEVIATION OF 83g. WE WANT TO PERFORM THIS TEST USING THE USUAL 5% SIGNIFICANCE LEVEL. CLEARLY: H 0 : : = 500g, H 1 : : < 500g AND z c = -1.645 (THE LAST LINE OF TABLE 6, UNDER "N ). THE VALUE OF THE TEST STATISTIC EQUALS: 4

489 500 83 75 = -1.148 CONCLUSION: WE DO NOT HAVE ENOUGH STATISTICAL EVIDENCE TO REJECT THE NULL HYPOTHESIS (THUS, WE MUST ACCEPT IT). THIS OF COURSE DOES NOT PROVE THE VALIDITY OF H 0 - WE MAY YET BE ABLE REJECT IT, BY LARGER SAMPLING. NOTE THAT FOR THIS (LEFT-TAILED) TEST, A SAMPLE MEAN OF 500g OR BIGGER WOULD AUTOMATICALLY YIELD ACCEPTING H 0. ALSO NOTE THAT THE TWO HYPOTHESES ARE TREATED VERY DIFFERENTLY: THE NULL HYPOTHESIS ALWAYS GETS THE BENEFIT OF THE DOUBT, WHEN OUR OBSERVATIONS FALL IN THE NOT-SO- CLEAR REGION. THE ALTERNATE HYPOTHESIS, ON THE OTHER HAND, MUST PROVE ITS CASE BEYOND A REASONABLE DOUBT. THIS IS BECAUSE THE CONSEQUENCES OF REJECTING H 0 ARE USUALLY MORE SERIOUS THAN THOSE OF ACCEPTING (IT S BETTER TO SAY: FAILING TO REJECT ) IT. 5

A FEW MORE TERMS TO REMEMBER: REJECTING H 0 WHEN IT S TRUE (I.E. INCORRECTLY) IS MAKING TYPE I ERROR. WE KNOW THAT THE PROBABILITY OF THIS IS SET TO " (AND IT THUS, IN A SENSE, UNDER CONTROL ). FAILING TO REJECT H 0 WHEN IT S FALSE IS CALLED TYPE II ERROR. ITS PROBABILITY (DENOTED $) DEPENDS ON THE ON THE ACTUAL VALUE OF : (THE MORE : DEVIATES FROM THE NULL-HYPOTHESIS CLAIM, THE SMALLER THE CHANCES OF MAKING A TYPE II MISTAKE). THE PROBABILITY OF REJECTING H 1 WHEN IT S FALSE (THE CORRECT COURSE OF ACTION) IS THUS EQUAL TO 1 - $ AND IT S CALLED THE POWER OF THE TEST (WE WOULD LIKE IT TO BE LARGE). 6

NOTE THAT: AS : APPROACHES : 0 (THE NULL-HYPOTHESIS CLAIM), THE POWER OF THE TEST REACHES THE VALUE OF " (FOR THE APPLE EXAMPLE, IT IS GOING TO BE DIFFICULT TO REJECT H 0 WHEN : = 499g). THERE IS A DIFFERENT (BUT, IN THE FINAL ANALYSIS, EQUIVALENT) WAY OF DECIDING WHETHER TO REJECT H 0 OR NOT, BASED ON THE SO CALLED P VALUE. THIS REQUIRES LOOKING UP (IN TABLE 5) THE TAIL AREA (OF THE STANDARD NORMAL DISTRIBUTION) CORRESPONDING TO THE µ VALUE OF 0 (THE TEST STATISTIC). FOR x s n TWO-TAILED TESTS, THIS AREA MUST FURTHER BE MULTIPLIED BY 2. THE ACTUAL DECISION (WHETHER TO REJECT H 0, OR NOT) IS THEN MADE DEPENDING ON THIS P BEING LESS THAN " (REJECT) OR BIGGER THAN " (DON T). 7

THIS HAS THE ADVANTAGE OF KNOWING EXACTLY HOW STRONG OUR REJECTION IS. USING THE PREVIOUS EXAMPLE, THE P VALUE CORRESPONDING TO 1.148 IS 1.0000-0.8749 = 12.51%. SINCE THIS IS BIGGER THAN 5% (OR EVEN 10%), WE CANNOT REJECT H 0 : : = 500g (SAME CONCLUSION AS BEFORE). NOW, WE MODIFY THE PROCEDURE TO ACCOMMODATE THE OTHER POSSIBILITIES OF THE LAST CHAPTER. FIRSTLY, WE EXTEND THE PREVIOUS CASE (TESTING WHETHER A SPECIFIC VALUE OF : IS CORRECT OR NOT) TO COVER THE SITUATION OF A SMALL SAMPLE SIZE. SIMILARLY TO CONSTRUCTING CONFIDENCE INTERVALS, WE NOW HAVE TO ASSUME THAT THE POPULATION ITSELF IS NORMAL. EVERYTHING ELSE REMAINS THE SAME, EXCEPT THE CRITICAL VALUE(S) WILL BE LOOKED UP IN THE TABLE OF THE t - DISTRIBUTION, RATHER THAN z. 8

EXAMPLE: TO CHECK THE CLAIM THAT THE AVERAGE WAGE IN A CERTAIN INDUSTRY IS $40,000 ANNUALLY, WE SELECT A RANDOM SAMPLE OF 22 OF EMPLOYEES AND ESTABLISH THEIR SAMPLE MEAN TO BE $36,274, WITH THE CORRESPONDING SAMPLE STANDARD DEVIATION OF $8,409. DOES THIS CONSTITUTE A STATISTICALLY SIGNIFICANT EVIDENCE THAT THE ORIGINAL CLAIM IS INCORRECT (EITHER WAY)? H 0 : : = $40,000, H 1 : : $40,000 36274 40000 THE VALUE OF THE TEST STATISTIC IS = -2.078 8409 22 THE CORRESPONDING CRITICAL VALUES ARE ±2.080 (FROM TABLE 6 (ALSO IN INSERT), USING d.f. OF 21 AND "O = 0.050). CONCLUSION: NOT ENOUGH STATISTICAL EVIDENCE TO DISPROVE THE VALUE OF $40,000. < TESTING THE VALUE OF p (PROPORTION OR PROBABILITY OF SUCCESS ). WE AGAIN HAVE TO ASSUME THAT n IS LARGE (IN THE pn > 5, qn > 5 SENSE). THE TEST STATISTIC IS 9

$p p pq 0 0 0 WHERE p 0 IS THE VALUE OF THE NULL HYPOTHESIS (NOTE THAT p 0, NOT, IS NOW USED IN THE DENOMINATOR). THIS TEST STATISTIC HAS, UNDER H 0, THE STANDARD NORMAL DISTRIBUTION. n $p EXAMPLE: WE SUSPECT THAT OUR OPPONENT IS USING A CROOKED DIE (PROBABILITY OF A SIX IS BIGGER THAN 1/6). WE BORROW THE DIE AND ROLL IT 1000 TIME, GETTING 213 SIXES. IS THERE ENOUGH STATISTICAL EVIDENCE TO ACCUSE HIM OF BEING DISHONEST? WE HAVE: H 0 : p = 1/6, H 1 : p > 1/6 0213. 1 THE VALUE OF THE TEST STATISTIC IS 6 = 3.932 1 5 1 6 6 1000 THE CRITICAL REGION IS ANYTHING BEYOND 2.58 (THIS TIME WE DECIDED TO USE A VERY SMALL 0.5% LEVEL OF SIGNIFICANCE). WE OBTAINED A HIGHLY SIGNIFICANT EVIDENCE THAT HS DIE IS CROOKED! 10

< TESTING FOR DIFFERENCE BETWEEN TWO POPULATION MEANS, HAVING LARGE, INDEPENDENT SAMPLES. TYPICALLY, THE NULL HYPOTHESIS WOULD CLAIM THAT THE TWO MEANS HAVE THE SAME VALUE (H 0 : µ 1 = µ 2 ). THE TEST STATISTIC IS x s n x 1 2 2 1 1 s + n 2 2 2 (USING THE USUAL NOTATION). UNDER THE NULL HYPOTHESIS, THE TEST STATISTIC HAS THE STANDARD NORMAL DISTRIBUTION. EXAMPLE: TO TEST (USING " = 1%) WHETHER MEN HAVE, ON THE AVERAGE, HIGHER BLOOD PRESSURE THAN WOMEN, WE COLLECT TWO INDEPENDENT SAMPLES, EACH OF SIZE 100. THE TWO SAMPLE MEANS TURN OUT TO BE 137 mm AND 126 mm, WITH STANDARD DEVIATIONS OF 32 mm AND 28 mm. 11

THE HYPOTHESES ARE: H 0 : µ 1 = µ 2 AND H 1 : µ 1 > µ 2 137 126 THE TEST STATISTIC EVALUATES TO = 2.587 2 2 32 28 + 100 100 IN TABLE 6 (UNDER 4 d.f.), WE FIND THAT THE CORRESPONDING CRITICAL VALUE IS 2.326. THERE IS SUFFICIENT STATISTICAL EVIDENCE TO CONCLUDE MEN HAVE, ON THE AVERAGE, HIGHER BLOOD PRESSURE THAT WOMEN. DISCLAIMER: ALL OUR EXAMPLES ARE HYPOTHETICAL, WITH MADE UP DATA - THE CONCLUSIONS ARE NOT TO BE TAKEN SERIOUSLY! TWO NOTES: WHEN THE EXACT VALUES OF F 1 AND F 2 ARE KNOWN AND GIVEN, THESE ARE TO BE SUBSTITUTED FOR s 1 AND s 2. WHEN THE NULL HYPOTHESIS CLAIMS SOMETHING LIKE THIS: H 0 : : 1 = : 2 + 10, THE NUMERATOR OF THE TEST STATISTIC MUST BE MODIFIED CORRESPONDINGLY, I.E. CHANGED TO x x. 1 2 10 12

< SMALL-SAMPLE MODIFICATIONS: WE NEED BOTH POPULATIONS TO BE NORMAL, AND THE TWO (POPULATION) F S TO EQUAL TO EACH OTHER. THE TEST STATISTIC IS THEN x x 1 2 1 1 ( n 1) s + ( n 1) s 1 1 + n n n + n 2 1 2 2 1 2 2 2 2 WHICH HAS, WHEN H 0 IS TRUE, THE t DISTRIBUTION WITH n 1 + n 2-2 DEGREES OF FREEDOM. EXAMPLE: TO TEST WHETHER TWO CAR MAKES HAVE THE SAME FUEL CONSUMPTION (OR NOT), WE HAVE DRIVEN 10 CARS OF EACH MAKE OVER THE SAME TRACK, OBTAINING THE FOLLOWING RESULTS: THE TWO SAMPLE MEANS WERE 23.82L AND 22.41L, THE CORRESPONDING SAMPLE STANDARD DEVIATIONS 0.38L AND 0.43L. WE WANT TO USE 10% LEVEL OF SIGNIFICANCE. 13

H 0 : : 1 = : 2, H 1 : : 1 : 2. THE TEST STATISTIC EVALUATES TO 02. 2382. 22. 41 9 038. + 9 043. 10 + 10 2 2 2 = 7.77 THE CRITICAL VALUES (TABLE 6, d.f. = 18, "O = 0.100) ARE ±1.734. CONCLUSION: IN THIS CASE, THE TWO CONSUMPTION ARE CLEARLY DIFFERENT (AT JUST ABOUT ANY SIGNIFICANCE LEVEL). < DIFFERENCE IN TWO PROPORTIONS (PROBABILITIES OF A SUCCESS ). THE TWO SAMPLES MUST BE INDEPENDENT, BOTH MUST BE LARGE. THE TEST STATISTIC IS p$ p$ 1 2 1 1 + pq $ $ n n 1 2 14

r n + r + n 1 2 WHERE $p / IS THE SO CALLED 1 2 POOLED ESTIMATE OF THE PROBABILITY OF SUCCESS, AND q$ 1 p$. ITS DISTRIBUTION IS, TO A GOOD APPROXIMATION, STANDARD NORMAL. EXAMPLE: TEST, USING 5% LEVEL OF SIGNIFICANCE, WHETHER THE PROPORTION OF PEOPLE WITH BLOOD TYPE O IS THE SAME IN US AND JAPAN. H 0 : p 1 = p 2, H 1 : p 1 p 2. SUPPOSE THAT OUR SAMPLING RESULTED IN 137 SUCH PEOPLE OUT OF 400 IN US, AND 98 OUT OF 300 IN JAPAN. THE VALUE OF THE TEST STATISTIC IS THUS 137 / 400 98/ 300 1/ 300+ 1/ 400 235/ 700 465/ 700 = 0.439 THE CRITICAL VALUES ARE ±1.96. CONCLUSION: WE DON T HAVE STATISTICALLY SIGNIFICANT EVIDENCE TO REJECT THE HYPOTHESIS THAT THE TWO PROPORTIONS ARE IDENTICAL. 15

< TESTS INVOLVING PAIRED DIFFERENCES THIS IS THE ONLY SITUATION WHICH WE DID NOT LEARN TO CONSTRUCT CONFIDENCE INTERVALS FOR. TYPICALLY, IT INVOLVES AN EXPERIMENT IN WHICH TWO OBSERVATIONS ARE TAKEN FOR EACH SUBJECT (E.G. BEFORE AND AFTER A MEDICATION IS TAKEN). WE ARE USUALLY GIVEN TWO SETS OF (PAIRED) DATA, AND MUST FIRST REALIZE THAT THESE ARE NOT TWO INDEPENDENT SAMPLES. THE NEXT THING IS TO COMPUTE THE DIFFERENCES OF EACH PAIR OF VALUES (SOME MAY TURN OUT POSITIVE, SOME NEGATIVE), AND PRETTY MUCH FORGET THE ORIGINAL DATA. THE NULL HYPOTHESIS USUALLY STATES THAT THE (POPULATION) MEAN DIFFERENCE IS ZERO (: d = 0). 16

THE TEST STATISTIC IS: s d d n WHERE d IS THE SAMPLE MEAN OF THE DIFFERENCES, AND s d IS THE CORRESPONDING SAMPLE STANDARD DEVIATION. ASSUMING THE DIFFERENCE DISTRIBUTION TO BE NORMAL, THE TEST STATISTIC HAS THE t DISTRIBUTION WITH n - 1 DEGREES OF FREEDOM (THE STANDARD NORMAL DISTRIBUTION WHEN n > 30). EXAMPLE: A CERTAIN BLOOD-PRESSURE MEDICATION IS BEING TESTED ON 12 RANDOMLY SELECTED INDIVIDUALS. THEIR BLOOD PRESSURE IS RECORDED BEFORE AND AFTER THEY TAKE THIS MEDICATION; THESE ARE THE RESULTS: B: 143 128 160 148 139 172 144 150 138 153 180 163 A: 128 132 144 139 137 140 125 138 139 139 161 129 17

FIRST, WE NEED TO COMPUTE THE DIFFERENCES: 15, -4, 16, 9, 167 2, 32, 19, 12, -1, 14, 19, 34, THEIR MEAN: d = = 13.92 AND 12 s d = 3825 167 12 11 STANDARD DEVIATION: 11.68 THE RESULTING VALUE OF THE TEST STATISTIC IS: 1392. 1168. 12 = 4.128 2 = THE HYPOTHESES ARE: H 0 : : d = 0 AND H 1 : : d > 0 (ONE-TAILED TEST). USING 1% LEVEL OF SIGNIFICANCE (AND 11 DEGREES OF FREEDOM), WE FIND (TABLE 6) THAT THE CRITICAL VALUE IS 2.718. CONCLUSION: WE HAVE A HIGHLY SIGNIFICANT PROOF THAT THE MEDICATION IS EFFECTIVE. NOTE: SHOULD THE NULL HYPOTHESIS CLAIM THAT : d = 20 (NOT VERY COMMON) WE WOULD HAVE TO MODIFY OUR TEST STATISTIC TO d 20 s d n WOULD BE THE SAME). (EVERYTHING ELSE 18