Comparing Means from Two-Sample

Similar documents
Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Two-Sample Inference for Proportions and Inference for Linear Regression

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Business Statistics. Lecture 9: Simple Regression

Chapter 12 - Lecture 2 Inferences about regression coefficient

Midterm 1 and 2 results

Chapter 20 Comparing Groups

Lecture 11 - Tests of Proportions

Lab #12: Exam 3 Review Key

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

CHAPTER 10 Comparing Two Populations or Groups

One-sample categorical data: approximate inference

Business Statistics. Lecture 10: Course Review

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Harvard University. Rigorous Research in Engineering Education

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Data Analysis and Statistical Methods Statistics 651

+ Specify 1 tail / 2 tail

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Chapter 22. Comparing Two Proportions 1 /29

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence interval to compare two proportions.

Chapter 22. Comparing Two Proportions 1 /30

Statistical Inference for Means

An inferential procedure to use sample data to understand a population Procedures

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

MBA 605, Business Analytics Donald D. Conant, Ph.D. Master of Business Administration

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Chapter 24. Comparing Means

Sampling Distributions

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals

POLI 443 Applied Political Research

Statistics for IT Managers

Sampling Distributions: Central Limit Theorem

STAT Chapter 8: Hypothesis Tests

Lecture 30. DATA 8 Summer Regression Inference

Study Guide #3: OneWay ANALYSIS OF VARIANCE (ANOVA)

1 Independent Practice: Hypothesis tests for one parameter:

Chapter 12: Inference about One Population

AMS 7 Correlation and Regression Lecture 8

Difference Between Pair Differences v. 2 Samples

HYPOTHESIS TESTING. Hypothesis Testing

Unit 10: Simple Linear Regression and Correlation

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan

10.2: The Chi Square Test for Goodness of Fit

Chapter 9. Hypothesis testing. 9.1 Introduction

Inferential Statistics

Hotelling s One- Sample T2

LECTURE 5. Introduction to Econometrics. Hypothesis testing

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Inference for Distributions Inference for the Mean of a Population. Section 7.1

Hypothesis Testing hypothesis testing approach formulation of the test statistic

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Ordinary Least Squares Regression Explained: Vartanian

Student s t-distribution. The t-distribution, t-tests, & Measures of Effect Size

10.1. Comparing Two Proportions. Section 10.1

Hypothesis testing: Steps

Chapter 16. Simple Linear Regression and Correlation

STA Module 10 Comparing Two Proportions

Ch. 7. One sample hypothesis tests for µ and σ

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Hypothesis tests

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Chapter 23. Inference About Means

Statistical Inference

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Difference between means - t-test /25

First we look at some terms to be used in this section.

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Chapter 9 Inferences from Two Samples

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Gov 2000: 6. Hypothesis Testing

Inferences for Regression

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Hypothesis testing: Steps

Business Statistics. Lecture 5: Confidence Intervals

Lecture 10: Comparing two populations: proportions

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Statistics for Managers Using Microsoft Excel/SPSS Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests

9-6. Testing the difference between proportions /20

Name: Exam: In-term Two Page: 1 of 8 Date: 12/07/2018. University of Texas at Austin, Department of Mathematics M358K - Applied Statistics TRUE/FALSE

y n 1 ( x i x )( y y i n 1 i y 2

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

MAT2377. Rafa l Kulik. Version 2015/November/23. Rafa l Kulik

Chapter 27 Summary Inferences for Regression

Econometrics. 4) Statistical inference

Chapter 10: Inferences based on two samples

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Transcription:

Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22

Inference from One-Sample We have two options to make an inference about population mean µ from one sample of size n: 1) 100(1-C)% Confidence interval and 2) Hypothesis test with a level α 1) 100(1-C)% Confidence Interval We need to consider the case when σ is known or unknown. a. Known σ ( X Z σ n, X + Z σ n ). b. Unknown σ ( X tn 1 s, X + t s n n 1 ) n Kwonsang Lee STAT111 April 3, 2015 2 / 22

Inference from One-Sample 2) Hypothesis test with a level α a. State the null and alternative hypotheses. (Here, two-sided example) H 0 : µ = µ 0 and H a : µ µ 0. b. Calculate a test statistic Z 0 (known σ) or a test statistic T 0 (unknown σ) c. Calculate the P-value P-value = Z 0 = X µ 0 σ/ n or T 0 = X µ 0 s/ n { 2 P(Z Z 0 ) σ is known 2 P(T T 0 ) σ is unknown d. Compare the P-value to the significance level α Kwonsang Lee STAT111 April 3, 2015 3 / 22

Supplement of t-test (Two-sided test) Because t-table doesn t give the P-value, we can modify our t-test. Instead of computing P-value, we can find the value tn 1 such that P(T > t n 1) = α 2 Then, Conclusion = { We reject the null We don t reject the null if T 0 t n 1 if T 0 < t n 1 Note: If one-sided alternative hypothesis is H a : µ > 0, we need to find the value tn 1 such that P(T > t n 1 ) = α. We reject the null if T 0 > tn 1. Also, if H a : µ < 0, we need to use tn 1 such that P(T < t n 1 ) = α. We reject the null if T 0 < tn 1. Draw the t-distribution and think about it! Kwonsang Lee STAT111 April 3, 2015 4 / 22

New Terminology: Standard Error X is from the population with mean µ and SD σ. We take a sample of size n from the population and say X 1,..., X n. What we learned: A sample (X 1,..., X n ) has the sample mean X and the sample SD s The sample mean X has the distribution with mean µ and SD σ n. New terminology: Standard error of X is s n. i.e. SE( X ) = s n. Kwonsang Lee STAT111 April 3, 2015 5 / 22

Two-Sample Example Let s assume that we want to study about household incomes in Philadelphia and New York. Philadelphia income dist. mean µ p and SD σ p New York income dist. mean µ n and SD σ n Then, we take a Philadelphia sample of size n p and a New York sample of size n n. Phila. sample sample mean x p and sample SD s p NY sample sample mean x n and sample SD s n What to do? We want to compare µ p with µ n. 1) Hypothesis test of µ p = µ n 2) Confidence interval of µ p µ n. Kwonsang Lee STAT111 April 3, 2015 6 / 22

Inference from Two-Sample: Intro We don t know the values of µ 1 and µ 2. We want to make inferences from Sample 1 and Sample 2. We can conduct a hypothesis test or construct a confidence interval for µ. Also, we need to consider the case when σ 1 and σ 2 are known or unknown. 1) Hypothesis test of µ 1 = µ 2 or µ 1 µ 2 = 0. 2) Confidence interval of µ 1 µ 2. Kwonsang Lee STAT111 April 3, 2015 7 / 22

Two-Sample Hypothesis Test: Known σ 1 and σ 2 Since σ 1 and σ 2 are known, we can take the Z test. a. H 0 : µ 1 µ 2 = 0 and H a : µ 1 µ 2 0. b. Test statistic Z 0 is c. P-value is Z 0 = ( X 1 X 2 ) (µ 1 µ 2 ) σ 2 1 n 1 + σ2 2 n 2 P(Z Z 0 ) + P(Z Z 0 ) = 2 P(Z Z 0 ) d. Compare P-value with a level α Kwonsang Lee STAT111 April 3, 2015 8 / 22

Two-Sample Hypothesis Test: Unknown σ 1 and σ 2 Since σ 1 and σ 2 are unknown, we use s 1 and s 2 instead and use the t-test with a level α. a. H 0 : µ 1 µ 2 = 0 and H a : µ 1 µ 2 0. b. Test statistic T 0 is T 0 = ( X 1 X 2 ) (µ 1 µ 2 ) s 2 1 n 1 + s2 2 n 2 c. (Modified Version) We can find the critical value t k such that P(T > t k ) = α 2 where k = min(n 1 1, n 2 1). d. Compare T 0 with the value t k. ( T 0 > t k Reject the null.) Kwonsang Lee STAT111 April 3, 2015 9 / 22

Example 1 There is a product A that is advertised as helping students to learn Statistics more effectively. We want to test if there is any positive effect of the product A. Among 44 participants, we randomly select 21 people to use the product (21 treated and 23 control). After one month, all participants take a statistic test, and the scores are recorded. The following is the summary: n x s Treated 21 51.5 11 Control 23 41.5 17 Q: How can we conduct a hypothesis test? Kwonsang Lee STAT111 April 3, 2015 10 / 22

Example 1 There is a product A that is advertised as helping students to learn Statistics more effectively. We want to test if there is any positive effect of the product A. Among 44 participants, we randomly select 21 people to use the product (21 treated and 23 control). After one month, all participants take a statistic test, and the scores are recorded. The following is the summary: n x s Treated 21 51.5 11 Control 23 41.5 17 Q: How can we conduct a hypothesis test? We need to do Two-Sample t-test! Kwonsang Lee STAT111 April 3, 2015 10 / 22

Example 1 Two-Sample t-test with a level α = 0.05: a. H 0 : µ treated µ control = 0 and H a : µ t µ c 0. b. Test statistic T 0 is given by T 0 = ( X t X c ) (µ t µ c ) = + s2 c n c s 2 t n t (51.5 41.5) 0 11 2 21 + 172 23 = 2.336 c. Conservatively, degree of freedom k is min(n t 1, n c 1) = 20. The critical value tk is 2.086. P(T > t 20) = α 2 = 0.025 d. Since T 0 = 2.336 > 2.086 = t20, we reject the null hypothesis. t-table http: //bcs.whfreeman.com/ips6e/content/cat_050/ips6e_table-d.pdf Kwonsang Lee STAT111 April 3, 2015 11 / 22

Two-Sample t-test in JMP Here are the references for t-test. One-sample t-test: http://www.chem.sc.edu/faculty/morgan/ resources/statistics/jmp_one_sample_t-test.pdf Two-sample t-test: http://www.chem.sc.edu/faculty/morgan/ resources/statistics/jmp_two_sample_t-test.pdf Steps for two-sample t-test: 1. Open the data file. 2. Go to Analyze Fit Y by X. For example, Y is a score variable and X is an indicator of either treated or control. 3. Click the red triangle next to Oneway Analysis of... and choose t-test. Kwonsang Lee STAT111 April 3, 2015 12 / 22

Using JMP We can find the descriptions of each sample. We also find a confidence interval of µ 1 µ 2 and the results of Two-sample t-test. Using JMP, we can compute the p-value of our test statistic T 0 in the previous Example 1. The P-value is 0.0264 which is less than 0.05, so we reject the null hypothesis. Here is another reference relate with Example 1: http://web.utk.edu/~cwiek/201tutorials/twosamplettest/ Kwonsang Lee STAT111 April 3, 2015 13 / 22

Confidence Interval from Two-Sample We can consider two cases: 1) Known σ 1 and σ 2 case and 2) Unknown σ 1 and σ 2 case. Confidence interval of µ 1 µ 2 with known σ 1 and σ 2 ( X 1 X 2 ) ± Z σ 2 1 n 1 + σ2 2 n 2 Confidence interval of µ 1 µ 2 with unknown σ 1 and σ 2 ( X 1 X 2 ) ± tk s1 2 + s2 2 n 1 n 2 where k = min(n 1 1, n 2 1). Kwonsang Lee STAT111 April 3, 2015 14 / 22

Special Case: Matched Pairs Sometimes the two samples that are being compared are matched pairs. For example, if there is a drug A and it can lower blood pressure. Each subject s blood pressure is measured before taking the drug and is measured after intake. One subject has two values of the outcome. Then, we want to test if there is any difference between blood pressure before intake and blood pressure after intake. Subject 1 Subject 2... Subject n Before 130 128... 126 After 116 110... 108 We want to test if blood pressure before = blood pressure after. Kwonsang Lee STAT111 April 3, 2015 15 / 22

Matched Pairs In this case, we can compute the difference D = X 1 X 2. Here, Diff=Before After. Subject 1 Subject 2... Subject n Before 130 128... 126 After 116 110... 108 Diff 14 18... 18 Then, we can use a test like H 0 : Diff = 0. This is One-Sample t-test. Kwonsang Lee STAT111 April 3, 2015 16 / 22

Matched Pairs Test From Matched pairs design, we have X 1 and X 2 for n subjects. Then we compute the new variable D = X 1 X 2 and compute the sample mean and the sample SD of D: Xd and s d. Then, we can state the null hypothesis H 0 : µ d = 0 and the alternative H a : µ d 0. We calculate the test statistic T 0 T 0 = X d µ d s d / n. Then, we can calculate the critical value tn 1 statistic T 0. and compare it with the test Kwonsang Lee STAT111 April 3, 2015 17 / 22

Example 2 We consider the drug of lowering blood pressure example. The summary is that Subject Before After D 1 130 116 14 2 128 110 18.... 10 126 108 18 x before = 122.2, s before = 6.3 x after = 113, s after = 9.1 However, in matched pairs design, what we need is a new variable D = Before After. We have x d = 9.2, s d = 9.8 Kwonsang Lee STAT111 April 3, 2015 18 / 22

Example 2: Under Independent Assumption It is clear that Before and After is not independent because these two are from the same subject. If we consider Before and After are independent, then what can we conclude? We want to do hypothesis test with a level α = 0.02. a. H 0 : µ before µ after = 0 and H a : µ before µ after 0. b. The test statistic T 0 is T 0 = ( x before x after ) (µ before µ after ) = s 2 1 n 1 + s2 2 n 2 122.2 113 6.3 2 10 + 9.12 10 = 2.629 c. k = n 1 = 9 and t9 = 2.821 such that P(T > t 9 ) = α/2 = 0.01. d. T 0 = 2.629 < 2.821 = t9. So, we don t reject the null. It means that there is not enough evidence that there is an effect of a drug on lowering blood pressure. Kwonsang Lee STAT111 April 3, 2015 19 / 22

Example 2: Matched Pairs Before and After from the original data are dependent. So, doing t-test isn t correct under independence. Here is right Matched pairs t-test. Hypothesis test with a level α = 0.02 a. H 0 : µ d = 0 and H a : µ d 0. b. The test statistic T 0 is T 0 = x d µ d s d / n = 9.2 9.8/ 10 = 2.969 c. k = n 1 = 9 and t9 = 2.821 such that P(T > t 9 ) = α/2 = 0.01. d. T 0 = 2.969 > 2.821 = t9. So, we reject the null. It means that there is enough evidence that there is an effect of a drug on lowering blood pressure. Note: It is important to use the right approach!! Kwonsang Lee STAT111 April 3, 2015 20 / 22

Summary We learned CI and Hypothesis test in so many situations. I can give a direction about how to choose the right way of analysis. 1. Need to understand the data. i.e. is it one-sample? or two-sample? or matched pairs design? 2. Do we know the population SD σ? 3. Is our goal making a CI? or doing hypothesis test? 4. If we need to do hypothesis test, what is the null and alternative hypotheses? (One-sided? or Two-sided?) Kwonsang Lee STAT111 April 3, 2015 21 / 22

Next Week We have been talking about inferences of the population mean µ. Next week, we re going to talk about CI and hypothesis test for population p. Kwonsang Lee STAT111 April 3, 2015 22 / 22