STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Similar documents
Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

STA Module 10 Comparing Two Proportions

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Chapter 8: Estimating with Confidence

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Common Large/Small Sample Tests 1/55

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Sample Size Determination (Two or More Samples)

Chapter 18 Summary Sampling Distribution Models

This is an introductory course in Analysis of Variance and Design of Experiments.

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

1 Inferential Methods for Correlation and Regression Analysis

(7 One- and Two-Sample Estimation Problem )

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Chapter 6 Sampling Distributions

MIT : Quantitative Reasoning and Statistical Methods for Planning I

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

CH19 Confidence Intervals for Proportions. Confidence intervals Construct confidence intervals for population proportions

Comparing your lab results with the others by one-way ANOVA

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Math 140 Introductory Statistics

Read through these prior to coming to the test and follow them when you take your test.

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Chapter 13, Part A Analysis of Variance and Experimental Design

Data Analysis and Statistical Methods Statistics 651

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Chapter 23: Inferences About Means

Statistics 20: Final Exam Solutions Summer Session 2007

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Topic 9: Sampling Distributions of Estimators

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Simple Random Sampling!

Stat 200 -Testing Summary Page 1

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

University of California, Los Angeles Department of Statistics. Hypothesis testing

1 Models for Matched Pairs

Final Examination Solutions 17/6/2010

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

1 Constructing and Interpreting a Confidence Interval

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

1 Constructing and Interpreting a Confidence Interval

Chapter 5: Hypothesis testing

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

STAT 155 Introductory Statistics Chapter 6: Introduction to Inference. Lecture 18: Estimation with Confidence

A statistical method to determine sample size to estimate characteristic value of soil parameters

Estimation of a population proportion March 23,

Chapter 22: What is a Test of Significance?

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Module 1 Fundamentals in statistics

Topic 9: Sampling Distributions of Estimators

CONFIDENCE INTERVALS STUDY GUIDE

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Understanding Samples

Economics Spring 2015

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Topic 9: Sampling Distributions of Estimators

Properties and Hypothesis Testing

Formulas and Tables for Gerstman

Statistics 511 Additional Materials

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Chapter 1 (Definitions)

Understanding Dissimilarity Among Samples

AP Statistics Review Ch. 8

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Confidence Interval for one population mean or one population proportion, continued. 1. Sample size estimation based on the large sample C.I.

Chapter two: Hypothesis testing

Statistics. Chapter 10 Two-Sample Tests. Copyright 2013 Pearson Education, Inc. publishing as Prentice Hall. Chap 10-1

Lesson 2. Projects and Hand-ins. Hypothesis testing Chaptre 3. { } x=172.0 = 3.67

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Issues in Study Design

SOLUTIONS y n. n 1 = 605, y 1 = 351. y1. p y n. n 2 = 195, y 2 = 41. y p H 0 : p 1 = p 2 vs. H 1 : p 1 p 2.

Working with Two Populations. Comparing Two Means

Hypothesis Testing (2) Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

GG313 GEOLOGICAL DATA ANALYSIS

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

MidtermII Review. Sta Fall Office Hours Wednesday 12:30-2:30pm Watch linear regression videos before lab on Thursday

x z Increasing the size of the sample increases the power (reduces the probability of a Type II error) when the significance level remains fixed.

Power and Type II Error

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Transcription:

STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio proportios. 2. Describe the relatioship betwee the sample sizes, cofidece level, ad margi of error for a cofidece iterval for the differece betwee two populatio proportios. 3. Determie the sample size required for a specified cofidece level ad margi of error for the estimate of the differece betwee two populatio proportios. Populatio Proportios I this module, we are goig to lear how to compare two populatio proportios. Remember, a populatio proportio, p is simply the percetage of a populatio that has a specified attribute. 1

Quick Review o Populatio Proportio ad Sample Proportio I short, a sample proportio is obtaied by dividig the umber of members sampled that have the specified attribute (x) by the total umber of members sampled (). Sometimes, we refer to x as the umber of successes ad -x as the umber o failures. Quick Review o Oe-Proportio z-iterval Whe the coditios are met, we are ready to fid the cofidece iterval for the populatio proportio, p. The cofidece iterval is pˆ ± z SE pˆ ( ) where SE( pˆ ) = pq ˆ ˆ The critical value, z*, depeds o the particular cofidece level, C, that you specify. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more iterestig. We ofte wat to kow how two groups differ, whether a treatmet is better tha a placebo cotrol, or whether this year s results are better tha last year s. 2

Aother Ruler I order to examie the differece betwee two proportios, we eed aother ruler the stadard deviatio of the samplig distributio model for the differece betwee two proportios. Recall that stadard deviatios do t add, but variaces do. I fact, the variace of the sum or differece of two idepedet radom variables is the sum of their idividual variaces. The Stadard Deviatio of the Differece Betwee Two Proportios Proportios observed i idepedet radom samples are idepedet. Thus, we ca add their variaces. So The stadard deviatio of the differece betwee two sample proportios is p1q1 p2q2 SD( pˆ ˆ 1 p2 ) = + Thus, the stadard error is 1 2 ( ˆ pˆ ) = + SE p pˆ qˆ pˆ qˆ Assumptios ad Coditios Idepedece Assumptios: Radomizatio Coditio: The data i each group should be draw idepedetly ad at radom from a homogeeous populatio or geerated by a radomized comparative experimet. The 10% Coditio: If the data are sampled without replacemet, the sample should ot exceed 10% of the populatio. Idepedet Groups Assumptio: The two groups we re comparig must be idepedet of each other. 3

Assumptios ad Coditios (cot.) Sample Size Coditio: Each of the groups must be big eough Success/Failure Coditio: Both groups are big eough that both successes ad failures are at least 5 have bee observed i each. The Samplig Distributio We already kow that for large eough samples, each of our proportios has a approximately Normal samplig distributio. The same is true of their differece. The Samplig Distributio (cot.) Provided that the sampled values are idepedet, the samples are idepedet, ad the samples sizes are large eough, the samplig distributio of pˆ pˆ is modeled by a Normal model with Mea: µ = p1 p2 Stadard deviatio: 1 2 ( ˆ pˆ ) = + SD p p q p q 4

Two-Proportio z-iterval Whe the coditios are met, we are ready to fid the cofidece iterval for the differece of two proportios: The cofidece iterval is pˆ pˆ ± z SE pˆ pˆ where ( ) ( ) pˆ 1qˆ 1 pˆ 2qˆ 2 SE( pˆ ˆ 1 p2 ) = + The critical value z* depeds o the particular cofidece level, C, that you specify. Pool or Not Pool? The typical hypothesis test for the differece i two proportios is the oe of o differece. I symbols, H 0 : p 1 p 2 = 0. Sice we are hypothesizig that there is o differece betwee the two proportios, that meas that the stadard deviatios for each proportio are the same. Sice this is the case, we combie (pool) the couts to get oe overall proportio. What is the Pooled Proportio? The pooled proportio is Success + Success pˆ pooled = + where Success1 = 1 pˆ 1 ad Success2 = 2 pˆ 2 If the umbers of successes are ot whole umbers, roud them first. (This is the oly time you should roud values i the middle of a calculatio.) 5

What is the Pooled Proportio? (Cot.) We the put this pooled value ito the formula, substitutig it for both sample proportios i the stadard error formula: pˆ qˆ pˆ qˆ SE ( ˆ ˆ pooled p1 p2 ) = + pooled pooled pooled pooled Compared to What? We ll reject our ull hypothesis if we see a large eough differece i the two proportios. How ca we decide whether the differece we see is large? Just compare it with its stadard deviatio. Ulike previous hypothesis testig situatios, the ull hypothesis does t provide a stadard deviatio, so we ll use a stadard error (here, pooled). Two-Proportio z-test The coditios for the two-proportio z-test are the same as for the two-proportio z-iterval. We are testig the hypothesis H 0 : p 1 = p 2. Because we hypothesize that the proportios are equal, we pool them to fid Success + Success pˆ pooled = + 6

Two-Proportio z-test (cot.) We use the pooled value to estimate the stadard error: pˆ ˆ ˆ ˆ pooledqpooled ppooledqpooled SE ( ˆ ˆ pooled p1 p2 ) = + 1 2 Now we fid the test statistic: ( pˆ 1 pˆ 2 ) 0 z = SE ( ˆ ˆ pooled p1 p2 ) Whe the coditios are met ad the ull hypothesis is true, this statistic follows the stadard Normal model, so we ca use that model to obtai a P-value. Quick Review Let s look at the followig oe more time: How to fid oe proportio z-iterval? How to perform a oe proportio z-test? What is the samplig distributio of the differece betwee two sample proportios? How to perform a two proportio z-test? How to fid a two proportio z-iterval? Let s review How to Costruct a Oe Proportio z-iterval? 7

A Quick Review o How to Perform a Oe Proportio z-test? How to Perform a Oe Proportio z-test? (Cot.) The Samplig Distributio of the Differece Betwee Two Sample Proportios What does it mea? For large idepedet samples, the possible differeces betwee two sample proportios have approximately a ormal distributio with mea p 1 p 2 ad stadard deviatio as above. 8

How to Perform a Two Proportio z-test? How to Perform a Two Proportio z-test? (Cot.) How to Perform a Two Proportio z-iterval? 9

What is the Margi of Error for the Estimate of p 1 - p 2? What does it mea? The margi of error equals half the legth of the cofidece iterval. It represets the precisio with which the differece betwee the sample proportios estimates the differece betwee the populatio proportios at the specified cofidece level. How to Fid the Sample Size for Estimatig p 1 - p 2? What Ca Go Wrog? Do t Misstate What the Iterval Meas: Do t suggest that the parameter varies. Do t claim that other samples will agree with yours. Do t be certai about the parameter. Do t forget: It s the parameter (ot the statistic). Do t claim to kow too much. Do take resposibility (for the ucertaity). 10

What Ca Go Wrog? (cot.) Margi of Error Too Large to Be Useful: We ca t be exact, but how precise do we eed to be? Oe way to make the margi of error smaller is to reduce your level of cofidece. (That may ot be a useful solutio.) You eed to thik about your margi of error whe you desig your study. To get a arrower iterval without givig up cofidece, you eed to have less variability. You ca do this with a larger sample What Ca Go Wrog? (cot.) Violatios of Assumptios: Watch out for biased samplig. Thik about idepedece. What Ca Go Wrog? (cot.) Do t base your ull hypothesis o what you see i the data. Thik about the situatio you are ivestigatig ad develop your ull hypothesis appropriately. Do t base your alterative hypothesis o the data, either. Agai, you eed to Thik about the situatio. 11

What Ca Go Wrog? (Cot.) Do t use two-sample proportio methods whe the samples are t idepedet. These methods give wrog aswers whe the idepedece assumptio is violated. Do t apply iferece methods whe there was o radomizatio. Our data must come from represetative radom samples or from a properly radomized experimet. Do t iterpret a sigificat differece i proportios causally. Be careful ot to jump to coclusios about causality. What have we leared? We have leared to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio proportios. 2. Describe the relatioship betwee the sample sizes, cofidece level, ad margi of error for a cofidece iterval for the differece betwee two populatio proportios. 3. Determie the sample size required for a specified cofidece level ad margi of error for the estimate of the differece betwee two populatio proportios. Credit Some of these slides have bee adapted/modified i part/whole from the slides of the followig textbooks. Weiss, Neil A., Itroductory Statistics, 8th Editio Bock, David E., Stats: Data ad Models, 3rd Editio 12