Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Similar documents
Stat 200 -Testing Summary Page 1

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Common Large/Small Sample Tests 1/55

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Topic 9: Sampling Distributions of Estimators

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

1 Inferential Methods for Correlation and Regression Analysis

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Chapter 1 (Definitions)

Expectation and Variance of a random variable

Statistical Inference About Means and Proportions With Two Populations

Data Analysis and Statistical Methods Statistics 651

Statistics 20: Final Exam Solutions Summer Session 2007

Topic 9: Sampling Distributions of Estimators

Chapter two: Hypothesis testing

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Read through these prior to coming to the test and follow them when you take your test.

Topic 9: Sampling Distributions of Estimators

University of California, Los Angeles Department of Statistics. Hypothesis testing

Properties and Hypothesis Testing

Statistics 511 Additional Materials

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Chapter 13, Part A Analysis of Variance and Experimental Design

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

This is an introductory course in Analysis of Variance and Design of Experiments.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

5. A formulae page and two tables are provided at the end of Part A of the examination PART A

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Final Examination Solutions 17/6/2010

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Chapter 6 Sampling Distributions

z is the upper tail critical value from the normal distribution

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

MA238 Assignment 4 Solutions (part a)

STATISTICAL INFERENCE

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

(7 One- and Two-Sample Estimation Problem )

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Sample Size Determination (Two or More Samples)

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Power and Type II Error

1036: Probability & Statistics

Statistics. Chapter 10 Two-Sample Tests. Copyright 2013 Pearson Education, Inc. publishing as Prentice Hall. Chap 10-1

Module 1 Fundamentals in statistics

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

There is no straightforward approach for choosing the warmup period l.

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

6 Sample Size Calculations

1 Models for Matched Pairs

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Estimating the Population Mean - when a sample average is calculated we can create an interval centered on this average

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Chapter 8: Estimating with Confidence

Output Analysis (2, Chapters 10 &11 Law)

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

AP Statistics Review Ch. 8

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

GG313 GEOLOGICAL DATA ANALYSIS

Stat 421-SP2012 Interval Estimation Section

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

1 Constructing and Interpreting a Confidence Interval

Lecture 3. Properties of Summary Statistics: Sampling Distribution

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Multiple Comparisons Examples STAT 314

LESSON 20: HYPOTHESIS TESTING

Stat 225 Lecture Notes Week 7, Chapter 8 and 11

Thomas Whitham Form Centre

The Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 3 Suggested Solution

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Math 140 Introductory Statistics

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Background Information

Sampling Distributions, Z-Tests, Power

Biostatistics for Med Students. Lecture 2

(6) Fundamental Sampling Distribution and Data Discription

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

1 Constructing and Interpreting a Confidence Interval

Transcription:

Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual compariso of populatios Boxplots Histograms Backgroud Readig Devore : Sectio 9.1-9. Stem-ad-Leaf Plots Distributio may differ i terms of ceter or spread 15 15-1 Comparig Two Pop Meas First cosider compariso of ceters (meas Cosider the followig otatio Iterested i estimatig µ 1 µ Will use x y as the estimate Need the stadard error of this estimate Populatio 1 Populatio Pop Size N 1 N Mea µ 1 µ Std. Dev. 1 Sample Size m Sample Mea x y Sample StDev s 1 s Need to costruct samp dist of X Y Recall if X ormal the X ormal with Mea: µ 1 ad Std Dev: 1 / m Ad if Y ormal the Y ormal with Mea: µ ad Std Dev: / 15-15-3

Liear combiatio of ormal RV s is also also applied to the differece (page 45 Give X s ad Y s are idep, X Y has Differece of Two Meas Mea: µ 1 µ (1 α100% Cofidece Iterval of µ 1 µ Stadard deviatio: X Y = 1 m + (x y ± z 1 α/ m + What do we coclude if 0 is i the iterval? Will use this samplig distributio for costructio of cofidece itervals ad hypothesis tests 15-4 15-5 Hypothesis Testig Similar procedure as oe populatio case H 0 : µ 1 µ = 0 µ 1 µ > 0 H a : µ 1 µ < 0 µ 1 µ 0 Cosumer Reports decided to compare several differet brads of batteries. Suppose that 100 Duracell Alkalie AA batteries had a average lifetime of 4.1 hours ad 100 Eergizer Alkalie AA batteries had a average lifetime of 4.5 hours. If it were assumed that the two distributios were ormally distributed with stadard deviatios 1.0 ad 1.35 hours respectively, is there evidece that the two types of batteries have a differet average lifetime? Will use x y for test H 0 : µ 1 µ =0 H a : µ 1 µ 0 Will assess ocompatibility usig stadardizatio (x y 0 1 m + = z s Give z s, P-value computed i usual way 15-6 z = (4.1 4.5 0 1.0 100 + 1.35 100 =.4 0.18 =.1 The P-value is P ( Z >.1 = (.0136 = 0.07. For α>.07, we would reject H 0 ad coclude there is a differece. It appears the Eergizer lasts loger. 15-7

Type II Error / Power Followig formulae result for β H a : µ 1 µ > 0 Φ(z α + 0 H a : µ 1 µ < 0 1 Φ( z α + 0 H a : µ 1 µ 0 Φ(z α/ + 0 Φ( z α/ + 0 where Φ(z c =P(Z <z c = µ 1 µ uder H a is the stadard error of X Y Cosiderig the previous compariso of battery brads, what is the power of detectig the differece whe the true differece is at least 30 miutes (0.5 hours ad we use α =.01? Sice H a : µ 1 µ 0, the formula (α =.01 gives us β = Φ(.576 0.5 0.5 Φ(.576 0.18 0.18 = Φ(.0 Φ( 5.35 0.407 Thus the power is 1-.407 or 51.83%. Would the power icrease or decrease if α =.05? NOTE: Formulas for sample size follow same geeral rules. Whe m = the formula is give o page 366. 15-8 15-9 ( s ukow ( s ukow With large samples Cetral Limit Thm X Normal Estimate s 1 ad s ca be used for variaces (1 α100% Cofidece Iterval of µ 1 µ Will use t dist istead of stadard Normal Must estimate the stadard error Two possible estimates 1. The Upooled Method (Variaces Uequal s (x y ± z 1 α/ m + s s ˆ X Y = 1 m + s. The Pooled Method (Variaces Equal Hypothesis test statistic z s = (x y 0 s 1 m + s s p = (m 1s 1 +( 1s m + s p ˆ = m + s p 15-10 15-11

s Sample 1 Sample x =5 y =7 s 1 =0.0 s =0.8 1 =0 =0 Whe 1 = both methods give idetical aswer µ 1 µ Upooled. /0 +.8 /0 =.077 s pooled = ((0 1. + (0 1.8 /(0 + 0 Pooled.059(1/0 + 1/0 =.077 Sample 1 Sample y 1 =5 y =7 s 1 =0.0 s =0.8 1 =10 =0 Whe 1, choice based o assumptio 1 = Upooled. /10 +.8 /0 =.089 Recall if Y ormal but ukow y ± t α/ s / df = 1 If X, Y ormal with 1 =,CIis (x y ± t α/ s p( 1 m + 1 df = m + s pooled = ((10 1. + (0 1.8 /(10 + 0 Pooled.066(1/10 + 1/0 =.099 15-1 15-13 µ 1 µ If you do ot assume 1 =,theciis (y 1 y ± t α/ s 1/m + s / where ( SE 1 +SE df = SE 4 1/(m 1 + SE 4 /( 1 SE 1 = s 1/ m, SE = s / 15-14 Wat to compare serum iro levels (µmol/i for the populatio of healthy childre ad the populatio of childre with cystic fibrosis. A sample of m =9 healthy childre resulted i x =18.9 ads 1 =5.9 ad a sample of = 13 childre with cystic fibrosis resulted i y =11.9 ads =6.3, what is the 90% CI for the differece i mea levels? Will assume the variaces are equal because s 1 ad s are ot very differet. The pooled variace is s pooled = (9 15.9 + (13 16.3 =37.74 9+13 The stadard error is 37.74( 1 9 + 1 13 =.66. For df=9+13-=0 degrees of freedom, the cofidece coefficiet is 1.75. The cofidece iterval is (18.9 11.9 ± 1.75(.66 = (.4, 11.6 Recall that sice zero is ot i this iterval, the two meas are statistically differet at the α =.1 sigificace level. Lookig at the sample meas, the childre i the healthy populatio o average have higher serum iro levels tha those with cystic fibrosis. 15-15

Hypothesis Testig Similar procedure as oe populatio case H 0 : µ 1 µ = 0 µ 1 µ > 0 H a : µ 1 µ < 0 µ 1 µ 0 t s = (x y 0 s p( 1 m + 1 Compare t s to the t dist with m + df If o poolig, must alter std error ad df Assumptios Both CI ad hypothesis tests assume 1 Samples are idepedet Very importat!! Ofte accomplished through radom samplig Beware of subsamplig/hierarchical samplig Populatios are Normally distributed Less critical Cetral Limit Thm allows for approximatio 3 Variaces are equal Statistical tests available to compare variaces Oly ecessary whe usig pooled SE If 1 =, works well if 1/ <s 1 /s < 15-16 15-17 A ew tapeworm medicie has bee developed. To test its effectiveess, a experimet was coducted to compare the mea umber of tapeworms i sheep treated with this medicie agaist the mea umber i those that were ot (cotrol. A sample of 10 ifected sheep were radomly divided ito two groups. After four moths, the sheep were slaughtered ad the followig couts were recorded. Is this drug effective at the.05 level? Treated 13 15 0 15 17 Utreated 17 1 19 0 The sample statistics are x = 16, s 1 = 7, y = 19.8, ad s = 3.7. I order to use the t distributio, we must assume that the two populatios are Normal ad each observatio i the sample ad the two samples are idepedet. We may also decide to assume the variaces are the same. With such a small data set, it is difficult to test for Normality so we ll assume it is true. As for idepedece, we ll assume the sheep were kept i separate locatios so the couts are idepedet. Sice the sample variaces do t differ by more tha a factor of, we pool the variaces. H 0 : µ 1 µ = 0 (Treated - Cotrol H A : µ 1 µ < 0 The test statistic is s pooled = 7+3.7 =5.35 16 19.8 5.35( 15 + 15 =.60 Sice the df=8, the P-value is betwee.01 ad.05 (1 sided. Therefore, we reject the Null hypothesis. There is evidece that this drug reduces the umber of tapeworms. NOTE: If we did t wat to assume the variaces are equal, the test statistic is the same but the degrees of freedom is ow 7 (use formula. The coclusios do ot chage. 15-18 15-19