To make comparisons for two populations, consider whether the samples are independent or dependent.

Similar documents
18. Two-sample problems for population means (σ unknown)

STAT-UB.0103 NOTES for Wednesday 2012.APR.25. Here s a rehash on the p-value notion:

The Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 3 Suggested Solution

p we will use that fact in constructing CI n for population proportion p. The approximation gets better with increasing n.

Confidence Intervals

Distribution of Sample Proportions

Comparing your lab results with the others by one-way ANOVA

Confidence intervals for proportions

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Common Large/Small Sample Tests 1/55

Chapter 9, Part B Hypothesis Tests

Confidence Intervals for the Difference Between Two Proportions

Estimating Proportions

Examination Number: (a) (5 points) Compute the sample mean of these data. x = Practice Midterm 2_Spring2017.lwp Page 1 of KM

Chapter 18: Sampling Distribution Models

Data Analysis and Statistical Methods Statistics 651

Final Examination Solutions 17/6/2010

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Chapter 13, Part A Analysis of Variance and Experimental Design

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

General Instructions:

tests 17.1 Simple versus compound

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Expectation and Variance of a random variable

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Hypothesis Testing. H 0 : θ 1 1. H a : θ 1 1 (but > 0... required in distribution) Simple Hypothesis - only checks 1 value

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

1 Constructing and Interpreting a Confidence Interval

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Sample Size Determination (Two or More Samples)

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

This is an introductory course in Analysis of Variance and Design of Experiments.

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

STAC51: Categorical data Analysis

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH

Stat 200 -Testing Summary Page 1

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

1 Constructing and Interpreting a Confidence Interval

Chapter 23: Inferences About Means

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Topic 9: Sampling Distributions of Estimators

Chapter 1 (Definitions)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Frequentist Inference

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Linear Regression Models

= p x (1 p) 1 x. Var (X) =p(1 p) M X (t) =1+p(e t 1).

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

1 Inferential Methods for Correlation and Regression Analysis

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Successful HE applicants. Information sheet A Number of applicants. Gender Applicants Accepts Applicants Accepts. Age. Domicile

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

STATISTICAL INFERENCE

GG313 GEOLOGICAL DATA ANALYSIS

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

MA238 Assignment 4 Solutions (part a)

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

1 Models for Matched Pairs

Sampling Distributions, Z-Tests, Power

Chapter 6 Sampling Distributions

Math 140 Introductory Statistics

Basics of Inference. Lecture 21: Bayesian Inference. Review - Example - Defective Parts, cont. Review - Example - Defective Parts

Parameter, Statistic and Random Samples

Read through these prior to coming to the test and follow them when you take your test.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

1036: Probability & Statistics

Topic 9: Sampling Distributions of Estimators

Statistics Definition: The science of assembling, classifying, tabulating, and analyzing data or facts:

Power and Type II Error

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Chapter two: Hypothesis testing

Chapter 8: Estimating with Confidence

Stat 139 Homework 7 Solutions, Fall 2015

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

5. A formulae page and two tables are provided at the end of Part A of the examination PART A

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Statistics 511 Additional Materials

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

(7 One- and Two-Sample Estimation Problem )

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Computing Confidence Intervals for Sample Data

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Topic 9: Sampling Distributions of Estimators

Regression, Inference, and Model Building

TI-83/84 Calculator Instructions for Math Elementary Statistics

Transcription:

Sociology 54 Testig for differeces betwee two samle meas Cocetually, comarig meas from two differet samles is the same as what we ve doe i oe-samle tests, ecet that ow the hyotheses focus o the arameters of two oulatios. To make comarisos for two oulatios, cosider whether the samles are ideedet or deedet. Ideedet samles: Selectio of members of oe samle has o ifluece o the selectio of members of the other samle. Deedet samles: Selectio of members for oe samle determies the characteristics of the members for the other samle. Whe we comare two grous, we still state two cometig hyotheses. With two samles, we are ow dealig with a samlig distributio of differeces betwee samle meas. The outcome of iterest is the differece betwee two samle statistics (e.g. the differece i mea hours set o housework betwee me ad wome). Just as the samlig distributio of samle meas is ormal, so is the samlig distributio of differeces betwee samle meas (a corollary of the Cetral Limit Theorem). Coductig a hyothesis test comarig two samles: Calculatig a test statistic to determie whether the differece betwee two samle meas is real or simly due to chace variatio is cocetually the same as what we ve already reviewed for a sigle samle. To comare two grous o a quatitative characteristic, we make ifereces about their oulatio meas µ ad µ ad the differece betwee them. Test statistic = (observed differece - eected differece) / amout of variability i samlig distributio of differeces t = ( ) ( µ µ ) ˆ Note that idetificatio of which grou is called ad which is called is arbitrary. What is the stadard error of the samlig distributio of the estimated differece betwee the two samle meas? That is, the degree to which the differece would vary if we reeatedly took samles of size ad.

Comarig samle meas (iterval measures) of two ideedet LARGE samles with uequal variaces (>00 for each samle) s s = Where s = stadard deviatio of samle s = stadard deviatio of samle = size of samle = size of samle Whe two estimates are formed from ideedet samles, the samlig distributio of their differece has variace equal to the sum of the variaces of the samlig distributio of the searate estimates. The corresodig test statistic is: z = ( ) ( µ µ ) ˆ Recall that oce you ve calculated the test statistic for your two samles, you ca fid the aroriate -value that corresods to that statistic. That is, what is the likelihood that you would draw two samles with a differece as large as that observed, if i fact there really were o differece betwee the two grous? If the robability is small, we may reject the ull hyothesis ad tetatively accet the research hyothesis. Guidelies to test for statistically sigificat differeces betwee meas of two grous.. Choose a sigificace (α) level.. State hyotheses. 3. Calculate the aroriate test statistic. 4. Determie the -value associated with the test statistic. 5. Draw coclusios.

Two ideedet LARGE samles A samle of 000 studets draw from a ublic uiversity fids that studets work. hours er week while a samle of 900 studets draw from a rivate uiversity fids that their studets work a average of 9. hours er week. The samle stadard deviatios are 0.8 hours for the ublic uiversity ad 9.6 hours for the rivate uiversity. Is there a sigificat differece i the umber of hours worked i ublic versus rivate uiversities? 3

4 Comarig samle meas for two ideedet SMALL samles (iterval measures). You must make the assumtio that the oulatio variaces are equal to use this formula (Kurtz. 83-86). First, obtai a ooled estimate of the stadard deviatio for the two grous: Although we assume that the samle variaces are equal, to obtai the best ossible estimate of the oulatio variace, we take a weighted average of the two variaces rather tha arbitrarily choose oe of them as the estimate. The obtai the estimated stadard error of the samlig distributio of differeces usig the ooled estimate of the stadard deviatio: This is equivalet to: The aroriate test statistic is: ˆ ) ( ) ( t = µ µ with degrees of freedom = - ) ( ) ( ˆ = s s ˆ ˆ = ˆ ˆ =

Two ideedet small samles (assume oulatio variaces are equal) The followig two samles idicate salaries for male ad female rofessors (i 969!). Could these differeces i salary arise just by chace? Samle of male rofessors: Samle of female rofessors: N=0 N=5 = 6 (salary figure i 000's) = s = 3.5 s =.83 5

6 Comarig roortios for two ideedet large samles (Kurtz.9-95) Let's say that you have two ideedet samles with dichotomous measures. Is that dichotomous variable distributed similarly i the two oulatios? We are ow comarig samles with a qualitative resose variable. This test requires that both samles have 30 or more members, ad the resultig statistic is a z score. The test statistic is the ratio of the differece betwee two samle roortios to the stadard error of the two roortios. The test statistic is estimated with the followig equatio: Test Statistic: ( ) = ) ) ( z c c where c =. c = is a weighted average of the two roortios to adjust if the two samles are of uequal size.

Two ideedet large samles (comarig roortios) A study foud the followig results: Of 3 male studets, 53% worked more tha 8 hours er week. Of 79 female studets, 48% worked more tha 8 hours er week. Is there a sigificat differece betwee male ad female studets i the umber of hours worked? 7

SPSS Oe-Samle t Test A ewly created radom umber geerator is suosed to geerate a sequece of digits such that each digit is equally likely to be ay of 0,,,, 9. The first 0 umbers geerated are: 7 7 3 0 5 6 3 6 0 9 9 4 0 8 5 0 6 As a check of whether the rocess works correctly, test whether the mea differs sigificatly from the value eected. Reort the -value ad iterret. You should also do this by had ad cofirm that you get the same results as those reorted by SPSS. SPSS Commads Aalyze - Comare Meas Oe-Samle T Test Select Test Variable (chage) ad Test Value ( ) Okay Oe-Samle Statistics NUMBERS Std. Error N Mea Std. Deviatio Mea 0 4.5 3.08.69 Oe-Samle Test NUMBERS Test Value = 4.5 95% Cofidece Iterval of the Mea Differece t df Sig. (-tailed) Differece Lower Uer -.508 9.67 -.35 -.79.09 8

Two-Samle t Test Usig GSS98.SAV file, determie whether there is a sigificat differece betwee me ad wome i their resose to the followig questio: ABANY: "Please tell me whether or ot you thik it should be ossible for a regat wome to obtai a legal abortio if... the woma wats it for ay reaso?" SEX: Resodet's Se Remember to use syta file set ritback o. Coduct test for differeces Aalyze Comare Meas Ideedet Samles T test Select test (ABANY) ad grou variables (SEX) - Okay Aother Two-Samle t-test Is there a sigificat differece by age (eole older tha ad youger tha 40) i the resose to the questio above? Recode age ito two categories (ages 0-39 ad 40 ad older) ad coduct a t-test. AGE: Age of Resodet 9

ADDITIONAL NOTES FOR SELF STUDY AND FUTURE REFERENCE Comarig two deedet samles with iterval measures (Kurtz. 95-98) Deedet samles occur whe each observatio i samle matches with a observatio from samle. (Ofte called matched-airs data). Most commoly occurs whe each samle has the same subjects. A eamle of reeated measuremet data. Studet's T test for Paired Comarisos A secial case of the oe-grou t test usig differece scores (e.g. differece i SAT scores from time to time ) from each air of deedet subjects. For matched-airs data, the differece betwee the meas of the two grous equals the mea of the differece scores. t = δ s δ δ = mea differece score S δ = stadard deviatio of differece score N = size of samle size Note: This formula is equivalet to that reseted o age 96 of Kurtz. See also otes o eamle of SAT scores ad effect of a re course discussed i earlier hadout. Some advatages to aalysis with deedet samles. Kow sources of otetial bias are cotrolled usig same subjects i each samle for eamle kees may ossible cofoudig factors fied.. Stadard error of differece may be smaller with deedet samles Assumtios. Radom ad ideedet samlig. Normality assumtio: Normality assumtio alies to oulatio of differece scores. The deedet grous t test is geerally cosidered robust agaist violatio of this assumtio if N > 30. 0

SPSS Oe-Samle t Test Use ch7.sav i soc 54 work directory Differece scores for studets who've take a SAT re course. File Name: Aalyze - Comare Meas Oe-Samle T Test Select Test Variable (chage) ad Test Value (0) Okay Oe-Samle Statistics CHANGE Std. Error N Mea Std. Deviatio Mea 0.0000 35.839.3333 Oe-Samle Test CHANGE Test Value = 0 95% Cofidece Iterval of the Mea Differece t df Sig. (-tailed) Differece Lower Uer.059 9.37.0000-3.6378 37.6378

You could also comare these two grous usig a aired samle t-test (these are deedet samles) Paired Samle t-test Usig the dataset cotaiig 0 observatios o re ad ost SAT scores, use the aired samle t- test to determie whether there is a sigificat differece betwee the two scores. Commads: Aalyze Comare Meas Paired Samle t-test Select aired variables (variable ad variable ) - Okay Syta: T-TEST PAIRS= origscre WITH ewscore (PAIRED) /CRITERIA=CIN(.95) /MISSING=ANALYSIS. Paired Samles Test Pair ORIGSCRE - NEWSCORE Mea Paired Differeces 95% Cofidece Iterval of the Std. Error Differece Std. Deviatio Mea Lower Uer t df Sig. (-tailed) -.0000 35.8395.33333-37.6378 3.6378 -.059 9.37 Note that these two tests yield eactly the same result.