Two-sample Categorical data: Testing

Similar documents
Two-sample Categorical data: Testing

Two-sample inference: Continuous data

Two-sample inference: Continuous data

Two-sample inference: Continuous Data

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

One-sample categorical data: approximate inference

Statistics: revision

Correlation. Patrick Breheny. November 15. Descriptive statistics Inference Summary

Multiple samples: Modeling and ANOVA

Descriptive statistics

Chapter 15. Probability Rules! Copyright 2012, 2008, 2005 Pearson Education, Inc.

An introduction to biostatistics: part 1

Probability and Probability Distributions. Dr. Mohammed Alahmed

STA Why Sampling? Module 6 The Sampling Distributions. Module Objectives

Probability Year 9. Terminology

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Sampling Distribution Models. Chapter 17

First we look at some terms to be used in this section.

Poisson regression: Further topics

THE SAMPLING DISTRIBUTION OF THE MEAN

Probability Year 10. Terminology

HYPOTHESIS TESTING. Hypothesis Testing

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

2 Chapter 2: Conditional Probability

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

Probability. Patrick Breheny. September 1. Events and probability Working with probabilities Additional theorems/rules Summary

Do students sleep the recommended 8 hours a night on average?

Mathematical Notation Math Introduction to Applied Statistics

The Central Limit Theorem

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Data Analysis and Statistical Methods Statistics 651

1 A simple example. A short introduction to Bayesian statistics, part I Math 217 Probability and Statistics Prof. D.

Chapter 26: Comparing Counts (Chi Square)

Unit 9: Inferences for Proportions and Count Data

Probability II. Patrick Breheny. February 16. Basic rules (cont d) Advanced rules Summary

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

Unit 9: Inferences for Proportions and Count Data

Contingency Tables Part One 1

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

Ling 289 Contingency Table Statistics

Probability and Statistics

Grades 7 & 8, Math Circles 24/25/26 October, Probability

Categorical Data Analysis 1

Study and research skills 2009 Duncan Golicher. and Adrian Newton. Last draft 11/24/2008

Nicole Dalzell. July 3, 2014

Hypothesis testing. Data to decisions

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Business Statistics. Lecture 5: Confidence Intervals

Goodness of Fit Tests

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

Probability and Independence Terri Bittner, Ph.D.

Statistics 3858 : Contingency Tables

The Central Limit Theorem

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

7.1 What is it and why should we care?

Statistics in medicine

STAT 705: Analysis of Contingency Tables

Math 243 Section 3.1 Introduction to Probability Lab

Statistic: a that can be from a sample without making use of any unknown. In practice we will use to establish unknown parameters.

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

One-Way Tables and Goodness of Fit

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Categorical Data Analysis. The data are often just counts of how many things each category has.

Chapter 18. Sampling Distribution Models /51

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests

MATH MW Elementary Probability Course Notes Part I: Models and Counting

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

Chapter 2: Describing Contingency Tables - I

1 Normal Distribution.

Review of One-way Tables and SAS

Many natural processes can be fit to a Poisson distribution

Probability and Discrete Distributions

Fin285a:Computer Simulations and Risk Assessment Section 2.3.2:Hypothesis testing, and Confidence Intervals

Testing Independence

Example. If 4 tickets are drawn with replacement from ,

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Inferential statistics

Chapter 6: Probability The Study of Randomness

Multiple Regression Analysis

MAT Mathematics in Today's World

Data Analysis and Statistical Methods Statistics 651

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Chapter 3. Estimation of p. 3.1 Point and Interval Estimates of p

16.400/453J Human Factors Engineering. Design of Experiments II

Chi Square Analysis M&M Statistics. Name Period Date

Discrete Probability and State Estimation

CONTINUOUS RANDOM VARIABLES

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

DISTRIBUTIONS USED IN STATISTICAL WORK

Statistical Experiment A statistical experiment is any process by which measurements are obtained.

Transcription:

Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28

Separate vs. paired samples Despite the fact that paired samples usually offer a more powerful design, separate samples are much more common, for a variety of reasons: It is often impossible to pair samples: for example, in the Salk vaccine trial, a child must either be vaccinated or unvaccinated there is no way to receive both Even if theoretically possible, it is often impractical: for example, in studies of chronic diseases, we have to wait years to observe whether or not a person suffers a heart attack or develops breast cancer Furthermore, in observational studies, we have no control over which subjects end up in which group Patrick Breheny Introduction to Biostatistics (171:161) 2/28

Lister s experiment In the 1860s, Joseph Lister conducted a landmark experiment to investigate the benefits of sterile technique in surgery At the time, it was not customary for surgeons to wash their hands or instruments prior to operating on patients Lister developed a new operating procedure in which surgeons were required to wash their hands, wear clean gloves, and disinfect surgical instruments with carbolic acid This new procedure was compared to the old, non-sterile procedure and Lister recorded the number of patients in each group that lived or died Patrick Breheny Introduction to Biostatistics (171:161) 3/28

Contingency tables When the outcome of a two-sample study is categorical, the results can be summarized in a 2x2 table that lists the number of subjects in each sample that fell into each category Putting Lister s results in this form, we have: Survived Yes No Sterile 34 6 Control 19 16 This kind of table is called a contingency table, or sometimes a cross-classification table Patrick Breheny Introduction to Biostatistics (171:161) 4/28

Contingency tables (cont d) Customarily, the rows of a contingency table represent the treatment/exposure groups, while the columns represent the outcomes All rows and columns must represent mutually exclusive categories; thus, each subject is located in one and only one cell of the table Patrick Breheny Introduction to Biostatistics (171:161) 5/28

Lister s results Introduction On the surface, Lister s experiment seems encouraging: 46% of patients who received conventional treatment died, compared with only 15% of the patients who were operated on using the new sterile technique However, if we calculate (separate, exact) confidence intervals for the proportion who die from each type of surgery, they overlap: Sterile: (6%,30%) Control: (29%,63%) Patrick Breheny Introduction to Biostatistics (171:161) 6/28

Differences between groups It s nice to know about the actual separate proportions that died for each type of surgery, but what we really want to know in this experiment is whether there is a difference between the two treatments So, rather than ask two separate questions, each using part of the data and addressing part of the question of interest, a more powerful approach is to focus all of the analysis on the one primary question of interest Patrick Breheny Introduction to Biostatistics (171:161) 7/28

Setting up a hypothesis test Let s think about what a p-value is: the probability of seeing results as extreme or more extreme than what we saw, if the null hypothesis were true The null hypothesis here is that sterilization has no impact on the probability that a patient lives or dies i.e., that it doesn t make any difference which type of surgery the patients received If that were true, then it is as if we only really had one group, and everyone lived or died with equal probability regardless of the sterility of their surgery Patrick Breheny Introduction to Biostatistics (171:161) 8/28

Setting up a hypothesis test Thus, consider putting all the patients outcomes into a single urn without considering the type of surgery they received (i.e., this urn would contain 22 balls with died written on them and 53 balls with survived written on them) Our sample of 40 patients who received sterile surgery would be like randomly drawing 40 balls out of the urn How often would we see something as extreme or more extreme than only 6 out of 40 patients dying? Patrick Breheny Introduction to Biostatistics (171:161) 9/28

Performing the experiment Making these draws 10,000 times, I got the following results: 2000 1500 Frequency 1000 500 0 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Patrick Breheny Introduction to Biostatistics (171:161) 10/28

Calculating a p-value from the experiment When I drew 40 balls from the combined urn, I only drew 6 balls 27 times (out of 10,000) The only results as extreme or more extreme than 6 were: Number of died balls: 4 5 6 18 19 Number of times drawn: 1 4 27 11 1 So I obtained a result as extreme or more extreme than the observed value a total of 44 times out of 10,000 From this experiment, then, I would calculate a p-value of 44/10000 =.0044 Patrick Breheny Introduction to Biostatistics (171:161) 11/28

Fisher s exact test This approach to testing association in a 2x2 table is called Fisher s exact test, after R.A. Fisher, probably the most influential statistician of the 20th century Although we did the experiment using a simulation, Fisher worked out the exact probabilities that would result from this experiment For Lister s data, the exact probability (in lab, we will use a computer to get this result) is 0.0050; this is generally in agreement with the simulation, although by chance we ended up with slightly fewer extreme tables in this particular simulation Patrick Breheny Introduction to Biostatistics (171:161) 12/28

Introduction The χ 2 curve As you might imagine, Fisher s exact test is too complicated to realistically carry out by hand in most situations Looking at the bar plot of the simulation results, however, it would seem possible to obtain an approximate answer using the normal distribution Indeed, even before Fisher, another famous statistician (Karl Pearson) invented an approximate test for categorical data Pearson s invention, the χ 2 -test, is one of the earliest (1900) and most widely used statistical tests Patrick Breheny Introduction to Biostatistics (171:161) 13/28

The χ 2 curve Introduction The χ 2 curve involves a curve we haven t seen yet called the χ 2 curve Before we get to the test, let s take a quick look at where the χ 2 curve comes from Suppose that we generated a lot of random observations from the normal distribution; the histogram of these observations would look like the normal curve Now suppose that we took those observations, squared them, and then made a histogram Patrick Breheny Introduction to Biostatistics (171:161) 14/28

The χ 2 curve The χ 2 curve (with 1 degree of freedom) 6 5 4 Density 3 2 1 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 z 2 Patrick Breheny Introduction to Biostatistics (171:161) 15/28

The χ 2 curve The χ 2 curve and hypothesis testing What are the implications for hypothesis testing? Well, suppose we were performing a z-test: normally, we would calculate a test statistic z and find the area under the normal curve outside ±z But what if we squared z instead? z 2 = ( x µ 0) 2 SE 2 Patrick Breheny Introduction to Biostatistics (171:161) 16/28

The χ 2 curve The χ 2 curve and hypothesis testing (cont d) Now, the area under the normal curve above +z will lie above +z 2 and so will the area under the normal curve below z The area to the right of z 2 now contains both tails of the original normal curve To summarize: regardless of whether x is far above µ 0 or far below µ 0, z 2 will be large and we will naturally get a two-sided test So, an alternative way to calculate p-values for z-tests is to find the area to the right of z 2 on the χ 2 curve and don t double it Patrick Breheny Introduction to Biostatistics (171:161) 17/28

Graphical representation The χ 2 curve Density 0.0 0.1 0.2 0.3 0.4 0.5 Density 0.0 0.1 0.2 0.3 0.4 0.5 4 2 0 2 4 z 0 1 2 3 4 5 6 z 2 Patrick Breheny Introduction to Biostatistics (171:161) 18/28

The motivation behind a χ 2 -test The χ 2 curve So essentially, the χ 2 test is simply the squared version of the z-test The fact that this test statistic is naturally two-sided makes it easy to compare the observed number of times each category occurs with the number of times it would be expected to occur under the null hypothesis, and then sum up these results over each of the cells in the table The more disagreement there is between observed and expected results, the further we will be in the right-hand tail of the χ 2 curve, and the lower our p-value will be Patrick Breheny Introduction to Biostatistics (171:161) 19/28

The χ 2 -statistic Introduction The χ 2 curve Specifically, this is done by calculating the χ 2 -statistic: letting the subscript i denote the possible categories, χ 2 = i (O i E i ) 2 where O i and E i are the observed and expected number of times category i occurs/should occur The numerator should look familiar: it s the difference between the observed value of something and its expected value under the null The denominator, on the other hand, looks weird However, it just so happens that when you re counting occurrences of a category, SD E, so SD 2 is approximately equal to the expected value E i Patrick Breheny Introduction to Biostatistics (171:161) 20/28

The χ 2 -test procedure Introduction The χ 2 curve The procedure for performing a χ 2 -test is as follows: #1 Create a table of expected counts based on the null hypothesis #2 Calculate the χ 2 -statistic: χ 2 = i (O i E i ) 2 E i #3 Determine the area to the right of χ 2 on the χ 2 curve Patrick Breheny Introduction to Biostatistics (171:161) 21/28

The χ 2 -test: Lister s experiment The χ 2 curve Let s use the χ 2 -test to determine how unlikely Lister s results would have been if sterile technique had no impact on fatal complications from surgery #1 Create a table of expected counts based on the null hypothesis In the experiment, ignoring group affiliation, 22 out of 75 patients died Thus, under the null, we would expect 22/75 = 29.3% of the patients in each group to die: Survived Yes No Sterile 28.3 11.7 Control 24.7 10.3 Patrick Breheny Introduction to Biostatistics (171:161) 22/28

The χ 2 curve The χ 2 -test: Lister s experiment (cont d) #2 Calculate the χ 2 -statistic: χ 2 (34 28.3)2 (6 11.7)2 = + 28.3 11.7 (19 24.7)2 (16 10.3)2 + + 24.7 10.3 = 8.50 #3 The area to the right of 8.50 is 1.996 =.004 There is only a 0.4% probability of seeing such a large association by chance alone; this is compelling evidence that sterile surgical technique saves lives Patrick Breheny Introduction to Biostatistics (171:161) 23/28

Fisher s exact test and the χ 2 -test So far, everything we ve talked about is testing the same null hypothesis with the same data, so one would expect similar results Indeed, the p-values are very similar: Experiment: p =.004 : p =.005 χ 2 -test: p =.004 Patrick Breheny Introduction to Biostatistics (171:161) 24/28

Fisher s exact test vs. the χ 2 -test This is often the case for 2x2 tables: the results from Fisher s exact test and the approximate χ 2 -test are typically in close agreement However, when there are many cells with small E i numbers, as can happen in larger tables, the two can yield very different results For example, the next slide presents data from an occupational health study comparing the frequency of wearing gloves outside the lab vs. education level Patrick Breheny Introduction to Biostatistics (171:161) 25/28

Study: Frequency of wearing gloves outside the lab Some 4-year Master s Other Prof. College Degree Degree Ph.D. Degree Sometimes 1 0 0 0 0 Rarely 1 3 5 15 0 Never 1 17 13 38 3 : p =.12 χ 2 -test: p =.00003 Patrick Breheny Introduction to Biostatistics (171:161) 26/28

Fisher s exact test vs. the χ 2 -test How should you decide to use one versus the other? As in the case of one-sample data, with modern computers there is little reason to settle for the approximate answer when the exact answer can be calculated in a fraction of a second Nevertheless, χ 2 -tests are still widely used, largely due to inertia and tradition, but also because the two generally provide very similar results, especially for 2x2 tables It is important to be aware, however, that the χ 2 -test can be wildly incorrect when some cells have small E i values as a rule of thumb, this starts to become a problem when E i < 5, but becomes extreme when E i < 1 Patrick Breheny Introduction to Biostatistics (171:161) 27/28

Introduction Know what a contingency table is and how to construct one Know how to carry out a χ 2 -test, and in particular, how to construct a table of expected counts under the null hypothesis The χ 2 -test is based on an approximation; there is an exact alternative () that produces exact p-values Complex to perform by hand Trivial with modern computers Patrick Breheny Introduction to Biostatistics (171:161) 28/28