One-sample categorical data: approximate inference

Similar documents
The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

Power and sample size calculations

Proportional hazards regression

Two-sample inference: Continuous data

Descriptive statistics

The Central Limit Theorem

Multiple samples: Modeling and ANOVA

Harvard University. Rigorous Research in Engineering Education

Two-sample inference: Continuous data

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

Semiparametric Regression

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Dealing with the assumption of independence between samples - introducing the paired design.

Biostatistics: Correlations

Correlation. Patrick Breheny. November 15. Descriptive statistics Inference Summary

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

7 Estimation. 7.1 Population and Sample (P.91-92)

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing

Inference for Binomial Parameters

Comparing Means from Two-Sample

Count data and the Poisson distribution

Two-sample inference: Continuous Data

Lecture 11: Extrema. Nathan Pflueger. 2 October 2013

Lecture 11 - Tests of Proportions

1 Statistical inference for a population mean

Math 124: Modules Overall Goal. Point Estimations. Interval Estimation. Math 124: Modules Overall Goal.

Lab #11. Variable B. Variable A Y a b a+b N c d c+d a+c b+d N = a+b+c+d

Lab #12: Exam 3 Review Key

STA Module 10 Comparing Two Proportions

MA 1128: Lecture 19 4/20/2018. Quadratic Formula Solving Equations with Graphs

First we look at some terms to be used in this section.

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

Lecture 21: October 19

Sampling Distributions

Categorical Data Analysis 1

Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals

Poisson regression: Further topics

The normal distribution

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

Chapter 26: Comparing Counts (Chi Square)

Week 11 Sample Means, CLT, Correlation

Last few slides from last time

(right tailed) or minus Z α. (left-tailed). For a two-tailed test the critical Z value is going to be.

Lecture 10: Powers of Matrices, Difference Equations

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

Chapter 22. Comparing Two Proportions 1 /29

Confidence Intervals and Hypothesis Tests

Difference Between Pair Differences v. 2 Samples

1 Hypothesis testing for a single mean

CONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Chapter 22. Comparing Two Proportions 1 /30

Announcements. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 February 8, / 45

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Significance Tests. Review Confidence Intervals. The Gauss Model. Genetics

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

Sampling distribution of GLM regression coefficients

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

Correlation and regression

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Section 4.6 Negative Exponents

Stat 101: Lecture 12. Summer 2006

Expansion of Terms. f (x) = x 2 6x + 9 = (x 3) 2 = 0. x 3 = 0

We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).

Table Probabilities and Independence

Categorical Data Analysis Chapter 3

Lesson 21 Not So Dramatic Quadratics

Confidence intervals

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Lecture 7: Confidence interval and Normal approximation

Lecture 10: F -Tests, ANOVA and R 2

9.4 Radical Expressions

Statistical Inference

Binary Logistic Regression

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Confidence Intervals with σ unknown

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

INTRODUCTION TO ANALYSIS OF VARIANCE

UNIVERSITY OF TORONTO Faculty of Arts and Science

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

CS 124 Math Review Section January 29, 2018

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

Data Analysis and Statistical Methods Statistics 651

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Lecture #16 Thursday, October 13, 2016 Textbook: Sections 9.3, 9.4, 10.1, 10.2

COMP6053 lecture: Sampling and the central limit theorem. Markus Brede,

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Data Mining. CS57300 Purdue University. March 22, 2018

The Central Limit Theorem

Discrete Multivariate Statistics

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Transcription:

One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25

Introduction It is relatively easy to think about the distribution of data heights or weights or blood pressures: we can see these numbers, summarize them, plot them, etc. It is much harder to think about things like the distribution of the sample mean, because in reality the experiment is conducted only once and we only see one mean The distribution of the mean is more of a hypothetical concept describing what would happen if we were to repeat the experiment over and over Patrick Breheny Biostatistical Methods I (BIOS 5710) 2/25

Consider a study to determine the average cholesterol level in a certain population; if we were to repeat this study many times, we would get different estimates each time, depending on the random sample we drew To reflect the fact that its distribution depends on the random sample, the distribution of an estimate (such as the sample mean) is called a sampling distribution are of fundamental importance to the long-run frequency approach to statistical inference and essential for carrying out hypothesis tests and constructing confidence intervals In a broader sense, we study sampling distributions to understand how reproducible a study s findings are, and in turn, how accurate its generalizations are likely to be Patrick Breheny Biostatistical Methods I (BIOS 5710) 3/25

(cont d) For independent one-zero outcomes, the sampling distribution was simple enough that we could derive it exactly and describe it with a simple formula For most other outcomes, however, this is not possible and we often rely instead on the central limit theorem to provide the sampling distribution as we ve seen, this is not exact, but usually a very good approximation Patrick Breheny Biostatistical Methods I (BIOS 5710) 4/25

Applying the central limit theorem To get a sense of how useful the central limit theorem is, let s return to our hypothetical study to determine an average cholesterol level According the National Center for Health Statistics, the distribution of serum cholesterol levels for 20- to 74-year-old males living in the United States has mean 211 mg/dl, and a standard deviation of 46 mg/dl (these are estimates, of course, but for the sake of this example we will take them to be the true population parameters) We collect a sample of size 25; what is the probability that our sample average will be above 230? We collect a sample of size 25; 95% of our sample averages will fall between what two numbers? How large does the sample size need to be in order to insure a 95% probability that the sample average will be within 5 mg/dl of the population mean? Patrick Breheny Biostatistical Methods I (BIOS 5710) 5/25

Introduction Introduction Hypothesis testing We can use this same line of thinking to develop hypothesis tests and confidence intervals We ll begin by revisiting one-sample categorical data because It s the simplest scenario We can compare our new simple-yet-approximate results to the exact hypothesis tests and confidence intervals that we obtained earlier based on the binomial distribution Patrick Breheny Biostatistical Methods I (BIOS 5710) 6/25

Introduction Hypothesis testing One-zero (Bernoulli) distribution: mean and variance To use the central limit theorem, we need the population mean and variance For a single one-zero outcome (known as the Bernoulli distribution), its mean is π as we showed in the previous lecture Theorem: For a Bernoulli random variable X, Var(X) = π(1 π) Patrick Breheny Biostatistical Methods I (BIOS 5710) 7/25

Hypothesis testing Introduction Hypothesis testing Now we re ready to carry out a hypothesis test based on the central limit theorem Consider our cystic fibrosis experiment in which 11 out of 14 people did better on the drug than the placebo; expressing this as an average, ˆπ = 11/14 =.79 (i.e., 79% of the subjects did better on drug than placebo) Under the null hypothesis, the sampling distribution of the percentage who did better on one therapy than the other will (approximately) follow a normal distribution with mean π 0 = 0.5 The notation π 0 refers to the hypothesized value of the parameter π under the null Patrick Breheny Biostatistical Methods I (BIOS 5710) 8/25

The standard error Introduction Hypothesis testing What about the standard error (i.e., the standard deviation of ˆπ)? Recall that SE = SD/ n, so for a Bernoulli random variable, π0 (1 π 0 ) SE = n = 1 2 n For the cystic fibrosis experiment, under the null SE = 0.134 Patrick Breheny Biostatistical Methods I (BIOS 5710) 9/25

Introduction Hypothesis testing Approximate test for the cystic fibrosis experiment To calculate a p-value, we need the probability that ˆπ is more extreme than 11/14 given that the true probability is π 0 = 0.5 By the central limit theorem, under the null Thus, ˆπ π 0 SE. N(0, 1).786.5 z =.134 = 2.14 and the p-value of this test is therefore 2(1 Φ(2.14)) =.032 In other words, if the null hypothesis were true, there would only be about a 3% chance of seeing the drug do this much better than the placebo Patrick Breheny Biostatistical Methods I (BIOS 5710) 10/25

Terminology Introduction Hypothesis testing Hypothesis tests revolve around calculating some statistic (known as a test statistic) from the data that, under the null hypothesis, you know the distribution of In this case, our test statistic is z: we can calculate it from the data, and under the null hypothesis, it follows a normal distribution Tests are often named after their test statistics: the testing procedure we just described is called a z-test Patrick Breheny Biostatistical Methods I (BIOS 5710) 11/25

Accuracy of the approximation Introduction Hypothesis testing So the z-test indicates moderate evidence against the null; recall, however, that we calculated a p-value of 6% from the (exact) binomial test, which is more in the borderline evidence region With a sample size of just 14, the distribution of the sample average is still fairly discrete, and this throws off the normal approximation by a bit: Density 0 0.5 1 p^ Patrick Breheny Biostatistical Methods I (BIOS 5710) 12/25

Introduction: confidence intervals Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals Now let s turn our attention to confidence intervals As usual, this is a harder problem hypothesis testing was straightforward because under the null, we knew π 0 and therefore we know the standard error This is not true in trying to determine a confidence interval the SE depends on π, which we don t know There are two common approaches to dealing with this problem, known as the Wald interval and the score interval; we will discuss both Patrick Breheny Biostatistical Methods I (BIOS 5710) 13/25

Wald approach: Main idea Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals In the Wald approach, we use ˆπ to estimate SE The idea behind this approach is that uses our best guess about π to obtain a best guess for the SE Otherwise, however, this approach does not directly account for the fact that SE depends on π Patrick Breheny Biostatistical Methods I (BIOS 5710) 14/25

Wald approach for CF study Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals For the CF study, ˆp(1 ˆp) SE = n 0.786(1 0.786) = 14 = 0.110 Now, by the central limit theorem, ˆπ π 0.110. N(0, 1) and we can solve for π to obtain a confidence interval Patrick Breheny Biostatistical Methods I (BIOS 5710) 15/25

Wald approach for CF study (cont d) Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals For the standard normal distribution, Thus, Φ 1 (0.975) = 1.96 Φ 1 (0.025) = 1.96 0.95 = P ( 1.96 < Z < 1.96) P and ( 1.96 < ˆπ π ) 0.110 < 1.96, [ˆπ 1.96(0.110), ˆπ + 1.96(0.110)] = [57.1%, 100.0%] is an approximate 95% confidence interval for π Patrick Breheny Biostatistical Methods I (BIOS 5710) 16/25

Wald formula Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals Let z α denote the value that contains the middle 100(1 α) percent of the standard normal distribution We can summarize the Wald interval with the formula ˆπ ± z α SE, where SE = ˆπ(1 ˆπ)/n As we will see, this is actually a very common form for confidence intervals (estimate plus/minus a multiple of the standard error), although the multiplier and standard error formulas change depending on what we are estimating Patrick Breheny Biostatistical Methods I (BIOS 5710) 17/25

Score approach: Main idea Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals The score approach also uses the central limit theorem to create approximate confidence intervals, but does so in a different manner than the Wald approach The score approach works very similarly to the Clopper-Pearson interval, except that instead of inverting the binomial test, we invert the CLT-based test from earlier This amounts to solving the quadratic formula ˆπ π π(1 π)/n = z α for π Patrick Breheny Biostatistical Methods I (BIOS 5710) 18/25

Score approach: Formula Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals In other words, the endpoints of the score interval are given by b ± b 2 4ac, 2a where a = 1 + z 2 α/n, b = z 2 α/n 2ˆπ, and c = ˆπ 2 (although I certainly don t expect you to remember this formula) For the cystic fibrosis study, the 95% CI is [52.4%, 92.4%] The score approach lies somewhat in between the Wald and Clopper-Pearson approaches: still based on a CLT approximation to the true sampling distribution, but accounting for the fact that SE varies with π Patrick Breheny Biostatistical Methods I (BIOS 5710) 19/25

Cystic fibrosis study Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals Let s take a look at how the three confidence intervals (binomial, wald, score) compare for the three studies we ve discussed previously For the cystic fibrosis study (x=11, n=14), we have: Binomial: [49.2, 95.3] Wald: [57.1, 100.0] Score: [52.4, 92.4] The score interval isn t too bad, but the Wald interval is pretty far off Patrick Breheny Biostatistical Methods I (BIOS 5710) 20/25

Infant survival, 25 weeks Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals Sometimes, the agreement is much better; for the infant survival data at 25 weeks (x=31,n=39), we have: Binomial: [63.6, 90.7] Wald: [66.8, 92.2] Score: [64.5, 89.2] Here all three intervals are reasonably close, although the score interval is again closer to the binomial interval Patrick Breheny Biostatistical Methods I (BIOS 5710) 21/25

Infant survival, 25 weeks Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals And sometimes, the Wald interval fails completely; for the infant survival data at 22 weeks (x=0,n=29), we have: Binomial: [0, 11.9] Wald: [0, 0] Score: [0, 11.7] The Wald interval is clearly useless in this scenario Patrick Breheny Biostatistical Methods I (BIOS 5710) 22/25

Confidence interval #1: Wald Confidence interval #2: Score Comparison of confidence intervals Accuracy of the normal approximation The real sampling distribution is binomial, but when n is reasonably big and p isn t close to 0 or 1, the binomial distribution looks a lot like the normal distribution, so the normal approximation works pretty well When n is small and/or p is close to 0 or 1, the normal approximation doesn t work very well: Probability 0.15 0.10 0.05 0.00 n=39, p=0.8 Probability 0.4 0.3 0.2 0.1 0.0 n=15, p=0.95 20 25 30 35 40 10 11 12 13 14 15 Patrick Breheny Biostatistical Methods I (BIOS 5710) 23/25

Exact vs. approximate intervals When n is large and p isn t close to 0 or 1, it doesn t really matter whether you choose the approximate or the exact approach The approximate approaches are easy to do by hand, although in the computer era, this is often not important in real life Keep in mind, however, that the Clopper-Pearson interval is exact in the sense that it is based on the exact sampling distribution, but as we saw in lab, does not produce exact 1 α coverage Patrick Breheny Biostatistical Methods I (BIOS 5710) 24/25

A sampling distribution is the distribution of an estimate based on a sample from a population Know how to use the CLT to approximate sampling distributions Know how to use the CLT to carry out approximate tests for one-sample categorical data Wald CI: ˆπ ± z α SE, where SE = ˆπ(1 ˆπ)/n, although this approximation can be very poor at times Score CI: Based on inverting the CLT-based test; still approximate, but better than Wald Patrick Breheny Biostatistical Methods I (BIOS 5710) 25/25