Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Similar documents
Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Topic 9: Sampling Distributions of Estimators

MATH/STAT 352: Lecture 15

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Topic 9: Sampling Distributions of Estimators

Estimation of a population proportion March 23,

Statistics 511 Additional Materials

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Topic 9: Sampling Distributions of Estimators

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Chapter 8: Estimating with Confidence

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Frequentist Inference

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Lecture 2: Monte Carlo Simulation

Properties and Hypothesis Testing

STAT431 Review. X = n. n )

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

1 Review of Probability & Statistics

Stat 421-SP2012 Interval Estimation Section

Statistical Intervals for a Single Sample

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

(7 One- and Two-Sample Estimation Problem )

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Sample Size Determination (Two or More Samples)

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Chapter 6 Sampling Distributions

Simulation. Two Rule For Inverting A Distribution Function

Homework 5 Solutions

Announcements. Unit 5: Inference for Categorical Data Lecture 1: Inference for a single proportion

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

Read through these prior to coming to the test and follow them when you take your test.

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Math 140 Introductory Statistics

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

1 Inferential Methods for Correlation and Regression Analysis

Sampling Distributions, Z-Tests, Power

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

This is an introductory course in Analysis of Variance and Design of Experiments.

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Expectation and Variance of a random variable

Estimation for Complete Data

6.3 Testing Series With Positive Terms

Lecture 3. Properties of Summary Statistics: Sampling Distribution

The standard deviation of the mean

Understanding Dissimilarity Among Samples

STATISTICAL INFERENCE

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Chi-Squared Tests Math 6070, Spring 2006

Chapter 6 Principles of Data Reduction

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Binomial Distribution

Economics Spring 2015

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Statistical inference: example 1. Inferential Statistics

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Lecture 19: Convergence

S160 #12. Sampling Distribution of the Proportion, Part 2. JC Wang. February 25, 2016

GG313 GEOLOGICAL DATA ANALYSIS

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Power and Type II Error

Output Analysis and Run-Length Control

1 Constructing and Interpreting a Confidence Interval

Computing Confidence Intervals for Sample Data

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Confidence Intervals for the Population Proportion p

Random Variables, Sampling and Estimation

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

6 Sample Size Calculations

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Lecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

University of California, Los Angeles Department of Statistics. Hypothesis testing

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Infinite Sequences and Series

Lecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

1 Constructing and Interpreting a Confidence Interval

Parameter, Statistic and Random Samples

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Transcription:

MATH1005 Statistics Lecture 24 M. Stewart School of Mathematics ad Statistics Uiversity of Sydey

Outlie Cofidece itervals summary Coservative ad approximate cofidece itervals for a biomial p The aïve iterval ad problems with it The coservative iterval Examples

Pivots ad Cofidece Itervals For a give statistical model, a pivot is a fuctio of the data ad the parameters which always has the same distributio (whatever the values of the parameters). For the Z-test model (data modelled as radom sample size from a populatio with ukow mea µ kow variace σ 2 ad populatio ormal ad/or sample size large) the with X the sample average, the pivot X µ σ/ N(0, 1) (whatever µ is). For the (oe-sample) t-test model (radom sample size from ormal populatio with ukow mea µ ad variace), with X ad S the sample average ad sd (resp.), the pivot X µ S/ t 1 (whatever µ is).

For the two-sample t-test model (two radom samples, sizes x, y from ormal populatios with ukow meas µ x, µ y ad ukow but equal variaces, the with X, Ȳ the sample averages, S X, S Y the sample sds, ad S p = ( x 1)S 2 X +(y 1)S2 Y x + y 2 X Ȳ (µ x µ y ) S p q 1 x + 1 y the pooled sample sd, the pivot t x + y 2 (whatever the values of µ x, µ y ). I all cases the pivot is of the form EST PARAM SE where the umerator is the differece betwee a parameter ad a estimate of it, the so-called estimatio error ad SE is the sd (or a estimate thereof) of the estimatio error i the umerator, regarded as a radom variable.

Suppose we ca fid a c so that for the pivot i questio, ( ) EST PARAM P c c = 0.95. SE The we ca say that the radom iterval of the form EST ± c SE cotais PARAM with probability 0.95, that is P (EST c SE PARAM EST + c SE) = 0.95. The observed value of this radom iterval is called a 95% cofidece iterval. Differet cofidece levels (e.g. 90%, 99%) ca be obtaied by choosig c differetly, such that the right-had side above is 0.90, 0.99, etc.

Thus for the Z-test model, if x is the observed value of X, we eed upper percetage poits from N(0, 1) (available o the bottom row of a t-table): For a 100(1 α)% cofidece iterval we eed c such that P(Z c) = α/2, sice the we also have P(Z c) = α/2 ad so P( c Z c) = 1 P(Z c) P(Z c) = 1 α/2 α/2 = 1 α. 95% cofidece iterval is x ± 1.96 σ/ 90% cofidece iterval is x ± 1.645 σ/ 99% cofidece iterval is x ± 2.576 σ/

Suppose we have a (oe-sample) t-test model with 16 observatios, ad that x ad s are the observed values of the sample average X ad sd S. The pivot ( X µ)/(s/ 16) t 15 so we cosult that row of the t-table: For 95% (corresp. to α = 0.05), sice P(t 15 > 2.131) = 0.025 (i.e. α/2), we use x ± 2.131(s/4). For 90% (corresp. to α = 0.1), sice P(t 15 > 1.753) = 0.05 (i.e. α/2), we use x ± 1.753(s/4). For 99% (corresp. to α = 0.01), sice P(t 15 > 2.947) = 0.005 (i.e. α/2), we use x ± 2.947(s/4).

Suppose we have a two-sample t-test model with sample sizes 11 ad 15, let x ad ȳ deote the observed sample averages, s x ad s y the observed sample sds ad s p = (10s 2 x + 14s 2 y ) /24 the observed pooled sample sd. The pivot here has a t 24 distributio: For 95% cofidece (corresp. to α = 0.05), the multiplier we eed is 2.064 (sice P(t 24 > 2.064) = 0.025, i.e. α/2). A 95% cofidece iterval for the populatio mea differece µ x µ y is therefore give by ( x ȳ) ± 2.064 (s p 1 11 + 1 15 For 90% cofidece (α = 0.1), sice P(t 24 > 1.711) = 0.05 (i.e. α/2), use 1.711. For 99% cofidece (α = 0.01), sice P(t 24 > 2.797) = 0.005 (i.e. α/2), use 2.797. ).

Iterpretatio May have difficulty i properly iterpretig a cofidece iterval. The cofidece level is a property of the procedure you have used. It says how ofte it covers the target i the log-ru. This is thus a property oly realised after may repetitios. If we just compute a sigle cofidece iterval i practice, the we may or may ot have covered the target. We do t kow, ad we possibly ever will kow exactly. However we kow that if we repeated this procedure may times, i 95% (or whatever the cofidece level is) of the time the cofidece iterval would iclude the ukow parameter value.

Coservative ad approximate cofidece itervals Our ability to costruct exact cofidece itervals for the 3 models cosidered depeded crucially o the fact that we had a pivot with a kow distributio of the form (est-param)/se. I that case we ca defie a radom iterval with the property that P(iterval icludes parameter) = 0.95 I some models the form of (approximate) pivots ad/or se s make are ot so coveiet.

Fallback optios are: a coservative 95% cofidece iterval obtaied by defiig a radom iterval such that P(iterval icludes parameter) 0.95, so that the iterval is possibly wider tha it really eeds to be, but at least still has at least the omial coverage probability; a approximate 95% cofidece iterval whereby P(iterval icludes parameter) 0.95. Such itervals should be used with cautio. Strictly speakig, itervals i the Z-test model where we are usig a Cetral-Limit-Theorem-approximately-ormal argumet are of this type, although i those cases the approximatio is ofte quite accurate. We shall examie such thigs i oe particular example: cofidece itervals for a biomial p-parameter.

Cofidece itervals for a biomial p Suppose we model a cout X as a B(, p) for some kow but ukow p. Example: i a cliical trial, of 100 patiets sufferig from a certai coditio, 68 obtai relief. Modellig this cout as a B(100, p) radom variable, provide a 95% cofidece iterval for p. The estimate is just ˆp = 68 100 = 0.68, the observed proportio obtaiig relief. A first guess would be to work out the stadard error of the estimate ad the, sice X, ad thus ˆp are approximately ormal, use ˆp ± 1.96SE(ˆp). What is the stadard error of the estimate ˆp i geeral?

The radom variable ˆp = X has ( ) ( ) X 1 2 Var = Var(X ) ( ) 1 2 = Var(X ) = p(1 p) = p(1 p) Thus the stadard deviatio of the estimator ˆp = X / is p(1 p)/. However, this depeds o the ukow p, so a computable versio, (i.e. the stadard error) is obtaied by pluggig i the estimate ito this expressio. Thus ˆp(1 ˆp) SE(ˆp) =. So i our example, the estimate is ˆp = 0.68 with stadard error ˆp(1 ˆp)/100 0.047. Ca we use a iterval of the form ˆp ± c SE(ˆp)?.

To do this, we eed ˆp p SE(ˆp) = ˆp p ˆp(1 ˆp) to be a pivot, that is to have a kow distributio, ot depedig o p. Is this a pivot? Is it approximately? If so, with what distributio? If is large eough so that the Cetral Limit applies, we do have a (approximate) pivot here, but ot the ratio above, rather a versio of it with the true p put back ito the deomiator i place of ˆp (i.e. the exact SD istead of its approximatio, the SE): ˆp p SD(ˆp) = ˆp p p(1 p) approx N(0, 1) (ote the differece betwee these two ratios, i particular the deomiators!)

Ufortuately the approximate-n(0, 1)-pivot has the ukow p appearig i the deomiator; it ca t be used to costruct cofidece itervals directly; the ±-factor is ot computable. Eve more ufortuately, if we revert back to our first guess ad plug-i ˆp for p i the deomiator, the resultat ratio is i geeral ot-at-all-a-approximate-n(0, 1)-pivot: Dist of ˆp p SE(ˆp) = ˆp p ˆp(1 ˆp) chages sigificatly for differet p s, (particularly for small-to-moderate, say 50). Eve more ufortuately tha that, this is still recommeded i may textbooks as a good idea. As we illustrate below, this iterval ca have a serious problem. More precisely, for certai ulucky choices of ad p, the coverage probability of the ˆp ± 1.96SE (ˆp) is otably below 0.95.

We illustrate this pheomeo with a particularly ulucky pair, = 32, p = 0.2: > x=rbiom(10000,32,.2) > phat=x/32 > se=sqrt(phat*(1-phat)/32) > lower=phat-1.96*se > upper=phat+1.96*se > sum((lower<=.2)*(upper>=.2)) [1] 8889 # this couts how may simulated itervals # cover the true value of 0.2 This is sigificatly less tha the expected 9500 (P-value of 1-sided test of H 0 : p = 0.95 versus H 1 : p < 0.95 is pretty small!): > pbiom(8889,10000,.95) [1] 5.488768e-131

Cofidece itervals Table summary 1 lists the smallest Coservative after adwhich approximate the coverage cofidece itervals stays atfor 0.93 a biomial or above p for Examples selected values of p for the stadard iterval ad three alterative itervals. s, J, FIG. 1. Coverage probability of the stadard iterval for p = 0.5 ad = 10 100. This shows P(p ˆp ± 1.96 ˆp(1 ˆp)/) for p = 0.5 ad ragig from 10 to 100 (from Brow et al. Aals of Statistics 2002).

This graph seems to suggest that, at least for p = 0.5, the situatio improves as gets bigger (as we expect is should, because the the SE should be almost perfect at estimatig SD(ˆp) ad so the ratio should be like a N(0, 1) the. Let us cosider = 100, = 1000 ad = 2000 with p = 0.2 agai: Firstly, = 100, p = 0.2: > x=rbiom(10000,100,.2) > phat=x/100 > se=sqrt(phat*(1-phat)/100) > lower=phat-1.96*se > upper=phat+1.96*se > sum((lower<=.2)*(upper>=.2)) [1] 9343 > pbiom(9343,10000,.95) [1] 3.154453e-12 Agai, the coverage probability is clearly less tha 0.95.

Next = 1000, p = 0.2: > x=rbiom(10000,1000,.2) > phat=x/1000 > se=sqrt(phat*(1-phat)/1000) > lower=phat-1.96*se > upper=phat+1.96*se > sum((lower<=.2)*(upper>=.2)) [1] 9429 > pbiom(9429,10000,.95) [1] 0.0007532143 Eve here with = 1000, the umber of itervals that work is sigificatly less tha the 9500 that oe would expect if the cofidece level really was 95%.

Fially = 2000, p = 0.2: > x=rbiom(10000,2000,.2) > phat=x/2000 > se=sqrt(phat*(1-phat)/2000) > lower=phat-1.96*se > upper=phat+1.96*se > sum((lower<=.2)*(upper>=.2)) [1] 9485 > pbiom(9485,10000,.95) [1] 0.251702 Here although less that 9500, it is ot sigificatly less ad so we would be happy believig that the actual cofidece level is 95% here. So oly use this iterval for massive sample sizes (well over 1000).

Two sources of error i the approximate cofidece iterval for p There are two approximatios at work with the so-called approximate iterval ˆp ± 1.95 ˆp(1 ˆp)/, where ˆp = X / ad X B(, p): approximatig the SD(ˆp) = p(1 p)/ with the SE = ˆp(1 ˆp)/; approximatig the biomial distributio of X with a ormal. The mai source of error is the first oe; so log as p ad (1 p) are both bigger tha 5 we are happy that the ormal approximatio to the biomial is pretty good.

That is to say we are reasoably happy that ˆp p p(1 p) approx N(0, 1) (ote: the true p appears i the deomiator here, ot ˆp) ad so that ( ) p(1 p) P p i ˆp ± 1.96 0.95 is a pretty accurate approximatio. The problem here is that this iterval, while havig a close-to-95% coverage probability, caot be computed!

The poor performace of the so-called approximate cofidece iterval is because of the difficulty i accurately p(1 p) approximatig the quatity. Aother approach is to determie a upper boud for this. For 0 p 1, p(1 p) is maximised at p = 0.5 where it equals 0.25. p(1 p) Thus because 1 2 for all p, the ucomputable iterval p(1 p) ˆp ± 1.96 (which has coverage probability 95%) is always icluded i the coservative iterval ˆp ± 1.96 1 2.

Coservative cofidece iterval for biomial p Thus we have that ( P ˆp 1.96 1 ) 2 1 p ˆp + 1.96 2 (approx) 0.95 (Note: it is still techically oly approximately coservative sice we are usig a ormal approximatio to a biomial distributio). We refer to the observed value of the radom iterval ˆp ± 1.96 1 2 as a coservative 95% cofidece iterval for p. It is always the maximum width of ay correspodig approximate iterval for that value of.

Summary Thus, we have the followig two optios for providig a 95% 1 cofidece iterval for the biomial p parameter, based o a sigle observatio x modelled as the observed value of a radom variable X B(, p) for kow but p ukow: 1. The approximate 95% C.I. for p: ˆp(1 ˆp) ˆp ± 1.96 which should oly be used for massive (> 1000). 2. The coservative 95% C.I. for p: ˆp ± 1.96 1 2 which ca be (eedlessly) wide, but is (at least approx) valid. 1 Differet cofidece levels are obtaied by replacig 1.96 with the appropriate value from the N(0, 1) table: 1.645 for 90%, 2.326 for 98%, 2.576 for 99%, etc.

Examples Cotiuig our earlier example with 100 patiets, 68 of whom experiece relief, our poit estimate of p is ˆp = 68 100 = 0.68, the stadard error is ˆp(1 ˆp)/100 0.047. The sample size is too small to use the approximate iterval. The coservative 95% iterval is thus ˆp ± 1.96 1 2 0.68 ± 0.098 [0.582, 0, 778].

Left-hadedess A radom sample of 1500 people from a certai populatio was foud to cotai 129 left-haded people. Provide a 95% cofidece iterval for the true proportio of left-haders i the populatio. Our poit estimate is ˆp = 129/1500 0.086 ad its stadard 0.086 0.914 1500 0.00724. error is Sice our here is i the thousads, we ca perhaps use the approximate iterval. It yields ˆp ± 1.96 SE 0.086 ± (1.96 0.00724) [0.072, 0.100] It is of iterest to compare this to the coservative iterval: ( ) 1 0.086 ± 1.96 2 [0.061, 0.111]. 1500 The coservative iterval is cosiderably wider, which will of course happe wheever ˆp is far from 0.5 as it is here.

I light of the last example, we ca do a little simulatio to see how reliable that approximate 95% cofidece iterval is: we simulate from B(1500, 0.08) may times ad see how ofte the iterval covers 0.08: > x=rbiom(10000,1500,.08) > phat=x/1500 > l1=phat-1.96*sqrt(phat*(1-phat)/1500) > u1=phat+1.96*sqrt(phat*(1-phat)/1500) > sum((l1<=.08)*(u1>=.08)) [1] 9494 This is clearly ot sigificatly differet from the ideal 9500! This makes us feel good about the approximate iterval here. How about the coservative 95% iterval? > l2=phat-1.96*sqrt(1/(4*1500)) > u2=phat+1.96*sqrt(1/(4*1500)) > sum((l2<=.08)*(u2>=.08)) [1] 9997 Wow! I all but 3 of the 10000 simulatios the coservative iterval covered 0.08. So although very wide, it will cover the true p at least 95% of the time.

Goodess-of-fit tests Our last topic relates to discrete data, e.g. couts or frequecies. Sometimes it is desired to compare a set of observed frequecies to either 1. a give set of expected probabilties/proportios or 2. a family of such sets to see if the set of probs (or oe member of the family of such sets) ca well explai what is observed.