Confidence Intervals with σ unknown

Similar documents
Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Inference for Single Proportions and Means T.Scofield

Chapter 23. Inference About Means

Stat 427/527: Advanced Data Analysis I

Business Statistics. Lecture 10: Course Review

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Lab #12: Exam 3 Review Key

Swarthmore Honors Exam 2012: Statistics

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Ch. 7. One sample hypothesis tests for µ and σ

Probability theory and inference statistics! Dr. Paola Grosso! SNE research group!! (preferred!)!!

The Student s t Distribution

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Harvard University. Rigorous Research in Engineering Education

Advanced Experimental Design

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Gov 2000: 6. Hypothesis Testing

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Null Hypothesis Significance Testing p-values, significance level, power, t-tests

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Sociology 6Z03 Review II

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

One-sample categorical data: approximate inference

Gov Univariate Inference II: Interval Estimation and Testing

Confidence Intervals, Testing and ANOVA Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Chapter 23: Inferences About Means

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Ch. 7: Estimates and Sample Sizes

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

1 Statistical inference for a population mean

Statistical Inference

STA Module 10 Comparing Two Proportions

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

Estimating a population mean

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES. Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups

The Components of a Statistical Hypothesis Testing Problem

Topic 3: Sampling Distributions, Confidence Intervals & Hypothesis Testing. Road Map Sampling Distributions, Confidence Intervals & Hypothesis Testing

Two Sample Problems. Two sample problems

Comparing Means from Two-Sample

STAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Introduction to Econometrics. Review of Probability & Statistics

Statistics for IT Managers

Business Statistics. Lecture 5: Confidence Intervals

MAT2377. Rafa l Kulik. Version 2015/November/23. Rafa l Kulik

Lecture 18: Simple Linear Regression

Two sample Hypothesis tests in R.

HYPOTHESIS TESTING. Hypothesis Testing

Mathematics for Economics MA course

Two-sample inference: Continuous Data

First we look at some terms to be used in this section.

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Data analysis and Geostatistics - lecture VI

Mathematical Notation Math Introduction to Applied Statistics

Two-sample inference: Continuous data

Statistical Foundations:

On Assumptions. On Assumptions

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

Inference for Distributions Inference for the Mean of a Population

Review of Statistics 101

Lecture 10A: Chapter 8, Section 1 Sampling Distributions: Proportions

Chapter 7: Statistical Inference (Two Samples)

Hypothesis Testing One Sample Tests

Comparing two independent samples

7 Estimation. 7.1 Population and Sample (P.91-92)

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Confidence Intervals. - simply, an interval for which we have a certain confidence.

STA Module 11 Inferences for Two Population Means

STA Rev. F Learning Objectives. Two Population Means. Module 11 Inferences for Two Population Means

Statistical Distribution Assumptions of General Linear Models

BINF 702 SPRING Chapter 8 Hypothesis Testing: Two-Sample Inference. BINF702 SPRING 2014 Chapter 8 Hypothesis Testing: Two- Sample Inference 1

Hypothesis Testing in Action: t-tests

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

Statistical inference provides methods for drawing conclusions about a population from sample data.

Math 143: Introduction to Biostatistics

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Outline. Unit 3: Inferential Statistics for Continuous Data. Outline. Inferential statistics for continuous data. Inferential statistics Preliminaries

T test for two Independent Samples. Raja, BSc.N, DCHN, RN Nursing Instructor Acknowledgement: Ms. Saima Hirani June 07, 2016

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

STAT 4385 Topic 01: Introduction & Review

Mathematical statistics

Chapter 24. Comparing Means

Chapter 8: Confidence Intervals

Hypothesis Testing in Action

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Transcription:

STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04 Today (Chapter 7): CI with σ unknown, t-distribution CI for proportions Two sample CI with σ known or unknown Hypothesis Testing, z-test Confidence Intervals with σ unknown Last Time: Confidence Interval when σ is known: A level C, or 100(1 α) % confidence interval for µ is [ X z α/2 σ n, X + z α/2 σ n ] But to return to reality, we don t know σ. Thus we must estimate the standard deviation of X with: SE X = s n But s is just a function of our X i s and thus is a random variable too it has a sampling distribution too. Before we could say if we knew σ which after algebra gave the confidence interval. P ( z α/2 < X µ σ/ n < z α/2) = 1 α [Remember for any s, z s is defined as where 1 2s of the area falls in ( z s, z s ). So z s = qnorm(1 s) = qnorm(s) = 1 s quantile. i.e. z s is the positive side.] Now we want a similar setup, so that: P (?? < X µ SE X <??) = α We need know the probability distribution of T = X µ. T has the Student s t-distribution with n 1 SE X degrees of freedom. We write this as T t n 1. The degrees of freedom=ν is the only parameter of this distribution. [book uses t s for T ] 1

t dist w/ df=1 t dist w/ df=5 0.0 0.1 0.2 0.3 0.4 t dist, df=1 N(0,1) 0.0 0.1 0.2 0.3 0.4 t dist, df=5 N(0,1) 6 4 2 0 2 4 6 6 4 2 0 2 4 6 t dist w/ df=10 t dist w/ df=50 0.0 0.1 0.2 0.3 0.4 t dist, df=10 N(0,1) 0.0 0.1 0.2 0.3 0.4 t dist, df=100 N(0,1) 6 4 2 0 2 4 6 6 4 2 0 2 4 6 RCode: > par(mfrow=c(2,2)) #tdist1.pdf > plot(seq(-6,6,length=10000),dnorm(seq(-6,6,length=10000)), type="l",lty=3,ylab="",xlab="",main="t-dist w/ df=1") > lines(seq(-6,6,length=10000),dt(seq(-6,6,length=10000),df=1), type="l",ylab="",xlab="") > legend(x=2,y=.4,lty=c(1,3),legend=c("t-dist, df=1","n(0,1)"))... Thus t-distribution approaches normal as ν increases, but for small n gives wider intervals. Why degrees of freedom?? 2

Let y i = x i x We have s 2 = 1 n 1 n 1 y 2 i and yi = 0( ) Now (*) < > 1 constraint on n numbers, hence the phrase n-1 degrees of freedom Now that we know the distribution, we know we can find the?? from above these are just the α/2 and 1 α/2 quantiles of the t-distribution. Let t n 1,s be defined similarly as z s and is equal to qt(1 s, df = n 1) = qt(s, df = n 1). We then have: P ( t n 1,α/2 < X µ SE X < t n 1,α/2 ) = 1 α This gives us a confidence interval like before, only we use the quantiles of the t-distribution rather than the normal distribution. Example. Taken from the original paper on t-test by W.S. Gossett, 1908. [Gossett was employed by Guiness Breweries, Dublin. A chemist, turned statistician, Guiness, fearing the results to be of commerical importance, forbade Gossett to publish under his own name. Chose pseudonym Student out of modesty] Two drugs to induce sleep: A- dextro, B= laevo. Each of ten patients receives both drugs (presumably in random order). Issue: Is drug B better than drug A? Student s sleep data:

data(sleep) extra group 1 0.7 1 2-1.6 1 3-0.2 1 4-1.2 1 5-0.1 1 6 3.4 1 7 3.7 1 8 0.8 1 9 0.0 1 10 2.0 1 11 1.9 2 12 0.8 2 13 1.1 2 14 0.1 2 15-0.1 2 16 4.4 2 17 5.5 2 18 1.6 2 19 4.6 2 20 3.4 2 extra1=sleep[sleep[,2]==1,] extra2=sleep[sleep[,2]==2,] extradiff=extra2[,1]-extra1[,1] >extradiff 1.2 2.4 1.3 1.3 0.0 1.0 1.8 0.8 4.6 1.4 > mean(extradiff) [1] 1.58 > sqrt(var(extradiff)) [1] 1.229995 > sqrt(var(extradiff)/10) [1] 0.3889587 > 1.58/0.38896 [1] 4.062114 > qt(.975,9) [1] 2.262157 > qt(.995,9) [1] 3.249836 > qnorm(0.975) [1] 1.959964 > qnorm(0.995) [1] 2.575829 A level C conf. interval with σ unknown: exact if X Normal otherwise approx correct for large n Margin of error M in E ± M is t n 1, α 2 s = t n 1, α n 2 SE X Remark: Large value, 4.6 possible outlier, so some doubt about normal assumptions here. What s different?? Since we don t know σ, pay a penalty with a (slightly) wider interval: ( e.g t=2.262 vs. z=1.96 for 5% level confidence ) For large sample sizes we can just use the normal distribution quantiles z α/2, since the t-distribution quickly looks like the normal distribution. Proportions We saw last time that ˆp is approximately distributed as N(p, p(1 p) ). If we want a confidence interval for n ˆp we can use this normality to get an approximate confidence interval.

M = z α/2 SE p = z α/2 p(1 p) n The book offers a correction to this using p = y+0.5z2 α/2 n+z 2 α/2 and SE p = p(1 p) n+z 2 α/2. Two-samples One of the most common statistical procedures. Is there a difference? Is it real?? However, because of the preparatory work with one-sample problems, this should seem rather familiar, a case of dejà-vu., but with slightly more complex formulas. What do we mean by two-samples? Two groups Distinct populations [treatment/control,..., male/female... ] Grouping variable: categorical variable with 2 levels. Data is independent between groups Example: (Dalgaard p 87) Energy expenditure: Two groups of women, lean and obese. Twenty four hour energy expenditure in MJ. data(energy) lean_energy[energy$stature== lean,1] obese_energy[energy$stature== obese,1] obese [1] 9.21 11.51 12.79 11.85 9.97 8.79 9.69 9.68 9.19 lean [1] 7.53 7.48 8.08 8.09 10.15 8.40 10.88 6.13 7.90 7.05 7.48 7.58 8.11 plot(expend~stature,data=energy) Beware: Some data sets that may look like two sample problems are really better treated as paired data. Example: Sleep drugs data from above: 10 patients, Drugs A and B. But since each patient received both A and B, the samples are not really independent (common component of variation due to patient) better to look at differences. Becomes a one-sample problem. (Will discuss more about pairing/blocking later). Notation: Population Variable Mean SD SRS from Each Population Sample Size Sample Mean Sample SD Populatio X 1 µ 1 σ 1 Populatio X 2 µ 2 σ 2 Sample 1 X1 s 1 Sample 2 X2 s 2 5

Distribution of X 1 X 2 Sample mean difference: X 1 X 2 All depends on the variability and distribution of this difference!! Recall in general that if E(V ) = µ and E(W ) = ν then and if V and W are independent then So if X 1 (µ 1, σ2 1 ), X2 (µ 2, σ2 2 ), we will have E(V W ) = µ ν var(v W ) = var(v ) + var(w ) and for independent rvs X 1 and X 2 : µ X1 X 2 = E( X 1 X 2 ) = µ 1 µ 2 We need estimates for µ 1 µ 2 and σ 2 X1 X 2. σ 2 X1 X 2 = σ 2 X1 + σ 2 X2 = σ2 1 + σ2 2 Clearly X 1 X 2 is estimate for µ 1 µ 2. Once we have an estimate for the σ X1 X 2 then we can use similar method as for a 1-sample case to get a confidence interval. 1. Unequal variances: σ 2 1 σ 2 2 then use SE 2 X1 X 2 = s2 1 + s2 2 2. Equal Variances: If σ 2 1 = σ 2 2 = σ 2 is unknown but assumed to be equal, can use a pooled estimate of variance σ: s 2 pooled = ( 1)s 2 1 + ( 1)s 2 2 + 2 i.e. average with weights equal to the respective degrees of freedom. Then our estimate of σ X1 + X 2 SE 2 pooled = s 2 pooled( 1 + 1 ) Good method if the two SDs are close, but if also are moderate to large, there won t be much difference from the unequal variances method (below) If the two SDs are different, better to use unequal variances method. will use this pooled estimate again when we study Analysis of Variance As above, we need the distribution of: If X 1 N(µ 1, σ 2 1) and X 2 N(µ 2, σ 2 2) then: T = X 1 X 2 µ X1 X 2 SE of X 1 X 2 6

Equal Variances: If we have equal variances in the two populations, then SE of X 1 X 2 = SE pooled and T t ν with ν = + 2 Unequal Variances: Then SE of X 1 X 2 = SE X1 X 2 and T is approximately distributed as t ν. We use one of two values for ν 1. ν = min( 1, 1) 2. ν = s 2 1 + s2 2 1 1 ( s2 1 ) 2 + 1 1 ( s2 2 ) 2 This is known as Welsh s formula which gives fractional degrees of freedom. More accurate formula (generally used by packages, and only on computers!): Can use either approximation, but say which! Note that one can generally not go too far wrong, since can show by algebra that min( 1, 1) ν + 2 Summary: Two sample confidence intervals for µ 1 µ 2 at the 100(1 α)% level E ± M, E = X 1 X 2, M = (z α/2 or t α/2 ) (appropriate SE) known large sample unknown, unequal unknown, equal σ1 M = z 2 α/2 + σ2 2 s M = z 1 α/2 + s2 2 s M = t 1 α/2,ν + s2 2 1 M = t α/2,ν s pooled ν = min( 1, 1) or ν ν = + 2 where z α/2 and t α/2,ν are same notation as for one-sample case. In energy data above, we can construct a 95% confidence interval for the difference in the true means between obese and lean. = 9, = 13 and X 1 X 2 = 2.23. We ll use the conservative estimate for ν = min(9 1, 13 1) = 8. SE X1 X 2 = 0.58. So our M = 2.24 0.57 = 1.30. Then a (conservative) 95% confidence interval is [0.93,3.53]. Computer output for Welsh s formula gives [1.00,3.46] > mean(obese)-mean(lean) [1] 2.231624 > qt(.9725,df=8) [1] 2.244938 > sqrt(var(obese)/length(obese)+var(lean)/length(lean)) [1] 0.5788152 > t.test(obese,lean, conf.level=.95) Welch Two Sample t-test data: obese and lean t = 3.8555, df = 15.919, p-value = 0.001411 alternative hypothesis: true difference in means is not equal to 0 + 1

95 percent confidence interval: 1.004081 3.459167 sample estimates: mean of x mean of y 10.297778 8.066154 Hypothesis Tests We will generally have some hypotheses about certain parameters of the population (or populations) from which our data arose, and we will be interested in using our data to see whether these hypotheses are consistent with what we have observed. To do this, we have already calculated confidence intervals for them, now we will be conducting hypothesis tests about the populations parameters of interest. We will discuss these two statistical procedures, in general they are built on the idea that if some theory about the population parameters is true, the observed data should follow, admittedly random, but generally predictable patterns. Thus, if the data do not fall within the likely outcomes under our supposed ideas about the population, we will tend to disbelieve these ideas, as the data do not strongly support them. We will initially be interested in using our data to make inferences about µ, the population mean. To do this, we will use our estimate of location from the data; namely, the sample mean (average) (since it is mathematically nicer than the median). We will do this in the framework of several different data structures, starting with the most basic, the one-sample situation. How can we decide if a given set of data, and in particular its sample mean, is close enough to a hypothesized value for µ for us to believe that the data are consistent with this value? In order to answer such a question, we need to know how a statistic like the sample average behaves, i.e. its distribution. Now we have already studied the distribution of the sample average and the sample proportions, when the sample size is large enough, they follow Normal distributions, centered at the expected value and with a spread of the order the relevant SE. INFERENCE FOR A SINGLE SAMPLE: Z-DISTRIBUTION Standard Error of the Sample Mean (σ known) Example: Testing whether the birthweights of the secher babies have above average mean. Variance of the original population σ=700 Known. We would like to test whether µ = 2500, versus the alternative µ > 2500. We have a sample of n = 107 observations. mean(bwt) gives that X = 2739, we would like to use this data to test µ > 2500. We have a sample of size 107, we know that X will be normal with variance σ2 = 7002 = 490000. 07 107 If it is true that µ=2500 (this is called the null hypothesis), then under the central limit theorem, X N (2500, 490000/107) = N (2500, 67.7 2 ) and under the null hypothesis P ( X 2739) = P ( X µ σ n 2739 2500 ) = P (Z 3.53) 67.7 What is the probability that a standard normal Z score is as big as 3.53? P (Z > 3.53) = 1 P (Z 3.53) = 1 Φ(3.53) = 0.000207 8

using the R command pnorm(3.53) which returns [1] 0.9997922 This is indeed very small, too small to be true. We reject the null hypothesis. Let X 1,..., X n be a sample of n i.i.d. random variables from a distribution having unknown mean µ, and known standard deviation σ. Assume n is large, say n > 30. Suppose interest centers on testing the hypothesis H 0 : µ = µ 0, where µ 0 is some fixed, pre-specified value. This will be our null hypothesis, notice that it is a simple one, i.e. it postulates a single hypothesized value for µ. The hypothesis against which the null hypothesis is to be compared, the alternative hypothesis, can take one of three basic forms: 1. H A : µ µ 0 2. H A : µ > µ 0 3. H A : µ < µ 0 The idea, as we have said, is to assess whether the data supports the null hypothesis (H 0 ) or whether it suggests the relevant alternative (H A ). To begin, we assert that the null hypothesis is true (i.e. that the true value of µ is actually µ 0 ). Under this assumption, the Central Limit Theorem implies that the test statistic Z = X µ 0 σ/ n, has a standard normal (N(0, 1)) distribution (notice that the test statistic is just the standardized version of X under the assumption that the true mean is actually equal to µ 0 ). The usual convention applies that if σ is unknown, and n is large then the sample standard deviation, s, is used in place of σ in forming the test statistic. The null hypothesis is supported if the observed value of the test statistic is small (i.e. X is close enough to µ 0, the hypothesized value, so that I would believe that the true mean is µ 0 ). On the other hand, if I observe a large value of the test statistic, this suggests that X is far from µ 0, which tends to discredit the null hypothesis in favor of the alternative hypothesis H A : µ µ 0. The real issue is how large is large? (or small is small?). For example, if I observe a Z value of 1, say, can we conclude in favor of H 0 over H A, or should we prefer H A over H 0. What about a Z value of 2? The answer to these question lies in considering what the test statistic actually measures. In words, the observed value of Z is just the number of standard errors the observed sample mean is from the hypothesized population mean; i.e. Z obs = number of standard errors X is away from µ0 This is determined by how rare a rare event should be to make us think soemthing else than H 0 is going on. This determines what we call the significance level α, most often α is taken to be 5%, sometimes 10%, and sometimes even.1 % (1/1000). We compute the P-value which the probability of observing a value as extreme as this. The P-value computation either takes P ( Z > Z obs ),P (Z > Z obs ) or P (Z < Z obs ) depending on what the alternative H A was. 9