We have also learned that, thanks to the Central Limit Theorem and the Law of Large Numbers,

Similar documents
A string of not-so-obvious statements about correlation in the data. (This refers to the mechanical calculation of correlation in the data.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Data Analysis and Statistical Methods Statistics 651

Contents Two Sample t Tests Two Sample t Tests

Hypothesis tests and confidence intervals

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

ECE 901 Lecture 4: Estimation of Lipschitz smooth functions

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Bertrand s postulate Chapter 2

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Define a Markov chain on {1,..., 6} with transition probability matrix P =

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

APPLIED MULTIVARIATE ANALYSIS

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Lecture 19. Curve fitting I. 1 Introduction. 2 Fitting a constant to measured data

Module 1 Fundamentals in statistics

Sample size calculations. $ available $ per sample

Chapter 23: Inferences About Means

Two sample test (def 8.1) vs one sample test : Hypotesis testing: Two samples (Chapter 8) Example 8.2. Matched pairs (Example 8.6)

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Expectation and Variance of a random variable

Statistics 511 Additional Materials

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Topic 9: Sampling Distributions of Estimators

Integrals of Functions of Several Variables

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Statistics for Applications Fall Problem Set 7

Common Large/Small Sample Tests 1/55

Chapter 5: Hypothesis testing

MATH/STAT 352: Lecture 15

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Math 140 Introductory Statistics

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Chapter 8: Estimating with Confidence

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Stat 421-SP2012 Interval Estimation Section

Summer MA Lesson 13 Section 1.6, Section 1.7 (part 1)

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Stat 200 -Testing Summary Page 1

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Lecture 2: Monte Carlo Simulation

Properties and Hypothesis Testing

Comparing your lab results with the others by one-way ANOVA

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

AN EFFICIENT ESTIMATION METHOD FOR THE PARETO DISTRIBUTION

Chapter 2. Asymptotic Notation

Chapter 18 Summary Sampling Distribution Models

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

This is an introductory course in Analysis of Variance and Design of Experiments.

Perturbation Theory, Zeeman Effect, Stark Effect

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Output Analysis and Run-Length Control

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

Statistics 20: Final Exam Solutions Summer Session 2007

Topic 9: Sampling Distributions of Estimators

M1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r

Data Analysis and Statistical Methods Statistics 651

Lecture 8: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

STAC51: Categorical data Analysis

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Topic 9: Sampling Distributions of Estimators

Output Analysis (2, Chapters 10 &11 Law)

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Read through these prior to coming to the test and follow them when you take your test.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

MA238 Assignment 4 Solutions (part a)

Question 1: Exercise 8.2

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Birth-Death Processes. Outline. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Relationship Among Stochastic Processes.

Introductory statistics

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

Lecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS

Chapter 6 Sampling Distributions

(7 One- and Two-Sample Estimation Problem )

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

ESTIMATION OF MOMENT PARAMETER IN ELLIPTICAL DISTRIBUTIONS

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

Transcription:

Cofidece Itervals III What we kow so far: We have see how to set cofidece itervals for the ea, or expected value, of a oral probability distributio, both whe the variace is kow (usig the stadard oral, or Z, table), ad whe it is ot (usig "tudet's t" table). The two itervals are Z / ad t / We have also leared that, thaks to the Cetral Liit Theore ad the Law of Large Nubers, Z / is a approxiate cofidece iterval for the expected value, E[], whe the saple size () is large, eve if the observatios are coig fro a distributio other tha the oral. Authors: Blue, Greevy Bios 3 Lecture Notes Page of 9

Cofidece Itervals III Most cooly, the secod iterval (with the coefficiet fro the t table, ( t / ) is used oly whe the distributio of the idividual observatios is believed to be early oral. Otherwise the third iterval is used ad is called a large saple cofidece iterval or a approxiate cofidece iterval to ephasize that the coverage probability is oly approxiate. Thus, Z / is a approxiate (-)00% CI. This all works because E[ ] approx ~ N (0,) for `large. Authors: Blue, Greevy Bios 3 Lecture Notes Page of 9

Cofidece Itervals III Cofidece Itervals for the differece betwee two eas I order to lear about how oral cotraceptive use affects blood pressure, we fid soe woe who use oral cotraceptives ad soe who do't, observe their systolic blood pressures, ad see what we see. We coceptualize the two groups as differet populatios, fro which we draw a saple: Group : (OC users),, are i.i.d. with E[]= x ad Var[]= A saple of =8 yields x 3. 86 ad s 5. 34 x Group : (o-users),, are i.i.d. with E[]= ad Var[]= A saple of = yields y 7. 44 ad s 8. 3 y Authors: Blue, Greevy Bios 3 Lecture Notes Page 3 of 9

Cofidece Itervals III To aalyze these observatios we ight use a probability odel that says that blood pressures are orally distributed ad we ight ot. We ll see later why this becoes iportat. The quatity we are tryig to estiate is E[]-E[]= x - ad our estiator is siply. Because ad are rado variables, so is. Hece we ca stadardize it like so: Z E Var Now we kow that: E E E ) ad ) Var Var Var Var Var Authors: Blue, Greevy Bios 3 Lecture Notes Page 4 of 9

Cofidece Itervals III Ad because the CLT ad LLN work o averages of rado variables we have that Z approx ~ N 0, as gets large * Fially because P( -Z / < Z < Z / )=-, A approxiate* (-)00% cofidece iterval for is give by Z / (*Which is exact if the uderlyig distributios of the s ad s are both oral.) Authors: Blue, Greevy Bios 3 Lecture Notes Page 5 of 9

Cofidece Itervals III But the variace is ukow. If we estiate with proble because:, we ru ito a T ~??? The distributio of T is ukow -- is ot oral or studets-t. We have o way of calculatig a exact or eve approxiate cofidece iterval. (Why?) Fortuately, we ca always use the CLT to save us i large saples because: Z approx ~ N 0, as getslarge Authors: Blue, Greevy Bios 3 Lecture Notes Page 6 of 9

Cofidece Itervals III Now, i large saples, a approxiate (-)00% cofidece iterval for give by Z / For our case this yields the followig 95% CI : 3.86 7.44.96 5.34 8 8.3 or 5.4 3.8 7.76,8.60 At the 95% level the data suggest that the differece betwee the two populatios eas is at least -7.76 but o ore tha 8.60. Our observatios are evidece that OC use causes a icrease i blood pressure. ( Our best estiate is that it icreases ea blood pressure by 5.4.) But the evidece is ot very strog (because = 0 eas that there is o icrease, ad the 95% CI icludes this value). Authors: Blue, Greevy Bios 3 Lecture Notes Page 7 of 9

Cofidece Itervals III Is our saple size large eough to ivoke the CLT? Here we have =8 ad =, so probably ot. o what ca we do for sall saples? Uless we are willig to ake soe additioal assuptios the othig ore ca be doe. o, if we assue that the uderlyig distributios of ad are approxiately oral (syetrical ad ot too skewed) the whe the saple size is sall (either or or both) the there are several available ethods. The price we pay is that our procedure is o loger `robust. This is because all of our future calculatios will deped o the fact that s ad s are orally distributed ad if i fact they have soe other distributio our calculatios will be wrog. (This is why the large saple iterval discussed earlier is used so ofte, ad also why us statisticias bug you doctors about gettig a large saple size.) Authors: Blue, Greevy Bios 3 Lecture Notes Page 8 of 9

Cofidece Itervals III The geeral proble is to coe up with a estiate of Var, call it Vˆ, so that Z Vˆ ~ Q where the distributio Q is kow. However there is ore tha oe "atural" way to estiate the variace of. Ufortuately, oly oe of these ways leads to a tidy solutio (aother tudet's t iterval), ad it is ofte iappropriate. Authors: Blue, Greevy Bios 3 Lecture Notes Page 9 of 9

Cofidece Itervals III Ukow Variace Method (The Case of Equal Variaces) Assuig that both the s ad s are orally distributed, there is a eat, exact solutio oly for the case whe the variaces, although ukow, are assued to be equal ( = = ). I this case Var is estiated with Vˆ p where p is the `pooled variace estiate. It is a weighted average of the two saple variaces, ad, with the oe that is based o ore observatios gettig ore weight. p Authors: Blue, Greevy Bios 3 Lecture Notes Page 0 of 9

Cofidece Itervals III Now the stadardized differece p ~ t has exactly a t-distributio with +- degrees of freedo. Thus the (-)00% cofidece iterval for give by is t / p Assuig that the variaces i the two populatios are equal ad the uderlyig distributios are approxiately oral. However this iterval is fairly robust to oorality (That is, it cotiues to have approxiately the correct coverage probability whe the distributios are ot oral). Authors: Blue, Greevy Bios 3 Lecture Notes Page of 9

Cofidece Itervals III I our exaple of how oral cotraceptive use affects blood pressure, the pooled variace estiate is 7 (5.34 ) + 0 (8.3 ) s p = = 307.803 8 + so for a cofidece coefficiet of 0.95 we fid fro Table A., t7 =.05, ad the 95% cofidece iterval for the ea blood pressure differece betwee OC users ad o-users is 3.86 7.44.05 307.803 8 + or 5.4.05 (7.8), or 5.4 4.94. Or (-9.5, 0.36 ). However the variaces are rarely, if ever, equal. o what ca we do if we assue that the s ad s are orally distributed, but the variaces are uequal. Authors: Blue, Greevy Bios 3 Lecture Notes Page of 9

Cofidece Itervals III Ukow Variace Method (The Case of Uequal Variaces) Assuig that both the s ad s are orally distributed, ad that ( ), there is o eat solutio (it reais usolved today!) I this case we estiate with ru ito a proble because:, but T ~??? (Reeber that we are assuig that the saple sizes are sall eough that T would ot be approxiately oral, by the CLT) Authors: Blue, Greevy Bios 3 Lecture Notes Page 3 of 9

Cofidece Itervals III We ca use a approxiatio such as T approx * ~ t where t * is approx a t-dist with DF / / / / rouded dow called atterthwaite s correctio for df. (ost coputer progras do this but too uch of a pai to do by had) Alteratively there is a coservative approxiatio T approx ~ ti, That is, just use the df for the sallest populatio. Why is this coservative? Authors: Blue, Greevy Bios 3 Lecture Notes Page 4 of 9

Cofidece Itervals III Authors: Blue, Greevy Bios 3 Lecture Notes Page 5 of 9 Each approach suggests the iterval t * / Where the degree of freedo is calculated either as DF = Mi[-,-] or DF= / / / / Both itervals are approxiately correct uder the assuptio of s ad s orally distributed with uequal variaces. But either is the exact solutio.

Cofidece Itervals III Take hoe essage: Whe we have idepedet saples fro two oral distributios with ukow ad uequal variaces, we caot fid sesible exact cofidece itervals for the differece betwee the eas (i the sall saple case). Fortuately the Cetral Liit Theore ad the Law of Large Nubers still apply, ad they provide the basis for approxiate CI's whe both saple sizes, ad, are large. These two results (CLT ad LLN) ca be used to prove that Z / is a approxiate (-)00% for - (As we saw earlier). That is, the probability that this rado iterval will iclude - approaches 0.95 as the saple sizes grow. Authors: Blue, Greevy Bios 3 Lecture Notes Page 6 of 9

Cofidece Itervals III If the 's ad s are oral ad the two variaces are equal, the the tudet's t CI, t / p, is exact for all saple sizes, large or sall. If the variaces are very uequal, the coverage probability of this iterval ight ot be eve approxiately correct, eve whe the 's are orally distributed ad the saples are large. The coverage probability of this iterval ca be seriously wrog if the two variaces ad are ot equal, because the pooled variace estiate, p estiates E p + = E ( ) +( ) + + ( = ) +( ) + + Authors: Blue, Greevy Bios 3 Lecture Notes Page 7 of 9

Cofidece Itervals III Var ot the correct quatity I fact the two are the sae oly whe the variaces are equal, i.e., whe =. Iterestigly eough, the ai source of the proble with the tudet's t cofidece iterval that is caused by uequal variaces disappears whe the two saple sizes are equal (=)! I that special case, the pooled variace is estiatig the right quatity after all, because the two variace estiates are idetical: p + ( ) + ( ) = + + ( = )( + ( ) ) = +. Authors: Blue, Greevy Bios 3 Lecture Notes Page 8 of 9

Cofidece Itervals III The distributio is still ot exactly tudet's t, so the coverage probability wo't be exactly the value show i the t-table. But because the variace estiate is estiatig the right thig, the possibility of serious discrepacy betwee the table value ad the actual coverage probability of the iterval is avoided whe the two saple sizes are roughly equal. uary Iterval Whe Z / both ad are large t / p For, sall ad s & s oral; variaces equal or = i[, ] t / For, sall ad s & s oral; variaces uequal; (ca also use atterthwaite s df) Authors: Blue, Greevy Bios 3 Lecture Notes Page 9 of 9