UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences

Similar documents
Statistics 511 Additional Materials

Chapter 8: Estimating with Confidence

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Confidence Interval for one population mean or one population proportion, continued. 1. Sample size estimation based on the large sample C.I.

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Topic 9: Sampling Distributions of Estimators

CONFIDENCE INTERVALS STUDY GUIDE

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

1 Inferential Methods for Correlation and Regression Analysis

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

Confidence Intervals for the Population Proportion p

Binomial Distribution

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Statistical Intervals for a Single Sample

Topic 9: Sampling Distributions of Estimators

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Topic 9: Sampling Distributions of Estimators

Stat 421-SP2012 Interval Estimation Section

Expectation and Variance of a random variable

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

1.010 Uncertainty in Engineering Fall 2008

MATH/STAT 352: Lecture 15

(7 One- and Two-Sample Estimation Problem )

Chapter 6 Sampling Distributions

This is an introductory course in Analysis of Variance and Design of Experiments.

Computing Confidence Intervals for Sample Data

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Chapter 8 Interval Estimation

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

STAT 155 Introductory Statistics Chapter 6: Introduction to Inference. Lecture 18: Estimation with Confidence

Read through these prior to coming to the test and follow them when you take your test.

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Data Analysis and Statistical Methods Statistics 651

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Final Examination Solutions 17/6/2010

Biostatistics for Med Students. Lecture 2

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

AP Statistics Review Ch. 8

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Frequentist Inference

A Confidence Interval for μ

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

9.2 Confidence Intervals for Means

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Sampling Distributions, Z-Tests, Power

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Announcements. Unit 5: Inference for Categorical Data Lecture 1: Inference for a single proportion

UNIT 8: INTRODUCTION TO INTERVAL ESTIMATION

Random Variables, Sampling and Estimation

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Chapter 23: Inferences About Means

Properties and Hypothesis Testing

Lecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

1 Constructing and Interpreting a Confidence Interval

S160 #12. Review of Large Sample Result for Sample Proportion

Design of Engineering Experiments Chapter 2 Basic Statistical Concepts

Parameter, Statistic and Random Samples

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Simple Random Sampling!

S160 #12. Sampling Distribution of the Proportion, Part 2. JC Wang. February 25, 2016

1 Models for Matched Pairs

Topic 10: Introduction to Estimation

Last Lecture. Wald Test

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

a.) If random samples of size n=16 are selected, can we say anything about the x~ distribution of sample means?

A statistical method to determine sample size to estimate characteristic value of soil parameters

Stat 225 Lecture Notes Week 7, Chapter 8 and 11

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

University of California, Los Angeles Department of Statistics. Hypothesis testing

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Exam 2 Instructions not multiple versions

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

Chapter 18 Summary Sampling Distribution Models

CH19 Confidence Intervals for Proportions. Confidence intervals Construct confidence intervals for population proportions

Transcription:

UCLA STAT 13 Itroductio to Statistical Methods for the Life ad Health Scieces Istructor: Ivo Diov, Asst. Prof. of Statistics ad Neurolog Sample Size Calculatios & Cofidece Itervals for Proportios Teachig Assistats: Bradi Shaata & Tiffa Head Uiversit of Califoria, Los Ageles, Fall 007 http://www.stat.ucla.edu/~diov/courses_studets.html Slide 1 Slide Plaig a Stud to Estimate μ Plaig a Stud to Estimate μ It is importat before ou begi collectig data to cosider whether the estimates will be sufficietl precise. Two factors to cosider: the populatio variabilit of Y sample size First: I certai situatios the variabilit of Y should ot be cotrolled for (respose i a medical stud to treatmet). However, i most studies it is importat to reduce the variabilit of Y, b holdig extraeous coditios as costat as possible. For example: stud of breast cacer might wat to examie ol wome Slide 3 Slide 4 Plaig a Stud to Estimate μ Secod: Oce the experimet is plaed to reduce the variabilit of Y as much as possible, we cosider the sample size. For example: how ma wome should we sample to achieve the desired precisio for our estimate? s RECALL: Plaig a Stud to Estimate μ To decide o a proper value of, we must specif what value of is desirable ad have a guess of s. For we eed to ask what value would we tolerate? For s we could use iformatio from a pilot stud or previous research Guessed s Desired Slide 5 Slide 6 1

Plaig a Stud to Estimate μ Plaig a Stud to Estimate μ Example: Reideer (Cot ) 54.78 s 8.83 0.874 Suppose we would like to estimate the sample size ecessar for ext ear's roud-up to keep < 0.6 8.83 0.60 14.7 16.58 17 reideer Slide 7 Ca't have 0.6 of a reideer, so we roud (ALWAYS roud up o sample size calculatios) to 17 reideer. What happes to as the desired precisio gets smaller? Example: Reideer (cot ) Suppose we would like to estimate the sample size ecessar for ext ear's roud-up to keep < 0.3 8.83 0.30 866.3 867 reideer Whe we double the precisio (ie. cut i half) it requires 4 times as ma reideer. This is the result of the Slide 8 Decisios About Decisios About How do we make the decisio of what we will tolerate is the estimatio of μ RECALL: ± t df ) ( the + part is called the margi of error ad is equivalet to t(df) 0.05 * for a 95% cofidece iterval t( df ) 0. 05 s If we sca the 0.05 (or 95%) colum of the t table the t multipliers are roughl equal to. t df ) ( 0. 05 + t( df ) 0. 05 Slide 9 So the for example, mabe we reaso that we wat our estimate to be withi μ + 1. with 95% cofidece Usig the logic from the previous slide thikig of the spa of the CI, suppose a total spa of.4 or + 1. is desired, the would eed to be < 0.60-1. + 1. Slide 10 t( df ) 0. 05 1. 0.6 Coditios for Validit of Estimatio Methods Coditios of validit of the formula We have to be careful whe makig estimatios computers make it eas iterpretatios are valid ol uder certai coditios Slide 11 For to be a estimate of μ, we must have sampled radoml from the populatio If ot the iferece is questioable/biased The validit of also requires: The populatio is large whe compared to the sample size rare that this is a problem sample size ca be as much as 5% of the populatio without seriousl iflatig. Observatios must be idepedet of each other we wat the observatios to give idepedet pieces of iformatio about the populatio. Slide 1

Coditios of validit of the formula Defiitio: A hierarchical structure exists whe observatios are ested withi the samplig uits this is a commo problem i the scieces Example: Measure the pulse of 10 patiets 3 times each. We do't have 30 pieces of idepedet iformatio. Oe possible aïve solutio: we could use each persos average Coditios of validit of a CI for μ Data must be from a radom sample ad observatios must be idepedet of each other If the data is biased, the samplig distributio cocepts o which the CI method is based do ot hold kowig the average of a biased sample does ot provide iformatio about μ Slide 13 Slide 14 Coditios of validit of a CI for μ We also eed to cosider the shape of the data for Studet's T distributio: If Y is ormall distributed the Studet's T is exactl valid If Y is approximatel ormal the Studet's T is approximatel valid If Y is ot ormal the Studet's T is approximatel valid ol if is large (CLT) How large? Reall depeds o severit of o-ormalit, however our rule of thumb is > 30 Page 0 has a ice summar of these coditios NOTE: If samplig distributio caot be cosidered ormal Studet's T will ot hold. Verificatios of Coditios I practice these coditios are ofte assumptios, but it is importat to check to make sure the are reasoable Scrutiize stud desig for: radom samplig possible bias o-idepedet observatios Populatio Normal? previous experiece with other similar data histogram/ormal probabilit plot icrease sample size tr a trasformatio ad aalze o the trasformed scale Slide 15 Slide 16 CI for a Populatio Proportio CI for a Populatio Proportio So far we have discussed a cofidece iterval usig quatitative data There is also a CI for a dichotomous categorical variable whe the parameter of iterest is a populatio proportio is the sample proportio p is the populatio proportio Whe the sample size is large, the samplig distributio of is approximatel ormal Related to the CLT Whe the sample size is small, the ormal approximatio ma be iadequate To accommodate this we will modif slightl Slide 17 Slide 18 3

CI for a Populatio Proportio The adjustmet we are goig to make to is to use istead + 0.5 z ~ p + z Relax ad remember that the formula for p ˆ p ~ was: CI for a Populatio Proportio So what is the z / bit? 0.05 -Z 0.05 0.05 RECALL: I chapter 4, z was the cut poit of the upper part of the stadard ormal distributio for a give Now we wat z / because we are calculatig a cofidece iterval ad eed to accout for both sides of the distributio So i the distributio above would be 0.05, which correspods to a 95% cofidece iterval 0.95 Z 0.05 Z Slide 19 Slide 0 CI for a Populatio Proportio CI for a Populatio Proportio The stadard error of A sample value p~ also eeds a slight modificatio ( 1 ) p( p) p~ is tpicall withi ~ p ~ 1 + ± ~ p ~ z Before we defie the formula for a CI for p let s remember the formula for a CI(μ) RECALL: ± t df ) ( s Where 100(1 - ) is the desired cofidece If we pick this apart we are reall saig that a CI(μ) is: the estimate of μ + (a appropriate multiplier) x () Slide 1 Slide CI for a Populatio Proportio Applicatio to Data Icorporate that logic ad we get: ( ) ~ p ± z ~ p Where 100(1 - ) is the desired cofidece This time we will use a z multiplier istead of a t multiplier Example: Suppose a researcher is iterested i studig the effect of aspiri i reducig heart attacks. He radoml recruits 500 subjects with evidece of earl heart disease ad has them take oe aspiri dail for two ears. At the ed of the two ears he fids that durig the stud ol 17 subjects had a heart attack. Calculate a 95% cofidece iterval for the true proportio of subjects with earl heart disease that have a heart attack while takig aspiri dail. Slide 3 Slide 4 4

Applicatio to Data Example: Heart Attacks (cot ) First, we eed to fid z / because this is a 95% CI, this meas that will be 0.05 ad z / will be z 0.05 0.05 0.95 0.05 Z Applicatio to Data Next, solve for p~ + The Text rouds this to + 4 + 0.5 z ~ + 0.5( z0.05 ) + 0.5( 1.96 ) + 1.9 p + z + z0.05 + 1.96 + 3.84 that s just the formula for p~, ow we actuall have to fid p~ ~ 17 + 1.9 p 0.038 500 + 3.84 i this case z / 1.96 -Z 0.05 Z 0.05 Slide 5 Slide 6 Applicatio to Data Applicatio to Data Next, solve for p ~ p ( 0.038)( 0.96) ~ 500 + 3.84 Fiall the 95% CI for p ~ p z 0.0085 ( ~ ) 0.038 ± 1.96( 0.0085) ± p 0.038 ± 0.0167 (0.013, 0.0547) What is our iterpretatio of this iterval? CONCLUSION: We are highl cofidet, at the 0.05 level (95% cofidece), that the true proportio of subjects with earl heart disease who have a heart attack after takig aspiri dail is betwee 0.013 ad 0.0547. Is this meaigful? Slide 7 Slide 8 Practice Calculate p~ ad ~ for a 99% cofidece iterval So z 0.005 is.58 + 0.5 z ~ p + z 0.005 ~ p 1 ~ ~ p + z -Z 0.005 p + 0.5 + z 0.99 0.005 Z 0.005 ( z0.005 ) + 0.5(.58 ) 0.005 +.58 ~ 1 ~ + 6.66 ( p) p( p) p( p) ~ 1 ~ +.58 + 3.33 + 6.66 Practice This is a lot of work! Cosider the followig shortcuts: The value of z / ca be carried through for all three formulas + 0.5 z ~ ~ p( 1 ~ p) p + ~ p ~ p ± z ( ~ p ) z + z just do t forget to square it i p~ ad ~ p RECALL: The t distributio approaches a z distributio whe df this meas that at the bottom of the t table there are several t multipliers that ca be substituted for z (use the df row) CAUTION: this will ol work for certai levels of. If ot foud o the t table ou must go back ad solve with the z table! Slide 9 Slide 30 5

Plaig a Stud to Estimate p Plaig a Stud to Estimate p We talked about fidig the sample size ecessar to esure for quatitative data. This method depeded o: Desired ( Guessed ~ p)( 1 Guessed ~ p) Desired ~ p + z Guessed s For the proportios we use a similar idea: where a guess for p~ ca be made o previous research or i igorace. Example: Heart Attacks (cot ) How ma subjects are eeded if researchers wat < 0.005 for a 95% CI, ad have guess based o previous research that p~ would be 0.04 ( 0.04)( 0.96) ( 0.04)( 0.96) 0.005 0.005 + 1.96 ( 0.04)( 0.96) + 3.84 + 3.84 1536 + 3.84 1533.16 1534subjects Slide 31 Slide 3 Example 6.1 Example 6.1 StatisticalBarChartDemo: http://socr.ucla.edu/htmls/socr_charts.html 6.1. Six health three ear-old female Suffolk sheep were ijected with the atibiotic Getamici, at a dosage of 10 mg/kg bod weight. Their blood serum cocetratio (µg/mli) of Getamici 1.5 hours after ijectio were as follows: 33, 6, 34, 31, 3, 5. For these data, the mea is 8.7 ad the stadard deviatio is 4.6. (a) Costruct a 95% cofidece iterval for the populatio mea μ. There are five degrees of freedom. 8.7 ±.571 4.6/sqrt(6), or (3.9, 33.5). -bar 8.7; s 4.5898; 4.5898/sqrt[6] 1.8738 (approx) 1.9 micrograms/liter. 8.7 +/- (.571)(1.8738) (3.9,33.5) or 3.9 < mu < 33.5 (b) Defie i words the populatio mea. The populatio mea μ is the mea blood serum cocetratio i μg/ml of Getamici 1.5 hours after ijectio at a dosage of 10mg/kg bod weight i health three-ear-old female Suffolk sheep. The value of mu is ukow. However, it does exist ad, i words, mu mea blood serum cocetratio of Getamici (1.5 hours after ijectio of 10 mg/kg bod weight) i health threeear-old female Suffolk sheep. (c) The fact that the 95% cofidece iterval for μ cotais earl all the observatios will this be geerall true? The fact that, i this case, 95% cofidece iterval for μ cotais earl all the observatios is mail due to the small sample size. For much larger samples, cofidece i the locatio of μ is much more cocetrated ad the iterval will be much tighter. Slide 33 Slide 34 6