MIT : Quantitative Reasoning and Statistical Methods for Planning I

Similar documents
STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Frequentist Inference

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Common Large/Small Sample Tests 1/55

Chapter 22: What is a Test of Significance?

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Sampling Distributions, Z-Tests, Power

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Sample Size Determination (Two or More Samples)

Successful HE applicants. Information sheet A Number of applicants. Gender Applicants Accepts Applicants Accepts. Age. Domicile

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Chapter 1 (Definitions)

Data Analysis and Statistical Methods Statistics 651

STATISTICAL INFERENCE

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Estimation of a population proportion March 23,

This is an introductory course in Analysis of Variance and Design of Experiments.

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Final Examination Solutions 17/6/2010

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Statistics 300: Elementary Statistics

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

University of California, Los Angeles Department of Statistics. Hypothesis testing

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

Read through these prior to coming to the test and follow them when you take your test.

Topic 9: Sampling Distributions of Estimators

Chapter 23: Inferences About Means

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

6.3 Testing Series With Positive Terms

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Confidence Interval for one population mean or one population proportion, continued. 1. Sample size estimation based on the large sample C.I.

Confidence Intervals for the Population Proportion p

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Topic 9: Sampling Distributions of Estimators

Statistics 511 Additional Materials

Economics Spring 2015

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Chapter 5: Hypothesis testing

Confidence Intervals QMET103

LESSON 20: HYPOTHESIS TESTING

Lesson 2. Projects and Hand-ins. Hypothesis testing Chaptre 3. { } x=172.0 = 3.67

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

S160 #12. Review of Large Sample Result for Sample Proportion

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

1036: Probability & Statistics

S160 #12. Sampling Distribution of the Proportion, Part 2. JC Wang. February 25, 2016

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

MATH/STAT 352: Lecture 15

1 Inferential Methods for Correlation and Regression Analysis

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Chapter 6 Sampling Distributions

A Statistical hypothesis is a conjecture about a population parameter. This conjecture may or may not be true. The null hypothesis, symbolized by H

Mathematical Notation Math Introduction to Applied Statistics

Math 140 Introductory Statistics

Chapter 8: Estimating with Confidence

Stat 225 Lecture Notes Week 7, Chapter 8 and 11

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

Homework 5 Solutions

Statistics 20: Final Exam Solutions Summer Session 2007

Topic 10: Introduction to Estimation

Topic 9: Sampling Distributions of Estimators

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Y i n. i=1. = 1 [number of successes] number of successes = n

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

Properties and Hypothesis Testing

MA238 Assignment 4 Solutions (part a)

Stat 200 -Testing Summary Page 1

Understanding Samples

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Chapter 13, Part A Analysis of Variance and Experimental Design

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

5. A formulae page and two tables are provided at the end of Part A of the examination PART A

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

STAT431 Review. X = n. n )

Simulation. Two Rule For Inverting A Distribution Function

6 Sample Size Calculations

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

1 Constructing and Interpreting a Confidence Interval

A statistical method to determine sample size to estimate characteristic value of soil parameters

Introduction There are two really interesting things to do in statistics.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

AP Statistics Review Ch. 8

Transcription:

MIT 11.220 Sprig 06 Recitatio 4 March 16, 2006 MIT - 11.220: Quatitative Reasoig ad Statistical Methods for Plaig I Recitatio #4: Sprig 2006 Cofidece Itervals ad Hypothesis Testig I. Cofidece Iterval 1. Meaig of Cofidece Iterval: A 95% cofidece iterval has two parts: a. a iterval calculated from the data b. a cofidece level (such as 95%, 99%, etc.) which states the probability that the iterval will cotai the true populatio parameter value if we take repeated samples. 2. The theory behid cofidece itervals: The sample mea ad variace are give by: Mea: X = X1 + X2 + X3 +... + X Variace: s 2 = ( Xi X) i= 1 ( 1) 2 X µ T = has a t-distributio with (-1) degrees of freedom. For a 95% s/ cofidece iterval, P(-t < T < +t) = 0.95 Substitutig, we get, P[( X - t* s/ ) < µ < ( X + t* s/ )] = 0.95 Hece, the cofidece iterval is give by: X ± (stadard error) * t 3. Cofidece iterval for populatio proportios: 1

The procedure is similar to that for populatio meas. The formula for cofidece iterval for a populatio proportio p is give by: Cofidece iterval: p-hat ± t * s.e. Where, p-hat is the sample proportio, ad s.e. is the stadard error of the proportio. The stadard error of proportios has the same formula as the stadard error of the mea. s.e. = s I the case of proportios, s or the stadard deviatio is give by s = p*( 1 p). Here, p is the proportio. Example: (from M, B&B) The village of Whitefish Bay uses a private isurace carrier to cover its automobile fleet. The carrier coteds that it pays 90% of all claims withi 30 days after the claims are filed. The departmet wats to check this out without goig through a complete audit. Aalysts Susa Medford takes a sample of 100 claims from last year ad fids that 82 were paid withi 30 days. Preset a statistical evaluatio of these results. Aswer: Step 1: Best estimate of the populatio proportio is the sample proportio. Here. The sample proportio is 0.82. Step 2: Estimate the populatio stadard deviatio. I this case, the hypothetical populatio proportio is 0.90 (=p). Hece, we use this proportio (ot the sample proportio) to estimate the stadard deviatio. σ = SQRT [p*(1-p)] = 0.3 Step 3: Estimate the stadard error of the proportio: s.e. = σ/ = 0.3/ 100 = 0.03 Step 4: The questio is what is the probability that a sample of 100 would result i a proportio estimate of 0.82 or less if the true populatio proportio would be 0.9? Covert 0.82 to a t-score: t = (X-µ)/s.e. = (0.82-0.9)/0.03 = -2.67 2

From the t-table, we fid a value of probability = 0.0044 (we ca also use a ormal distributio as degrees of freedom is more tha 30). This meas that the probability is less tha 0.004 of obtaiig a sample of 100 with a proportio of 0.82 if the true proportio were 0.90. II. Determiig Sample Size 1. Sample size for problems ivolvig populatio meas: What is the sample size required to be 95% certai that the estimate of a populatio mea is withi a margi of error E? Recall the cofidece iterval formula: ( X - t* s/ ) < µ < ( X + t* s/ ) Or, ( X - E) < µ < ( X + E) The margi of error is give by: E = t * s.e. = t* s, Solvig for, we get: = [t * s/e] 2 I the above formula, t is the t-score associated with the desired cofidece level, s is the estimated stadard deviatio, ad E is the amout of error that ca be tolerated. Example: (from M, B&B) It is cotract egotiatio time, ad the Louisiaa teachers uio wats to argue that its salaries are the lowest i the regio. Because the uio has 20,000 members, it must rely o a survey. If the uio wats to estimate its members mea salary ad be 95% sure that the estimate is withi $200 of the real mea, how large a sample should the uio use? Assume that a prelimiary survey estimates the mea as $15,000, with a $1,000 stadard deviatio. Aswer: Here, s=1,000, E = 200, t = 1.96 (it is the t-score associated with 95% cofidece level as becomes large). From the formula for sample size, we have, = [t * s/e] 2 = [1.96*1000/200] 2 = 96. 2. Sample size for problems ivolvig populatio proportios: The formula i this is idetical to the formula for determiig sample size for problems ivolvig populatio meas. = [t * s/e] 2 3

However, i this case, if we do t kow the populatio proportio ad eed to estimate a sample size, we may assume a populatio proportio of 0.5 as it is the best proportio estimate to use. This is because, for a proportio of 0.5, stadard deviatio is the highest (ca be show by calculus). Hece, all other proportios would require a smaller sample size for the same cofidece level. Example: (from M, B &B) The persoel departmet of a large govermet agecy eeds to kow the percetage of employees who will retire this year. This iformatio is essetial to agecy recruitmet persoel. The agecy determies this iformatio with a radom sample. If the agecy wats to be 90% sure that its estimate of the recruitmet percetage is withi 2%, how large a sample should it take? Aswer: III. Hypothesis Testig 1. Null Hypothesis (H 0 ): It is a hypothesis expressed as a egative, i.e, othig happeed. I statistical iferece, it is easier to use ull hypotheses. It is a statemet of o effect. 2. Alterative Hypothesis (H 1 ): It is the opposite of the ull hypothesis. It is expressed i the positive. It is a statemet of what we hope or suspect to be true istead of the ull hypothesis. It is also called as research hypothesis. 3. Steps i Hypothesis Testig: Step 1: Formulate the ull ad alterate hypotheses Step 2: Collect data relevat to the hypothesis. Step 3: Evaluate the hypotheses i the light of the data. Do the data support the ull hypothesis or the alterative hypothesis? Step 4: Based o the evaluatio i step 3, reject or do ot reject the ull hypothesis. 4

I most situatios, we coduct hypothesis testig with sample data as populatio parameters are ot available. 4. Type I ad Type II errors: Type I error arises whe we reject the ull hypothesis eve whe it is true. The secod type of error, called Type II error, arises whe we fail to reject the ull hypothesis whe i fact it is ot true. Example: (from M, B & B) The Bureau of Admiistratio is cocered with high levels of employee abseteeism. Last year, the average employee missed 12.8 workdays. This year, there is a experimetal program i which the agecy pays employees for each sick day or persoal day that they do ot use. A prelimiary survey of 20 persos reveals a mea of 8.7 days missed ad a stadard deviatio of 4.6. Preset a hypothesis, a ull hypothesis, ad evaluate them. State a coclusio i plai Eglish. Aswer: Step 1: Formulate the hypothesis: H 0 = Mea employee abseteeism is NOT less tha 12.8 days. H 1 = Mea employee abseteeism IS less tha 12.8 workdays (Hit: How did we formulate these hypotheses? The purpose of the experimetal program is to kow whether employee abseteeism has reduced after the program was istituted. Evidece of this effect would mea that mea umber of days missed AFTER the program came ito effect is lower tha the same before the program. If the program had o effect, we would expect that the mea umber of days did ot chage). Step 2: Collect data: a. Estimate the populatio mea after the program was istituted. As the best estimate of the populatio mea is the sample mea, the estimated mea is 8.7. b. Estimate the populatio stadard deviatio after the program was istituted. Agai, the sample stadard deviatio is the best estimate of the populatio stadard deviatio. Hece, s is 4.6. c. Calculate the stadard error of the mea. We use our familiar formula: s.e. = s = 4.6/ 20 = 1.028 Step 3: Evaluate the hypotheses: 5

This basically meas aswerig the followig questio: what is the probability of drawig a sample of 20 with a mea of 8.7 if the populatio mea is 12.8? To kow this probability, we covert 8.7 ito a t-score ad use the t-table. t = x µ s/ t = (8.7-12.8)/1.028 = -3.986 Hece, the probability = 0.0004 Step 4: Reject or do ot reject the ull hypothesis: As the probability is very low, reject the ull hypothesis. Thus it is very ulikely that a sample of 20 with a mea of 8.7 could have bee draw from a populatio with mea 12.7. Thus it is likely that the mea employee abseteeism is less tha 12.8 days. 6