Chapter 23: Inferences About Means

Similar documents
1 Inferential Methods for Correlation and Regression Analysis

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Data Analysis and Statistical Methods Statistics 651

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Math 140 Introductory Statistics

Final Examination Solutions 17/6/2010

Chapter 1 (Definitions)

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Chapter 8: Estimating with Confidence

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MA238 Assignment 4 Solutions (part a)

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

Statistics 511 Additional Materials

Chapter 18: Sampling Distribution Models

Confidence Intervals for the Population Proportion p

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

Lecture 24 Floods and flood frequency

Chapter 18 Summary Sampling Distribution Models

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Frequentist Inference

AP Statistics Review Ch. 8

CONFIDENCE INTERVALS STUDY GUIDE

Chapter 6 Sampling Distributions

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Mathematical Notation Math Introduction to Applied Statistics

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

Topic 9: Sampling Distributions of Estimators

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

z is the upper tail critical value from the normal distribution

Sampling Distributions, Z-Tests, Power

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Read through these prior to coming to the test and follow them when you take your test.

Sample Size Determination (Two or More Samples)

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Understanding Samples

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Topic 9: Sampling Distributions of Estimators

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Estimation of a population proportion March 23,

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Parameter, Statistic and Random Samples

1 Constructing and Interpreting a Confidence Interval

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

Power and Type II Error

Statistics 20: Final Exam Solutions Summer Session 2007

Homework 5 Solutions

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Common Large/Small Sample Tests 1/55

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Properties and Hypothesis Testing

Y i n. i=1. = 1 [number of successes] number of successes = n

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

6.3 Testing Series With Positive Terms

Confidence Intervals QMET103

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

This is an introductory course in Analysis of Variance and Design of Experiments.

Module 1 Fundamentals in statistics

Stat 421-SP2012 Interval Estimation Section

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Topic 9: Sampling Distributions of Estimators

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

1 Constructing and Interpreting a Confidence Interval

Stat 139 Homework 7 Solutions, Fall 2015

Expectation and Variance of a random variable

32 estimating the cumulative distribution function

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Confidence Intervals

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

STAC51: Categorical data Analysis

MIT : Quantitative Reasoning and Statistical Methods for Planning I

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Statistical Inference About Means and Proportions With Two Populations

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Stat 200 -Testing Summary Page 1

GG313 GEOLOGICAL DATA ANALYSIS

Chapter 5: Hypothesis testing

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Math 113 Exam 3 Practice

Transcription:

Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For qualitative variables, the parameter (whe there was oe) was the populatio proportio. Now, there are two parameters the populatio mea ad the populatio variace. Iferece for variace is beyod the scope of this course, so we ll oly cocer ourselves with the mea. The statistic that estimates the populatio mea is the sample mea. Naturally, this is a radom variable ad as such, we eed to kow somethig about the samplig distributio of this statistic. Some Theory About the Sample Mea A Remider Recall from a previous chapter that the samplig distributio of has mea, 2 2 variace, ad a approimately ormal shape uder certai circumstaces. We ow eed to revisit this idea with a eye towards reality. The key for the mea is that the equatio is true; that the equals sig was correct it actually does t affect our (upcomig) calculatios to ot kow the value of. The variace equatio was oly valid if the sample was small relative to the populatio (less tha 10%). That cotiues to be true, ad also cotiues to be the least of our worries. The bigger issue is that we eed this value for our upcomig calculatios. How ofte are we goig to kow the variace of the populatio? Never! A Complicatio We re certaily ot goig to stop ad cry about this! There is a way aroud it i fact, we ve ecoutered the problem before. A few chapters ago, we let go of the parameter p ad started usig the statistic p i its place. 2 Logically, the ow that we do t have the value of (or place? The statistic, of course! Let s replace the stadard error: SE s. ), what should we use i its with s. Just like before, we ll start callig this thig Previously, switchig p for p had o effect o the shape of the distributio because p was a ubiased estimator of p. Would t it be cool if s was a ubiased estimator of? HOLLOMAN S AP STATISTICS BVD CHAPTER 23, PAGE 1 OF 7

Alas, it is t. The larger the sample, the smaller the variatio; thus, the value of s will typically get smaller as the sample size icreases right up util the poit that the sample is the populatio, ad s becomes. What you should take from that is that is typically smaller tha s. Oe direct result of this is that SE will typically be larger tha. The other direct result of this cocers the Cetral Limit Theorem. It said that the samplig distributio approached ormal with stadard deviatio. We are ow replacig that eistig stadard deviatio with oe that is larger that meas that the shape of the distributio will be differet! Aother Complicatio The Cetral Limit Theorem says that has a approimately stadard ormal shape; what kid of shape does have? s Fortuately, a very smart guy figured this out a log time ago. He derived a ew distributio, which he called Studet s t (be sure to read your tetbook for the full story). This distributio looks a lot like the stadard ormal, but with fatter tails ad the shape chages as the sample size icreases! We saw this before i our study of Chi Square, ad we saw how the idea was hadled i terms of the graph: degrees of freedom. It turs out that the degrees of freedom for the t distributio are (for ow) 1. Fidig Probabilities You eed to be able to fid probabilities for a t distributio. Happily, this skill is idetical to fidig probabilities for a Chi Square distributio! Whe usig the chart, first fid the degrees of freedom dow the left had side. Net, fid the spot where the give statistic value ought to lie. The, look up to fid the right had area. Fially, make sure that you actually aswer the questio that was asked (this may ivolve symmetry ad the complemet). Whe usig the calculator, kow that tcdf() works idetically to chisqcdf()! Eamples [1.] Fid 2 P t if 15 df. I get a eact aswer of 0.03197, ad a chart aswer betwee 0.025 ad 0.05. HOLLOMAN S AP STATISTICS BVD CHAPTER 23, PAGE 2 OF 7

Figure 1 - T Table Ecerpt for Eample 1 [2.] Fid P t 2.5 if df 20 I get a eact aswer of 0.9894, ad a chart aswer betwee 0.975 ad 0.99. You must use the complemet for this oe, sice the questio asked for left had area but the chart oly gives right had area. [3.] Fid P t 1.75 if df 5 I get a eact aswer of 0.0703, ad a chart aswer betwee 0.05 ad 0.1. You must use symmetry for this oe, sice the chart oly uses positive values of t. [4.] Fid P t 3 if df 6 I get a eact aswer of 0.988 ad a chart aswer betwee 0.975 ad 0.99. You must use symmetry ad the complemet for this oe. A Cofidece Iterval for the Mea So how does this affect our procedures for cofidece itervals? The Formula s * t with df 1. The Coditios The t procedures require that the sample was obtaied radomly, that the sample is small eough (the 10% coditio), ad the variable has a ormal distributio i the populatio. Yet Aother Complicatio As before, we will ofte assume that the sample was obtaied radomly. Also, I ll keep the secod coditio i mid, but I ll rarely metio it. The last coditio will almost certaily fail! Fortuately, the t procedures are what we call robust. That meas that they still give reasoably accurate results eve whe the coditios are violated. This does ot mea that we will simply plow ahead ad forget about the coditio rather, it meas that what we eed to check is goig to be slightly differet from oe problem to aother. HOLLOMAN S AP STATISTICS BVD CHAPTER 23, PAGE 3 OF 7

If the populatio is ormal, the we are good to go. If the populatio is approimately ormal, or ot very o-ormal, the the robust ature of the t procedures will allow us to cotiue. If the populatio is clearly (or icredibly) o-ormal, the we ll oly be able to cotiue if the sample size is large (because as the sample size icreases, the closer we get to a situatio where the Cetral Limit Theorem kicks i). but how ca we kow aythig about the populatio? How ca we determie if it is OK to cotiue if we do t have the whole populatio to look at? Thik, thik! What have we doe i the past cases? We replaced the populatio iformatio with sample iformatio. Thus, if we ca t look at the shape of the populatio, we should istead look at the shape of the sample! The Solutio For small samples (say, less tha 15), we eed a fairly ormal populatio or, i terms of the sample, there caot be ay clear idicatio of skewess. For slightly larger samples (say 15 through 40), we eed a populatio that is t too skew so we eed to see a sample that is t too terribly skew. No, you ca t make that ay more precise! Deal with it. For larger samples, we almost do t care what the populatio looks like so we could have almost ay amout of skew i the sample. How are you goig to decide if there is skew? If you have data, the you ll eed to graph the data ad that meas that you ll have to draw the graph as part of your aswer. If you do t have the data, the you re goig to have to make a assumptio about the populatio. Make sure that you do t assume too much! Oly assume as much as is eeded i order to move forward with the procedure. I all cases, outliers are a issue i reality. As far as AP is cocered, outliers should ot stop you from performig the procedure. The Coditios (AP Eam Versio) Alas, what has appeared i the scorig rubrics o the AP Eam is t eactly what I ve described specifically with regards to the shape requiremet. Here s what is typically epected: We eed the sample to be a radom sample from the populatio, ad either a ormally distributed populatio or a large sample size. All of the scorig rubrics result i either a very small sample size (i which case you should check to see if the sample shows ay sig of skew, or assume somethig about the populatio) or a quite large sample size (i which case the requiremet is met). I have t see ay with a medium-sized sample where you must either see a ot-too-skew sample or assume a ot-tooskew populatio. So I m goig to go ahead ad work eamples the way I d like to see them i class. Be aware that the same aswers o the AP Eam might ot receive full credit. HOLLOMAN S AP STATISTICS BVD CHAPTER 23, PAGE 4 OF 7

Eample [5.] Durig a study o car safety, the brakig distace (feet) was measured for a car travelig at several differet speeds. The data are as follows: Table 1 - Brakig Distaces for Eample 5 2 10 4 22 16 10 18 26 34 17 28 14 20 24 28 26 34 26 36 60 80 20 26 54 32 40 32 40 50 42 56 76 84 36 32 48 52 56 64 66 54 70 92 93 120 85 46 68 46 34 Costruct a 95% cofidece iterval for the populatio mea brakig distace for these cars. This calls for a oe sample t iterval for the true mea. This requires that the sample was obtaied radomly ad that the populatio variable is ormally distributed. I ll have to assume that the sample was obtaied radomly. I do t kow aythig about the populatio distributio, but with a sample size of 50 I ll almost certaily be able to cotiue regardless of how the sample distributio looks. I ll take a look ayway Eample 5 Frequecy 0 5 10 0 20 40 60 80 100 120 Figure 2 - Histogram for Eample 5 Brakig Distace (ft) Nothig i the sample idicates a problem I should be able to cotiue. With 95% cofidece ad 49 degrees of freedom, t * 2.01. * s 25.77 The iterval is t 42.98 2.01 35.656,50.304 50 I am 95% cofidet that the populatio mea brakig distace is betwee 35.656 feet ad 50.304 feet. A Hypothesis Test for the Mea This will follow the same patter as the other tests we ve leared. The Hypotheses We ll assume that the parameter ( ) has some specific value ( 0 ). The alterative will be oe of the three iequalities. Be sure to eplicitly defie the parameter! H : 0 0 H :? a 0 HOLLOMAN S AP STATISTICS BVD CHAPTER 23, PAGE 5 OF 7

The Coditios The coditios here are the same as for the iterval the sample was obtaied radomly, that the sample is small eough (the 10% coditio), ad the variable has a ormal distributio i the populatio. As was the case before, that last coditio will fail. You ll eed to graph the data or make a appropriate assumptio i order to cotiue. Be sure to read my earlier commets about how what I m writig might ot be eactly what is epected o the AP Eam! The Mechaics t with df 1 s. Calculate the p-value the usig the t-distributio much as you did for oe sample proportio tests. Be sure to eplicitly state the level of sigificace that you will be usig. The Coclusio The coclusio is much the same as it was i previous procedures! If [ull hypothesis] the I ca epect to fid [probability statemet] i [p-value] of repeated samples. Sice [ p / p ], this occurs [too rarely / ofte eough] to attribute to chace at the [ ] level; it is [sigificat / ot sigificat], ad I [reject / fail to reject] the ull hypothesis. [coclusio i cotet make a statemet about the alterate hypothesis]. Eample [6.] The girth (diameter; measured i iches) of 31 black cherry trees was measured. The data are as follows: Table 2 - Cherry Tree Data for Eample 6 8.3 8.6 8.8 10.5 10.7 10.8 11.0 11.0 11.1 11.2 11.3 14.2 14.5 16.0 16.3 17.3 17.5 17.9 18.0 18.0 20.6 12.9 13.3 13.7 13.8 11.4 11.4 11.7 12.0 14.0 12.9 Do these data provide evidece that the populatio mea girth is differet from 12 iches? I ll let represet the populatio mea girth of a Cherry tree. H : 12 0 (the populatio mea girth is 12 iches) H a : 12 (the populatio mea girth is ot 12 iches) This calls for a oe sample t test for the populatio mea. This requires that the sample was obtaied radomly ad that the populatio variable is distributed ormally. I ll have to assume that the sample was obtaied radomly. I do t kow how the populatio is distributed, but with a sample size of 31 I should be able to cotiue i almost ay case I ll go ahead ad look at the data ayway. HOLLOMAN S AP STATISTICS BVD CHAPTER 23, PAGE 6 OF 7

Eample 6 Frequecy 0 4 8 8 10 12 14 16 18 20 22 Figure 3 - Histogram for Eample 6 Girth (i) The skew here is OK I ca cotiue. I ll use 0.05. 13.248 12 t 2.215. With df 30, 2P t 2.215 0.0345. s 3.138 31 If the populatio mea girth is 12 iches, the I ca epect to fid a sample with a mea girth less tha 10.75 iches or greater tha 13.248 iches i about 3.45% of samples. Sice p, this occurs too rarely to attribute to chace at the 5% level. This is sigificat; I reject the ull hypothesis. The data do provide evidece that the populatio mea girth is differet from 12 iches. HOLLOMAN S AP STATISTICS BVD CHAPTER 23, PAGE 7 OF 7