Chapter 6. Estimates and Sample Sizes

Similar documents
Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

Sections 7.1 and 7.2. This chapter presents the beginning of inferential statistics. The two major applications of inferential statistics

Chapter 6 Estimation and Sample Sizes

p = q ˆ = 1 -ˆp = sample proportion of failures in a sample size of n x n Chapter 7 Estimates and Sample Sizes

Lecture Slides. Elementary Statistics. Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

Sampling Distribution Models. Chapter 17

Chapter 9 Inferences from Two Samples

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1

Chapter 12: Inference about One Population

Chapter 15 Sampling Distribution Models

Topic 6 - Confidence intervals based on a single sample

Chapter 18. Sampling Distribution Models /51

1 MA421 Introduction. Ashis Gangopadhyay. Department of Mathematics and Statistics. Boston University. c Ashis Gangopadhyay

7.1: What is a Sampling Distribution?!?!

Unit 22: Sampling Distributions

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

Ch. 7: Estimates and Sample Sizes

Chapter 8: Confidence Intervals

Chapter 9. Correlation and Regression

Introduction to Survey Analysis!

Section 7.1 How Likely are the Possible Values of a Statistic? The Sampling Distribution of the Proportion

EXAM 3 Math 1342 Elementary Statistics 6-7

Introduction to Estimation. Martina Litschmannová K210

Chapter 8: Sampling Distributions. A survey conducted by the U.S. Census Bureau on a continual basis. Sample

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

What Is a Sampling Distribution? DISTINGUISH between a parameter and a statistic

Last few slides from last time

Statistics for Business and Economics

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Chapter. Hypothesis Testing with Two Samples. Copyright 2015, 2012, and 2009 Pearson Education, Inc. 1

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Chapter 23. Inference About Means

Harvard University. Rigorous Research in Engineering Education

Statistics for Business and Economics: Confidence Intervals for Proportions

Practice Questions: Statistics W1111, Fall Solutions

Chapter 7 Sampling Distributions

10.4 Hypothesis Testing: Two Independent Samples Proportion

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

Single Sample Means. SOCY601 Alan Neustadtl

Statistical Inference for Means

Ch. 7 Statistical Intervals Based on a Single Sample

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Lecture 11 - Tests of Proportions

Business Statistics. Lecture 5: Confidence Intervals

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Ch 12 Inference for Proportions

What is a parameter? What is a statistic? How is one related to the other?

Chapter 26: Comparing Counts (Chi Square)

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Econ 325: Introduction to Empirical Economics

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score

Chapter 23: Inferences About Means

Ch18 links / ch18 pdf links Ch18 image t-dist table

Data Analysis and Statistical Methods Statistics 651

STAT Chapter 9: Two-Sample Problems. Paired Differences (Section 9.3)

AP Statistics Review Ch. 7

Exam 2 (KEY) July 20, 2009

One-sample categorical data: approximate inference

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

Sampling Distribution Models. Central Limit Theorem

Section 5.4. Ken Ueda

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

Mathematical Notation Math Introduction to Applied Statistics

Statistical Intervals (One sample) (Chs )

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Problems Pages 1-4 Answers Page 5 Solutions Pages 6-11

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

SMAM 314 Exam 3d Name

Two-Sample Inferential Statistics

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Unit 9: Inferences for Proportions and Count Data

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Occupy movement - Duke edition. Lecture 14: Large sample inference for proportions. Exploratory analysis. Another poll on the movement

Hypothesis Testing. Mean (SDM)

their contents. If the sample mean is 15.2 oz. and the sample standard deviation is 0.50 oz., find the 95% confidence interval of the true mean.

Ch. 7. One sample hypothesis tests for µ and σ

STA Why Sampling? Module 6 The Sampling Distributions. Module Objectives

Inference for Proportions, Variance and Standard Deviation

Econ 325: Introduction to Empirical Economics

Using Dice to Introduce Sampling Distributions Written by: Mary Richardson Grand Valley State University

Statistic: a that can be from a sample without making use of any unknown. In practice we will use to establish unknown parameters.

STA Module 10 Comparing Two Proportions

Statistical inference provides methods for drawing conclusions about a population from sample data.

Sampling Distributions: Central Limit Theorem

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

Confidence intervals CE 311S

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

20 Hypothesis Testing, Part I

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

The variable θ is called the parameter of the model, and the set Ω is called the parameter space.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Sections 6.1 and 6.2: The Normal Distribution and its Applications

ANOVA - analysis of variance - used to compare the means of several populations.

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Notice that these facts about the mean and standard deviation of X are true no matter what shape the population distribution has

The Chi-Square Distributions

Transcription:

Chapter 6 Estimates and Sample Sizes

Lesson 6-1/6-, Part 1 Estimating a Population Proportion

This chapter begins the beginning of inferential statistics. There are two major applications of inferential statistics involve the use of sample data to: 1. Estimate the value of a population parameter (proportions, means and variances).. Test some claim (or hypothesis) about a population.

Overview Introduce methods for estimating values of these important population parameters: proportions, means, and variances. Present methods for determining samples sizes necessary to estimate those parameters.

Assumptions Randomization condition Were the data sampled at random or generated from a properly randomization experiment? 10% Condition (N 10n) Samples are almost always drawn without replacement. If the sample exceeds 10% of the population, the probability of a success changes so much during the sampling that our Normal model may no longer be appropriate.

Assumptions Normal Approximation The model that we use for inference is based on the Central Limit Theorem. The sample must be large enough to make the sampling model for the sample proportions approximately Normal. npˆ 5 nqˆ 5 and.

Notations for Proportions p population proportion p x n sample proportion (p hat) of x successes in a sample size of n q 1 p sample proportion of failures in a sample of size n

Point Estimate A point estimate is a single value (or point) used to approximate a population parameter. ˆp The sample proportion is the best point estimate of the population proportion. p

Confidence Interval or Interval Estimate A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population parameter. A confidence interval is sometimes abbreviated as CI.

Confidence Interval A confidence level is the probability 1 α (often expressed as the equivalent percentage value) that is the proportion of times that confidence interval actually does contain the population parameter, assuming that the estimate process is repeated a large number of times. This is usually 90% (α = 10%), 95% (α = 5%) or 99% (α = 1%) The confidence level is also called the degree of confidence, or the confidence coefficient.

Interpreting the Confidence Level The statement 95% confident means in repeated sampling, 95 percent of the intervals produced using this method will contain the proportion of adult Minnesotans who would respond no to the question photo cop legislation. If 1000 samples of size 89 were taken about 1000(0.95) = 950 of the intervals would contain the parameter p and about 50 would not.

What can we really say about p? 51 % of all Minnesotans are opposed to photo-cop legislations. It would be nice to be able to make absolute statements about population values with certainty, but we just don t have enough information do that. There s no way to be sure that the population proportion is the same as the sample proportion; in fact, it almost certainly isn t. Observations vary. Another sample would yield a different sample proportion.

What can we really say about p? It is probably true that 51 % of all Minnesotans are opposed to photo-cop legislations. No. In fact we can be pretty sure that whatever the true proportion is, it s not exactly 51%. So the statement is not true.

What can we really say about p? We don t know exactly what proportion of Minnesotans are opposed to photo-cop legislations but we know that it s between the interval from 48% and 54%. This it getting closer, but we still can t certain. We can t know for sure that the true proportion is in this range or any particular range.

What can we really say about p? We don t know exactly what proportion of Minnesotans are opposed to photo-cop legislations but interval from 48% and 54% probably contains the true value. We ve now fudge twice first by giving an interval and second by admitting that we only think the interval probably contains the true value. This statement is true.

What can we really say about p? The last statement may be true, but it s a bit wishy-washy. We can tighten it up bit quantifying what we meant by probably. We are 95% confident that between 48% and 54% of Minnesotans opposed photo-cop legislation.

Critical Value A critical value is the number on the borderline separating sample proportions that are likely to occur from those that are unlikely to occur. α Confidence Level α z 0 z α

Example Page 31, # Find the critical value that corresponds to the given confidence level of 90% 10.90 0.10 90% z z z 0.10 0.05 invnorm(1 0.05,0,1) 1.645 1.645 0.05 0.05 1.645 z 0 1.645

The most common critical values are: Confidence Level Critical Values, z α 90% 0.10 1.645 95% 0.05 1.96 99% 0.01.575

Margin of Error When data from a simple random sample are used to estimate a population proportion p, the margin of error, denoted by E, is the maximum likely (with probability 1 α) difference between the observed proportion ˆp and the true value of the population proportion p.

Margin of Error E z pq ˆˆ n

Page 31, #14 Assume that a sample is used to estimate the population proportion p. Find the margin of error E that corresponds to n = 100, x = 400 99% confidence. 10.99 0.01 z z0.01 z0.005.576 invnorm(1-0.005,0,1) =.576 400 ˆ x p 0.33 n 100 E pq ˆˆ (0.33)(0.67) z n.576.576.01357.0350 100

Confidence Interval for the Population Proportion pˆ E p pˆ E pˆ E, pˆ E pˆ E

Find the Point Estimate and Margin of Error From a Confidence Interval Point Estimate: pˆ ( ) ( ) UCL LCL Margin of Error: E UCL LCL UCL Upper Confidence Limit LCL Lower Confidence Limit

Example Page 31, #6 Express the confidence interval 0.456 < p <0.496 in the form pˆ E. UCL LCL 0.496 0.456 pˆ 0.476 E UCL LCL 0.496 0.456 0.00 p 0.476 0.00

Example Page 31, #10 Interpreting Confidence Interval Limits: Use the given confidence interval limits to find the point estimate ˆp and the margin of error E. 0.78 p 0.338 UCL LCL 0.78 0.338 pˆ 0.308 UCL LCL 0.338 0.78 E 0.030

Example Page 31, #0 Use the sample data and confidence level to construct the confidence level estimate of the population proportion p. n = 001, x = 1776, 90% confidence Check assumptions. npˆ 5 nqˆ 5 x 1776 pˆ 0.8876 n 001

Example Page 31, #0 0.90 0.05 0.05 1.645 z 0 1.645 CI 90% α 1.90 0.10 pˆ 0.8876 n 001 z invnorm(1 0.05,0,1) 1.645 0.05

Example Page 31, #0 pˆ z α pq ˆˆ n 0.8876(0.114) 0.8876 1.645 001 0.8876 0.116 [0.876,0.899] CI 90% α 1.90 0.10 pˆ 0.8876 n 001 z 0.05 1.645 0.876 p 0.899

Example Page 31, #0 Using the TI Stat/Tests/A:1-PropZint

Example Page 31, #0 0.876 p 0.899 pˆ 0.8876

Lesson 6-1/6-, Part Estimating a Population Proportion

Sample Size for Estimating Proportion p ˆp is known: n z E pq ˆˆ ˆp is unknown: n z E 0.5

Example Page 31, # Use the given data to find the minimum sample size required to estimate a population proportion or percentage. Margin of error: 0.038; confidence level: 95%; unknown 1.95 0.05 z z z (1 0.05,0,1) 1.96 0.05 0.05 ˆp and ˆq n z 0.5 (1.96) 0.5 665.10 666 E (0.038)

Example Page 313, #6 In 190 only 35% of U.S. Households had telephones, but that rate is now much higher. A recent survey of 476 randomly selected households showed that 4019 of them had telephones (based on the data from U.S. Census Bureau). Using those survey results and 99% confidence level, the TI-83 Plus calculator displays is as shown. A. Write a statement that interprets the confidence level. We are 99% certain that the interval from 93.053% to 94.96% contains the true percentage of U.S. households having telephones.

Example Page 313, #6 B. Based on the preceding results, should pollsters be concerned about results from surveys conducted by phone. Yes. Based on the results from part (a), about 5% to 7% of the population does not have telephone, so those people are missed.

Procedure for Constructing a Confidence Interval for p Identify the population of interest and the parameter you want to draw conclusions about. Choose the appropriate inference procedure. Verify the conditions for using the selected procedure. If the conditions are met, carry out the inference. Interpret your results in the context of the problem.

Example Page 313, #8 Death Penalty Survey: In a Gallup Poll, 491 randomly selected adults were asked whether they are in favor of the death penalty for a person convicted of murder, and 65% of them said that they were in favor. A. Find the point estimate of the percentage of adults who are in favor of this death penalty. 65% is the point estimate

Example Page 313, #8 B. Find a 95% confidence interval estimate of the percentage of adults who are in favor of this death penalty. Step 1 Identify the population of interest and parameter you want to draw conclusion about. p = proportion of adults who are in favor of the death penalty for a person convicted of murder

Example Page 313, #8 Step Choose the appropriate inference procedure. Verify conditions for using selected procedure. Use a one proportion z-interval Random sample stated in the question. Population is at least 10(491) = 4910 adults Sampling distribution is approximately normal npˆ (491)(0.65) 30 5 nqˆ (491)(0.35) 17 5

Example Page 313, #8 Step 3 Carry out the inference procedure. 0.05 0.95 1.96 z 0 1.96 0.05 pˆ z α 0.65 1.96 pq ˆˆ n 0.65 0.04 0.65(0.35) 491

Example Page 313, #8 Step 4 Interpret you results in the context of the problem. We 95% confident that the proportion of adults who are in favor of the death penalty for a person convicted of murder is between 61% and 69%.

Example Page 313, #8 Using the TI pˆ 0.65 x x n x 491 0.65 491 319.15 30 0.61 p 0.69 61% p 69%

Example Page 313, #8 C. Can we safely conclude that the majority of adults are in favor of this death penalty? Explain Yes, since the interval in which we have 95% confidence is entirely above 50%

Example Page 314, #34 Sample size for Left-Handed Golfers. As a manufacturer of golf equipment, the Spalding Corporation wants to estimate the proportion of golfers who are left handed. (The company can use this information in planning for the number of right-handed and left-handed sets golf clubs to make.) How many golfers must be surveyed if we want 99% confidence that the sample proportion has a margin of error of 0.05. ˆ p 0.50 A) Assume that there is no available information that could used as estimate of. ˆp n z 0.5 E ˆ q 0.50.575 0.5 65.5 653 0.05 1 0.99 0.01 Z 0.005.575

Example Page 314, #34 B) Assume that we have an estimate of ˆp found from the previous study that suggests that 15% of golfers are left handed (based on a USA Today report). n z ˆˆ pq E.575 (0.15)(0.85) 135.64 1353 0.05 ˆ p 0.15 ˆ q 0.85 1 0.99 0.01 Z 0.005.575

Example Page 314, #34 C) Assume that instead of using randomly selected golfers, the sample data are obtained by asking TV viewers of the golfing channel to call an 800 phone number to report whether they are left-handed or right-handed. How are the results affected? Self selected samples are not valid. It is not appropriate to assume that those who respond will be representative of the general population.

Lesson 6-3 Estimating a Population Mean: σ Known

Assumptions Sample is a simple random sample Values of the population standard deviation σ is known The population is normally distributed or n >30.

Example Page 37, #6 Verify the assumptions. Determine whether the given conditions justify using the margin of error when finding a confidence interval estimate of the population mean μ The sample size is n = 5 and σ not known. No, n is not greater than 30 and standard deviation is not known.

Example Page 37, #8 Verify the assumptions. Determine whether the given conditions justify using the margin of error when finding a confidence interval estimate of the population mean μ The sample size is n = 9, σ not known and the original population is normally distributed. No, because σ not known.

Definitions Estimator is a formula or process for using sample data to estimate a population parameter. Estimate is a specific value or range of values used to approximate a population parameter. Point Estimate is a single value (or point) used to approximate a population parameter. x The sample mean is the best point estimate of the population mean μ.

Confidence Interval As we saw in Section 6-, a confidence interval is a range (or an interval) of values used to estimate the true value of the population parameter. The confidence level gives us the success rate of the procedure used to construct the confidence interval.

Level of Confidence As describe in Section 6-, The confidence level is often expressed as the probability 1 α, where α is the complement of the confidence level. For a 0.95 (95%) confidence level, α = 0.05 For a 0.99 (99%) confidence level, α = 0.01

Margin of Error Margin of Error is the maximum likely difference observed between the sample mean x and population mean μ, and is denoted by E. E z n

Confidence Interval Estimate of the Population Mean μ x E x E x E, x E x E

Distribution of Sample Means with Known σ α E E α μ 0 z α

Example Page 38, #10 Use the given confidence level and sample data to find the margin of error and confidence interval for estimating the population mean μ. Ages of drivers occupying the passing lane while driving 5 mi/h with the left signal flashing: 99% confidence; n = 50, x 80.5 years, and σ is known to be 4.6 years. 1 0.99 0.01 n 50 x 80.5 4.6

Example Page 38, #10 1 0.99 0.01 n 50 x 80.5 4.6 Z 0.01 Z0.005.575 invnorm(0.005,0,1).575 Find the margin of error E Z n 4.6.575 1.675 1.68 years 50

Example Page 38, #10 1 0.99 0.01 n 50 x 80.5 4.6 E 1.675 Find the confidence interval x E x E 80.5 1.675 80.5 1.675 78.8yr 8.yr

Example Page 38, #10 Find the confidence interval using the TI STAT/TESTS/7:ZInterval 1 0.99 0.01 n 50 x 80.5 4.6 E 1.675

Sample Size for Estimating Mean μ n Z E When finding the sample size n, if the use of the formula does not result in a whole number, always increase the value of n to the next larger whole number.

Example Page 38, #16 Use the given margin of error, confidence level, and population standard deviation σ find the minimum sample size required to estimate an unknown population mean μ Margin of Error: $500, confidence level: 94%, σ = $9877 1.94.06 Z.06 Z0.03 1.88 invnorm(.03,0,1) 1.8807 n Z E 1.889877 1379.0 1380 500

Procedure for Constructing a Confidence Interval for μ, when σ is known Identify the population of interest and the parameter you want to draw conclusion about. Choose the appropriate inference procedure. Verify the conditions for using the selected procedure. Carry out the inference. Interpret your results in the context of the problem.

Example Page 38, # The health of the bear population in Yellowstone National Park is monitored by periodic measurements taken from anesthetized bears. A sample of 54 bears has a mean weight of 18.9 lb. Assuming that σ is known to be 11.8 lb, find a 99% confidence interval estimate of the mean of the population of all such bear weights. What aspect of this problem is not realistic? It is unrealistic to know σ Step 1 Identify the population of interest and the parameter you want to draw conclusion about. µ = mean weight of bears in the Yellowstone National Park. n x 54 18.9 11.8 CI.99

Example Page 38, # Step Choose the appropriate inference procedure. Verify conditions for using the selected procedure. We will use a one-sample z-interval We are assuming that the sample was random The standard deviation of the population is known σ = 11.8 Large sample n 30 the CLT tells us that the sampling distribution is approximately normal since n = 54

Example Page 38, # Step 3 Carry out the inference procedure x z α σ n 11.8 18.9.575 54 n 54 x 18.9 11.8 CI.99 140. lbs < μ < 5.6 lbs 0.99 0.005 0.005.575 z 0.575

Example Page 38, # Step 4 Interpret you results in the context problem. We are 99% confident that the mean weight of bears in Yellowstone National Park is between 140. lbs and 5.6 lbs.

Lesson 6-4, Part 1 Estimating a Population mean: σ Not Known

Assumptions Sample is a simple random sample Values of the population standard deviation σ is unknown The population is normally distributed or n > 30.

Student t Distribution If the distribution of a population is essentially normal, then the distribution of t x s μ n is essentially a student t distribution for all samples size n, and is used to find critical value values denoted by t α/

z Student t Distribution t-statistic is the same as the z-score Represents the number of standard errors x is from the population mean, μ. The shape of the t-distribution depends on the sample size, n x n Normally Distributed z x s n Not Normally Distributed t x s n Normally Distributed

Student t distribution for n = 3 and n = 1 t distribution is different for different samples sizes.

Important Properties of the Student t Distribution The Student t distribution has the same general symmetric bell shape as the normal distribution, but it reflects the greater variability (with wider distributions) that is expected with small samples. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0). The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a σ = 1). As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.

Degree of Freedom (df) Degrees of Freedom (df) corresponds to the number of samples values that can vary after certain restrictions have been imposed on all data values. df n 1

Margin of Error E for Estimate of μ Based on an unknown σ and a small simple random sample from a normally distributed population. s E t n where t α/ has n 1 degrees of freedom.

Confidence Interval Estimate of the Population Mean μ withσ unknown x E x E x E, x E x E

Example Page 343, # A) Find the critical value z. (B) Find the critical value t (C) State the neither the normal nor the t-distribution applies. 95%; n = 10; σ is unknown; population appears to be normally distributed. 1.95 0.05 df n 1 10 1 9 t t t 0.05 9,0.05.6 0.95 Use table A-3 0.05 0.05

Example Page 343, # Using TI nd Vars

Example Page 343, #8 A) Find the critical value z. (B) Find the critical value t (C) State the neither the normal nor the t-distribution applies. 98%; n = 37; σ is unknown; population appears to be normally distributed. 1.98 0.0 df n 1 37 1 36 0.98 t t t 0.0 36,0.01.434 Use table A-3 0.01 0.01

Example Page 343, #10 Use the given confidence level and sample data to find a) the margin of error and b) the confidence interval for the population mean μ. Assume that the population has a normal distribution. Elbow to fingertip length of mean: 99% confidence level, n 3, x 14.50, s 0.70 1 0.99 0.01 t.01 31,0.005 E t t s n.744 0.70.744 0.34 3 x E x E 14.50 0.34 14.50 0.34 14.16 14.84

Example Page 343, #10 Find the confidence interval using the TI STAT/TESTS/8:TInterval CL.99 n 3 x 14.50 s 0.70

Lesson 6-4, Part Estimating a Population mean: σ Not Known

Procedure for Constructing a Confidence Interval for μ, when σ is Unknown Identify the population of interest and the parameter you want to draw conclusion about. Choose the appropriate inference procedure. Verify the conditions for using the selected procedure. Carry out the inference. Interpret your results in the context of the problem.

Example Page 344, #14 A study was conducted to estimate hospital costs for accident victims who wore seats belts. Twenty randomly selected cases have a distribution that appears to be bell-shape with a mean of $9004 and a standard deviation of $569. A) Construction the 99% confidence interval for the mean of all such costs. Step 1 Identify the population of interest and the parameter you want to draw conclusion about. µ = mean costs of accident victims who wore seat belts.

Example Page 344, #14 Step Choose the appropriate inference procedure. Verify conditions for using the selected procedure. We will use a one-sample t-interval for the mean Random Sample Stated in the question Value of σ is unknown Question stated that the distribution appears to be approximately normal

Example Page 344, #14 Step 3 Carry out the inference procedure n 0, df 19, x 9004, s 569, t α.861 x t α s n 569 9004.861 0 $5403, $1,605 0.005.861

Example Page 344, #14 Step 4 Interpret your results in the context of the problem. We are 99% confident that the mean costs of all accidents victims who wear seat belts is between $5403 and $1605

Example Page 344, #14 B). If you are a manager for an insurance company that provides lower rates for drivers who wear seat belts, and you want a conservative estimate for a worst scenario, what amount should you use as the possible hospital cost for an accident victim who wears seat belts? $1,605 is the high end estimate for the long-run average hospital cost of such accident victims.

Example Page 344, #18 Listed below are measured amounts of lead (in micrograms per cubic meter) in the air. The Environmental Protection Agency has established an air quality standard for lead: 1.5 μg/m³. The measurements shown below were recorded at Building 5 of the World Trade Center site on different days immediately following the destruction caused by the terrorist attacks of September 11, 001. After the collapse of the two World Trade Center Buildings, there was considerable concern about the quality of the air. Use the given values to construct a 95% confidence interval estimate of the mean amount of lead in the air. Is there anything about this data set suggesting that the confidence interval might not be very good? Explain. 5.40 1.10 0.4 0.73 0.48 1.10

Example Page 344, #18 Step 1 Identify the population of interest and the parameter you want to draw conclusions about. µ = mean amount of lead in the air at the world Trade Center

Example Page 344, #18 Choose the appropriate inference procedure. Verify conditions for using the selected procedure. Use a one sample t-interval Measurements were not randomly selected, but its representative sample. The value of σ is unknown The sampling distribution does not appear to be approximately normal since the box plot is skewed right with an outlier (see graph).

Example Page 344, #18 Collection 1 Box Plot 0 1 3 4 5 6 Mean_Amt_of_Lead_at_the_World_Trade_Center

Example Page 344, #18 Carry out the inference procedure. n 6, df 5, x 1.538, s 1.914, t α.571 x s t 1.914 α 1.538.571 n 6-0.471 < µ < 3.547 (micrograms/cubic meter) 0.05.571

Example Page 344, #18 Step 4 Interpret your results in the context of the problem. We are 95% confident that the mean lead amount of all air at the World Trade Center is between -0.4705 and 3.547 (micrograms/cubic meter). Yes, 4 of the 5 samples are below x raises a question about whether the data meets the requirements that underlying population distribution is normal.

Lesson 6-5 Estimating the Population Variance σ²

What is variance? Is the difference between each observation and the mean. Since the mean represents the center of gravity, the sum of all deviation about the mean must equal zero.

Population Variance Population variance (σ²) of a variable is the sum of the squared deviations about the population mean divided by the number of observation in the population (N) i x N Population Standard Deviation

Assumptions The sample is simple random sample The population must have normally distributed values (even if the sample is large).

Chi-Square Distribution χ ( n1) s σ n = sample size s = sample variance σ = population variance

Properties of the Distribution of the Chi-Square Statistics The chi-square distribution is not symmetric, unlike the normal and Student t distribution. As the number of degrees of freedom increases, the distribution becomes more symmetric.

Properties of the Distribution of the Chi-Square Statistics The values of chi-square can be zero or positive, but they cannot be negative. The chi-square distribution is different for each number of degrees of freedom, which is df = n 1 in this section. As the number increases, the chi-square distribution approaches a normal distribution. In table A-4, each critical value of χ corresponds to an area given in the top row of the table, and that area represents the total region located to the right of the critical value.

Chi-Square Distribution with Critical values Use Table A-4 Left 1 Right

0.05 Example Page 355, # Find the critical values that correspond to the given confidence level and sample size. 0.05 95%; n 51 1.95.05 0.05 The Area to the Right 0.05 71.40 The Area to the Left 0.975 3.357 Area 0.95 1 0.05 0.975 0.05 0.05

Estimators of σ The sample variance s² is the best point estimate of the population variance σ²

Confidence Interval for the Population Variance σ² 1 1 1 n s n s 1 1 n s n s 1

Example Page 355, #6 Find the confidence interval. Use the given confidence level and sample data to find a confidence interval for the population standard deviation. In each case assume that a simple random sample has been selected from population that has a normal distribution. Ages of drivers occupying the passing lane while driving 5 mi/h with the left signal flashing: 99% confidence; n = 7, x = 80.5 years, s = 4.6 years 10.99.01 0.01 0.005 1 1 n s n s 1

Example Page 355, #6 10.99.01 0.01 0.005 0.005 48.90 0.995 11.160 1 1 n s n s 1 7 1 4.6 7 1 4.6 48.90 11.160 3.4years 7.0years

Procedure for Constructing a Confidence Interval for σ Identify the population of interest and the parameter you want to draw conclusion about. Choose the appropriate inference procedure. Verify the conditions for using the selected procedure. Carry out the inference. Interpret your results in the context of the problem.

Example Page 356, #14 A container of car antifreeze is supposed to hold 3785 ml of the liquid. Realizing that fluctuations are inevitable, the quality-control manager wants to be quite sure that the standard deviation is less than 30 ml. Otherwise, some containers would overflow while others would not have enough of the coolant. She selects a simple random sample, with the results given here. Use these sample results to construct the 99% confidence interval for the true value of σ. Does this confidence interval suggest that the fluctuations are at an acceptable level? 3761 3861 3769 377 3675 3861 3888 3819 3788 3800 370 3748 3753 381 3811 3740 3740 3839 n 18 x 3787.0 s 55.4

Example Page 356, #14 Step 1 Identify the population of interest and the parameter you want to draw conclusions about. σ = standard deviation of car antifreeze. Step Choose the appropriate inference procedure. Verify conditions for using selected procedure. Use a chi-square interval Conditions Question stated SRS Since the histogram is approximately normal.

Example Page 356, #14

Example Page 356, #14 Step 3 Carry out the inference procedure n 18 x s CL 3787.0 55.4 99%.005 35.718 0.995 5.697 1.99.01 1 1 n s n s 0.01.005 1 18 155.4 18 155.4 35.718 5.697 38.mL 95.7mL

Example Page 356, #14 Step 4 Interpret your results in the context of the problem. We are 99% confident that the standard deviation of car antifreeze is between 38. ml and 95.7 ml. No, the interval indicates 99% confidence that σ > 30 ml (the fluctuations appears to be too high).