Statistical Intervals (One sample) (Chs )

Similar documents
Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean

Introduction to Statistical Inference and Confidence Intervals

Statistical Inference with Small Samples

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Applied Statistics I

7.1 Basic Properties of Confidence Intervals

Ch. 7 Statistical Intervals Based on a Single Sample

6 Central Limit Theorem. (Chs 6.4, 6.5)

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Ch18 links / ch18 pdf links Ch18 image t-dist table

Business Statistics. Lecture 10: Course Review

Confidence intervals

The Chi-Square Distributions

Ch. 7: Estimates and Sample Sizes

Confidence intervals CE 311S

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

Section B. The Theoretical Sampling Distribution of the Sample Mean and Its Estimate Based on a Single Sample

Continuous random variables

Tables Table A Table B Table C Table D Table E 675

appstats27.notebook April 06, 2017

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Stat 427/527: Advanced Data Analysis I

Confidence Intervals. - simply, an interval for which we have a certain confidence.

Exploring Operations Involving Complex Numbers. (3 + 4x) (2 x) = 6 + ( 3x) + +

Chapter 27 Summary Inferences for Regression

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

7 Estimation. 7.1 Population and Sample (P.91-92)

Week 11 Sample Means, CLT, Correlation

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Advanced Herd Management Probabilities and distributions

CH.8 Statistical Intervals for a Single Sample

Chapter 8: Confidence Intervals

df=degrees of freedom = n - 1

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

The Chi-Square Distributions

Continuous Probability Distributions

IV. The Normal Distribution

CENTRAL LIMIT THEOREM (CLT)

Content by Week Week of October 14 27

Practice Problems Section Problems

Ch. 7. One sample hypothesis tests for µ and σ

Solutions to Practice Test 2 Math 4753 Summer 2005

Harvard University. Rigorous Research in Engineering Education

Chapter 5. Means and Variances

Chapter 23: Inferences About Means

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Chapter 8: Estimating with Confidence

Statistical Inference

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Everything is not normal

IV. The Normal Distribution

Review of Statistics 101

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

EXPERIMENT: REACTION TIME

Week 1 Quantitative Analysis of Financial Markets Distributions A

Chapter 23. Inference About Means

Bayesian Inference for Normal Mean

CS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer.

Estimating a Population Mean

Introduction to Statistical Data Analysis Lecture 5: Confidence Intervals

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

Probability Distributions for Continuous Variables. Probability Distributions for Continuous Variables

Inference for Single Proportions and Means T.Scofield

Describing Distributions with Numbers

Simple linear regression

Mathematical statistics

Supporting Australian Mathematics Project. A guide for teachers Years 11 and 12. Probability and statistics: Module 25. Inference for means

Chapter 8 - Statistical intervals for a single sample

Introduction to Statistical Inference

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series

6. CONFIDENCE INTERVALS. Training is everything cauliflower is nothing but cabbage with a college education.

Business Statistics. Lecture 5: Confidence Intervals

Chapter 6. Estimates and Sample Sizes

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

Statistics for Managers Using Microsoft Excel 7 th Edition

Unit 4 Probability. Dr Mahmoud Alhussami

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Introduction to Survey Analysis!

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Chapter 18. Sampling Distribution Models /51

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Sampling Distributions of Statistics Corresponds to Chapter 5 of Tamhane and Dunlop

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Chapter 6 Continuous Probability Distributions

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

Math Review Sheet, Fall 2008

INTRODUCTION TO ANALYSIS OF VARIANCE

Statistics and Data Analysis in Geology

Introduction to Computational Complexity

Lecture 8 Sampling Theory

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

1/11/2011. Chapter 4: Variability. Overview

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Ordinary Least Squares Regression Explained: Vartanian

Chapter 10: Inferences based on two samples

One-Sample and Two-Sample Means Tests

Transcription:

7 Statistical Intervals (One sample) (Chs 8.1-8.3)

Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and standard deviation Standardizing X by first subtracting its expected value and then dividing by its standard deviation yields the standard normal variable How big does our sample need to be if the underlying population is normally distributed? 2

Confidence Intervals Because the area under the standard normal curve between 1.96 and 1.96 is.95, we know: This is equivalent to: 3

Confidence Intervals The interval Is called the 95% confidence interval for the mean. This interval varies from sample to sample, as the sample mean varies. So, the interval itself is a random interval. 4

Confidence Intervals The CI interval is centered at the sample mean X and extends 1.96 to each side of X. The interval s width is 2 (1.96), which is not random;; only the location of the interval (its midpoint X) is random. 5

Confidence Intervals As we showed, for a given sample, the CI can be expressed as A concise expression for the interval is x ± 1.96 / p n where the left endpoint is the lower limit and the right endpoint is the upper limit. 6

Interpreting a Confidence Level We are 95% confident that the true parameter is in this interval What does that mean?? 7

Interpreting a Confidence Level A correct interpretation of 95% confidence relies on the long-run relative frequency interpretation of probability. In repeated sampling, 95% of the confidence intervals obtained from all samples will actually contain µ. The other 5% of the intervals will not. The confidence level is not a statement about any particular interval instead it pertains to what would happen if a very large number of like intervals were to be constructed using the same CI formula. 8

Interpreting a Confidence Level Demonstration through simulations 9

Other Levels of Confidence Probability of 1 a is achieved by using z a /2 in place of z.025 = 1.96 P ( z /2 apple Z apple z /2 )=1 where Z = X µ / p n 10

Other Levels of Confidence A 100(1 a )% confidence interval for the mean µ when the value of s is known is given by or, equivalently, by 11

Example A sample of 40 units is selected and diameter measured for each one. The sample mean diameter is 5.426 mm, and the standard deviation of measurements is 0.1mm. Let s calculate a confidence interval for true average hole diameter using a confidence level of 90%. What about the 99% confidence interval? What are the advantages and disadvantages to a wider confidence interval? 12

Sample size computation For a desired confidence level and interval width, we can determine the necessary sample size. Example: A response time is normally distributed with standard deviation of 25 milliseconds. A new system has been installed, and we wish to estimate the true average response time µ for the new environment. Assuming that response times are still normally distributed with s = 25, what sample size is necessary to ensure that the resulting 95% CI has a width of (at most) 10? 13

Unknown variance A difficulty in using our previous equation for confidence intervals is that it uses the value of s, which will rarely be known. 14

Unknown variance A difficulty in using our previous equation for confidence intervals is that it uses the value of s, which will rarely be known. In this instance, we need to work with the sample standard deviation s. Remember from our first lesson that the standard deviation is calculated as:! = #! $ = # (& ' &) $ * 1 15

Unknown variance A difficulty in using our previous equation for confidence intervals is that it uses the value of s, which will rarely be known. In this instance, we need to work with the sample standard deviation s. Remember from our first lesson that the standard deviation is calculated as:! = #! $ = # (& ' &) $ * 1 With this, we instead work with the standardized random variable: 16

Unknown mean and variance Previously, there was randomness only in the numerator of Z by virtue of, the estimator. In the new standardized variable, both value from one sample to another. and s vary in When n is large, the substitution of s for s adds little extra variability, so nothing needs to change. When n is smaller, the distribution of this new variable should be wider than the normal to reflect the extra uncertainty. (We talk more about this in a bit.) 17

A Large-Sample Interval for µ Proposition If n is sufficiently large (n>=30), the standardized random variable has approximately a standard normal distribution. This implies that is a large-sample confidence interval for µ with confidence level approximately 100(1 a )%. This formula is valid regardless of the population distribution for sufficiently large n. 18

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 19

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 20

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 21

n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 22

A Small-Sample Interval for µ The CLT cannot be invoked when n is small, and we need to do something else when n < 30. When n < 30 and the underlying distribution is normal, we have a solution! 23

t Distribution The results on which large sample inferences are based introduces a new family of probability distributions called t distributions. When is the mean of a random sample of size n from a normal distribution with mean µ, the random variable has a probability distribution called a t Distribution with n 1 degrees of freedom (df). 24

Properties of t Distributions Figure below illustrates some members of the t-family 25

Properties of t Distributions Properties of t Distributions Let t n denote the t distribution with n df. 1. Each t n curve is bell-shaped and centered at 0. 2. Each t n curve is more spread out than the standard normal (z) curve. 3. As n increases, the spread of the corresponding t n curve decreases. 4. As n, the sequence of t n curves approaches the standard normal curve (so the z curve is the t curve with df = ). 26

Properties of t Distributions Let t a,n = the number on the measurement axis for which the area under the t curve with n df to the right of t a,n is a ;; t a,n is called a t critical value. For example, t.05,6 is the t critical value that captures an upper-tail area of.05 under the t curve with 6 df 27

Tables of t Distributions The probabilities of t curves are found in a similar way as the normal curve. Example: obtain t.05,15 28

The t Confidence Interval Let and s be the sample mean and sample standard deviation computed from the results of a random sample from a normal population with mean µ. Then a 100(1 a )% t-confidence interval for the mean µ is or, more compactly: 29

Example cont d GPA measurements for 23 students have a histogram that looks like this: GPAs Frequency 0 1 2 3 4 5 6 7 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 The sample mean is 3.146. The sample standard deviation is 0.308. Calculate a 90% CI for the mean GPA. GPA 30

Confidence Intervals for µ n >= 30 n < 30 Underlying normal distribution Underlying non-normal distribution 31

Confidence Intervals for µ n >= 30 n < 30 Underlying normal distribution Special Cases Underlying non-normal distribution 32

When the t-distribution doesn t apply When n < 30 and the underlying distribution is unknown, we have to: Make a specific assumption about the form of the population distribution and derive a CI based on that assumption. Use other methods (such as bootstrapping) to make reasonable confidence intervals. 33

A Confidence Interval for a Population Proportion Let p denote the proportion of successes in a population (e.g., individuals who graduated from college, computers that do not need warranty service, etc.). A random sample of n individuals is selected, and X is the number of successes in the sample. Then, X can be regarded as a Binomial rv with mean np and 34

A Confidence Interval for p Let p denote the proportion of successes in a population (e.g., individuals who graduated from college, computers that do not need warranty service, etc.). A random sample of n individuals is selected, and X is the number of successes in the sample. Then, X can be regarded as a Binomial rv with mean np and If both np ³ 10 and n(1-p) ³ 10, X has approximately a normal distribution. 35

A Confidence Interval for p The estimator of p is = X / n (the fraction of successes). has approximately a normal distribution, and Standardizing by subtracting p and dividing by then implies that And the CI is 36

A Confidence Interval for p The EPA considers indoor radon levels above 4 picocuries per liter (pci/l) of air to be high enough to warrant amelioration efforts. Tests in a sample of 200 homes found 127 (63.5%) of these sampled households to have indoor radon levels above 4 pci/l. Calculate the 99% confidence interval for the proportional of homes with indoor radon levels above 4 pci/l. 37

CIs for the Variance Let X 1, X 2,, X n be a random sample from a normal distribution with parameters µ and s 2. Then the r.v. 2 has a chi-squared ( ) probability distribution with n 1 df. (In this class, we don t consider the case where the data is not normally distributed.) 38

The Chi-Squared Distribution Definition Let v be a positive integer. The random variable X has a chi-squared distribution with parameter v if the pdf of X The parameter is called the number of degrees of freedom (df) of X. The symbol c 2 is often used in place of chi-squared. 39

CIs for the Variance The graphs of several Chi-square probability density functions are 40

CIs for the Variance Let X 1, X 2,, X n be a random sample from a normal distribution with parameters µ and s 2. Then has a chi-squared (χ 2 ) probability distribution with n 1 df. 41

CIs for the Variance The chi-squared distribution is not symmetric, so these tables contain values of both for a near 0 and 1 42

CIs for the Variance As a consequence Or equivalently Thus we have a confidence interval for the variance s 2. Taking square roots gives a CI for the standard deviation s. 43

Example The data on breakdown voltage of electrically stressed circuits are: breakdown voltage is approximately normally distributed. s 2 = 137,324.3 n = 17 44

Confidence Intervals in R 45