Statistical Inference

Similar documents
Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

1 Binomial Probability [15 points]

STA Module 10 Comparing Two Proportions

Estimating a Population Mean

Confidence intervals

Statistical inference provides methods for drawing conclusions about a population from sample data.

CHAPTER 10 Comparing Two Populations or Groups

Confidence Intervals for Population Mean

Chapter 22. Comparing Two Proportions 1 /29

Harvard University. Rigorous Research in Engineering Education

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Statistic: a that can be from a sample without making use of any unknown. In practice we will use to establish unknown parameters.

MAT Mathematics in Today's World

One-sample categorical data: approximate inference

Comparing Means from Two-Sample

Sampling Distribution Models. Chapter 17

Central Limit Theorem Confidence Intervals Worked example #6. July 24, 2017

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

CHAPTER 7 THE SAMPLING DISTRIBUTION OF THE MEAN. 7.1 Sampling Error; The need for Sampling Distributions

Probability and Statistics

8.2 Margin of Error, Sample Size (old 8.3)

Business Statistics. Lecture 5: Confidence Intervals

Do students sleep the recommended 8 hours a night on average?

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

Statistical Intervals (One sample) (Chs )

Sampling Distributions

Chapter 22. Comparing Two Proportions 1 /30

Last few slides from last time

Chapter 8: Estimating with Confidence

Chapter 23: Inferences About Means

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Two-sample inference: Continuous data

Chapter 18. Sampling Distribution Models /51


CENTRAL LIMIT THEOREM (CLT)

Ch. 7 Statistical Intervals Based on a Single Sample

Statistics for IT Managers

Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals

Matrices, Row Reduction of Matrices

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

The Mean Value Theorem Rolle s Theorem

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Inferential Statistics. Chapter 5

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score

Confidence intervals CE 311S

Inferential Statistics

Confidence Intervals. - simply, an interval for which we have a certain confidence.

Statistics and Quantitative Analysis U4320. Segment 5: Sampling and inference Prof. Sharyn O Halloran

SECTION 3.5: DIFFERENTIALS and LINEARIZATION OF FUNCTIONS

5/15/2017. Sample Size and Power. Henry Glick May 12, Goals of Sample Size and Power Analysis

AP Statistics Ch 12 Inference for Proportions

Ch Inference for Linear Regression

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Sampling Distributions

Ch. 7: Estimates and Sample Sizes

Two-sample inference: Continuous data

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

Boyle s Law and Charles Law Activity

F = ma W = mg v = D t

2. If the values for f(x) can be made as close as we like to L by choosing arbitrarily large. lim

Chapter 12: Inference about One Population

INTERVAL ESTIMATION AND HYPOTHESES TESTING

211 Real Analysis. f (x) = x2 1. x 1. x 2 1

Introduction to Design of Experiments

Chapter 16. Simple Linear Regression and Correlation

Business Statistics. Lecture 10: Course Review

10.1. Comparing Two Proportions. Section 10.1

CH 14 MORE DIVISION, SIGNED NUMBERS, & EQUATIONS

1 Matched pair comparison(p430-)

Confidence Intervals with σ unknown

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

7 Estimation. 7.1 Population and Sample (P.91-92)

AP Statistics Review Ch. 7

Chapter 24. Comparing Means

Northwood High School Algebra 2/Honors Algebra 2 Summer Review Packet

NOTES: EXPONENT RULES

3.3 Estimator quality, confidence sets and bootstrapping

Assessment Schedule 2011 Statistics and Modelling: Calculate confidence intervals for population parameters (90642)

Chapter 5 Simplifying Formulas and Solving Equations

Lecture 8 Hypothesis Testing

Experiment 2 Random Error and Basic Statistics

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Week 11 Sample Means, CLT, Correlation

Linear Regression with one Regressor

Ch 7: Dummy (binary, indicator) variables

Examine characteristics of a sample and make inferences about the population

Chapter 5 Confidence Intervals

Linear Classifiers and the Perceptron

Chapter 1 Review of Equations and Inequalities

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

Ch18 links / ch18 pdf links Ch18 image t-dist table

Supporting Australian Mathematics Project. A guide for teachers Years 11 and 12. Probability and statistics: Module 25. Inference for means

Physics 225 Relativity and Math Applications. Fall Unit 7 The 4-vectors of Dynamics

Lecture 8 Sampling Theory

SECTION 3.5: DIFFERENTIALS and LINEARIZATION OF FUNCTIONS

Basic Math Problems Unit 1

QUADRATICS 3.2 Breaking Symmetry: Factoring

9. Linear Regression and Correlation

Chapter 16. Simple Linear Regression and dcorrelation

Transcription:

Chapter 14 Confidence Intervals: The Basic Statistical Inference Situation: We are interested in estimating some parameter (population mean, μ) that is unknown. We take a random sample from this population. Goal: Draw inference about the population parameter from the sample data. Example: Beetle cars Confidence Intervals Suppose EPA wants to estimate μ, the average CO 2 emitted by all Beetle cars. 49 Beetle cars are randomly selected at the VW plant in Detroit and tested for CO 2 emissions. The test results show these 49 cars have an average of 1.5 grams CO 2 emission per mile. What is our best guess about the unknown parameter μ? But we need to be careful in making a conclusion about μ because We know any other random sample of 49 cars would give a different value of We can t say how much confident we are with our estimate. So instead of just giving one number (the value of x ) as our estimate of μ, it seems more desirable to give an of values that may contain μ with some degree of. Let s try to find an interval of values within which we can say that average CO 2 emission (μ) of Beetle cars lies with some high degree of confidence, say 95%. For this, let s recall from Chapter 11 how x behaves in repeated sampling. 1

In the Beetle cars example, suppose we repeatedly sample 49 cars and for each sample note their mean CO 2 emission. We know that the distribution of sample mean, x is then Assume we know that the standard deviation (σ) of CO 2 emissions of Beetle cars is 0.84 (it is unrealistic to assume σ to be known, we would get rid of this assumption in Chapter 18!). Then, Using the 68-95-99.7 rule, we know that x will fall within 2 standard deviations σ 2 = of the population mean μ with probability n This means that in 95% of all samples, the observed mean score x will be within points of the population mean μ. (Note: μ is unknown and fixed - it doesn t vary from sample to sample). To say that x is within 0.24 points of μ is equivalent to saying that μ is within 0.24 points of observed x. This happens in 95% of all samples. 2

Combining these facts we can say, In 95% of all samples (of size n = 49) the true but unknown mean μ lies in the interval. We can rewrite this interval as Recall that mean of our sample was 1.5 grams. So we say that We are 95% confident that This interval we just calculated is a 95% confidence interval for the unknown average CO 2 emission (μ) of all Beetle cars. In general, confidence intervals for any parameter consists of two parts: 1) An interval calculated from the data of the form The margin of error conveys how accurate we believe our guess of the true parameter value is, based on the variability of the estimate. 2) A confidence level, C, gives the probability that the random interval captures the true parameter value in repeated samples. (Note that it is NOT the probability that any one specific interval calculated from a random sample captures the true parameter.) Many types of confidence intervals exist for various kinds of parameters. Ch. 14: Confidence Intervals for mean μ (s.d. σ is known). Ch. 18: Confidence Intervals for mean μ (s.d. σ is unknown). Ch. 19: Confidence Intervals for difference between two population means. Ch. 20: Confidence Intervals for population proportion. Ch. 21: Confidence Intervals for comparing two proportions. We will concentrate on confidence intervals for the mean μ of a population. 3

Confidence Intervals for a Population Mean (standard deviation is known) Confidence Intervals for a Population Mean Choose an SRS of size n from a population having unknown mean μ and known standard deviation σ. A level C confidence interval for μ is Here, z* is the value on the standard normal curve with area C between -z* and z*. The interval is exact when the original population distribution is normal and is approximately correct for large n otherwise. The most commonly used confidence levels are C 90% 95% 99% z* z* for other confidence levels can be found similarly from Table A or more conveniently from Table C. 4

Ex: Beetle Cars Recall that the CO 2 emissions of Beetle cars have a mean μ and standard deviation σ = 0.84. Our sample mean was x = 1.5 and n = 49. We had calculated the 95% confidence interval for the mean CO 2 emission as a. Find a 90% confidence interval for the mean CO 2 emission. b. Find a 99% confidence interval for the mean CO 2 emission. c. Find an 80% confidence interval for the mean CO 2 emission. Note: We don t know if any of the above confidence intervals contain μ or not! Then what do we mean by confidence? 5

The meaning of Confidence : When we say 95% confident, we mean that Our confidence is in the procedure, not in any one specific interval. So it is completely wrong to say: (because any one interval either contains the parameter or not, there is no randomness in it!) Remember that probability (chance) is associated only with a random phenomenon. After you have constructed a CI from a random sample, there is no randomness left it. Hence it doesn t make sense to attach any probability statement to a specific (numerical) CI. 6

Behavior of Confidence Intervals What happens to the margin of error if we increase the confidence level C? Does it increase, decrease or stay the same? (Hint: what happens to the value of z*?) How does this affect the width of the resulting confidence interval? Note the tradeoff: We would like to have a smaller margin of error (narrower interval) as well as high confidence but Q: What could we do to get a narrower interval (smaller margin of error) without lowering confidence? E.g. in the Beetle cars example the 99% confidence interval for μ was found to be with a sample of size 49. If instead we had taken a sample of size 100 cars and suppose their mean CO 2 emission was 1.5 grams, then How does the size of σ affect the margin of error? Thus we have 3 ways of reducing the width of the confidence interval: 1) 2) 3) 7

Choosing the Sample Size We saw that we can have high degree of confidence as well as small margin of error by Usually researchers will have a desired confidence level and margin of error they want to attain. So one aspect of designing any study is to decide the number of observations needed. Let m represent the desired margin of error. Recall the formula of margin of error: Solving for n we get: *****Always round your answer up!!***** Ex: Suppose PGSA (Poor Graduate Students Association) at the Texas state wants to estimate the mean monthly income of SMU graduate students within $100 with 95% confidence. How many students should PGSA sample? Assume that the standard deviation of incomes of SMU graduate students is $421. 8