STA 291 Lecture 16. Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately) normal

Similar documents
Lecture 3: Chapter 3

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

Lecture Topic 4: Chapter 7 Sampling and Sampling Distributions

Sampling Distribution Models. Chapter 17

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Monday, November 26: Explanatory Variable Explanatory Premise, Bias, and Large Sample Properties

Chapter 9: Sampling Distributions

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Lecture 8 Sampling Theory

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Business Statistics: A Decision-Making Approach 6 th Edition. Chapter Goals

Lecture 27. DATA 8 Spring Sample Averages. Slides created by John DeNero and Ani Adhikari

Chapter 7: Sampling Distributions

Fundamentals of Measurement and Error Analysis

Lecture 10A: Chapter 8, Section 1 Sampling Distributions: Proportions

What is a parameter? What is a statistic? How is one related to the other?

MALLOY PSYCH 3000 MEAN & VARIANCE PAGE 1 STATISTICS MEASURES OF CENTRAL TENDENCY. In an experiment, these are applied to the dependent variable (DV)

Two-sample inference: Continuous data

Note: Solve these papers by yourself This VU Group is not responsible for any solved content. Paper 1. Question No: 3 ( Marks: 1 ) - Please choose one

What is a parameter? What is a statistic? How is one related to the other?

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

Introduction to Statistical Data Analysis Lecture 4: Sampling

MS&E 226: Small Data

Chapter 3 Statistics for Describing, Exploring, and Comparing Data. Section 3-1: Overview. 3-2 Measures of Center. Definition. Key Concept.

Chapter 8 Sampling Distributions Defn Defn

AP Statistics - Chapter 7 notes

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

4 Limit and Continuity of Functions

Ch7 Sampling Distributions

MIDTERM EXAMINATION (Spring 2011) STA301- Statistics and Probability

The Central Limit Theorem

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz

Variables, distributions, and samples (cont.) Phil 12: Logic and Decision Making Fall 2010 UC San Diego 10/18/2010

Descriptive Univariate Statistics and Bivariate Correlation

1/11/2011. Chapter 4: Variability. Overview

Inferential Statistics. Chapter 5

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

CHAPTER 3a. INTRODUCTION TO NUMERICAL METHODS

Lecture 18: Central Limit Theorem. Lisa Yan August 6, 2018

Review of the Normal Distribution

Sampling Distribution Models. Central Limit Theorem

Carolyn Anderson & YoungShil Paek (Slide contributors: Shuai Wang, Yi Zheng, Michael Culbertson, & Haiyan Li)

Histograms, Central Tendency, and Variability

The variable θ is called the parameter of the model, and the set Ω is called the parameter space.

Business Statistics. Lecture 10: Course Review

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2012

Lecture 7: Confidence interval and Normal approximation

Math 10 - Compilation of Sample Exam Questions + Answers

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Lab 5 for Math 17: Sampling Distributions and Applications

Two-sample inference: Continuous data

Introduction to Statistical Data Analysis Lecture 1: Working with Data Sets

AP Statistics Chapter 7 Multiple Choice Test

Week 11 Sample Means, CLT, Correlation

STA Why Sampling? Module 6 The Sampling Distributions. Module Objectives

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2013

COMP6053 lecture: Sampling and the central limit theorem. Jason Noble,

CS 543 Page 1 John E. Boon, Jr.

February 27, smartboard notes.notebook

MATH 10 INTRODUCTORY STATISTICS

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Chapter 1 - Lecture 3 Measures of Location

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

The central limit theorem

10.1 Estimating with Confidence

Data Analysis and Statistical Methods Statistics 651

Chapter 7: Sampling Distributions

THE SAMPLING DISTRIBUTION OF THE MEAN

7.1: What is a Sampling Distribution?!?!

MATH 10 INTRODUCTORY STATISTICS

Experiment 2. Reaction Time. Make a series of measurements of your reaction time. Use statistics to analyze your reaction time.

SESSION 5 Descriptive Statistics

Do students sleep the recommended 8 hours a night on average?

Sampling Distributions. Ryan Miller

Statistical Methods. by Robert W. Lindeman WPI, Dept. of Computer Science

Last few slides from last time

Inferential Statistics

Lecture 8: Chapter 4, Section 4 Quantitative Variables (Normal)

Statistics and parameters

EXPERIMENT 2 Reaction Time Objectives Theory

Econ 325: Introduction to Empirical Economics

IEOR165 Discussion Week 5

- a value calculated or derived from the data.

Sampling Distributions. Introduction to Inference

Measures of Central Tendency. Mean, Median, and Mode

Figure Figure

Lecture 06. DSUR CH 05 Exploring Assumptions of parametric statistics Hypothesis Testing Power

Data Analysis and Statistical Methods Statistics 651

Now we will define some common sampling plans and discuss their strengths and limitations.

Lecture 2. Descriptive Statistics: Measures of Center

ECON 497: Lecture Notes 10 Page 1 of 1

Chapter 18: Sampling Distribution Models

Describing Distributions

ECO220Y Simple Regression: Testing the Slope

PHYS 6710: Nuclear and Particle Physics II

Transcription:

STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately) normal X STA 291 - Lecture 16 1 Sampling Distributions Sampling Distribution of Sampling Distribution of X Central limit theorem: no matter what the population look like, as long as we use SRS, and when sample size n is large, the above two sampling distribution are (very close to) normal. STA 291 - Lecture 16 2 is approximately normal with mean = p, SD = p(1 p) n is approximately normal with X mean = µ, SD = σ n STA 291 - Lecture 16 3 1

Central Limit Theorem For random sampling, as the sample size n grows, the sampling distribution of the sample mean X approaches a normal distribution. So does the sample proportion Amazing: This is the case even if the population distribution is discrete or highly skewed The Central Limit Theorem can be proved mathematically We will verify it experimentally in the lab sessions STA 291 - Lecture 16 4 Central Limit Theorem Online applet 1 http://www.stat.sc.edu/~west/javahtml/clt.h tml Online applet 2 http://bcs.whfreeman.com/scc/content/cat_0 40/spt/CLT-SampleMean.html STA 291 - Lecture 16 5 Population distribution vs. sampling distribution Population distribution: = distribution of sample of size one from the population. In a simple random sample of size 4: X, X, X, X 1 2 3 4 X 1 each one has the distribution of the population. But the average of the 4 has a different distribution -- the sampling distribution of mean, when n=4., a STA 291 - Lecture 16 6 2

x = X + X + X + X 1 2 3 4 4 Has a distribution different from the population distribution: (1) shape is more normal (2) mean remains the same (3) SD is smaller (only half of the population SD) STA 291 - Lecture 16 7 Population Distribution Distribution from which we select the sample Unknown, we want to make inference about its parameters Mean =? Standard Deviation =? STA 291 - Lecture 16 8 Sample Statistic From the sample X 1,, X n we compute descriptive statistics Sample Mean = Sample Standard Deviation = Sample Proportion = They all can be computed given a sample. STA 291 - Lecture 16 9 3

Sampling Distribution of a sample statistic Probability distribution of a statistic (for example, the sample mean) Describes the pattern that would occur if we could repeatedly take random samples and calculate the statistic as often as we wanted Used to determine the probability that a statistic falls within a certain distance of the population parameter The mean of the sampling distribution of X is = The SD of X is also called Standard Error = STA 291 - Lecture 16 10 The 3 features for the sampling distribution of sample mean also apply to sample proportion. (1. approach normal, 2. centered at p; 3. less and less SD) This sampling distribution tells us how far or how close between p and One quantity we can compute, the other we want to know STA 291 - Lecture 16 11 Central Limit Theorem For example: If the sample size is n = 100, then the sampling distribution of has mean p and SD (or standard error) = p(1 p) p(1 p) = 100 10 STA 291 - Lecture 16 12 4

Preview of estimation of p Estimation with error bound: Suppose we counted 57 YES in 100 interview. (SRS) Since we know just how far is to p. (that is given by the sampling distribution) 95% of the time two SD of p. is going to fall within SD =? STA 291 - Lecture 16 13 57/100 = 0.57 = Sqrt of [ 0.57(1-0.57) ] = 0.495 p(1 p) = 0.495/10 = 0.0495 10 STA 291 - Lecture 16 14 Finally, with 95% probability, the difference between p and within 2 SD or 2(0.0495) = 0.099 is STA 291 - Lecture 16 15 5

Multiple choice question The standard error of a statistic describes 1. The standard deviation of the sampling distribution of that statistic 2. The variability in the values of the statistic for repeated random samples of size n. Both are true STA 291 - Lecture 16 16 Multiple Choice Question The Central Limit Theorem implies that 1. All variables have approximately bell-shaped sample distributions if a random sample contains at least 30 observations 2. Population distributions are normal whenever the population size is large 3. For large random samples, the sampling distribution of the sample mean (X-bar) is approximately normal, regardless of the shape of the population distribution 4. The sampling distribution looks more like the population distribution as the sample size increases STA 291 - Lecture 16 17 In previous page, 3 is correct. STA 291 - Lecture 16 18 6

Chapter 10 Statistical Inference: Estimation of p Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample from that population For quantitative variables, we usually estimate the population mean (for example, mean household income) For qualitative variables, we usually estimate population proportions (for example, proportion of people voting for candidate A) STA 291 - Lecture 16 19 Two Types of Estimators Point Estimate A single number that is the best guess for the parameter For example, the sample mean is usually a good guess for the population mean Interval Estimate (harder) =point estimator with error bound A range of numbers around the point estimate To give an idea about the precision of the estimator For example, the proportion of people voting for A is between 67% and 73% STA 291 - Lecture 16 20 Point Estimator A point estimator of a parameter is a sample statistic that predicts the value of that parameter A good estimator is Unbiased: Centered around the true parameter Consistent: Gets closer to the true parameter as the sample size gets larger Efficient: Has a standard error that is as small as possible (made use of all available information) STA 291 - Lecture 16 21 7

Efficiency An estimator is efficient if its standard error is small compared to other estimators Such an estimator has high precision A good estimator has small standard error and small bias (or no bias at all) The following pictures represent different estimators with different bias and efficiency Assume that the true population parameter is the point (0,0) in the middle of the picture STA 291 - Lecture 16 22 Bias and Efficiency Note that even an unbiased and efficient estimator does not always hit exactly the population parameter. But in the long run, it is the best estimator. STA 291 - Lecture 16 23 Sample proportion is an unbiased estimator of the population proportion. It is consistent and efficient. STA 291 - Lecture 16 24 8

Example: Estimators Suppose we want to estimate the proportion of UK students voting for candidate A We take a random sample of size n=400 The sample is denoted X 1, X 2,, X n, where X i =1 if the ith student in the sample votes for A, X i =0 otherwise STA 291 - Lecture 16 25 Example: Estimators Estimator1 = the sample proportion Estimator2 = the answer from the first student in the sample (X 1 ) Estimator3 = 0.3 Which estimator is unbiased? Which estimator is consistent? Which estimator has high precision (small standard error)? STA 291 - Lecture 16 26 Attendance Survey Question On a 4 x6 index card Please write down your name and section number Today s Question: Table or web page for Normal distribution? Which one you like better? STA 291 - Lecture 16 27 9

Central Limit Theorem Usually, the sampling distribution of X is approximately normal for n = 30 or above In addition, we know that the parameters of the sampling distribution are µ and σ σ = X n For example: If the sample size is n=49, then the sampling distribution of X has mean µ and SD (or standard error) = σ σ = 49 7 STA 291 - Lecture 16 28 Cont. Using the empirical rule with 95% probability X will fall within 2 SD of its center, mu. (since the sampling distribution is approx. normal, so empirical rule apply. In fact, 2 SD should be refined to 1.96 SD) with 95% probability, the X falls between σ 1.96 µ 1.96 = µ σ = µ 0.28σ n 7 σ 1.96 and µ + 1.96 = µ + σ = µ + 0.28σ n 7 ( µ = population mean, σ = population standard deviation) STA 291 - Lecture 16 29 Unbiased An estimator is unbiased if its sampling distribution is centered around the true parameter For example, we know that the mean of the sampling distribution of X-bar equals mu, which is the true population mean So, X-bar is an unbiased estimator of mu STA 291 - Lecture 16 30 10

Unbiased However, for any particular sample, the sample mean X-bar may be smaller or greater than the population mean Unbiased means that there is no systematic under- or overestimation STA 291 - Lecture 16 31 Biased A biased estimator systematically underor overestimates the population parameter In the definition of sample variance and sample standard deviation uses n-1 instead of n, because this makes the estimator unbiased With n in the denominator, it would systematically underestimate the variance STA 291 - Lecture 16 32 Point Estimators of the Mean and Standard Deviation The sample mean is unbiased, consistent, and (often) relatively efficient for estimating mu The sample standard deviation is almost unbiased for estimating population SD (no easy unbiased estimator exist) Both are consistent STA 291 - Lecture 16 33 11