STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

Similar documents
Sampling Distributions, Z-Tests, Power

Simulation. Two Rule For Inverting A Distribution Function

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Binomial Distribution

(6) Fundamental Sampling Distribution and Data Discription

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

Module 1 Fundamentals in statistics

MATH/STAT 352: Lecture 15

Central Limit Theorem the Meaning and the Usage

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Introduction to Probability and Statistics Twelfth Edition

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Topic 9: Sampling Distributions of Estimators

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 18 Summary Sampling Distribution Models

Chapter 7 Student Lecture Notes 7-1

Chapter 6 Sampling Distributions

Lecture 7: Properties of Random Samples

Economics Spring 2015

Homework 5 Solutions

Parameter, Statistic and Random Samples

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Read through these prior to coming to the test and follow them when you take your test.

Stat 400: Georgios Fellouris Homework 5 Due: Friday 24 th, 2017

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Probability and statistics: basic terms

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Rule of probability. Let A and B be two events (sets of elementary events). 11. If P (AB) = P (A)P (B), then A and B are independent.

STAT 203 Chapter 18 Sampling Distribution Models

Approximations and more PMFs and PDFs

CONFIDENCE INTERVALS STUDY GUIDE

Parameter, Statistic and Random Samples

AMS570 Lecture Notes #2

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Topic 10: Introduction to Estimation

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Final Review for MATH 3510

Comparing your lab results with the others by one-way ANOVA

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)

Median and IQR The median is the value which divides the ordered data values in half.

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

Modeling and Performance Analysis with Discrete-Event Simulation

Confidence Intervals for the Population Proportion p


Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

Frequentist Inference

Estimation for Complete Data

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Statisticians use the word population to refer the total number of (potential) observations under consideration

Sample Size Determination (Two or More Samples)

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

NOTES ON DISTRIBUTIONS

Describing the Relation between Two Variables

Expectation and Variance of a random variable

Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

4. Partial Sums and the Central Limit Theorem

Chapter 6 Principles of Data Reduction

Distribution of Random Samples & Limit theorems

AMS 216 Stochastic Differential Equations Lecture 02 Copyright by Hongyun Wang, UCSC ( ( )) 2 = E X 2 ( ( )) 2

STAC51: Categorical data Analysis

Introducing Sample Proportions

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Questions and Answers on Maximum Likelihood

1 Introduction to reducing variance in Monte Carlo simulations

Lecture 2: Monte Carlo Simulation

Random Variables, Sampling and Estimation

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

Chapter 2 The Monte Carlo Method

Lecture 7: Non-parametric Comparison of Location. GENOME 560 Doug Fowler, GS

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

5. INEQUALITIES, LIMIT THEOREMS AND GEOMETRIC PROBABILITY

Understanding Samples

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

PRACTICE PROBLEMS FOR THE FINAL

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

Exponential Families and Bayesian Inference

Math 140 Introductory Statistics

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Transcription:

STAT 515 fa 2016 Lec 15-16 Samplig distributio of the mea, part 2 cetral limit theorem Karl B. Gregory Moday, Sep 26th Cotets 1 The cetral limit theorem 1 1.1 The most importat theorem i statistics............. 1 1.2 More adjectives for probability distributios............ 2 1.3 Cetral limit theorem for the sample proportio......... 3 1.4 Diagrams for X ad ˆp........................ 5 1.5 Further examples of approximatio............ 6 1 The cetral limit theorem 1.1 The most importat theorem i statistics Probably the most importat theorem i statistics is the cetral limit theorem. This theorem tells us that the sample mea behaves like a radom variable if the sample size is large eough eve if the populatio itself is ot! Theorem 1 Cetral Limit Theorem If X has mea µ ad variace σ 2 <, the for a radom sample X 1,..., X of X values, the sample mea X = 1 X i behaves more ad more like a µ, σ2 i=1 radom variable for larger ad larger sample sizes. Example 1 Let X be the maratho time of a radomly selected ruer of the ext Columbia maratho. The distributio is skewed to the right ad has mea 4.5 hours ad stadard deviatio 2 hours. Suppose you take a radom sample of 30 fiishers. 1

Questio: What is the probability that the mea of the 30 times is less tha 4.25 hours? Aswer: Eve though the maratho times are ot ly distributed, the sample mea X should behave like a radom variable with a µ, σ2 i.e. a 4.5, 4 30 distributio. Thus we ca get P X < 4.25 usig the distributio: Z = 4.25 4.5 4/30 =.68, ad P Z <.68 =.2483. So the aswer is P X < 4.25 =.2483. Example 2 Let X be the umber of ships which come through a set of locks i a afteroo, ad the mea ad stadard deviatio of X are 6 ad 1, respectively. Suppose you observe o five radomly selected afteroos ad compute X, the mea of the umbers of ships you couted o the 5 afteroos. Questio: What is P X > 7? Aswer: We caot compute it, because the sample size is small, ad the cetral limit theorem holds oly for large sample sizes. 1.2 More adjectives for probability distributios The distributio has a bell-shaped probability desity fuctio. We ofte describe distributios by the way their probability desity fuctios differ i shape from that of the distributio: A left-skewed distributio produces more observatios to the far left of the mea tha the distributio, a heavy-tailed distributio produces more extreme values far away from the mea i both directios, a right-skewed distributio produces more observatios to the far right of the mea tha the distributio. The plots below show probability desity fuctios for a left-skewed, heavytailed, ad a right-skewed distributio solid lies with the probability desity fuctio dashed lie overlaid. Below these plots are histograms from a sample of size = 500 draw from the respective distributios. I the bottom row of the figure, QQ plots are give comparig the quatiles of the sample to the quatiles of the distributio. 2

left skewed quatiles Sample quatiles heavy tailed quatiles right skewed quatiles The cetral limit theorem says that eve whe the populatio has a distributio which is left-skewed, heavy-tailed, right-skewed, or eve which differs from the distributio i some other way, the mea of a large eough sample may be treated as a radom variable. This is take advatage of all the time i statistical practice. 1.3 Cetral limit theorem for the sample proportio We ca express the sample proportio ˆp as a mea ad use the cetral limit theorem to treat it as a radom variable havig a distributio. Suppose we ecode the outcome of a Beroulli trial i the radom variable Y such that Y = { 1 if outcome a success 0 if outcome a failure. If the Beroulli trial has success probability p, the we have P Y = 1 = p ad 3

P Y = 0 = 1 p. We ca compute µ = EY = p ad σ 2 = VarY = p1 p. Suppose we ra the Beroulli trial times idepedetly ad got Y 1,..., Y. The Ȳ = 1 Y i = #{successes} i=1 is the sample proportio ˆp of successes. We ca apply the cetral limit theorem to Ȳ, that is to ˆp. The cetral limit theorem says that Ȳ should behave approximately like a p1 p p, radom variable whe is large. From here we ca compute probabilities about ˆp usig the distributio. Remark 1 How large should be before we ca ivoke the cetral limit theorem for the sample proportio ˆp? A rule of thumb is that we ca treat ˆp as if p 5 ad 1 p 5. Example 3 Suppose you take a radom sample of 15 USC udergraduates ad you ask each oe if they are registered to vote. Let ˆp be the proportio i your sample who are registered to vote. Questio: Supposig that the true proportio of USC udergraduates who are registered to vote is.6, What is the probability that ˆp of your sample is greater tha.7? Aswer #1: For a sample of size 15 ad with the populatio proportio equal to p =.6, ˆp should behave approximately like a p, p1 p i.e. a radom variable sice 15.6, 15.4 5. Now.60,.61.6 15 Z = ˆp p p1 p gives.7.6.61.6 15 = 0.79. We get from the table that P Z >.79 =.2148. So the aswer is P ˆp >.7.2148. Aswer #2: We could also use the Biomial distributio to get the exact aswer. The evet ˆp >.7 correspods to observig 11 or more successes out of the 15 4

Beroulli trials. So if X is the umber of successes, P ˆp >.7 = P X 11 = 1 P X < 10. We ca compute P X < 10 i R usig the commad pbiomq=10,size=15,prob=.6 We get P X < 10 =.7827, so the aswer is P ˆp >.7 = P X 11 = 1.7827 =.2173. It is close to aswer #1, which is approximate. 1.4 Diagrams for X ad ˆp The diagram below summarizes the distributio of the sample mea X: X approx µ, σ 2 / 30 X o - < 30 X X o- X µ, σ 2 X µ, σ 2 / The ext diagram summarizes the distributio of the sample proportio ˆp: ˆp mi{p, 1 p} < 5 mi{p, 1 p} 5 ˆp Biomial, p ˆp approx p, p1 p/ ˆp Biomial, p Recall that ˆp = X, the umber of successes i Beroulli trials, so sayig that ˆp Biomial, p is othig ew, ad it is always true, o matter what is. 5

1.5 Further examples of approximatio Example 4 Suppose X is the time betwee phoe calls to a customer service call ceter every hour, ad suppose it follows the expoetial distributio with mea equal to 1/20. Suppose we observe the ext 30 time itervals betwee calls ad record them as X 1,..., X 30. Let X be the mea legth of the 30 time itervals. Questio: What is P X >.075? Aswer: For the expoetial distributio, we have µ = 1/λ ad σ 2 = 1/λ 2. Accordig to the cetral limit theorem, X should behave approximately like a distributio. So we get 1/λ, 1/λ2 30, i.e. a Z = X µ.075.05 = = 2.74 σ2 /.0025/30.05,.0025 30 We get P Z > 2.74 =.0031. Example 5 Suppose X is the umber of phoe calls to a customer service call ceter every hour, ad suppose it follows the Poisso distributio with λ = 20. Suppose we observe the call ceter durig 25 radomly selected hours ad let X 1,..., X 25 be the umbers of calls we observed ad X the mea umber of calls. Questio: What is P X < 18? Aswer: For the Poisso distributio, we have µ = λ ad σ 2 = λ. Accordig to the cetral limit theorem, X should behave approximately like a distributio. So we get λ, λ 25 = 20, 20 25 Z = X µ 18 20 = = 2.24. σ2 / 20/25 We get P Z < 2.24 =.0125. 6