Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

Similar documents
Parameter, Statistic and Random Samples

Expectation and Variance of a random variable

Chapter 23: Inferences About Means

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Confidence Intervals for the Population Proportion p

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Topic 9: Sampling Distributions of Estimators

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Statistics 511 Additional Materials

1 Inferential Methods for Correlation and Regression Analysis

Frequentist Inference

Random Variables, Sampling and Estimation

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Lecture 2: Monte Carlo Simulation

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Read through these prior to coming to the test and follow them when you take your test.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Sampling Distributions, Z-Tests, Power

Homework 5 Solutions

Chapter 7 Student Lecture Notes 7-1

STAT 203 Chapter 18 Sampling Distribution Models

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

MATH/STAT 352: Lecture 15

Binomial Distribution

Chapter 18 Summary Sampling Distribution Models

Module 1 Fundamentals in statistics

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Stat 421-SP2012 Interval Estimation Section

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

CONFIDENCE INTERVALS STUDY GUIDE

Understanding Samples

Simulation. Two Rule For Inverting A Distribution Function

Economics Spring 2015

Infinite Sequences and Series

6.3 Testing Series With Positive Terms

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

Data Analysis and Statistical Methods Statistics 651

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Confidence Intervals

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Chapter 2 Descriptive Statistics

Elementary Statistics

Topic 10: Introduction to Estimation

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

Chapter 6. Sampling and Estimation

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Chapter 8: Estimating with Confidence

MIT : Quantitative Reasoning and Statistical Methods for Planning I

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Lesson 11: Simple Linear Regression

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

MA Advanced Econometrics: Properties of Least Squares Estimators

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Stat 225 Lecture Notes Week 7, Chapter 8 and 11

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Central Limit Theorem the Meaning and the Usage

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Output Analysis and Run-Length Control

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

1 Constructing and Interpreting a Confidence Interval

Mathematical Notation Math Introduction to Applied Statistics

Properties and Hypothesis Testing

Statisticians use the word population to refer the total number of (potential) observations under consideration

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Computing Confidence Intervals for Sample Data

AP Statistics Review Ch. 8

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Lecture Chapter 6: Convergence of Random Sequences

Parameter, Statistic and Random Samples

1 Introduction to reducing variance in Monte Carlo simulations


Stat 400, section 5.4 supplement: The Central Limit Theorem

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Chapter 1 (Definitions)

Chapter 6 Sampling Distributions

Chapter 6 Principles of Data Reduction

Good luck! School of Business and Economics. Business Statistics E_BK1_BS / E_IBA1_BS. Date: 25 May, Time: 12:00. Calculator allowed:

Sample Size Determination (Two or More Samples)

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

NCSS Statistical Software. Tolerance Intervals

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Statistics 300: Elementary Statistics

Transcription:

Eco411 Lab: Cetral Limit Theorem, Normal Distributio, ad Jourey to Girl State 1. Some studets may woder why the magic umber 1.96 or 2 (called critical values) is so importat i statistics. Where do they come from? Ok, this is a log log story. Hopefully I ca make it fu ad easy to uderstad. 2. Simply put, cetral limit theorem (CLT) implies that the ormal distributio plays a key role i statistics, ad 1.96 is the 97.5-th percetile of the stadard ormal distributio. Warig! You should read this setece at least two more times, ad thik about it, before you go ahead. 3. I wat to use simulatio (Mote Carlo experimet) to show the mai idea of CLT, which states that the sample average (mea) approaches the ormal distributio as the sample size rises, regardless of the distributio of the origial data. I terms of math, o matter which distributio origial y follows, the sample mea ȳ becomes more ad more like a ormal radom variable. This is a strog statemet. It is like, o matter who your are, i the ed you will like Dr. Li s teachig... 4. To get Mote Carlo started, I first geerate a populatio of 100,000 observatios of a zero-oe dummy variable. So by costructio the origial variable y does ot follow ormal distributio. I fact it follows a Beroulli distributio give as P (y = 1) = p = 0.8, P (y = 0) = 1 p = 0.2 (1) where I made up the umber 0.8. The expected value ad variace are give by E(y) = p = 0.8, var(y) = p(1 p) = 0.16 (2) For those quatitative guys, ca you prove above results? 5. To help uderstad, you ca thik of a fictitious coutry called Girl State, which exists i a famous Chiese ovel called Jourey to the West. Girl State has a populatio of 100,000. This coutry is special because 80 percet populatio are female (y = 1), while oly 20 percet are male (y = 0) 1. This kid of zero-oe variable defiitely is ot a ormal radom variable. 1 Male is always eeded, otherwise the coutry will go extict i log ru. 1

6. To visualize the o-ormality, we draw the histogram for the origial data y : Desity 0 10 20 30 40 0.2.4.6.8 1 y This clearly does ot look like a bell, or a ormal distributio. 7. Next we wat to draw 1000 samples, ad each sample is small with oly 10 observatios. For each sample we ca compute a sample mea ȳ, or i this case, the sample proportio of y = 1. Agai let me use laguage everyoe uderstads. It is like the kig asks you to visit 1000 small villages, ad there are oly 10 residets i each village. The kig wats to kow the geder break-up for each village. 8. Because each village is small, you expect big variatio i ȳ. That is, for village that has o male, ȳ = 1; for village that has o female, ȳ = 0. The latter case is ulikely, but still possible. I geeral, ȳ varies across villages. So ȳ is a radom variable 2, whose distributio is called samplig distributio. Statistics is largely cocered with usig ȳ to make iferece to p. 3 9. The graph below is the histogram (distributio) of 1000 ȳ. Remember, we get oe ȳ for each village: 2 You do ot kow ȳ before you go to a specific village, so it is radom. 3 I reality p is geerally ukow, uless we do simulatio. 2

Desity 0 5 10 15.2.4.6.8 1 ybar10 I have several remarks: (a) eve though y ca oly take two values of 0 ad 1, ȳ ca take values of 0, 0.1, 0.2,...1. We see more tha two bars i the histogram. (b) Because 80 percet populatio are female, we have substatial umber of villages that have o male (ȳ = 1, the rightmost bar). O the other had, a o-female village (ȳ = 0) is hard to fid. This ca be see from the height of bar i the histogram. (c) The most likely ȳ (the highest bar) is 0.8, which is equal to the populatio mea p = 0.8. 10. Notice that this secod histogram is more symmetric tha the first histogram. We ca almost see a bell, eve though a asymmetric oe. The cetral limit theorem is kickig i ow! 11. Also ote that the horizotal distace betwee bars gets smaller. I the limit whe the sample size is ifiity, the bars become immediately adjacet, meaig that the ormal radom variable is cotiuous (but Beroulli is discrete). 12. CLT is a example of asymptotic theory describig what would happe whe the sample gets larger ad larger. So ext I will let the sample size icrease. 13. It is like the kig ow asks you to check city, other tha village. So we will icrease the sample size from 10 to 100. We visit 1000 cities, ad for each city we compute the sample mea ȳ. The histogram of the 1000 city averages is below 3

Desity 0 5 10 15.65.7.75.8.85.9 ybar100 If we igore the gaps i the graph, we almost see a symmetric bell! Put differetly, compared to the village average, the city average is more like a ormal radom variable. Yes, this is what the cetral limit theorem is about. Also otice that the dispersio of the third histogram is less tha the secod histogram. Mathematically we ca show E(ȳ) = p = 0.8, var(ȳ) = var(y) = p(1 p) = 0.16 (3) The first equatio above implies that o average, ȳ is a accurate estimate of p (called ubiased estimator). The secod equatio idicates that a bigger sample () ca lead to smaller variatio i ȳ, or equivaletly, lead to a more precise estimate. That is why we prefer a big sample over small oe. 14. Note var(ȳ) = 0 whe =. So i the limit the sample mea equals the costat p. This is called Law of Large Number. Loosely speakig, a sufficietly large sample ca give you a sample mea as close as possible to the actual populatio mea. The ituitio is, whe the sample gets larger, it coverges to the populatio, ad o woder the sample mea coverges to the populatio mea. 15. Most importatly, the cetral limit theorem says ȳ N ( p, ) p(1 p) as (4) where N( ) represets ormal distributio. Pay attetio here. It is ȳ, ot y, that coverges to ormal distributio. I this case our y always remais uchaged as a 4

Beroulli variable. The CLT is about ȳ. 16. I theory the sample size should be ifiitely large for CLT to hold. I practice we ca get ȳ very close to ormality for as small as 20. 17. I order to get a stadard ormal distributio with mea value of zero ad variace of oe, we apply the process of stadardizig (obtaiig the z-score). That is, we subtract the p, which is the E(ȳ), ad divide by the square root of var(ȳ), which is called stadard error ȳ p p(1 p) 18. The histogram of the stadardized ȳ is below N (0, 1) as (5) Desity 0.2.4.6 4 2 0 2 4 zybar100 Now we see the value 0 is i the ceter (0.8 is i the ceter before stadardizatio). 19. Normal distributio ca appear i a uexpected but atural way. Imagie you are lookig at the satellite image of a parkig lot outside a shoppig mall. You will see most cars parked directly i frot of the etrace, ad the umber of cars decreases gradually away from the etrace, just like a ormal distributio. 20. Now I ca show you where is the magic umber 1.96 or 2. After we sort the stadardized ȳ i ascedig order, 1.96 (or 2) is the 975-th observatio i the sorted series of 1000 stadardized ȳ! I other words P (stadard ormal < 1.96) = 0.975 (6) P ( 1.96 < stadard ormal < 1.96) = 0.95 (7) 5

The last iequality is the basis for cofidece iterval. 21. Cofidece iterval allows us to say somethig almost certai (with 0.95 probability) about somethig totally radom. I other words, the stadard ormal radom variable is radom, but is NOT that radom the most likely values are betwee -1.96 ad 1.96. 22. For a geeral ormal radom variable, the most likely values are betwee sample mea ± 1.96 times stadard deviatio. Mathematically 95 percet cofidece iterval = [µ 1.96σ, µ + 1.96σ] (8) 23. The stata code for doig this Mote Carlo is below clear set more off set obs 100000 set seed 12345 capture drop y ybar10 ybar100 ge y = (uiform()>0.2) * draw 1000 samples; each sample cotais 10 obs; draw histogram of the sample me ge ybar10 =. forvalues i = 1(1)1000 { local 0 = ( i -1)*10+1 local = 0 + 9 qui sum y i 0 / qui replace ybar10 = r(mea) i i } histogram ybar10 i 1/1000 sum ybar10 i 1/1000, detail * draw 1000 samples; each sample cotais 100 obs; draw histogram of the sample m ge ybar100 =. forvalues i = 1(1)1000 { local 0 = ( i -1)*100+1 local = 0 + 99 6

qui sum y i 0 / qui replace ybar100 = r(mea) i i } histogram ybar100 i 1/1000 sum ybar100 i 1/1000, detail * Where does 1.96 or 2 come from? * Aswer: Stadardize ybar100, sort them, ad 1.96 (or 2) is the 97.5 precetile ge zybar100 = (ybar100-0.8)/sqrt(0.16/100) histogram zybar100 i 1/1000 sort zybar100 list zybar100 i 975 24. Last commet. The fuy lookig stadardized ȳ has a popular ame: it is called t statistic (t value, t ratio...) I called it pada i class! t value ȳ p p(1 p) 7