Economics Spring 2015

Similar documents
7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Chapter 6 Sampling Distributions

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

MATH/STAT 352: Lecture 15

Sampling Distributions, Z-Tests, Power

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Understanding Dissimilarity Among Samples

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Confidence Intervals for the Population Proportion p

Module 1 Fundamentals in statistics

Topic 9: Sampling Distributions of Estimators

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Power and Type II Error

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

GG313 GEOLOGICAL DATA ANALYSIS

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Random Variables, Sampling and Estimation

Stat 421-SP2012 Interval Estimation Section

Topic 9: Sampling Distributions of Estimators

Parameter, Statistic and Random Samples

Statistics 511 Additional Materials

Understanding Samples

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Estimation of a population proportion March 23,

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

The standard deviation of the mean

Common Large/Small Sample Tests 1/55

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Introduction There are two really interesting things to do in statistics.

Statisticians use the word population to refer the total number of (potential) observations under consideration

Chapter 8: Estimating with Confidence

Frequentist Inference

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

1 Inferential Methods for Correlation and Regression Analysis

Topic 9: Sampling Distributions of Estimators

STAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)

(7 One- and Two-Sample Estimation Problem )

Binomial Distribution

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

6.3 Testing Series With Positive Terms

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STAT 203 Chapter 18 Sampling Distribution Models

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Parameter, Statistic and Random Samples

(6) Fundamental Sampling Distribution and Data Discription

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Confidence Intervals QMET103

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Final Examination Solutions 17/6/2010

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

7.1 Convergence of sequences of random variables

AP Statistics Review Ch. 8

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Infinite Sequences and Series

LESSON 20: HYPOTHESIS TESTING

Problem Set 4 Due Oct, 12

This is an introductory course in Analysis of Variance and Design of Experiments.

Read through these prior to coming to the test and follow them when you take your test.

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Lecture 2: Monte Carlo Simulation

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

MA238 Assignment 4 Solutions (part a)

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

1 Constructing and Interpreting a Confidence Interval

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

7.1 Convergence of sequences of random variables

Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

Lecture 7: Non-parametric Comparison of Location. GENOME 560, Spring 2016 Doug Fowler, GS

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

MIT : Quantitative Reasoning and Statistical Methods for Planning I

Chapter 23: Inferences About Means

Transcription:

1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures ad hadouts ad computer & book exercises. Brig Calculator & pecil with fuctioig eraser! Review Sessio 7:30 pm o Moday, Feb 3d, Garder 105 Samplig distributio of the sample mea, ormal populatio, sigma ukow (The Studet's t distributio) Usig the t-distributio whe the populatio is ot ormal Samplig Distributio of the Populatio Proportio Ed of Midterm 1 Material

/17/015 7 Samplig Distributio of x, Normal Populatio, Ukow We ca immediately use our estimator of sample variace to help us out of a problem that occurs whe we try to estimate the sample mea from a ormal populatio where the populatio stadard deviatio is ukow. It turs out that s is a good estimator of the populatio variace, oe ca prove that the expected value of s is equal to the populatio variace. I fact,. To do so we use the rules of expectatios that we developed earlier. Here s a hadout that shows you how to do this proof: {Next Slide} Proof that the Sample Variace is a Ubiased Estimator of the Populatio Variace Early i the course I claimed that the "best" estimator of the populatio variace, sigma-squared is s-squared defied as: ( xi x) i1 s, 1 ( xi x) ' i1 eve though it would seem that a better estimator would be: s. ' So, let x be a radom sample with ad. Show that is a biased estimator 1, x,, x E x i V xi s for ad s is a ubiased estimator for. First, with some basic algebra (which I'll leave to you) we ca demostrate that: xi x xi 1 xi i1 xi x i i i 1 1 1 ( ). The, we ca write the expected value of this sum of squared differeces as: E x x E x E x E x E x i1 i1 i1 ( ) ( ). i i i Notice that is the same for i = 1,,...,. We use this ad the fact that the variace of a radom variable is give by V x E x E x to coclude that E xi V xi E xi, E x V x E x /, ad that E ( xi x) i1 i1 1. It follows that ' 1 1 1 E s E ( xi x) 1 i1 s ' ad that is biased because E s '. However, 1 1 E s E ( xi x) 1 1 i1 1 so we see that s is a ubiased estimator for. The expected value of a differece is the differece of the expected values. Ad the expected value of a sum is the sum of the expected values. Now, we kow that whe we re estimatig a sample mea from a ormal populatio with a SampleDist.lwp Lecture o Samplig Distributios Page 8 of 37

/17/015 kow variace our estimator s distributio is exactly ormal. So, if we wat to calculate probabilities for a ormally distributed estimator of the sample mea we ca covert to z-scores: z x x What happes whe we substitute the sample stadard deviatio for sigma? z x x s Ituitively you ca guess that the mea of the stadardized variable z, is still zero sice the umerator has ot bee affected by the substitutio. I terms of the variace, we should expect the variace of x x s to be larger tha the variace of x x sice oe more elemet of ucertaity has bee added to the ratio. Fially, we should expect the ratio to be symmetrical, sice there is o reaso to believe that substitutig s for sigma will make this distributio skewed either positively or egatively. Note also that the variaiability of the distributio depeds upo the size of, for the sample size affects the reliability with which s estimates. Whe is large, s will be a good approximatio to ; but whe is small, s may ot be very close to. Hece, the distributio of SampleDist.lwp Lecture o Samplig Distributios Page 9 of 37

/17/015 x x is a family of distributios whose varibility depeds upo. s So, I hope that it s clear from this discussio that the distributio of x x s is ot ormal, but is more spread out tha ormal. The distributio of this statistic is called the t-distributio ad its radom variable is deoted as t x x s. This is the famous distributio that was discovered by a statisticia amed W. S. Gossett, a Irishma who worked for Guiess Brewery. The brewery would t let him publish his research so he published aoymously uder the ame of Studet. I hoor of Gosset s research, published i 1908, the t-distributio is ofte called Studet s t-distributio. The t-distributio is a fairly complex fuctio, ad I wo t preset it here. Let me list its characteristics: The t-distributio depeds upo the size of the sample. It is cusomary to describe the characteristics of the t-distributio i terms of the sample size mius oe, or (-1), as this quatity has special sigificace. The value of (-1) is called umber of degrees of freedom (abbreviated d.f.), ad represets a measure of the umber of observatios i the sample that ca be used to estimate the stadard deviatio of the paret populatio. For example, whe =1, there is o way to SampleDist.lwp Lecture o Samplig Distributios Page 30 of 37

/17/015 esxtimate the populatio stadard deviatio; hece there are o degrees of freedom (-1=0). There is oe degree of freedom i a sample of =, sice oe observatio is ow free to vary away from the other, ad the amout it varies determis our estimate of the populatio stadard deviatio. Each additioal observatio adds oe more degree of freedom, so that, i a sample of size, there are (-1) observatios free to vary, ad hece (-1) degrees of freedom. The Greek letter or u is ofte used to deote degrees of freedom. Whe sample sizes are small, the t-distributio is see to be cosiderably more spread out tha the stadard ormal distributio. That is, its tails are thicker: {ext slide} Comparig the t- ad ormal distributios desity 0.4 0.3 0. 0.1 t distributio (df = 3) Stadardized ormal distributio 0-3 - -1 0 1 3 z- ad t-values Here we compare a t-distributio with degrees of freedom = 3 to the stadard ormal distributio. You ca see that the t-distributio has cosiderably more area uder its tails outside of stadard SampleDist.lwp Lecture o Samplig Distributios Page 31 of 37

/17/015 deviatios; however, as degrees of freedom get large, the t-distributio approaches the ormal distributio. Because the t-distributio is really a family of distributios it would be very difficult to carry aroud tables for all possible t-distributios. Istead, tables are usually published that cotai probability values for certai critical values. Here s the t-table out of your textbook: {ext slide} The table cotais values of t that cotai a certai amout of area uder the curve to the right. So, with degrees of freedom = 1, t must equal 31.81 {Next Slide} to have oe percet ot total area to the right. O the other had t eeds oly be 3.747 if the degrees of freedom are equal to 4.{Next Slide} At degrees of freedom = 9 the t-value at oe percet is.46 which is very close to the z-value.36 of the ormal distributio for 1-percet right tail probability. d.f. t.100 t.050 t.05 t.010 t.005 d.f. 1 3 4 5 6 7 8 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 6.314.90.353.13.015 1.943 1.895 1.860 1.706 4.303 3.18.776.571.447.365.306 31.81 6.965 4.541 3.747 3.365 3.143.998.896 63.657 9.95 5.841 4.604 4.03 3.707 3.499 3.355 1 3 4 5 6 7 8 SampleDist.lwp Lecture o Samplig Distributios Page 3 of 37

/17/015 9 10 11 1 13 14 15 16 17 18 19 0 1 3 4 5 6 7 8 9 if. 1.383 1.37 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.38 1.35 1.33 1.31 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.8 1.833 1.81 1.796 1.78 1.771 1.761 1.753 1.746 1.740 1.734 1.79 1.75 1.71 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.645.6.8.01.179.160.145.131.10.110.101.093.086.080.074.069.064.060.056.05.048.045 1.960.81.764.718.681.650.64.60.583.567.55.539.58.518.508.500.49.485.479.473.467.46.36 3.50 3.169 3.106 3.055 3.01.977.947.91.898.878.861.845.831.819.807.797.787.779.771.763.756.576 9 10 11 1 13 14 15 16 17 18 19 0 1 3 4 5 6 7 8 9 if. 7.1. Example Usig the t-distributio {Next Slide} To see how to use the t-distributio, let s examie the widely publicized claims of a well-kow eighborig uiversity that its studets have I.Q. s which are ormally distributed with a mea 130. Suppose that I were able to obtai through, methods that I caot reveal, a radom sample of the I.Q. s of 5 studets. This radom sample has a mea of 16.8 ad a stadard deviatio of s=6. {Next Slide} <What is the probability of receivig a sample mea of 16.8, or lower, if 130? {Next Slide} First, covert the sample mea to a t-score assumig that the populatio mea really is equal to 130: {Next Slide} SampleDist.lwp Lecture o Samplig Distributios Page 33 of 37

/17/015 P x P x 16. 8 130 16. 8 s / 6 / 5 P t 3. Pt. 667 1. Now, sice the t-distributio is symmetrical, the probability that t.667 is equal to the probability that t.667. The degrees of freedom for this sample is {ext slide}: df 5 1 4. So, lookig at the table uder df=4 we do t fid a direct match, but we do see that.667 lies betwee.49 ad.797. So, the probability that we would get a sample mea of 16.8 or less, if the true mea were 130 is oly betwee oe ad oe-half percet! So, the probability that the true mea I.Q. of this uamed uiversity s studets is 130 is very low. {ext slide}. The ext slide shows that the area to the left of -.667 equals the area to the right of +.667, ad that this area is somewhere betwee 0.01 ad 0.005. {ext slide - clicks} 0.4 0.3 Studet's t-distributio of sample mea Deg. of freedom 4 desity 0. 0.1 0-4 -3 - -1 0 1 3 4 t-value These are the same areas Oh, by the way, we ve just doe some statistical iferece: We asked the questio, what s the probability of drawig a radom sample with mea 16.8 if the true populatio mea is 130? The aswer was: quite low. SampleDist.lwp Lecture o Samplig Distributios Page 34 of 37

/17/015 7.. Usig the t-distributio whe the populatio is ot Normal Now, let me emphasize agai that the t-distributio assumes that samples are draw from a paret populatio that is ormally distributed. I practical problems ivolvig this distributio, the questio is: just how critical is this assumptio of ormality i the paret populatio? Ofte, we ca t determie the distributio of the paret populatio, so it becomes difficult to kow if usig the t-distributio is appropriate. Fortuately, the assumptio of ormality ca be relaxed without sigificatly chagig the samplig distributio of the t-distributio. Because of this, the t-distributio is said to be quite robust, implyig that its usefuless holds up well uder coditios that do ot exactly coform to the origial ormality assumptio. So, let s agai emphasize several importat aspects of the samplig distributio of x whe is large: {ext slide} v Whe is large (>30) x will at a miimum be approximately ormally distributed. v Whe is large s will usually be a good approximatio to sigma. v I that case, the distributio of t (x )/s/ ad that of z (x )// will be approximately the same. v So, for large samples we ca use the stadard ormal distributio to approximate the t-distributio. 8 Samplig Distributio of the Sample Proportio {Next Slide} Let's say that we're doig a political poll about itetios of a radomly selected set of voters to vote for the curret presidet at the ext electio. The respodets to the poll will respod "yes" or "o" ad we wat to estimate the probability that the average voter will vote for the presidet. We ca approximate this ukow probability, p, with the sample proportio, SampleDist.lwp Lecture o Samplig Distributios Page 35 of 37

/17/015 {ext slide} x pˆ where p is the estimated probability, x is the umber of "yes" aswers i the sample ad is the size of the sample. That is, we estimate the uderlyig probability with the sample proportio. Sice each distict value of x results i a distict value of x pˆ the probabilities associated with are equal to the probabilities p associated with the correspodig values of x. Hece, the samplig distributio of will be the same shape as the biomial p probability distributio for x. Like the biomial probability distributio, it ca be approximated by a ormal probability distributio whe the sample size is large. Now, the expected value of the sample proportio is:{ext slide} x 1 1 E pˆ pˆ E E x p p ad the stadard error of the sample proportio, p, is {Next Slide} 1 (1 ) (1 ) ˆ x p p p p V p pˆ V V x ad, {ext slide} pˆ p(1 p) SampleDist.lwp Lecture o Samplig Distributios Page 36 of 37

/17/015 8.1. Example It's ot widely kow, but a substatial proportio of super market scaig machies make mistakes whe items are scaed i. The North Carolia Divisio of Weights ad Measures tests a store's scaers by radomly selectig 300 register tapes ad verifyig whether or ot there's a error o the tape. Stores are fied if the error rate is more tha percet. Suppose 8 tapes show errors; what's the probability of gettig 8 or more errors if the true error rate is, i fact, percet (or 6 errors)? Let's approximate the biomial distributio with a ormal distributio with: {ext slide} x 8 pˆ 0.0667 300 ad, {ext slide} pˆ 1 ) ˆ 1 ˆ) p p p p 0.00930 The, we calculate the z-value uder the assumptio that the true probability of error is {ext slide} p 0.0 ad we get {ext slide} pˆ p 0.0667 0.0 z 0.7170 pˆ 1 pˆ ) 0.00930 so, lookig at the ormal table we see that the probability of gettig SampleDist.lwp Lecture o Samplig Distributios Page 37 of 37

/17/015 a.7% error rate is (0.5-0.63 = 0.37) 3.7 percet, eve if the true probability of error is oly.0%: {ext slide} Stadardized ormal distributio 0.5-0.63 = 0.37-3 - -1 0 1 3 z = 0.7170 So, it's ot too ulikely that we'd get a error rate of.7% eve if the machies are operatig withi regulatory specs at.0%. SampleDist.lwp Lecture o Samplig Distributios Page 38 of 38