Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Similar documents
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Linear Regression with One Regressor

ENGI 3423 Simple Linear Regression Page 12-01

Summary of the lecture in Biostatistics

: At least two means differ SST

Econometric Methods. Review of Estimation

Chapter 13 Student Lecture Notes 13-1

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Lecture Notes Types of economic variables

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Special Instructions / Useful Data

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

LECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Statistics MINITAB - Lab 5

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

4. Standard Regression Model and Spatial Dependence Tests

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Lecture 3. Sampling, sampling distributions, and parameter estimation

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Multiple Linear Regression Analysis

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

Simple Linear Regression

2SLS Estimates ECON In this case, begin with the assumption that E[ i

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Lecture 3 Probability review (cont d)

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

Multiple Choice Test. Chapter Adequacy of Models for Regression

Probability and. Lecture 13: and Correlation

Simple Linear Regression

Simulation Output Analysis

Continuous Distributions

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Econometrics. 3) Statistical properties of the OLS estimator

Simple Linear Regression - Scalar Form

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4

Objectives of Multiple Regression

ENGI 4421 Propagation of Error Page 8-01

22 Nonparametric Methods.

Bayes (Naïve or not) Classifiers: Generative Approach

Chapter 14 Logistic Regression Models

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

ε. Therefore, the estimate

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1


THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Class 13,14 June 17, 19, 2015

ESS Line Fitting

Multivariate Transformation of Variables and Maximum Likelihood Estimation

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

Statistics: Unlocking the Power of Data Lock 5

Parameter, Statistic and Random Samples

DISTURBANCE TERMS. is a scalar and x i

Chapter 8. Inferences about More Than Two Population Central Values

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Chapter 5 Properties of a Random Sample

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Chapter 3 Multiple Linear Regression Model

Chapter -2 Simple Random Sampling

Lecture Notes Forecasting the process of estimating or predicting unknown situations

STATISTICAL INFERENCE

Chapter 3 Sampling For Proportions and Percentages

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

Qualifying Exam Statistical Theory Problem Solutions August 2005

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

Chapter 4 Multiple Random Variables

STK3100 and STK4100 Autumn 2017

TESTS BASED ON MAXIMUM LIKELIHOOD

Lecture 02: Bounding tail distributions of a random variable

Chapter -2 Simple Random Sampling

STK3100 and STK4100 Autumn 2018

STK4011 and STK9011 Autumn 2016

Lecture 8: Linear Regression

Statistics Review Part 3. Hypothesis Tests, Regression

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Lecture 9: Tolerant Testing

Fundamentals of Regression Analysis

ρ < 1 be five real numbers. The

Chapter 11 The Analysis of Variance

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Correlation and Simple Linear Regression

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Maximum Likelihood Estimation

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

1 Solution to Problem 6.40

Simple Linear Regression and Correlation.

CHAPTER 2. = y ˆ β x (.1022) So we can write

Module 7: Probability and Statistics

Simple Linear Regression Analysis

Transcription:

Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The CLR model specfes that the relato betwee the depedet varable Y ad the depedet varable s a exact lear relato,.e. Y α β u,, K, ad that, u satsfy the assumptos

Assumpto : Assumpto : costats. u,, K, are radom varables wth ( ) 0,, K, are determstc,.e. o-radom, E u Assumpto 3 (Homoskedastcty) All u ' s have the same varace,.e. for, K, E( u ) Var( u ) σ Assumpto 4 (No seral correlato) The radom errors u ad u are ot correlated for all j j, K, Cov ( u, u ) E( u u ) j j 0

If these assumptos hold, the OLS estmators Y Y ) ( ) )( ( ˆβ Y β α ˆ ˆ of the regresso coeffcets β α, have the followg propertes 3

. The OLS estmators ˆ, β are ubased,.e. for the mea of the samplg dstrbuto t holds that α ˆ E ( ˆ) α α, E( ˆ β ) β For ths oly assumptos ad are eeded.. The varace of the samplg dstrbuto s ( Var ( ˆ) α σ ) 4

Var( ˆ) β σ ( ) Cov ( ˆ, α ˆ) β σ ( ˆ ) 3. The OLS estmators ˆ α, β are cosstet,.e. large samples the samplg dstrbuto s cocetrated α, β. The samplg varace forms us how precse the estmates are. It ca be show that uder assumptos -4, the OLS estmators are the best estmators,.e. have the smallest samplg varace, amog the ubased estmators that ca be expressed as a, lear expresso Y,.e. as a weghted average of the Y. I jargo: The OLS estmators are Best Lear Ubased Estmators (BLUE) 5

For cofdece tervals ad hypothess tests, the mea ad varace of the samplg dstrbuto are ot eough: we eed that the samplg dstrbuto belogs to a class of dstrbutos that we ca work wth, e.g. the ormal dstrbuto. Ths wll be the case f the followg assumpto holds Assumpto 5. The radom error terms u,, K, are radom varables wth a ormal dstrbuto. 6

Note that Ths ormal dstrbuto has mea 0 (assumpto ) ad varace σ (assumpto 3) for all, K,. Hece the u s are radom varables wth detcal (ormal) dstrbutos The error terms u ad u are ucorrelated (assumpto 4) ad j because they are ormal, also stochastcally depedet Why s the ormal dstrbuto a obvous choce as a dstrbuto for the error term u? 7

What s the samplg dstrbuto of the OLS estmators f assumpto 5 holds? Cosder a CLR model wth α 0. We have see that the OLS estmator ˆ β ca be expressed as ˆ β β W u The rght-had sde s the sum of a costat (β ) ad a weghted average of ormal radom varables W u It ca be show that such a lear combato s also a radom varable wth a ormal dstrbuto. 8

Cocluso: If assumpto 5 holds, the samplg dstrbuto of s ormal wth mea β ad varace σ. If assumpto 5 does ot hold the we caot coclude that the samplg dstrbuto of ˆ β s ormal. Why wll t geeral be close to ormal? βˆ 9

If the CLR model has a tercept we have the same result: If assumpto 5 holds, the samplg dstrbuto of the OLS estmators ˆ α, ˆ β s also ormal wth mea α, β ad varace ( Var ( ˆ) α σ Var( ˆ) β respectvely. σ ) ( ) 0

Now we ca fd a cofdece terval for β. As the co tossg example we stadardze the estmator Z ( ) ˆ β β σ Now Z has a ormal dstrbuto wth mea 0 ad varace.

We fd a 95% cofdece terval for β by cosderg the probablty. < < < <.96 ) ( ˆ.96 Pr.96).96 Pr( 95 Z σ β β Hece wth probablty.95 we have

( ).96 ˆ.96 < < σ β β or equvaletly wrtte as a terval for β ( ) ( ) < <.96 ˆ.96 ˆ σ β β σ β Wth probablty.95, the ukow β s ths terval. Ths refers to repeated samples: 95% of these samples β s ths terval. 3

The terval s a 95% cofdece terval for β. What chages f we wat a 90% cofdece terval? Use Table of ormal dstrbuto! What problem do we have wth ths terval? Ca t be computed from the data? 4

Remember that σ s the varace of the radom error term u. We do ot have the s, because u u Y α β ad we do ot kow OLS resduals α, β. After we estmate α, β, we ca compute the e Y ˆ α ˆ β These are ot the same because we use estmators for Remember that the sample mea of the e,, K, s 0. α, β. 5

Hece, the sample varace s ˆ σ e Ths s a estmator of σ. The followg estmator s preferred s e Note that we have estmated parameters α, β to compute the e s, ad ths s the reaso we use. It ca be show that ths s a ubased ad cosstet estmator of σ. 6

Now we ca derve a computable cofdece terval. Start from T ( ) ˆ β β s.e. we substtute a (ubased, cosstet) estmator for σ. Because the deomator s estmated (ad hece vares samples) T does ot have a stadard ormal dstrbuto, but a Studet t dstrbuto. Ths dstrbuto has a larger varace. See Table D. for a comparso wth the stadard ormal dstrbuto (fal row). As the stadard ormal, the t dstrbuto s symmetrc aroud 0 ad has mea 0. 7

The t dstrbuto depeds o the umber of observatos. If s large t s close to the stadard ormal, but f s small t s much more dspersed. If we have observatos, the we have a t dstrbuto wth parameter. I jargo ths s umber of degrees of freedom of the t dstrbuto. Note that f > 60the error that oe makes usg the stadard ormal stead of the t dstrbuto s small. 8

The startg pot for fdg a 95% cofdece terval s ow. < < < < ) ( ˆ Pr ) Pr( 95 c s c c T c β β If we have 30 observatos, the 048. c. 9

What s f we have 6 observatos or f we wat a 90% cofdece terval? c For 30 observatos the 95% cofdece terval for β s ( ) ( ) < < s s.048 ˆ.048 ˆ β β β Ths ca be computed from the data. 0

The estmated stadard devato of the OLS estmator βˆ s ( ) s Ths s called the stadard error of (of course we ca also compute the stadard error of αˆ ) It s customary to report both the OLS estmate ad ts stadard error. Reaso s clear from formula for cofdece terval: boudares of cofdece terval are OLS estmate plus/mus a multple of the stadard error. Multple close to for 95% (exactly for 6). βˆ

I graphs the samplg dstrbuto of T s gve for the lear regresso model of lecture 6 wth α β 3,, K, are uform [0,] radom umbers u has a stadard ormal dstrbuto or a uform dstrbuto wth the same mea ad varace The graphs are base o 0000 samples.

If s small, the dstrbuto s more dspersed tha stadard ormal Eve f the error has a uform dstrbuto we ca use the t ad stadard ormal dstrbuto ( large) as a approxmato to the samplg dstrbuto of T 3

4

5

Explas why we do ot worry much about the valdty of assumpto 5. Next we cosder testg hypotheses o regresso parameters. We use the same setup as the co tossg example: Null ad alteratve hypothess Decso rule that chooses betwee these hypotheses 6

Null hypothess H 0 : β β 0 Alteratve hypothess or or H : β β 0 (two-sded alteratve H : β > β 0 (oe-sded alteratve) H : β < β0 (oe-sded alteratve) For stace f β s a prce elastcty we may test H : β 0 agast H : β < 0 0 H β 0 s the ull hypothess that has o effect o Y. 0 : 0 7

Decso rule Cosder T ( ) ˆ β β 0 s If H0 s true, the T has a t dstrbuto wth degrees of freedom. Ths dstrbuto s symmetrc aroud 0 ad most of the tme ( repeated samples) we get a value ot too far from 0. If H0 s ot true, the the dstrbuto of β > β 0 ) or to the left (f β < β0). T wll shft to the rght (f 8

The samplg dstrbuto of T for β 0 3 (correct value) ad for β (correct value) s plotted ( 30 ad 0000 samples). 0 0 9

β 0 ( 30) 0 3 30

We choose H (ad reject H0) f we obta a uexpectedly large postve or large egatve value for T, otherwse we accept H 0. The decso rule or test s Reject H 0 T How do we choose c? s ( ) ˆ β β 0 > c 3

Remember we ca make two errors the decso False rejecto of or Type I error H 0 False acceptace of or Type II error H 0 As co tossg example we caot make both small. Usual approach: Fx probablty of Type I error at small value. Ths gves c. Now f H0 s correct the T has a t dstrbuto wth df. The α Pr( Type I error) Pr( T > c) Pr( T < c) 3

From cofdece terval we kow that for α. 05 ad 30, we have c. 048. Hece we reject H 0 : β β0 f ad oly f T ˆ β β ( ) s 0 >.048 Note f β 0 0 we just look at the rato of the OLS estmate ad ts stadard error. If ths s greater tha about, we reject H : β 0 0 33

If the alteratve s H : β > β0, we may use a dfferet decso rule. I that case we reject oly f we get a uexpectedly large postve value for T. We reject H : β 0 f ad oly f 0 T s ( ) ˆ β β 0 > c Now c.70 for a 5% probablty of a Type I error. Ths s a oesded test. 34

Predcto After we have computed the OLS estmates ˆ, β usg the data Y,,, K, we ca use the estmated regresso model to predct Y for values of that have ot bee observed. α ˆ Example: predcto of sales prce of house that s ew o market. If that house has lvg area, the we predct Yˆ ˆ ˆ α β How close s ths to Y? 35

Predcto error Y ˆ) ( ˆ) ( ˆ ˆ ˆ u u Y β β α α β α β α Estmato error Ukow future omtted vars. Estmato error s small f we have e.g. may observatos ( large) We caot reduce u 36

We ca obta a predcto terval for. Ths s a cofdece terval. A 95% terval s f 30 Y 048. ˆ.048 ˆ < < s Y Y s Y (ote we use the t dstrbuto wth df) wth s s ) ( ) ( Note that f s large ths s close to s. Hece a rule of thumb for the predcto terval s s Y Y s Y ˆ ˆ < < 37

38