Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Similar documents
5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

3. Linear Regression With a Single Regressor

CHAPTER 6: SPECIFICATION VARIABLES

Topic 4: Model Specifications

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

The Multiple Regression Model Estimation

Föreläsning /31

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Answers to Problem Set #4

The Linear Regression Model

2. Linear regression with multiple regressors

Motivation for multiple regression

The Statistical Property of Ordinary Least Squares

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Applied Econometrics (QEM)

The Simple Regression Model. Part II. The Simple Regression Model

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Intermediate Econometrics

Heteroskedasticity. Part VII. Heteroskedasticity

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Homoskedasticity. Var (u X) = σ 2. (23)

Linear Regression. Junhui Qian. October 27, 2014

WISE International Masters

10. Time series regression and forecasting

INTRODUCTORY ECONOMETRICS

Introductory Econometrics

Empirical Economic Research, Part II

Econometrics Multiple Regression Analysis: Heteroskedasticity

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Simple Linear Regression: The Model

Review of Econometrics

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

Section 2 NABE ASTEF 65

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing

Interpreting Regression Results

08 Endogenous Right-Hand-Side Variables. Andrius Buteikis,

Advanced Econometrics I

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

Lesson 17: Vector AutoRegressive Models

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

The regression model with one fixed regressor cont d

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Model Specification and Data Problems. Part VIII

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

Econometrics I. Andrea Beccarini. Summer 2011

Econometrics of Panel Data

Multiple Regression Analysis

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECO220Y Simple Regression: Testing the Slope

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Spatial Econometrics

F9 F10: Autocorrelation

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Econometrics - 30C00200

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend

Practice Questions for the Final Exam. Theoretical Part

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Multiple Linear Regression

Multiple Linear Regression CIVL 7012/8012

An overview of applied econometrics

Econometrics of Panel Data

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Econometrics. 8) Instrumental variables

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

Econometrics. 7) Endogeneity

Ch 2: Simple Linear Regression

ECONOMETRICS HONOR S EXAM REVIEW SESSION

Chapter 11 Specification Error Analysis

Multiple Regression Analysis: Estimation ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

ECON3150/4150 Spring 2015

Applied Statistics and Econometrics

Math 423/533: The Main Theoretical Topics

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012


Review of probability and statistics 1 / 31

Test of hypotheses with panel data

Introduction to Estimation Methods for Time Series models. Lecture 1

Brief Suggested Solutions

Matematické Metody v Ekonometrii 7.

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Simple Linear Regression

6. Assessing studies based on multiple regression

Multiple Regression Analysis: Heteroskedasticity

7. Integrated Processes

Multivariate Regression Analysis

Model Mis-specification

Simple Linear Regression

Applied Quantitative Methods II

Transcription:

Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42

Moving on to... 1 Prediction 2 Violation of Assumption A1 3 Model Selection Adjusted coefficient of determination Other Criteria: AIC, SC and PC F-Test t-test Non-nested F-Test J-Test (Universität Trier) Econometric Methods 30.11.2011 2 / 42

Prediction Point prediction The aim of point prediction is forecasting y-values based on known x-values. In the case of a multiple regression with two regressors, for x 10 and x 20, we have ŷ 0 = α + β 1 x 10 + β 2 x 20 The estimated model for barley production is ŷ t = 0, 95432 + 0, 59652 x 1t + 0, 26255 x 2t. For a parcel with 29 kg/ha phosphate (x 10 = ln 29 = 3, 36730) and 120 kg/ha nitrogen (x 20 = ln 120 = 4, 78749), the model predicts ŷ 0 = 4, 21994, which corresponds to a barley output of about 68 dt/ha. (Universität Trier) Econometric Methods 30.11.2011 3 / 42

Prediction Point prediction and prediction error In the case of multiple regression, for the vector x 0 = [1 x 10... x K0 ] we predict ŷ 0 = x 0 β. The prediction error is thus ) ŷ 0 y 0 = x 0 β x 0β u 0 = x 0 ( β β u 0. Expected value of the prediction error is E (ŷ 0 y 0 ) = x 0E ( β β) E (u 0 ) = 0. (Universität Trier) Econometric Methods 30.11.2011 4 / 42

Prediction Variance of the prediction error The variance of the prediction error is obtained from var(ŷ 0 y 0 ) = E [[(ŷ 0 y 0 ) E (ŷ 0 y 0 )] 2] = E [(ŷ 0 y 0 )(ŷ 0 y 0 )] [[ ) ] [ ) ]] = E x 0 ( β β u 0 x 0 ( β β u 0 [ [ )] 2 [ ) ] ] = E x 0 ( β β 2 x 0 ( β β u 0 + u0 2 [ ) ] x0 ] = E x 0( β β)( β β 2x 0E [( β β )u 0 +E (u0), 2 At the last step, pay attention that x 0 ( β β) = ( β β) x 0. Since the elements of β depend solely on the observations t = 1, 2,..., T, and u 0 is the disturbance of the additional observation, β and u 0 are independent; and ) therefore also ( β β) and u 0 : cov (( β β), u 0 = 0. (Universität Trier) Econometric Methods 30.11.2011 5 / 42

Prediction Variance of the prediction error In addition, since E (u 0 ) = 0 and E ( β β) = 0, Moreover, Hence [[ ] E [( β β)u 0 ] = E ( β β) E ( β β) ) = cov (( β β), u 0 = 0. E (u 2 0) = E var(ŷ 0 y 0 ) = x 0E ] [u 0 E (u 0 )] [ [u 0 E (u 0 )] 2] = var(u 0 ) = σ 2. [ ) ] ( β β ( β β) x 0 + σ 2 = x 0C( β)x 0 + σ 2 = x 0σ 2 ( X X ) 1 x0 + σ 2 [ = σ 2 1 + x 0 ( X X ) ] 1 x0. (Universität Trier) Econometric Methods 30.11.2011 6 / 42

Prediction Prediction Interval Standardize the prediction error as follows: t = (ŷ 0 y 0 ) E (ŷ 0 y 0 ) ŝe(ŷ 0 y 0 ) = ŷ0 y 0 ŝe(ŷ 0 y 0 ) where ŝe(ŷ 0 y 0 ) = var(ŷ 0 y 0 ). We know that t t (T 3) { } Pr t a/2 ŷ0 y 0 ŝe(ŷ 0 y 0 ) t a/2 = 1 a. Solving for y 0, we get the prediction interval [ŷ 0 t a/2.ŝe(ŷ 0 y 0 ); ŷ 0 + t a/2.ŝe(ŷ 0 y 0 )] (Universität Trier) Econometric Methods 30.11.2011 7 / 42

Prediction Prediction Interval For barley production, the estimated variance of prediction error is var(ŷ 0 y 0 ) = 0, 00502, hence ŝe(ŷ 0 y 0 ) = 0, 00502 0, 07088. If a = 0, 05, the critical t-value with 27 degrees of freedom is 2,0518. Then, the prediction interval is [4, 21994 2, 0518.0, 07088; 4, 21994 + 2, 0518.0, 07088] = [4, 0745; 4, 3654] (Universität Trier) Econometric Methods 30.11.2011 8 / 42

Moving on to... 1 Prediction 2 Violation of Assumption A1 3 Model Selection Adjusted coefficient of determination Other Criteria: AIC, SC and PC F-Test t-test Non-nested F-Test J-Test (Universität Trier) Econometric Methods 30.11.2011 9 / 42

Violation of Assumption A1 Consider the following multiple regression model: y t = α + β 1 x 1t + β 2 x 2t +... + β K x Kt + u t. y = X β + u Assumption A1: The econometric model does not lack any relevant exogenous variables and the exogenous variables used are not irrelevant. A true variable selection is achieved when the econometric model 1 does not exclude any relevant exogenous variables and 2 does not contain any irrelevant exogenous variables. (Universität Trier) Econometric Methods 30.11.2011 10 / 42

Violation of Assumption A1 Example: Factors that affect wages (y t ) in a company will be analyzed. Duration of job training (x 1t ), age (x 2t ) and duration of employment in the company (x 3t ) are considered to be possible exogenous variables. t y t x 1t x 2t x 3t t y t x 1t x 2t x 3t 1 1250 1 28 12 11 1350 1 30 13.......... 10 2000 4 58 30 20 1550 2 41 6 (Universität Trier) Econometric Methods 30.11.2011 11 / 42

Violation of Assumption A1 Three competing models: Model I (Underspecified model) II (True model) in vector notation y t = α + β 1 x 1t + u t y t = α + β 1 x 1t + β 2 x 2t + u t III (Overspecified model) y t = α + β 1 x 1t + β 2 x 2t + β 3 x 3t + u t where y t represents the wage, x 1t years of education, x 2t age and x 3t duration of employment in the company. Model I (Underspecified model) II (True model) III (Overspecified model) in matrix notation y = X 1 β 1 + ũ y = X 1 β 1 + X 2 β 2 + u y = X 1 β 1 + X 2 β 2 + X 3 β 3 + ν (Universität Trier) Econometric Methods 30.11.2011 12 / 42

Violation of Assumption A1 True model Model y = Xβ + u in a partitioned notation: where β = (β 1 β 2 ) and y = X 1 β 1 + X 2 β 2 + u 1 x 11 x K1 1 1 x 12 x K1 2 X 1 =...... 1 x 1T x K1 T x K1 +1 1 x K1 +2 1 x K1, X x K1 +1 2 x K1 +2 2 x K2 2 =...... x K1 +1 T x K1 +2 T x KT. (Universität Trier) Econometric Methods 30.11.2011 13 / 42

Violation of Assumption A1 Estimation results Table: Overview of estimation results. Model Variable Coeff. ŝe (.) t-stat. p-value I Constant 1354,7 94,2 14,377 < 0,001 Training 89,3 19,8 4,505 < 0,001 II Constant 1027,8 164,5 6,249 < 0,001 Training 62,6 21,2 2,953 0,009 Age 10,6 4,6 2,317 0,033 III Constant 1000,5 225,7 4,432 < 0,001 Training 62,4 21,8 2,859 0,011 Age 12,4 10,7 1,159 0,263 Duration of Emp. -2,6 14,3-0,183 0,857 (Universität Trier) Econometric Methods 30.11.2011 14 / 42

Violation of Assumption A1 Figure: Interdependencies in the true model. (Universität Trier) Econometric Methods 30.11.2011 15 / 42

Violation of Assumption A1 Omitted Variable Impact on the expected value of the disturbances The disturbances of the misspecified model (ũ) are ũ = X 2 β 2 + u and this results in a violation of the Assumption B1: E (ũ) = E (X 2 β 2 + u t ) = X 2 β 2 + E (u t ) = X 2 β 2 + 0 = 0. Consequences for the point estimator The OLS estimator of the incomplete model is denoted by β 1, which is as usual: β1 = ( X 1X 1 ) 1 X 1 y = ( X 1X 1 ) 1 X 1 (X 1 β 1 + X 2 β 2 + u) = ( X 1X 1 ) 1 X 1 X 1 β 1 + ( X 1X 1 ) 1 X 1 X 2 β 2 + ( X 1X 1 ) 1 X 1 u = β 1 + ( X 1X 1 ) 1 X 1 X 2 β 2 + ( X 1X 1 ) 1 X 1 u. (Universität Trier) Econometric Methods 30.11.2011 16 / 42

Violation of Assumption A1 Omitted Variable Take expected value of both sides (E (β 1 ) = β 1 and E (u) = o): E ( β1 ) = β 1 + ( X 1X 1 ) 1 X 1 X 2 β 2 = β 1. Consequences for the interval estimator [ β 1 t a/2 ŝe( β 1) ; β 1 + t a/2 ŝe( β 1)] var( β 1 ) must be estimated unbiased, because se( β 1 var( β ) = 1 ). ) var ( β 1 = c 22 where c 22 is the corresponding element of the variance-covariance matrix of the estimators C ( β). We need an unbiased estimate of σ 2. The estimated variance based on the incomplete model is σ 2 ũ ũ = T K 1 1. (Universität Trier) Econometric Methods 30.11.2011 17 / 42

Violation of Assumption A1 Omitted Variable - Bias of the variance σ 2 = E (ũ ũ) = E [(X 2 β 2 + u) (X 2 β 2 + u)] = E [X 2 β 2 + u] 2 = E [(X 2 β 2 ) (X 2 β 2 ) + (X 2 β 2 ) u + u X 2 β 2 + u u] (X 2 β 2 ) u is a scalar, so (X 2 β 2 ) u = u X 2 β 2. E (ũ ũ) = E [(X 2 β 2 ) (X 2 β 2 ) + 2 (X 2 β 2 ) u + u u] (Universität Trier) Econometric Methods 30.11.2011 18 / 42

Violation of Assumption A1 Omitted Variable - Bias of the variance = E [(X 2 β 2 ) (X 2 β 2 )] + 2 E [(X 2 β 2 ) u] + E [u u] = E [(X 2 β 2 ) 2 ] + σ 2 > σ 2 as E [(X 2 β 2 ) 2 ] > 0 The underlying true variance of the incomplete model is larger than that of the true model. Consequently, the estimated variance is also biased upwards, and var( β 1 ) is also biased. (Universität Trier) Econometric Methods 30.11.2011 19 / 42

Violation of Assumption A1 Excursion - Variance in the true model û = y ŷ ŷ = X β = X ( X X ) 1 X y û = y X ( X X ) 1 X y [ = I T X ( X X ) ] 1 X y = My, (Universität Trier) Econometric Methods 30.11.2011 20 / 42

Violation of Assumption A1 Excursion - Variance in the true model û = M (Xβ + u) = MXβ + Mu [ = I T X X ( X X ) ] 1 X X β + Mu = [I T X XI K+1 ] β + Mu = Mu. Consequently, the sum of squared residuals û û(= Sûû ), is û û = u M Mu = u Mu Matrix M is symmetric and idempotent, hence: M M = M (a scalar) û û = tr ( u Mu ) = tr ( uu M ). (Universität Trier) Econometric Methods 30.11.2011 21 / 42

Violation of Assumption A1 E [ û û ] = tr [[ E [uu ] ] M ] = tr [[V(u)] M] = tr [ σ 2 I T M ] = σ 2 tr [M] = σ 2 tr [I T X ( X X ) ] 1 X [ = σ 2 tr [I T ] tr [X ( X X ) ]] 1 X [ = σ 2 tr [I T ] tr [X X ( X X ) ]] 1 = σ 2 [tr [I T ] tr [I K +1 ]] = σ 2 [T K 1]. E [ σ 2 ] = E [[ û û ] /T K 1 ] = σ2 [T K 1] = σ 2 T K 1 (Universität Trier) Econometric Methods 30.11.2011 22 / 42

Violation of Assumption A1 Omitted Variable - Bias of the estimated sigma σ 2 = ũ ũ/(t K 1 1). ũ = y ỹ = = y X 1 β1 ( ) y X 1 X 1 1 X 1 X 1 y = M 1 y, M 1 = I T X 1 ( X 1 X 1 ) 1 X 1 Matrix M 1 is usually referred to as residual generating matrix. Matrix M 1 is also symmetric and idempotent, we have: M 1M 1 = M 1. (Universität Trier) Econometric Methods 30.11.2011 23 / 42

Violation of Assumption A1 ũ = M 1 y = M 1 (X 1 β 1 + X 2 β 2 + u). ũ = [I T X 1 X 1 ( X 1 X 1 ) 1 X 1 X 1 ] β 1 + M 1 X 2 β 2 + M 1 u = M 1 X 2 β 2 + M 1 u. Consequently, the following holds for the sum of squared residuals ũ ũ: ũ ũ = [ β 2 X 2M 1 + u M 1] [M1 X 2 β 2 + M 1 u] = β 2 X 2M 1M 1 X 2 β 2 + β 2 X 2M 1M 1 u + u M 1M 1 X 2 β 2 + u M 1M 1 u = β 2 X 2M 1 X 2 β 2 + β 2 X 2M 1 u + u M 1 X 2 β 2 + u M 1 u, where u M 1 u and β 2 X 2 M 1u are scalars. ũ ũ = β 2 X 2M 1 X 2 β 2 + 2 β 2 X 2M 1 u + u M 1 u, (Universität Trier) Econometric Methods 30.11.2011 24 / 42

Violation of Assumption A1 As u M 1 u is a scalar: u M 1 u = tr[u M 1 u] and tr[u M 1 u] =tr[uu M 1 ] E [ ũ ũ] = β 2 X 2M 1 X 2 β 2 + 2 β 2 X 2M 1 E (u) + E [tr[uu M 1 ]]. E [tr[uu M 1 ]] = σ 2 [T K 1 1] E [ ũ ũ] = β 2 X 2M 1 X 2 β 2 + σ 2 [T K 1 1]. E ( σ 2 ) = E [ ũ ũ] /(T K 1 1) = σ 2 + β 2 X 2M 1 X 2 β 2 / (T K 1 1) Since the second term is a quadratic form, it is positive definite, and accordingly it is a scalar greater than zero (compare von Auer, p. 303). Therefore, the estimated residual variance is also biased. (Universität Trier) Econometric Methods 30.11.2011 25 / 42

Violation of Assumption A1 Omitted Variable Conclusion: Omitting relevant variables leads to: biased point estimators biased interval estimators invalid hypothesis tests (Universität Trier) Econometric Methods 30.11.2011 26 / 42

Violation of Assumption A1 Using irrelevant variables First, suppose that we estimate Model III: y = X 1 β 1 + X 2 β 2 + X 3 β 3 + ν This can be represented in a simpler form as where: y = X β + ν X= [ ] X 1 X 2 X 3, β = Since Model III is wrong, β 3 = o. β 1 β 2 β 3 = β 1 β 2 o (Universität Trier) Econometric Methods 30.11.2011 27 / 42

Violation of Assumption A1 Using irrelevant variables Consequences for the point estimator β = ( X X ) 1 X y y =X β + ν = X 1 β 1 + X 2 β 2 + X 3 o + ν β = β + ( X X ) 1 X u. E ( β) = E (β + ( X X ) 1 X u) = E (β) + E ( ( X X ) 1 X u) = β. Conclusion: Since the elements of the vector β 3 are zero, beta estimator is not systematically biased. So, there is not a problem with the expected value of β. (Universität Trier) Econometric Methods 30.11.2011 28 / 42

Violation of Assumption A1 Using irrelevant variables Consequences for the interval estimator [ β 1 t a/2 ŝe( β 1 ) ; β 1 + t a/2 ŝe( β 1 )] It is crucial that the variance is not biased. u = ν + X 3 β 3 ν = u X 3 β 3 = u o = u It is thus clear that the expected value is E (ν) = 0. Moreover, the estimated variance is unbiased. The only problem is that the point estimator is not efficient, which means that there is another efficient estimator, namely of the true model. (In other words, the matrix X is now another one, and therefore the estimated variance-covariance matrix of parameter estimates is greater, see von Auer p. 304.) (Universität Trier) Econometric Methods 30.11.2011 29 / 42

Violation of Assumption A1 Using irrelevant variables Conclusion: Using irrelevant variables leads to: unbiased, but inefficient point estimates unbiased, but inefficient interval estimates valid, but not powerful hypothesis testing (Universität Trier) Econometric Methods 30.11.2011 30 / 42

Moving on to... 1 Prediction 2 Violation of Assumption A1 3 Model Selection Adjusted coefficient of determination Other Criteria: AIC, SC and PC F-Test t-test Non-nested F-Test J-Test (Universität Trier) Econometric Methods 30.11.2011 31 / 42

Adjusted coefficient of determination Coefficient of determination R 2 = (S yy Sûû )/S yy = 1 Sûû /S yy is a meaningful criterion if the following conditions are met: 1 the endogenous variables of the models are identical, i.e. their numerical values match, 2 the number of exogenous variables is identical in the models, 3 the models have a level parameter α. (Universität Trier) Econometric Methods 30.11.2011 32 / 42

Adjusted coefficient of determination Econometric software provides us the following coefficients of determination, α- and β 1 -variances for the three wage models: Table: Coefficients of determination and estimated coefficient variances of the competing three models. Model R 2 in % var( α) var( β 1 ) I 52, 99 8.877 392, 824 II 64, 27 27.052 449, 043 III 64, 34 50.953 477, 745 (Universität Trier) Econometric Methods 30.11.2011 33 / 42

Adjusted coefficient of determination The adjusted coefficient of determination: R 2 = 1 S / ûû (T K 1) / = 1 S ûû (T 1) S yy (T 1) S yy (T K 1) = 1 ( 1 R 2) T 1 T K 1 Comparison of the three wage models: Model I : R 2 = 1 (1 0, 5299) 20 1 = 0, 5038 20 2 Model II : R 2 = 1 (1 0, 6427) 20 1 = 0, 6007 20 3 Model III : R 2 = 1 (1 0, 6434) 20 1 = 0, 5766 20 4 (Universität Trier) Econometric Methods 30.11.2011 34 / 42

Other Criteria: AIC, SC and PC Akaike information criterion: AIC = ln ( Sûû T ) 2(K + 1) + T Schwarz criterion: SC = ln ( Sûû T ) (K + 1) ln T + T Prediction criterion: PC = S ûû[1 + (K + 1)/T ] T K 1 (Universität Trier) Econometric Methods 30.11.2011 35 / 42

Other Criteria: AIC, SC and PC We obtain the following Sûû -values for the three competing wage models I, II and III: 1.260.028, 957.698 und 955.692. Thus, the following results arise: Table: AIC, SC and PC of the three models to compare. Model AIC SC PC I 11,251 11,350 77.001,715 II 11,077 11,226 61.968,701 III 11,174 11,374 71.676,877 (Universität Trier) Econometric Methods 30.11.2011 36 / 42

F-Test One could perform an F -Test to compare Model I and Model III; H 0 : β 2 = β 3 = 0. F -statistic is calculated as For our example, we have F = (S 0 ûû S ûû)/l Sûû /(T K 1). F = (1.260.028 955.692) /2 955.692/ (20 4) = 2, 548. The critical F -value at a significance level of a = 0, 05 for 2 and 16 degrees of freedom is: F 0,05 = 3, 634. Hence, the two variables are not jointly significant. (Universität Trier) Econometric Methods 30.11.2011 37 / 42

t-test To compare Model II and III, we could test H 0 : β 3 = 0: t = β 3 q ŝe( β 3 ) = β 3 0 ŝe( β 3 ). Normally, the respective t-statistics should be determined for all exogenous variables of the competing models. The critical value t a/2 at a significance level of a = 0, 05 is 2, 1098 for Model II (17 degrees of freedom) and 2, 1199 for Model III (16 degrees of freedom). The estimation results in the table (slide 14) agree on the true model overall. (Universität Trier) Econometric Methods 30.11.2011 38 / 42

Non-nested F-Test Let Model IV be an alternative candidate to the Model II: y t = α + β 2 x 2t + β 3 x 3t + u t. Table: Estimation results for the model. Variable Coeff. ŝe(ˆ ) t-stat p-value Constant 921, 4 267, 2 3, 449 0, 003 Age 20, 7 12, 2 1, 691 0, 109 Duration of emp. 4, 1 17, 0 0, 241 0, 812 At a significance level of 5%, neither age nor duration of employment are significant. (Universität Trier) Econometric Methods 30.11.2011 39 / 42

Non-nested F-Test Non-nested F -Test proceeds in two steps: 1 From the two models, a mega-model is formed: y t = α + β 1 x 1t + β 2 x 2t + β 3 x 3t + u t [corresponds to Model III]. 2 For this model, two F-test are performed. First, joint significance of betas, which appear in the first model but not in the alternative, are tested. Second, joint significance of betas, which appear in the alternative model but not in the first model, are tested. In our example, the null hypotheses H 0 : β 1 = 0 and H 0 : β 3 = 0 are tested successively with an F -Test (t-tests would also be acceptable here). For H 0 : β 1 = 0, F = 8, 174 and for H 0 : β 3 = 0, F = 0, 033 The critical F -value is F 0,05 = 4, 494. Model II is, therefore, preferable to the alternative model. (Universität Trier) Econometric Methods 30.11.2011 40 / 42

J-Test Models II and IV should be compared again. J-test proceeds in the following steps: 1 The estimated values of the endogenous variable ŷ t are calculated for Model II. 2 This variable is used as an additional variable in Model IV: y t = α + β 2 x 2t + β 3 x 3t + β 4 ŷ t + u t. (Model IV*) 3 A t-test for the null hypothesis H 0 : β 4 = 0 is run. 4 The same test is also carried out in reverse: Model IV is estimated and the corresponding values of ŷt are included as additional variables in Model II: y t = α + β 1 x 1t + β 2 x 2t + β 4ŷ t + u t. (Model II*) A t-test is conducted for the null hypothesis H 0 : β 4 = 0. (Universität Trier) Econometric Methods 30.11.2011 41 / 42

J-Test Econometric software delivers the estimated values ŷ t for Model II. Based on these values, an OLS estimation of Model IV can be run. This delivers ˆβ 4 = 1, 0 and (for H 0 : β 4 = 0) t = 2, 859. The critical t-value is t a/2 = 2, 1199 (16 degrees of freedom, at a significance level of a = 0, 05). The null hypothesis is rejected. Econometric software delivers the estimated values ŷ t for Model IV. OLS estimation of Model II delivers ˆβ 4 = 0, 637 and (for H 0 : β 4 = 0) t = 0, 183. The critical t-value is again t a/2 = 2, 1199 (16 degrees of freedom, at a significance level of a = 0, 05). The null hypothesis H 0 : β 4 = 0 cannot be rejected. J-test would prefer Model II to Model IV. (Universität Trier) Econometric Methods 30.11.2011 42 / 42