1/31 Föreläsning 10 090420
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i + β 4 X 3 i + u i Model A Y i = α 1 + α 2 X i + α 3 X 2 i + v i Model B Y i = α 1 + α 2 X i + α 3 X 2 i + α 4 X 3 i + α 5 X 4 i + v i Model C ln Y i = α 1 + α 2 X i + α 3 X 2 i + α 4 X 3 i + v i Model D Y i = α 1 + α 2 X i + α 3 X 2 i + α 4 X 3 i + v i Model E
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 3/31 Types of speci cation errors *Assume Model A is the true model 1. If Model B is tted one commits the error of Under tting the model by omitting a relevant variable 2. If Model C is tted one commits the error of Over tting the model by including an irrelevant variable 3. If Model D is tted one commits the error of Using the wrong functional form
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 4/31 Types of speci cation errors 4. There can be errors of measurement in the variables, i.e. instead of observing Y i and X i Yi = Y i + ε i Xi = X i + w i are observed. Fitting Model E would then cause errors of measurement bias.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 5/31 Types of speci cation errors 5. Another type of speci cation error can arise because of incorrect speci cation of the error term. For example, compare the following two models Y i = βx i u i (1) Y i = βx i + u i (2) In (1) the error term enters multiplicatively and in (2) it enters additively.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 6/31 Consequences of speci cation errors Under tting Biased parameter estimators, i.e. E b β 6= β Inconsistent parameter estimators, i.e the bias does not disappear as the sample size gets larger
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 7/31 Consequences of speci cation errors Under tting Biased variance estimators, i.e. h E Var d b βi 6= Var b β σ 2 is incorrectly estimated Therefore there is a substantial risk that conclusions based on the usual con dence intervals and hypothesis tests are wrong. The same caution applies to forecast intervals.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 8/31 Under tting, an example If the true model is Consequences of speci cation errors Y i = β 1 + β 2 X 2i + β 3 X 3i + u i but instead Y i = α 1 + α 2 X 2i + u i is mistakenly assumed, it can be shown that (A) (B) E (bα 2 ) = β 2 + β 3 b 32 where b 32 is the slope coe cient in the regression of X 3 on X 2. That is, bα 2 is biased unless b 32 = 0 (in which case X 2 and X 3 are uncorrelated, which is uncommon in economic data).
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 9/31 Under tting, an example Consequences of speci cation errors The variances of bβ 2 and bα 2 are given by Var b β 2 = σ 2 x 2 2i (1 r 2 23 ) Var (bα 2 ) = σ2 x 2 2i It can be seen that Var (bα 2 ) Var b β 2. Although bα 2 is biased it has a smaller variance than bβ 2, there is a tradeo between bias and variance in this case.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 10/31 Consequences of speci cation errors Under tting, an example Generally σ 2 is unknown and has to be estimated by bσ 2. Thus, the variances of bβ 2 and bα 2 are estimated by dvar b β 2 = bσ 2 A x 2 2i (1 r 2 23 ) = RSS A/df A x 2 2i (1 r 2 23 ) dvar (bα 2 ) = bσ2 B x 2 2i = RSS B /df B x 2 2i It can be seen that dvar (bα 2 ) 6= dvar b β 2
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 11/31 Consequences of speci cation errors Under tting, an example Now is or is dvar (bα 2 ) > dvar b β 2 dvar (bα 2 ) < dvar b β 2?? It depends!
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 12/31 Consequences of speci cation errors Under tting In summary: Once a model is formulated on the basis of relevant theory, it is generally not a good idea to drop a variable from such a model
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 13/31 Consequences of speci cation errors Over tting Ine cient parameter estimators, i.e. their variances are generally larger than those of true model Risk of introducing multicollinearity Loss of degrees of freedom because an irrelevant parameter is estimated
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 14/31 Consequences of speci cation errors Over tting, an example Now reconsider the two models A and B Y i = β 1 + β 2 X 2i + β 3 X 3i + u i Y i = α 1 + α 2 X 2i + u i (A) (B) Assume that model B is in fact the true model and model A is mistakenly tted. We have already seen that Var (bα 2 ) Var b β 2, i.e. the penalty for the mistake of adding the variable X 3 is an unnecessarily large variance of the parameter estimator
Consequences of speci cation errors Errors of measurement In Y, such that Y i = Y i + ε i Ideally, Y could be correctly measured and the following model estimated Y i = α + βx i + u i (3) but because of the measurement errors what is estimated is in fact Yi = α + βx i + u i (4) Y i + ε i = α + βx i + u i Y i = α + βx i + u i ε i Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 15/31
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 16/31 Errors of measurement Consequences of speci cation errors Given some assumptions () about the error terms u i and measurement errors ε i it can be shown that - bβ is still unbiased - but the variance of bβ is larger in the presence of measurement errors in Y, i.e. Var b β > Var b β Model (4) Model (3) (*) in addition to the usual assumptions about u i these are E (ε i ) = 0, cov (X i, ε i ) = 0 and cov (u i, ε i ) = 0
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 17/31 Consequences of speci cation errors Errors of measurement In X, such that X i = X i + w i Ideally, X could be correctly measured and model (3) estimated but because of the measurement errors what is estimated is in fact Y i = α + βxi + u i Y i = α + β (X i + w i ) + u i Y i = α + βx i + (βw i + u i ) Y i = α + βx i + z i (5)
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 18/31 Consequences of speci cation errors Errors of measurement One can no longer assume that the error term z i is independent of the the explanatory variable X which is a violation of Assumption 2: Fixed X values or X values independent of the error term Violations of this assumption causes the OLS parameter estimators to be - biased - inconsistent
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 19/31 Consequences of speci cation errors Incorrect speci cation of the error term Biased parameter estimators For example, recall Y i = βx i u i (1) Y i = βx i + u i (2) Suppose model (1) is true but the error term is incorrectly assumed to be additive as in (2). Assuming ln u i is normally distributed with variance σ 2 it can be shown that E b β = βe σ2 /2 6= β
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 20/31 Tests of speci cation errors Detecting the prescence of irrelevant variabels in the model can be accomplished with the usual t and/or sequential F tests However, one should be careful of "data mining", for example to use these tests to build a model iteratively by stepwise expansion (starting with X 2 in the model and deciding to keep it if bβ 2 is statistically signi cant, then adding X 3 if bβ 3 turns out to be statistically signi cant etc) and testing a lot of di erent models wihout any a priori theoretical justi cation
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 21/31 Tests of speci cation errors The reason for this is that the stated signi cance level then cannot be trusted. If there is c candidate explanatory variables out of which k are nally selected the following approximate relationship is suggested α (c/k) α
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 22/31 Tests of speci cation errors Examination of residual plots can be useful for detecting speci cation errors, such as under tting or assuming the wrong functional form, these should not exhibit any patterns The Durbin-Watson d statistic can be used to test if an alternative speci cation of the model is superior, for example if a quadratic term X 2 should be added to model. If the etimated d value is signi cant the conclusion would be to add the candidate variable, e.g. X 2
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 23/31 Tests of speci cation errors Lagrange Multiplier (LM) test. A general test for comparing di erent model speci cations Say for example that we are interested in comparing Y i = β 1 + β 2 X i + β 3 Xi 2 + β 4 Xi 3 + u i to Y i = β 1 + β 2 X i + u i where the latter is a restricted version of the rst.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 24/31 Tests of speci cation errors To test the restricted regression (or equivalently to test β 3 = β 4 = 0) proceed as follows: 1 Estimate the restricted regression to obtain the estimated residuals bu i 2 Let the estimated residuals be the dependent variable in a new regression including the same explanatory variables as the full model, here bu i = α 1 + α 2 X i + α 3 X 2 i + α 4 X 3 i + v i, and obtain R 2 in this regression 3 For large sample sizes nr 2 is approximately χ 2 distributed with df equal to the number of restrictions
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 25/31 Tests of speci cation errors Therefore if nr 2 > χ 2 (number of restrictions) the restricted regression is rejected. In the present example it is rejected if nr 2 > χ 2 (2) (or in other words the hypothesis β 3 = β 4 = 0 is rejected)
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 26/31 Model Discrimination How can we compare two di erent models like and Y i = α 1 + α 2 X 2i + α 3 X 3i + u i Y i = β 1 + β 2 Z 2i + β 3 Z 3i + u i?
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 27/31 Some commonly used criteria are Model Discrimination Adjusted R 2 R 2 = 1 RSS/ (n k) TSS/ (n 1) When comparing two or more R 2, the dependent variable must be the same However, a good "in-sample goodness of t" as measured by R 2 does not guarantee good forecasts for "out-of-sample" observations.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 28/31 Akaike Information Criteria (AIC) Model Discrimination which can be rewritten as 2k /n RSS AIC = e n ln AIC = 2k RSS + ln n n where k here is the number of estimated parameters. In e ect the AIC criteria imposes a penalty (by the factor 2k/n) for adding explanatory variables to the model. The model which has the lowest AIC is preferred.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 29/31 Schwarz Information Criteria (SIC) Similar in spirit to AIC, de ned as which can be rewritten as ln SIC = Model Discrimination k /n RSS SIC = n n k RSS ln n + ln n n where the penalty factor is [(k/n) ln n], i.e. SIC imposes a harsher penalty than AIC. Like AIC, the model with the lowest SIC is preferred.
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 30/31 Model Discrimination Mallows C p Criterion The starting point is a model with k regressors including the intercept, suppose p of these are selected (p k) and let RSS p denote the residual sum of squares in this model. Then de ne C p = RSS p bσ 2 (n 2p)
Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 31/31 Model Discrimination If the model with p regressors is adequate it can be shown that E (RSS p ) = (n p) σ 2. We already know that E bσ 2 = σ 2. Therefore, approximately E (C p ) (n p) σ2 σ 2 (n 2p) = p The goal is to nd a model with a low C p value, about equal to p, when comparing models the model with the lowest C p is preferred.