Model Mis-specification Carlo Favero Favero () Model Mis-specification 1 / 28
Model Mis-specification Each specification can be interpreted of the result of a reduction process, what happens if the reduction process that has generated E (y X) omits some relevant information? We shall consider two general cases of mis-specification. Mis-specification related to the choice of variables included in the regressions, Mis-specification related to wrong assumptions on the properties of the error terms. Favero () Model Mis-specification 2 / 28
Choice of variables under-parameterization (the estimated model omits variables included in the DGP) over-parameterization (the estimated model includes more variables than the DGP). Favero () Model Mis-specification 3 / 28
Under-parameterization Given the DGP: y = X 1 β 1 +X 2 β 2 +ɛ, (1) for which hypotheses (??) (??) hold, the following model is estimated: y = X 1 β 1 + ν. (2) The OLS estimates are given by the following expression: β up 1 = ( X 1 X 1) 1 X 1 y, (3) Favero () Model Mis-specification 4 / 28
Under-parameterization while the OLS estimates which are obtained by estimation of the DGP, are: β 1 = ( X 1 M 2X 1 ) 1 X 1 M 2 y. (4) The estimates in (4) are best linear unbiased estimators (BLUE) by construction, while the estimates in (3) are biased unless X 1 and X 2 are uncorrelated. To show this, consider: β 1 = ( X 1 X ) ) 1 1 (X 1 y X 1 X 2 β 2 (5) = β up 1 D β 2, (6) where D is the vector of coefficients in the regression of X 2 on X 1 and β 2 is the OLS estimator obtained by fitting the DGP. Favero () Model Mis-specification 5 / 28
Illustration Given the DGP: the following model is estimated by OLS y = X 1 0.5 + X 2 0.5 + ɛ 1, (7) X 2 = 0.8X 1 + ɛ 2 (8) y = X 1 β 1 + ν. (9) (a) The OLS estimate of β 1 will be 0.5 (b) The OLS estimate of β 1 will be 0 (c) The OLS estimate of β 1 will be 0.9 (d) The OLS estimate of β 1 will have a mean of 0.9 Favero () Model Mis-specification 6 / 28
Under-parameterization Note that if E (y X 1, X 2 ) = X 1 β 1 +X 2 β 2, E (X 1 X 2 ) = X 1 D, then, E (y X 1 ) = X 1 β 1 +X 1 Dβ 2 = X 1 α. Therefore the OLS estimator in the under-parameterized model is a biased estimator of β 1, but an unbiased estimator of α. Then, if the objective of the model is forecasting and X 1 is more easily observed than X 2, the under-parameterized model can be safely used. On the other hand, if the objective of the model is to test specific predictions on parameters, the use of the under-parameterized model delivers biased results. Favero () Model Mis-specification 7 / 28
Over-parameterization Given the DGP, y = X 1 β 1 + ɛ, (10) for which hypotheses (??) (??) hold, the following model is estimated: y = X 1 β 1 + X 2 β 2 + v. (11) The OLS estimator of the over-parameterized model is β op 1 = ( X 1 M 2X 1 ) 1 X 1 M 2 y, (12) Favero () Model Mis-specification 8 / 28
Over-parameterization while, by estimating the DGP, we obtain: β 1 = ( X 1 X 1) 1 X 1 y. (13) By substituting y from the DGP, one finds that both estimators are unbiased and the difference is now made by the variance. In fact we have: ( βop ) var 1 X 1, X 2 = σ 2 ( X 1 M ) 1 2X 1, (14) ) var ( β1 X 1, X 2 = σ 2 ( X 1 X 1 1). (15) Favero () Model Mis-specification 9 / 28
Over-parameterization Remember that if two matrices A and B are positive definite and A B is positive semidefinite, then also the matrix B 1 A 1 is positive semidefinite. We have to show that X 1 X 1 X 1 M 2X 1 is a positive semidefinite matrix. Such a result is almost immediate: X 1 X 1 X 1 M 2X 1 = X 1 (I M 2 ) X 1 = X 1 Q 2X 1 = X 1 Q 2Q 2 X 1. We conclude that over-parameterization impacts on the efficiency of estimators and the power of the tests of hypotheses. Favero () Model Mis-specification 10 / 28
Heteroscedasticity, Autocorrelation, and the GLS estimator Let us reconsider the single equation model and generalize it to the case in which the hypotheses of diagonality and constancy of the conditional variances-covariance matrix of the residuals do not hold: y = Xβ + ɛ, (16) ( ) ɛ n.d. 0, σ 2 Ω, where Ω is a (T T) symmetric and positive definite matrix. When the OLS method is applied to model (16), it delivers estimators which are consistent but not efficient; moreover, the traditional formula for the variance-covariance matrix of the OLS estimators, σ 2 (X X) 1, is wrong and leads to an incorrect inference. Favero () Model Mis-specification 11 / 28
Heteroscedasticity, Autocorrelation, and the GLS estimator Using the standard algebra, it can be shown that the correct formula for the variance-covariance matrix of the OLS estimator is: σ 2 ( X X ) 1 X ΩX ( X X ) 1. To find a general solution to this problem, remember that the inverse of a symmetric definite positive matrix is also symmetric and definite positive and that for a given matrix Φ, symmetric and definite positive, there always exists a (T T) non-singular matrix K, such that K K =Ω 1 and KΩK = I T. Favero () Model Mis-specification 12 / 28
Heteroscedasticity, Autocorrelation, and the GLS estimator Consider the regression model obtained by pre-multiplying both the right-hand and the left-hand sides of (16) by K: Ky = KXβ + Kɛ, (17) ) Kɛ n.d. (0, σ 2 I T. The OLS estimator of the parameters of the transformed model (17) satisfies all the conditions for the applications of the Gauss Markov theorem; therefore, the estimator β GLS = ( X K KX ) 1 X K Ky ( 1 = X Ω X) 1 X Ω 1 y, known as the generalised least squares (GLS) estimator, is BLUE. The variance of the GLS estimator, conditional upon X, becomes ) Var ( β ( 1 GLS X = Σ β = σ 2 X Ω X) 1. Favero () Model Mis-specification 13 / 28
Heteroscedasticity, Autocorrelation, and the GLS estimator Consider, for example, the variances of the OLS and the GLS estimators. Using the fact that if A and B are positive definite and A B is positive semidefinite, then B 1 A 1 is also positive semidefinite, we have: ( ) X Ω 1 X ( X X ) ( X ΩX ) 1 ( X X ) = X K KX ( X X ) ( X K 1 ( K ) 1 X ) 1 (X X ) = X K ( I ( K ) 1 X ( X K 1 ( K ) 1 X ) 1 X K 1 ) KX where = X K M W M WKX, M W = ( I W ( W W ) 1 W ), (18) W = ( K ) 1 X. (19) Favero () Model Mis-specification 14 / 28
Heteroscedasticity, Autocorrelation, and the GLS estimator The applicability of the GLS estimator requires an empirical specification for the matrix K. We consider here three specific applications where the appropriate choice of such a matrix leads to the solution of the problems in the OLS estimator generated, respectively, by the presence of first-order serial correlation in the residuals, by the presence of heteroscedasticity in the residuals by the presence of both of them. Favero () Model Mis-specification 15 / 28
Correction for Serial Correlation (Cochrane-Orcutt) Consider first the case of first-order serial correlation in the residuals. We have the following model: y t = x tβ+u t, u t = ρu t 1 + ɛ t, ( ) ɛ t n.i.d. 0, σ 2 ɛ, which, using our general notation, can be re-written as: y = Xβ + ɛ, (20) ( ) ɛ n.d. 0, σ 2 Ω, σ 2 = σ 2 ɛ 1 ρ 2, (21) Favero () Model Mis-specification 16 / 28
Correction for Serial Correlation (Cochrane-Orcutt) = 1 ρ ρ 2.. ρ T 1 ρ 1 ρ.. ρ T 2 ρ 2. 1......... ρ T 2.. ρ 1 ρ ρ T 1 ρ T 2.. ρ 1 In this case, the knowledge of the parameter ρ allows the empirical implementation of the GLS estimator. An intuitive procedure to implement the GLS estimator can then be the following: 1 estimate the vector β by OLS and save the vector of residuals û t ; 2 regress û t on û t 1 to obtain an estimate ρ of ρ; 3 construct the transformed model and regress (y t ρy t 1 ) on (x t ρx t 1 ) to obtain the GLS estimator of the vector of parameters of interest. Note that the above procedure, known as the Cochrane Orcutt procedure, can be repeated until convergence. Favero () Model Mis-specification 17 / 28.
Correction for Heteroscedasticity (White) In the case of heteroscedasticity, our general model becomes y = Xβ + ɛ, ɛ n.d. (0,Ω), σ 2 1 0 0.. 0 0 σ 2 2 0.. 0 Ω =............. 0.. 0 σ 2 T 1 0 0 0.. 0 σ 2 T In this case, to construct the GLS estimator, we need to model heteroscedasticity choosing appropriately the K matrix. Favero () Model Mis-specification 18 / 28
Correction for Heteroscedasticity (White) White (1980) proposes a specification based on the consideration that in the case of heteroscedasticity the variance-covariance matrix of the OLS estimator takes the form: σ 2 ( X X ) 1 X ΩX ( X X ) 1, which can be used for inference, once an estimator for Φ is available. The following unbiased estimator of Φ is proposed: Ω= û 2 1 0 0.. 0 0 û 2 2 0.. 0............ 0.. 0 û 2 T 1 0 0 0.. 0 û 2 T. Favero () Model Mis-specification 19 / 28
Correction for Heteroscedasticity (White) This choice for Ω leads to the following degrees of freedom corrected heteroscedasticity consistent parameters covariance matrix estimator: Σ W β = T ( X X ) ) 1 (X X ) 1 T k ( T û 2 t X t X t t=1 This estimator corrects for the OLS for the presence of heteroscedasticity in the residuals without modelling it. Favero () Model Mis-specification 20 / 28
Correction for heteroscedasticity and serial correlation (Newey-West) The White covariance matrix assumes serially uncorrelated residuals. Newey and West(1987) have proposed a more general covariance estimator that is robust to heteroscedasticity and autocorrelation of the residuals of unknow form. This HAC (heteroscedasticity and autocorrelation consistent) coefficient covariance estimators is given by: Σ NW β = ( X X ) 1 T ˆ Ω ( X X ) 1 where ˆ Ω is a long-run covariance estimators Favero () Model Mis-specification 21 / 28
Correction for heteroscedasticity and serial correlation (Newey-West) Ω = Γ (0) + Γ (j) = p j=1 [ 1 j ( T û t û t j X t X t j t=1 p + 1 ) 1 T ] [ Γ (j) + Γ ( j)], (22) note that in absence of serial correlation Ω = Γ (0) and we are back to the White Estimator. Implementation of the estimator requires a choice of p, which is the maximum lag at whcih correlation is still present. The weighting scheme adopted guarantees a positive definite estimated covariance matrix by multiplying the sequence of the Γ (j) s by a sequence of weights that decreases as j increases. Favero () Model Mis-specification 22 / 28
Testing the CAPM The time-series performance of the CAPM is questioned by the significance of the alpha coefficients. (consider for example the case of the 25 FF protfolios) We also know that other factors beside the market are significant in explaining excess returns Fama-Mac Beth propose to exploit cross sectional information to produce a different test of the CAPM Favero () Model Mis-specification 23 / 28
Fama-MacBeth(1973) Consider the 25 portfolios and run for each of them the CAPM regression over the sample 1962:1 2014:6. These regressions deliver 25 betas. ( ) ( r i t r f t = α i + β i r m t r f t ) + u i t Take now a second-step regression in which the cross-section of the average (over the sample 1962:1-2014:6) monthly returns on the 25 portfolios are projected on the 25 betas: r i = γ 0 + γ 1 β i + u i Under the null of the CAPM i) residuals should be randomly distributed around the regression line, ii) γ 0 = E ( r f ), γ 1 = E ( r m r f ) Favero () Model Mis-specification 24 / 28
Fama-MacBeth(1973) Favero () Model Mis-specification 25 / 28
Fama-MacBeth(1973) The cross-sectional regression strongly rejects the CAPM. Note however that this regression is affected by an inference problem caused by the correlation of residuals in the cross-section regression. Fama and MacBeth (1973) address this problem by estimating month-by-month cross-section regressions of monthly returns on the betas obtained on the full sample. The time series means of the monthly slopes and intercepts, along with the standard errors of the means, are then used to run the appropriate tests Favero () Model Mis-specification 26 / 28
Fama-MacBeth(1973) TABL The application of the Fama-MacBeth on the sample 1962:1 2014:6 delivers the following results γ 0 γ 1 Mean 2.07-0.807 St.Dev 9.56 10.06 Obs 630 630 t-stat 5.437-2.01 An alternative route would be to construct Heteroscedasticity and Autocorrelation Consistent(HAC) estimators. Favero () Model Mis-specification 27 / 28
A Multifactor model The testing procedure could be easily extended to multifactor models. In which the first stage regression in general can be written as ( ) ( r i t r f t = α i + β 1,i r m t and the second stage regression: r i = γ 0 + γ 1 β 1,i + ) r f t + n j=2 n j=1 γ j β j,i + u i β j,i F j,t + u i t Under the null of the CAPM i) residuals should be randomly distributed around the regression line, ii) γ 0 = E ( r f ), γ 1 = E ( r m r f ),γ j = E ( F j ) Favero () Model Mis-specification 28 / 28