Matematické Metody v Ekonometrii 7. Multicollinearity Blanka Šedivá KMA zimní semestr 2016/2017 Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 1 / 15
One of the assumptions of the classical and normal regression models is that columns of X are linearly independent (i.e. X is a matrix with full rank). The so called multicollinearity is a high dependency of columns of the matrix X is almost singular and consequently it is problematic to find its inverse. Multicollinearity can be caused by adding polynomial terms or other regressor derived from already existing regressors. Another causes might be including too many variables in the model when some of them measures the same conceptual variable or wrong data collection procedure. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 2 / 15
Gauss Markov theorem for normally distributed disturbances Model Y N ( X β, s 2 ) I, ( (i) b N β; σ 2 ( X T ) 1 ) X ; (ii) RSE σ 2 = s2 (n p) σ 2 χ 2 with ν = n p degree of free (df) (iii) b a RSE are independent (iv) E (a T b) = a T β; where a = (a 0, a 1,..., a k ) T 0 (v) Var (a T b) = s 2 a T ( X T X ) 1 a; (vi) T = a T b a T β s 2 a T (X T X ) 1a t-distribution with ν = n p df. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 3 / 15
The mean squared error (MSE) or mean squared deviation (MSD) of an estimator The MSE of an estimator ˆθ with respect to an unknown parameter θ is defined as MSE(ˆθ) = E ) 2 ( 2 2 (ˆθ θ = Var ˆθ + E ˆθ θ) = Var ˆθ + Bias (ˆθ, θ) for model Y ( X β, σ 2 I ) are given ) E (Ŷ T Ŷ = Y T Y + σ 2 rank (X ) and if X has linear independent columns ( ) E b T b = β T β + σ 2 tr (X T X ) 1 Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 4 / 15
Consequences of multicollinearity Small changes in X result in large changes in estimations of β (i.e. the OLS procedure is ill-conditioned). The standard errors of estimated coefficients tend to be large and therefore they often seem to be statistically insignificant despite a high value of R 2 and high significance of the whole model. The estimated coefficient can have wrong sign or unexpected values not corresponding with economical interpretation of the model. Note: The important sign of multicollinearity is also a fact that increasing number of observation neither reduces standard errors of estimations nor helps to eliminate the other problems mentioned above. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 5 / 15
Detection of multicollinearity - pair correlation coefficients The basic approach is based on pair correlation coefficients between columns of X Compute µ ij = cor (X i, X j), i, j = 1, 2,..., k for all pairs of columns. We can use several rules of thumb to test the multicollinearity. We say that the multicollinearity is present in our model if: There exists µ ij > 0.75 (some literature suggest 0.8 or even 0.9). There exists µ ij R 2, where R 2, the coefficient of determination of the regression model. This method is not very effective when the dependency is generated by three and more columns. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 6 / 15
Detection of multicollinearity - auxiliary regressions Perform auxiliary regression of the column X j = X 1 α 1 +... + X j 1 α j 1 + X j+1 α j+1 +... + X k α k + ε j Rj 2 is the coefficient of determination of the corresponding auxiliary regression. Again, we can use several rules of thumb to test the multicollinearity. We say that the multicollinearity is present in our model if: there exists R 2 j > R 2 there exists VIFj = 1 > 10 We call VIF 1 Rj 2 j variation inflation factor (of the regressor j) and it quantifies the severity of the multicollinearity s influence on the standard error of coefficient b j The VIF j are diagonal elements inverse matrix of correlations diag(cor (X ) 1 ) = VIF. The test statistics F j = R2 j n p 1 Rj 2 p 1 F ν 1,ν 2 exceeds its critical value. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 7 / 15
Solution of multicollinearity We have several options what to do when we detect multicollinearity: (i) Normalise or transform columns of X such that the multicollinearity is eliminated. (ii) Select a submodel such that the regressors which cause multicollinearity are omitted. (iii) Use transformed regressors which are linear combinations of the original regressors (the so called principal component regression (PCR)). (iv) Use the so called ridge regression. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 8 / 15
Ridge regression Consider a linear regression model Y ( X β, σ 2 I ), with diagnosed multicollinearity. Because multicollinearity causes the matrix X T X to be ill-conditioned, we use ridge regression estimator where δ 0 bδ = ( X T X + δi ) 1 X T Y, relation between b = ( X T ) 1 T X X Y and bδ can be expressed as bδ = ( X T ) 1 ( T X + δi X X X T ) 1 T X X y = = ( X T ) 1 T X + δi X X b = [ = I + δ (X T ) 1 ] 1 X b Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 9 / 15
Ridge regression - statistic properties of b δ Obviously, ridge regression estimator is bδ biased which is not a desired property. However, it can be shown that under certain conditions the ridge regression estimator can be somewhat favourable but it can be shown that for 0 < δ < 2σ 2 1 it holds that β 2 MSE(b) MSE(bδ) In practice, however, we do not know the real values of β and σ 2, so we are not able to directly determine the value of δ. s Hence, the usual choice is δ 1 = k 2 = k b T s 2 n, or we could b j=1 b2 j employ values δ (0, δ max ) and plot bδ to create the so called ridge trace. The desired value of δ is where bδ stabilise. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 10 / 15
Choice of the submodel overspecification and underspecification of model including irrelevant variables in a regression model true model y = X β + ε, hod(x ) = k and false selected model y = X β + X 2 β 2 + ε 1, hod([x X 2 ]) = k > k. the estimations b are unbias but the variances of the OLS this estimators higher there is also risk of multicollinearity misspecification of model true model y = X β + ε, rank (X ) = k a y = X 1 β 1 + X 2 β 2 + ε and false selected model y = X 1 β 1 + ɛ 1, rank (X 1 ) = k 1 < k the estimations b are bias E (b 1 ) = β 1 + (X T 1 X 1) 1 X T 1 X 2β 2. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 11 / 15
Model and submodel relationship between model Y N(X β; σ 2 I ), hod(x ) = k and submodel with rank (X ) = k 0 the estimated parameters based on submodel we denote b0, s 2 0, e0 = ŷ 0 y and RSE 0 = e T 0 e 0, than it is hold F 0 = (RSE 0 RSE) / (k k 0 ) RSE/ (n k) F ν1,ν 2, where ν 1 = k k 0 and ν 2 = n k Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 12 / 15
Choice of the model - critters There are many different critters derived for the choice of the model: Minimization of the residual sum of squares RSE = e T e Maximization of the coefficient of determination R 2 = 1 e T e y T y Maximization of the adjusted coefficient of determination R 2 adj = 1 (e T e)/(n k) (y T y )/(n 1) Minimization of the residual variance s 2 = e T e n k Maximization of Mallows C k C k = RSS 0 s 2 + 2 k 0 n, Minimization of an information criterion such as AIC = ln ( s 2 + 2k... (Akaike) A = s 2 1 + k n 1/4)... (Anděl a kol.) SR = ln s 2 + k ln n n... (Swarz,Rissanen) HQ = ln s 2 ln(ln n) + 2 c k n, c = 2 or 3... (Hannan,Quinn) Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 13 / 15
Stepwise regression Forward selection, which involves starting with no variables in the model, testing the addition of each variable using a chosen model comparison criterion, adding the variable (if any) that improves the model the most, and repeating this process until none improves the model. Backward elimination, which involves starting with all candidate variables, testing the deletion of each variable using a chosen model comparison criterion, deleting the variable (if any) that improves the model the most by being deleted, and repeating this process until no further improvement is possible. Bidirectional elimination, a combination of the above, testing at each step for variables to be included or excluded. Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 14 / 15
Stepwise regression Blanka Šedivá (KMA) Matematické Metody v Ekonometrii 7. zimní semestr 2016/2017 15 / 15