Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi Arabia Follow this and additional works at: htt://digitalcommons.wayne.edu/jmasm Part of the Alied Statistics Commons, Social and Behavioral Sciences Commons, and the Statistical Theory Commons Recommended Citation Khalaf, Ghadban (03) "A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression," Journal of Modern Alied Statistical Methods: Vol. : Iss., Article 7. DOI: 0.37/jmasm/38379360 Available at: htt://digitalcommons.wayne.edu/jmasm/vol/iss/7 This Regular Article is brought to you for free and oen access by the Oen Access Journals at DigitalCommons@WayneState. It has been acceted for inclusion in Journal of Modern Alied Statistical Methods by an authorized editor of DigitalCommons@WayneState.
Journal of Modern Alied Statistical Methods November 03, Vol., No., 93-303. Coyright 03 JMASM, Inc. ISSN 538 947 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University Saudi Arabia During the ast years, different kinds of estimators have been roosed as alternatives to the Ordinary Least Squares (OLS) estimator for the estimation of the regression coefficients in the resence of multicollinearity. In the general linear regression model, Y = Xβ + e, it is known that multicollinearity makes statistical inference difficult and may even seriously distort the inference. Ridge regression, as viewed here, defines a class of estimators of β indexed by a scalar arameter k. Two methods of secifying k are roosed and evaluated in terms of Mean Square Error (MSE) by simulation techniques. A comarison is made with other ridge-tye estimators evaluated elsewhere. The estimated MSE of the suggested estimators are lower than other estimators of the ridge arameter and the OLS estimator. Keywords: OLS estimator, linear regression, multicollinearity, ridge regression, Monte Carlo simulation. Introduction Consider the multile linear regression model Y = Xβ + e () where Y is an ( n ) resonse vector, X is a fixed ( n ) matrix of indeendent variables of rank, β is the unknown ( ) arameter vector of regression coefficients and, finally, e is an ( n ) vector of uncorrelated errors with mean zero and common unknown variance σ. If XX is nonsingular, the OLS estimator for β is given by Ghadban Khalaf is an Associate Professor in the Deartment of Mathematics. 93
A COMPARISON BETWEEN BIASED AND UNBIASED ESTIMATORS ˆ β = ( XX) X Y () For orthogonal data, the OLS estimator in the linear regression model is strongly efficient. But in the resence of multicollinearity, the OLS efficiency can be reduced and hence an imrovement uon it would be necessary and desirable. The term multicollinearity is used to denote the resence of linear relationshis, or near linear relationshis, among exlanatory variables. If the exlanatory variables are erfectly linearly correlated, that is, if the correlation coefficient for these variables is equal to unity, then the arameters become indeterminate; i.e, it is imossible to obtain numerical values for each arameter searately and the method of lease squares breaks down. Conversely, if the correlation coefficient for the exlanatory variables is equal to zero, then the variables are called orthogonal and there are no roblems concerning the estimates of the coefficients. When multicollinearity occurs, the least squares estimates are still unbiased and efficient but the roblem is that; the estimated standard error S ˆi β for the coefficient ˆi β become infinitely large; i.e, the standard error tends to be larger than it would be in the absence of multicollinearity and when S ˆi β is larger than it would be, then the t- value for testing the significance of βi is smaller than it should be. Thus one is likely to conclude that a variable X i is not imortant in the relationshi when it realy is. To solve the roblem of multicollinearity, there is no single solution that will eliminate multicollinearity altogether. One common rocedure is to select the indeendent variable most seriously involved in the multicollinearity and remove it from the model. This rocedure often imroves the standard error of the remaining coefficients and may make formerly insignificant variables significant, since the elimination of a variable reduces any multicollinearity caused by it. The difficulty with this aroach is that the model now may not correctly reresent the oulation relationshi and all estimated coefficients would contain a oulation secification. The rocedure of increasing the samle size is sometimes recommended as another suggested rocedure to solve the roblem of multicollinearity. In fact this method imroves the recision of an estimator and hence reduces the adverse effects of multicollinearity. Hoerl and Kennard (970) suggested a new technique to overcome the roblem of multicollinearity. This technique is called ridge regression. Ridge 94
GHADBAN KHALAF regression is a variant of ordinary multile linear regression whose goal is to circumvent the redictors collinearity. It gives u the least squares as a method for estimating the arameters of the model and focuses instead of the XX matrix. This matrix will be artificially modified so as to make its determinant areciably different from zero. This is accomlished by adding a small ositive quantity, say k (k > 0), to each of the diagonal elements of the matrix XX before inverting it for least squares estimation. The resulting estimator is given by ˆ( k ) ( X X ki ) β = + X Y, k > 0 (3) which coincides with the OLS estimator, defined by (), when k = 0. The resulting estimator will be biased, but have smaller variances than ˆβ. This is recisely what the ridge regression estimator we study can accomlish. The lan of this aer is as follows: Section resents the roosed estimators included in the study; a novel feature is our roosed ridge estimator which, as we shall see resently, has lower MSE. Section 3 is described the simulation technique that we have adoted in our study to evaluate the erformance of the new values of the ridge arameter we suggest. The results of the simulation study, which aear in the tables, are resented in Section 4. Finally, Section 5 contains summary and conclusions. The Proosed Estimators With the ridge estimator method, there arises the roblem of determining an otimal value of k. With a good choice of k, one might hoe to imrove on the OLS estimator for every coefficient. Hoerl, Kennard and Baldwin (975) showed, through simulation, that the use of ridge estimator with the following biasing arameter kˆ = ˆ σ i= imlies that MSE( ˆ β( k)) < MSE( ˆ β), where denotes the number of arameters (excluding the intercet) and ˆ σ is the usual unbiased estimate of σ, defined by; ˆ β i (4) 95
A COMPARISON BETWEEN BIASED AND UNBIASED ESTIMATORS n ˆ = ei / ( n ). i= σ They showed that the robability of a smaller MSE using (4) increases with the number of arameters. We will use the acronym HKB for the estimator (4). Khalaf and Shukur (005) suggested a modification of Hoerl and Kennard (970) given by; ˆ t ˆ σ k = ( n ) σ + t max ˆ ˆ maxβmax (5) which guaranteed lower MSE, where t max is the maximum eigenvalue of XX matrix. For this estimator we will use the acronym KS. From the estimators (4) and (5), we suggest as a modification of HKB and KS by multilying them by the amount; ( tmax + tmin ) tmax + tmin = ˆ β ˆ, βi i i= i= where t min is the minimum eigenvalue of the matrix XX. This leads to the following estimators; ( t + t ) ˆ σ ˆ max min =. k ˆ β i i= i= ( t + t ) t ˆ β ˆ σ i ˆ max min max =. ˆ ˆ ˆ βi (( n ) σ + tmaxβmax ) i= k (6) (7) For our two suggested estimators, defined by (6) and (7), we use the acronym K and K, resectively. 96
GHADBAN KHALAF The Simulation Study A simulation study was conducted in order to draw conclusions about the erformance of our suggested estimators relative to HKB, KS and the OLS estimator. To achieve different degree of collinearity, following Kibria (003), the exlanatory variables are generated by using the following equation; x z z i n j / ij = ( ρ ) ij + ρ i, =,,...,. =,,..., zij where are indeendent standard normal distribution, is the number of the exlanatory variables and ρ is secified so that the correlation between any two ρ exlanatory variables is given by. Three different sets of correlation are considered according to the value of ρ = 0.85, 0.95 and 0.99. The n observations for the deendent variable are determined by the following equation; y = β0 + βx+ βx +... + β x + e, i =,,..., n i i i i i where e i are i.i.d seudo-random numbers. In this study, β 0 is taken to be zero and the term e i is generated from each of the following distributions: N(0, ), T(3), T(7) and F(3, ). The arameters values are chosen so that βi =, i= which is a common restriction in simulation studies (see Muniz and Kibria (009)). The other factors we chose to vary is the samle size and the number of regressions. We generate models consisting of 5, 50, 00 and 50 observations and with and 4 exlanatory variables. It is noted from the results of the revious simulation studies (see Khalaf and Shukur (005), Alkhamisi and Shukur (008) and Khalaf (0)) that increasing the number of regressor and using non-normal seudo random numbers to generate e i leads to a higher estimated MSE, while increasing the samle size leads to a lower estimated MSE. The criterion roosed here for measuring the goodness of an estimator is the MSE using the following formula; ˆ MSE( β ) ( ˆ )( ˆ r = βr β βr β), 5000 (8) 97
A COMPARISON BETWEEN BIASED AND UNBIASED ESTIMATORS where ˆr β is the estimator of β obtained from the OLS estimator or from the ridge estimator for different estimated values of k considered for comarison reasons and, finally, 5000 is the number of relicates used in the Monte Carlo simulation. The Simulation Results Tables 6 below, resent the outut from the Monte Carlo exeriment concerning roerties of the different methods that used to choose the ridge arameter k. The results showed that the estimated MSE is affected by all factors we choose to vary in the design of exeriment. It is also noted that the higher the degree of correlation the higher estimated MSE, but this increase is much greater for the OLS than the ridge regression estimator. The distribution of the error term and the number of exlanatory variables having a different imact on the estimators. Table. Estimated MSE when ρ = 0.85 and =. OLS HKB KS K K N(0, ) 5 0.38 0.8 0.90 0.43 0.8 50 0. 0.093 0.097 0.55 0.76 00 0.057 0.05 0.053 0.66 0.78 50 0.034 0.03 0.03 0.8 0.84 T(3) 5.59 0.957.35 0.506 0.497 50.9 0.60 0.896 0.5 0.364 00 0.53 0.3 0.445 0.588 0.39 50 0.473 0.6 0.44 0.63 0.35 T(7) 5 0.350 0.48 0.66 0.304 0.8 50 0.69 0.35 0.43 0.34 0. 00 0.076 0.067 0.069 0.35 0.0 50 0.053 0.048 0.049 0.367 0.4 F(3, ) 5 0.853 0.50 0.584 0.383 0.89 50 0.39 0.6 0.309 0.49 0.36 00 0.78 0.39 0.57 0.48 0.76 50 0.6 0.04 0.5 0.50 0.90 98
GHADBAN KHALAF Table. Estimated MSE when ρ = 0.85 and = 4. OLS HKB KS K K N(0, ) 5 0.796 0.549 0.57 0.87 0.59 50 0.334 0.55 0.68 0.6 0.6 00 0.56 0.3 0.37 0.57 0.8 50 0.03 0.090 0.093 0.84 0.96 T(3) 5 7.387 3.78 4.330.049.05 50 6..96 4.79.073.6 00.685 0.969.38 0.509 0.409 50.40 0.754.08 0.55 0.33 T(7) 5.59 0.730 0.776 0. 0.78 50 0.504 0.36 0.389 0.55 0.84 00 0.35 0.88 0.00 0.33 0.8 50 0.5 0.7 0.35 0.353 0.3 F(3, ) 5.667.446.60 0.338 0.38 50.30 0.699 0.805 0.36 0.6 00 0.578 0.40 0.468 0.45 0.63 50 0.36 0.7 0.3 0.467 0.8 Table 3. Estimated MSE when ρ = 0.95 and =. OLS HKB KS K K N(0, ) 5 0.705 0.47 0.450 0.93 0.47 50 0.353 0.50 0.65 0.93 0.34 00 0.68 0.33 0.40 0.0 0.45 50 0.4 0.095 0.00 0.3 0.49 T(3) 5 7.899 3.00 3.580.50.75 50 5.575.37.969 0.988.56 00.703 0.789.5 0.486 0.75 50.83 0.655 0.959 0.54 0.99 T(7) 5.74 0.670 0.78 0.4 0.86 50 0.58 0.340 0.37 0.49 0.64 00 0.50 0.85 0.00 0.87 0.77 50 0.6 0.7 0.37 0.3 0.87 F(3, ) 5.556.3.37 0.40 0.343 50.67 0.63 0.738 0.336 0.5 00 0.566 0.346 0.49 0.397 0.4 50 0.378 0.5 0.300 0.444 0.44 99
A COMPARISON BETWEEN BIASED AND UNBIASED ESTIMATORS Table 4. Estimated MSE when ρ = 0.95 and = 4. OLS HKB KS K K N(0, ) 5.356.308.35 0.54 0.45 50.057 0.673 0.705 0.5 0.099 00 0.359 0.5 0.66 0.0 0.088 50 0.33 0.45 0.58 0.94 0.38 T(3) 5 8.886 8.484 9.06.45.99 50 4.573 7.74 8.8.939.35 00 4..079.57 0.84 0.94 50 3.390.85.66 0.4 0.306 T(7) 5 3.76.996.068 0.8 0. 50.50 0.89 0.948 0.46 0.4 00 0.745 0.495 0.534 0.99 0.39 50 0.478 0.34 0.368 0.39 0.6 F(3, ) 5 8.0 4.48 4.334 0.776 0.796 50 3.578.88.064 0.06 0.7 00.755.034.70 0.48 0.64 50.80 0.74 0.849 0.309 0.95 Table 5. Estimated MSE when ρ = 0.99 and =. OLS HKB KS K K N(0, ) 5 4.050.850.905 0.349 0.33 50.776 0.884 0.93 0.33 0.099 00 0.93 0.533 0.568 0.38 0.09 50 0.57 0.358 0.385 0.55 0.099 T(3) 5 43.786 5.68 6.407.046.5 50.736 7.673 8.50 3.55 3.377 00 8.794 3.60 4.7 0.48 0.399 50 7.046.46 3.74 0.36 0.3 T(7) 5 6.08.657.745 0.56 0.544 50.63.9.74 0.7 0.4 00.370 0.73 0.797 0.78 0. 50 0.865 0.50 0.55 0.04 0.3 F(3, ) 5.863 4.8 5.037.4.438 50 6.40.550.779 0.343 0.96 00 3.39.508.75 0.46 0.5 50.90 0.899.060 0.79 0.5 300
GHADBAN KHALAF Table 6. Estimated MSE when ρ = 0.99 and = 4. OLS HKB KS K K N(0, ) 5 3.39 6.484 6.547 0.97 0.98 50 6.095 3.078 3.4 0.09 0.07 00.708.466.50 0.06 0.050 50.70 0.990.036 0.076 0.057 T(3) 5 69.385 7.38 73.397 58.48 59.44 50 65.70 33.98 34.73 5.466 5.685 00 30.93 5.077 5.93.38.448 50 9.9 8.885 9.738 0.505 0.556 T(7) 5 9.789 9.337 9.473.739.756 50 8.78 4.34 4.44 0.30 0.9 00 4.068.5.40 0.077 0.063 50.550.390.467 0.086 0.06 F(3, ) 5 44.4 0.834.06 7.00 7.089 50.347 9.485 9.785.073.3 00 9.7 4.498 4.730 0.45 0.3 50 6.78 3.077 3.303 0.4 0.085 For non-normal error term in combination with ρ = 0.95 and ρ = 0.99 leads to a larger estimated MSE for the OLS estimator and the ridge arameter, esecially when n is small, but when the samle size increases the estimated MSE of the suggested ridge arameters, namely K and K decreases substantially. The erformance of K and K is well for all cases when the error term is distributed as a normal and, when n is greater than 5 and the error term in nonnormal. When n is greater than 5, the modified ridge arameter erformance, defined by (6) and (7), is much better than the estimators HKB, KS and the OLS, where K has a low estimated MSE when the number of regressor equals 4. Summary and Conclusions In multile linear regression, the effect of non-orthogonality of the exlanatory variables is to ull the least squares estimates of the regression coefficients away from the true coefficients, β, that one is trying to estimate. The coefficients can 30
A COMPARISON BETWEEN BIASED AND UNBIASED ESTIMATORS be both too large in absolute value and incorrect with resect to sign. Furthermore, the variance and the covariance of the OLS tend to become too large. A slight movement away from this oint can give comletely different estimates of the coefficients. This is accomlished by adding a small ositive quantity, k, to each of the diagonal elements of the matrix XX. The resulting estimator is called the ridge estimator, suggested by Hoerl and Kennard (970) and given by (3). Several rocedure for constructing ridge estimators have been roosed in the literature. These rocedures were aiming at a rule (or algorithm) for selecting the constant k in equation (3). In fact, the best method of estimating k is an unsolved roblem and there is no constant value of k that is certain to yield an estimator which is uniformly better (in terms of MSE) than the OLS in all cases. By means of Monte Carlo simulations two suggested ridge arameters were evaluated and the result were comared with ridge arameters evaluated by Hoerl et. al (975) and Khalaf and Shukur (005). The estimator HKB erformed well in this study. It aears to outerform KS when ρ is small and the samle size is greater than 5. The suggested estimators K and K erforms well in our simulation. They aeared to offer an oortunity for large reduction in MSE when = and the error term in normally distributed. For non-normal error term the versions of the ridge arameter has a lower estimated MSE when the samle size is greater than 5. K is always minimizes the estimated MSE when the error term in normally distributed. References Alkhamisi, M., & Shukur, G. (008). Develoing Ridge Parameters for SUR Model. Communication in Statistics- Theory and Methods, 37, 544-564. Hoerl, A. E., & Kennard, R. W. (970). Ridge Regression: Biased Estimation for non-orthogonal Problems. Technometrics, Vol., 55 67. Hoerl, A. E., Kennard, R. W., & Baldwin, K. F. (975). Ridge Regression: some Simulation. Communications in Statistics- Theory and Methods, 4, 05 4. Khalaf, G. (0). Suggested Ridge Regression Estimators Under Multicollinearity. Journal of Natural and Alied Sciences, University of Aden, Vol. 5, 70 93. 30
GHADBAN KHALAF Khalaf, G., & Shukur, G. (005). Choosing Ridge Parameters for Regression Problems. Communication in Statistics Theory and Methods, 34, 77 8. Kibria, B. M. G. (003). Performance of some New Ridge Regression Estimators. Communication in Statistics- Theory and Methods, 3, 49-435. Muniz, G., & Kibria, B. M. G. (009). On some Ridge Regression Estimators: An Emirical Comarisons. Communications in Statistics-Simulation and Comutation, 38, 6 630. 303