COMBINING THE LIU-TYPE ESTIMATOR AND THE PRINCIPAL COMPONENT REGRESSION ESTIMATOR

Size: px

Start display at page:

Download "COMBINING THE LIU-TYPE ESTIMATOR AND THE PRINCIPAL COMPONENT REGRESSION ESTIMATOR"

Augustine Willis McDaniel
5 years ago
Views:

1 Noname manuscript No. (will be inserted by the editor) COMBINING THE LIU-TYPE ESTIMATOR AND THE PRINCIPAL COMPONENT REGRESSION ESTIMATOR Deniz Inan Received: date / Accepted: date Abstract In this study a new two-parameter estimator which includes the ordinary least squares (OLS), the principal components regression (PCR) and the Liu-type estimator is proposed. Conditions for the superiority of this new estimator over the PCR, r-k class estimator and Liu-type estimator are derived. Furthermore the performance of this estimator is compared with the other estimators in different conditions with simulation studies. Keywords Liu-type estimator; r-k class estimators; principal component regression estimator; ridge estimator; multicollinearity. Mathematics Subject Classification (0) 62J07 1 Introduction Multicollinearity can be a problem for many empirical researchers, and this problem will influence how you are able to interpret the coefficients in your model. It is well known that especially ordinary least squares (OLS) estimator can be effected seriously by the presence of multicollinearity so parameter estimates have large sampling variances. To deal with this problem different methods have been proposed such as ordinary ridge regression (ORR) estimator (Hoerl and Kennard, 1970), principal component regression (PCR) estimator (Draper and Smith, 1981), least absolute shrinkage and selection operator (LASSO) estimator (Tibshirani, 1996), least angle regression (LARS) estimator (Efron et al.,4) and Liu-type estimator (Liu, 3). Also combinations of these estimators have been proposed with the expectation that the combination of different estimators might inherit the advantages of both. Baye and D. Inan Marmara University Department of Statistics Istanbul Turkey Tel.: Fax: denizlukuslu@marmara.edu.tr

2 2 Deniz Inan Parker (1984) combined ORR estimator with PCR estimator and proposed r-k class estimator which provides an alternative method of dealing with multicollinearity. The r-k class estimator was compared with the ORR estimator and OLS estimator according to scalar mean square error (MSE) criteria in Nomura and Ohkuba (1985). Sarkar (1996) studied the properties of r-k class estimators in the sense of matrix mean square error. Differently, Kaciranlar and Sakallioglu (1) proposed r-d class estimators by combining the Liu and PCR estimators and compared this new estimator with the OLS, PCR and Liu estimators according to MSE criteria. Conditions for superiority of the r-d class estimators over these estimators and the r-k class estimators are derived in Ozkale and Kaciranlar (7). Aim of this paper is to introduce a new class of estimators which include OLS, PCR and Liu-type estimators that can be used as an alternative method for combating multicollinearity and compare this estimator with the OLS, ORR, Liu-type, PCR and r-k class estimators in the sense of MSE criteria. 2 A new class of estimators Consider the linear regression model Y = Xβ +ǫ (2.1) where X is a known n p matrix of standardized regressors with rank(x) = p, Y is a n 1 response vector, β is a vector of parameters and ǫ is a n 1 vector of errors with E(ǫ) = 0 and Cov(ǫ) = σ 2 I. Let us rewrite model (2.1) in canonical form Y = Zα+ǫ where Z = Xθ, α = θ β and θ is the p p orthogonal matrix whose columns constitute the eigenvectors of X X. Then, Z Z = θ X Xθ = Λ = diag(λ 1,,λ p ) where λ 1 λ 2 λ p > 0 are the ordered eigenvalues of X X. The columns of Z, which define the new set of orthogonal regressors Z = [Z 1,Z 2,,Z p ] are referred to as principal components and the OLS estimator of α is ˆα = (Z Z) 1 Z y = Λ 1 Z y. The PCR regression approach combats multicollinearity by using less than the full set of principal components in the model. Suppose that the last p r of the eigenvalues are approximately near zero. In PCR the principal components corresponding to near-zero eigenvalues are removed from the analysis and OLS is applied to the remaining components. That is, ˆα PC = Bˆα

3 Title Suppressed Due to Excessive Length 3 where B is a diagonal matrix whose diagonal elements are as b 1,1 = b 2,2 = = b r,r = 1 and b r+1,r+1 = b r+2,r+2 = = b p,p = 0 (Montgomery et. al., 1). Likewise in terms of original regressors, the PCR estimator can be given as ˆβ PC = θα PC ˆ (2.2). Let θ r be the remaining eigenvectors of X X after having deleted p r of columns of θ which correspond to the near-zero eigenvalues. Thus, θ rx Xθ r = Λ r = diag(λ 1,,λ r ) and PCR estimator given in (2.2), can also be given as in the form, ˆβ PC = θ r (θ rx Xθ r ) 1 θ rx y and r-k class estimator is ˆβ r (k) = θ r (θ rx Xθ r +ki) 1 θ rx y where r p and k 0. A new class of estimators which includes the OLS, PCR and Liu-Type estimators as a special cases is defined as follows. Definition 1. The r-(k,d) class estimator of β in (2.1) is ˆβ r (k,d) = θ r (θ rx Xθ r +ki) 1 (θ rx Xθ r di r )θ rˆβ PC, where k 0 and < d < are shrinkage and biasing parameters that must be chosen by the researcher and 0 < r p. Special cases of r-(k,d) class estimators can be given as: ˆβ p (0,0) = ˆβ LS = (X X) 1 X Y, the OLS estimator, ˆβ r (k, k) = ˆβ PC = θ r (θ rx Xθ r ) 1 θ rx y, the PCR estimator, ˆβ r (k,0) = ˆβ r (k) = θ r (θ rx Xθ r +ki) 1 θ rx y, the r-k class estimator, ˆβ p (k,d) = ˆβ(k,d) = (X X +ki) 1 (X X di)ˆβ, Liu-Type estimator. The MSEs ˆβ r (k,d), ˆβ LS, ˆβ PC, ˆβr (k), ˆβ(k,d) are given MSE( ˆβ r (k,d)) = σ 2 (λ i d) 2 λ i (λ i +k) 2 + ( d+k p λ i +k )2 αi 2 + αi 2 (2.3) respectively. MSE(ˆβ LS ) = σ 2 p MSE(ˆβ PC ) = σ 2 1 λ i 1 λ i + p i=r+1 MSE( ˆβ r (k)) = σ 2 (λ i ) (λ i +k) 2 + k ( λ i +k )2 αi 2 + α 2 i i=r+1 p i=r+1 p MSE(ˆβ(k,d)) = σ 2 (λ i d) 2 p λ i (λ i +k) 2 + ( d+k λ i +k )2 αi, 2 α 2 i

4 4 Deniz Inan Theorem 1. For any k > 0 and 0 < r p, if the biasing parameter chosen as d = ((σ 2 kαi)/(λ 2 i +k) 2 )/ ((σ 2 +λ i αi)/(λ 2 i (λ i +k) 2 )) then MSE( ˆβ r (k,d)) MSE(ˆβ PC ) and MSE( ˆβ r (k,d)) MSE( ˆβ r (k)). Proof. From (2.1.1) it is easy to see that, for fixed k, MSE( ˆβ r (k,d)) is a quadratic function of d. If the first derivative of this function is taken and solved for d, the optimum value of d which minimizes MSE( ˆβ r (k,d)) can be obtained as d opt = ((σ 2 kαi)/(λ 2 i +k) 2 )/ ((σ 2 +λ i αi)/(λ 2 i (λ i +k) 2 )) (2.4). In view of MSE( ˆβ r (k, k)) = MSE(ˆβ PC ), we have MSE( ˆβ r (k,d opt )) MSE(ˆβ PC ). And the equality occurs when d opt = k. Similarly in view of MSE( ˆβ r (k,0)) = MSE( ˆβ r (k)),wehavemse( ˆβ r (k,d opt )) MSE( ˆβ r (k)).equality occurs when d opt = 0. Theorem 2. For any k > 0, 0 < r p and i = 1,,p, when λ 1 < d < min[(σ 2 +α 2 i( 2k λ i ))λ i /(σ 2 +α 2 iλ i )] we have MSE( ˆβ r (k,d)) MSE(ˆβ(k,d)). Proof. For a given r the solution of MSE( ˆβ r (k,d)) MSE(ˆβ r 1 (k,d)) > 0 for d can be obtained as λ r < d < (σ 2 +α 2 r( 2k λ r ))λ r /(σ 2 +α 2 rλ r ). Generalizing this solution, if we choose d as, λ 1 < d < min[(σ 2 +α 2 i( 2k λ i ))λ i /(σ 2 +α 2 iλ i )] we get MSE( ˆβ r (k,d)) MSE(ˆβ r 1 (k,d)) > 0 for all r = 1,,p. As a result, MSE( ˆβ p (k,d)) > MSE(ˆβ p 1 (k,d)) > > MSE(ˆβ r (k,d)) MSE(ˆβ r (k,d)) < MSE(ˆβ(k,d)). This completes the proof of Theorem 2 (See Appendix).

5 Title Suppressed Due to Excessive Length 5 3 Simulation Results Monte Carlo simulations were used to illustrate the behaviour of the proposed estimator. Simulations were carried out under different degrees of multicollinearity. In first simulation study the number of regressors was chosen to be six and the parameters were set to be (22,1.4,1.2, 1.5, 0.2,0.7,1.7). The size of the sample n was set to be 30, 50 and 70. The regressors were generated from x i,1 = (1 ϕ 2 ) 1/2 w i,1 +ϕw i,5 +20 x i,2 = (1 ϕ 2 ) 1/2 w i,2 +ϕw i,5 +45 x i,3 = (1 ρ 2 ) 1/2 w i,3 +ρw i,6 24 x i,4 = (1 ρ 2 ) 1/2 w i,4 +ρw i,6 +7 x i,5 = (1 ρ 2 ) 1/2 w i,3 +ρw i,2 +19 x i,6 = w i,2 14 wherew i,j aregeneratedindependentlyfromn(0,2).ϕandρcorrespondtothe correlations between x i,1, x i,2 and x i,3, x i,4 are ϕ 2 and ρ 2 respectively (Gibbons, 1981). Three different sets of correlations were considered corresponding to ϕ = 0.95 and ρ = 0.95, ϕ = 0.98 and ρ = 0.95, ϕ = 0.95 and ρ = Simulations were repeated times. In each experiment inhomogeneous model was applied to the data, estimated error variances of each estimator and the following estimates were computed: 1.OLS : Ordinary least square estimator 2.ORR: Ordinary ridge regression estimator, k = ˆσ LS 2 /ˆβ LSˆβ LS (Schafer et 3.Liu: al., 1984) Liu-type estimator, k = ˆσ LS 2 /ˆβ LSˆβ LS, ˆd = p ((ˆσ2 kˆα i 2)/(λ i +k) 2 )/ p ((ˆσ2 +λ iˆα i 2)/(λ i(λ i +k) 2 )) (Liu, 3) and ˆβ = ˆβ LS 4.PCR: Principal component regression estimator 5.r-k: r-k class estimator, k = ˆσ LS 2 /ˆβ LSˆβ LS 6.r-(k,d): r-(k,d) class estimator, k = ˆσ LS 2 /ˆβ LSˆβ LS, ˆd = r ((ˆσ2 kˆα i 2)/ (λ i +k) 2 )/ r ((ˆσ2 +λ iˆα i 2)/(λ i(λ i +k) 2 )). Unfortunately,thed opt valuesgiveninthetheoremsdependonunknownmodel parameters and so, for practical purposes, these unknown parameters were decided to replace with their suitable estimates. If we substitute α and σ with their ORR estimates (ˆα R and ˆσ R ) in (2.2.1), we obtain ˆd opt = ((ˆσ R 2 kˆα R(i) 2 )/(λ i +k) 2 )/ ((ˆσ R 2 +λ iˆα R(i) 2 )/(λ i(λ i +k) 2 ))

6 6 Deniz Inan Table 1 EMSE values for six estimators (continuous regressors) ϕ ρ n OLS ORR Liu PCR r-k r-(k,d) Table 2 Estimated error variances for six estimators (continuous regressors) ϕ ρ n OLS ORR Liu PCR r-k r-(k,d) as an operational estimator for d opt. In simulation studies this estimator was used for r-(k,d) class estimator and considering the proposal of Liu (3), ˆd opt = p ((ˆσ R 2 kˆα R(i) 2 )/(λ i +k) 2 )/ p ((ˆσ R 2 +λ iˆα R(i) 2 )/(λ i(λ i +k) 2 )) was used for Liu-type estimator. The estimated MSEs (EMSEs) and means of the estimated error variances were computed for each of the above six estimators. The results are presented in Table 1 and Table 2. As expected, our proposed estimators have smaller EMSEs than those of the other estimators. Although the EMSEs of all estimators increased along with multicollinearity, the increase in EMSEs for proposed estimator was not as much as the others. It can be concluded that the effect of multicollinearity on the other estimators is more pronounced than proposed estimator. The EMSE reduction is much better when the multicollinearity is high. Also there is no significant difference among the estimators according to estimated error variances.

7 Title Suppressed Due to Excessive Length 7 Table 3 EMSE values for six estimator (both continuous and ordinal regressors) ϕ ρ n OLS ORR Liu PCR r-k r-(k,d) The case when the model has both continuous and ordinal regressors was also considered. For this case, six regressors were generated as in the first simulation study, the parameters were set to be (25,2.3,4.1, 1.5, 0.9,0.9,1.7) and the size of the sample n was set to be 30, 50 and 70. Two different sets of correlations were considered corresponding to ϕ = 0.95 and ρ = 0.95, ϕ = 0.95 and ρ = After six continuous regressors were generated, the first two of them were discretized in order to make them ordinal. In this way the impact of collinearity was slightly reduced. Simulations were repeated times. The results are presented in Table 3. As seen from the Table 3, in all cases r-(k,d) class estimator has smaller EMSE than the other estimators. 4 Numerical Example In this section, the acetylene data described and analysed in detail by Marquardt (1980) was employed. This data set is a typical set of response surface data for which a full quadratic model in three regressors is often considered to be an appropriate candidate model. The standardized full quadratic model for the data is y = β 0 +β 1 x 1 +β 2 x 2 +β 3 x 3 +β 1,2 x 1 x 2 +β 1,3 x 1 x 3 +β 2,3 x 2 x 3 +β 1,1 x 2 1+β 2,2 x 2 2+β 3,3 x 2 3+ǫ wherey=percentageofconversion,x 1 =(temperature )/80.623,x 2 =(H 2 (nheptane)-12.44)/5.662 and x 3 =(contact time )/ Each of the six estimators was applied to these data. k, r, ˆd opt and ˆd opt were determined respectively as , 6, and The parameter estimations are given in Table 4. Under this model, the condition number is κ = λ max λ min = = 42048

8 8 Deniz Inan Table 4 Parameter estimations for the acetylene data ˆβ 0 ˆβ1 ˆβ2 ˆβ3 ˆβ4 ˆβ5 ˆβ6 ˆβ7 ˆβ8 ˆβ9 OLS ORR Liu PCR r-k r-(k,d) Table 5 EMSEs, estimated parameter variances and biases for the acetylene data Estimated parameter Estimated parameter EMSE variance bias OLS ORR Liu PCR r-k r-(k,d) which indicates severe multicollinearity. In this case, the OLS estimator does not provide good estimations. Although the OLS estimator achieves an optimum fit to the estimation data, it does not provide a good prediction for the new data. Therefore, we can not trust the OLS estimator under this model. The bootstrap sampling method was used for these data to estimate the MSE values of relevant estimators and illustrate the distributional properties of the parameters. For this purpose, bootstrap samples were generated, and parameter estimates of six estimators were computed for each of these samples. The mean of the OLS estimates was considered as β, and EMSEs, estimated parameter variances and estimated biases were computed as given in Table 5. Because of the severe multicollinearity, small changes in the data cause large differences in the parameter estimates. To illustrate this situation, for each estimator, histograms of ˆβ 9 are given in Figure 1 as an example. As seen in Figure 1, the OLS, ORR and Liu-type estimates vary over a large interval. Compared to the OLS, the distributions for the ORR and Liutype estimations have stronger peaks and more rapid decays, but they still have heavy tails. Therefore, we can not say that these estimates are stable. The PCR, r-k and r-(k,d) class estimations are distributed within a narrower interval according to the OLS, ORR and Liu-type estimates. In addition, the distribution of r-(k,d) class estimations has a stronger peak and more rapid decay according to the distributions of the PCR and r-k class estimations. Considering all of these histograms, we can say that among the six estimators, the r-(k,d) class estimator gives the most stable estimates, indicating that this estimator provides a better prediction for the new data.

9 Title Suppressed Due to Excessive Length OLS ORR Liu PCR r k r k,d Fig. 1 Histograms of ˆβ 9 5 Summary In this paper conditions have been obtained for the superiority of the r-(k,d) class estimators over the PCR, r-k class and Liu-type estimators. It is seen that the conditions obtained in Theorem 1. and Theorem 2. depend on the unknown parameters. For practical purposes, thus, we have suggested to replace them with their ORR estimators. Simulation studies are constructed to decide whether the relevant conditions are satisfied. Simulation results obtained through the use of using ORR estimates are seen to be supporting our theoretical findings. Finally, considering these simulation studies it is concluded that especially for the severe multicollinearity case and small sample sizes r-(k,d) class estimator performs better than the other estimators in the sense of MSE. 6 Appendix MSE( ˆβ r (k,d)) MSE(ˆβ r 1 (k,d)) > 0 σ 2 ( r (λ i d) 2 /(λ i (λ i +k) 2 ) r 1 (λ i d) 2 /(λ i (λ i +k) 2 ))+ r ((d+k)/(λ i +k)) 2 α 2 i r 1 ((d+k)/(λ i +k)) 2 α 2 i + p i=r+1 α2 i p i=r α2 i > 0 σ 2 (λ r d) 2 /(λ r (λ r +k) 2 )+((d+k)/(λ r +k)) 2 α 2 r α 2 r > 0 λ r < d < (σ 2 +α 2 r( 2k λ r ))λ r /(σ 2 +α 2 rλ r ) Acknowledgements The author was supported by the Marmara University Scientific Research Project Unit (BAPKO, project number: FEN-D ).

10 10 Deniz Inan References 1. Baye M.R., Parker, D.F. (1984). Combining Ridge and Principal Component Regression: A Money Demand Illustration.Communications in Statistics-Theory and Methods 13, Draper, N.R., Smith, H.(1981). Applied Regression Analysis. John Willey & Sons. 3. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.(4). Least Angle Regression. The Annals of Statistics 32, Gibbons, D.G. (1981). A simulation study of some ridge estimators. Journal of the American Statistical Association, 76, Hoerl, A.E., Kennard, R.W.(1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 12, Kaciranlar S., Sakallioglu S. (1). Combining the Liu Estimator and the Principal Component Regression Estimator. Communications in Statistics-Theory and Methods 30, Liu K. (3).Using Liu Type Estimator to Combat Collinearity. Communications in Statistics-Theory and Methods 32, Marquardt, D.W.(1980) A Critique of Some Ridge Regression Methods. Journal of the American Statistical Association 75, Montgomery, D.C., Peck, A.E., Vining, G.G.(1) Introduction to the Linear Regression Analysis. John Willey & Sons Nomura M., Ohkubo T. (1985). A Note on Combining Ridge and Principal Component Regression. Communications in Statistics-Theory and Methods 14, Ozkale M.R., Kaciranlar S. (7). Superiorty of the r-d Class Estimator Over Some Estimators by the Mean Square Error Matrix Criterion. Statistics and Probability Letters 77, Sarkar,N.(1996). Mean Square Error Matrix Comparison of Some Estimators in Linear Regression with Multicollinearity. Statist.Probab.Lett. 30, Tibshirani, R.(1996). Regression Shrinkage and Selection via the Lasso. J.R. Statist. Soc.B 58,

Improved Liu Estimators for the Poisson Regression Model

www.ccsenet.org/isp International Journal of Statistics and Probability Vol., No. ; May 202 Improved Liu Estimators for the Poisson Regression Model Kristofer Mansson B. M. Golam Kibria Corresponding author