Generalized Ridge Regression Estimator in Semiparametric Regression Models

Size: px

Start display at page:

Download "Generalized Ridge Regression Estimator in Semiparametric Regression Models"

Sibyl Collins
5 years ago
Views:

1 JIRSS (2015) Vol. 14, No. 1, pp Generalized Ridge Regression Estimator in Semiparametric Regression Models M. Roozbeh 1, M. Arashi 2, B. M. Golam Kibria 3 1 Department of Statistics, Faculty of Mathematics, Statistics and Computer Sciences, Semnan University, Iran. 2 Department of Statistics, School of Mathematical Sciences, University of Shahrood, Iran. 3 Department of Mathematics and Statistics, Florida International University, Florida, USA. Abstract. In the context of ridge regression, the estimation of ridge (shrinage) parameter plays an important role in analyzing data. Many efforts have been put to develop sills and methods of computing shrinage estimators for different full-parametric ridge regression approaches, using eigenvalues. However, the estimation of shrinage parameter is neglected for semiparametric regression models. The main focus of this paper is to develop necessary tools for computing the ris function of regression coefficient based on the eigenvalues of design matrix in semiparametric regression model, maing use of differencing methodology. In this respect, some new estimators for shrinage parameter are also proposed. It is shown that one of these estimators which is constructed based on well-nown harmonic mean, performs better for large values of signal to noise ratio. For our proposal, the Monte Carlo simulation studies and a real application related to housing attributes are conducted to illustrate the efficiency of shrinage estimators based on minimum ris criteria. Keywords. Differencing methodology, generalized ridge estimator, er- M. Roozbeh( )(mahdi.roozbeh@profs.semnan.ac.ir), M. Arashi(m arashi stat@ yahoo.com), B. M. Golam Kibria(ibriag@fiu.edu) Received: July 2014; Accepted: June 2015

2 26 Roozbeh et al. nel smoothing, multicollinearity, semiparametric regression model. MSC: Primary: 62J07; Secondary: 62J99. 1 Introduction Semiparametric regression models (SRMs) have received considerable attention in statistics and econometrics, because of their flexibility in modeling real events. A SRM has form y i = x i β + f(t i ) + ϵ i, i = 1,..., n, (1.1) where x i = (x i1, x i2,..., x ip ) is a vector of explanatory variables, β = (β 1, β 2,..., β p ) is an unnown p-dimensional parameter vector, the t i is nown (non-stochastic) in some bounded domain D R, f(t i ) is an unnown smooth function and ϵ i s are independent and identically distributed random errors with mean 0 and variance σ 2. Most of approaches for SRMs are based on different nonparametric regression procedures. There have been several methods to estimate β and f(.). Extensive study regarding estimation and application of the SRM (1.1) can be found in the monograph of Härdle et al. (2000). An alternative approach to nonparametric procedure, is differencing methodology. This incoming, uses differences to remove the trend in the data that arises from the function f( ) and does not require an estimator of f( ). It is often called difference-based procedure. Provided that f( ) is differentiable and t i s are closely spaced, it is possible to remove the effect of the function f( ) by differencing the data, appropriately. In the model (1.1), Yatchew (1997) concentrated on estimation of the linear component and used differencing to eliminate bias induced from the presence of the nonparametric component. The difference-based estimation procedure is optimal in the sense that the estimator of linear component is asymptotically efficient and the estimator of nonparametric component is asymptotically minimax (Wang et al. 2011). Thus, differencing allows one to perform inference on β as if there was no nonparametric component f( ) in the model SRM (1.1). Once β is estimated, a variety of nonparametric techniques could be applied to estimate f( ) as if β was nown. Wang et al. (2011) used higher order differences for optimal efficiency in estimating the linear part, by using a special class of difference sequences. Now, consider a SRM in the presence of multicollinearity. For the purpose of this study, we only employ the ridge regression concept due to

3 Generalized Ridge Regression Estimator in Hoerl and Kennard (1970), to combat multicollinearity. There are a lot of wor adopting ridge regression methodology to overcome the multicollinearity problem. To mention a few recent studies in full-parametric regression, see Saleh and Kibria (1993), Saleh (2006), Ozale and Kaciranlar (2007), Muniz and Kibria (2009), Ozale (2009), Hassanzadeh Bashtian et al. (2011a, b), Kaciranlar et al. (2011), Kibria and Saleh (2004, 2011), Roozbeh et al. (2011, 2012), Adeniz Duran et al. (2012) and Kibria (2012). Adeniz and Tabaan (2009) and Roozbeh and Arashi (2013) employed this methodology in facing with SRM. The main focus of this approach is to develop necessary tools for computing the ris function of regression coefficient in a SRM incorporating eigenvalues of design matrix. To this end, the differencing methodology will be applied. We are also seeing for some new estimators of shrinage parameter. The organization of the paper is as follows: Section 2 contains a nutshell of the difference-based methodology. In section 3, differencebased generalized ridge estimator of linear part is introduced. Some of estimation methods usually used for estimating the ridge parameter in full-parametric regression model and along proposing some new ones will be reviewed in section 4. Properties of the proposed estimator are exactly derived in section 5. Section 6 is devoted to a Monte Carlo simulation study and an application in housing attributes. Finally, we conclude our paper by giving some remars in section 7. 2 Differencing Approach In this section, we use a difference-based technique to estimate the linear regression coefficient vector β. This technique has been used to remove the nonparametric component in SRM by various authors (e.g. Yatchew, 1997 and 2003; Klipple and Euban, 2007, and Brown and Levine, 2007). In a full matrix notation, the model (1.1) can be represented as y = Xβ + f(t) + ϵ, (2.1) where y = (y 1,..., y n ), X = (x 1,..., x n ) is the n p non-stochastic design matrix of full column ran, f(t) = (f(t 1 ),..., f(t n )), and ϵ = (ϵ 1,..., ϵ n ). In general, we assume that ϵ is a vector of disturbances (errors) distributed with E(ϵ) = 0 and E(ϵϵ ) = σ 2 I n, where I n is an identity matrix of order n and σ 2 is an unnown parameter. Yatchew (1997) suggested to estimate β on the basis of the m th order

4 28 Roozbeh et al. differencing equation m ( m d j y i j = d j x i j )β + j=0 j=0 m d j f(t i j ) + j=0 m d j ϵ i j, (2.2) where d 0, d 1,..., d m are differencing weights. In the forthcoming section, we adopt an explanation from Adeniz et al. (2013), to show how the approximation wors. How does the Approximation wor? Suppose t i s are equally spaced on the unit interval and f (.) L. By the mean value theorem, for some t i [t i 1, t i ] we have j=0 f(t i ) f(t i 1 ) = f (t i )(t i t i 1 ) L n. Note that with m = p = 1 from (2.2) we have that y i y i 1 = (x i x i 1 )β + f(t i ) f(t i 1 ) + ϵ i ϵ i 1 = (x i x i 1 )β + O ( 1 ) + ϵi ϵ i 1 n = (x i x i 1 )β + ϵ i ϵ i 1. We then estimate the linear regression coefficient β by the ordinary least-squares estimator based on the differences. Then, the least-squares estimate will be obtained as ˆβ n i=2 diff = (x i x i 1 )(y i y i 1 ) n i=2 (x i x i 1. ) 2 Now, let d = (d 0,..., d m ) be a (m + 1)-vector, where m is the order of differencing and d 0, d 1,..., d m are differencing weights minimizing the variance of linear estimators, i.e., satisfying the conditions m min d0,...,d m l=1 ( m ) 2, d j d l+j j=0 m d j = 0, j=0 m d 2 j = 1. (2.3) j=0 The role of constraints (2.3) are now evident. The first condition ensures that as the t s become close, the non-parametric effect is removed and the second one ensures that the variance of sum of weighted residuals remains equals to σ 2 in (2.2).

5 Generalized Ridge Regression Estimator in Now, we define the (n m) n differencing matrix D whose elements satisfy (2.3) as d 0 d 1... d m d 0 d 1... d m D = d 0 d 1... d m This and related matrices are given, for example, in Yatchew(2003). Incorporating the differencing matrix in model (2.1) permits direct estimation of the parametric effect. As a result of developments in Specman (1988), it is nown that the parameter vector β in (2.1) can be estimated with parametric efficiency. We now show the difference-based estimators can be used for this purpose. In as much as the data have been ordered so that the values of the nonparametric variable(s) are close, the application of the differencing matrix D in model (2.1) removes the nonparametric effect in large samples. If f(.) is an unnown function that is an inferential object and has a bounded first derivative, then Df(t) is close to 0, so that by applying the differencing matrix, we may rewrite (2.1) as or Dy. = DXβ + Dϵ, y D. = XD β + ϵ D, (2.4) where y D = Dy, X D = DX, ϵ D = Dϵ. For arbitrary differencing coefficients satisfying (2.3), the differencebased ordinary least square estimator (DOLSE) of the regression coefficient β is obtained in the SRM as (see Yatchew, 1997) ˆβ D = S 1 D X Dy D, S D = X DX D. (2.5) Thus, differencing allows one to perform inferences on β as if there was no nonparametric component f(.) in the model (2.1) (Yatchew 2003). Once β is estimated, a variety of nonparametric techniques could be applied to estimate f( ) as if β was nown. To account the parameter β in (2.4), we introduce the modified estimator of σ 2 as ˆσ D 2 = y D (I P )y D tr ( D ), (2.6) (I P )D where tr(.) is the trace function for a square matrix and P is the projection matrix defined as P = X D (X DX D ) 1 X D.

6 30 Roozbeh et al. 3 Difference-based Generalized Ridge Estimator In this section, we will be discussing a biased estimation technique under multicollinearity, for SRM. The covariance matrix of ˆβ D is equal to σ 2 S 1 D. As it can be seen, both difference based ordinary least squares estimate (DOLSE) and its covariance matrix heavily depend on the characteristics of the matrix S D. If S D is ill-conditioned, the DOLS estimators are sensitive to a number of errors. For example, some of the regression coefficients may be statistically insignificant or have wrong signs, and they may result in wide confidence intervals for individual parameters (which are called unstable estimators). With these errors, it is difficult to mae valid statistical inference. The problem of multicollinearity can be solved by collecting additional data, re-parameterizing the model and reselecting variables. There are two well-nown mathematical methods to overcome multicollinearity: the principal components regression method and the ridge regression method. In this article, we will discuss the ridge regression method. A brief review of the literature reveals an abundance of wors related to the ridge regression method. Hoerl and Kennard (1970) first proposed this method to solve the multicollinearity problem. They suggested a small positive number to be added to the diagonal elements of S D matrix; and the resulting estimator has the following form ˆβ D () = S 1 D ()X DY D, S D () = S D + I p, (3.1) which is nown as a difference-based ridge estimator (DRE). For a positive value of, this estimator provides a smaller mean squared error (MSE) compared to the LSE. The constant ( 0) is called the ridge or shrinage parameter, and it must be estimated using the real data. Although, the ridge estimator is the most popular method for dealing with multicollinearity, it has some drawbacs. Dependency on the ridge parameter tends to result in either instability or bias. However, as, ˆβD () 0 and one obtains a stable, but biased estimator of β. As 0, ˆβD () ˆβ D and one obtains an unbiased, but unstable estimator of β. The expected distance between ˆβ D () and β must decrease as increases from the origin. The value of that produces the best estimator, however, is not clear. It is realized that the estimator ˆβ D () is a complicated function of. It is clear that for the semi-positive definite matrix S D, there exists an orthogonal matrix Γ such that S D = ΓΛΓ, where Λ = diag(λ 1,..., λ p )

7 Generalized Ridge Regression Estimator in contains the eigenvalues of matrix S D. Therefore, the orthogonal (canonical) version of the model (2.4) is given by y D = X Dα + ϵ D, (3.2) where X D = X DΓ and α = Γ β. But, when the matrix S D is ill-conditioned (in the sense that there is a near linear dependency among the columns of matrix), the DOLSE of β has a large variance, and multicollinearity is said to be present. If. multicollinearity is present, at least for one eigenvalue, λ i = 0. More closeness of the smallest eigenvalue to the origin, the more strength of linear multicollinearity. To mae the behavior of S D matrix more lie the canonical form, we need to increase the eigenvalues. Ridge regression replaces S D with S D () = S D + I p, ( > 0), which is the same as replacing the λ i by λ i +. This replacement counters the damaging effect of the smallest eigenvalue. Then, the canonical difference-based generalized ridge estimator (CD- GRE) will have from ˆα D (K) = (S D + K) 1 X Dy D = T D(K) ˆα D, T D(K) = (KS 1 D + I p) 1, where S D = X D X D, K = diag( 1, 2,..., p ), i 0 and ˆα D = Λ 1 X D y D is the canonical difference-based ordinary least squares (CDOLS) estimates of α. Hoerl and Kennard (1970) showed that for nown optimal i = σ2 αi 2, i = 1,..., p, (3.3) the generalized ridge regression estimator is superior to all other estimators within the class of biased estimators, where σ 2 is the error variance of model (2.4) and αi 2 is the i th element of α. However, the optimal value of i fully depends on the unnown σ 2 and α i, and they must be estimated from the observed data. Hoerl and Kennard (1970), suggested to replace σ 2 and αi 2 by their corresponding unbiased estimators. That is, ˆ i = ˆσ2 D ˆα i 2 i = 1,..., p, (3.4) where ˆσ 2 D is an unbiased and efficient estimator of σ2 and ˆα i is the i th element of ˆα D, which is an unbiased estimator of α.

8 32 Roozbeh et al. 4 Ridge Parameter Estimators In this Section, we will review and propose some new methods of estimating based on (3.4) as follows: 1. Hoerl and Kennard (1970), suggested the following estimator for ˆ HK = ˆσ2 D ˆα 2 max, (4.1) where ˆα max is the maximum element of ˆα D. If σ 2 and α are nown, then ˆα D (ˆ HK ) will give smaller ris than ˆα D (Hoerl and Kennard, 1970). 2. Hoerl et al. (1975), suggested the following estimator for ˆ HKB = pˆσ2 D p i=1 ˆα2 i = pˆσ2 D ˆα D ˆα D = pˆσ2 D ˆβ D ˆβ D. (4.2) 3. Kibria (2003), proposed the following estimators by using the arithmetic mean, geometric mean and median of ˆ i in (3.4) as ˆ AM = 1 p p i=1 ˆσ D 2 ˆα i 2, (4.3) ˆ GM = ˆσ 2 D ( p i=1 ˆα2 i ) 1 p ˆ Med = Median { ˆσ 2 D ˆα 2 i, (4.4) }, i = 1, 2,..., p. (4.5) 4. New estimator: We propose to estimate by using the harmonic mean of in (3.4), which produces the following new estimator: ˆ 1 i ˆ HM = ˆσ 2 D p p i=1 ˆα 2 i. (4.6) 5. Khalaf and Shuur (2005), suggested a new method to estimate the ridge parameter as a modification of HK which has form ˆ KS = λ maxˆσ 2 D (n m p)ˆσ D 2 + λ max ˆα max 2, (4.7) where λ max is the maximum eigenvalue of S D.

9 Generalized Ridge Regression Estimator in Alhamisi et al. (2006), proposed the following estimators: ˆ AKS1 = 1 p ( λ iˆσ D 2 p (n m p)ˆσ 2 i=1 D + λ i ˆα i 2 { λ iˆσ D ˆ 2 AKS2 = max i=1,...,p (n m p)ˆσ D 2 + λ i ˆα i 2 { λ iˆσ D ˆ 2 AKS3 = Median (n m p)ˆσ D 2 + λ i ˆα i 2 ), (4.8) }, (4.9) }. (4.10) 7. Muniz and Kibria (2009), by using the geometric mean of proposed the following estimator: λ iˆσ 2 D (n p)ˆσ 2 D +λ iˆσ 2 D, ˆ MK1 = ( p i=1 λ iˆσ 2 D (n m p)ˆσ 2 D + λ i ˆα 2 i ) 1 p. (4.11) 8. Muniz and Kibria (2009), following Alhamisi and Shuur (2008), proposed the following estimators based on m i = ˆσ D ˆα i : ˆ MK2 = { } 1 max, i=1,...,p m i ˆ MK3 = max i}, i=1,...,p ˆ MK4 = ˆ MK5 = ( p 1 m i=1 i ( p i=1 ) 1 p, m i ) 1 p, { 1 ˆ MK6 = Median m i }, ˆ MK7 = Median {m i }. (4.12) 9. New estimators: We propose to estimate by using the harmonic λ mean of iˆσ D 2 (n p)ˆσ D 2 +λ, m 1 i ˆα 2 i and m i, which produce the following new i

10 34 Roozbeh et al. estimators: ˆ RAK1 = pˆσ D 2 p ( (n p)ˆσ 2 D + λ i ˆα i 2 ), (4.13) i=1 { } 1 ˆ RAK2 = HM = m i ˆ RAK3 = HM {m i } = ˆσ D p p 1, (4.14) α ˆ i i=1 pˆσ D. (4.15) p α ˆ i i=1 5 Computing Ris Function For any particular estimator ˆβ of β, the ris function associated with the square error loss is measured by R(ˆβ, β) = E [ (ˆβ β) (ˆβ β) ]. Lemma 5.1. (Roozbeh et al., 2011) The bias, covariance matrix and ris functions of difference-based generalized ridge estimator (DGRE) can be evaluated as follows: b (ˆβD (K) ) = E (ˆβD (K) β ) = KS 1 D (K)β, (5.1) Cov (ˆβD (K) ) = σ 2 S 1 D (K)S DS 1 D (K), (5.2) R (ˆβD (K), β ) = σ 2 tr ( S 1 D (K)S DS 1 D (K)) + β S 1 D (K)K2 S 1 D (K)β, where S D (K) = S D + K. (5.3) Then, the properties of ˆβ D (DOLSE) is obtained by letting K = 0 in the above Lemma as follows: b (ˆβD ) = 0, (5.4) Cov (ˆβD ) = σ 2 S 1 D, (5.5) R (ˆβD, β ) = σ 2 tr ( S 1 ) D. (5.6)

11 Generalized Ridge Regression Estimator in Theorem 5.1. The ris function of the DGRE is given by R (ˆβD (K), β ) = σ 2 p i=1 λ i p (λ i + i ) 2 + i 2α2 i (λ i + i ) 2. (5.7) Proof. By using S 1 D (K) = Γ(Λ + K) 1 Γ, we have ( tr Cov (ˆβD (K) )) = σ 2 tr [ S 1 D ()S DS 1 D ()] = σ 2 tr [ Γ(Λ + I) 1 Γ ΓΛΓ Γ(Λ + I) 1 Γ ] = σ 2 tr [ Λ(Λ + I) 2] Also, QB (ˆβD (K) ) = σ 2 p i=1 i=1 λ i (λ i + ) 2. (5.8) = β S 1 D (K)K2 S 1 D (K)β = α Γ Γ(Λ + K) 1 Γ K 2 Γ(Λ + K) 1 Γ Γα = α (Λ + K) 1 diag(1, 2 2, 2..., p) 2 (Λ + K) 1 α ( = α diag 1(λ ) 2, 2(λ ) 2,..., p(λ 2 p + p ) 2) α p i 2 = α2 i (λ i + i ) 2, (5.9) i=1 which QB(.) is the quadratic bias of an estimator. By adding (5.8) into (5.9), we have that R (ˆβD (K), β ) = tr (Cov (ˆβD (K) )) + QB (ˆβD (K) ) The proof is complete. = σ 2 p i=1 λ i p (λ i + i ) 2 + i 2α2 i (λ i + i ) 2. (5.10) i=1 6 Numerical Results In this section we conduct some numerical computations as proofs of our assertions. First, we will be considering a Monte Carlo simulation study to evaluate the performance of the newly proposed harmonic mean ridge estimator.

12 36 Roozbeh et al. 6.1 Monte Carlo Simulation We evaluate the ris function performance of the proposed estimators comparatively, in this section. To achieve different degrees of collinearity, following McDonald and Galarneau (1975) and Gibbons (1981), the explanatory variables were generated using the following device for n = 500: x ij = (1 γ 2 ) 1 2 zij + γz ip, i = 1, 2,..., n, j = 1, 2,..., p, (6.1) where z ij are independent standard normal pseudo-random numbers, and γ is specified so that the correlation between any two explanatory variables is given by γ 2. These variables are then standardized so that X X and X Y are in correlation forms. Three different sets of correlation corresponding to γ = 0.80, 0.90 and 0.99 are considered. Then, n observations for the dependent variable are generated according to the following scheme: y i = 6 x ij β j + f(t i ) + ϵ i, i = 1,..., n, (6.2) j=1 where β = (3, 1, 3, 2, 5, 4), f(t) = 1 [ ϕ ( t; 3, 0.81 ) + ϕ ( t; 0, 0.36 ) + ϕ ( t; 3, 0.81 )], 3 is a mixture of normal densities for t [ 5, 5] and ϕ(x; µ, σ 2 ) is a normal density function with mean µ and variance σ 2. The main reason of selecting such structure for nonlinear part is to chec the efficiency of nonparametric estimation for wavy functions. Moreover, ϵ N(0, σ 2 I n ). Four values of σ 2 are investigated in this study, which are 0.01, 0.25, 1 and 4. We consider a relationship between σ 2 and the signal to noise ratio as, ρ 2 = β β σ 2. (6.3) The values of ρ 2 corresponding to σ 2 are 6400, 256, 64 and 16 respectively. In model (6.2), the parametric effect β is estimated by a differencing procedure. Optimal differencing weights do not have analytic expressions but may be calculated easily using an optimization routine. Hall

13 Generalized Ridge Regression Estimator in et al. (1990) presents weights to order m = 10. These contain some minor errors. We use a fourth-order differencing coefficients, d 0 = , d 1 = , d 2 = , d 3 = , and d 4 = in which m = 4. All computations were conducted using the statistical pacage R and consequent results are presented in Tables 1 to 12. We numerically estimated the trace of covariance matrix and ris function of estimators, which heavily depend on, γ and ρ 2. For estimating the non-linear part, we simulated responses from model (6.2) for n = 1000 observations. The ˆR(.) and δ = ˆR( ˆβ D, β) ˆR( ˆβ D (), β) are plotted in Figures 1 and 2, respectively. In Figure 3, the nonparametric part of the model (6.2) is plotted. This function is difficult to be estimated and provides a good test case for the nonparametric regression method. In the continuation, this figure shows the fitted function by ernel smoothing after estimating the linear part of the model by ˆβ D (ˆ RAK1 ), that is, y X ˆβ D (ˆ RAK1 ), for γ = 0.99 and different values of ρ 2 (we realized that results for other values of γ will not mae significant changes). 6.2 Real Data Example To motivate the problem of linearly constrained estimation in a SRM, we consider the hedonic prices of housing attributes. Housing prices are very much affected by lot size. The SRM that follows was estimated by Ho (1995) using semiparametric least squares. The data consist of 92 detached homes in Ottowa area that were sold during The variables are defined as follows: The dependent variable y is sale price (SP), the independent variable include lot size (lot area = LT), square footage of housing (SFH), average neighborhood income (), distance to highway (DHW), presence of garage (GAR), fireplace (FP). We first consider the pure parametric model: (SP ) i = β 0 + β 1 (LT ) i + β 2 (SF H) i + β 3 (F P ) i +β 4 (DHW ) i + β 5 (GAR) i + β 6 () i + ϵ i. (6.4) Estimation details of the above model are summarized in Table 15. According to this table, model (6.4) does not fit to the given data adequately. An appropriate approach is to replace the pure parametric model with a semiparametric model and considering one of explanatory variables as a non-parametric component instead. For this aim, we plotted the dependent variable (SP) versus explanatory variables (except for the binary ones) to find the type of relation (linear or non linear)

14 38 Roozbeh et al. Table 1: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 6400, γ = 0.80 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

15 Generalized Ridge Regression Estimator in Table 2: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 6400, γ = 0.90 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

16 40 Roozbeh et al. Table 3: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 6400, γ = 0.99 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

17 Generalized Ridge Regression Estimator in Table 4: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 256, γ = 0.80 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

18 42 Roozbeh et al. Table 5: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 256, γ = 0.90 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

19 Generalized Ridge Regression Estimator in Table 6: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 256, γ = 0.99 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

20 44 Roozbeh et al. Table 7: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 64, γ = 0.80 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

21 Generalized Ridge Regression Estimator in Table 8: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 64, γ = 0.90 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

22 46 Roozbeh et al. Table 9: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 64, γ = 0.99 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

23 Generalized Ridge Regression Estimator in Table 10: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 16, γ = 0.80 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

24 48 Roozbeh et al. Table 11: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 16, γ = 0.90 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

25 Generalized Ridge Regression Estimator in Table 12: Evaluation of DGRE at different estimators in model (6.2) with ρ 2 = 16, γ = 0.99 and λ1/λ6 = ˆHK ˆHKB ˆAM ˆGM ˆMed ˆHM ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK3 ˆβ ˆβ ˆβ ˆβ ˆβ ˆβ ) ˆR

26 50 Roozbeh et al. Estimation of R(.) ρ 2 = 6400 γ = Estimation of R(.) ρ 2 = 6400 γ = Estimation of R(.) ρ 2 = 6400 γ = Estimation of R(.) ρ 2 = 256 γ = Estimation of R(.) ρ 2 = 256 γ = Estimation of R(.) ρ 2 = 256 γ = Estimation of R(.) ρ 2 = 64 γ = 0.8 Estimation of R(.) ρ 2 = 64 γ = 0.9 Estimation of R(.) ρ 2 = 64 γ = Estimation of R(.) ρ 2 = 16 γ = 0.8 Estimation of R(.) ρ 2 = 16 γ = 0.9 Estimation of R(.) ρ 2 = 16 γ = Figure 1: The diagrams of ˆR(.) versus for different values of ρ 2 and γ.

27 δ δ δ δ δ δ δ δ δ δ δ δ Generalized Ridge Regression Estimator in e 08 3e 08 1e 08 ρ 2 = 6400 γ = e e e 08 ρ 2 = 6400 γ = e e e+00 ρ 2 = 6400 γ = e 05 6e 06 2e 06 2e 06 ρ 2 = 256 γ = 0.8 4e 05 2e 05 0e+00 ρ 2 = 256 γ = ρ 2 = 256 γ = e 05 1e 05 1e 05 3e 05 ρ 2 = 64 γ = 0.8 1e 04 0e+00 1e 04 ρ 2 = 64 γ = ρ 2 = 64 γ = e 04 1e 04 3e 04 5e 04 ρ 2 = 16 γ = ρ 2 = 16 γ = ρ 2 = 16 γ = Figure 2: The diagrams of δ versus for different values of ρ 2 and γ.

28 52 Roozbeh et al. Mixture of Normal Densities t Estimation of f(.) ρ 2 = 6400 γ = 0.99 Estimation of f(.) ρ 2 = 256 γ = t t Estimation of f(.) ρ 2 = 64 γ = 0.99 Estimation of f(.) ρ 2 = 16 γ = t t Figure 3: The function under study (Mixtures of normal densities) and estimation of it by ernel approach for n = 1000 and γ = 0.99.

29 Generalized Ridge Regression Estimator in Table 13: Obtained results of simulation study: the ridge estimator which is leading to minimum ris (+), the ridge estimator which is leading to maximum ris ( ). ρ 2 γ 0.0 ˆHK ˆHKB ˆAM ˆGM ˆMED ˆHM ˆKS ˆAKS1 ˆAKS2 ˆAKS3 ˆMK1 ˆMK2 ˆMK3 ˆMK4 ˆMK5 ˆMK6 ˆMK7 ˆRAK1 ˆRAK2 ˆRAK

30 54 Roozbeh et al. Table 14: The values of test statistics (6.5) Variable Z 0 intercept - LT 3.15 SFH 0.61 FP 3.74 DHW 3.84 GAR between them in Figure 4. By Yatchew (2003), the test statistic for the null hypothesis that the regression function has the parametric form, i.e., H 0 : f(t) = h(t; β) for a nown function h(.), against the nonparametric alternative f(t) when one uses optimal differencing weights, is where Z 0 = nm ˆσ2 ˆσ 2 D ˆσ 2 D N(0, 1), (6.5) ˆσ 2 = ˆσ 2 D = 1 n p n ( yi h(t; ˆβ) ) 2, i=1 y D (I P )y D tr ( D (I P )D ), P = X D (X DX D ) 1 X D. We incorporated a fourth-order optimal differencing coefficients. We consider the average neighborhood income () as a non-parametric part, because, it has the largest value of nonparametric significance test statistics among those of other independent variables. The statistics of linearity of h(.) for all explanatory variables can be found in Table 14. The nonparametric significance test of the effect using (6.5) yields the value 3.93, which indicates that non linear relation between the and dependent variables is significant. Moreover, we plotted the dependent variable (SP) versus explanatory variables (except for the binary ones) to find the type of relation (linear or non linear) between them in Figure 4. Consequently, the underlying SRM is specified as (SP ) i = β 0 + β 1 (LT ) i + β 2 (SF H) i + β 3 (F P ) i +β 4 (DHW ) i + β 5 (GAR) i + f() i + ϵ i. (6.6)

Comparison of Some Improved Estimators for Linear Regression Model under Different Conditions

Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 3-24-2015 Comparison of Some Improved Estimators for Linear Regression Model under