LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur
2 Remedes for multcollnearty Varous technques have been proposed to deal wth the problems resultng from the presence of multcollnearty n the data.. Obtan more data The harmful multcollnearty arses essentally because ran of X X falls below and then X X = 0 whch clearly suggests the presence of lnear dependences n the columns of X. It s the case when X X s close to zero whch needs attenton. Addtonal data may help n reducng the samplng varance of the estmates. The data need to be collected such that t helps n breang up the multcollnearty n the data. It s always not possble to collect addtonal data to varous reasons as follows. The experment and process have fnshed and no longer avalable. The economc constrans may also not allow to collect the addtonal data. The addtonal data may not match wth the earler collected data and may be unusual. If the data s n tme seres, then longer tme seres may force to tae data that s too far n the past. If multcollnearty s due to any dentty or exact relatonshp, then ncreasng the sample sze wll not help. Sometmes, t s not advsable to use the data even f t s avalable. For example, f the data on consumpton pattern s avalable for the years 950-200, then one may not le to use t as the consumpton pattern usually does not remans same for such a long perod.
2. Drop some varables that are collnear 3 If possble, dentfy the varables whch seems to causng multcollnearty. These collnear varables can be dropped so as to match the condton of fall ran of X-matrx. The process of omttng the varables may be carred out on the bass of some nd of orderng of explanatory varables, e.g., those varables can be deleted frst whch have smaller value of t- rato. In another example, suppose the expermenter s not nterested n all the parameters. In such cases, one can get the estmators of the parameters of nterest whch have smaller mean squared errors than the varance of OLSE of full vector by droppng some varables. If some varables are elmnated, then ths may reduce the predctve power of the model. Sometmes there s no assurance that how the model wll exhbt less multcollnearty. 3. Use some relevant pror nformaton One may search for some relevant pror nformaton about the regresson coeffcents. Ths may lead to specfcaton of estmates of some coeffcents. More general stuaton ncludes the specfcaton of some exact lnear restrctons and stochastc lnear restrctons. The procedures le restrcted regresson and mxed regresson can be used for ths purpose. The relevance and correctness of nformaton plays an mportant role n such analyss but t s dffcult to ensure t n practce. For example, the estmates derved n U.K. may not be vald n Inda.
4. Employ generalzed nverse ( X ' X) < If ran, then the generalzed nverse can be used to fnd the nverse of. Then can be estmated by ˆ β = ( X ' X) X ' y. X ' X β 4 In such case, the estmates wll not be unque except n the case of use of Moore-Penrose nverse of Dfferent methods of fndng generalzed nverse may gve dfferent results. So appled worers wll get dfferent results. Moreover, t s also not nown that whch method of fndng generalzed nverse s optmum. ( X ' X). 5. Use of prncpal component regresson The prncpal component regresson s based on the technque of prncpal component analyss. The explanatory varables are transformed nto a new set of orthogonal varables called as prncpal components. Usually ths technque s used for reducng the dmensonalty of data by retanng some levels of varablty of explanatory varables whch s expressed by the varablty n study varable. The prncpal components nvolves the determnaton of a set of lnear combnatons of explanatory varables such that they retan the total varablty of the system and these lnear combnatons are mutually ndependent of each other. Such obtaned prncpal components are raned n the order of ther mportance. The mportance beng udged n terms of varablty explaned by a prncpal component relatve to the total varablty n the system. The procedure then nvolves elmnatng some of the prncpal components whch contrbute n explanng relatvely less varaton. After elmnaton of the least mportant prncpal components, the set up of multple regresson s used by replacng the explanatory varables wth prncpal components.
5 Then study varable s regressed aganst the set of selected prncpal components usng ordnary least squares method. Snce all the prncpal components are orthogonal, they are mutually ndependent and so OLS s used wthout any problem. Once the estmates of regresson coeffcents for the reduced set of orthogonal varables (prncpal components) have been obtaned, they are mathematcally transformed nto a new set of estmated regresson coeffcents that correspond to the orgnal correlated set of varables. These new estmated coeffcents are the prncpal components estmators of regresson coeffcents. Suppose there are explanatory varables X, X2,..., X. Consder the lnear functon of X, X2,.., X le Z Z 2 = = = = ax bx etc. The constants a, a2,..., a are determned such that the varance of s maxmzed subect to the normalzng condton 2 = 2 that a =. The constant b, b,..., b are determned such that the varance of Z2 s maxmzed subect to the normalty condton that = b 2 = and s ndependent of the frst prncpal component. Z
6 We contnue wth such process and obtan such lnear combnatons such that they are orthogonal to ther precedng lnear combnatons and satsfy the normalty condton. Then we obtan ther varances. Suppose such lnear combnatons are Z, Z2,.., Z and for them, Var( Z) > Var( Z2) >... > Var( Z ). The lnear combnaton havng the varance s the frst prncpal component. The lnear combnaton havng the second largest varance s the second largest prncpal component and so on. These prncpal components have the property that Var( Z) = Var( X ). Also, the X, X2,..., X are correlated = = but Z, Z2,.., Z are orthogonal or uncorrelated. So there wll be zero multcollnearty among Z, Z2,.., Z. The problem of multcollnearty arses because X, X2,..., X are not ndependent. Snce the prncpal components based on X, X2,..., X are mutually ndependent, so they can be used as explanatory varables and such regresson wll combat the multcollnearty. Let λ, λ2,..., λ be the egenvalues of X ' X, Λ= dag( λ s dagonal matrx, V s a orthogonal, λ2,..., λ ) matrx whose columns are the egenvectors assocated wth λ, λ2,..., λ. Consder the canoncal form of the lnear model where y = Xβ + ε = XVV ' β + ε = Zα + ε Z = XV, α = V ' β, V ' X ' XV = Z ' Z = Λ Columns of ( ) Z = Z, Z2,..., Z defne a new set of explanatory varables whch are called as prncpal component.
7 The OLSE of α s ˆ ( Z' Z) ' α = = Λ Z' y Z y and ts covarance matrx s V( ˆ α) = σ ( Z' Z) = σ Λ 2 2,,...,. 2 = σ dag λ λ2 λ λ th Note that s the varance of prncpal component and Z' Z= ZZ = Λ. A small egenvalue of X X means that the lnear relatonshp between the orgnal explanatory varable exst and the varance of correspondng orthogonal regresson coeffcent s large whch ndcates that the multcollnearty exsts. If one or more are small, then t ndcates that multcollnearty s present. = = λ
8 Retanment of prncpal components The new set of varables,.e., prncpal components are orthogonal, and they retan the same magntude of varance as of orgnal set. If multcollnearty s severe, then there wll be at least one small value of egenvalue. The elmnaton of one or more prncpal components assocated wth smallest egenvalues wll reduce the total varance n the model. Moreover, the prncpal components responsble for creatng multcollnearty wll be removed and the resultng model wll be apprecably mproved. The prncpal component matrx Z = Z, Z2,..., Z wth Z, Z2,..., Z contans exactly the same nformaton as the orgnal data n X n the sense that the total varablty n X and Z s same. The dfference between them s that the orgnal data are arranged nto a set of new varables whch are uncorrelated wth each other and can be raned wth respect to the th magntude of ther egenvalues. The column vector correspondng to the largest accounts for the largest proporton of the varaton n the orgnal data. Thus the Z s are ndexed so that λ and s the > λ2 >... > λ > 0 Z varance of. [ ] Z A strategy of elmnaton of prncpal components s to begn by dscardng the component assocated wth the smallest egenvalue. The dea behnd to do ths s that the prncpal component wth smallest egenvalue s contrbutng the least varance and so s least nformatve. λ λ
Usng ths procedure, prncpal components are elmnated untl the remanng components explan some preselected varance s terms of percentage of total varance. For example, f 90% of total varance s needed, and suppose r prncpal components are elmnated whch means that( r) prncpal components contrbute 90% of the total varaton, then r s selected to satsfy = = λ > 0.90. λ 9 Varous strateges to choose requred number of prncpal components are also avalable n the lterature. Suppose after usng such a rule, the r prncpal components are elmnated. Now only (- r) components wll be used for regresson. So Z matrx s parttoned as ( ) ( ) Z= Z Z = XV V r r where submatrx Z r s of order n x r and contans the prncpal components to be elmnated. The submatrx Z -r s of order n x ( r) and contans the prncpal components to be retaned. The reduced model obtaned after the elmnaton of r prncpal components can be expressed as y = Z α + ε*. The random error component s represented as ust to dstngush wth. The reduced coeffcents contan the coeffcents assocated wth retaned Z s. So ε * (,,..., ) Z = Z Z Z 2 (,,..., ) α = α α α 2 ( ) V = V, V,..., V. 2 ε
Usng OLS on the model wth retaned prncpal components, the OLSE of ' ' ˆ = ( ZZ) Zy. α. Now t s transformed bac to orgnal explanatory varables as follows: α = V ' β α = V ˆ β = V ' ˆ α β pc whch s the prncpal component regresson estmator of Ths method mproves the effcency as well as combats multcollnearty. β. α r s 0