Dr.Samira Muhammad salh

Similar documents
Research Design - - Topic 17 Multiple Regression & Multiple Correlation: Two Predictors 2009 R.C. Gardner, Ph.D.

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms

Central Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution

Psychometric Methods: Theory into Practice Larry R. Price

LET a random variable x follows the two - parameter

Pulse Neutron Neutron (PNN) tool logging for porosity Some theoretical aspects

n 1 Cov(X,Y)= ( X i- X )( Y i-y ). N-1 i=1 * If variable X and variable Y tend to increase together, then c(x,y) > 0

Estimation of the Correlation Coefficient for a Bivariate Normal Distribution with Missing Data

3.1 Random variables

6 PROBABILITY GENERATING FUNCTIONS

MEASURING CHINESE RISK AVERSION

Chem 453/544 Fall /08/03. Exam #1 Solutions

Alternative Tests for the Poisson Distribution

7.2. Coulomb s Law. The Electric Force

Liquid gas interface under hydrostatic pressure

Surveillance Points in High Dimensional Spaces

4/18/2005. Statistical Learning Theory

A New Method of Estimation of Size-Biased Generalized Logarithmic Series Distribution

Elementary Statistics and Inference. Elementary Statistics and Inference. 11. Regression (cont.) 22S:025 or 7P:025. Lecture 14.

APPLICATION OF MAC IN THE FREQUENCY DOMAIN

arxiv: v2 [physics.data-an] 15 Jul 2015

EM Boundary Value Problems

Identification of the degradation of railway ballast under a concrete sleeper

Chapter 6 Balanced Incomplete Block Design (BIBD)

Stanford University CS259Q: Quantum Computing Handout 8 Luca Trevisan October 18, 2012

Bayesian Analysis of Topp-Leone Distribution under Different Loss Functions and Different Priors

2. The Munich chain ladder method

ON INDEPENDENT SETS IN PURELY ATOMIC PROBABILITY SPACES WITH GEOMETRIC DISTRIBUTION. 1. Introduction. 1 r r. r k for every set E A, E \ {0},

ASTR415: Problem Set #6

Basic Bridge Circuits

Safety variations in steel designed using Eurocode 3

Web-based Supplementary Materials for. Controlling False Discoveries in Multidimensional Directional Decisions, with

Directed Regression. Benjamin Van Roy Stanford University Stanford, CA Abstract

Hydroelastic Analysis of a 1900 TEU Container Ship Using Finite Element and Boundary Element Methods

MULTILAYER PERCEPTRONS

DIMENSIONALITY LOSS IN MIMO COMMUNICATION SYSTEMS

B. Spherical Wave Propagation

Empirical Prediction of Fitting Densities in Industrial Workrooms for Ray Tracing. 1 Introduction. 2 Ray Tracing using DRAYCUB

Fresnel Diffraction. monchromatic light source

Nuclear Medicine Physics 02 Oct. 2007

Hypothesis Test and Confidence Interval for the Negative Binomial Distribution via Coincidence: A Case for Rare Events

Contact impedance of grounded and capacitive electrodes

A Multivariate Normal Law for Turing s Formulae

THE IMPACT OF NONNORMALITY ON THE ASYMPTOTIC CONFIDENCE INTERVAL FOR AN EFFECT SIZE MEASURE IN MULTIPLE REGRESSION

An Exact Solution of Navier Stokes Equation

1 Statistics. We ll examine two ways to examine the relationship between two variables correlation and regression. They re conceptually very similar.

Appraisal of Logistics Enterprise Competitiveness on the Basis of Fuzzy Analysis Algorithm

New problems in universal algebraic geometry illustrated by boolean equations

F-IF Logistic Growth Model, Abstract Version

Chapter 3: Theory of Modular Arithmetic 38

Chapter 5 Linear Equations: Basic Theory and Practice

GENLOG Multinomial Loglinear and Logit Models

The Substring Search Problem

Absorption Rate into a Small Sphere for a Diffusing Particle Confined in a Large Sphere

Hammerstein Model Identification Based On Instrumental Variable and Least Square Methods

6 Matrix Concentration Bounds

An Application of Fuzzy Linear System of Equations in Economic Sciences

MEASURES OF BLOCK DESIGN EFFICIENCY RECOVERING INTERBLOCK INFORMATION

33. 12, or its reciprocal. or its negative.

This is a very simple sampling mode, and this article propose an algorithm about how to recover x from y in this condition.

Topic 5. Mean separation: Multiple comparisons [ST&D Ch.8, except 8.3]

Goodness-of-fit for composite hypotheses.

Experiment I Voltage Variation and Control

Lecture 7 Topic 5: Multiple Comparisons (means separation)

PROBLEM SET #1 SOLUTIONS by Robert A. DiStasio Jr.

C/CS/Phys C191 Shor s order (period) finding algorithm and factoring 11/12/14 Fall 2014 Lecture 22

( ) [ ] [ ] [ ] δf φ = F φ+δφ F. xdx.

Functions Defined on Fuzzy Real Numbers According to Zadeh s Extension

OSCILLATIONS AND GRAVITATION

On the integration of the equations of hydrodynamics

Some technical details on confidence. intervals for LIFT measures in data. mining

DEMONSTRATING THE INVARIANCE PROPERTY OF HOTELLING T 2 STATISTIC

KR- 21 FOR FORMULA SCORED TESTS WITH. Robert L. Linn, Robert F. Boldt, Ronald L. Flaugher, and Donald A. Rock

Determining solar characteristics using planetary data

MATH 415, WEEK 3: Parameter-Dependence and Bifurcations

A NEW VARIABLE STIFFNESS SPRING USING A PRESTRESSED MECHANISM

A matrix method based on the Fibonacci polynomials to the generalized pantograph equations with functional arguments

Chapter 13 Gravitation

Duality between Statical and Kinematical Engineering Systems

An Overview of Limited Information Goodness-of-Fit Testing in Multidimensional Contingency Tables

Scattering in Three Dimensions

Information Retrieval Advanced IR models. Luca Bondi

International Journal of Mathematical Archive-3(12), 2012, Available online through ISSN

working pages for Paul Richards class notes; do not copy or circulate without permission from PGR 2004/11/3 10:50

Control Chart Analysis of E k /M/1 Queueing Model

Swissmetro: design methods for ironless linear transformer

CSCE 478/878 Lecture 4: Experimental Design and Analysis. Stephen Scott. 3 Building a tree on the training set Introduction. Outline.

Stress Intensity Factor

Light Time Delay and Apparent Position

Math 124B February 02, 2012

LINEAR AND NONLINEAR ANALYSES OF A WIND-TUNNEL BALANCE

Revisiting the Question More Guns, Less Crime? New Estimates Using Spatial Econometric Techniques

Analytical Solutions for Confined Aquifers with non constant Pumping using Computer Algebra

AN OVERVIEW OF BIASED ESTIMATORS

2 x 8 2 x 2 SKILLS Determine whether the given value is a solution of the. equation. (a) x 2 (b) x 4. (a) x 2 (b) x 4 (a) x 4 (b) x 8

Long-range stress re-distribution resulting from damage in heterogeneous media

Boundary Layers and Singular Perturbation Lectures 16 and 17 Boundary Layers and Singular Perturbation. x% 0 Ω0æ + Kx% 1 Ω0æ + ` : 0. (9.

Right-handed screw dislocation in an isotropic solid

Review: Electrostatics and Magnetostatics

Nuclear and Particle Physics - Lecture 20 The shell model

Transcription:

Intenational Jounal of Scientific & Engineeing Reseach, Volume 5, Issue 0, Octobe-04 99 ISSN 9-558 Using Ridge Regession model to solving Multicollineaity poblem D.Samia Muhammad salh Abstact: In this (paepa) eseach,we intoduce two diffeent method to solve multicollineaity poblem.these methods include odinay least squae (OLS) and odinay idge egession (ORR),and using data simulation to compaison between methods,fo thee diffeent sample size (5,50,00).Accoding to a esults,we found that idge egession (ORR) ae bette than OLS Method when the Multicollineaity is exist. And we conclude that the sample size affects the esults of estimated value, wheneve the sample size inceases, the case esults of the methods of Estimation stables moe. Keywods: multiple egessions, Multicollineaity poblem, odinay least squae, odinay idge egession..introduction: Multiple egessions is a statistical technique that allows us to pedict someone s scoe on one vaiable on the basis of thei scoes on seveal othe vaiables. An example might help. Suppose we wee inteested in pedicting how much an individual enjoys thei job. Vaiables such as salay, extent of academic qualifications, age, sex, numbe of yeas in full-time employment and socioeconomic status might all contibute towads job satisfaction. If we collected data on all of these vaiables, pehaps by suveying a few hunded membes of the public, we would be able to see how many and which of these vaiables gave ise to the most accuate pediction of job satisfaction. We might find that job satisfaction is most accuately pedicted by type of occupation, salay and yeas in full-time employment, with the othe vaiables not helping us to pedict job satisfaction. When using multiple egession in pecisely which set of vaiables is influencing ou behavio. D.Samia muhamad Salh,Depatment of statistics, Univesity of sulamani(samia9690@yahoo.com) Human behavio is athe vaiable and theefoe difficult to pedict. What we ae doing in both ANOVA and multiple egession is seeing to account fo the vaiance in the scoes we obseve. Thus, in the example above, people might vay geatly in thei levels of job satisfaction. Some of this vaiance will be accounted fo by the vaiables we have identified. Fo example, we might be able to say that salay accounts fo a faily lage pecentage of the vaiance in job satisfaction, and hence it is vey useful to now someone s salay when tying to pedict thei job satisfaction. You might now be able to see psychology, many eseaches use the tem independent that the ideas hee ae athe simila to those undelying vaiables to identify those vaiables that they thin will ANOVA. In ANOVA we ae tying to detemine how much of influence some othe dependent vaiable. We pefe to use the vaiance is accounted fo by ou manipulation of the the tem pedicto vaiables fo those vaiables that may be independent vaiables (elative to the pecentage of the useful in pedicting the scoes on anothe vaiable that we call vaiance we cannot account fo). In multiple egession we do the citeion vaiable. Thus, in ou example above, type of not diectly manipulate the IVs but instead just measue the occupation, salay and yeas in full-time employment would natually occuing levels of the vaiables and see if this helps emege as significant pedicto vaiables, which allow us to us pedict the scoe on the dependent vaiable (o citeion estimate the citeion vaiable how satisfied someone is liely vaiable). Thus, ANOVA is actually a athe specific and to be with thei job. As we have pointed out befoe, human esticted example of the geneal appoach adopted in behavio is inheently noisy and theefoe it is not possible to multiple egession.to put anothe way, in ANOVA we can poduce totally accuate pedictions, but multiple egession diectly manipulate the factos and measue the esulting allows us to identify a set of pedicto vaiables which togethe change in the dependent vaiable. In multiple egession we povide a useful estimate of a paticipant s liely scoe on a simply measue the natually occuing scoes on a numbe of citeion vaiable. If two vaiables ae coelated, then pedicto vaiables and ty to establish which set of the nowing the scoe on one vaiable will allow you to pedict obseved vaiables gives ise to the best pediction of the the scoe on the othe vaiable. The stonge the coelation, citeion vaiable. A cuent tend in statistics is to emphases the close the scoes will fall to the egession line and the similaity between multiple egession and ANOVA, and theefoe the moe accuate the pediction. Multiple egession between coelation and the t-test. All of these statistical is simply an extension of this pinciple, whee we pedict one techniques ae basically seeing to do the same thing explain vaiable on the basis of seveal othe vaiables. Having moe the vaiance in the level of one vaiable on the basis of the level than one pedicto vaiable is useful when pedicting human of one o moe othe vaiables. These othe vaiables might be behavio, as ou actions, thoughts and emotions ae all liely manipulated diectly in the case of contolled expeiments, o to be influenced by some combination of seveal factos. Using be obseved in the case of suveys o obsevational studies, but multiple egession we can test theoies (o models) about 04

Intenational Jounal of Scientific & Engineeing Reseach, Volume 5, Issue 0, Octobe-04 993 ISSN 9-558 the undelying pinciple is the same. Thus, although we have given sepaate chaptes to each of these pocedues they ae fundamentally all the same pocedue. The usage of multiple egessions:-. You can use this statistical technique when exploing linea elationships between the pedicto and citeion vaiables that is, when the elationship follows a staight line. (To examine non-linea elationships, special techniques can be used.).the citeion vaiable that you ae seeing to pedict should be measued on a continuous scale (such as inteval o atio scale).thee is a sepaate egession method called logistic egession that can be used fo dichotomous dependent vaiables (not coveed hee). 3. The pedicto vaiables that you select should be measued on a atio, inteval, o odinal scale. A nominal pedicto vaiable is legitimate but only if it is dichotomous, i.e. thee ae no moe that two categoies. Fo example, sex is (masculine, feminine and andogynous) could not be coded as a single vaiable. Instead, you would ceate thee diffeent vaiables each with two categoies (masculine/not masculine; feminine/not feminine and andogynous/not andogynous). The tem dummy vaiable is used to descibe this type of dichotomous vaiable. 4. Multiple egessions equie a lage numbe of obsevations. The numbe of cases (paticipants) must substantially exceed the numbe of pedicto vaiables you ae using in you egession. The absolute minimum is that you have five times as many paticipants as pedicto vaiables. A moe acceptable atio is 0:, but some people ague that this should be as high as 40: fo some statistical selection methods.. TERMINOLOGY:. Va(Ui ) σu means that the vaiance of.v U, is constant fo all tems fo all independent.v s. Then U ~ N(0, σu ). covaiance (UiUj ) E (UiUj) 0 That means the vaious values of.v (U)ae uncoelated with each othe 3. E(UiXi) 0, o the values of (Ui s) ae uncoelated withany one of the explanatoy vaiables (Xi s) this assumption is achieved by fixing the values of (Xi) with the same value in othe sample. Thee ae cetain tems we need to claify to allow you to 4. It is equied that the explanatoy (Xi;s) ae uncoelated, so undestand the esults of this statistical technique: the fitted multiple egession model is doesn t including. R, R Squae, Adjusted R Squae: Multicolinaity. R is a measue of the coelation between the obseved value 5. The linea elation that equied to be estimated by a and the pedicted value of the citeion vaiable. In ou multiple egession model is equied to be (identified), i.e. example this would be the coelation between the levels of job the economic model unde consideation have a satisfaction epoted by ou paticipants and the levels disciminate fom and doesn t contains the same vaiables pedicted fo them by ou pedicto vaiables. R Squae (R²) is that exist in othe elation in the same poblem unde the squae of this measue of coelation and indicates the consideation. popotion of the vaiance in the citeion vaiable which is accounted fo by ou model in ou example the popotion of.4 SOME PROPERTIES OF DEPENDENT RANDOM the vaiance in the job satisfaction scoes accounted fo by ou VARIABLE IN (GLM): set of pedicto vaiables (salay, etc.). In essence, this is a measue of how good a pediction of the citeion vaiable we Concening with the popeties fo dependent andom can mae by nowing the pedicto vaiables. Howeve, R vaiable (Yi), ae: squae tends to somewhat ove-estimate the success of the model when applied to the eal wold, so an Adjusted R Squae value is calculated which taes into account the numbe of vaiables in the model and the numbe of obsevations (paticipants) ou model is based on. This Adjusted R Squae value gives the most useful measue of the success of ou model. If, fo example we have an Adjusted R Squae value of 0.75 we can say that ou model has accounted fo 75% of the vaiance in the citeion vaiable.. MULTICOLINEARITY PROBLEM Multicolineaity is occued whee two o moe explanatoy andom vaiables ae coelated linealy by a stong linea elation, that it is not easy to sepaate the effect of each of them fom the (Response) dependent vaiable. These stong coelations always made the multiple coelations ae almost necessay. Multicolineaity is happened if the obsevations of one of the explanatoy (xi s) ae equals o one of the explanatoy is a function of the othes, then in ode to apply (OLS) method fo estimation, it is equied to summaize the assumption elated with multicolineaity. As follow: (Thee is no exact o Simi-linea elationship among the explanatoy o independent vaiables, moeove it is necessay that the numbe of paametes have been estimated must be less than the sample size(n) unde consideation)..3 THE (OLS) ASSUMPTIONS IN (GLM):- The Geneal linea model fomula is given by : 04 Y X + U U~N(0,σu).() E (Ui) 0, Then E(Y) X. (Yi) is distibuted nomal with mean given by: Y EY ( ) + X...() i o i. The vaiance of (Yi) is given by: V a Y E Y E Y σ ( i) [( i) ( i)] u...(3) These popeties can be fomulated as follow:

Intenational Jounal of Scientific & Engineeing Reseach, Volume 5, Issue 0, Octobe-04 994 ISSN 9-558 Y i ~ N( o + X i, σ u ) max φ...(5) min Afte illustating the main assumptions fo (GLM), now one can Montgomey and Pec, (99) defined the condition numbe as conclude the effect of multicolineaity that concening with the atio of max and min as : assumption (5). Means, if the explanatoy.v;s ae exactly max coelated, then the estimation of paametes of the linea φ...(6) model is impossible, that s because the invese matix of fishe min infomation matix (X X) would be not valid, this is etuned to Whee max is the lagest eigenvalue. the fact that the deteminant of (X X) is vanished (Zeo), min is the smallest eigenvalue moeove the vaiance fo the estimatos attains to infinity( if min 0, Then φ is infinite, which means that pefect multicollineaity between pedicto vaiables. If max and ). But in othe hand if the explanatoy.v;s ae highly min ae equal, Then φ and the pedictos ae said to be coelated it is possible to estimate the paametes but the othogonal. Pagel and Lunnebog, (985) suggested that the value of deteminant of (X X) is vey small that maes the condition numbe is : vaiance of estimatos vey lage and attains to unexpected p esults about the model, see the following vaiance fomula. φ3...(7) Some conventions ae that if φ lies between 5 and 30 it V a ( b ) S e * adj ( X ' X )/ X ' X...(4) consideed that multicollineaity goes fom Modeate to stong. iv) vaiance inflation facto (VIF) can be computed as The students can conclude the above illustations eithe follow : X X 0, o have a small value. If multicolineaity is VIF...(8) happened fo the above easons, it may causes o attains to a R j wong conclusions that some of Explanatoy(Xi s) ae not Whee is the coefficient of detemination in the egession of impotant due to the (t) test fo each of these paametes, so explanatoy Vaiable Xj on the emaining explanatoy the fitted model is not capable to aise and explain the sepaate vaiables of the model? Geneally, when VIF > 0, we assume effect fo each explanatoy with espect to the highly thee exists highly multicolineaity (Sana and Eyup, 008). coelation among them. Multicolineaity efes to a situation v) Checing the elationship between the F and T tests might in which o moe pedicto vaiables in a multiple egession povide some indication of the pesence of multicollineaity. If Model ae highly coelated if Multicolineaity is pefect the oveall significance of the model is good by using F-test, (EXACT), the egession coefficients ae indeteminate and but individually the coefficients ae not significant by using t- thei standad eos ae infinite, if it is less than pefect. The test, then the model might suffe fom multicollineaity. egession coefficients although Deteminate but posses lage Multicollineaity has seveal effects, these ae descibed as standad eos, which means that the coefficients cannot be follow: -High vaiance of coefficients may educe the pecision estimated with geat accuacy (Gujaati, 995).We can define of estimation. multicolineaity Though the concept of othogonality, when - Multicollineaity can esult in coefficients appeaing to have the pedictos ae othogonal o uncoelated, all eigen values the wong sign. of the design matix ae equal to one and the design matix is - Estimates of coefficients may be sensitive to paticula sets full an. if at least one eigen value is diffeent fom one, of sample data. especially when equal to zeo o nea zeo, then nonothogonality exists, meaning that multicollineaity is - Some vaiables may be dopped fom the model although, pesent. (Vinod and Ullah, 98).Thee ae many methods they ae impotant in the population. used to detect multicollineaity, among these methods: - The coefficients ae sensitive of to the pesence of small i) Compute the coelation matix of pedictos vaiables, a numbe inaccuate data values (moe details in Judge 988, high value fo the coelation between two vaiables may Gujaat; 995). indicate that the vaiables ae collinea. This method is easy, Because multicolineaity is a seious poblem when we need to but it cannot poduce a clea estimate of the ate (degee) of mae infeences o looing fo pedictive models. So it is vey multicolineaity. impotant fo us to find a bette method to deal with multicolineaity. Theefoe, The main objective in this pape, is ii)eigen stuctue of XX, let,,.., p be The to intoduce diffeent models of idge egession to solve eigenvalues of XX (in coelation fom). When at least one multicollineaity poblem and mae compaison between these eigenvalue is close to zeo, then multicollineaity is exist (Geene, (993), Wale, (999)). iii) Condition numbe: thee ae seveal methods to compute the condition numbe (φ) which indicate degee of models of idge egession with the odinay least squae method..5 THE ORDINARY RIDGE REGRESSION (ORR). Conside the standad model fo multiple linea egessions: multicolineaity Vinod and Ullah,(98), suggested that the condition numbe is given by : Y X + E (9) 04 i i

Intenational Jounal of Scientific & Engineeing Reseach, Volume 5, Issue 0, Octobe-04 995 ISSN 9-558 Whee Y is (n ) vecto of the dependent vaiable values, X is (n p) matix contains the values of P pedicto vaiables and this matix is full Ran (matix of an p), is a (p ) vecto of unnown coefficients, and E is a (n ) vecto of nomally distibuted andom eos with zeo mean and common vaiance σi.note that, Both X's and Y have been standadized. The OLS estimate ˆ of is obtained by minimizing the esidual sum of squaes, and is given by: and (Y Xˆ ) (Y Xˆ ), ˆ (XX) XY ˆ Va( ) ˆ σ (X X ) MSE( ˆ ) ˆ σ tace(x X) σ Whee p ˆ...(0) i i ˆ σ is the mean squaes eo. This ˆ estimato is an unbiased and has a minimum vaiance. Howeve, if XX is ill-conditioned (singula), the OLS estimate tend to become too and some of coefficients have wong sign (Wethill,986). In ode to pevent these difficulties of OLS, Hoel and Kennad (970), suggested the idge egession as an altenative pocedue to the OLS method in egession analysis, especially, multicollineaity is exist. The idge technique is based on adding a biasing constants K's to the diagonal of XX matix befoe computing by using method of Hoel and Kennad (000). Theefoe, the idge solution is given by : ˆ( that K ) MSE ˆ( K ) ˆ( K) (X X+KI) X Y, K 0...() < 988, Whee K is idge paamete and I is identity matix. Note that if K 0, the idge estimato become as the OLS. If all K's ae the same, the esulting estimatos ae called the odinay idge estimatos (John, 998)..6 PROPERTIES OF ORDINARY RIDGE REGRESSION ESTIMATOR The idge egession estimato has seveal popeties, which can be summaized as follow : - Fom equation (), by taing expectation on both sides, then ˆ fom on aveage. -Because ˆ ( K) I K( XX ) ˆ + the idge estimato is a linea tansfomation of the OLS. - The sum of the squaed esiduals is an inceasing function of K. - The mean squaes eo of ˆ( K ) is given by : ( ) ( ) ˆ ˆ ˆ MSE ( ( K )) E ( K ) ( K ) ˆ σ tace A ( ) ˆ ( )( ) ˆ X X A + I A I A σ + + p i ˆ ˆ ( ) ˆ K X X KI...() i i + Note that, the fist tem of the ight hand in equation () is ˆ( K ) and the second the tace of the dispesion matix of the tem is the squae length of the bias vecto. Fom the pevious popety, we find that, fo K > 0, the vaiance tem is monotone deceasing function of K and The squaes bias is monotone inceasing function of K. Theefoe the suitable choice if K is detemined by stiing a balance between two tems, so we select K which achieved educe in vaiance lage than the incease is bias. - Thee always exists a K > 0, such has smalle MSE than ˆ this means that ˆ ( ) MSE. (Moe details see Judge, Gujaat; 995, Gube 998, Pasha and Shah 004) Testing of Multicolineaity:- The most common test to detect multicolineaity poblem is the test which named by (Faa-Glaube)test, which depending on the statistic (χ Chi Squae) to test the following hypothesis:. Ho : (Xj ) othogonal V/S H: (Xj) Not othogonal. The test statistic is given by: E( ˆ ( K)) A W hee A I + K ( X X ) and va( ˆ ( K )) ˆ σ A ( X X ) A So ˆ( K ) is a biased estimato but educe the vaiance of the estimate ˆ( K ) is the coefficient vecto with minimum length, this - means that K > 0 always exists, fo which the squaes length of χ [ n ( + such that: (3) n: No. of obsevations : No. of Explanatoy (Xi s) 5) / 6]* Ln D Ln D : the natual logaithm fo the absolute value fo deteminant of coelation matix among explanatoy. But the coelation matix is given by: 04

Intenational Jounal of Scientific & Engineeing Reseach, Volume 5, Issue 0, Octobe-04 996 ISSN 9-558 V/S H: X does not causes Multicolineaity 3... R j.( 34... ) 0 D.. 3 3........ 3. Compaing Chi-Squae computed fom (), with the theoetical Chi-Squae fo statistical table fo a specific significant level (%5, o %), and [(-)/] degees of feedom (d.f ). 4. It s decided to eject Ho if χcal.> χtable. Which means that Multicolineaity exists among explanatoy.v s. And thee is no eason to eject(accept) Ho if χcal.< χtable. (Repeat the test if χcal. χtable). Note: It is impotant to explain what does (Othogonal) means in a mathematical explanation: The othogonal vectos: 4. The test statistic is given by: Let Xj is the explanatoy one which causes multicolineaity F R j.34 /( ) ( )....(6) ( j.34) )/( ) then: j cal R n 5. The ejection egion: Compaing Fcal if > Ftable with (-),(n-) d.f and (α: level of significant), then the decision would be eject Ho o X indeed causes multicolineaity. Note :In ode to compute the patial coelation coefficients the following mathematical fomula can be used.nd ode patial coelation coefficient: ρ ij. hc ρ ρ ρ ij. c ih. c jh. c ρih. c ρjh. c ( )( )...(7) let us we intoduce the expession: Fist ode patial coelation coefficient: x ' y ρij ρihρ jh ρ / / ij. h y y ( ρih )( ij such that cos( θ )...(4) ( x ' x ) ( ' ) θ : is the angle between two vectos X, and Y These two vectos ae othogonal if the cos(θ) is nealy equal one. But if exactly one these two vectos ae said to be othonomal o thee length nomalized to unity. Afte detecting multicolineaity due to the test above, the anothe step is to detemined which one of explanatoy caused multicolineaity.this eoganization can be done by maing anothe test named by fishe test o commonly (F Test). This test is made as follow:. Let X,X,X3,X4 is a set of explanatoy in (GLM). Taing X Fist. ρ ae simple coelation coefficients fom matix (D). Afte detemining the explanatoy.v let (Xj) which causes multicolineaity, the next step is to detemine (that Xj with which one of emaindes causes this multico) this step could ρ jh be done fo all emaindes one afte anothe. Then the following statistical hypothesis which concened with the second ode coelation coefficients must be applied; Ho : ij.(,,3.) 0 V/S H : ij.(,,3.) 0 The test statistic is given by : ). Calculating the coelation of detemination R o multiple coelations fo X as a dependent vaiable and the emainde (XX3X4) ae independents, using the following fomula: t cal ij.(... ) n ij,(... )...(8) R ( )( )( )...(5).(34) 3. 4.3 such that each of (3., 4.3) ae epesents the patial coelation coefficients. 3. Let Xj is the explanatoy one which causes multicolineaity then to test the hypothesis: Ho: X causes Multicolineaity o R j.( 34... ) 0 And compae this value with ttable fo (n-) d.f and (α) level, we eject Ho if tcal > ttable, then at last and afte all these tests ae made fo all explanatoy,v s then all those causes multicolineaity ae detemined. APPLICATIVE SIDE: In this eseach,we simulate a thee set of data using sas pacage of (lag,medium and small) size which ae (00,50,5)espectively,whee the coelation coefficients between the pedicto vaiable (xs) ae lage (the numbe of 04

Intenational Jounal of Scientific & Engineeing Reseach, Volume 5, Issue 0, Octobe-04 997 ISSN 9-558 pedicto vaiable in this study ae six vaiable ).table(,,3) coelation matix ae as follow: shows the coelation matix based on set of simulate data. VIF N50 N00 ).3 7.9 ).95 6.39 3) 4.6 4.6 4) 8.93.85 5).88 7.98 ) 3.55.99 6 Table() coelation matix of size(00) Table () coelation matix of size(50) We find and show that the sample size affects the esults,while the two eigenvalues of sample size (n5) ae complex numbe (which may not be all distinct) has two additional non-eal oots ( 5, 6 ) then ignoe this sample. wheneve the sample size inceases, the case esults of the methods of Estimation stables moe. The vaiance inflation facto (VIF) fo all vaiable xs ae as follow: X X X3 X4 X5 X6 X 0.93 0.94 0.973 0.964 0.975 X 0.93 0.934 0.96 0.893 0.90 X3 0.94 0.934 0.945 0.938 0.889 X4 0.973 0.96 0.945 0.904 0.95 X5 0.964 0.893 0.938 0.904 0.93 X6 0.975 0.90 0.889 0.95 0.93 X X X3 X4 X5 X6 X 0.943 0.956 0.96 0.970 0.890 X 0.943 0.934 0.95 0.98 0.9 N00 OLS X3 0.956 0.934 0.9 0.943 0.93 X4 0.96 0.95 0.9 0.885 0.97 The X5 0.970 0.98 0.943 0.885 0.94 X6 0.890 0.9 0.93 0.97 0.94 Table (3) coelation matix of size(5) X X X3 X4 X5 X6 X 0.890 0.856 0.943 0.974 0.904 X 0.890 0.9 0.873 0.98 0.894 X3 0.856 0.9 0.94 0.94 0.956 X4 0.943 0.873 0.94 0.887 0.900 X5 0.974 0.98 0.94 0.887 0.96 X6 0.904 0.894 0.956 0.900 0.96 0RR cofe stdev cofe stdev.890 0.99 0.375 0.07-0.6 0.5-0.45 0.7-0.3 0. -0.08 0.9 0.70 0.7 0.553 0.50-0.694 0.379.76 0.089 0.94 0.3 0.675 0.98 condition numbe ( φ, φ, and φ 3 ) as follow: C.numbes N50 N00 φ 79.37 4 φ 630 08 φ 6 6 3 And then we can find the eigenvalues of the pedicto Eigenvalues N5 N50 N00 5.63 0.0009 0.07-0.0388 0.004 0.047 0.060 0.0469 0.075 3 0.655 0.55 0.9 4 0.70 + 0.363 0.89 5 0.0300i 0.70-0.0300i 5.6700 5.66 6 Fom the pevious indicatos, it is obvious that thee ae a seious multicollineaity poblem because thee is one of eigenvalues ( ) fo n50 close to zeo, all VIFs values moe than 0, and the diffeent types of condition numbe moe than 5. The egession coefficients and standad deviations of these coefficients can be summaized in table (4 &5 ).by using both OLS methods of RR to analyze the simulated Data,we get the following esults. Table (4 )Regession coefficients and Standad deviations 04

Intenational Jounal of Scientific & Engineeing Reseach, Volume 5, Issue 0, Octobe-04 998 ISSN 9-558 N50 OLS 0RR cofe stdev cofe stdev.0 0.50.89 0.30-0.503 0.33-0.433 0.89-0.4 0.0-0.03 0.85 0.643 0.304 0.80 0.0-0.57 0.34-0.506 0.9 0.8 0.38 0.05 0.94 Table (5) Regession coefficients and Standad deviations And in the end(late) we can find the mean squae eo and coefficient of detemination(r) fo (OLS) and (ORR) Methods espectively.we obtain the following esult.this esult can be in summaized in table (6 ). Table (6) MSE and R-Squae fo each sample MSE R-Squae sample OLS ORR OLS ORR D.Samia Muhammad salh n00 0.43 0.37 0.793 0.965 Depatment of statistics, Univesity of sulamani n50 0.40 0.34 0.70 0.93 Samia9690@yahoo.com Fom the pevious esults, it is obvious that: -all values of ORR have smalle standad deviation than OLS. -all values of ORR have smalle MSE of egession coefficient than OLS. While, model of RR have lage (R) than OLS).consequently, ORR method bette than OLS when the multicollineaity poblem is exist in dada. Conclusions: In this eseach,we efeed to the multicollineaity poblem, method of detecting of this poblem and effect on esult of multiple egession model.also we intoduced model of idge egession to solve this poblem and we mae a compaison between ORR and OLS methods.based on standad deviation, mean squae eo and coefficient of deteminations.we note that all values of ORR have smalle standad deviation than OLS when the multicollineaity poblem exist and ORR it has smalle MSE of estimatos,smalle standad deviation fo most estimatos and has lage coefficient of detemination. We conclude that the sample size affects the esults of estimated value, wheneve the sample size inceases, the case esults of the methods of Estimation stables moe. REFERENCES:. W.H. Geen, Econometic A analysis, Mac- Millan, New Yo,993..H.Wethill, Evaluation of odinay Ridge Regession, Bulletin of Mathematical Statiatics,8(986),-35. 3.D.K.Guiley,and J.L. Muphy,Diected Ridge Regession Techniques in cases of Multicollineaity, Jounal of Ameican Statistical Association, 70(975),767-775. 4. D.N. Gujaati, basic econometics, McGaw-Hill, New Yo 995. 5. G.G.Jude, Intoduction to theoy and pactice of Econometic, John Willy and son, New Yo, 988. 6. D.C. Montgomey, and E.A. Pec, Intoduction to linea egession analysis, John willy and Sons, New Yo, 99. 7. R.H.Myes, Classical and Moden Regession with application, PWS-KENT publishing Co., Boston, 990. 8. M.D. Pagel, and C.E. Lunnebog, Empiical Evaluation of Ridge Regession, Psychological Bulletin,97(985),34-355. 9. G.R. Pasha, and M.A. Shah, application of Ridge Regession to Multicollinea Data, J.es.sci, 5(004),97-06. 0. H.D.Vinod, and A. Ullah, Recent Advances in Regession Models,Macel Dee, New Yo,98.. E. Wale, Detection of Collineaity Influential obsevations, Communications in statistics, theoy and methods,8(989),675-690..m.sana,and.c.eyup,efficient choice of Biasing Constant fo Ridge Regession,Int.J. Contemp.Math.Science,3(008),57-536. 04