[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

Similar documents
LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

REGRESSION ANALYSIS II- MULTICOLLINEARITY

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Economics 130. Lecture 4 Simple Linear Regression Continued

Statistics for Business and Economics

Chapter 13: Multiple Regression

Chap 10: Diagnostics, p384

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

The Prncpal Component Transform The Prncpal Component Transform s also called Karhunen-Loeve Transform (KLT, Hotellng Transform, oregenvector Transfor

x i1 =1 for all i (the constant ).

Comparison of Regression Lines

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

LECTURE 9 CANONICAL CORRELATION ANALYSIS

The Ordinary Least Squares (OLS) Estimator

Composite Hypotheses testing

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

STAT 511 FINAL EXAM NAME Spring 2001

Lecture 3 Specification

Negative Binomial Regression

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

First Year Examination Department of Statistics, University of Florida

Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Chapter 11: Simple Linear Regression and Correlation

a. (All your answers should be in the letter!

Professor Chris Murray. Midterm Exam

January Examinations 2015

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Continuous vs. Discrete Goods

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

e i is a random error

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Collinearity in regression (the dreaded disease, and how to live with it)

Lecture 6: Introduction to Linear Regression

CHAPTER 8. Exercise Solutions

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

β0 + β1xi. You are interested in estimating the unknown parameters β

Econometrics of Panel Data

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Economics & Business

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Ridge Regression Estimators with the Problem. of Multicollinearity

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Introduction to Regression

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

β0 + β1xi. You are interested in estimating the unknown parameters β

A Comparative Study for Estimation Parameters in Panel Data Model

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Statistical Evaluation of WATFLOOD

Polynomial Regression Models

Chapter 12 Analysis of Covariance

Linear Approximation with Regularization and Moving Least Squares

Diagnostics in Poisson Regression. Models - Residual Analysis

Some basic statistics and curve fitting techniques

Chapter 4: Regression With One Regressor

STATISTICS QUESTIONS. Step by Step Solutions.

Exam. Econometrics - Exam 1

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

STAT 3008 Applied Regression Analysis

Sampling Theory MODULE V LECTURE - 17 RATIO AND PRODUCT METHODS OF ESTIMATION

Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall, 1980

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

β0 + β1xi and want to estimate the unknown

Regression Analysis. Regression Analysis

Global Sensitivity. Tuesday 20 th February, 2018

S136_Reviewer_002 Statistics 136 Introduction to Regression Analysis Reviewer for the 2 nd Long Examination

Chapter 8 Indicator Variables

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

Comparison among Some Remedial Procedures for Solving. Multicollinearity Problem in Regression Model Using Simulation. Ashraf Noureddin Dawod Ababneh

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Correlation and Regression

The Geometry of Logit and Probit

A Robust Method for Calculating the Correlation Coefficient

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Linear Regression Analysis: Terminology and Notation

F8: Heteroscedasticity

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Statistics Chapter 4

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Statistics MINITAB - Lab 2

This column is a continuation of our previous column

Radar Trackers. Study Guide. All chapters, problems, examples and page numbers refer to Applied Optimal Estimation, A. Gelb, Ed.

Transcription:

Multcollnearty multcollnearty Ragnar Frsch (934 perfect exact collnearty multcollnearty K exact λ λ λ K K x+ x+ + x 0 0.. λ, λ, λk 0 0.. x perfect ntercorrelated λ λ λ x+ x+ + KxK + v 0 0.. v 3 y β + β x+ β x + β x + u 3 4 3 4 5 trend (3. X X [ ] X x x x 3.30 3 K centered and scaled

x S x x (,,, n;,,, K Sx ( x x centerng and scalng XX ' a correlaton matrx 0 0 0 0 r3 r K 0 r3 r 3K XX ' rk, K 0 r x x X raw X n X y + X β 3.3 β ' [ β β β β ] X s the n*(k- matrx of centered and scaled regressor 3 K varables. X ' X s the k*k correlaton matrx X ' X r3 r4 r K r34 r3 K rk, K What s multcollnearty? X x near lnear dependences 0 K c x 0 3.3 0 exact XX '

Use of egenvalues and egenvectors to explan multcollnearty X ' X V v v v ( dag ( λ λ3 λ orthogonal matrx [ ] V' X ' X V,,, K 3.33 3 K λ (3.33 X ' X V X ' X normalzed egenvectors 0 λ 0 ' ( v X ' X v 0 v K vlx l l 0 v. v OLS perfect multcollnearty ( b X X X' y var ( σ ( ' ' ( b X X XX ' b ( y y( x x ( x3x3 ( y y( x3x3 ( x x( x3x3 ( x x ( x3x3 ( x x( x3x3 var ( b x ( x x3 σ ( x ( 3 3 ( ( 3 3 ( x x ( r x x x x x x x 3 3 σ 3 λx λ b 0 b 0 3

( ( ( ( 3 x x x3x3 r3 x x3 x x x x3 r b 0 3 λ x x 3 ( λ y y b + b x + e ax + e 3 λ a b + b 3 β + β + ε devaton form y y ( x x 3( x 3 x3 OLS a b + λb ( y y( x x ( x x 3 b b3 + estmable functon β λβ 3 hgh but mperfect multcollnearty 3 OLS BLUE BUE near multcollnearty OLS BLUE In case of near or hgh multcollnearty, one s lkely to encounter the followng consequences:. Although BLUE, the OLS estmators have large varances and covarances, makng precse estmaton dffcult.. Due to consequence, the confdence ntervals tend to be much wder, leadng to the acceptance of the zero null hypothess more readly. 3. Also due to consequence, the t rato of one or more coeffcents tends to be statstcally 4

nsgnfcant (.e., becomes smaller. 4. Although the t rato of one or more coeffcents s statstcally nsgnfcant, R, the overall measure of goodness of ft, can be very hgh. Indeed, ths s one of the sgnals of multcollnearty nsgnfcant t values but a hgh overall R (and a sgnfcant F value! 5. The OLS estmators and ther standard errors can be senstve to small changes n the data. 4 detecton. Hgh R but few sgnfcant t ratos. F 0 t. Hgh par-wse correlatons among regressors. 3. Examnaton of partal correlatons. Farrar Glauber R y x x3 x4 r, r, r.34.34 3.4 4.3 x x3 x4 4. Auxlary regressons. exact or approxmate R Auxlary regresson y x Rx. /( x x K K F 0.7.3 R / n K + ( x. x x ( K K n K F n K R x x. x xk F x F x x F x R Klen s rule of thumb 5

R overall R 5. Egenvalues and condton ndex. SAS package uses egenvalues and the condton ndex to dagnose multcollnearty. From these egenvalues, we can derve what s known as the condton number k defned as Maxmum egenvalue k Mnmum egenvalue and the condton ndex (CI defned as CI Maxmum egenvalue Mnmum egenvalue k Then we have ths rule of thumb. If k s between 00 and 000 there s moderate to strong multcollnearty and f t exceeds 000 there s severe multcollnearty. Alternatvely, f the CI s between 0 and 30, there s moderate to strong multcollnearty and f t exceeds 30 there s severe multcollnearty. 6. Tolerance and varance nflaton factor. var var σ r3 ( b ( b ( x x σ 3 r3 ( x 3 x3 7.4. 7.4.5 σ r 3 3 r3 ( b b cov, ( x x ( x3 x3 7.4.7 varance-nflatng factor, VIF VIF r 0.5. 3 K (7.5.6 Myers 3.34 var ( b σ R ( x x (7.5.6 where b s the partal regresson coeffcent of x and R s the R n the regresson of x on the remanng (K- regressors. 6

VIF (,3, K 0.5.4 R The nverse of the VIF s called tolerance (TOL. TOL R 0.5.4 VIF Some authors use the VIF as an ndcator of multcollnearty. As a rule of thumb, f the VIF of a varable exceeds 0, whch wll happen f R exceeds 0.9, that varable s sad to be hghly collnear. 5 remedy Rule-of-thumb procedures. A pror nformaton. Combnng cross-sectonal and tme seres data poolng the data 3. Droppng a varable(s and specfcaton bas But n droppng a varable from the model we may be commttng a specfcaton bas or specfcaton error. Specfcaton bas arses from ncorrect specfcaton of the model used n the analyss. β β β ε If the true model s y + x + 3x 3+ But we mstakenly ft the model y b+ bx + e 7

Then t can be shown that E ( b β + β b 3 3 where b 3 slope coeffcent n the regresson of x3 on x. It s obvous that b wll be a based estmated of β as long as b3 s dfferent from zero. If b3 does not approach zero as the sample sze s ncreased ndefntely, then b wll be not only based but also nconsstent. Of course f b3 s zero, we have no multcollnearty problem to begn wth. 4. Transformaton of varables y β+ βx + β3x3+ ε y x ε β + β + β3+ x3 x3 x3 x3 But the frst-dfference or rato transformatons are not wthout problems. v ε ε t t t ε x 3 5. Addtonal or new data. 6. Other methods of remedyng multcollnearty Multvarate statstcal technques such as factor analyss and prncple components or technques such as stepwseregresson, rdge regresson are often employed to solve the problem of multcollnearty. 8