IN a recent article, Geary [1972] discussed the merit of taking first differences

Similar documents
A Matrix Representation of Panel Data

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

Lecture 10, Principal Component Analysis

Distributions, spatial statistics and a Bayesian perspective

SAMPLING DYNAMICAL SYSTEMS

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

Inference in the Multiple-Regression

1 The limitations of Hartree Fock approximation

Lyapunov Stability Stability of Equilibrium Points

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

Cambridge Assessment International Education Cambridge Ordinary Level. Published

Marginal Conceptual Predictive Statistic for Mixed Model Selection

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart

Resampling Methods. Chapter 5. Chapter 5 1 / 52

, which yields. where z1. and z2

A Regression Solution to the Problem of Criterion Score Comparability

A solution of certain Diophantine problems

NUMBERS, MATHEMATICS AND EQUATIONS

Kinetic Model Completeness

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax

Thermodynamics and Equilibrium

Computational modeling techniques

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

Pattern Recognition 2014 Support Vector Machines

Least Squares Optimal Filtering with Multirate Observations

ENSC Discrete Time Systems. Project Outline. Semester

NOTE ON APPELL POLYNOMIALS

AP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date

A Simple Set of Test Matrices for Eigenvalue Programs*

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

ALE 21. Gibbs Free Energy. At what temperature does the spontaneity of a reaction change?

An Introduction to Matrix Algebra

Department of Economics, University of California, Davis Ecn 200C Micro Theory Professor Giacomo Bonanno. Insurance Markets

Drought damaged area

SUMMER REV: Half-Life DUE DATE: JULY 2 nd

More Tutorial at

Math Foundations 20 Work Plan

2004 AP CHEMISTRY FREE-RESPONSE QUESTIONS

Comparing Several Means: ANOVA. Group Means and Grand Mean

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

BASD HIGH SCHOOL FORMAL LAB REPORT

INSTRUMENTAL VARIABLES

Study Group Report: Plate-fin Heat Exchangers: AEA Technology

Calculating the optimum pressure and temperature for vacancy minimization from theory; Niobium is an example. Jozsef Garai

Lead/Lag Compensator Frequency Domain Properties and Design Methods

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

Evaluating enterprise support: state of the art and future challenges. Dirk Czarnitzki KU Leuven, Belgium, and ZEW Mannheim, Germany

EXPERIMENTAL STUDY ON DISCHARGE COEFFICIENT OF OUTFLOW OPENING FOR PREDICTING CROSS-VENTILATION FLOW RATE

Equilibrium of Stress

NOTE ON A CASE-STUDY IN BOX-JENKINS SEASONAL FORECASTING OF TIME SERIES BY STEFFEN L. LAURITZEN TECHNICAL REPORT NO. 16 APRIL 1974

V. Balakrishnan and S. Boyd. (To Appear in Systems and Control Letters, 1992) Abstract

Floating Point Method for Solving Transportation. Problems with Additional Constraints

Pressure And Entropy Variations Across The Weak Shock Wave Due To Viscosity Effects

UNIV1"'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION

A PLETHORA OF MULTI-PULSED SOLUTIONS FOR A BOUSSINESQ SYSTEM. Department of Mathematics, Penn State University University Park, PA16802, USA.

Figure 1a. A planar mechanism.

Source Coding and Compression

( ) kt. Solution. From kinetic theory (visualized in Figure 1Q9-1), 1 2 rms = 2. = 1368 m/s

A NOTE ON THE EQUIVAImCE OF SOME TEST CRITERIA. v. P. Bhapkar. University of Horth Carolina. and

Lab #3: Pendulum Period and Proportionalities

On Boussinesq's problem

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

ENGI 4430 Parametric Vector Functions Page 2-01

Kinematic transformation of mechanical behavior Neville Hogan

THERMAL TEST LEVELS & DURATIONS

We can see from the graph above that the intersection is, i.e., [ ).

Math Foundations 10 Work Plan

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

arxiv:hep-ph/ v1 2 Jun 1995

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint

Determining the Accuracy of Modal Parameter Estimation Methods

Introduction: A Generalized approach for computing the trajectories associated with the Newtonian N Body Problem

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

A proposition is a statement that can be either true (T) or false (F), (but not both).

Section 11 Simultaneous Equations

BOUNDED UNCERTAINTY AND CLIMATE CHANGE ECONOMICS. Christopher Costello, Andrew Solow, Michael Neubert, and Stephen Polasky

Revisiting the Socrates Example

Chapters 29 and 35 Thermochemistry and Chemical Thermodynamics

Eric Klein and Ning Sa

Hypothesis Tests for One Population Mean

A.H. Helou Ph.D.~P.E.

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

APPLICATION OF THE BRATSETH SCHEME FOR HIGH LATITUDE INTERMITTENT DATA ASSIMILATION USING THE PSU/NCAR MM5 MESOSCALE MODEL

The blessing of dimensionality for kernel methods

Tree Structured Classifier

Homology groups of disks with holes

Interference is when two (or more) sets of waves meet and combine to produce a new pattern.

MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Dead-beat controller design

A Correlation of. to the. South Carolina Academic Standards for Mathematics Precalculus

Transcription:

The Efficiency f Taking First Differences in Regressin Analysis: A Nte J. A. TILLMAN IN a recent article, Geary [1972] discussed the merit f taking first differences t deal with the prblems that trends in data present in regressin analysis. Geary gave examples f situatins where this prcedure leads t highly inefficient estimates f the regressin cefficients. The first difference transfrmatin has als been suggested as an apprpriate way f dealing with multicllinearity amng the independent variables, fr example Kane [1968, p, 28]. This nte generalises Geary's results and shws that transfrming the data by taking first differences fr such a purpse cannt imprve the efficiency f the regressin estimates but will in general cause a reductin in efficiency. Althugh the usual least squares frmula is nt apprpriate fr calculating the variance f regressin estimates btained frm the transfrmed data, they are ften used fr this purpse. Their relatin t the variance f the regressin estimates btained frm the untransfrmed data is examined. Fr cmpleteness this nte cncludes with a brief discussin f first differencing t deal with serially crrelated disturbances in the regressin mdel. Trend and Multicllinearity Let the regressin mdel be y=xb+u (1) where y is a nx 1 vectr, X a nx k matrix f rank k, and/j a vectr f k parameters t be estimated, u is an unbserved nx 1 vectr f disturbances such that E(u)=, E(u «') = a 2 J. Fr such a regressin mdel the best linear unbiased estimatr is the familiar least squares estimatr

with cvariance Cv (h) = - 2 (X'X)~ 1 ' =a*e h Hwever, trends in the data may give rise t implausibly large r values. Anther prblem frequently met with is that f a high degree f multicllinearity between the independent variables. This may result in the estimated cefficients having insignificant t ratis. Taking first differences is smetimes emplyed as a way ut f these difficulties. We will shw that this cannt lead t mre efficient estimates if the underlying regressin is given by (i). Premultiplying (i) by the n ixn matrix T transfrms the regressin t first differences, thus Ty=TXp+Tu (2) where T = 1 1 1 If an intercept term is included in the regressin, first differencing reduces this t zer. T avid this, bth the dependent and independent variables are expressed as'deviatin's abut their J means. This eliminates the need fr an explicit intercept term in (1), Let y*= Ty, X*= TX and u*= Tu. Hence y*, X* and u* haverc 1 rws and («*«*') = a 2 TT'. Therefre (2) may be re-written as y* = X*fi+u* (3) where j8 is nw t be estimated frm (3). But, t qute Geary, "Invariably, hwever, when the regressin prblem is deltaised the assumptin is made that the errr term 'w*' is regular, which assumptin amunts t a wrng specificatin if the basic mdel is (1)." The least squares estimatr using the deltaised data 7* X* is with cvariance Cv(6*) = <r 2 (X*'X*)- X X*' TV X*(X*'X*)~ 1 = a 2 27 6. whereas the best-linear unbiased estimatr is nw - ^ (x*'{riy i x*)- i x*, (TT')- :l y*

with cvariance Cv(j W )=a a (X*'(rr)- 1 X*)- 1 = cr 2^;. The questin t be reslved is that f the relative efficiency amng the three estimatrs b, h* and j8*. Successively applying the Generalised Gauss Markv therem yields the inequality 1. C v(6) < C v()8*) < Cy (b*) Thus b* is the least efficient f the three.estimates. This lss f efficiency is due bth t incrrectly assuming that E(u*u*') = a 2 I and t "lsing" ne bservatin in taking first differences. If T were a nx n matrix f full rank p* wuld be identical t b, but b* wuld still, in general differ frm b. If, after taking first differences t remve trend r multicllinearity, the w* are regarded as being regular, then nt nly will b* be used instead f /?* but the cvariance f b* is estimated (incrrectly) by Cv(6*) =a 2 (X*'X*)" 1 First differencing wuld be judged as having successfully dealt with multicllinearity if Cv(fr*) is smaller than Cv(t). It is therefre f sme interest t investigate the extent t which such a reductin is pssible. We shall btain bunds fr the rati f the generalised cvariances, Cv(t*) / Cv(b) \. The matrix f regressin vectrs X may be written. : X P K where P'P = I and K is nn-singular. Hence '. I X'X \ - I K'K\ : :,- =\k\ and ' r 2 - : /'... : ;. X*[X* \ = \ X'T'TX\ ' = K'P'T'T'TPK\ = I P'T'TP\ I K\ 2 - = I X'X\ A Ck(P'T'TP) 1. This is discussed further in Appendix 1. *- - -,

where we have made use f the fact that the determinant f a symmetric matrix is equal t the prduct f its characteristic rts. The ith largest characteristic rt f a symmetric matrix A may be dented by Ch t (A). Althugh the characteristic rts f P' T'T P depend n X, Cauchy's inequality enables them t be bunded by the characteristic rts f T' T. Cauchy's Inequality If A is an nx n symmetric matrix and R an nx k matrix such that R'R = I, Ch i + _ k (A)^Ch,(R'A R)^Ch i (A) i= i... fe The next step is t evaluate the characteristic rts f T" T. But T'T=- O L I i -I 2 I O O I 2 1. 1 I It is knwn, fr example see Andersn [1971], that Therefre ( r Ck(TT) = 21 1-Csil 1 Cs \ (n i) 4>Ch 1 (VT)>... >CL{T'T) = The characteristic vectr f T crrespnding t Ch [T' T) is a vectr f identical elements rthgnal t the clumns f X. Therefre Cauchy's inequality cntracts t give cfci +._ fe _ I (r't)<c/t«(p'rrp)<cfe«(r, T) 1=1 * (4) The upper bund is attained if the clumns f P crrespnd t the characteristic vectrs f T" T giving the k largest rts. These vectrs may be written as P 2 - The lwer bund is attained if the clumns f P crrespnd t vectrs that yield the k smallest rts excluding Ch (T'T). These may be written as P v After sme simplificatin we btain bunds fr the rati f the generalised cvariances, Cv(&*) nch^t'tt). Cv(fc) nch^^^^vt) n ' J I

The upper bund is attained if X=P 1 K, the lwer if X=P 2 K. The bunds are tabulated fr varius values f k and n in Table i. The lwer bund appraches -- as n increases but the upper bund has n limit as n increases. 4fe The cnditins fr the rati f the generalised cvariances t attain its maximum value has an interesting ecnmic interpretatin. In many ecnmic applicatins, particularly thse using time series data the regressin vectrs are slwly changing, that is the change frm ne perid t the next is small in relatin t the ttal change ver n perids. In such a case the matrix f regressin vectrs X is apprximately equal t P X K. It might be cnjectured that fr small divergences frm X=P- L K the relatin f Cv(&)* t Cv(i) is given apprximately by the upper bund in Table i. Therefre in many ecnmic applicatins the incrrect estimate Cv(fe*) will be larger than the cvariance f the riginal b. Thus even if the riginal estimatr des nt suffer frm multicllinearity the incrrect estimate b* will appear t d s. TABLE I I Cv(Z>*) Cv(Z>*) k 1 2 3 4 «1 3 Upper bund Lwer bund Upper bund Lwer bund 1-2 -25 1-6 xi 2-25 26-7 -71 6-5 XIO 3-63 32-4 -22 1-2 XIO 5 6 23-5 -9 1-2 X IO E -4 k = number f independent variables, k des nt include the cnstant term. Fr n = 3, the figures fr the Upper Bund have been runded. First Order Serial Crrelatin Fr cmpleteness we cnclude by briefly describing anther use fr which the first difference transfrmatin has been prpsed. In many applicatins f the regressin mdel, particularly thse using time series data it is suspected that the disturbance term u fllws a first rder autregressive prcess. That is u t pu,_ 1+ t where the E t are independent identically distributed errr terms.

Premultiplying equatin (i) by the n ixn matrix Q crrespnds t the generalised first difference transfrmatin and regularises the errr term. Thus Qy=QXp+Qu (6) where Q = -p, I... p I pi O... p It may easily be shwn that E(Q u u'q')=a 2 L_ 1. Unlike the prblem f trend r multicllinearity, the ipurpse f the transfrmatin is nw t deal with the irregular errr term. i Kadiyala [1968] discussed the efficiency f the least squares estimatr btained using the transfrmed data Qy and QX relative t the rdinary least squares estimatr. Fr the special case f X being a clumn f nes the estimatr btained frm (6) was always less efficient than the least squares estimatr and as p apprached ne the efficiency drpped t zer. It is fr the values f p clse t ne that the first difference transfrmatin has been recmmended fr dealing with first rder serial crrelatin. A better prcedure is t use a transfrmatin Q*, where Q* = / q \ and<j=(vi P 2 >,..., ). If p is unknwn it can be estimated frm the rdinary least squares residuals, fr example Jhnstn [1972]. Cnclusin The first difference transfrmatin is nt an apprpriate way f dealing with trend, n. multicllinearity in regressin analysis. Transfrming the data in this way cannt increase the efficiency f the regressin estimates and will in general reduce the efficiency. Fr first rder serial 1 crrelatin the transfrmatin Q* is superir t the generalised first difference transfrmatin Q. I wuld like t thank T. Muench fr his helpful cmments n an earlier draft f this paper. University f Massachusetts, Bstn.

REFERENCES, [i] Andersn, T. W. (1971), The Statistical Analysis f Time Series, J. Wiley, New Yrk. [2] Geary, R. C. (1972), "Tw Exercises in Simple Regressin", Ecnmic and Scial Review, Vl. 3, N. 4, July, 1972- [3] Jhnstn, J. (1972), Ecnmetric Methds, 2nd editin, McGraw Hill, New Yrk. [4] Kadiyala, K. R. (1968), "A Transfrmatin Used t Circumvent the Prblem f Autcrrelatin", Ecnmetrica, 36, 93-96. [5] Kane, E. J. (1968), Ecnmic Statistics and Ecnmetrics, Harper and Rw. APPENDIX 1 The k x k cvariance matrics f h, js* and b* are given by cf 2 \, cr 2 ]T/5* and cr 2^' respectively. The inequality between the cvariances is t be understd in the fllwing way. If A and B are symmetric psitive definite matrices f the same rder then A^B, if and nly if A B is psitive semi-definite. The Generalised Gauss Markv therem states that fr the regressin mdel (1), b has minimum variance amng all unbiased linear estimatrs f j8. Since j3* is anther unbiased linear estimatr f 3 it fllws that Cv(&)<Cv(j8*). Similarly fr mdel (3),/?* has minimum variance amng all unbiased linear estimatr f /3 where the data is f the frm y*, X*. But b* is als an unbiased linear estimatr f/3 hence Cv$*)<Cv(fc*)