Additional material for Chapter 7 Regression diagnostic IV: model specification errors

Similar documents
R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Licenciatura de ADE y Licenciatura conjunta Derecho y ADE. Hoja de ejercicios 2 PARTE A

Time series Decomposition method

Dynamic Econometric Models: Y t = + 0 X t + 1 X t X t k X t-k + e t. A. Autoregressive Model:

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

Regression with Time Series Data

GMM - Generalized Method of Moments

Vehicle Arrival Models : Headway

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H.

The Simple Linear Regression Model: Reporting the Results and Choosing the Functional Form

ECON 482 / WH Hong Time Series Data Analysis 1. The Nature of Time Series Data. Example of time series data (inflation and unemployment rates)

Solutions to Odd Number Exercises in Chapter 6

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.

Distribution of Estimates

A Dynamic Model of Economic Fluctuations

Dynamic Models, Autocorrelation and Forecasting

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

1. Diagnostic (Misspeci cation) Tests: Testing the Assumptions

Financial Econometrics Jeffrey R. Russell Midterm Winter 2009 SOLUTIONS

Distribution of Least Squares

Wednesday, November 7 Handout: Heteroskedasticity

Solutions: Wednesday, November 14

Properties of Autocorrelated Processes Economics 30331

Unit Root Time Series. Univariate random walk

Stationary Time Series

Modeling and Forecasting Volatility Autoregressive Conditional Heteroskedasticity Models. Economic Forecasting Anthony Tay Slide 1

Forecasting optimally

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

CHAPTER 17: DYNAMIC ECONOMETRIC MODELS: AUTOREGRESSIVE AND DISTRIBUTED-LAG MODELS

Econ Autocorrelation. Sanjaya DeSilva

Cointegration and Implications for Forecasting

Problem Set 5. Graduate Macro II, Spring 2017 The University of Notre Dame Professor Sims

How to Deal with Structural Breaks in Practical Cointegration Analysis

Estimation Uncertainty

E β t log (C t ) + M t M t 1. = Y t + B t 1 P t. B t 0 (3) v t = P tc t M t Question 1. Find the FOC s for an optimum in the agent s problem.

Chapter 2. First Order Scalar Equations

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Lecture 15. Dummy variables, continued

Comparing Means: t-tests for One Sample & Two Related Samples

Some Basic Information about M-S-D Systems

14 Autoregressive Moving Average Models

Chapter 15. Time Series: Descriptive Analyses, Models, and Forecasting

Vectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1

OBJECTIVES OF TIME SERIES ANALYSIS

(a) Set up the least squares estimation procedure for this problem, which will consist in minimizing the sum of squared residuals. 2 t.

Explaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Lecture 3: Solow Model II Handout

Lecture 5. Time series: ECM. Bernardina Algieri Department Economics, Statistics and Finance

Testing for a Single Factor Model in the Multivariate State Space Framework

Introduction to Probability and Statistics Slides 4 Chapter 4

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H.

( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively:

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Chapter 16. Regression with Time Series Data

Solutions Problem Set 3 Macro II (14.452)

= ( ) ) or a system of differential equations with continuous parametrization (T = R

The Brock-Mirman Stochastic Growth Model

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Wisconsin Unemployment Rate Forecast Revisited

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

4.1 Other Interpretations of Ridge Regression

Summer Term Albert-Ludwigs-Universität Freiburg Empirische Forschung und Okonometrie. Time Series Analysis

Nature Neuroscience: doi: /nn Supplementary Figure 1. Spike-count autocorrelations in time.

Solutions to Exercises in Chapter 12

Predator - Prey Model Trajectories and the nonlinear conservation law

Lecture Notes 2. The Hilbert Space Approach to Time Series

20. Applications of the Genetic-Drift Model

Math 333 Problem Set #2 Solution 14 February 2003

Linear Response Theory: The connection between QFT and experiments

Suggested Solutions to Assignment 4 (REQUIRED) Submisson Deadline and Location: March 27 in Class

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

Department of Economics East Carolina University Greenville, NC Phone: Fax:

10. State Space Methods

DEPARTMENT OF STATISTICS

ADVANCED MATHEMATICS FOR ECONOMICS /2013 Sheet 3: Di erential equations

Final Spring 2007

Chapter 11. Heteroskedasticity The Nature of Heteroskedasticity. In Chapter 3 we introduced the linear model (11.1.1)

Nonstationarity-Integrated Models. Time Series Analysis Dr. Sevtap Kestel 1

Lecture Notes 3: Quantitative Analysis in DSGE Models: New Keynesian Model

Advanced time-series analysis (University of Lund, Economic History Department)

Methodology. -ratios are biased and that the appropriate critical values have to be increased by an amount. that depends on the sample size.

KINEMATICS IN ONE DIMENSION

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates

( ) (, ) F K L = F, Y K N N N N. 8. Economic growth 8.1. Production function: Capital as production factor

Chapter 5. Heterocedastic Models. Introduction to time series (2008) 1

3.1 More on model selection

5.1 - Logarithms and Their Properties

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3

Types of Exponential Smoothing Methods. Simple Exponential Smoothing. Simple Exponential Smoothing

Let us start with a two dimensional case. We consider a vector ( x,

12: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME. Σ j =

Module 2 F c i k c s la l w a s o s f dif di fusi s o i n

Transcription:

Table 7.14 OLS resuls of he regression of PCE on income. Dependen Variable: PCE Mehod: Leas Squares Dae: 07/31/10 Time: 10:00 Sample: 1960 2009 Included observaions: 50 Variable Coefficien Sd. Error -Saisic Prob. C 31.88846 18.22720 1.749498 0.0866 INCOME 0.819232 0.003190 256.7871 0.0000 Addiional maerial for Chaper 7 Regression diagnosic IV: model specificaion errors R-squared 0.999273 Mean dependen var 3522.160 Adjused R-squared 0.999257 S.D. dependen var 3077.678 S.E. of regression 83.86681 Akaike info crierion 11.73551 Sum squared resid 337614.8 Schwarz crierion 11.81200 Log likelihood 291.3879 Hannan Quinn crier. 11.76464 F-saisic 65939.59 Durbin Wason sa 0.568044 Prob(F-saisic) 0.000000 II ILS is ha i akes ino accoun direcly he simulaneiy problem, whereas OLS simply ignores i. We have considered a very simple example of simulaneous equaion models. In models involving several equaions, i is no easy o idenify if all he equaions in he sysem are idenified. The mehod of ILS is oo clumsy o idenify each equaion. Bu here are oher mehods of idenificaion, such as he order condiion of idenificaion and he rank condiion of idenificaion. We will no discuss hem here, for ha will ake us away from he main heme of his chaper, which is o discuss he major sources of specificaion errors. Bu a brief discussion of he order condiion of idenificaion is given in Chaper 19. An exended discussion of his opic can be found in he references. 23 7.10 Dynamic regression models Economic heory is ofen saed in saic or equilibrium form. For example, elemenary economics eaches us ha he equilibrium price of a commodiy (or service) is deermined by he inersecion of he relevan demand and supply curves. However, he equilibrium price is no deermined insananeously bu by a process of rial and error, which akes ime. This leads us o a discussion of dynamic regression models. Therefore, if we neglec o ake ino accoun he dynamic (i.e. ime) aspec of a problem, we will be commiing a specificaion error. To moivae he discussion, we consider he celebraed permanen income hypohesis of Milon Friedman. 24 In simple erms, i saes ha he curren consumpion (expendiure) of an individual is a funcion of his or her permanen (i.e. life-long) income. Bu how does one measure he permanen income? Based on quarerly daa, Friedman esimaed permanen income as a weighed average of quarerly income going back 24 Milon Friedman, A Theory of Consumpion Funcion, Princeon Universiy Press, New Jersey, 1957.

136 Criical evaluaion of he classical linear regression model abou 16 quarers. Leing Y represen consumpion expendiure and X income, Friedman esimaed he following ype of model: Y AB X B X B X B X u (7.14) 0 1 1 2 2 16 16 where X is income in he curren period (quarer), X 1 is income lagged one quarer, X 2 is income lagged wo quarers, and so on. The B coefficiens are he weighs aached o he income in he various quarers. We assume ha he model (7.14) saisfies he usual OLS assumpions. For discussion purposes, we will call (7.14) he consumpion funcion. In he lieraure, model (7.14) is known as a disribued lag model (DLM) because he curren value of he dependen variable Y is affeced by he curren and lagged values of he explanaory variable X. This is no difficul o see. Suppose you ge an increase in your salary his year. Assuming his increase is mainained, you will no necessarily rush o spend he increase in your income immediaely. Raher, you are likely o spread i over a period of ime. Before we urn o he esimaion of he DLM, i may be useful o inerpre he model in (7.14). The coefficien B 0 is known as he shor-run or impac muliplier, for i gives he change in he mean value of Y following a uni change in X in he same ime period. If he change in X is kep a he same level hereafer, (B 0 + B 1 ) gives he change in mean Y in he nex period, (B 0 + B 1 + B 2 ) in he following period, ec. These parial sums are called inerim or inermediae mulipliers. Afer k periods (if ha is he maximum lag lengh under consideraion), we obain: k k 0 1 0 B B B B (7.15) k which is known as he long-run or oal muliplier. I gives he ulimae change in mean consumpion expendiure following a (susained) uni increase in he income. Thus, in he following hypoheical consumpion funcion, Y consan 0. 4X 02. X 015. X 01. X 1 2 3 he impac muliplier will be 0.4, he inerim muliplier will be (0.75) and he oal, or long-run, muliplier will be 0.85. If, for example, income increases by $1000 in year, and assuming his increase is mainained, consumpion will increase by $400 in he firs year, by anoher $200 in he second year, and by anoher $150 in he hird year, wih he final oal increase being $750. Presumably, he consumer will save $250. Reurning o he model (7.14), we can esimae i by he usual OLS mehod. 25 Bu his may no be pracical for several reasons. Firs, how do we decide how many lagged erms we use? Second, if we use several lagged erms, we will have fewer degrees of freedom o do meaningful saisical analyses, especially if he sample size is small. Third, in ime series daa successive values of he lagged erm are likely o be highly correlaed, which may lead o he problem of mulicollineariy, which, as we noed in he chaper on mulicollineariy, will lead o imprecise esimaion of he regression coefficiens. 25 Provided ha he regressors (curren and lagged) are weakly exogenous, ha is, hey are uncorrelaed wih he error erm. In some cases a sronger assumpion is needed in ha he regressors are sricly exogenous, ha is, hey are independen of he pas, curren and fuure values of he error erm.

Regression diagnosic IV: model specificaion errors 137 To overcome some of hese drawbacks of he DLM some alernaives have been suggesed in he lieraure. We will discuss only one of hese alernaives, namely he Koyck disribued lag model. 26 The Koyck disribued lag model 27 To undersand his model, le us express (7.14) in a more general form: Y AB0X B1X1 B2X2 u (7.16) This is called an infinie DLM because we have no defined he lengh of he lag; ha is, we have no specified how far back in ime we wan o ravel. By conras, he model in (7.14) is a finie DLM, for we have specified he lengh of he lag: 16 lagged erms. The infinie DLM in (7.16) is for mahemaical convenience, as we will show. To esimae he parameers of (7.16), Koyck used he Geomeric Probabiliy Disribuion. Assuming ha all he B coefficiens in (7.16) have he same sign, which makes sense in our consumpion funcion, Koyck assumed ha hey decline geomerically as follows: B B0, k 01,, ; 01 (7.17) k k where is known as he rae of decline or decay and where( 1) is known as he speed of adjusmen, ha is, how fas consumpion expendiure adjuss o he new income level. Apar from B 0, he value of each B k depends on he value of : a value of close o 1 would sugges ha B k declines slowly, ha is, X values in disan pas will have some impac on he curren value of Y. On he oher hand, a value of close o zero would sugges ha he impac of X in he disan pas will have lile impac on he curren Y. Wha Koyck is assuming is ha each successive B coefficien is numerically smaller han each preceding B (which follows from he assumpion ha is less han 1), suggesing ha as we go back ino he disan pas, he effec of ha lag on Y becomes progressively smaller. In he consumpion funcion of (7.14) his makes good sense, for a person s consumpion expendiure oday is less likely o be affeced by he disan pas income han he recen pas income. How does his help us in esimaing he infinie DLM? To see how, le us express (7.16) as II Y AB X B X B X B X u (7.18) 0 0 1 0 2 2 0 3 2 where use is made of (7.17). However, (7.18) is no easy o esimae, for we sill have o esimae an infinie number of coefficiens and he adjusmen coefficien eners highly nonlinearly. Bu Koyck uses a clever rick o ge around his problem. He lags (7.18) by one period o obain: Y AB X B X B X u (7.19) 1 0 1 0 2 0 2 3 1 He hen muliplies (7.19) by o obain: 26 For deails, see Gujarai/Porer, Ch. 17. For an advanced discussion, see James H. Sock and Mark W. Wason (2011), Inroducion o Economerics, 3rd edn, Addison-Wesley, Boson, Ch. 15. 27 L. M. Koyck (1954), Disribued Lags and Invesmen Analysis, Norh Holland Publishing Company, Amserdam.

138 Criical evaluaion of he classical linear regression model 2 3 1 0 1 0 2 0 3 1 Y AB X B X B X u (7.20) Subracing (7.20) from (7.18), he obains: Y Y A( 1) B X ( u u ) (7.21) 1 0 1 Rearranging (7.21), he finally obains: Y A( ) B X Y v (7.22) 1 0 1 where v u u 1. I is ineresing o noe ha he lagged value of he dependen variable appears as a regressor in his model. Such models are called auoregressive models, for hey involve he regression of he dependen variable upon is lagged value(s) among oher independen explanaory variable(s). A grea advanage of he Koyck ransformaion is ha insead of esimaing an infinie number of parameers, as in (7.16), we now have o esimae only hree parameers in model (7.22), a remendous simplificaion of he original model. Are here any problems in esimaing (7.22)? Before we answer ha quesion, i is ineresing o noe ha he shor-run and long-run impacs of a uni change in X on he mean value of Y can be readily compued from (7.22). The shor-run impac is given by he coefficien of X, B 0, and he long-run impac of a susained uni change in X is given by B 0 /( 1. ) 28 Since lies beween 0 and 1, he long-run impac will be greaer han he shor-run impac, which makes sense because i akes ime o adjus o he changed income. The esimaion of (7.22) poses formidable challenges: Firs, if he error erm u saisfies he classical assumpions (i.e. zero mean value, consan variance and no serial correlaion), he composie error erm v in (7.22) may no saisfy he classical assumpions. As a maer of fac, i can be shown ha he error erm v is serially correlaed. Second, he lagged value of he dependen variable Y appears as an explanaory variable in (7.22). Since Y is a sochasic variable, so will Y 1. Since he classical OLS model assumes ha he explanaory variables mus eiher be nonsochasic, or if sochasic, hey mus be disribued independenly of he error erm, we mus find ou if he laer is he case. In (7.22) i can be shown ha Y 1 and v are correlaed. 29 In his siuaion, he OLS esimaors are no even consisen. Third, as noed in he chaper on auocorrelaion, we canno use he Durbin Wason d saisic o check for auocorrelaion in v if a lagged dependen variable appears as an explanaory variable in he model, as in (7.22), alhough Durbin himself has developed a es, he Durbin h es, o es for serial correlaion in his siuaion. For hese reasons, he Koyck model, alhough elegan, poses formidable esimaion problems. Wha hen are he soluions? Firs, since he error erm v is auocorrelaed, he sandard errors of he OLS esimaors are no reliable, even hough he OLS esimaors are sill consisen. Bu we can resolve his problem by using he HAC sandard errors discussed in he chaper on auocorrelaion. Bu he more serious problem is he correlaion beween he lagged Y and he error erm v. As we know from previous discussion, in his siuaion he OLS esimaors are no even consisen. One soluion o his problem is o find a proxy for he 28 This is because in he long-runy*y Y 1, so ransferring Y 1 o he lef-hand side of (7.22) and simplifying we obain he long-run impac, as shown. 29 For a proof of his and he preceden saemen, see Gujarai/Porer, 5h edn, p. 635.

Regression diagnosic IV: model specificaion errors 139 lagged dependen variable, Y 1, such ha i is highly correlaed wih Y 1 and ye uncorrelaed wih he error erm v. Such a proxy is known as an insrumenal variable (IV), bu i is no always easy o find IVs. 30 In he example discussed below we will show how we can find a proxy for he lagged consumpion expendiure in our consumpion example. An illusraive example To illusrae he model (7.22), we use daa on personal consumpion expendiure (PCE) and disposable (i.e. afer ax) income (DPI) for he USA for he period 1960 o 2009 (all daa in 2005 dollars). (See daa appendix on p. 149.) For our example, using OLS we obain he resuls in Table 7.15. Because of he problems wih he OLS sandard errors in he presence of auocorrelaion, we obained robus sandard errors (i.e. Newey Wes sandard errors) for our consumpion funcion, which yielded he resuls in Table 7.16. Alhough he esimaed regression coefficiens in he wo ables are he same (as hey should be under he HAC procedure), he esimaed sandard errors are somewha higher under HAC. Even hen, all he esimaed coefficiens are saisically highly significan, as refleced in he low p values of he esimaed values. This probably suggess ha he problem of auocorrelaion may no be very serious in he presen case. Acceping he resuls for he ime being, we sill have o resolve he possibiliy of correlaion beween he lagged PCE and he error erm, i seems he shor-run marginal propensiy o consume (MPC) ou of disposable income is abou 0.43, bu he long-run MPC is abou 0.98. 31 Tha is, when consumers have had ime o adjus o a dollar s increase in PDI, hey will increase heir mean consumpion expendiure by almos a dollar in he long run, bu in he shor run, consumpion increases by only abou 43 cens. II Table 7.15 OLS resuls of regression (7.22). Dependen Variable: PCE Mehod: Leas Squares Dae: 07/07/11 Time: 16:40 Sample (adjused): 1961 2009 Included observaions: 49 afer adjusmens Variable Coefficien Sd. Error -Saisic Prob. C 485.8849 197.5245 2.459872 0.0177 DPI 0.432575 0.081641 5.298529 0.0000 PCE( 1) 0.559023 0.084317 6.630052 0.0000 R-squared 0.998251 Mean dependen var 19602.16 Adjused R-squared 0.998175 S.D. dependen var 6299.838 S.E. of regression 269.1558 Akaike info crierion 14.08773 Sum squared resid 3332462. Schwarz crierion 14.20355 Log likelihood 342.1493 Hannan Quinn crier. 14.13167 F-saisic 13125.09 Durbin Wason sa 0.708175 Prob(F-saisic) 0.000000 30 Chaper 19 is devoed o a discussion of he mehod of insrumenal variable esimaion. 31 This is obained as 0.4325/(1 ) = 0.4325/0.441, he value of being abou 0.5590.

140 Criical evaluaion of he classical linear regression model Table 7.16 Resuls of regression wih robus sandard errors. Dependen Variable: PCE Mehod: Leas Squares Dae: 07/07/11 Time: 16:46 Sample (adjused): 1961 2009 Included observaions: 49 afer adjusmens HAC sandard errors & covariance (Barle kernel, Newey Wes fixed bandwidh = 4.0000) Variable Coefficien Sd. Error -Saisic Prob. C 485.8849 267.7614 1.814619 0.0761 DPI 0.432575 0.098339 4.398823 0.0001 PCE( 1) 0.559023 0.102057 5.477587 0.0000 R-squared 0.998251 Mean dependen var 19602.16 Adjused R-squared 0.998175 S.D. dependen var 6299.838 S.E. of regression 269.1558 Akaike info crierion 14.08773 Sum squared resid 3332462. Schwarz crierion 14.20355 Log likelihood 342.1493 Hannan Quinn crier. 14.13167 F-saisic 13125.09 Durbin Wason sa 0.708175 Prob(F-saisic) 0.000000 The esimaed of abou 0.56 lies beween 0 and 1, as expeced. Thus he speed of adjusmen of PCE o a change in DPI is no very slow or no very fas. To see how quickly PCE adjuss o an increase in DPI, we can compue he so-called median and mean lag imes. The median lag ime is he ime in which he firs half, or 50%, of he oal change in PCE follows a uni susained change in DPI. The mean lag is he weighed average of all he lags involved, wih he respecive B coefficiens serving as he weighs. For he Koyck model, i can be shown ha hese lags are as follows: and Median lag log 2 log Mean lag = 1 The reader can verify ha for he presen example he median and mean lags are abou 1.19 and 1.27, respecively, noing ha is abou 0.56. In he former case, abou 50% of he oal change in mean PCE is obained in abou 1.2 years and in he laer case he average lag is abou 1.3 years. As noed, he lagged DPI and he error erm (7.22) are likely o be correlaed, which would render he resuls in Table 7.16 suspec, for in his siuaion he OLS esimaors are no even consisen. Can we find a proxy for he lagged PCE such ha ha proxy is highly correlaed wih i, bu is uncorrelaed wih he error erm in (7.22)? Since lagged PCE and lagged DPI are likely o be highly correlaed, and since he laer by assumpion is (weakly) exogenous, we can use lagged DPI as a proxy for lagged PCE. 32 32 Calculaions will show ha he correlaion coefficien beween he wo is abou 0.998.

Regression diagnosic IV: model specificaion errors 141 Table 7.17 The resuls of regression (7.23) using HAC sandard errors. Dependen Variable: PCE Mehod: Leas Squares Dae: 07/08/11 Time: 08:51 Sample (adjused): 1961 2009 Included observaions: 49 afer adjusmens HAC sandard errors & covariance (Barle kernel, Newey Wes fixed bandwidh = 4.0000) Variable Coefficien Sd. Error -Saisic Prob. C 1425.511 372.3686 3.828224 0.0004 DPI 0.934361 0.175986 5.309287 0.0000 DPI( 1) 0.038213 0.177358 0.215455 0.8304 R-squared 0.996583 Mean dependen var 19602.16 Adjused R-squared 0.996434 S.D. dependen var 6299.838 S.E. of regression 376.1941 Akaike info crierion 14.75736 Sum squared resid 6510013. Schwarz crierion 14.87318 Log likelihood 358.5553 Hannan Quinn crier. 14.80130 F-saisic 6707.481 Durbin Wason sa 0.351356 Prob(F-saisic) 0.000000 II Therefore, insead of esimaing (7.22), we can esimae PCE AB1DPI B2DPI 1 u (7.23) which is a finie order DLM. The resuls of his regression, wih HAC errors, are given in Table 7.17. The lagged DPI coefficien in his regression is no saisically significan, which may be due o he fac ha curren and lagged DPI are so highly correlaed. If you add he coefficiens of curren and lagged DPI, i is abou 0.9725, which gives he long-run MPC. I should be noed ha he proxy we have chosen may no be he righ one. 33 Bu as noed previously, and as will be discussed more fully in Chaper 19, finding appropriae proxies is no always easy. Auoregressive Disribued Lag Models (ARDL) So far we have considered auoregressive and disribued lag models. Bu we can combine he feaures of hese models in a more general dynamic regression model, known as he Auoregressive Disribued Lag Models (ARDL). To keep he discussion simple, we consider one dependen variable, or regressand, Y and one explanaory variable, or regressor, X, alhough he discussion can be exended o models ha conain more han one regressor and more han one dependen variable, a opic explored more fully in Chapers 13 and 16. Now consider he following model: 33 If we had daa on consumer s wealh (W), we could use lagged W for he lagged DPI, for hey are likely o be highly correlaed. However, i is no easy o find daa on consumer wealh.

142 Criical evaluaion of he classical linear regression model Y A AY A Y A Y 0 1 1 2 2 p p B X B X B X B X u 0 1 1 2 2 This equaion can be wrien more compacly as: ip iq Y A AY B X u 0 i i i1 i0 i i q q (7.24) (7.25) In his model he lagged Ys consiue he auoregressive par and he lagged Xs consiue he disribued par of he ARDL(p, q) model, for here are p auoregressive erms and q disribued lag erms. An advanage of such an ARLD model is ha i no only capures he dynamic effecs of he lagged Ys bu also hose of he lagged Xs. If a sufficien number of lags of boh variables are included in he model, we can eliminae auocorrelaion in he error erm, he choice of he number of lags included in he model being deermined by Akaike or a similar informaion crierion. Such models are ofen used for forecasing and also for esimaing he muliplier effecs of he regressors in he model. Before we consider he esimaion and inerpreaion of his model, as well as he naure of he regressand, regressors and he error erm, i may be useful o know why such models can be useful in empirical work. 34 One classic example is he celebraed Phillips curve. Based on hisorical daa, Phillips found an inverse relaionship beween inflaion and unemploymen, alhough he iniial Phillips curve has been modified in several ways. 35 Since curren inflaion is likely o be influenced by lagged inflaion (because of ineria) as well as he curren and pas unemploymen raes, i is appropriae o develop an ARDL model for forecasing and policy purposes. 36 For anoher example, consider he sale of a produc in relaion o adverising expendiure on ha produc. The sale of a produc in he curren ime period is likely o depend on he sale of ha produc in he previous ime periods as well as he expendiure on adverising in he curren and previous ime periods. In our consumpion funcion example we can also argue ha curren consumpion expendiure is dependen on pas consumpion expendiures as well curren and pas levels of incomes, he number of lags being deermined empirically using a suiable informaion crierion, such as he Akaike Informaion crierion. To minimize he algebra, le us consider an ARDL (1,1) model for our consumpion funcion. Y A AY B X B X u, A (7.26) 37 0 1 1 0 1 1 1 1 where Y = PCE and X = DPI. 34 For a deailed bu advanced discussion see David F. Henry (1995), Dynamic Economerics, Oxford Universiy Press. 35 For a chronology of he various forms of Phillips curve, see Gordon, R. J. (2008), The hisory of he Phillips curve: an American perspecive, a keynoe address delivered a he Ausralasian Meeings of he Economeric Sociey. See hp://www.nzae.org.nz/conference/2008/090708/nr1217302437.pdf. 36 For a concree example, see R. Carer Hill, William E. Griffihs and Guay C. Lim (2011), Principles of Economerics, 3rd edn, Wiley, New York, pp. 367 369. 37 If he condiion A 1 < 1 is violaed, Y will exhibi explosive behavior.

Regression diagnosic IV: model specificaion errors 143 Tha is, personal consumpion expendiure in he curren period is relaed o personal consumpion expendiure in he previous ime period as well as on he curren and one-period lagged disposable income. An imporan feaure of he model (7.26) is ha i enables us o find he dynamic effecs of a change in DPI on curren and fuure values of PCE. The immediae effec, called he impac muliplier, of a uni change in DPI is given by he coefficien B 0.If he uni change in DPI is susained, i can be shown ha he long-run muliplier is given by long-run muliplier = B 0 B 1 (7.27) 1 A 1 So if DPI increases by a uni (say, a dollar) and is mainained, he expeced cumulaive increase in PCE is given by (7.27). 38 In oher words, if he uni increase in DPI is mainained, Equaion (7.27) gives he long-run permanen increase in PCE. To illusrae he ARDL(1,1) model for our consumpion example, we have o make cerain assumpions. Firs, he variables Y and X are saionary. 39 Secondly, given he values of regressors in Eq. (7.26), or more generally in Eq. (7.24), he expeced mean value of he error erm u is zero. Thirdly, if he error erm in Eq. (7.24) is serially uncorrelaed, hen he coefficiens of he model (7.24), or in he presen model (7.26), esimaed by OLS will be consisen (in he saisical sense). However, if he error erm is auocorrelaed, he lagged Y erm in Eq. (7.26), or generally in Eq. (7.24), will also be correlaed wih he error erm, in which case he OLS esimaors will be inconsisen. So we need o find ou if he error erm is auocorrelaed by any of he mehods discussed in he chaper on auocorrelaion. Finally, i is assumed ha he X variables are exogenous a leas weakly so. Tha is, hey are uncorrelaed wih he error erm. Now le us reurn o our illusraive example. The resuls of model (7.26) are given in Table 7.18. Assuming he validiy of he model for he ime being, he resuls show ha he impac muliplier of a uni change in DPI on PCE is abou 0.82. If his uni change is mainained, hen he long-run muliplier, following Eq. (7.27), is abou 0.9846. 40 As expeced, he long-run muliplier is greaer han he shor-run muliplier. Thus a susained one dollar increase in DPI will evenually increase mean PCE by abou 98 cens. To allow for he possibiliy of serial correlaion in he error erm, we re-esimaed he model in Table 7.18 using he HAC procedure. The resuls are given in Table 7.19. The HAC procedure does no change he esimaed sandard errors subsanially, perhaps suggesing ha he serial correlaion problem in our example may no be serious. We leave i o he reader o ry differen lagged values for p and q in he ARDL(p,q) model for our daa and compare he resuls wih he ARDL(1,1) model. II 38 For a derivaion of his resul, see Marno Verbeek (2008), A Guide o Modern Economerics, 3rd edn, Wiley and Sons, Chicheser, pp. 324 325. 39 Broadly speaking, a ime series is saionary if is mean and variance are consan over ime and he value of covariance beween wo ime periods depends only on he disance beween he wo ime periods and no he acual ime a which he covariance is compued. This opic is discussed more horoughly in Chaper 13. 40 Long-run muliplier = (B 0 + B 1 )/(1 A 1 ) = (0.8245 0.6329)/(1 0.8053) = 0.9846 (approx.)

144 Criical evaluaion of he classical linear regression model Table 7.18 OLS esimaes of model (7.26). Dependen Variable: PCE Mehod: Leas Squares Dae: 08/14/11 Time: 13:35 Sample (adjused): 1961 2009 Included observaions: 49 afer adjusmens Variable Coefficien Sd. Error -Saisic Prob. C 281.2019 161.0712 1.745823 0.0877 DPI 0.824591 0.097977 8.416208 0.0000 PCE( 1) 0.805356 0.081229 9.914632 0.0000 DPI( 1) 0.632942 0.118864 5.324935 0.0000 R-squared 0.998927 Mean dependen var 19602.16 Adjused R-squared 0.998855 S.D. dependen var 6299.838 S.E. of regression 213.1415 Akaike info crierion 13.63990 Sum squared resid 2044318. Schwarz crierion 13.79433 Log likelihood 330.1775 Hannan Quinn crier. 13.69849 F-saisic 13962.93 Durbin Wason sa 1.841939 Prob(F-saisic) 0.000000 Table 7.19 OLS esimaes of model (7.26) wih HAC sandard errors. Dependen Variable: PCE Mehod: Leas Squares Dae: 08/14/11 Time: 13:41 Sample (adjused): 1961 2009 Included observaions: 49 afer adjusmens HAC sandard errors & covariance (Barle kernel, Newey Wes fixed bandwidh = 4.0000) Variable Coefficien Sd. Error -Saisic Prob. C 281.2019 117.3088 2.397107 0.0207 PCE( 1) 0.805356 0.071968 11.19044 0.0000 DPI 0.824591 0.114989 7.171026 0.0000 DPI( 1) 0.632942 0.119717 5.286977 0.0000 R-squared 0.998927 Mean dependen var 19602.16 Adjused R-squared 0.998855 S.D. dependen var 6299.838 S.E. of regression 213.1415 Akaike info crierion 13.63990 Sum squared resid 2044318. Schwarz crierion 13.79433 Log likelihood 330.1775 Hannan Quinn crier. 13.69849 F-saisic 13962.93 Durbin Wason sa 1.841939 Prob(F-saisic) 0.000000 Forecasing How do we use he model (7.26) for forecasing? Suppose we wan o forecas PCE for 1961, ha is, one-period ahead of 1960 (our sample daa ends in 1960). Tha is, we wan o esimae PCE 1961. We can move he model one period ahead as follows: PCE1961 A0 A1 Y1960 B0X1961 B1X1960 u1961 (7.28) Here we know he values of Y 1960 and X 1960. Bu we do no know he values of X 1961 and u 1961. We can guess-esimae X 1961 or obain is value from any forecasing

Regression diagnosic IV: model specificaion errors 145 mehod discussed in Chaper 16 on economic forecasing. We can pu he value of u 1961 a zero. Then, using he esimaed values of he parameers from Table 7.19, we can esimae he esimaed value of PCE 1961. A similar procedure can be used for muli-period ahead forecass of PCE. Bu we leave i o he reader o find he numerical values of PCE for one-period-ahead and muli-period-ahead forecass. Concluding commens In his secion we have discussed hree dynamic regression models: auoregressive, disribued lag, and auoregressive and disribued lag models. We firs considered an infinie order (DLM), bu because i involves esimaing an infinie number of parameers we convered i ino an auoregressive model via he Koyck ransformaion. Wih a numerical example involving real personal consumpion expendiure and real disposable income in he US for he period 1960 2009, we showed how hese models are esimaed, noing he assumpions underlying hese models and some of he esimaion problems. We also discussed a simple auoregressive disribued lag model, ARDL(1,1), which combines he feaures of boh auoregressive and disribued lag models and showed how we can compue he shor-run and long-run mulipliers following a permanen uni increase in he value of a regressor. We also discussed he assumpions underlying his model and some of he esimaion procedures. We also discussed briefly how forecass for fuure periods can be made based on he ARDL models. The opic of dynamic regression models is vas and is mahemaically complex. In his secion we have jus ouched he essenial feaures of such models. For furher sudy of hese models he reader is advised o consul he references. II 7.11 Summary and conclusions We have covered a lo of ground in his chaper on a variey of pracical opics in economeric modeling. If we omi a relevan variable(s) from a regression model, he esimaed coefficiens and sandard errors of OLS esimaors in he reduced model are biased as well as inconsisen. We considered he RESET and Lagrange Muliplier ess o deec he omission of relevan variables bias. If we add unnecessary variables o a model, he OLS esimaors of he expended model are sill BLUE. The only penaly we pay is he loss of efficiency (i.e. increased sandard errors) of he esimaed coefficiens. The appropriae funcional form of a regression model is a commonly encounered quesion in pracice. In paricular, we ofen face a choice beween a linear and a log-linear model. We showed how we can compare he wo models in making he choice, using he Cobb Douglas producion funcion daa for he 50 saes in he USA and Washingon, DC, as an example. Errors of measuremen are a common problem in empirical work, especially if we depend on secondary daa. We showed ha he consequences of such errors can be very serious if hey exis in explanaory variables, for in ha case he OLS esimaors are no even consisen. Errors of measuremen do no pose a serious problem if hey are in he dependen variable. In pracice, however, i is no always easy o spo he

146 Criical evaluaion of he classical linear regression model errors of measuremen. The mehod of insrumenal variables, discussed in Chaper 19, is ofen suggesed as a remedy for his problem. Generally we use he sample daa o draw inferences abou he relevan populaion. Bu if here are unusual observaions or ouliers in he sample daa, inferences based on such daa may be misleading. Therefore we need o pay special aenion o oulying observaions. Before we hrow ou he oulying observaions, we mus be very careful o find ou why he ouliers are presen in he daa. Someimes hey may resul from human errors in recording or ranscribing he daa. We illusraed he problem of ouliers wih daa on cigaree smoking and deahs from lung cancer in a sample of 42 saes, in addiion o Washingon, DC. One of he assumpions of he classical normal linear regression model is ha he error erm included in he regression model follows he normal disribuion. This assumpion canno always be mainained in pracice. We showed ha as long he assumpions of he classical linear regression model (CLRM) hold, and if he sample size is large, we can sill use he and F ess of significance even if he error erm is no normally disribued. Finally, we discussed he problem of simulaneiy bias which arises if we esimae an equaion ha is embedded in sysem of simulaneous equaions by he usual OLS. If we blindly apply OLS in his siuaion, he OLS esimaors are biased as well as inconsisen. There are alernaive mehods of esimaing simulaneous equaions, such as he mehods of indirec leas-squares (ILS) or he wo-sage leas squares (2SLS). In his chaper we showed how ILS can be used o esimae he consumpion expendiure funcion in he simple Keynesian model of deermining aggregae income. Exercises 7.1 For he wage deerminaion model discussed in he ex, how would you find ou if here are any ouliers in he wage daa? If you do find hem, how would you decide if he ouliers are influenial poins? And how would you handle hem? Show he necessary deails. 7.2 In he various wage deerminaion models discussed in his chaper, how would you find ou if he error variance is heeroscedasic? If your finding is in he affirmaive, how would you resolve he problem? 7.3 In he chaper on heeroscedasiciy we discussed robus sandard errors or Whie s heeroscedasiciy-correced sandard errors. For he wage deerminaion models, presen he robus sandard errors and compare hem wih he usual OLS sandard errors. 7.4 Wha oher variables do you hink should be included in he wage deerminaion model? How would ha change he models discussed in he ex? 7.5 Use he daa given in Table 7.8 o find ou he impac of cigaree smoking on bladder, kidney, and leukemia cancers. Specify he funcional form you use and presen your resuls. How would you find ou if he impac of smoking depends on he ype of cancer? Wha may he reason for he difference be, if any? 7.6 Coninue wih Exercise 7.5. Are here any ouliers in he cancer daa? If here are, idenify hem.

Regression diagnosic IV: model specificaion errors 147 7.7 In he cancer daa we have 43 observaions for each ype of cancer, giving a oal of 172 observaions for all he cancer ypes. Suppose you now esimae he following regression model: C B B Cig B Lung B Kidney B Leukemia u i 1 2 i 3 i 4 i 5 i i where C = number of deahs from cancer, Cig = number of cigarees smoked, Lung =a dummy aking a value of 1 if he cancer ype is lung, 0 oherwise, Kidney = a dummy aking a value of 1 if he cancer ype is kidney, 0 oher wise, and Leukemia = 1 if he cancer ype is leukemia, 0 oherwise. Trea deahs from bladder cancer as a reference group. (a) Esimae his model, obaining he usual regression oupu. (b) How do you inerpre he various dummy coefficiens? (c) Wha is he inerpreaion of he inercep B 1 in his model? (d) Wha is he advanage of he dummy variable regression model over esimaing deahs from each ype of cancer in relaion o he number of cigarees smoked separaely? Noe: Sack he deahs from various cancers one on op of he oher o generae 172 observaions on he dependen variable. Similarly, sack he number of cigarees smoked o generae 172 observaions on he regressor. 7.8 The error erm in he log of wages regression in Table 7.7 was found o be non-normally disribued. However, he disribuion of log of wages was normally disribued. Are hese findings in conflic? If so, wha may he reason for he difference in hese findings? 7.9 Consider he following simulaneous equaion model: II Y A A Y A X u (1) 1 1 2 2 3 1 1 Y B B Y B X u (2) 2 1 2 1 3 2 2 In his model he Ys are he endogenous variables and he Xs are he exogenous variables and he us are sochasic error erms. (a) Obain he reduced form regressions. (b) Which of he above equaions is idenified? (c) For he idenified equaion, which mehod will you use o obain he srucural coefficiens? (d) Suppose i is known a priori ha A 3 is zero. Will his change your answer o he preceding quesions? Why? 7.10 For he ARDL(1,1) model, he long-run muliplier is given in Eq. (7.27). Suppose for he illusraive example you esimae he following simple regression model: PCE = C 1 + C 2 DPI + u Esimae his regression and show ha C 2 is equal o he long-run muliplier given in Eq. (7.27). Can you guess why his is so? Can you esablish his formally?

148 Criical evaluaion of he classical linear regression model Appendix Inconsisency of he OLS esimaors of he consumpion funcion The OLS esimaor of he marginal propensiy o consume is given by he usual OLS formula: b cy y Cy y 2 2 2 (1) where c and y are deviaions from heir mean values, e.g. c C C. Now subsiue Eq. (7.8) ino Eq. (1) o obain: b 2 ( B1 B2Y u ) y y2 B yu y 2 2 (2) where use is made of he fac ha y 0 and Yy Taking he expecaion of Eq. (2), we obain: / y2 1. yu Eb ( 2) B2 E y2 (3) Since E, he expecaions operaor, is a linear operaor, we canno ake he expecaion of he nonlinear second erm in his equaion. Unless he las erm is zero, b 2 is a biased esimaor. Does he bias disappear as he sample increases indefiniely? In oher words, is he OLS esimaor consisen? Recall ha an esimaor is said o be consisen if is probabiliy limi (plim) is equal o is rue populaion value. To find his ou, we can ake he probabiliy limi (plim) of Eq. (3): yu / n plim( b2) plim( B2) plim y2 / n p lim( yu / n) B2 plim( y 2 / n) (4) where use is made of he properies of he plim operaor ha he plim of a consan (such as B 2 ) is ha consan iself and he plim of he raio of wo eniies is he raio of he plim of hose eniies. As he sample size n increases indefiniely, i can be shown ha 1 plim( b2) B2 1 B 2 2 u 2 y (5) where u 2 and y 2 are he (populaion) variances of u and Y, respecively. Since B 2 (MPC) lies beween 0 and 1, and since he wo variances are posiive, i is obvious ha p lim (b 2 ) will always be greaer han B 2, ha is, b 2 will overesimae B 2,no

Regression diagnosic IV: model specificaion errors 149 maer how large he sample is. In oher words, no only is b 2 biased, bu i is inconsisen as well. Daa appendix obs PCE DPI 1960 9871.000 10865.00 1961 9911.000 11052.00 1962 10243.00 11413.00 1963 10512.00 11672.00 1964 10985.00 12342.00 1965 11535.00 12939.00 1966 12050.00 13465.00 1967 12276.00 13904.00 1968 12856.00 14392.00 1969 13206.00 14706.00 1970 13361.00 15158.00 1971 13696.00 15644.00 1972 14384.00 16228.00 1973 14953.00 17166.00 1974 14693.00 16878.00 1975 14881.00 17091.00 1976 15558.00 17600.00 1977 16051.00 18025.00 1978 16583.00 18670.00 1979 16790.00 18897.00 1980 16538.00 18863.00 1981 16623.00 19173.00 1982 16694.00 19406.00 1983 17489.00 19868.00 1984 18256.00 21105.00 obs PCE DPI 1985 19037.00 21571.00 1986 19630.00 22083.00 1987 20055.00 22246.00 1988 20675.00 22997.00 1989 21060.00 23385.00 1990 21249.00 23568.00 1991 21000.00 23453.00 1992 21430.00 23958.00 1993 21904.00 24044.00 1994 22466.00 24517.00 1995 22803.00 24951.00 1996 23325.00 25475.00 1997 23899.00 26061.00 1998 24861.00 27299.00 1999 25923.00 27805.00 2000 26939.00 28899.00 2001 27385.00 29299.00 2002 27841.00 29976.00 2003 28357.00 30442.00 2004 29072.00 31193.00 2005 29771.00 31318.00 2006 30341.00 32271.00 2007 30838.00 32648.00 2008 30479.00 32514.00 2009 30042.00 32637.00 II Noe: The daa in his able are 2005 chained dollars. Source: US Deparmen of Commerce. The daa can also be found on he websie of he Federal Reserve Bank of S Louis, USA.

Addiional maerial for Appendix 2: Saisical Appendix

Saisical appendix 375 S 2 x ( n ) 1 ~ 2 2 ( n1 ) x Suppose a random sample of 30 observaions is chosen from a normal populaion wih x 2 = 10 and gave a sample variance of S x 2 = 15. Wha is he probabiliy of obaining such a sample variance (or greaer)? (Hin: Use saisical ables.) Exponenial and logarihmic funcions In Chaper 2 we considered several funcional forms of regression models, one of hem being he logarihmic model, eiher double-log or semi-log. Since logarihmic funcional forms appear frequenly in empirical work, i is imporan ha we sudy some of he imporan properies of he logarihms and heir inverse, he exponenials. Consider he numbers 8 and 64. As you can see 64 = 8 2 (1) Wrien his way, he exponen 2 is he logarihm of 64 o he base 8. Formally, he logarihm of a number (e.g. 64) o a given base (e.g. 8) is he power (2) o which he base (8) mus be raised o obain he given number (64). In general, if hen X Y b ( b0 ) (2) log b Y X (3) In mahemaics funcion (2) is called he exponenial funcion and (3) is called he logarihmic funcion. I is clear from hese equaions ha one funcion is he inverse of he oher funcion. Alhough any posiive base can be used in pracice, he wo commonly used bases are 10 and he mahemaical number e = 2.71828... Logarihms o base 10 are called common logarihms. For example, log 64 181. ; log 30 148. 10 10 In he firs case 64 10 1.81 and in he second case 30 10 1.48. Logarihms o base e are called naural logarihms. Thus, log e 64 4. 16 and log e 30 3. 4 By convenion, logarihms o base 10 are denoed by log and o base e by ln. In he preceding case we can wrie log 64 or log 30 or ln 64 and ln 30. There is a fixed relaionship beween common and naural logs, which is ln X 23026. log X (4) Tha is, he naural log of he (posiive) number X is equal o 2.3026 imes he log of X o base 10. Thus, ln 30 = 2.3026 log 30 = 2.3026(1.48) 3.4, as before. In mahemaics he base ha is usually used is e.

376 Appendix 2 I is imporan o keep in mind ha logarihms of negaive numbers are no defined. Some of he imporan properies of logarihms are as follows: le A and B be some posiive numbers. I can be shown ha he following properies hold: 1. ln( AB) ln A ln B (5) Tha is, he log of he produc of wo posiive numbers A and B is equal o he sum of heir logs. This propery can be exended o he produc of hree or more posiive numbers. 2. ln A ln A ln B (6) B Tha is, he log of he raio of A o B is equal he difference in he logs of A and B. 3. ln( AB) ln A ln B (7) Tha is, he log of he sum or difference of A and B is no equal o he sum or difference of heir logs. 4. ln( A k ) kln A (8) Tha is, he log of A raised o power k is k imes he log of A. 5. ln e = 1 (9) Tha is, he log of e o iself as a base is 1 (as is he log of 10 o base 10). 6. ln 1 0 Tha is, he naural log of he number 1 is zero; so is he common log of he number 1. 7. IfY ln X, hen d Y d(lnx) 1 dx dx X Tha is, he derivaive or rae of change of Y wih respec o X is 1 over X. However, if you ake he second derivaive of his funcion, which gives he rae of change of he rae of change, you will obain: 2 d Y 1 (11) dx2 X2 Tha is, alhough he rae of change of he log of a (posiive) number is posiive, he rae of change of he rae of change is negaive. In oher words, a larger posiive number will have a larger logarihmic value, bu i increases a a decreasing rae. Thus, ln( 10) 23026. bu ln( 20) 2. 9957. Tha is why he logarihmic ransformaion is called a nonlinear ransformaion. All his can be seen clearly from Figure A2.2. 8. Alhough he number whose log is aken is always posiive, is logarihm can be posiive as well as negaive. I can be easily verified ha if 0Y 1,lnY 0 Y 1,lnY 0 Y 1,lnY 0 (10)

Saisical appendix 377 160 140 Y 120 100 80 Y =7.5X 60 40 20 6 0 2 4 6 8 10 12 14 16 18 20 X 5 4 ln Y 3 2 1 0 15 30 45 60 75 90 105 120 135 150 Y Figure A2.2 Tweny posiive numbers and heir logs. Logarihms and percenage changes Economiss are ofen ineresed in he percenage change of a variable, such as he percenage change in GDP, wages, money supply, and he like. Logarihms can be very useful in compuing percenage changes. To see his, we can wrie (10) above as: dx d(ln X) X Therefore, for a very small (echnically, infiniesimal) change in X, he change in ln X is equal o he relaive or proporional change in X. If you muliply his relaive change by 100, you ge he percenage change.

378 Appendix 2 In pracice if he change in X (= dx) is reasonably small, we can approximae he change in ln X as a relaive change in X, ha is, for small changes in X, we can wrie ( X X 1) (ln X ln X 1) X 1 relaive change in X, or percenage change if muliplied by 100 Some useful applicaions of logarihms Doubling imes and he rule of 70 Suppose he GDP in a counry is growing a he rae of 3% per annum. How long will ake for is GDP o double? Le r = percenage rae of growh in GDP and le n = number of years i akes for GDP o double. Then he number of years (n) i akes for he GDP o double is given by he following formula: n 70 (12) r Thus, i will ake abou 23 years o double he GDP if he rae of growh of GDP is 3% per annum. If r=8%, i will ake abou 8.75 years for he GDP o double. Where does he number 70 come from? To find his, le GDP ( + n) and GDP () be he values of GDP a ime ( + n) and a ime (i is immaerial where sars). Using he coninuous compound ineres formula of finance, i can be shown ha GDP( n) GDP( ) e rn (13) where r is expressed in decimals and n is expressed in years or any convenien ime uni. Now we have o find n and r such ha e rn GDP( n) 2 (14) GDP() Taking he naural logarihm of each side, we obain rn ln 2 (15) Noe: There is no need o worry abou he middle erm in (14), for he iniial level of GDP (or any economic variable) does no affec he number of years i akes o double is value. Since ln (2) = 0.6931 070. (16) we obain from (15) n 070. (17) r

Saisical appendix 379 Muliplying he righ-hand side in he numeraor and denominaor by 100, we obain he rule of 70. As you can see from his formula, he higher he value of r, he shorer he ime i will ake for he GDP o double. Some growh rae formulas Logarihmic ransformaions are very useful in compuing growh raes in variables ha are funcions of ime-dependen variables. To show his, le he variable W be a funcion of ime,w f(), where denoes ime. Then he insananeous (i.e. a poin in ime) rae of growh of W, denoed as g W, is defined as: W W gw d / d W 1 d W d For example, le W (18) X Z (19) where W= nominal GDP, X = real GDP, and Z is he GDP price deflaor. All hese variables vary over ime. Taking he naural log of he variables in (19), we obain: lnw ln X ln Z (20) Differeniaing his equaion wih respec o (ime), we obain: Or, 1 dw 1 dx 1 dz (21) W d X d Z d gw gx gz (22) In words, he insananeous rae of growh of W is equal o he sum of he insananeous raes of growh of X and Z. In he presen insance, he insananeous rae of growh of nominal GDP is he sum of he insananeous raes of growh of real GDP and he GDP price deflaor, a finding ha should be familiar o sudens of economics. In general, he insananeous rae of growh of a produc of wo or more variables is he sum of he insananeous raes of growh of is componens. Similarly, i can be shown ha if we have hen W X (23) Z gw gx gz (24) Thus, if W = per capia income (measured by GDP), X = GDP, and Z = oal populaion, hen he insananeous rae of growh of per capia income is equal o he insananeous rae of growh of GDP minus he insananeous rae of growh of he oal populaion, a proposiion well known o sudens of economic growh.