Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system:

Panel Data Exercises Manuel Arellano Exercise 1 Using panel data, a researcher considers the estimation of the following system: y 1t = α 1 + βx 1t + v 1t. (t =1,..., T ) y Nt = α N + βx Nt + v Nt where the equation for each unit is postulated to have a different intercept but a common slope parameter. The chosen method of estimation of β is OLS including dummies for different units (i.e. within-groups), because lack of time series correlation between x it and v it for each unit is regarded as a reasonable assumption. Next, in order to obtain a confidence interval for β, a robust standard error is calculated allowing for contemporaneous correlation among the errors, but ruling out autocorrelation. Another researcher considers the estimation of the system: y i1 = α + βx i1 + u i1. (i =1,..., N). y it = α + βx it + u it This researcher is worried that x it may be an endogenous explanatory variable. It is postulated that the correlation between x it and u is, for any two periods t and s, occurs only through an additive time-invariant component of u is,sothat u is = η i + v is and Cov (x it,v is )=0for all t and s. The chosen method of estimation is OLS in deviations from means (i.e. within-groups) on the grounds that the deviations eliminate the source of endogeneity, and give rise to a transformed error that remains orthogonal to the transformed explanatory variable. Next, in order to obtain a confidence interval for β, a robust standard error is calculated allowing for autocorrelation and heteroskedasticity in v it, but ruling out cross-sectional dependence among the errors. (i) Both researchers arrived at the same estimator starting from very different perspectives. Does the validity of the estimation method in each case rest on different assumptions? Discuss. 1

(ii) In spite of using the same estimator, the two researchers rely on different methods and assumptions for calculating a confidence interval. Compare the two methods and discuss their statistical properties. (iii) Suppose there is a concern that the effect of x on y may differ across units. The first researcher addresses this concern by obtaining separate OLS estimates for different units b β 1,..., b β N and the average eβ = 1 N NX bβ i. i=1 The second researcher reckons that as long as the variation in the slopes is independent of x it, within-groups can be interpreted as a consistent estimate of the average effect. Discuss this claim and compare the strategies followed by the two researchers. 2

Exercise 2 Consider a first-order autoregressive model with individual and time effects of the form y it µ i δ t = α ³ y i(t 1) µ i δ t 1 + vit (i =1,..., N; t =1,...,T) E(v it y i0,..., y i(t 1), δ 0,..., δ t,µ i )=0. Suppose that in fact T =2, so that for each individual we observe y i0,y i1,y i2. (a) Obtain the within-groups estimate of α and discuss its properties. (b) Derive a consistent estimator of α for large N. How would your answer be modified if T>2? (c) Discuss the costs and benefits of assuming that the initial observations satisfy E(y i0 µ i, δ 0 )=µ i + δ 0. How would you test this assumption? 3

Exercise 3 Consider the following partial adjustment model with individual effects y it = αy i(t 1) + β 0 x it + β 1 x i(t 1) + η i + v it (i =1,..., N; t =1,..., T ). Discuss the identification and estimation of the parameters of a model of this type when T is small and N is large, under the assumptions listed below. Set out carefully any additional assumptions that you make in each case. (a) x it is a strictly exogenous variable uncorrelated with η i, and v it is a potentially serially correlated error. (b) The variable x it is strictly exogenous but correlated with the individual effect η i. (c) x it is a predetermined variable correlated with η i and v it is a white noise error. (d) Discuss the extent to which strict exogeneity is a testable as opposed to an identifying assumption in this model. 4

Exercise 4 Let y it be a 0 1 binary variable such that Pr(y it =1 y i1,..., y i(t 1),x i1,..., x it, η i )=F (x 0 itβ + η i )(i =1,..., N; t =1,..., T ) where F is some known cumulative distribution function. (a) Derive the log-likelihood function for this model and explain what difficulties one might encounter in estimating β by maximizing the log-likelihood with respect to (β, η 1,..., η N ). (b) Obtain the plim as N of the MLE of β for logit and probit when T =2, β is a scalar parameter, and x i1 =0and x i1 =1for all i. (c) Describe a method of estimation of β that is consistent as N when T =2,andF (r) =e r /(1 + e r ), regardless of the form of the distribution of η i x i1,..., x it. (d) Explain how you would estimate β in a probit model under the additional assumption that η i x i1,...,x it N ³ ψ + λ 0 1x i1 +... + λ 0 T x it, σ 2 η. (e) Discuss the relative merits of the methods considered in parts (b) and (c), and suggest a method for testing the correlation between x it and η i. 5

Exercise 5 Suppose that the demand for money of each firm i responds to output with an elasticity of unity (² =1)andsatisfies m it = y it + η i + w t + e it where m it and y it denote, respectively, the logs of cash holdings and output of firm i in period t, η i represents the financial technology of the firm, w t captures aggregate effects, and e it is a transitory effect. Moreover, output satisfies y it = µ i + p t + c it where µ i captures firm size, p t aggregate effects, and c it transitory fluctuations independent of µ i and η i, Corr(µ i, η i ) = 0.5 and Var(µ i ) = Var(η i )=4Var(c it ). In addition, w t and p t follow a VAR process: w t = 0.6w t 1 +0.2p t 1 + u t p t = 0.8p t 1 + υ t Cov (u t, υ t )=0. (i) A researcher decides to use time series aggregate data m t and y t to estimate ². m t and y t are averages such that e it and c it cancel out, satisfying m t = η + y t + w t and y t = µ + p t. The researcher is also interested in measuring adjustment costs, and because of this he estimates by OLS: m t = δ + αm t 1 + β 0 y t + β 1 y t 1 + v t getting an estimate of α of around 0.6 and an estimate of the long run elasticity of 1.5. Next, he tests and rejects the restriction β 1 + αβ 0 =0, hence discarding the possibility that the equation dynamics is capturing autocorrelation in the errors. Finally, he erroneously concludes that adjustment costs are important and that ² is greater than unity. Explain why these results were obtained. (ii) What would happen if in order to take into account the endogeneity of y t, the equation was reestimated by 2SLS using as instruments m t j and y t j, (j =1, 2)? (iii) Explain the difficulties that another researcher would find in trying to use a cross-sectional sample of firms in order to estimate ². 6

Exercise 6 Let us consider the following model for the consumption of a good from an addictive consumer, in the absence of uncertainty about future prices: c t = θc t 1 + βθc t+1 + γp t + u t 0 θ < 1 0 β < 1 where c t is consumption in t, p t price, and u t an unobservable term that captures changes in the consumer s preferences. The variables p t and u t are stationary processes such that Cov (u t,p s )=0for all t and s. Thecoefficient θ measures the degree of addiction and β is a discount factor. (i)showthattheolsestimatesof(θ, βθ, γ) obtained as a regression of c t on c t 1, c t+1 and p t are generally inconsistent. (ii) Propose a consistent estimation method for θ, β and γ Under which conditions are θ, β and γ identified? (iii) Explain how to obtain asymptotic standard errors for the estimates suggested in (ii) and discuss the assumptions under which they are valid. (iv) Explain the concept of Granger non-causality and find the conditions under which c does not Granger cause p. 7