Quantile Regression for Dynamic Panel Data

Quantile Regression for Dynamic Panel Data Antonio Galvao 1 1 Department of Economics University of Illinois NASM Econometric Society 2008 June 22nd 2008

Panel Data Panel data allows the possibility of following the same individuals over time, which allows to control for individual specific effects Recently, semiparametric panel data models have attracted considerable interest in both theory and application since the error distribution is not specified Honore and Lewbell (2002), Koenker (2004), Abrevaya and Dahl (2008)

Quantile Regression Panel Data Koenker (2004) - Quantile Regression for Panel Data Conditional quantile functions of the response of the y it Q yit (τ η i, x it ) = η i + β(τ)x it i = 1,..., n, t = 1,..., T In some applications it is of interest to explore a broad class of covariate effects (location and scale shifts), while still accounting for individual specific effects Such models enable the investigator to explore various forms of heterogeneity associated with the covariates under less stringent distributional assumptions

Quantile Regression Panel Data (Example) Example Multiple observations on each individual over time Q yit (τ η i, x it ) = η i + β(τ)x it i = 1, 2, 3 t = 1,..., 50

Quantile Regression Panel Data (Example) yit 0 5 10 15 20 25 0 2 4 6 8 xit

Quantile Regression for Dynamic Panel Data It is very important to analyze longitudinal data allowing for individual effects and lagged dependent variables at the quantile regression framework Q yit (τ η i, y it 1, x it ) = η i (τ) + α(τ)y it 1 + β(τ)x it Very often researchers wish to use longitudinal data to estimate behavioral relationships that are dynamic in character, namely, models containing lagged dependent variables Allows the investigator to explore a range of conditional quantile functions thereby exposing a variety of forms of conditional dynamic heterogeneity, e.g. asymmetric adjustments

Bias in Dynamic Panel Data Inclusion of lagged dependent variable induces bias in the estimators in standard dynamic panel models y it = η i + αy it 1 + u it LSDV is biased of O(1/T): there is a correlation between the explanatory variables and residuals in the transformed model (y it ȳ it ) = α(y it 1 ȳ it 1 ) + (u it ū it ) where ȳ it = T t=1 y it/t

Reducing Bias in Dynamic Panel Data There is an extensive literature in estimation of fixed effects dynamic panel data models for linear regression models Various instrumental variables (IV) and generalized method of moments (GMM) estimators have been proposed to overcome inconsistency of the estimators for fixed time T Anderson and Hsiao (1981, 1982), Arellano and Bond (1991)

Main Contribution The model includes lagged dependent variable as well as fixed effects in the model Q yit (τ η i, y it 1, x it ) = η i (τ) + α(τ)y it 1 + β(τ)x it Propose an instrumental variables strategy to estimate quantile regression dynamic panel data with fixed effects and reduce the dynamic bias The estimator uses lagged dependent observations (or lagged differences) as instruments The estimator employs Chernozhukov and Hansen (2005, 2006, 2008) Show consistency and asymptotic normality of the estimator, provided N a /T 0, for some a > 0

Digression on IV for Quantile Regression Consider the following quantile regression model, Q Y (τ X) = Xβ(τ) where Y is the outcome variable conditional on the exogenous variables of interest X Then note that Y = Xβ(U) U X U(0, 1) Koenker and Bassett (1978) show that ˆβ = argmin β ρτ (Y Xβ) where ρ τ (u) = u(τ I(u < 0))

Digression on IV for Quantile Regression (Cont.) ˆβ(τ) based on the moment condition ψ τ (R) X, where R is the residual Y Xβ and ψ τ (u) = τ I(u < 0) Suppose that ψ τ (R) X does not hold But we can find ψ τ (R) W The model is still the same, and we want to estimate β Q Y (τ X) = Xβ(τ) But now Y = Xβ(U) U W U(0, 1)

Digression on IV for Quantile Regression (Cont.) W does not belong to the model Thus, for fixed β, in the quantile regression of (Y Xβ) on W, W should have coefficient zero Estimator is define as: For fixed β ˆγ(β) = argmin γ ρτ (Y Xβ Wγ) ˆβ = argmin β ˆγ(β) A ˆβ(τ) that makes ˆγ(τ) = 0 is the instrumental variables estimator

Dynamic Panel Quantile Regression Estimator Now we consider a finite-sample analog of the above procedure for dynamic panel data Q nt (τ, η, α, β, γ) := n i=1 t=1 The (QRIV) is defined as follows: T ρ τ (y it η i αy it 1 βx it γw it ) For a given value of the structural parameter, say α, one estimates the ordinary QR to obtain (ˆη(α, τ), ˆβ(α, τ), ˆγ(α, τ)) := argmin η,β,γ Q nt (τ, η, α, β, γ) To find an estimate for α(τ), we look for a value α that makes the coefficient on the instrumental variable γ(α, τ) as close to 0 as possible ˆα(τ) = argmin α B ˆγ(α, τ) A

Asymptotics for QRIV - Assumptions A1 The y it are independent across individuals with conditional distribution functions F it, and differentiable conditional densities, 0 < f it <, with bounded derivatives f it for i = 1,..., N and t = 1,..., T; A2 Define Π(η, α, β, τ) E[(τ 1(Zη + y 1 α + Xβ)) ˇX(τ)] ˇX(τ) [Z, W, X ], The matrix (η,α,β) Π(η, α, β, τ) is continuous and have full rank at each (η, α, β) in Θ; A3 The image of Θ under (η, α, β) Π(η, α, β, τ) is simply connected;

Asymptotics for QRIV - Assumptions (Cont.) A4 For all τ, (α(τ), β(τ)) int A B, and A B is compact and convex; A5 max it y it = O( NT); max it x it = O( NT); max it w it = O( NT); A6 Na T 0, for some a > 0; A7 Denote Φ(τ) = diag(f it (ξ it (τ))), M Z = I P Z and P Z = Z(Z Φ(τ)Z) 1 Z Φ(τ). Let X = [W, X ]. Then, the following matrix is invertible: J ϑ = ( X M Z Φ(τ)M Z X); Now define [ J β, J γ] as a partition of J 1 ϑ, J α = ( X M Z Φ(τ)M Z y 1 ) and H = J γa[α(τ)] J γ. Then, J αhj α is also invertible.

Asymptotics for QRIV - Results Let θ(τ) = (α(τ), β(τ)) Theorem 1 Given assumptions A1-A6, (η(τ), α(τ), β(τ)) uniquely solves the equations E[ψ(Y Zη y 1 α Xβ) ˇX(τ)] = 0 over Θ, and θ(τ) = (α(τ), β(τ)) is consistently estimable. Theorem 2 (Asymptotically Normality) Under conditions A1-A7, for given τ, ˆθ(τ) converges to a Gaussian distribution as NT(ˆθ(τ) θ(τ)) d N(0, Ω(τ)), where Ω(τ) = (K, L ) S(K, L )

Monte Carlo - Description Evaluate the finite sample performance of the quantile regression instrumental variables estimator Bias, RMSE Model y it = η i + αy it 1 + βx it + u it Two schemes to generate the disturbances u it uit N(0, σ 2 u) uit t 3

Monte Carlo - Description (Cont.) The regressor x it is generated according to x it = µ i + ζ it where ζ it follows the ARMA(1, 1) process (1 φl)ζ it = ɛ it + θɛ it 1 and ɛ it follows the same distribution as u it, that is, normal distribution and t 3 for Schemes 1, and 2 respectively.

Monte Carlo - Description (Cont.) The fixed effects, µ i and α i, are generated as T µ i = e 1i + T 1 ɛ it, e 1i N(0, σ 2 e 1 ), η i = e 2i + T 1 t=1 T t=1 x it, e 2i N(0, σ 2 e 2 ). In the simulations, we experiment with T = 5, 10, 25 and N = 50, 100. Consider the following values for the remaining parameters: (α, β) = (0.4, 0.6), (0.8, 0.2); φ = 0.6, θ = 0.2, σ 2 u = σ 2 e 1 = σ 2 e 2 = 1.

Monte Carlo - Results WG OLS-IV PQR QRIV α = 0.8 Bias 0.211 0.006 0.214 0.008 RMSE 0.196 0.231 0.255 0.285 α = 0.4 Bias 0.102 0.004 0.104 0.005 RMSE 0.161 0.187 0.218 0.236 Table: Location-Shift Model: Bias and RMSE of Estimators for Normal Distribution (T = 10 and N = 50)

Monte Carlo - Results (Cont.) WG OLS-IV PQR QRIV α = 0.8 Bias 0.225 0.029 0.194 0.002 RMSE 0.309 0.334 0.273 0.303 α = 0.4 Bias 0.115 0.018 0.094 0.008 RMSE 0.286 0.305 0.253 0.269 Table: Location-Shift Model: Bias and RMSE of Estimators for t 3 Distribution (T = 10 and N = 50)

Application - Habit Formation Test for the presence of habit formation using household data With habit formation, current utility depends not only on current expenditures, but also on a habit stock formed by lagged expenditures Consumption services in period t are positively related to current expenditure and negatively related to lagged expenditure, Dynan (2000): c i,t = c i,t αc i,t 1

Application - Habit Formation We estimate the following model: Q Cit (τ F it 1 ) = η i (τ) + α(τ)c it 1 + X it β(τ) where C it = ln c it, Test H 0 : α(τ) = 0 X it is a set of covariates W it is a set of instruments

Application - Habit Formation DATA Panel Study on Income Dynamics (PSID) 2132 households, each with 13 observations C it : food expenditure growth as proxy for consumption expenditures X it : difference in family sizes, age of the head of the household, and age of the head of the household squared, race W it : C it 2, C it 3, dummies for income growth

Application - Habit Formation H 0 : α(τ) = 0 Quantiles ˆα sd W n 0.1 0.052 0.024 4.808 0.2 0.046 0.026 3.003 0.3 0.033 0.026 1.587 0.4 0.018 0.028 0.419 0.5 0.025 0.029 0.726 0.6 0.036 0.029 1.514 0.7 0.035 0.027 1.643 0.8 0.016 0.032 0.247 0.9 0.004 0.043 0.009 T SLS 0.034 0.023 2.168 Table: QRIV and TSLS tests for Habit Formation in Consumption

Conclusions Propose a quantile regression for dynamic panel data with fixed effects model Estimation is based on instrumental variables We show consistency and asymptotic normality of the estimators Apply the estimator and test to Consumption Habit Formation