The exact bias of S 2 in linear panel regressions with spatial autocorrelation SFB 823. Discussion Paper. Christoph Hanck, Walter Krämer

SFB 83 The exact bias of S in linear panel regressions with spatial autocorrelation Discussion Paper Christoph Hanck, Walter Krämer Nr. 8/00

The exact bias of S in linear panel regressions with spatial autocorrelation Christoph Hanck and Walter Krämer May 00 Abstract We investigate the OLS-based estimator s of the disturbance variance in an error component linear panel regression model when the disturbances are homoskedastic, but spatially correlated. Although consistent (Song and Lee, Econ. Lett. 008), s can be arbitrarily biased towards zero in finite samples. Keywords: panel data, spatial error correlation, bias Introduction and Model We consider the following standard linear panel regression model y it = X itβ + u it, i =,..., N; t =,..., T () where i indexes units and t indexes time. The dependent variable y it is affected by the (for simplicity fixed) K regressor vector X it through the unknown K vector β and the error u it. The errors follow the one-way error component structure u it = µ i + ɛ it, i =,..., N; t =,..., T where the individual-specific effect µ i is i.i.d. with mean 0 and variance σ µ. In matrix notation, () can be written as y = Xβ + u, where the observations are stacked such that y, u = µ ι T + ɛ and ɛ = (ɛ,..., ɛ T ) are NT (ι T is a T -vector of ones), while X is NT K with rank K and µ = (µ,..., µ N ) is N. The idiosyncratic error ɛ t = (ɛ t,..., ɛ Nt ) is generated by a first-order spatial autoregressive process ɛ t = ρw ɛ t + ν t, () Corresponding author. Rijksuniversiteit Groningen, Department of Economics, Econometrics and Finance, Nettelbosje, 9747AE Groningen, Netherlands. Tel. +3 50 363 3836, Fax+3 50 363 7337. c.h.hanck@rug.nl. Universität Dortmund, Fakultät Statistik, Vogelpothsweg 78, 447, Dortmund, Germany. Research supported by DFG under Sonderforschungsbereich 83. Tel. +49 3 755 35. walterk@statistik.uni-dortmund.de.

where ρ is the scalar spatial autoregressive coefficient and the elements of ν = (ν,..., ν T ) = (ν,..., ν N,..., ν NT ) are i.i.d. with (0, σ ν). The N N-matrix W has known nonnegative spatial weights with w ii = 0 (i =,..., N). Such patterns of dependence are often entertained when the objects under study are positioned in some space, whether geographical or sociological, and account for spillovers from one unit to its neighbors, whichever way neighborhood may be defined. They date back to Whittle [954 and have become quite popular in econometrics recently. See Anselin [00 for a survey of this literature. The coefficient ρ in () measures the degree of correlation, which can be both positive and negative. Below we focus on the empirically more relevant case of positive disturbance correlation, where 0 ρ /λ max and where λ max is the Frobenius-root of W (i.e. the unique positive real eigenvalue such that λ max λ i for arbitrary eigenvalues λ i ). Under (), u satisfies u = (I N ι T )µ + (B I T )ν, with I M the identity matrix of dimension M and B = I N ρw. Furthermore, the variance-covariance matrix of u becomes E(uu ) =: Ω = σ µ[ IN J T +σ ν [ (B B) I T, (3) where J T is a T T matrix of ones. The pooled OLS estimate for β is ˆβ = (X X) X y, and the OLS-based estimate for σ := Var(u it ) is s = NT K (y X ˆβ) (y X ˆβ) = NT K u Mu, (4) where M = I X(X X) X. The present paper shows that, while consistent, s can still be arbitrarily biased. Of course, for our analysis to make sense, the main diagonal of Ω should be constant, i.e. Ω = σ Σ, where Σ is the correlation matrix of u. It is therefore important to clarify that many, though not all, spatial autocorrelation schemes imply homoskedasticity of Ω. Consider for instance the following popular specification for W known as one ahead and one behind: 0 0 0. 0.. 0 0 W := 0..................... 0.. 0 0 0 0 0 0 0 and renormalize the rows such that the row sums are, i.e. take W /. Then it is easily seen that E(u it ) is independent of i and t, and analogous results hold for the more general j ahead and j behind weight matrix W which has non-zero elements in the j entries before and after the main diagonal, with the non-zero entries equal to j/. This specification has been considered by, for instance,

Kelejian and Prucha [999 and Krämer and Donninger [987. The condition is also met if W is the equal-weight matrix (see, e.g., Kelejian and Prucha [00, Lee [004 or Case [99) W EW, defined by W EW = (w EW ij ) = { N for i j 0 for i = j. Our results therefore hold for both W = W EW and W = W /, among others. In the sequel, we work with W = W EW for brevity, following Song and Lee [008. It has long been known that s is in general (and contrary to ˆβ) biased for σ whenever Ω is no longer a multiple of the identity matrix. Krämer [99 and Krämer and Berghoff [99 show that this problem disappears asymptotically for certain types of temporal correlation such as stationary AR()-disturbances in a standard linear regression model, although it is clear from Kiviet and Krämer [99 that the relative bias of s might still be substantial for any finite sample size. Recently, Song and Lee [008 prove consistency of s for σ also under the panel model (), for W = W EW. We extend their results to the finite-sample case by showing that s can nevertheless be arbitrarily biased for σ. The relative bias of s in finite samples We have ( ) s E σ = ( ) E σ (NT K) u Mu = σ (NT K) tr(mω) = tr(mσ). NT K (5) Watson [955 and Sathe and Vinod [974 derive the (attainable) bounds mean of NT K smallest eigenvalues of Σ ( ) s E (6) σ mean of NT K largest eigenvalues of Σ, which shows that the bias can be both positive and negative, depending on the regressor matrix X, whatever Σ may be. Finally, Dufour [986 points out that the inequalities (6) amount to ( ) s 0 E σ NT NT K when no restrictions are placed on X and Σ. Again, these bounds are sharp and show that underestimation of σ is more likely than overestimation. (7) 3

The problem with Dufour s bounds is that they are unnecessarily wide when extra information on Ω is available. Here we assume Ω to satisfy (3) and show that the relative bias of s depends crucially on the interplay between X and W. In particular, irrespective of N and of the weighting matrix W, there is always a regressor matrix X such that E(s /σ ) becomes as close to zero as desired. To see this, note that the above W are symmetric. Hence, we can write W = N i= λ iω i ω i, the spectral decomposition of W, with the eigenvalues λ i in increasing order and ω i the corresponding orthonormal eigenvectors. We directly obtain from (5) that lim ρ /λ N E ( s /σ ) = 0 (8) whenever lim ρ /λn tr(mσ) = 0. Since M is constant w.r.t. ρ and the trace is continuous, we need lim ρ /λn Σ to investigate the limiting bias. Lemma. Let ω i = ω i e k, with e k an eigenvector of I T, and ω i the (, )-element of ω i ω i (under homoscedasticity, we could select any diagonal element of ω i ω i ). When ρ /λ N, Σ tends to NT i=t (N )+ Σ := ω i ω i NT i=t (N )+ ω i Proof. Using symmetry of W, write Ω = σµ [ [ IN J T +σ ν (IN ρw ) I T From e.g. Lütkepohl [996, Sec. 5.., (I N ρw ) I T has eigenvalues ( ρλ i ) with multiplicity T each, as λ j (I T ) =, t =,..., T and using standard results on eigenvalues of Kronecker products [e.g. Abadir and Magnus, 005, 0.0. Hence, we can write Σ = σµ + σν σµ [ NT i= [ Ω = σµ [ NT IN J T +σ ν i= ( ρλ i) ω i [ σ I N J T + ν [ NT σµ + σν ( ρλ i ) ω i ω i, again using standard [ results on eigenvectors of Kronecker products [e.g. Abadir and Magnus, 005, 0.. Note NT σ = σµ + σν i= ( ρλ i) ω i. Hence [ NT Multiplying numerator and denominator by ( ρλ N ) yields Σ = σ µ( ρλ N ) + σ ν σ µ( ρλ N ) [ NT + σµ( ρλ N ) + σν i= i= ( ρλ i) ω i [ I N J T ( ρλ N ) ( ρλ i) ω i σν [ NT i= ( ρλ N ) ( ρλ i) ω i [ NT i= i= ( ρλ N ) ( ρλ i ) ω i ω i ( ρλ i ) ω i ω i Now, as the T largest eigenvalues of [ (I N ρw ) I T are equal to λn, we have lim ρ /λn = ( ρλ N ) ( ρλ i) = for i = i = T (N ) +,..., NT and 0 else. Thus, 0 lim Σ = [ NT [ I N J T ρ /λ N 0 + σν + 0 + σν i=t (N )+ ω i σν [ NT i=t (N )+ ω i NT i=t (N )+ ω i ω i 4

Figure I The relative bias of s as a function of ρ and N, X = ι N I Remark. The result, and by extension the Proposition below, also holds if σ µ = 0, i.e. if the if the individual-specific effects are homogenous. We now demonstrate that s can be arbitrarily biased for σ. Proposition. For any W such that σ is constant over i and t, there always exists an X such that lim ρ /λn E ( s /σ ) = 0. Proof. Choose I T as eigenvectors for I T. The largest eigenvalue λ N of a row-normalized matrix such as W / or W EW is. (This follows immediately from Theorem 8.. of Horn and Johnson [985.) Hence, ω N = aι N for some a R \ {0}. ω N then also is the eigenvector corresponding to the largest eigenvalue of B, ( ρ). (This follows because, for e.g. W EW, B = (δ J N + δ I N ) = (Nδ + δ δ )J N + δi N, where δ = ρ/[(n + ρ)( ρ), δ = (N )/(N + ρ), is a matrix with constant on- and off-diagonal elements. Then, ab ι N is a vector with identical elements an[nδ + δ δ ) + δ. Some algebra shows this to be equal to a( ρ).) Hence, ( ω i ) i=t (N )+,...,NT = (aι N ) I T. Thus, the numerator of Σ, Ω, can be written as a matrix in which the jth column, j =, T +,..., T (N ) + is (a, 0 T, a, 0 T,..., a, 0 T ). The adjacent t =,..., T columns start on t zeros before the sequence a, 0 T, a, 0 T,... sets in. E.g., for T =, Ω is a 0 a a 0 0 a 0 a Ω := NT i=t (N )+ a 0 ω i ω i =....... 0.......... a... a 0 0 a 0 0 a Hence tr( Ω) = NT a. Now, take X = ι N I T. Then, I NT M = (ι N I T )[(ι N I T ) (ι N I T ) (ι N I T ) = N (J N I T ). We have N diag ( (J N I T ) Ω ) = N (NT/T a,..., NT/T a ) = (a,..., a ), whence N tr ( (J N I T ) Ω ) = NT a. Thus, tr(m Ω) = tr(m Σ) = 0, which, by continuity of tr, completes the proof. 5

Figure II The relative bias of s as a function of ρ and N, X = ι NT Figure I illustrates Theorem for X = ι N I for N = 5, 0,..., 50 and W = W EW. We see that (8) holds for any given N. Also, pointwise in ρ, the relative bias vanishes as N, as one would expect. To highlight that not all X matrices produce a limiting relative bias of zero, Figure II reports the case X = ι N ι T. Although the bias still is substantial, it is clearly bounded away from zero. (One can show that E(s /σ ) NT (NT K) ω a in this case, where ω := NT i=t (N )+ ω i. As slightly larger than one and ω = a, a limiting bias slightly above / obtains for N <.) NT NT K is References Abadir, Karim M., and Jan R. Magnus, 005, Matrix Algebra number. In Econometric Exercises. (Cambridge University Press, Cambridge). Anselin, Luc, 00, Rao s score test in spatial econometrics, Journal of Statistical Planning and Inference 97(), 3 39. Case, Ann, 99, Neighborhood influence and technological change, Regional Science and Urban Economics (), 49 508. Dufour, Jean-Marie, 986, Bias of s in linear regression with dependent errors, The American Statistican 40(4), 84 85. Horn, Roger A., and Charles R. Johnson, 985, Matrix Analysis (Cambridge University Press, Cambridge). Kelejian, Harry H., and Ingmar R. Prucha, 999, A generalized moments estimator for the autoregressive parameter in a spatial model, International Economic Review 40(), 509 533. 00, SLS and OLS in a spatial autoregressive model with equal spatial weights, Regional Science and Urban Economics 3(6), 69 707. Kiviet, Jan, and Walter Krämer, 99, Bias of s in the linear regression model with autocorrelated errors, The Review of Economics and Statistics 74(), 36 365. Krämer, Walter, 99, The asymptotic unbiasedness of s in the linear regression model with AR()- disturbances, Statistical Papers 3(), 7 7. Krämer, Walter, and Christian Donninger, 987, Spatial autocorrelation among errors and the relative efficiency of ols in the linear regression model, Journal of the American Statistical Association 8(398), 577 579. 6

Krämer, Walter, and Sonja Berghoff, 99, Consistency of s in the linear regression model with correlated errors, Empirical Economics 6(3), 375 377. Lee, Lung-Fei, 004, Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models, Econometrica 7(6), 899 95. Lütkepohl, Helmut, 996, Handbook of Matrices (John Wiley & Sons, New York). Sathe, S.T., and H.D. Vinod, 974, Bounds on the variance of regression coefficients due to heteroscedastic or autoregressive errors, Econometrica 4(), 333 340. Song, Seuck Heun, and Jaejun Lee, 008, A note on s in a spatially correlated error components regression model for panel data, Economics Letters 0, 4 43. Watson, Geoff S., 955, Serial correlation in regression analysis I, Biometrika 4(3/4), 37 34. Whittle, Peter, 954, On stationary processes in the plane, Biometrika 4(3/4), 434 449. 7