Spatial Models in Econometrics: Section 13 1

Size: px
Start display at page:

Download "Spatial Models in Econometrics: Section 13 1"

Transcription

1 Spatial Models in Econometrics: Section Single Equation Models 1.1 An over view of basic elements Space is important: Some Illustrations (a) Gas tax issues (b) Police expenditures (c) Infrastructure productivity (d) Cities and budgets (e) regulation issues (f) exchage market contagion (h) volitility of GDP (i) Spatial spill-overs relating between governments relating to the quality measures Consider a cross sectional framework: i =1..., N Concept of Neighbor: Neighboring units are units that interact in a meaningful way. This interaction could relate to spill-overs, externalities, copy cat policies, geographic proximity issues, industrial structure, similarity of markets, sharing of infrastructure, welfare benefits, banking regulations, tax issues, re-election issues, etc. Example of Geographic Neighbors: Forthei th unit, N denotes a close neighbor, NN a neighbor that is less close etc. 1 These notes are mostly based on work I have done with Ingmar Prucha, and parts of them were taken from his notes. 1

2 NN NN NN NN NN NN N N N NN NN N i N NN NN N N N NN NN NN NN NN NN The pattern described by N is called a Queen; the pattern described by N and NN is called a double Queen. Aweightingmatrix:A matrix that select neighbors, and indicates how important each neighbor is. For example, suppose we have N observations on the dependent variable Y 0 =(Y 1,..., Y N ). Suppose the neighbors corresponding to the i th observation ( i th cross sectional unit) are units 1, 2, and 3. Then the i th row of the weighting matrix, W N N will have non-zero elements: w i1,w i2, and w i3. If P N j=1 w ij =1for all i, the weighting matrix is said to be row normalized. Because a unit is not viewed as its own neighbor w ii =0,i=1,..., n. Example of use: Let W i. be i th row of W and let X i be a scalar. Then a model such as Y i = b 0 + b 1 X i + b 2 W i. X + ε i X 0 = (X 1,..., X N ) suggest that Y i depends upon X i (a within unit effect), as well as P N j=1 w ijx j, which is a weighted sum of the regressor in neighboring units. Typically, W is specified to be row normalized so that Y i depends on X i and a weighted average of this regressor corresponding to neighboring units. Clearly, the simplest weighted average is the uniform: e.g. if Y i has 5 neighbors, then the non-zero weights in the i th row of W are all 1/5. Other weighting schemes will be considered below. A Further Elaboration and Some Specifications of w ij Consider again the above relation in scalar terms Y i = b 0 + b 1 X i + b 2 Σ n j=1w ij X j + ε i Y i = b 0 + b 1 X i + b 2 X i + ε i ; X i = Σ n j=1w ij X j 2

3 In this form one can clearly see that w ij relates to the effect that X j has on Y i. That is, the dependent variable depends on the within unit value of X, and a weighted sum (which could be a weighted average) of the values of X corresponding to neighboring units. If i and j are not neighbors, w ij =0. If they are neighbors there are various ways researchers have specified w ij. (A) Let n i be the number of neighbors that unit i has. Then, if j is a neighbor to i, researchers often take w ij =1/n i. In this case W would be a row normalized weighting matrix. (B) Again suppose j is a neighbor to i. Let d ij be a distance measure between i and j. Then, one wants w ij 0 as d ij. Also, the closer is j to i, the larger one might want w ij. Thus, researchers sometimes take w ij =1/d ij. They may also specify a row normalized version as w ij = 1/d ij Σ n r=1(1/d ir ) (C) Let INC r be income per capita in cross sectional unit r; then one specification of w ij that has been considered is w ij = INC i INC j 1 This form has a disadvantage in that w ij is not bounded which, as we will see, is important for certain statistical results. One possible improvement is therefore w ij =[ INC i INC j +1] 1 Other variables which could signify distance between neighboring units i and j are (a) the average level of education (b) the proportion of housing units that are rental units (c) ethic group composition differences (d) geographic distances (e) trade shares (D) A generalization of the above would be, for two neighboring units, w ij = 1/(d ij +1) d ij = [(z i1 z j1 ) (z ir z jr ) 2 ] 1/2 3

4 where z iq is the q th relevant variable in cross sectional unit i, q =1,..., r. This measure depends upon the scale of the variables involved. Let z q,ij =(z iq + z jq )/2, q=1,..., r Then, perhaps a better measure would be the scale-normalized weights: w ij = 1/(d ij +1)where d ij =[ (z i1 z j1 ) 2 z 2 1,ij (z ir z jr ) 2 ] 1/2 z r,ij Cliff and Ord type models Basic specifications and parameter space issues Let Y 0 =[Y 1,..., Y N ],X 0 =[X 0 1.,..., X 0 N. ] K N,u0 =[u 1,..., u N ]. Then one spatial model for Y i is Y i = a + X i B 1 (1 K)(K 1) + ρ 1 Ã W i. (1 N) Y (N 1)! Ã + W i. X (1 N)(N K)! B 2 + u i (1) (K 1) (1 1) u i = ρ 2 W i. u + ε i ; ρ 1 < 1; ρ 2 < 1 (2) where ε i is i.i.d (0,σ 2 ε), and X relates to exogenous variables which we take as nonstochastic. Let W = W 1.. W N. N N be the weighting matrix which we assume at this point is observable and exogenous. Note that (1) and (2) can be written as Y = ae n + XB 1 + ρ 1 (WY)+(WX) B 2 + u (1A) u = ρ 2 (Wu)+ε (2A) where e n is an n 1 vector of unit elements. Assuming the inverse exists at thetruevalueofρ 1 and ρ 2 we have Y = (I ρ 1 W ) 1 [ae n + X 1 B 1 + WXB 2 + u] (3) u = (I ρ 2 W ) 1 ε (4) 4

5 Since the elements of (I ρ 1 W ) 1 and (I ρ 2 W ) 1 will generally depend upon N, thevectorsy and u are really triangular arrays. For example, this means that the first element of Y will be different if N =20then when N =25. This implies that these elements and the vector Y should be indexed with N : Y 0 N =(Y 1N,Y 2N,...,Y NN ) Thus, our sample on Y would be Y 11 Y 12 Y 13 etc Y 22 Y 23 Y 33 At this point we do not index all the variables for simplicity of notation. Interestingly the triangular nature of the variables involved, which leads to certain statistical problems, had not been recognized until recently in the (formal) literature. Because many researchers estimate spatial models by ML procedures, it is typically assumed that W is such that (I aw ) 1 exists for all a < 1, which is taken to be the parameter space, see (2). Note 1: If W is row normalized (I aw ) is singular at a =1 Proof: Let e 0 N =(1, 1,...,1) 1 N.Then (I W ) e N = e N We N = e N e N =0 Note 2: If W is row normalized, then (I aw ) 1 exits for all a < 1. Proof: We prove this by relying on a theorem by Gershgorin. Gershgorin s Theorem Let A N N have elements a ij. Let R i = NX j=1,j6=i a ij ; C j = NX i=1,i6=j a ij Then each eigenvalue, λ i of A lies in at least one of the N circles λ a ii R i,i=1,..., N 5

6 and hence in the union of these circles. Also each root lies in at least one of the N circles, and hence their union λ a jj C j, j =1,..., N An important application to a weighting matrix: Consider again W which has w ii =0. Let r =max i P j w ij c =max j Pi w ij r = max i Then the roots of W satisfy, R i ; c =maxc j j λ i r, λ i c, i =1,..., N If W is row normalized, r =1and so λ i 1. Given Gershgorin s theorem, note the following. Assume w ii =0and W is row normalized. Let Q be the matrix that triangularizes W : QW Q 1 = D λ,d λ = Then D λ = λ 1...λ N and λ 1 I aw = QQ 1 (I aw ) = Q (I aw ) Q 1 λ ij... 0 λ N if a < 1 since aλ i a < 1. = I ad λ =(1 aλ 1 )...(1 aλ N ) 6= 0 Note 3: The above indicates that if W is row normalized, the parameter space specified in(2)is such that the inverses in(3)and(4)exist. If W is 6

7 not row normalized (I aw ) will generally be singular for certain values of a < 1. In this case the following should be noted. Let α =min(r, c), where r and c are defined above. Then, assuming thattheelementsofw are nonnegative, (I aw ) will be nonsingular for all a < 1 α. This could be taken as the parameter space. Proof: If a =0, (I aw ) is nonsingular. Now consider a 6= 0. In this case I aw =0implies µ 1 I W a = 0 or µ 1 W I a = 0 So I aw is singular if 1 a is equal to a root of W. Now if the roots of W are such that λ i r, λ i c then λ i min (r, c) =α So I aw 6= 0 if 1 a > min (r, c) =α or a < 1 α This is an important result because a model which has a weighting matrix whichisnotrownormalizedcanalwaysbenormalizedinsuchawaythatthe inverse needed to solve the model will exist in an easily established region. For example, suppose W is not row normalized. Then the model 2 Y = XB + ρ 1 WY + ε (5) µ W = XB +(ρ 1 α) Y + ε α 2 Note that α below will depend on n and hence so will ρ 1 and W. For ease of notation, we do not indicate this dependence. 7

8 or Y = XB + ρ 1W Y + ε (6) where ρ 1 = ρ 1 α, W = W and again α X X α = min(c, r) :c =max w ij ; r =max w ij j i Note that α is easily determined. Also note that I ρ 1W 6= 0 i j For ρ 1 < < 1 min c, r α α 1 1 =1 α min (c, r) So if the model is renormalized as Y = XB + ρ 1W Y + ε (7) and ρ 1 is taken to be the parameter, the inverse exists for all ρ 1 < 1. One would then estimate ρ 1 as a parameter, and since ρ 1 = ρ 1 α, one would estimate ρ 1 as ˆρ 1 =ˆρ 1/α (8) Note 4: We note one more point which corresponds to a special case. Let W again be a weighting matrix, w ii =0,i=1,...,N, and assume that all of the roots of W are real. This is a strong assumption. Assume that W is not row normalized. Let λ max and λ min be the largest and smallest roots of W. Assume, as will typically be the case if all of the roots are real, that λ max > 0 and λ min < 0. Then (I aw ) is nonsingular for all λ 1 min <a<λ 1 max (9) Proof: I aw is nonsingular for a =0. If a 6= 0we have, as before I aw = I ad λ = (1 aλ 1 )(1 aλ 2 ) (1 aλ N ) 8

9 so I aw is nonsingular unless a is equal to the inverse of a root: λ 1 1,...,λ 1 N or a 1 is equal to a root λ 1,...,λ N. or if But Thus if a 1 <λ min a 1 >λ max (I aw ) is nonsingular a 1 > λ max a<λ 1 max a 1 < λ min a>λ 1 min Estimation: Again consider a variation on the model in (1A) and (2A) Y = XB 1 + ρ 1 WY +(WX) B 2 + u (1A) u = ρ 2 (Wu)+ε, ρ 1 < 1, ρ 2 < 1 (2A) Assume (I aw ) 1 exits for a < 1. Again if W is not row normalized, then the model can always be normalized as described above. Case 1: ρ 1 =0=ρ 2 In this case the model reduces to Y = XB 1 +(WX) B 2 + ε, (X N K ) (10) Assume rank (X, WX) =2K. Note if X contains the constant term, then (X, WX) < 2K if the weighting matrix is row normalized since We N = e N. Typically, we would not include We N in our model even if the weighting matrix is not row normailzed. To be a bit more precise, if X contains the constant term, X =(e N,X 1 ) then our model might be Y = b 0 e N + X 1 b 1 +(WX 1 )b 2 + ε (11) 9

10 and rank(e N,X 1,WX 1 )=2K 1. Assume ε (0,σ 2 I),and that X and W are nonstochastic. Consider again the model (10) and let Z = (X, WX), or if the model is (11) let Z =(e N,X 1,WX 1 ). Assume that Z has full column rank. Then estimate via OLS. If ε 0 =(ε 1,...,ε N ), where ε i is i.i.d. (0,σ 2 ), then the usual large sample theory holds if N 1 Z 0 Z Q zz where Q zz is a finite matrix and Q 1 zz exits. or Case 2: ρ 1 =0,ρ 2 6=0, ρ 2 < 1 Now if the model is (11) we have Properties of u Y = ZB + u (12) u = ρ 2 Wu+ ε (13) Y = ZB + u, u= ρ 2 Wu+ ε (14) Z = (e N,X 1,WX 1 ), B 0 =(b 0,b 0 1,b 0 2) If then So ε 0,σ 2 I u =(I ρ 2 W ) 1 ε E (u) = 0 E (uu 0 ) = σ 2 ε (I ρ 2 W ) 1 (I ρ 2 W 0 ) 1 = σ 2 εω u So the elements of u are heteroskedastic, as well as spatially correlated. GLS ˆB GLS = Z 0 Ω 1 u Z 1 Z 0 Ω 1 u Y (not feasible unless ρ 2 is known) (15) 10

11 Feasible Procedures ML. ³ E ˆBGLS = B, V C GLS = σ 2 ε Typically based on ε N (0,σ 2 I). So u N 0,σ 2 εω u Z 0 Ω 1 u Z 1 Thus Thus so L = (σ 2 ε) N 2 Y N ZB,σ 2 εω u e 1 2σ 2 ε [Y ZB] 0 Ω 1 u [Y ZB] 2π N I ρ2 W 1 + (16) ln (L) = 1 [Y ZB] 0 [I ρ 2σ 2 2 W 0 ][I ρ 2 W ][Y ZB] (17) ε N 2 ln ³ σε 2 +ln I ρ2 W + N ln 2π = 1 ([I ρ 2σ 2 2 W ][Y ZB]) 0 [I ρ 2 W ][Y ZB] ε N 2 ln ³ σ 2 ε +ln I ρ2 W + N ln 2π = 1 (Y (ρ 2σ 2 2 ) Z (ρ 2 ) B) 0 (Y (ρ 2 ) Z (ρ 2 ) B) ε N 2 ln ³ σ 2 ε +ln I ρ2 W + N ln 2π where Y (ρ 2 ) = Y ρ 2 WY Z (ρ 2 ) = Z ρ 2 WZ Note Y (ρ 2 ) and Z (ρ 2 ) are spatial counterparts to the Cochrane-Orcutt transformation. Atthispointwenotethatamajorprobleminmaximizingln L relates to the term ln I ρ 2 W +. For example, this term must 11

12 (a) be evaluated repeatedly for each trial value of ρ 2. If N is large this will indeed be tedious. As one example, cross sectional units could relate to counties and there are over 3000 counties in the US. In other cross sectional studies, the number of cross sectional units could be families and so it could be the case that N>50, 000. (b) as on page (6) of these notes, using Ord s (1975) suggestion ln I ρ 2 W + = ln[(1 ρ 2 λ 1 ) (1 ρ 2 λ N )] NX = ln (1 ρ 2 λ i ) i=1 Now if the roots can be evaluated, ln 1 ρ 2 W + can be evaluated in terms of the sum for each trial value of ρ 2. This will be far simpler than the method proposed in (a). The problem is that if N 450 both (a) and (b) will involve computation accuracy problems. For example Kelejian and Prucha (1999) found that the calculation of roots for even a nonsymmetric matrix involved accuracy problems. The form of the MLE for B and σ 2 ε Based on (17) we have ln L B = 1 2Z (ρ 2σ 2 2 ) 0 Y (ρ 2 )+2Z (ρ 2 ) 0 Z (ρ 2 ) B =0 (18) ε ln L = 1 [Y (ρ σ 2 ε 2σ 4 2 ) Z (ρ 2 ) B] 0 [Y (ρ 2 ) Z (ρ 2 ) B] N (19) ε 2σ 2 ε It follows that Interpretation ˆB ML = Z (ˆρ 2 ) 0 Z (ˆρ 2 ) 1 Z (ˆρ 2 ) 0 Y (ˆρ 2 ) ˆσ 2 ε = 1 h i 0 h i Y (ˆρ N 2 ) Z (ˆρ 2 ) ˆB Y (ˆρ 2 ) Z (ˆρ 2 ) ˆB Premultiplying (14) by (I ρ 2 W ) yields Y (ρ 2 )=Z (ρ 2 ) B + ε,ε N 0,σ 2 I (20) 12

13 Theresultsshouldbeclearfromthisform. LargeSampleResultsforML InarecentarticleLee(2004)gaveaformal demonstration of conditions that ensure consistency and asymptotic normality of the ML estimators for the general spatial model considered. Although some of his assumptions are strong, to date, this is the only paper in which a formal demonstration is given. In applied studies it is always assumed that the usual results hold: i.e. ³ N ˆP P D N (0,V) V 1 = lim E N 1 µ 2 ln L P P 0, P = Small sample inference is typically based on the approximations à ˆB ˆP = ˆρ 2, ˆP. N P, ˆV! ˆσ 2 N ε N ˆV 1 2 ln L = P P 0 ˆP B ρ 2 σ 2 ε An important note relating to the likelihood function Consider the model y = Xβ + ρ 1 Wy + u (21) u = ρ 2 Mu + ε where X is exogenous, ε N(0,σ 2 I), and W and M are two weighting matrices. Now consider a special case of this model in which β =0and W = M. Assuming both inverses exits, in this case y N(0,σ 2 Ω) Ω = (I ρ 1 W ) 1 (I ρ 2 W ) 1 (I ρ 2 W 0 ) 1 (I ρ 1 W 0 ) 1 13

14 and so Ω 1 = (I ρ 1 W 0 )(I ρ 2 W 0 )(I ρ 2 W )(I ρ 1 W ) (22) = GG 0 G = [I (ρ 1 + ρ 2 )W 0 + ρ 1 ρ 2 W 0 W 0 ] It should be clear from (22) that the likelihood is perfectly symmetrical in ρ 1 and ρ 2, and so these two parameters are not identified under the stated conditions. This is a known result in the literature, see e.g., Anselin (1985). Note carefully what we have stated. If in (21) β =0and W = M, thereis an identification problem concerning ρ 1 and ρ 2. In practice it is typically assumed that in a model such as (21) W = M. However, in practice typically models are not considered in which β =0. Results given below will imply that if in a model such as (21), β 6= 0there is no identification problem concerning the parameters of the model even if W = M. This is important to note, and unfortunately has not been noted by all researchers. For instance, in (21) the model in which ρ 1 =0is often referred to as the spatial error model; if in (21) ρ 2 =0the model is often referred to as the spatial lag model. There have been quite a number of studies in which researchers (still) try to determine whether the true model is a spatial error model, or a spatial lag model because it is assumed that the identification condition restricts the consideration of the general model in (21) in which neither ρ 1 nor ρ 2 are zero. This is unfortunate because the spatial patterns implied by this more general model are so much richer" than that implied by either the spatial error model or the spatial lag model. (B) Feasible GLS and the GMM method Again the model is Y = ZB + u (23) u = ρ 2 Wu+ ε Essentially, we first get a consistent estimator of B, use it to obtain û; use û to obtain ˆρ 2 ; use ˆρ 2 to transform the model by the spatial Cochrane-Orcutt procedure and then estimate B by OLS. Preliminaries 14

15 1) We will say that the row and column sums of an N N matrix, A, are uniformly bounded in absolute value if NX max a ij c a i max j j=1 NX a ij c a i=1 for all N 1 where c a is a finite constant which does not depend on N. We will also abbreviate reference to a matrix such as A just saying that it is absolutely summable". 2) If A and B are N N absolutely summable matrices, then so is D = AB Proof: Note and consider d ij = NX d ij j=1 = = NX a ir b rj r=1 NX j=1 r=1 NX r=1 j=1 NX a ir b rj NX a ir b rj NX a ir r=1 c a c b A similar demonstration will reveal that NX d ij c a c b i=1 NX b rj 3) If A is absolutely summable, its elements are bounded. Obvious! 4)If A is absolutely summable, and Z N K has bounded elements, then the elements of Z 0 AZ are O (N). 15 j=1

16 Proof: δ ij Let Z =(Z ij ) and Z ij c z. Now consider the i, j element of Z 0 AZ, say δ ij = δ ij NX NX Z si a sr Z rj r=1 s=1 NX r=1 s=1 NX Z si a sr Z rj X X c z Z rj a sr r s X c z Z rj X r s c z c a c z N a sr Assumptions of model (23) 1. ε i is i.i.d. (0,σ 2 ε),e(ε 4 i ) < 2. ρ 2 < 1 3. P =(I ρ 2 W ) is nonsingular at the true value of ρ w ii =0, i =1,...,N 5. W and P 1 are absolutely summable 6. Z is nonstochastic, has bounded elements, and rank(z N K )=K 7. lim N 1 Z 0 Z = Q z where Q z is nonsingular 8. lim N 1 Z 0 Ω u Z = Q 1, where Q 1 is nonsingular and where Ω u is given below 9. lim N 1 Z 0 Ω 1 u Z = Q 2, where Q 2 is nonsingular Basic Results 16

17 (R1) VC u = σ 2 εω u where Ω u =(I ρ 2 W ) 1 (I ρ 2 W 0 ) 1 (R2) ˆB =(Z 0 Z) 1 Z 0 Y is consistent Proof: The result in (R1) is obvious. Consider (R2). Since Y = ZB + u ˆB = B +(Z 0 Z) 1 Z 0 u So ³ E ˆB = B VCˆB = (Z 0 Z) 1 Z 0 VC u Z (Z 0 Z) 1 = σ 2 ε (Z 0 Z) 1 [Z 0 Ω u Z](Z 0 Z) 1 = N 1 σ 2 ε N (Z 0 Z) 1 N 1 [Z 0 Ω u Z] N (Z 0 Z) 1 Q 1 Z Q 1 Q 1 Z So VCˆB 0 and hence by Tchebyschev s inequality ˆB P B We will need the following. Let where 4 N 0 û = Y Z ˆB (24) = Y ZB ³ + ZB Z ˆB = u + Z B ˆB = u + Z4 N A consistent generalized moments estimators for ρ 2 [GME] In this section we will suggest an estimator for ρ 2 andthengiveahigh level proof that it is consistent. This high level proof is, although tedious, somewhat straight forward. A more complex low level proof is given by Kelejian and Prucha (1999). 17

18 Note from (23) that u ρ 2 Wu = ε (25) so that Wu ρ 2 W 2 u = Wε (26) Let ū = Wu, ū = W 2 u, ε = Wε and denote their i th elements as ū i, ū i, ε i. Then Square (27), sum and divide by N to get u i ρ 2 ū i = ε i i =1,...,N (27) ū i ρ 2 ū i = ε i (28) P u 2 i N + ρ2 2 Square (28), sum and divide by N P ū2 i N + ρ2 2 P ū2 i N 2ρ 2 P ū2 i N 2ρ 2 P P ui ū i ε 2 N = i N P P ūi ū i ε 2 N = i N Multiply (27) by (28), sum and divide by N to get P P P P ui ū i N + ūi ū i ρ2 2 N ρ ui ū i 2 N + ū2 i N Note that since ε i is i.i.d. (0,σ 2 ε),e(ε 4 i ) < = P εi ε i N (29) (30) (31) P ε 2 i N P σ 2 ε [bykhintchineorbytchebyshev] (32) and so Note that We will assume E P ε 2 i N P ε 2 i N = ε0 W 0 Wε N = σ 2 Tr(W 0 W ) ε N lim Tr(W 0 W ) N 18 (33)

19 exists. Given this we will demonstrate below that p lim ε0 W 0 Wε N Finally the RHS of (31) is = σ 2 ε lim Tr(W 0 W ) N (34) P εi ε i N We will demonstrate below that ε 0 W 0 ε N = ε0 Wε N P 0 (35) Given (32), (34) and (35) express P ε 2 i = σ 2 ε + δ 1, δ 1 P 0 (36) PN ε 2 i = σ 2 tr (W 0 W ) ε + δ 2, δ 2 P 0 (37) P N N εi ε i = 0+δ 3, δ 3 P 0 (38) N Substitute (36)-(38) into (29)-(31). Let λ 0 =[r, ρ 2,σ 2 ε],r= ρ 2 2,δ 0 =[δ 1,δ 2,δ 3 ] and A 1 = P ū2 i /N 2 P u i ū i /N 1 P _ū2 i /N 2 P ū i _ūi /N Tr(WW 0 ) /N P h P ūi _ūi /N ui _ūi + P i ū 2 i /N 0 and A 0 2 = P u 2 i /N, P ū 2 i /N, P u i ū i /N Given this notation (29)-(31), in light of (36), (37), (38) can be expressed as We will demonstrate below that A 1 λ = A 2 + δ (39) δ P 0 We will also obtain expressions for plima 1 and plima 2. These expressions will involve limits which we assume exists. Finally, we assume that the weighting 19

20 matrix W is such that (plima 1 ) is nonsingular. Given all this we note from (39) that (p lim A 1 ) λ = p lim A 2 + p lim δ (40) = p lim A 2 so λ =(plim A 1 ) 1 p lim A 2 (41) Thus if u were observed a consistent estimator of λ would be the (over parameterized) OLS estimator (which we may call the linear estimator) ˆλ = A 1 1 A 2 (42) If the information that r = ρ 2 2 is recognized, the NLLS estimator of ρ 2 and σ 2 ε could be considered, namely min ρ 2,σ 2 ε Feasible estimators δ 0 δ =min[a 1 λ A 2 ] 0 [A 1 λ A 2 ] ρ 2,σ 2 ε Clearly these estimators are not feasible because u is not observed. Let  1 and  2 be identical to A 1 and A 2 except that u is replaced by û. Thus, e.g. we would replace P ū2 i N by P b ū 2 i N where bū i is the i th element of bū = W û, etc. Then, we will show below that eλ =  1 1  P 2 λ (43) Kelejian and Prucha (1999) show that the NLLS estimator based on i 0 i min hâ1 λ  2 hâ1 λ  2 ρ 2,σ 2 ε is also consistent. Furthermore, as expected, Monte Carlo results suggest that the NLLS estimators are more efficient then the OLS estimators described in (43) Proof that e λ P λ 20

21 Preliminary 1: Let S be an N N absolutely summable matrix. Let ε 0 =(ε 1,...,ε N ) where ε i is i.i.d. (0,σ 2 ) and E (ε 4 i )=μ 4 <. Then ε 0 Sε N Tr(S) σ2 P ε 0 (44) N Proof: Note that ε 0 Sε E N = σ 2 Tr(S) ε N Also P P ε 0 Sε ε 2 N = i s ii N + i<j ε iε j [s ij + s ji ] (45) N Note E (ε 2 i ε r ε s )=0unless r = s. Thereforeeveryterminthedoublesum in (45) is uncorrelated with every squared term. Also all the squared terms are uncorrelated with each other, as are all of the cross product terms since E [ε i ε j ε r ε s ]=0unless i = j and r = s, or i = r and j = s, or i = j = r = s. All of these conditions are ruled out. Thus from (45) [Noting E (ε i ε j ) 2 = E (ε 2 i ) E εj 2 = σ 4,ifi6= j] µ " ε 0 Sε Var = 1 X N s 2 N N iivar # X ε 2 2 i + σ 4 ε [s ij + s ji ] 2 (46) i=1 i<j " 1 N # X X s 2 N iih + σ 4 2 ε [ s ij + s ji ] 2 i=1 where h = E (ε 4 i ) σ 4. Again, let c s betheboundontheelementsofs, as i<j 21

22 well as on its absolute row and column sums. It then follows from (46) that µ ε 0 Sε Var 1 NX s 2 N N iih + (47) 2 i=1 " σ 4 N # X NX NX ε s N 2 ij 2 + s ji 2 +2 s ij s ji i<j 1 N hc2 s + 3σ4 εc s N 2 1 N hc2 s + 4σ4 εc s N 2 i<j NX i<j NX i=1 s ij + σ4 εc s N 2 NX s ij j=1 i<j NX s ji i<j Therefore, via (47) 1 N hc2 s + 4σ4 εc 2 s N 0, as N µ ε 0 Sε Var 0 as N N The result in (44) follows from Tchebyshev s inequality. Given this preliminary note that every element of A 1 and A 2 above is of the form u 0 S 1 u N = ε0 (I ρ 2 W 0 ) 1 S 1 (I ρ 2 W ) 1 ε N = ε0 S 2 ε N where S 1 and S 2 are again absolutely summable matrices. Therefore the probability limit of each element of A 1 and A 2 can be determined via the preliminary. As a trivial application it follows that δ P 1 0,i =1, 2, 3 as indicated in (36) - (38).. We now show that ³Â1 p lim ³Â2 p lim = p lim (A 1 ) (48) = p lim (A 2 ) 22

23 Firstnotethateachelementof 1 and  2 is of the form û 0 Sû/N where S is absolutely summable. Since û 0 Sû N Our result follows if and Consider û = u + Z4 N, 4 N P 0 = (u + Z4 N) 0 S (u + Z4 N ) N = u0 Su N 4 0 N + 40 N Z0 SZ4 N N Z 0 SZ N 4 N 4 0 Z 0 Su N N 4 0 N P 0 P 0 Z 0 SZ N 4 N N Z0 Su N By preliminary 4 above, the elements of Z0 SZ = O (1). Thus 4 P N N 0 implies 4 0 N Z0 SZ 4 P N N 0 Consider Let Then 4 0 N Z 0 Su N ψ = Z0 Su N Eψ = 0 VC(ψ) = N 2 Z 0 SΩ u S 0 Z. We have assumed that Ω u is absolutely summable and hence so is SΩ u S 0. It follows that N 2 Z 0 SΩ u S 0 Z 0 since N 1 Z 0 SΩ u S 0 Z 0(1). It follows that ψ P 0 and so 4 0 N Z0 Su P 0. It follows that λ e P λ since (48) holds. N Feasible GLS 23

24 Let ˆρ P 2 ρ 2 be any consistent estimator of ρ 2, and let ˆΩ u =(I ˆρ 2 W ) 1 (I ˆρ 2 W 0 ) 1. Then ³ 1 ˆB FGLS = Z 0 ˆΩ 1 u Z Z 0 ˆΩ 1 u Y (49) ˆB GLS = Z 0 Ω 1 u Z 1 Z 0 Ω 1 u Y (50) Theorem: Given the assumptions of the model (a) ³ ³ N ˆB FGLS B D N 0,σ 2 ε p lim N Z 0 Ω 1 u Z 1 Outline of Proof: (b) N ³ ˆB FGLS ˆB GLS P 0 Consider first part b. Using the usual manipulations ³ ³ N ˆBFGLS B = N Z 0 ˆΩ 1 u ³ N ˆB GLS B 1 Z N 1 2 Z 0 ˆΩ 1 u u (51) = N Z 0 Ω 1 u Z 1 N 1 2 Z 0 Ω 1 u u (52) Thus (b) holds since: N 1 Z ³ˆΩ 0 1 u Ω 1 u Z = N 1 Z 0 (ρ 2 ˆρ 2 )(W + W 0 )+ ρ 2 2 ˆρ 2 2 W 0 W Z (ρ 2 ˆρ 2 ) N 1 Z 0 (W + W 0 ) Z = {z } + ρ 2 2 ˆρ 2 2 N 1 Z{z 0 W 0 WZ} 0 O (1) 0 O (1) and N 1 2 Z 0 ³ˆΩ 1 u Ω 1 u u = N 1 2 Z 0 (ρ 2 ˆρ 2 )(W + W 0 )+ ρ 2 2 ˆρ 2 2 W 0 W u = (ρ 2 ˆρ 2 ) N 1 2 Z 0 (W + W 0 ) u + ρ 2 2 ˆρ 2 2 N 1/2 Z 0 W 0 Wu P 0 since and ρ 2 ˆρ 2 P 0 N 1 2 Z 0 (W + W 0 ) u = O ρ (1) N 1 2 Z 0 W 0 Wu = O ρ (1) 24

25 Toseethisnote E ³ N 1 2 Z 0 (W + W 0 ) u =0 VC = N 1 Z 0 (W + W 0 ) Ω u (W 0 + W ) Z =0(1) {z } absolutely summable {z } A similar result holds for N 1/2 Z 0 W 0 Wu. Now consider part (a) O(N) Given part b we need only consider ³ N ˆB GLS B = N Z 0 Ω 1 u Z 1 N 1 2 Z 0 Ω 1 u u (53) We will use the following CLT for triangular arrays which is a variation on a problem given in Billingsley 1979, p319 problem 27.6 A formal statement of the CLT Let {v in, 1 i N,N 1} be a triangular array of random variables that are identically distributed and (jointly) independent for each N with Ev in =0and EviN 2 = σ2, 0 <σ 2 <. Let {x ij,n, 1 i N,N 1}, j =1,..., K be triangular arrays of real numbers that are bounded in absolute value, i.e., c x =sup N sup i N,j K x ij,n <. Further,let{V N : n 1} and {X N : n 1} with V N =(v in ) i=1,...,n and X N =(x ij,n ) i=1,...,n; j=1,...,k denote corresponding sequences of N 1 random vectors and N K real matrices, respectively, and let lim N N 1 XNX 0 N = Q be finite and positive definite. Then N 1/2 XNV 0 D N N(0,σ 2 Q). On an intuitive level, the CLT implies that if v i is i.i.d. (0,σ 2 ), 0 <σ 2 <, and X N = {x ijn } is be a sequence of N K real nonstochastic matrices, with bounded elements sup sup x ijn < N>1 i N j N and lim N N 1 XNX 0 N = Q x, where Q 1 x exists. Let VN 0 =(v 1,...,v N ). Then N 1 2 X 0 N V D N N 0,σ 2 Q x (54) 25

26 A point to note concerning the formal assumption concerning v i,n. In a triangular array each element can change as the sample size increases. Therefore such an array does not rule out the possibility that v 3,10 = v 7,25. Ifoneweretoonlyassumethatv i is i.i.d then one would be assuming away this characteristics relating to a triangular array. For instance, the way the formal assumption is stated v 1,N, v 2,N,...,v N,N are i.i.d. for each N. Now consider (53). First note via assumption 8 of model (21): Now note that N Z 0 Ω 1 u Z 1 Q 1 2 N 1 2 Z 0 Ω 1 u u = N 1 2 Z 0 (I ρ 2 W 0 )(I ρ 2 W ) u = N 1 2 Z 0 (I ρ 2 W 0 ) ε and so Since N 1 2 Z 0 Ω 1 u u D N 0,σ 2 ε lim N 1 Z 0 Ω 1 u Z (55) (i) (ii) Elements of ε are i.i.d. (0,σ 2 ε) Elements of Z (ρ 2 )=(I ρ 2 W ) Z are uniformly bounded (iii) N 1 Z (ρ 2 ) 0 Z (ρ 2 )=Q 2 = lim N 1 Z 0 Ω 1 u Z Given the results in (53) and (55) the suggested small sample guidance is ³. ˆB FGLS N B, ˆσ 2 ε Z (ˆρ 2 ) 0 Z (ˆρ 2 ) 1 (56) where ˆσ 2 ε = N 1 h Y (ˆρ 2 ) Z (ˆρ 2 ) ˆB FGLS i 0 h Y (ˆρ 2 ) Z (ˆρ 2 ) ˆB FGLS i A point to note The following procedure is sometimes suggested. Since Y = ZB + u u = ρ 2 Wu+ ε 26

27 we have Y = ρ 2 (WY)+ZB (WZ) ρ 2 B + ε (57) Since W is observed, WY and WZ are observed. Note that (57) can not be consistently estimated by OLS since E (WYε 0 ) = W (I ρ 2 W ) 1 E (εε 0 ) (58) = W (I ρ 2 W ) 1 σ 2 ε 6= 0 Thus an instrumental variable estimator is sometimes suggested. In one case people express (57) as Y = ρ 2 (WY)+ZB +(WZ) γ + ε (59) where the restriction γ = ρ 2 B is not used. Thus, (59) is over-parameterized. Then (59) is estimated by 2SLS using the non-redundant variables from the set Z, WZ, W 2 Z etc. (typically Z, WZ, W 2 Z). This 2SLS procedure is not consistent. The reason for this is that E (WY)=WZB,and (59) already contains Z and WZ as regressor matrices. In brief, there are no instruments for WY which are linearly independent of Z and WZ. For example, the ideal instrument for WY is WZB but WZ is already in the model. You should be able to show the following. Let D =(WY,Z,WZ) and λ 0 =(ρ 2,B 0,γ 0 ) so that (59) is Y = Dλ + ε (60) Let H be any N p matrix of nonstochastic instruments such that lim N 1 H 0 H = Q H, where Q 1 H exists. Then p lim N 1 H 0 D = G (61) where G does not have full column rank. One implication of this is that (60) can not be consistently estimated by 2SLS. One would think that (57) can be consistently estimated by NL2SLS, using the instruments Z, WZ, W 2 Z, etc. This procedure is also not consistent. The reason for this is similar to that given above. To see the issue involved re-write (57) as Y N 1 = F N 1 + ε N 1 (62) F N 1 = ρ 2 (WY)+ZB (WZ) ρ 2 B 27

28 Suppose we have an exogenous matrix of instruments say H N r,r K + 1. Then, a condition given by Amemiya (1985, page 110 and 246) 3 for consistency is that µ F p lim N 1 H 0 (ρ 2,B 0 ) has full column rank. Now F (ρ 2,B 0 ) = [WY WZB, Z ρ 2 WZ] = [W (Y ZB), Z ρ 2 WZ] = [Wu, Z ρ 2 WZ] You should be able to demonstrate that the first column of p lim N 1 H 0 F (ρ 2,B 0 ) is a column of zeros if, as is typically assumed, N 1 H 0 H Q HH where Q HH is a finite invertible matrix. It follows that Amemiya s condition will not hold. On a simpler scale, to see that something is wrong note that E [F ] = ρ 2 WZB + ZB ρ 2 WZB = ZB, Z N K only involves K variables which can be used as instruments. However, the model in (57) has K +1parameters. Case 3: ρ 1 6=0,ρ 2 6=0 In this case the model is (1A) Y = ZB + ρ 1 WY + u (2A) u = ρ 2 Wu+ ε, ρ 1 < 1, ρ 2 < 1 3 T. Amemiya (1985), Advanced Econometrics, Harvard university Press. 28

29 Note that (WY) is endogenous because Also note that which is not collinear with Z : E [(WY) u 0 ] = W (I ρ 1 W ) 1 Ω u 6= 0 E (WY)=W (I ρ 1 W ) 1 ZB rank W (I ρ 1 W ) 1 Z, Z >rank[z] for most reasonable weighting matrices. 4. All of this suggests that, under usual further assumptions, (1A) can be consistently estimated by 2SLS using the instruments Z, WZ, W 2 Z etc. These instruments are suggested because E [WY]=W (I ρ 1 W ) 1 ZB If the roots of W are all less than or equal to 1 in absolute value, then the roots of ρ 1 W are less than 1 in absolute value, if ρ 1 < 1. Thus E [WY] = W I + ρ 1 W + ρ 2 1W ZB = WZB + W 2 Z (ρ 1 B)+W 3 Z ρ 2 1B +... Therefore E (WY) is linear in WZ,W 2 Z, etc. Estimating (1A) by 2SLS does not account for the spatial correlation problem. We now describe a procedure that was put forth by Kelejian and Prucha (1998) which does account for it. The procedure Step 1: Estimate 1A by 2SLS using the linearly independent columns of Z, WZ, W 2 Z. Obtain eb,eρ 1 Step 2: Obtain eu = Y Z eb eρ 1 WY, ēu = W eu, ē u = W 2 eu. Use these residual vectors to obtain the GM estimator of ρ 2, say eρ 2 4 As an illustration suppose Z 0 =(1, 2, 3) and W = 1. Then WZ = Clearly rank[z, WZ] =

30 Step 3: Obtain Y (eρ 2 ) = Y eρ 2 WY Z (eρ 2 ) = Z eρ 2 WZ WY (eρ 2 ) = WY eρ 2 W 2 Y Note that (1A) and (1B) imply (based on the true value of ρ 2 ) Y (ρ 2 )=Z (ρ 2 ) B + ρ 1 WY (ρ 2 )+ε which can be consistently estimated by 2SLS using the instruments Z (ρ 2 ), WZ (ρ 2 ),W 2 Z (ρ 2 ) Step 4: Obtain the feasible counterpart to the 2SLS estimator outlined above. Specifically, let D (eρ 2 ) = [Z (eρ 2 ),WY (eρ 2 )] λ 0 = (B 0,ρ 1 ) ˆD (eρ 2 ) = H (H 0 H) 1 H 0 D (eρ 2 ) H = Z, WZ, W 2 Z Then obtain ˆλ = h i 1 ˆD (eρ 2 ) 0 ˆD (eρ 2 ) ˆD (eρ 2 ) 0 Y (eρ 2 ) Kelejian and Prucha (1998) show µ h 1 N ³ˆλ λ D N 0,σ 2 ε p lim N ˆD (eρ 2 ) 0 ˆD (eρ 2 )i = N µ h 1 0,σ 2 ε p lim N ˆD (ρ 2 ) 0 ˆD (ρ 2 )i Let h Y (eρ 2 ) D (eρ 2 ) ˆλ i 0 h Y (eρ 2 ) D (eρ 2 ) ˆλ i ˆσ 2 ε = N Then the suggested small sample inference would be based on µ h i ˆλ. 1 N λ, σ 2 ˆD ε (eρ 2 ) 0 ˆD (eρ 2 ) 30

31 1.2.3 Implications of the spatial model: Emanating and Own spillover effects. Consider the model y = Xβ + ρ 1 Wy+ u (63) where X is exogenous, and E(u) =0. In this section it does not make any difference whether or not the elements of u are spatially correlated. The model in (63) is a structural model. Its reduced form, i.e., the solution of the model for y is y =(I ρ 1 W ) 1 [Xβ + u] and so E(y) =(I ρ 1 W ) 1 Xβ (64) Now consider interpretations that are based on (63) and (64). For ease of presentation, we suppose that X is a vector i.e., there is only one exogenous variable; we denote the ith element of X as x i. The extension to the case in which X is n k matrix will be evident. If there were no spatial effects in the sense that ρ 1 =0, the effect of a oneunitchangeinx 1 on E(y 1 ) would be β -see(63). Thiseffect can also be thought of as the direct effect of x 1 on E(y 1 ) in the sense that it does not account for spatial spill-overs which would take place if ρ 1 6=0. Clearly, if ρ 1 =0a change in x 1 would have no effect on the expected values of the other dependent variables. Let G =(I ρ 1 W ) 1 and consider final effects implied by the model. From (64) it should be clear that the expected effect of a change in x 1 on all of the elements of the vector E(y) is (using evident notation) E(y j ) = G j1 β, j =1,..., n (65) x 1 The effects described in (65) have been referred to in the literature as emanating effects. These effects describe how a change in a regressor relating to a given unit, in our illustrative case unit 1, fan out to all the units. Of course these emanating effects can also be described in terms of elasticities; again, using evident notation η j1 = G j1 β x 1 y j,j=1,..., n (66) 31

32 A closely related concept is that of own spill-over effects. These effects directly follow from (66). Specifically, as indicated above, in the absence of spill-overs, the effect of a change in x 1 on E(y 1 ) is just β. In the presence of spill-overs, that effect is E(y 1 ) = G 11 β x 1 Therefore, one measure of the own spill-over effect is clearly (G 11 1)β Of course these effects can also be expressed in terms of elasticities. In passing we note that the above material relates to a change in the regressor corresponding to the first unit. Obviously, one can also calculate the effects of a change in the regressor corresponding to each and every unit, or to a change in the regressors corresponding to a set of units- e.g., say the first and second! Emanating effects with respect to a uniform worsening of the exogenous (fundamental) variables in the originating country Kelejian, Tavlas, and Hondroyannis (2006) used a spatial model to study contagion problems in foreign exchange markets. That is, in a number of episodes when the currency of a given country experiences a run and it depreciates, the effects fan out to other related countries. In the Kelejian, Tavlas, and Hondroyonnis study they considered a variant of the emanating effects described above. In particular, instead of calculating the effect of a given variable in one country, say country 1, on the other countries, they considered a uniform worsening of the exogenous variables of that originating country, in this case, country 1 on the other countries involved. In somewhat more detail, write the conditional mean in (64) as, using evident notation, E(y) = (I ρ 1 W ) 1 Xβ (67) = G[X 1 β X k β k ] Suppose now that high values of the dependent variable are associated with more severe exchange problems than are low values. In this case if the coefficient of a regressor is negative, a worsening of the corresponding variable in country 1 would relate to a decrease in that variable, i.e., the change in that 32

33 variable multiplied by the corresponding coefficient is positive. Similarly, if acoefficient of a regressor is positive, a worsening of that variable would be an increase in the value of the corresponding regressor. Let the first value of X j be x j,1, j=1,...,k and in order to avoid unnecessary tediousness assume that the values of all of the regressors are positive. Then implication of the above is that the response of E(y r ) with respect to a worsening of all of the regressors of country 1 would be E(y r ) = G r1 ( x 1,1 β 1 ) G r1 ( x k,1 β k ) (68) G r1 x 1,1 β G r1 x k,1 β k or, if a uniform percentage worsening is considered x 1,1 β E(y r ) = G 1 x 1,1 x k,1 β r G k x k,1 r1 (69) x 1,1 x k,1 = G r1 β 1 x 1,1 α G rk β k x k,1 α where α>0is the uniform percentage worsening. Or, one can calculate the emanating elasticity of E(y r ) with respect to the uniform percentage worsening of all of the regressors in country 1 as E(y r ) = G r1 [ x 1,1 β αy r y x k,1 β r y k ] (70) r Of course, an alternative to (70) would be to replace y r in the denominator of (70) by E(y r ) which can be estimated from (67) Prediction Issues This section is based on Kelejian and Prucha (2006). As a preliminary for this section we note the following result for the convenience of the reader. Using evident notation, let the vectors Z 1 and Z 2 be jointly normally distributed as (Z 1,Z 2 ) N(μ, V ) where μ 0 =(μ 0 1,μ 0 2); V = {V ij },i,j =1, 2. Then the minimum mean squared error predictor of Z 1 based on Z 2, and the corresponding predictor variance-covariance matrix are E(Z 1 Z 2 = z 2 )=μ 1 + V 12 V22 1 (z 2 μ 2 ) (71) VC(Z 1 Z 2 = z 2 )=V 11 V 12 V22 1 V 21, 33

34 see, e.g. Greene (2003, p. 872). The discussion below concerning prediction is based on the assumption that the model parameters are known. Of course in practice they are not known and so predictors would be based on their estimated values. An analysis of predictor efficiency based on estimated parameter values would then have to consider a wide variety sample sizes, regressor variations and co-variations. The reason is that these considerations have an effect on the precision of parameter estimators and these, in turn, have an effect on prediction efficiency. By basing our discussion of prediction efficiency on known parameter values we need not consider such issues, and our interpretation of theresultsisthattheyrelatetolimitsofefficiency.. Now consider the model y n = λw n y n + X n β + u n, (72) u n = ρw n u n + ε n, where W is an n n nonstochastic weighting matrix, X is an n k nonstochastic matrix of observations on k exogenous variables, and the remaining notation is evident. We assume that ε n N(0,σ 2 εi n ). The i-th unit in the model in (72) is determined as y n,i = x n,i. β + λw n,i. y n + u n,i (73) u n,i = ρw n,i. u n + ε n,i where y n,1 is the i-th unit of y n,x n,1. is the i-th row of X n, etc. Predictors that might be considered for y n,i (a) The Reduced form predictor. This is suggested by MSE issues: y (1) n,i = E(y n,i x n,w n ) (74) = (I λw n ) 1 i. x n β where, again the notation should be evident - e.g., (I λw n ) 1 i. is the i-th row of (I λw n ) 1. 34

35 (b) A larger Information set y (2) n,i = E(y n,i x n,w n,w n,i. y n ) (75) = λw n,i. y n + x n,i. β + cov(u n,i,w n,i. y n ) [w n,i. y n E(w n,i. y n )] var(w n,i. y n ) Note that the terms involved in (73) are straight forward to determine. For example, consider cov(u n,i,w n,i. y n ) Since u n =(I n ρw n ) 1 ε n, it follows that u n,i =(I n ρw n ) 1 i. ε n (76) and so u n,i N(0,σ 2 ε(i n ρw n ) 1 i. (I n ρw n ) 10 i. ). It is also clear that so that from (76) and (77) w n,i. y n = w n,i (I n λw n ) 1 X n β + (77) w n,i (I n λw n ) 1 (I n ρw n ) 1 ε n cov(u n,i,w n,i. y n ) = E[u n,i (w n,i. y n ) 0 ] (78) = σ 2 ε(i n ρw n ) 1 i. (I n ρw 0 n) 1 (I n λw 0 n) 1 w 0 n,i The remaining covariances and variances can also be calculated in a similar fashion. (c) The efficient estimator. y (3) n,i = E(y n,i x n,w n,y n, i ) (79) = λw n,i. y n + x n,i. β + cov(u n,i,y n, i )[VC(y n, i )] 1 [y n, i E(y n, i )] where y n, i is the same as y n except y n,i is deleted. This efficient predictor would be especially applicable to the case in which one were to predict the value of a house, given the hedonic characteristics and prices of the houses in the area. (d) The intuitive predictor: 35

36 y (4) n,i = x n,i.β + λw n,i. y n (80) This predictor is simply based on the right hand side of the generating model in (73). It is bases because ignores the correlation between w n,i. y n and u n,i. Mean Squared Errors of the Predictors. There is a technical problem when comparing the mean squared errors of the four predictors outlined above. Specifically, the predictors y (2) n,i,y(3) n,i, and y (4) n,i depend upon a particular realization of the dependent vector, y n. In order to compare these predictors to each other, and to the reduced form predictor y (1) n,i, Kelejian and Prucha (2007) essentially average the mean squared errors of the last three predictors over all realizations of the dependent vector, namely they calculate Theoretically, they show E[MSE(y (j) n,i x, w)],j =1, 2, 3, 4 (81) MSE(y (1) n,i ) MSE(y(2) n,i ) MSE(y(3) n,i ), (82) MSE(y (4) n,i ) MSE(y(2) n,i ) The exact calculations are given in Kelejian and Prucha (2007), however, the inequalities in (82) are consistent with theoretical notions. The first line in (82) follows because of increasing informations sets. The second line follows because y (2) n,i is the unbiased version of y(4) n,i. The expressions in (82) involve the model parameters. Kelejian and Prucha (2007) evaluated these expressions over a wide range of model parameter values. Their numerical results are more revealing. On average, the ratio of MSEs to efficient MSE 3 are For y (1) n,i :16.6 For y (2) n,i :1.07 For y (4) n,i :2.2 It is interesting to note that the predictor based on the reduced form, namely y (1) n,i, is by far the worst predictor. In addition to its high average value, it also had outliers for certain model parameter values. 36

37 1.2.5 Spatial Models with Uniform Weights Spatial models whose weighting matrices have equal elements might be considered if units can reasonably be viewed as equally distant within certain neighborhood. As one example, that neighborhood might be a school in a study of student achievement where it is suspected that each student s achievement is, at least in part, related to the achievements of others in that school. As another example, an equal weights matrix might be considered in a study of the agricultural productivity of farmers in a given region - e.g., farmers who live in a given village. In such a study it might be assumed that farmers learn from each other and hence their productivity is interrelated. Consider the model y N = e N α + X N β + λw N y N + ε N (83) = Z N γ + ε N Z N = (e N, X N, W N y N ),γ 0 =(α, β 0,λ) where y N is the N 1 vector of observations on the dependent variable, e N is an N 1 vector of unit elements, and the remaining notation is evident except that we are assuming in a manner specifiedbelowthatallofthenondiagonal elements of W N are equal. Since we are explicitly introducing an intercept, the regressor matrix X N doesnotcontaintheconstantterm. We note for future reference that the model in (83) contains both an intercept and a spatial lag of the dependent variable. Suppose the researcher assumes, as would often be the case, that E(ε N )= 0 and E(ε N ε 0 N )=σ2 I N. Then, given I N λw N is non-singular, we have y N =(I N λw N ) 1 [αe N + X N β + ε N ] and so W N y N = W N (I N λw N ) 1 [αe N + X N β + ε N ].Therefore E(W N y N ε 0 N )=σ 2 W N (I N λw N ) 1 6=0 (84) i.e., in general the spatial lag W N y N will be correlated with the disturbance vector ε N. Given this endogeneity of W N y N the researcher might attempt to estimate model (83) by the 2SLS procedure. Suppose the model in (83) is indeed estimated by two stage least squares in terms of the full column rank N (1 + k + r) matrix of instruments H N = 37

38 (e N, X N, G N ) where, of course, G N is an N r matrix and r 1. Given results in Kelejian and Prucha (1998), G N could be taken to be the linearly independent columns of (W N X N, W 2 N X N,..., W q N X N ), where typically q 2. LetP HN = H N (H 0 N H N ) 1 H 0 N and Ẑ N = P HN Z N. Then, assuming that Ẑ N has full column rank, the 2SLS estimator of γ is ˆγ N =(ˆα N, ˆβ 0 N, ˆλ N ) 0 =(Ẑ 0 N ẐN ) 1 Ẑ 0 N y N (85) OurmainresultisgiveninTheorem1. Itsimplicationsaregiveninthe remarks that follow. Theorem 1 Assume the model in (83). Let y N = e 0 N y N /N denote the sample mean of y N.If 0 a N... a N a N a N 0... a N a N W N = a N [e N e 0 N I N ]= (86) a N a N... 0 a N a N a N... a N 0 where a N is a constant whose value could depend upon the sample size, N, then (a) ˆγ N =(ˆα N, ˆβ 0 N, ˆλ N ) 0 =(Nȳ N, 0, 1/a N ), (b) ˆε N = y N Z N ˆγ N = 0. Proof of Theorem 1: FirstnotethatifW N is given by (86), then W N y N =(Na N y N )e N a N y N (87) which is linear in the variable being explained, namely y N. Given (87), the estimated residual vector ˆε N = y N Z N ˆγ N canbewrittenas ˆε N = y N e N ˆα N X N ˆβN ˆλ N W N y N = y N (1 + ˆλ N a N ) e N (ˆα N + N ˆλ N a N y N ) X N ˆβN Substituting the expressions for ˆα N, ˆβ N,andˆλ N given in part (a) of the theorem, it is then readily seen that ˆε N = 0. The 2SLS objective function is given by ˆε 0 N H N (H 0 N H N ) 1 H 0 Nˆε N (88) 38

39 Since H N (H 0 N H N ) 1 H 0 N is positive semi-definite ˆε 0 N H N (H 0 N H N ) 1 H 0 Nˆε N 0. The 2SLS objective function is thus clearly minimized for ˆε N = 0. Since we have just shown that ˆε N = 0 for ˆγ 0 N =(N y N, 0, 1/a N ) it follows that ˆγ N is indeed the vector of 2SLS estimators. Remark 1: Since the diagonal elements of W N are all zero, the non-diagonal elements are all equal, and the sample size is N, one would typically take a N = 1 in the above illustrative cases, - e.g., villages. N 1 Remark 2: Given the model in (83) and the weighting matrix in (86) it should be clear that any estimator that is defined as a minimizer of a positive semi-definte quadratic form of the disturbances, e.g., OLS, will be identical to the 2SLS estimators. Remark 3 Part(a)ofTheorem1impliesthat,inasinglepanelframework (e.g., data on just one village) the model in (83) with W N specified as in (86) is not a useful one, and indeed, should be avoided! This should also be clear from part (b), which implies that the usual estimator for σ 2 is given by ˆσ 2 N = N 1ˆε 0 N ˆε N =0, and so typical test statistics are not defined because they require division by zero. The suggestion is that results relating to them obtained in practice will, most likely, be based on rounding errors. Remark 4: Theorem 1 also has implications concerning 2SLS estimation of model (83) for situations where the weighting matrix is not observed, but instead is parameterized in terms of observable variables and then its parameters are estimated by a nonlinear 2SLS procedure along with the regression parameters. Unfortunately, for a wide variety of parameterizations the results of such an estimation procedure would not be consistent. To see the issues involved, suppose for the moment that the (i, j)-th element of the weighting matrix is specified as w ii,n (c) =0; w ij,n (c) = 1 1+d c ij,n, i 6= j (89) where d ij,n 0 is an observable distance measure between the (i)-th and (j)-th units, and c 0 is a parameter to be estimated. Let W N (c) be the N N weighting matrix for this case, and let Z N (c) =(e N, X N, W N (c)y N ) be the regressor matrix corresponding to this more general version of the model in (83). Let ε N ( c N )=y N Z N ( c N ) γ N where γ 0 N =( α N, β 0 N, λ N ). 39

40 Then the non-linear two stage least squares estimator for this model would minimize ε 0 N ( c N )H N (H 0 N H N ) 1 H 0 N ε N ( c N ) (90) w.r.t. ˆα N, β N, λ N and c N. Unfortunately, as should be clear from Theorem 1, the results of the minimization will lead to c N =0, (ˆα N, β 0 N, λ N )=(N y, 0, 2) since c =0implies uniform weights (in this case, a N =1/2) andthis,in turn, implies via part (b) of Theorem 1 that the minimized value in (90) iszero. Wenotethatthisnegativeresultwouldnotbealteredforother specifications of w ij,n, as long as there are admissible parameter values such that all non-diagonal weights are equal. In a sense there is a corollary to Remark 4. Specifically, suppose in a model such as (83) the weighting matrix is not known a priori and the researcher considers various observable specifications of it in terms of, say, various distance measures, e.g., trade shares, geographic distance, etc.. Remark 4 suggests that if that researcher then selects the specification of the weighting matrix on the basis of the standard R 2 statistic, the results may be biased in the direction of the matrix with the most uniform weights. Clearly, the suggestion is that the R 2 measure of fit should not be used to determine the weighting matrix. There is a subtle point concerning the remarks above relaying to the case in which w ij,n (c) is given by (89). Specifically, taking c =0,impliesthat w ij,n (c) =1/2. This in turn violates the assumption maintained in our large sample analysis that the weighting matrix is summable"" i.e., c =0should not be in the parameter space. On the other hand, if the weighting matrix were formed by first taking w ij,n (c) as in (89) and then row normalizing, we would still end up with uniform weights if c =0and a matrix that is summable, and so there would still be problems! Suppose now that the researcher had panel data on a model such as (83): e.g., data on more than one village. The reader should be able to convince himself that the identification problem will still hold if the number of observations in each village were the same, and if fixed effects were considered - i.e., each village had it own intercept. On the other hand, if fixed effects were not considered, or if the number of observations in each village were not the same this identification problem would not arise and so, under usual assumptions, the 2SLS estimators would be consistent. 40

41 1.2.6 Estimation issues relating to missing data for border units Data shortcomings often arise in the analysis of spatial models containing spatial lags in either the dependent variable or in the exogenous variables. In many cases these shortcoming relate to the lack of data on relevant variables relating certain units which are defined to be neighbors of at least some of the other units in the study which are observed. There are various ways researchers have confronted this problem. One is to ignore it. This approach leads to an ommited variable problem. Another approach is to construct a sample from the available data which is complete in the sense that observations are available for all units and their neighbors. This approach leads to an errors in variables problem. 41

ESTIMATION PROBLEMS IN MODELS WITH SPATIAL WEIGHTING MATRICES WHICH HAVE BLOCKS OF EQUAL ELEMENTS*

ESTIMATION PROBLEMS IN MODELS WITH SPATIAL WEIGHTING MATRICES WHICH HAVE BLOCKS OF EQUAL ELEMENTS* JOURNAL OF REGIONAL SCIENCE, VOL. 46, NO. 3, 2006, pp. 507 515 ESTIMATION PROBLEMS IN MODELS WITH SPATIAL WEIGHTING MATRICES WHICH HAVE BLOCKS OF EQUAL ELEMENTS* Harry H. Kelejian Department of Economics,

More information

A SPATIAL CLIFF-ORD-TYPE MODEL WITH HETEROSKEDASTIC INNOVATIONS: SMALL AND LARGE SAMPLE RESULTS

A SPATIAL CLIFF-ORD-TYPE MODEL WITH HETEROSKEDASTIC INNOVATIONS: SMALL AND LARGE SAMPLE RESULTS JOURNAL OF REGIONAL SCIENCE, VOL. 50, NO. 2, 2010, pp. 592 614 A SPATIAL CLIFF-ORD-TYPE MODEL WITH HETEROSKEDASTIC INNOVATIONS: SMALL AND LARGE SAMPLE RESULTS Irani Arraiz Inter-American Development Bank,

More information

1 Overview. 2 Data Files. 3 Estimation Programs

1 Overview. 2 Data Files. 3 Estimation Programs 1 Overview The programs made available on this web page are sample programs for the computation of estimators introduced in Kelejian and Prucha (1999). In particular we provide sample programs for the

More information

1 Introduction to Generalized Least Squares

1 Introduction to Generalized Least Squares ECONOMICS 7344, Spring 2017 Bent E. Sørensen April 12, 2017 1 Introduction to Generalized Least Squares Consider the model Y = Xβ + ɛ, where the N K matrix of regressors X is fixed, independent of the

More information

Finite sample properties of estimators of spatial autoregressive models with autoregressive disturbances

Finite sample properties of estimators of spatial autoregressive models with autoregressive disturbances Papers Reg. Sci. 82, 1 26 (23) c RSAI 23 Finite sample properties of estimators of spatial autoregressive models with autoregressive disturbances Debabrata Das 1, Harry H. Kelejian 2, Ingmar R. Prucha

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Spatial Regression. 11. Spatial Two Stage Least Squares. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 11. Spatial Two Stage Least Squares. Luc Anselin.  Copyright 2017 by Luc Anselin, All Rights Reserved Spatial Regression 11. Spatial Two Stage Least Squares Luc Anselin http://spatial.uchicago.edu 1 endogeneity and instruments spatial 2SLS best and optimal estimators HAC standard errors 2 Endogeneity and

More information

Dealing With Endogeneity

Dealing With Endogeneity Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics

More information

GMM Estimation of Spatial Error Autocorrelation with and without Heteroskedasticity

GMM Estimation of Spatial Error Autocorrelation with and without Heteroskedasticity GMM Estimation of Spatial Error Autocorrelation with and without Heteroskedasticity Luc Anselin July 14, 2011 1 Background This note documents the steps needed for an efficient GMM estimation of the regression

More information

A Spatial Cliff-Ord-type Model with Heteroskedastic Innovations: Small and Large Sample Results 1

A Spatial Cliff-Ord-type Model with Heteroskedastic Innovations: Small and Large Sample Results 1 A Spatial Cliff-Ord-type Model with Heteroskedastic Innovations: Small and Large Sample Results 1 Irani Arraiz 2,DavidM.Drukker 3, Harry H. Kelejian 4 and Ingmar R. Prucha 5 August 21, 2007 1 Our thanks

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Chapter 5 Prediction of Random Variables

Chapter 5 Prediction of Random Variables Chapter 5 Prediction of Random Variables C R Henderson 1984 - Guelph We have discussed estimation of β, regarded as fixed Now we shall consider a rather different problem, prediction of random variables,

More information

Instrumental Variables, Simultaneous and Systems of Equations

Instrumental Variables, Simultaneous and Systems of Equations Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated

More information

Econometrics Master in Business and Quantitative Methods

Econometrics Master in Business and Quantitative Methods Econometrics Master in Business and Quantitative Methods Helena Veiga Universidad Carlos III de Madrid Models with discrete dependent variables and applications of panel data methods in all fields of economics

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Spatial Econometrics

Spatial Econometrics Spatial Econometrics Lecture 5: Single-source model of spatial regression. Combining GIS and regional analysis (5) Spatial Econometrics 1 / 47 Outline 1 Linear model vs SAR/SLM (Spatial Lag) Linear model

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Introduction to Econometrics Final Examination Fall 2006 Answer Sheet

Introduction to Econometrics Final Examination Fall 2006 Answer Sheet Introduction to Econometrics Final Examination Fall 2006 Answer Sheet Please answer all of the questions and show your work. If you think a question is ambiguous, clearly state how you interpret it before

More information

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated Networks

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated Networks GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated etworks Yan Bao May, 2010 1 Introduction In this paper, I extend the generalized method of moments framework based on linear

More information

Estimation of simultaneous systems of spatially interrelated cross sectional equations

Estimation of simultaneous systems of spatially interrelated cross sectional equations Journal of Econometrics 118 (2004) 27 50 www.elsevier.com/locate/econbase Estimation of simultaneous systems of spatially interrelated cross sectional equations Harry H. Kelejian, Ingmar R. Prucha Department

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Instrumental Variables

Instrumental Variables Università di Pavia 2010 Instrumental Variables Eduardo Rossi Exogeneity Exogeneity Assumption: the explanatory variables which form the columns of X are exogenous. It implies that any randomness in the

More information

Analyzing spatial autoregressive models using Stata

Analyzing spatial autoregressive models using Stata Analyzing spatial autoregressive models using Stata David M. Drukker StataCorp Summer North American Stata Users Group meeting July 24-25, 2008 Part of joint work with Ingmar Prucha and Harry Kelejian

More information

Non-Spherical Errors

Non-Spherical Errors Non-Spherical Errors Krishna Pendakur February 15, 2016 1 Efficient OLS 1. Consider the model Y = Xβ + ε E [X ε = 0 K E [εε = Ω = σ 2 I N. 2. Consider the estimated OLS parameter vector ˆβ OLS = (X X)

More information

the error term could vary over the observations, in ways that are related

the error term could vary over the observations, in ways that are related Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance Var(u i x i ) = σ 2 is common to all observations i = 1,..., n In many applications, we may

More information

Testing Random Effects in Two-Way Spatial Panel Data Models

Testing Random Effects in Two-Way Spatial Panel Data Models Testing Random Effects in Two-Way Spatial Panel Data Models Nicolas Debarsy May 27, 2010 Abstract This paper proposes an alternative testing procedure to the Hausman test statistic to help the applied

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

The Statistical Property of Ordinary Least Squares

The Statistical Property of Ordinary Least Squares The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting

More information

Short Questions (Do two out of three) 15 points each

Short Questions (Do two out of three) 15 points each Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot October 28, 2009 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

11. Further Issues in Using OLS with TS Data

11. Further Issues in Using OLS with TS Data 11. Further Issues in Using OLS with TS Data With TS, including lags of the dependent variable often allow us to fit much better the variation in y Exact distribution theory is rarely available in TS applications,

More information

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22 GLS and FGLS Econ 671 Purdue University Justin L. Tobias (Purdue) GLS and FGLS 1 / 22 In this lecture we continue to discuss properties associated with the GLS estimator. In addition we discuss the practical

More information

Outline. Overview of Issues. Spatial Regression. Luc Anselin

Outline. Overview of Issues. Spatial Regression. Luc Anselin Spatial Regression Luc Anselin University of Illinois, Urbana-Champaign http://www.spacestat.com Outline Overview of Issues Spatial Regression Specifications Space-Time Models Spatial Latent Variable Models

More information

Econ 583 Final Exam Fall 2008

Econ 583 Final Exam Fall 2008 Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random

More information

1. The OLS Estimator. 1.1 Population model and notation

1. The OLS Estimator. 1.1 Population model and notation 1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

GMM Estimation of SAR Models with. Endogenous Regressors

GMM Estimation of SAR Models with. Endogenous Regressors GMM Estimation of SAR Models with Endogenous Regressors Xiaodong Liu Department of Economics, University of Colorado Boulder E-mail: xiaodongliucoloradoedu Paulo Saraiva Department of Economics, University

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Estimation of Spatial Models with Endogenous Weighting Matrices, and an Application to a Demand Model for Cigarettes

Estimation of Spatial Models with Endogenous Weighting Matrices, and an Application to a Demand Model for Cigarettes Estimation of Spatial Models with Endogenous Weighting Matrices, and an Application to a Demand Model for Cigarettes Harry H. Kelejian University of Maryland College Park, MD Gianfranco Piras West Virginia

More information

The exact bias of S 2 in linear panel regressions with spatial autocorrelation SFB 823. Discussion Paper. Christoph Hanck, Walter Krämer

The exact bias of S 2 in linear panel regressions with spatial autocorrelation SFB 823. Discussion Paper. Christoph Hanck, Walter Krämer SFB 83 The exact bias of S in linear panel regressions with spatial autocorrelation Discussion Paper Christoph Hanck, Walter Krämer Nr. 8/00 The exact bias of S in linear panel regressions with spatial

More information

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics II - EXAM Answer each question in separate sheets in three hours Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following

More information

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Chapter 6: Endogeneity and Instrumental Variables (IV) estimator Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 15, 2013 Christophe Hurlin (University of Orléans)

More information

The BLP Method of Demand Curve Estimation in Industrial Organization

The BLP Method of Demand Curve Estimation in Industrial Organization The BLP Method of Demand Curve Estimation in Industrial Organization 9 March 2006 Eric Rasmusen 1 IDEAS USED 1. Instrumental variables. We use instruments to correct for the endogeneity of prices, the

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

Lecture 6: Hypothesis Testing

Lecture 6: Hypothesis Testing Lecture 6: Hypothesis Testing Mauricio Sarrias Universidad Católica del Norte November 6, 2017 1 Moran s I Statistic Mandatory Reading Moran s I based on Cliff and Ord (1972) Kelijan and Prucha (2001)

More information

Simultaneous Equation Models (Book Chapter 5)

Simultaneous Equation Models (Book Chapter 5) Simultaneous Equation Models (Book Chapter 5) Interrelated equations with continuous dependent variables: Utilization of individual vehicles (measured in kilometers driven) in multivehicle households Interrelation

More information

On the asymptotic distribution of the Moran I test statistic withapplications

On the asymptotic distribution of the Moran I test statistic withapplications Journal of Econometrics 104 (2001) 219 257 www.elsevier.com/locate/econbase On the asymptotic distribution of the Moran I test statistic withapplications Harry H. Kelejian, Ingmar R. Prucha Department

More information

splm: econometric analysis of spatial panel data

splm: econometric analysis of spatial panel data splm: econometric analysis of spatial panel data Giovanni Millo 1 Gianfranco Piras 2 1 Research Dept., Generali S.p.A. and DiSES, Univ. of Trieste 2 REAL, UIUC user! Conference Rennes, July 8th 2009 Introduction

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Generalized Method of Moments: I. Chapter 9, R. Davidson and J.G. MacKinnon, Econometric Theory and Methods, 2004, Oxford.

Generalized Method of Moments: I. Chapter 9, R. Davidson and J.G. MacKinnon, Econometric Theory and Methods, 2004, Oxford. Generalized Method of Moments: I References Chapter 9, R. Davidson and J.G. MacKinnon, Econometric heory and Methods, 2004, Oxford. Chapter 5, B. E. Hansen, Econometrics, 2006. http://www.ssc.wisc.edu/~bhansen/notes/notes.htm

More information

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley Zellner s Seemingly Unrelated Regressions Model James L. Powell Department of Economics University of California, Berkeley Overview The seemingly unrelated regressions (SUR) model, proposed by Zellner,

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 30 Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Non-spherical

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Endogeneity b) Instrumental

More information

GMM Estimation and Testing

GMM Estimation and Testing GMM Estimation and Testing Whitney Newey July 2007 Idea: Estimate parameters by setting sample moments to be close to population counterpart. Definitions: β : p 1 parameter vector, with true value β 0.

More information

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal Structural VAR Modeling for I(1) Data that is Not Cointegrated Assume y t =(y 1t,y 2t ) 0 be I(1) and not cointegrated. That is, y 1t and y 2t are both I(1) and there is no linear combination of y 1t and

More information

GMM estimation of spatial panels

GMM estimation of spatial panels MRA Munich ersonal ReEc Archive GMM estimation of spatial panels Francesco Moscone and Elisa Tosetti Brunel University 7. April 009 Online at http://mpra.ub.uni-muenchen.de/637/ MRA aper No. 637, posted

More information

Instrumental Variables and the Problem of Endogeneity

Instrumental Variables and the Problem of Endogeneity Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =

More information

x 21 x 22 x 23 f X 1 X 2 X 3 ε

x 21 x 22 x 23 f X 1 X 2 X 3 ε Chapter 2 Estimation 2.1 Example Let s start with an example. Suppose that Y is the fuel consumption of a particular model of car in m.p.g. Suppose that the predictors are 1. X 1 the weight of the car

More information

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional Heteroskedasticity We now consider the implications of relaxing the assumption that the conditional variance V (u i x i ) = σ 2 is common to all observations i = 1,..., In many applications, we may suspect

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations. Exercises for the course of Econometrics Introduction 1. () A researcher is using data for a sample of 30 observations to investigate the relationship between some dependent variable y i and independent

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Spatial panels: random components vs. xed e ects

Spatial panels: random components vs. xed e ects Spatial panels: random components vs. xed e ects Lung-fei Lee Department of Economics Ohio State University l eeecon.ohio-state.edu Jihai Yu Department of Economics University of Kentucky jihai.yuuky.edu

More information

Lecture 7: Spatial Econometric Modeling of Origin-Destination flows

Lecture 7: Spatial Econometric Modeling of Origin-Destination flows Lecture 7: Spatial Econometric Modeling of Origin-Destination flows James P. LeSage Department of Economics University of Toledo Toledo, Ohio 43606 e-mail: jlesage@spatial-econometrics.com June 2005 The

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

Topic 10: Panel Data Analysis

Topic 10: Panel Data Analysis Topic 10: Panel Data Analysis Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Introduction Panel data combine the features of cross section data time series. Usually a panel

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

ECON 4160, Lecture 11 and 12

ECON 4160, Lecture 11 and 12 ECON 4160, 2016. Lecture 11 and 12 Co-integration Ragnar Nymoen Department of Economics 9 November 2017 1 / 43 Introduction I So far we have considered: Stationary VAR ( no unit roots ) Standard inference

More information

Instrumental Variables/Method of

Instrumental Variables/Method of Instrumental Variables/Method of 80 Moments Estimation Ingmar R. Prucha Contents 80.1 Introduction... 1597 80.2 A Primer on GMM Estimation... 1599 80.2.1 Model Specification and Moment Conditions... 1599

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

Notes on empirical methods

Notes on empirical methods Notes on empirical methods Statistics of time series and cross sectional regressions 1. Time Series Regression (Fama-French). (a) Method: Run and interpret (b) Estimates: 1. ˆα, ˆβ : OLS TS regression.

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Instrumental Variables and Two-Stage Least Squares

Instrumental Variables and Two-Stage Least Squares Instrumental Variables and Two-Stage Least Squares Generalised Least Squares Professor Menelaos Karanasos December 2011 Generalised Least Squares: Assume that the postulated model is y = Xb + e, (1) where

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information