Lecture 7 Autoregressive Processes in Space

Size: px

Start display at page:

Download "Lecture 7 Autoregressive Processes in Space"

Cuthbert Henderson
5 years ago
Views:

1 Lecture 7 Autoregressive Processes in Space Dennis Sun Stanford University Stats 253 July 8, 2015

2 1 Last Time 2 Autoregressive Processes in Space 3 Estimating Parameters 4 Testing for Spatial Autocorrelation 5 Application to the SIDS Data

3 1 Last Time 2 Autoregressive Processes in Space 3 Estimating Parameters 4 Testing for Spatial Autocorrelation 5 Application to the SIDS Data

4 AR Processes in Time Rather than model the covariance between errors explicitly, we assumed that the errors followed an AR(p) process: ɛ t = φ 1 ɛ t φ p ɛ t p + δ t. This induced a covariance structure between the errors. Estimation of φ is easy: Under the hack approach, you will have estimates ˆɛ t of the errors, and you can estimate φ by regressing ˆɛ on lagged versions of itself. If you follow the model-based approach, optimization over φ is not difficult because Σ 1 φ is banded.

5 1 Last Time 2 Autoregressive Processes in Space 3 Estimating Parameters 4 Testing for Spatial Autocorrelation 5 Application to the SIDS Data

6 Sudden Infant Death Syndrome (SIDS) Data library(spdep) example(nc.sids) gr.colors <- colorramppalette(c("gray", "red")) spplot(nc.sids, "SID74", col.regions=gr.colors(100)) AR processes have traditionally been used to model lattice data (or areal data), like this.

7 Generalizing AR Processes to Space There are two equivalent ways to specify a temporal AR process: By defining the variables in terms of each other: where δ t iid N(0, τ 2 ). ɛ t = p φ j ɛ t j + δ t, j=1 By specifying the conditional distribution: p(ɛ t ɛ t 1, ɛ t 2,...) exp 1 ( 2τ 2 ɛ t Both can be generalized naturally to space. p ) 2 φ j ɛ t j. j=1

8 Generalizing AR Processes to Space Simultaneous Autoregression (SAR): 1 ɛ s = φ N(s) s N(s) ɛ s + δ s, where N(s) denotes the neighbors of s and δ s iid N(0, τ 2 ). Conditional Autoregression (CAR): { p(ɛ s ɛ s ) exp 1 ( 1 2τ 2 ɛ s φ N(s) Are they the same? s N(s) ɛ s ) 2 }.

9 Simultaneous Autoregression (SAR) 1 ɛ s = φ N(s) s N(s) ɛ s + δ s. Let W be the matrix where W ij = 1/ N(i) if j N(i) and 0 otherwise. Then, we can write the SAR model as ɛ = φw ɛ + δ, where δ N(0, τ 2 I), or equivalently (I φw )ɛ = δ. Therefore, for SAR, ɛ N(0, τ 2 (I φw ) 1 (I φw ) T ).

10 Conditional Autoregression (CAR) { p(ɛ s ɛ s ) exp 1 ( 1 2τ 2 ɛ s φ N(s) CAR is a bit trickier. s N(s) ɛ s ) 2 }. For time series, we could order the data and obtain the joint distribution from the conditionals: p(ɛ 1,..., ɛ n ) = p(ɛ 1 ) p(ɛ 2 ɛ 1 ) p(ɛ 3 ɛ 1, ɛ 2 )... p(ɛ n ɛ 1,..., ɛ n 1 ). This trick doesn t work here because spatial data don t have a natural ordering.

11 Conditional Autoregression (CAR) The following result gives us the joint distribution in terms of the conditionals, up to a normalizing constant: Theorem (Brook s Lemma) Let p(ɛ) > 0 for all ɛ. Then, for any ɛ and ɛ : Proof. p(ɛ) n p(ɛ ) = p(ɛ i ɛ 1,..., ɛ i 1, ɛ i+1,..., ɛ n) p(ɛ i ɛ 1,..., ɛ i 1, ɛ i+1,..., ɛ n) i=1 p(ɛ) p(ɛ ) = p(ɛ 1 ɛ 2,..., ɛ n) p(ɛ 1 ɛ 2,..., ɛ n) p(ɛ 2,..., ɛ n ɛ 1 ) p(ɛ 2,..., ɛ n ɛ 1 ) = p(ɛ 1 ɛ 2,..., ɛ n) p(ɛ 1 ɛ 2,..., ɛ n) p(ɛ 2 ɛ 1, ɛ 3,..., ɛ n) p(ɛ 2 ɛ 1, ɛ 3,..., ɛ n) p(ɛ 3,..., ɛ n ɛ 1, ɛ 2 ) p(ɛ 3,..., ɛ n ɛ 1, ɛ 2 ) = p(ɛ 1 ɛ 2,..., ɛ n) p(ɛ 1 ɛ 3,..., ɛ n) p(ɛ 2 ɛ 1, ɛ 3,..., ɛ n) p(ɛ 2 ɛ 1, ɛ 3,..., ɛ n)...

12 Conditional Autoregression (CAR) Apply Brook s lemma to obtain p(ɛ)/p(0) for the CAR model: p(ɛ) n p(0) = p(ɛ i ɛ 1,..., ɛ i 1, 0 i+1,..., 0 n ) p(0 i=1 i ɛ 1,..., ɛ i 1, 0 i+1,..., 0 n ) { n exp 1 2τ (ɛ = 2 i φ j<i W ijɛ j φ ) 2 } j>i 0 j { i=1 exp 1 2τ (0 2 i φ j<i W ijɛ j φ ) 2 } j>i 0 j { = exp 1 n ( 2τ 2 ɛ i φ ) 2 1 n W ij ɛ j + 2τ 2 i=1 j<i i=1 { = exp 1 2τ 2 n ( ) } ɛ 2 i 2φɛ i W ij ɛ j i=1 j<i ( φ ) 2 } W ij ɛ j j<i If W is symmetric, then 2 i j<i ɛ iw ij ɛ j = i j ɛ iw ij ɛ j, so: { = exp 1 } 2τ 2 ɛt (I φw )ɛ, so ɛ N(0, τ 2 (I φw ) 1 ).

13 Comparison of SAR and CAR Simultaneous Autoregression (SAR): ɛ N(0, τ 2 (I φw ) 1 (I φw ) T ). Conditional Autoregression (CAR). W must be symmetric, ɛ N(0, τ 2 (I φw ) 1 ). Unlike with time series, the two specifications yield different models!

14 Extensions W can be any weight matrix in general. For example, we might... give immediate neighbors more weight than two-hop neighbors. weight pairs depending on the distance between them. Var[δ] does not have to be τ 2 I. It is common to assume that it is diagonal with different variances τi 2. This is important when analyzing data aggregated by county/state, since each data point is based on a different sample size n i. In this case, we typically assume τi 2 1 n i. If D = diag(τi 2 ), then the variance of SAR and CAR become (I W ) 1 D(I W ) T and (I W ) 1 D, respectively. For CAR, the requirement that W is symmetric needs to be changed accordingly.

15 1 Last Time 2 Autoregressive Processes in Space 3 Estimating Parameters 4 Testing for Spatial Autocorrelation 5 Application to the SIDS Data

16 Estimating φ Can we estimate φ by regressing ɛ s on its neighbors? No! First, each observation may have a different number of neighbors. Even if we had regularly-spaced data where every observation has the same number of neighbors, Whittle (1954) showed that this estimator is inconsistent.

17 Maximum Likelihood The log-likelihood for the CAR model is log det(τ 2 (I φw ) 1 ) 1 2τ 2 (y Xβ)T (I φw )(y Xβ). It is possible to reduce to this to a partial likelihood in just φ by substituting the optimal values for β and τ 2 : β(φ) = (X T (I φw )X) 1 X T (I φw )y τ 2 (φ) = 1 n (y Xβ(φ))T (I φw )(y Xβ(φ)). Optimizing over φ is a one-dimensional problem that can be solved by grid search. Note that φ < 1/λ 1 (W ) is necessary to ensure that the covariance matrix is positive-definite.

18 The log-partial likelihood is Maximum Likelihood log det(τ(φ) 2 (I φw ) 1 ) 1 2τ(φ) 2 (y Xβ(φ))T (I φw )(y Xβ(φ)). W is usually sparse. The second term can be evaluated with just a few matrix-vector multiplications involving (I φw ), which is easy to do. The real challenge is evaluating log det(i φw ). This matrix is no longer banded. But notice that log det(i φw ) = n log(1 φλ i (W )), i=1 so we do not need to evaluate the determinant for each φ we test. We just have to find the eigenvalues of W once.

19 1 Last Time 2 Autoregressive Processes in Space 3 Estimating Parameters 4 Testing for Spatial Autocorrelation 5 Application to the SIDS Data

20 Likelihood ratio test for testing H 0 : φ = 0.

21 1 Last Time 2 Autoregressive Processes in Space 3 Estimating Parameters 4 Testing for Spatial Autocorrelation 5 Application to the SIDS Data

22 Model with Number of Births Residuals

23 Call: spautolm(formula = SID74 ~ BIR74, data = nc.sids, listw = nb2listw(nccr85_nb), family = "SAR") Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) BIR <2e-16 Lambda: LR test value: p-value: Numerical Hessian standard error of lambda: Log likelihood: ML residual variance (sigma squared): , (sigma: ) Number of observations: 100 Number of parameters estimated: 4 AIC: Note that what they call Lambda is what we have called φ above.

24 Model with Numbers of Births and Nonwhite Births Residuals

25 Call: spautolm(formula = SID74 ~ BIR74 + NWBIR74, data = nc.sids, listw = nb2listw(nccr85_nb), family = "SAR") Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) BIR NWBIR e-10 Lambda: LR test value: p-value: Numerical Hessian standard error of lambda: Log likelihood: ML residual variance (sigma squared): , (sigma: ) Number of observations: 100 Number of parameters estimated: 5 AIC:

Lecture 18. Models for areal data. Colin Rundel 03/22/2017

Lecture 18. Models for areal data. Colin Rundel 03/22/2017 Lecture 18 Models for areal data Colin Rundel 03/22/2017 1 areal / lattice data 2 Example - NC SIDS SID79 3 EDA - Moran s I If we have observations at n spatial locations s 1,... s n ) I = n i=1 n n j=1