Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Title: Spatial Statistics for Point Processes and Lattice Data (Part III) Lattice Data Tonglin Zhang

Outline Description Research Problems Global Clustering and Local Clusters Permutation Test Spatial Scan Test Spatial Autoregressive Models Geographical Weighted Regression An Example for Cluster Detection Linkage between Point Process and Lattice Data Consideration of Asymptotics

Description Suppose a study area in a map is partitioned into m spatial units. Let Y 1,, Y m be the response variable and x 1,, x m be the explanatory variables. If Y i is continuous, then we have the model and Y i = x i β + δ i δ = ρwδ + ϵ where δ = (δ 1,, δ m ) T and ϵ N(0, σ 2 I). This is also called the spatial autoregressive (SAR) model. If Y i is count, then we assume Y i follows Poisson(n i θ i ), where n i is the at risk population. We have the model log θ i = x i β + δ. This may be specified either as a conditional authregressive (CAR) model or SAR model for count.

Description The weight matrix W is pre-specified. It is often defined by neighboring information, i.e. w ij = 1/ i if i and j are next to each other or w ij = 0 otherwise, where i is the number of neighbors of the ith unit. Therefore W is not symmetric. Sometimes, we consider a spatial cluster model, which is log θ i = x i β + α i, where α i = α if i C and α i = 0 if i C. C is often called a spatial cluster. The spatial structure is important.

Research Problems In SAR for continuous data, estimation of ρ is important. The specification of W is also an issue. In CAR or SAR model for count data, Bayesian estimation are often used. This is also called the disease mapping problem. In spatial cluster model, the detection of spatial cluster (C) is interesting. This is also called the cluster detection problem.

Global Clustering and Loal Clusters The term clustering indicates the presence of spatial global effects, which is often described by SAR and CAR models. Disease mapping methods are often used for count data. The geographical weighted regression (GWR) approach is also popular. The term clusters indicates the presence of local clusters, which is often described by spatial cluster models. Spatial scan test is often used.

Permutation test The permutation test is popular in the detection of global clustering. The basic idea is based on a quadratic form as Q = m m w ij z i z j, i=1 j i,j=1 where z 1,, z m are derived from a statistical model. The permutation approach permutes z 1,, z m. It assumes that Q is permutation invariant if there is no spatial clustering. In fact, one can treat the distribution of Q as conditioning on the order statistic of {z 1,, z m }. The famous permutation test statistics are: Moran s I, Getis G, and Geary c. If the test is significant, then one should also study the reason.

Permutation test There is a critical issue in all of the permutation test: the type I error probability may be inflated. For example, Suppose Y i Poisson(θn i ). Then, one often chooses z i = Y i /n i. Then, E(z i ) = θ and V (z i ) = θ/n i. This can cause the inflation of type I error probability. It is recommended to use the Pearson residual or the deviance residual for z i, i.e., z i = Y i Ŷi Ŷi, where n m i Ŷ i = n i ˆθ = i=1 Y i m i=1 n. i We can also use other types of residuals.

Spatial Scan Tests The aim of spatial scan tests is to detect cluster C. The original version considers the model Y i Poisson(θ i n i ) with θ i = θ c if i C and θ i = θ 0 if i C, where C is unknown. Assume θ c θ 0. Consider H 0 : θ c = θ 0. Then, the likelihood ratio given C is ( ) YC /n YC ( C Y Λ C = C /n ) Y C C, Y /n Y /n where Y C = i C Y i, n C = i C n i, Y C = i C Y i, n C = i C n i, Y = m i=1 Y i, and n = m i=1 Y i. Then, the spatial scan statistic is Λ = sup Λ C, C C where C is a collection of cluster candidates. p-values are derived by the bootstrap method.

Spatial Scan Tests It is important to select C in the spatial scan test. Originally, one chose C as all of circular or rectangular shaped subregions. Later, elliptical shape regions are also considered. Bootstrap method is slow. The computation of the maximization of Λ C for C C is a problem. The Poisson assumption is an issue. Explanatory variables are not included.

Spatial Scan Tests A modification Consider a GLM as log E(Y i ) = x i β + α c I i, where I i = 1 if i C and I i = 0 if i C. Let G 2 be the residual deviance goodness-of-fit statistic. Then, we have G 2 0 value in the model with α = 0 and G 2 1,C value in the model without α = 0. Let Λ C = G0 2 G 1,C 2 be the likelihood ratio statistic. Then, we can considder Λ = sup Λ C C C as before. Other goodness-of-fit statistic, such as X 2 and F 2 can also be used.

Spatial Scan Tests If overdispersion is present, then we can derive the estimate of the dispersion parameter ϕ as ˆϕ = max( X 2 0 m 1, 1), which can be used to modify Λ. One can also consider a zero-inflated model in which one assumes Y i = ϵ i Poisson(θ i n i ), where ϵ i is a Bernoulli random variable. Overdispersion can also be involved. One can also develop a GLMM version.

Spatial Scan Tests There are a few issues to be considered. Assumption α i = α c versus α i = 0 is a problem. It is only expected that α i is large in C. How to adjust the impact of the first cluster in the detection of secondary cluster. If C is mis-specified, what will happen.

Spatial Autoregressive Models Assume y R n and X is the n p matrix. The SAR model is Then, y = Xβ + δ, δ = ρwδ + ϵ, ϵ N(0, σ 2 I). δ N(0, σ 2 (I ρw) 1 (I ρw T ) 1 ). Therefore, the loglikelihood function is l(ρ, β) = n 2 log(2π) n 2 log σ2 det(i ρw) 1 2σ 2 [(I ρw)(y Xβ)]T [(I ρw)(y Xβ)]. Then, ρ and β can be estimated using the profile likelihood approach.

Spatial Autoregressive Models There are a few issues: If n is large, then it is impossible to compute det(i ρw). Some approximate methods have been proposed. The model can be used to predict response variable based on its neighbors.

Geographical Weighted Regression (GWR) Suppose the regression model is y(s) = x T (s)β(s) + ϵ(s), where (y(s), x(s)) is the vector of observed response and independent variables. Then, one can use a weighted least square method to estimate ˆβ(s) by minimizing Q s (β(s)) = m w i (s)[y(s) x T (s)β(s)] 2, i=1 where w i (s) is the weight function. The weight function is often decided by a kernel function, which gives more weight to locations close to s.

Geographical Weighted Regression The GWR has been extensively used in applications, (e.g. housing prices). This method can be easily extended to generalized linear models with local smoothness parameters. The computation is extremely fast. It can also be used to do prediction.

An Example for Cluster Detection We collected infant mortality count (y i ) and total number of infant births at county level in Jiangxi province, China in 2002. There were 99 counties (m = 99) with average rate 41.8 per 1,000. Then, we fit a Poisson cluster detection model. We also consider a quasi-poisson cluster detection model.

An Example for Cluster Detection Figure : Infant Mortality Rate of Jiangxi Province in China 2002

An Example for Cluster Detection Simulation We inserted cluster (C 0 ) at center of the province such that E(Y i ) = 0.001n i if i C 0 and E(Y i ) = 0.001(1 + δ)n i if i C 0, where n i is the true at risk population. We increased δ from 0 to 2. Then, δ = 0 indicated there were no clusters. We also chose a dispersion parameter ϕ from 1 to 2.0. If ϕ = 1, then there was no overdispersion effect. We used Gamma distribution to generate the dispersion effect. We can use Poisson likelihood and negative-binomial likelihood, where we should estimate ϕ if the negative binomial distribution is used.

An Example for Cluster Detection We only consider the maximum likelihood estimation in the Poisson model. We considered the maximum likelihood estimation (MLE) and the moment estimation (ME) in the negative binomial model. We computed the type I error probability and power functions based on 1000 simulation replications. The p-value is computed based on a bootstrap method.

An Example for Cluster Detection Figure : Simulated Power Functions as Function of δ for selected δ. Power Functions: φ=1 Power Functions: φ=1.25 Rejection Rate 0.0 0.2 0.4 0.6 0.8 1.0 Poisson MLE ME Rejection Rate 0.0 0.2 0.4 0.6 0.8 1.0 Poisson MLE ME 0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6 δ δ Power Functions: φ=1.5 Power Functions: φ=2 Rejection Rate 0.0 0.2 0.4 0.6 0.8 1.0 Poisson MLE ME Rejection Rate 0.0 0.2 0.4 0.6 0.8 1.0 Poisson MLE ME 0.0 0.4 0.8 1.2 1.6 0.0 0.4 0.8 1.2 1.6 δ δ

An Example for Cluster Detection Application We used both (y i, n i ) and fitted both Poisson and negative binomial models. We assume E(y i /n i ) = θ i in both models. There were many clusters in the Poisson model but only two in the negative binomial models.

An Example for Cluster Detection Figure : Cluster of Infant Mortality of Jiangxi Province in China 2002

Linkage between Point Process and Lattice Data The problem is important in disease problems because counts are basically aggregated over a spatial or spatiotemporal point process. Concepts of point processes have not been used in lattice data (e.g. K-functions, stationarity). Test for stationarity or proportionality may be modified for cluster detection. Asymptotics may also be derived.

Consideration of Asymptotics Asymptotics for lattice data is rare. We may consider: increasing domain and fixed domain asymptotics. However, none of these have been investigated.