On dealing with spatially correlated residuals in remote sensing and GIS

Similar documents
Spatial Data Mining. Regression and Classification Techniques

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Influence of parameter estimation uncertainty in Kriging: Part 2 Test and case study applications

BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS

Spatial analysis of a designed experiment. Uniformity trials. Blocking

Introduction. Semivariogram Cloud

Beta-Binomial Kriging: An Improved Model for Spatial Rates

Kriging Luc Anselin, All Rights Reserved

Chapter 4 - Fundamentals of spatial processes Lecture notes

Index. Geostatistics for Environmental Scientists, 2nd Edition R. Webster and M. A. Oliver 2007 John Wiley & Sons, Ltd. ISBN:

Covariance function estimation in Gaussian process regression

Optimal Interpolation

Economic modelling and forecasting

Spatial Backfitting of Roller Measurement Values from a Florida Test Bed

A robust statistically based approach to estimating the probability of contamination occurring between sampling locations

7 Geostatistics. Figure 7.1 Focus of geostatistics

Estimating variograms of soil properties by the method-ofmoments and maximum likelihood

Geostatistics: Kriging

Statistics: A review. Why statistics?

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Mixed-Models. version 30 October 2011

Statistics 910, #5 1. Regression Methods

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

VADOSE ZONE RELATED variables often exhibit some

Ordinary Least Squares Regression

PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH

ENGRG Introduction to GIS

Chapter 4 - Fundamentals of spatial processes Lecture notes

1. The OLS Estimator. 1.1 Population model and notation

Lecture 14 Simple Linear Regression

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Introduction to Geostatistics

Fluvial Variography: Characterizing Spatial Dependence on Stream Networks. Dale Zimmerman University of Iowa (joint work with Jay Ver Hoef, NOAA)

Propagation of Errors in Spatial Analysis

Point-Referenced Data Models

Lecture 5 Geostatistics

Non-Ergodic Probabilistic Seismic Hazard Analyses

Diagnostics and Remedial Measures

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1

Econometrics Summary Algebraic and Statistical Preliminaries

A Non-parametric bootstrap for multilevel models

Soil Moisture Modeling using Geostatistical Techniques at the O Neal Ecological Reserve, Idaho

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

Spatial Inference of Nitrate Concentrations in Groundwater

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Spatial Interpolation & Geostatistics

An anisotropic Matérn spatial covariance model: REML estimation and properties

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

Inference Methods for the Conditional Logistic Regression Model with Longitudinal Data Arising from Animal Habitat Selection Studies

Bayesian Transgaussian Kriging

Transiogram: A spatial relationship measure for categorical data

Linear Regression and Its Applications

Multivariate Geostatistics

WEIGHTED LEAST SQUARES. Model Assumptions for Weighted Least Squares: Recall: We can fit least squares estimates just assuming a linear mean function.

The Model Building Process Part I: Checking Model Assumptions Best Practice

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Introduction to Linear regression analysis. Part 2. Model comparisons

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Applied Statistics and Econometrics

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

Geostatistics in Hydrology: Kriging interpolation

Econ 582 Fixed Effects Estimation of Panel Data

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Plausible Values for Latent Variables Using Mplus

A Short Introduction to Curve Fitting and Regression by Brad Morantz

Mapping Precipitation in Switzerland with Ordinary and Indicator Kriging

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)

Best Linear Unbiased Prediction: an Illustration Based on, but Not Limited to, Shelf Life Estimation

Statistics for Managers using Microsoft Excel 6 th Edition

REML Estimation and Linear Mixed Models 4. Geostatistics and linear mixed models for spatial data

F9 F10: Autocorrelation

Types of Spatial Data

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

THE EFFECTS OF MULTICOLLINEARITY IN ORDINARY LEAST SQUARES (OLS) ESTIMATION

Chapter 11 Building the Regression Model II:

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Math 423/533: The Main Theoretical Topics

Some general observations.

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Steps in Regression Analysis

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Chapter 7: Simple linear regression

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Spatial inference. Spatial inference. Accounting for spatial correlation. Multivariate normal distributions

Robust Inference for Regression with Spatially Correlated Errors

A test for improved forecasting performance at higher lead times

Chapter 12 REML and ML Estimation

Model Selection for Geostatistical Models

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Introduction to Simple Linear Regression

Introduction to General and Generalized Linear Models

USEFUL TRANSFORMATIONS

Transcription:

On dealing with spatially correlated residuals in remote sensing and GIS Nicholas A. S. Hamm 1, Peter M. Atkinson and Edward J. Milton 3 School of Geography University of Southampton Southampton SO17 3AT United Kingdom Tel.: +44 7887 578 44; Fax: +44 3 8059 395 nick@hamm.org 1 ; pma@soton.ac.uk ; ejm@soton.ac.uk 3 Abstract Key assumptions in standard regression models are that the residuals are independent and identically distributed. These assumptions are often not met in practice. In particular, the issue of spatial correlation amongst the residuals has had limited attention in the remote sensing and GIS literature. This paper discusses approaches that use familiar authorized models to specify the covariance amongst the residuals. The model parameters were estimated using a maximum likelihood approach (ML or REML). The accuracy of the approach was investigated using simulated data and found to give accurate estimates of the error variance (σ ), although estimates of and range (a) and nugget component (s) were less accurate. The approach was found to be robust to choice of covariance function (exponential or spherical). The approach was then extended to deal with heteroskedastic residuals by incorporating a weighting component analogous to that used in weighted least squares. This was also shown to yield accurate results for σ, a and s. Finally, possibilities for extending these approaches to prediction are considered. Keywords: correlated residuals, regression, GIS, remote sensing, variogram 1 Introduction Regression models are often used in remote sensing and geographic information systems (GIS). For example, we may have collected field data on biomass at a limited number of locations and want to predict this variable over a wider area by using remotely sensed data as a predictor variable. Another example might be the Empirical Line Method (ELM) for atmospheric correction, which is based on developing a regression relationship between fieldbased methods of reflectance and remotely sensed measurements of radiance. However, developing a regression model in this context is complex since the residuals are likely to be spatially correlated. This violates standard ordinary least squares (OLS) regression assumptions, which require that the residuals from regression models be uncorrelated (Neter et al., 1996). This can lead to inaccurate estimation of the regression parameters. In particular, the residual variance may be underestimated. This may have consequences for tests of the validity of regression parameters. It may also lead to underestimation of the uncertainty in predicting the variable at an unmeasured location. Furthermore, it does not provide a framework for predicting realisations of surfaces, since it does not allow for spatial autocorrelation. One potential way to deal with this issue is to use geostatistics. In particular, the technique of cokriging allows us to make predictions of a sparsely sampled variable (e.g., biomass) at all locations, given a densely (or exhaustively) sampled second variable. However, co-kriging is not always appropriate, since there may be insufficient data to model the spatial covariances or 603

cross-covariance functions or we may wish to predict at locations which lie outside the bounds of the area that has been sampled (i.e., extrapolation rather than interpolation) (Dungan 1998). Regression modelling does allow for incorporation of autocorrelation in the residuals, through specification of a covariance or correlation matrix (Lark and Cullis, 004). However, this covariance matrix is, typically, unknown and the challenge lies in developing a methodology for specification of this covariance matrix. Techniques have been developed based on maximum likelihood (ML) or residual maximum likelihood (REML) (Lark, 000; Lark and Cullis, 004) which incorporate estimation of the parameters of the variogram of the residuals into the modelling process. The variogram is used to specify the elements of the covariance matrix. The approach is iterative and the likelihood is maximised numerically. Lark and colleagues have published demonstrations of this technique and have tested the approach on simulated image data and on real image and field data, with favourable results. However, these demonstrations were restricted to a single land cover type and, hence, assumed that the variance of the residuals was constant across the image. Furthermore, they did not apply the model to predicting a new variable or a new surface. The objectives of this paper were (1) to test the variogram-based approach in a broader range of circumstances to those examined by Lark and colleagues; () to extend the research by examining a situation where the residual variance was not uniform across the image and (3) to use the model for prediction across the image. For () the variance was allowed to vary by applying different weights to different land-cover classes, where larger weights are applied to areas with lower residual variances. These three elements were examined using simulated data. Background The standard linear regression model can be expressed as follows y = β X + ε (1) Where y is a vector of response variables, X is a matrix of predictor variables, β is a vector of model coefficients (or parameters) and ε is a vector of model residuals. Importantly, in the standard case, the model residuals are assumed to Normal and independent and identically distributed with zero mean and variance σ, i.e.: iid ε i ~ N(0, σ ) () The parameters of the regression model (β and σ ) are usually estimated using ordinary least squares (OLS). However, the parameters can also be estimated using maximum likelihood (ML) or within the Bayesian paradigm. OLS, yield estimators the following estimators for the parameters: ˆ T β = ( X X ) X y (3) 1 ˆ ˆ T σ = ( y βx ) ( y ˆ βx ) (4) n p Cˆ ˆ ( β ) = σ ( X X ) (5) 604

Standard linear regression models are widely used in GIS and remote sensing (Dungan, 1998). However, careful consideration needs to be given to the viability of the assumptions made. The assumption of Normality of often viable and was maintained for the research conducted for this paper. Where the residuals are not identically distributed, this is manifested through heteroskedasticity. In order to relax the iid assumptions, equation () can be re-written as: where Σ is a symmetric covariance matrix. ε ~ MVN(0, Σ) (6) Where the residuals have been shown to be heteroskedastic, but autocorrelation is not a problem, Σ can be expressed as a diagnol matrix where diag( Σ ) = ( σ, σ, K, σ ) = σ (1/ w,1/ w, K,1/ w ) (7) 1 n 1 where the ws are known as the weights. These weights may have been established a priori or may be predicted from the residuals from OLS regression a procedure known as iteratively re-weighted least squares. Having obtained the weights, the parameters may be estimated using weighted least squares (WLS). Where the residuals cannot be assumed to be uncorrelated, it is necessry to establish the offdiagnol elements of Σ. This can be obtained by specifying an authorised covariance function and estimating its parameters within a maximum likelikhood framework (Cook and Pocock, 1983; Lark, 000; Lark and Cullis, 004). This proceeds by numerical maximisation of the log-likelihood function: 1 T ( z+ log Σ + ( y Xβ ) Σ 1 ( y Xβ )) (8) where z is a constant. Early approaches to this (Cook and Pockock, 1983; Lark, 000) proceeded by setting Σ = σ A. In this approach, A is a correlation matrix, with the off diagnol elements given given by a ij =f(a, h ij ) where h ij is the lag or distance between i and j and f( ) is an authorised covariance function. Typically the exponential or spherical covariance function is adopted, requiring the estimation of the range parameter, a. The regression parameters, for a given value of a, are given as: ˆ 1 β = ( X A X ) X A y (9) 1 σ = ( y ˆ βx ) n ˆ Cˆ ( Equation (8) then reduces to (see Lark, 000): ) ( A ( y ˆ βx ) ) n (10) 1 β = σ X A X (11) log A nlogσ (1) This approach was initially proposed in the mid-1980s by Cook and Pocock (1983) and Mardia and Marshall (1984), which have been widely cited in the statistical and epidemiological literature. However, there has only been limited uptake in remote sensing and other 605

environmental fields. Lark (000) demonstrated the above approach for simulated imagery and found that it gave more accurate estimates of σ than OLS, with OLS tending to underestimate σ. This has important consequences for uncertainty analysis and for tests on the regression parameters. However, Lark did not examine the accuracy of the estimates of a, which is important for prediction. Furthermore, this model does not specify a nugget component. The above approach was revised by Lark and Cullis (004), who specified: Σ = σ sf( h, a), i j ij = σ, i = j ij (13) where f(h ij, a) is a correlation function, h ij is the lag, or distance of separation, between two measurements and a is the range parameter. This parameter s is defined as: c s = c + c 0 (14) which allows incorporation of a nugget effect, since c is the partial sill and c 0 is the nugget component. For conveniance, θ denotes the vector of variance paramters, θ = (a, s, σ ). The point estimate and covariance function of β are then given as: ˆ 1 β = ( X Σ X ) X Σ y (15) Cˆ ( 1 β ) = ( X Σ X ) (16) The parameter vector, θ, can then be estimated by maximising the log-likelihood function given in Equation (8). Lark and Cullis noted that the ML estimates of θ may be biased, because they depend on β. They advocate use of residual (or restricted) maximum likelihood (REML) which, whereby the conditional likelihood is maximised: ( log Σ + log X Σ X y Σ ( I Q) y) 1 L( θ ˆ, β β ) = z (17) where z is a constant and Q=X(X T Σ -1 X) -1 X T Σ -1. However, it may be noted that, although REML is widely used in geostatistics, there is not a consensus that it should be preferred to ML (Diggle et al., 003). Lark and Cullis (004) demonstrated the use of the REML approach for data gathered in the Swiss Jura, but do not provide an empirical evaluation against simulated data in a remote sensing context. Furthermore, they do not provide an empirical comparison of ML vs REML. The research conducted for this paper began by evaluating the ML and REML approaches using simulated data. This follows the approach of Lark (000). This was then extended to deal with the case where σ is allowed to vary accross the image. The development of the above approach for prediction was then considered. 606

When σ is allowed to vary, we can write σ i = σ (1/w i ) (see Equation (7)). Hence: σ = ρ σ σ = ρ σ 1/ w 1/ w (18) ij ij i j ij i j where σ ij is the covariance between i and j, ρ ij is the correlation between i and j and σ ij and σ ij are the standard deviations at i and j respectively. Hence, Equation (13) can be re-written as: Σ = σ ij i j ij = σ / wi, i = 1/ w 1/ w sf( h, a), i j j (19) 3 Method The utility of the ML and REML approaches was investigated using simulated imagery, following the approach of Lark (000). The images were created on a uniform square grid of 49 rows by 49 columns. First, a random normal variable, X was created on the square grid. Next, an image of errors (ε) were created, with variance (σ ) of 1, partitioned into a nugget term of 0.5 and a partial sill of 0.5 (i.e., s = 0.5). Finally, the dependent variable, y, was generated where β = (1, 1.) T : y = 1 + 1. + ε (0) i X i For the research conducted in this paper, the errors were simulated using a spherical variogram. When estimating the model parameters (whether by ML or REML), it is necessary to choose an authorized model for f(h ij, a) (Equation (13)). The smooth nature of the exponential model makes it well suited to ML techniques (Diggle et. al., 003; Lark and Cullis, 004). However, the spherical model is so widely used in geostatistics that it is reasonable to examine it in this context. A choice between the spherical and exponential model can then be made on the basis of the maximized likelihood. It may be noted that, in any given application, the spherical or exponential model may not be reflect the underlying data. The question is, then, whether the model is robust to choice of variogram models. Six images of ε (and six corresponding images of y) were created for a = (1,,...,6). For each image two sub-samples of 100 data-points were obtained. The first sub-sample was gathered using simple random sampling (SRS) and second using structured sampling (SS) on a 10 10 grid. This was repeated 100 times, without replacement. These repeat samples allowed investigation of the accuracy of the parameter estimates. For the case where σ is allowed to vary accross the image, the grid is divided into three strips, each 83 columns wide. Each strip in the image had an error variance, σ i, of 1, and 4 respectively. Hence, σ i = σ /w where w=(1, 0.5, 0.5) T. 100 replicate samples were still obtained. However, this time the sample consisted of 300 data-points, with one 10 10 sample grid (or 100 random data points) lying in each strip. Having obtained the simulated images of y and X parameter estimation was performed on each sample using OLS, ML and REML. This was repeated 100 times to establish the accuracy of the parameter estimation. Where σ is allowed to vary accross the known weights were inserted using Equation (19) and WLS was applied rather than OLS. i 607

4 Results 4.1 Error variance constant Figure 1 shows the estimated value of σ plotted against the range of the error variable. These show the average estimate of σ (for the 100 samples), together with 95 % confidence intervals on the mean as well as the first and third quartiles. Figure 1 (a, c, d) shows the estimated value of σ obtained using random sampling. It is clear that, under simple random sampling ML or REML offer no advantage over OLS for providing unbiased estimates of σ. Under systematic sampling Figure 1 (b) shows the estimates of σ are negatively biased when estimated using OLS. This is particularly clear when the range of the underlying error variable was large (a = 4, 5 6), compared to the size of area sampled (10 10), where the estimates of σ were significantly biased. It should be noted that, when a is small many of the residuals will be uncorrelated hence the reduction in the number of degrees of freedom is small by comparison to the case where a is relatively large. The ML and REML models were then implemented by adopting a spherical model for Equation (13). The resulting estimates of σ are shown in Figure 1 (d and f). It is interesting to note that ML (d) does not seem to offer a substantial improvement over OLS. However, REML (f) does substantially reduce or eliminate the bias. When the spatial covariance is modelled using an exponential model Figure 1 (g and h) a similar pattern is observed to that found for the spherical model. This shows that the estimation of σ is robust to choice of variogram model. Boxplots showing the estimates of a, are shown in Figure 1. As expected, under random sampling the estimates were extremely variable and inaccurate (not shown). First the spherical model was considered (Figure (a and b). Under ML the estimates of â tend to increase as the true value of a increases, but were slightly biased at relatively high values of a. The REML estimates are simalar to the ML ones. The results for the exponential model are shown in Figure (c and d). Note that the range given is the effective range and should not be expected to correspond exactly with the range of the error variable for the underlying surface (which was modelled using a spherical variogram). For both ML and REML the estimates of the range approximate to the range of the underlying variogram. However, the variability of the estimated values is greater than where the spherical model is adopted. Examination of boxplots and histograms for ŝ (not shown) showed that neither the ML or REML was able to accurately estimate s and only an approximate value was found. This was exacerbated when the the exponential model was adopted. As discused in the previous section, a problem with assuming the regression residuals are uncorrleated is that it may lead to underestimation of σ. This has implications for estimation of the standard errors of β and for estimating confidence and prediction intervals. However, Equation (16) shows that these standard errors are also dependent on the off diagnol elements of Σ. Hence, both ML and REML led to an increase in the standard errors of β, even though adopting the ML model led to significant bias in estimates of σ. Overall the results presented above provide empirical evidence for the utility of the ML and REML models. However, the reader should interpret these results with caution. In particular there is a good deal of variability in the estimates of a and s, implying the data do not contain sufficient informaiton to make accurate estimates of these parameters. On a related note, the 608

likelihood surfaces appear to be very flat. This can lead to problems with maximising the likelihood function (Cook and Pocock, 1983) and increased uncertainty in the parameter estimates. In this respect experience showed that particular care needed to be taken when adopting the REML model. Hence, even though the above results imply that the REML model should be preferred over the ML model, caution should be exercised here. (a)random Sample, OLS (b)structured Sample, OLS 0.8 0.9 1.0 1.1 1. 0.8 0.9 1.0 1.1 1. 1 3 4 5 6 1 3 4 5 6 (c) Random Sample, ML (Spherical) (d) Structured Sample, ML (Spherical) 0.8 0.9 1.0 1.1 1. 0.8 0.9 1.0 1.1 1. 1 3 4 5 6 1 3 4 5 6 (e) Random Sample, REML (Spherical) (f) Structured Sample, REML (Spherical) 0.8 0.9 1.0 1.1 1. 0.8 0.9 1.0 1.1 1. 1 3 4 5 6 1 3 4 5 6 (g) Structured Sample, ML (Exponential) (h) Structured Sample, REML (Exponential 0.8 0.9 1.0 1.1 1. 0.8 0.9 1.0 1.1 1. 1 3 4 5 6 1 3 4 5 6 Figure 1 Estimated value of σ for a given range of the error variable. The points show the mean estimate of σ. Solid lines show the 95 % confidence intervals on the mean and dashed lines quartiles. 609

(a) Structured Sample, ML (Spherical) (b) Structured Sample, REML (Spherical) 0 4 6 8 10 1 0 4 6 8 10 1 1 3 4 5 6 1 3 4 5 6 (c) Structured Sample, ML (Exponential) (d) Structured Sample, REML (Exponential) 0 4 6 8 10 1 0 4 6 8 10 1 1 3 4 5 6 1 3 4 5 6 Figure Boxplots showing the estimated value of a for a given range of the error variable. For the exponential model, the range shown is the effective range. 4. Error variance varies In this case the error variance was allowed to vary accross the image. For this experiment, the known weights were applied, rather than using estimated weights. Using estimated weights would add an extra layer of uncertainty and exploration of that is deferred. Figure 3 shows the estimated value of σ for OLS, ML and REML when the spherical model is adopted. This showed that, under WLS, estimates of σ tended to be slightly biased. However, both ML and REML lead to unbiased estimators of σ. Figure 4 shows the estimated value of a. It is interesting to note that, under random sampling both ML and REML yield approximate estimates for a. This implies that the larger sample size provides a sufficiently large sample of nearby data-points to allow an approximate estimate of a. Importantly, this also shows that the model residuals were correlated and the fact that random sampling was employed does not mean that this assumption can be ignored. For ML and REML, the estimates of a were shown to be highly accurate. The estimates of s are not discussed in detail. However, as with a both ML and REML provide more accurate estimates of s than those provided in the previous section. The results presented above show incorporation of a weighting scheme permits accurate estimation of the parameters of Equation (13). Incorporation of the weights is, undoubtedly, important. However, it should also be realised that the sample size was much larger than for the experiment discussed in the previous section. Hence, the increased accuracy of the parameter estimates may also be due to the larger sample size. 610

Finally, it may be noted that adoption of ML or REML leads to an increase in the standard error of β 1 (the intercept in Equation (0)) but a decrease in the standard error of β (the gradient in Equation (0)). This runs counter to what is expected (Lark and Cullis, 005) and requireds further consideration. (a) Random Sample, WLS (b) Structured Sample, WLS 0.8 0.9 1.0 1.1 1. 0.8 0.9 1.0 1.1 1. 1 3 4 5 6 1 3 4 5 6 (c) Structured Sample, ML (Spherical) (d) Structured Sample, REML (Spherical) 0.8 0.9 1.0 1.1 1. 0.8 0.9 1.0 1.1 1. 1 3 4 5 6 1 3 4 5 6 Figure 3 Estimated value of σ for a given range of the error variable for the case where σ i =σ /w i. Solid lines show the 95 % confidence intervals on the mean and dashed lines the upper and lower quartile. 611

(a) Random Sample, ML (Spherical) (a) Structured Sample, ML (Spherical) 0 4 6 8 10 0 4 6 8 10 1 3 4 5 6 1 3 4 5 6 (c) Random Sample, REML (Spherical) (d) Structured Sample, REML (Spherical) 0 4 6 8 10 0 4 6 8 10 1 3 4 5 6 1 3 4 5 6 Figure 4 Boxplots showing the estimated value of a for a given range of the error variable. 5 Discussion and Conclusions The results presented above have demonstrated the utility of the ML and REML approaches for addressing the issue of spatially correlated residuals in regression. They were also shown to be applicable under conditions of heteroskedasticity. Furthermore, they were shown to be robust to choosing the exponential model rather than the spherical model. This suggests that a reasonable variogram model could be chosen for the residuals, where the correct one is not known (Lark, 000). The ML and REML approaches require more detailed examination under a wider range of conditions. In particular, future research should consider sample size, choice of variogram model and use of estimated rather than known weights. Another objective of this research was to compare the results from the ML and REML models. Theoretical considerations suggest that the REML model should be more appropriate (Lark and Cullis, 005), although this has not always been bourne out in practice (Diggle et al., 003). Experience of conducting this research found the REML approach to be more susceptable to problems with numerical optimisation. Finally, an objective of this research was to consider approaches for simulating a realisation of a prediction from the regression model. This has not been addressed empirically, but is discussed briefly here. Conventional regression techniques allow specification of prediction intervals for a single point (Neter et al., 1996). However, these do not translate well to multiple points (or a surface). Conditional simulation can be used to simulate an error surface within a geostatistical framework (Dungan, 1998) and this possibility will be explored in the future. However, it should be noted that conventional prediction intervals incorporate two components: error due to uncertainty in β (i.e., location of the regression line) and error due to σ (the spread around the regression line). Conditional simulation only allows for the latter, so attention will need to be given to incorporating uncertainty in the former. 61

References Cook, D. G. and Pocock, S. J., 1983, Multiple-Regression In Geographical Mortality Studies, With Allowance For Spatially Correlated Errors, Biometrics, 39(), pp. 361-371. Diggle, P. J., Ribeiro Jr., P. J. and Christensen, O. F., 003, An introduction to model--based geostatistics. In Spatial Statistics and Computational Methods. J. Moller (Ed.), pp. 43-86, London: Springer. Dungan, J., 1998, Spatial prediction of vegetation quantities using ground and image data. International Journal of Remote Sensing, 19(), pp.67-85. Lark, R. M., 000, Regression analysis with spatially autocorrelated error: simulation studies and application to mapping of soil organic matter. International Journal of Geographical Information Science, 14(3), pp. 47-64. Lark, R. M. and Cullis, B. R., 004, Model-based analysis using REML for inference from systematically sampled data on soil. European Journal Of Soil Science, 55(4), pp. 799-813. Mardia, K. V. & Marshall, R. J., 1984, Maximum-Likelihood Estimation Of Models For Residual Covariance In Spatial Regression, Biometrika, 71(1), pp. 135-146. Neter, J., Kutner, M. H., Nachtsheim, C. J., & Wasserman, W., 1996, Applied Linear Statistical Models (Fourth ed.). London: Irwin. 613