Spatio-Temporal Geostatistical Models, with an Application in Fish Stock

Size: px

Start display at page:

Download "Spatio-Temporal Geostatistical Models, with an Application in Fish Stock"

Herbert Hunt
5 years ago
Views:

1 Spatio-Temporal Geostatistical Models, with an Application in Fish Stock Ioannis Elmatzoglou Submitted for the degree of Master in Statistics at Lancaster University, September 2006.

2 Abstract Geostatistics is based on the adoption of a probabilistic framework, aiming at the description of the behavior of any kind of continuous and quantifiable spatial phenomenon. However, many such phenomena are characterized not only by spatial but also by temporal variability. Although geostatistics was initially developed for the needs of the mining industry, it is now used in many other application areas including hydrological, environmental and meteorological applications. Although spatio-temporal analysis is in princilpe a direct extension of the geostatistical philosophy of analysis in space, in practice there are many obstacles in the path to its full development. II

3 Acknowledgements Many thanks to my supervisor Paulo J. Ribeiro for his help, his guidelines and his general contribution to my knowledge thanks to Peter Diggle for the time he spent and the great interest that he showed to make improvements in my project. The quality and the organization of the data analysis would be much different without his suggestions. It is in general a great honor for me to work with people like him. I would also like to thank: Martin Schlather for the help he provided in aspects concerning the comprehension of RandomFields software. The Department of Statistics of Lancaster University for the funding covering my tickets to Brazil and studentship offered to me through the academic year Loukia Meligkotsidou for her willingness to help at a really very bad moment, even though see didn t. My father Antonios Elmatzoglou for his support and last, many thanks to George Tsiotas, Department of Economics, University of Crete, as without him I wouldn t be here doing anything. III

4 Contents 1 An Introduction to Geostatistics From Classical Statistics to Geostatistics A Motivating Example Geostatistics Approaches in Geostatistical Analysis Geostatistics in Space and Time What follows Geostatistical Analysis of Spatial Data Introduction Basics Special Characterizations of Random Fields Stationary Random Fields Non-Stationary Random Fields Anisotropy Gaussian Random Fields (GRF) Modeling the Dependency Structure Properties of Second-Order Covariance functions Covariance Models Spectral Representation Nesting of Covariance Models Parameter Estimation and Predictions Estimation with Variograms Maximum Likelihood Estimation Predictions and Kriging Models for spatio-temporal geostatistical data Introducing the New Dimension Different Approaches in the Spatio-Temporal Analysis Nesting of Space-Time Covariance Functions Separable Space-Time Models Some Examples of separable models Non-Separable Models Stationary Space-Time Models Anisotropy? Fully-Symmetry Not Fully-Symmetric Space-Time Covariance Models IV

5 CONTENTS V 3.9 Simulations of Simple Spatio-Temporal Gaussian Random Fields with Different Dependency Features Simulating a Random Field with a Non-Separable and Fully Symmetric Covariance Function Simulating a Random Field with a Non-Separable and Not Fully-Symmetric Covariance Function Simulating two Random Fields with a Separable Dependency Structure Realism vs. Convenience The Creesie-Huang Approach Gneiting s Family of Non-Separable Models Case Study: Spatio-Temporal Modeling of the Portuguese Fish Stocks Introduction Scientific Interest and Data Description The Need for Joint Space-Time Analysis Methods Exploratory Data Analysis and Assumptions Analysis Comparison between the Purely Spatial Models Assuming Constant and Non-Constant Properties Over Time Purely Temporal Analysis Building a Space-Time Covariance Model and Testing its Superiority over the Purely Spatial one Construction of a Gneiting s type Non-Separable Covariance Model and Testing its Appropriateness against the Separable one Time-Forward Kriging Assessment with the two Models Assessment and conclusions Concluding Remarks and Further Studies 44 6 Appendices Appendix A Simulations from the Estimated Model and Variability of Estimations, (See Appendix B for Rcodes) Appendix B R CODES Simulating from the Estimated Model and Computation of the P.Likelihood of the Seperability Parameter

6 September 27, 2006

7 Chapter 1 An Introduction to Geostatistics 1.1 From Classical Statistics to Geostatistics A Motivating Example Suppose that our interest lies in predicting the average temperature V after one day, in the location A. What a statistician could do is treating all the average daily temperatures V t in A as random and think that their values are closely related with each other. So a solution to our problem can be given by considering all the past daily realized values of V : (v t 1, v t 2,..., v t n ) as an observed realization of a stochastic process, with a particular dependency structure. By exploring, through our sample, the way that these variables are related, we are enabled to express an opinion about the future, yet unrealized value of V t+1. Suppose that we are now interested in the amount of the underground oil deposits O that exists in the location B in a geographical area with many oil wells installed. Again, statistically thinking we can express our ignorance about O B and think that its real value, will be a realization from a certain probability distribution. Unfortunately, we can t use the same methodology. First of all, the expense of making excavations is by far higher than the one when measuring the temperature. Second, the amount of the underground oil is going to be practically the same over time unless we don t refine it. Thus, even if we could cheaply check the amount of oil in B, this has no practical point as it is going to be constant over time. So, we don t have any information for this spatial location coming from different time instances. Due to this lack of samples in time, one thing that someone could do is trying to look for the existence of some other kinds of information. This time, though, not coming from time but from space. And more specifically we can be based on the information provided by the nearby existing oil wells. Smililarly, like before, we can think that the information coming from the closest wells is going to be more valuable than the ones that are far apart. This can give us the idea of working in the same kind of framework like before, with the temperature example. That is, to think that the underground oil deposit in all the locations of the area, is a sequence of infinite dependent random variables, indexed by O si, and our sample as one particular realization o s1, o s2,..., o sn at some specific points. By exploring the way that they are linked, we are enabled to express an opinion for the amount of the yet unrealized, oil deposit in B. The methodology in the two examples was quite similar, with the only difference being the sample source. In the first case this came from time, while in the second one it came from space. While the former can be given as an example of time-series analysis, the latter is just a typical example of a geostatistical application Geostatistics Geostatistics can be roughly thought of as the spatial version of time series and it is one of the three main branches of spatial statistics. It usually refers to the case where data consist of a finite sample of measured values relating to an underlying spatially continuous phenomenon (Diggle & Ribeiro 2006 ). Example of this, can be the temperature in a particular area, the concentrations of a particular pollutant in the soil of a big geographical region or even the wind s velocity in the same location. The first law of geography says that everything is related to everything else, but near things are more related than distant things (Tobler, 1970, p.236 ). However, this seems to be in a great correspondence with our temperature example, as the real and unknown value of the latter is more likely to be similar with 1

8 1.2.Approaches in Geostatistical Analysis 2 its measurements one or two days ago, than the ones one or two weeks before. In a similar manner, the information we get from the nearby wells is more likely to be more useful than one coming from the wells further apart. Although statistics has a well established methodology that allows for the description of the relationships between various random variables, geostatistics is a science independently developed. The term geostatistics was firstly introduced by George Matheron (1962), as a means of designating his own methodology of ore reserve evaluation. The same person coined the term regionalised variable to designate a numerical function z(x) depending on a continuous space index x, and combining high irregularity of detail with spatial correlation. Being based on these words, Chiles and Delfiner (1999) defined Geostatistics as the application of probabilistic methods to regionalized variables, which is different from the vague usage of the word in the sense statistics in Geosciences. However, the range of applicability of this methodology has not been restricted by the concept attributed by the Greek prefix geo (γɛω=earth,ground,soil,land) which emphasize the spatial aspect of the problem. Applications have been taken place into a wider class of environments such as the subsurface, land, atmosphere or oceans. This of course implies that there are more sciences involved with geostatistics than just the geosciences. At this point it should be noted that geostatistics is just a complementary and biased tool of performing a spatial analysis. This can become clearer with the following example. Suppose that a bank is robbed in a particular region, say A. The fact that this happened in A and not in B can be interpreted by various ways from different people. It is sure that the interpretations of an economist, a psychologist, a criminologist, a sociologist and a policeman are going to be much different from each other. The same holds in the case of Geosciences. The fact that region A is richer in mineral resources than B, it can be interpreted in a different way by a geologist or a physician. Here it should be emphasized that geostatistics does not aim into the interpretation of what has been observed but it focuses mostly on making a description. It just aims to solve particular problems by capturing the main structural features from the data. Its essence is to recognize the inherent variability of natural spatial phenomena and the fragmentary character of the data and to incorporate these notions in a model of stochastic nature. That means that it does not attempt any physical or genetic interpretation of the data (Chiles and Delfiner 1999) and knowledge of the subject matter does not have much impact in the analysis. And that, of course, means that data play unique role in the analysis. In our oil example, the non-existence of other nearby wells implies that the problem is clearly converted to a geological one. 1.2 Approaches in Geostatistical Analysis Traditional and Model Based Geostatistics We mentioned earlier that geostatistics was an independently developed science, which later adopted many statistical features. One consequence of this fact is that it still uses various non-formal statistical and ad hoc methods of inference, such as fitting lines to variograms or fit by eye methods. Main characteristic is the focus around the covariance structure of a given process that is usually assumed to have a Gaussian distribution. Diggle, Tawn & Moyeed (1998) coined the phrase model-based geostatistics to describe an approach to geostatistical problems based on the application of formal statistical methods under an explicitly assumed stochastic model. In this approach the covariance structure depends and it is a consequence of the model assumed. In this project we follow the traditional approach. Convolution Representation A third alternative representation of a spatial process can be gained in the terms of convolutions. This is a completely different approach in which the spatial process is assumed to be constructed by integrating an unobserved and weighted white noise process, usually referred to as the excitation field. Such kind of representation has sometimes many advantages, such as an alternative way of deriving valid covariance functions, among some others. However, we are not going to pay much attention on this approach during this work.

9 1.3.Geostatistics in Space and Time Geostatistics in Space and Time Although in the case of oil deposits, the latter remain practically constant over time, when the interest lies in other kinds of processes such as nitrate nitrogen (NO 3 ) contamination in groundwater or ammonium nitrogen (NH 4 ) contents in soil, the quality analysis can be much more enhanced by making more complex assumptions. In particular, models taking into account the fact that the process evolves not only in space but in time as well, are very often proved to be more useful. One main reason is that past information can contribute not only in the improvement of the spatial interpolations at the present time, but also in our ability to perform time-forward predictions at different spatial locations. The modeling of spatiotemporal distributions resulting from dynamic processes evolving in both space and time is critical and has been increasingly used in many scientific and engineering fields, involving environmental sciences, climate prediction, meteorology, hydrology and reservoir engineering. Just as in the purely spatial case, geostatistical spatiotemporal models provide a probabilistic framework for data analysis and predictions that builds on the joint spatial and temporal dependence between the observations. The simultaneous analysis of space and time is based exactly on the same philosophy as the analysis only in space. Time can be just treated as an extra dimension in space. However, the fact that space and time are two completely different notions does not allow us treating them in an exact equivalent manner. For this reason the spatio-temporal analysis and modeling has its own idiomorphies and difficulties, which makes it dependent on a further number of assumptions. 1.4 What follows The fact that joint space and time analysis needs to be treated in its own particular way, is the reason that we decided to split the analysis into two parts. As the title of the project implies and so, as our main concern in this project is the modeling of spatial processes at different time instances, we are initially focusing on making a brief description of the most necessary elements and main assumptions relating to the modeling of the purely spatial processes (Chapter 2 ). These elements are going to be useful in the next part, where we emphasize and focus on the additional characteristics that a space-time modeling has to encounter (Chapter 3 ). In the last part of the project we make an application of the theory introduced earlier to a real problem (Chapter 4 ). So, to summarize, the work can be divided into three parts: two parts of theoretical analysis and one of application.

10 Chapter 2 Geostatistical Analysis of Spatial Data 2.1 Introduction Main objective of this chapter is just to provide with a very short review of the main characteristics governing random fields existing in the purely spatial domain. This will enable us taking all the necessary ingredients in order to understand the more complex assumptions characterizing the spatio-temporal processes. The chapter begin with some very simple definitions, such as moments, variance and covariance structure of random fields and it continues by focusing on some very common and convenient assumptions such as, stationarity and isotropy. The rest of this chapter is concerned about different ways of modeling the dependency structure and different strategies for inference. Significant amount from the material presented below is inspired by the descriptions of Le & Zidek (2006), Schabenberger & Gotway (2005) and Journel & Kyriakidhs (1999) as well. 2.2 Basics A stochastic process is a family or collection of random variables, the members of which can be identified or located (indexed) according to some metric. Consider for convenience the first example of the previous chapter, where the measured value of the temperature P at each time point t 1,..., t q was treated as a realization from a stochastic process that was considered to exist in time. While this kind of process is usually referred to as a time-series process, a spatial process is defined to be a collection of random variables that exist exclusively in the space domain. These variables are indexed by some set D R d containing spatial coordinates s = s 1,..., s d. Our second example, regarding the prediction of the underground oil deposits in particular location we had two spatial coordinates i.e. s = (x 1, x 2 ) and so d = 2, where D denoted the particular geographical sub-region of interest. We could have also taken into account the depth of our measurements and thus work in three dimensions (d = 3). In the case where d 1, the spatial process is usually referred to as random field. Each of the random components of the field, Z(s), is fully characterized by its cumulative distribution function (cdf ). F (s; z) = P rob{z(s) z}, z and s R d In other words, the previous expression gives the probability that the variable Z at the location s in space is not greater than any given threshold z. Consider now the discretization of the d-dimensional spatial domain D into a set N of n points (N D). The joint uncertainty about this n set of random variables is characterized by the joint n-variate cdf : F (s 1,..., s n ; z 1,..., z n ) = P rob{z(s 1 ) z 1,..., Z(s n ) z n }, s i R d (2.1) The random field is characterized by all these sets of n-dimensional distributions of random variables spatially defined by every possible discretized subset N. Gaussian Random Field (GRF) is defined to be the case when all these joint distributions are multivariate Gaussians. This always implies that the marginal distribution is Gaussian as well, while the inverse does not necessarily hold. At this point, we should emphasize the fact that in practice we observe only one (and partial) realization of the random field. So, the statistical analysis is based on this single realization, something that it is a bit 4

11 2.2.Basics 5 contradictory and unusual with what someone is used to do in the classical applications of statistics. While there, there is usually an i.i.d. sample of n observations, here we have a sample of size one considered to be just a collection of n georeferenced observations {z s1,... z sn }. n simulations from a univariate distribution is quite different than one simulation from a multivariate one. This of course makes the inferential process quite difficult, but this issue will be discussed in more detail after giving some simple but useful definitions. Moments The kth order moment of the random field Z(s) at any location s R d is defined as: E[Z(s)] k = x k df s (x), provided this integral exists. df s (x) denotes the differential element of probability allocated to x, by the distribution F s. The kth order moment exists provided that E[Z(s)] k <. It is not always the case that all the moments of a random field exist. Expectation Expectation of a random field Z(s) is defined to be its first order moment: µ(s) = E[Z(s)], for any location s. The expectation in general is allowed to depend on s. In geostatistical applications µ(s) is often referred to as trend and represents the large-scale changes of Z(s) Variance and Covariance Variance of a random field Z(s) is defined as the second-order moment about the expectation µ(s): Var[Z(s)] = E[Z(s) µ(s)] 2, for any location s. Like before, variance is generally dependent on s. An important variant of the second-order moment, the covariance is defined as: C(s i, s j ) = E [( Z(s i ) µ(s i ) )( Z(s j ) µ(s j ) )], for any locations s i and s j. Covariance generally depends on these locations. Note that when i = j, we have the particular case when the covariance equals to the variance of s: C(s i, s i ) = V ar(z(s i )) The covariance matrix of the vector Z(s), with s = {s 1,..., s n } and s R d, is defined to be the n n matrix Σ ij with ij element C(s i, s j ). The covariance structure of the random field represents its variability due to small and microscale stochastic sources. Variogram and Semi-Variogram The variogram (or theoretical variogram) between any two spatial locations s i and s j, supporting a random field, is defined as: 2 γ(s i, s j ) = V ar[z(s i ) Z(s j )] = E[ ( Z(s i ) Z(s j ) ) ( µ(s i ) µ(s j ) ) ] 2, (2.2) that is, the variance of the difference of the two spatial random variables defined by these locations. What variogram describes is how this value becomes different as the separation distance between these points increases. That s why variogram is also used as the name of the graph of this function against the separation distance. γ(s i, s j ) is termed as semi-variogram and it is closely related to the covariance of random fields. The semivariogram is the simplest way to relate uncertainty with distance from an observation and it is probably one of the most traditional and useful tools of geostatistics. Just as in the case of covariance, semivariogram is unknown and in practice can be estimated by means of the Empirical variogram ( 2.5.1). Covariance and semi-variogram are two alternative ways of describing the second order properties of a random field. While statisticians are trained in expressing the dependency between random variables in terms of covariances, in geostatistical applications it is common to work with semivariograms. One of the main

12 2.3.Special Characterizations of Random Fields 6 reasons is the differences in the statistical properties of their empirical estimators and particularly, some problems of bias that arise when working with covariances. But the most important is that semivariogram does not only serve as a device which describes the spatial dependency structure. It is also a structural tool that conveys information about the behavior of a random field. One example can be its behavior at the first lags of distance (slow increase, quadratic etc), which is something that determines the smoothness of the process. Furthermore, semivariogram is traditionally used in geostatistics as an inferential tool. (see 2.5) But why is it so important knowing the spatial dependency of the random field? Unlike in other application areas of statistics, in geostatistics the specification of the covariance function is of greater importance than finding an appropriate mathematical expression for the trend. Of course this is not a rule as the analysis always depends on its targets. However, it is quite often the case that we are interested in making interpolations over the area than detecting the most significant covariates. Since the covariance structure reflects the strengths of relationship between random variables, it plays an important role in the spatial prediction problem. A big problem that arises at this point is related to one of our previous discussions, regarding the task of making inferences based on a sample of size one. We need to specify the best possible covariance function of the process by relying on a single realization of this process. However, under certain conditions and simplifications the modeling of such a process can be satisfactory. The next section is entirely focused on these cases where things become simpler. 2.3 Special Characterizations of Random Fields The mathematical modeling of the covariance function, in general, can be regarded as a complicated task. Very often the random field exhibits quite different patterns over its various spatial subsets of its domain, which does not allow simple mathematical expressions to capture key features of its dependency structure. However, the process sometimes appears to have a quite homogeneous structure, which implies that we can make a simple approximation of its spatial behavior with a smaller number of parameters. In this last case, we can say that the process replicates itself in the various subsets of its domain, which make us many times willing to treat one sample of observations as a collection of many sub-realizations of the same process, taking place at different spatial subsets. This has as a result better inferences and solves in a great degree the problem of having only one sample. Many times these homogeneous spatial patterns refer to only some certain characteristics of the process, while most of them are related tho the dependency structure. We give a brief description of some popular simplifications such as stationarity, anisotropy but as well as some common features of the random processes such as smoothness. Finally we describe the advantages of having the case of a Gaussian random field Stationary Random Fields Strict Stationarity Strict stationarity (or first-order) is the case when the joint uncertainty of any spatially defined random vector Z(s), s = {s 1,..., s n }, s i R d is the same with the joint uncertainty of Z(s + h), for any h R d and n, or equivalently: F (s 1,..., s n ; z 1,..., z n ) = F (s 1 + h,..., s n + h; z 1,..., z n ), n and h R d (2.3) In other words, in the random field is invariant under translation. This is a very strong requirement which imposes that all moments, provided that they exist, will not depend on the location. As this can be difficult in practice, weaker forms of stationarity may be sufficient to provide a foundation for modeling analysis. Weak Stationarity Weak stationarity (or second-order) is defined to be the case where: E[Z(s)] = µ and C(s + h, s) = C(s + h s) = C(h) The mean of a second order stationary random field is constant and the covariance between attributes at

13 2.3.Special Characterizations of Random Fields 7 different locations is only a function of their spatial separation. Stationarity reflects the lack of importance of absolute coordinates. The last expression implies that for the particular case where h = 0, we have that: C(s, s) = C(0) = V ar[z(s)], for every s. In other words the variability of a second-order random field is constant throughout its domain. Strict stationarity implies second-order stationarity while the reverse is not true. In the case of a second order stationary random field the semi-variogram, γ(s, s + h), can be written as: 1 2 V ar[z(s) Z(s+h)] = 1 2 (V ar[z(s)]+v ar[z(s+h)] 2Cov[Z(s), Z(s+h)]) = 1 (C(0)+C(0) 2C(s, s+h)) 2 This allows the semivariogram of the random process to be expressed as: Intrinsic Stationarity γ(s, s + h) = C(0) C(h) (2.4) A weaker form of stationarity is that of the intrinsic stationarity. This property defines the case when the increments Z(s) Z(s + h), are second order stationary: E[Z(s) Z(s + h)] = 0 and V ar[z(s) Z(s + h)] = 2γ(h) Although intrinsic stationarity implies second order stationarity, the inverse does not hold. Second-order stationarity of a random field is obviously a very important assumption, without which there was little hope to make progress in statistical inference of geostatistical data. It implies that the random field replicates itself in different parts of the spatial domain, which enables us making easier conclusions about its second-order properties. The later can be investigated by just considering pairs of points that share the same distance but without regard to their absolute coordinates Non-Stationary Random Fields If none of the above assumptions holds, then we have the more general case of a non-stationary random field. Non-stationarity is a common feature of many spatial processes, in particular those observed in the earth sciences (Schabenberger & Gotway 2005). Sources of non-stationarity may be either a non-constant mean, a non-constant variance or a spatially varying covariance function. Changes in the mean value can be accommodated in spatial models by parameterizing the mean function in terms of spatial coordinates and other regressor variables, while variance can be stabilized by transformation of the response variable. The last case, when the covariance function varies spatially cannot be so easily confronted. The convenience of inspecting the second-order structure by considering only the distances between the various points is now lost and the simple covariogram or semivariogram models considered so far, no longer apply. In such cases, tricky techniques such as spatial deformation or moving windows, are very often used, as they allow for a reduction to a stationary covariance structure (Haslett & Raftery 1989; Sampson & Guttorp 1992) Anisotropy A random field is said to be anisotropic when its covariance function exhibits different behavior at different directions. Or, in other words, when it is direction dependent. On the other hand, when the strength of association within the field is the same in each direction, then the random field is termed as an isotropic. Stationarity and isotropy are two completely different notions. Nevertheless, they can be seen as two different homogeneity features of a random field. While a stationary random field is always invariant under translation, an isotropic one is invariant under rotation. This distinction can be made more explicit with the following table, regarding the covariance between Z(s) and Z(s + h), s, h R d :

2.3.Special Characterizations of Random Fields 8 A B Class C(s, h ) Non-Stationary and Isotropic C(s, s + h) = C(h) Stationary and Anisotropic None of them Non-Stationary and Anisotropic Both or C( h

14 2.3.Special Characterizations of Random Fields 8 A B Class C(s, h ) Non-Stationary and Isotropic C(s, s + h) = C(h) Stationary and Anisotropic None of them Non-Stationary and Anisotropic Both or C( h ) Non-Stationary and Isotropic Table 2.3 Identifying the homogeneous characteristics of a random field. We can distinct four different cases. By comparing the element of the column A with the first two elements of column B, we are able to make a final classification of our process in terms of stationarity and isotropy. If it can be expressed only as the first one, then the random field is isotropic but not stationary. If it can be expressed only as the second element then it is stationary but not isotropic. If none of the two representations is equivalent, then the process is non-stationary and anisotropic. Finally, when both expressions are equivalent then it means that C(s, s + h) = C( h ), which is the case of a stationary and isotropic random field. This can be regarded as the case of a homogeneous two dimensional random field that replicates it self throughout its domain and in a similar manner over all the directions. Geometric Anisotropy The fact that in many cases the covariance structure of the process is directionally dependent, causes additional difficulties in our modeling and makes the need for adoption of further assumptions necessary. However, in some particular cases of anisotropy is quite plausible for someone to assume that the correlation between two spatially defined random variables is a function of their separation angle. Or more specifically that the rate of their correlation decay (scale) for a given direction can be represented by the radius of an elliptical shape, such as that in figure 2.3 below: Figure 2.3: Analysis of geometric anisotropy by elliptical shapes, the radius of which represents the rate of correlation decay at different directions The vectors α 1 and α 2 represent the scales at these particular directions, that is the rate of the decay in the correlation strength of two variables at this angle. In such cases the process is able to be converted into an isotropic one, by a linear transformation of the coordinate system. The transformation shifts the points into such a distance with each other, so that: C(s, s + h) = C(s, h ). This particular case of anisotropy is know as geometric anisotropy and the transformation can be performed by means of the following matrix:

15 2.4.Modeling the Dependency Structure 9 A i = [ α1 0 ] [ 0 cos(ψa ) α 2 sin(ψ A ) sin(ψ A ) cos(ψ A ) ] This matrix is usually referred to as the anisotropy matrix. More specifically, in the general case, where Z(s) is an anisotropic process with s R d and d 2, the anisotropic matrix A is defined as the (d d) matrix for which Z(sA 1 ) has isotropic covariance function. So, in terms of our example (figure 2.3 ) that means that all the pairs of spatial locations with separation angle ψ A, are transformed such that their corresponding spatial variables at these locations have a correlation decay represented by a scale equal to α 2. As a result, the ellipsis is converted into a circle with radius α 2 and the process into an isotropic one. The convenience of this transformation is that it allows the performance of a geostatistical analysis in this new coordinate system. This suggests that we can also make predictions at the transformed coordinate system and then re-transform them back into the original one (Christensen, Diggle & Ribeiro 2000) Gaussian Random Fields (GRF) Gaussian Random Fields are widely used in practice as models for geostatistical data. They are used as convenient empirical models which can capture a wide range of spatial behavior, according to the specification of the correlation structure (Schabenberger & Gotway 2005). One very good reason for concentrating on the gaussian models is that they are quite convenient and uniquely tractable as models for dependent data. The Gaussian distribution is fully characterized by its first and second moment structure. That means that by inferring the mean and the covariance (under second order stationarity assumptions) we are able to make inferences for the whole joint distribution, which is impossible in the cases of other distributions. Another consequence of this property is that second-order stationarity implies strict stationarity GRF holds a core position in the theory of spatial data analysis, because like the univariate Gaussian distribution, it is the key to many classical approaches of statistical inferences. The statistical properties of estimators derived from Gaussian data are easy to examine and test statistics usually have a known and simple distribution. As we will see in the next section 2.4, best linear kriging predictors are identical to conditional means in GRF, establishing their optimality beyond the class of linear predictors. The range of applicability of the Gaussian model can be extended by assuming that the model holds after a marginal transformation of the response variable. Box and Cox proposed the following parametric family of transformations (Box & Cox 1964 ): Z = { Z λ 1 λ : λ 0 log(z) : λ = 0 where a particular choice of λ can lead to an empirical Gaussian approximation. 2.4 Modeling the Dependency Structure The need of making simplifications in the analysis was emphasized many times. This need is most time a natural consequence of the fact that we base our conclusions on a single manifestation of the process. The modeling of spatial processes is on a great degree dependent on these assumptions, which are responsible not only for the simpler and mathematically more convenient parametric assumptions regarding the dependency structure, but also for their better statistical inference, due to the relatively smaller number of parameters that they require. So, the greatest percentage of this kind of models is based on these simplifications and basically in the second-order assumptions for the process. Unfortunately, it is quite often the case where these assumptions are in a total disagreement with the observed process. In these cases, we explained that analysis is possible by the adoption of alternative strategies of modeling and by the use of some tricky methods. In the present section, we are focusing in the properties and some of the possible ways that enable us describing the second-order structure of a weakly stationary and isotropic random fields. The term isotropic here includes also the cases of transformed anisotropic random field. We will see that generally, there are two alternative ways of modeling the covariance structure: By operations in the spatial domain and in the frequency domain. Each method has its own advantages and disadvantages.

16 2.4.Modeling the Dependency Structure Properties of Second-Order Covariance functions The covariance function C(.) of a second-order stationary random field must satisfy the following properties: C(0) 0 for any s R d C(h) = C( h), i.e. C is a an even function C(0) C(h) C(h) = Cov[Z(s), Z(s + h)] = Cov[Z(0), Z(h)] k j b jc j (h), with j = 1,... k and b j 0, is a valid covariance function if C j (h) j are valid covariance functions. k j b jc j (h), with j = 1,... k and b j 0, is a valid covariance function if C j (h) j are valid covariance functions. If C(h) is a valid covariance function in R d, then it is also a valid covariance function in R p for p < d The above restrictions make clear the fact that not all the mathematical functions can serve as covariance functions for a particular spatial process. But even when a function satisfies all of these restrictions, the property that ensures its validity as a covariance function is the positive definite condition. Positive Definite Condition k i=1 j=1 k α i α j C(s i s j ) 0, s i R d and i, j k (2.5) for any set of locations and real numbers. This is an obvious requirement as (2.5) is the variance of the linear combination a [Z(s 1 ),..., Z(s k )] Covariance Models At this paragraph we provide the general form of some of the most popular parametric covariance functions for second-order stationary processes. Such kinds of functions are quite interesting as they form the general case of some very wide in use covariance models. The Matern Class of Covariance Functions Based on the spectral representation (see 2.4.3) of isotropic covariance functions, Matern (1986) constructed a very flexible class of covariance models. This allowed many previously proposed covariance functions to be expressed as a particular case of the following mathematical expression: C(h) = σ 2 1 Γ(ν)( θh 2 ) ν2kν (θh), ν > 0, θ > 0, (2.6) where K ν is the modified Bessel function of the second kind of order ν > 0. The parameter θ governs the range of the spatial dependence, while diffrent values of ν allow for the modeling of processes with different degrees of smoothness (see example below): ν = 1 2, Exponential Model: C(h) = σ2 exp{ θh} ν = 1, Wittle Model: C(h) = σ 2 θhk 1 (θh)

17 2.4.Modeling the Dependency Structure 11 ν, Gaussian Model: C(h) = σ 2 exp{ θh 2 } Spherical Family of Covariance Functions Chiles & Delfiner (1999), based on the convolution representation of the spatial process ( 1.2) and by choosing some particular kernel functions, generated the following family of covariance functions: { 1 C(h) h/a (1 u2 ) (d 1)/2 du h a 0 otherwise (2.7) Particular cases of models that result from this family of covariance functions are the tent the circular and the spherical models for d=1,2 and 3 respectively. Different covariance models can capture different degrees of smoothness of the process. In order to give an intuition about this, consider the realization of the two one-dimensional spatial processes, illustrated in figure 2.4a. Semivariogram Semi Variance Differentiability Example lag h Figure 2.4 a & b: Representation of different degrees of smoothness. Darker lines represent higher degrees of smoothness, which correspond in high values of the ν parameter in the matern family of models. An additional source of smoothness can be caused by the existence of a nugget effect (dashed lines) The left figure illustrates two spatial processes with different degrees of smoothness, while the right one the theoretical variograms of the processes produced by (2.6) for different values of ν. Processes such as that represented by the dark line in figure 2.4a, correspond to variogram (covariance) models similar to those given by the lower curves of right figure. On the other hand, lower in smoothness processes such as the one represented by the dashed line of the left figure, correspond to variogram (covariance) models similar to the ones in the upper part of 2.4b or the dashed lines in the same figure usually assuming to represent processes with micro scale variation ( 2.4.4). Nevertheless, many correlation models are more smooth than can be supported by a natural mechanism. For example the darkest line (on the bottom of figure 2.4b), which represents the case in the matern family where ν (Gaussian model), is an example of an infinitely differentiable processes. tern family where ν (Gaussian model), is an example of an infinitely differentiable processes. However, even at this extreme case of modeling, such covariance functions have been proved useful in certain application areas as a means of representing micro structure effects. For example in meteorology for geopotential fields and in bathymetry in regions where the seafloor surface is smooth due to water flow, erosion and sedimentation (Herzfeld, 1989b) Spectral Representation An alternative way of describing the second order properties of a random field can be done by means of a spectral representation. This idea was taken from the fact that all the deterministic functions under some

18 2.4.Modeling the Dependency Structure 12 regularity conditions can be expressed as a Fourier series. In a similar manner a covariance function was managed to be expressed as follows: C(h) = exp{ih}s(ω)dω, where s(ω) is termed as the spectral density function. C(h) and s(ω) form a Fourier pair, which implies that the latter can be expressed as a function of the former. This has as an advantage the possibility that provides us with an alternative way of estimating the covariance structure from the data, that is by means of ŝ(ω), usually known as periodogram. Although C(h) and s(ω) are two alternative but equivalent representations of a particular process, the first emphasizes spatial dependency as a function of coordinate separation, while the latter emphasizes the association of components of variability with frequencies (Schabenberger & Gotway 2005). Bochner (1955) showed that every continuous non-negative function with finite C(h) can be expressed in the previous form. And most importantly, he proved that C(h) is positive definite if and only if it can be expressed in this way. But this is something that will be further discussed in the next chapter, where the restrictions imposed by the positive definite condition seem to be greater Nesting of Covariance Models Very often it is very plausible to assume that the observed process is composed by two or more other processes, existing in different scales. For example, the spatial variation in the altitude of a particular kind of plant may depend on the general conditions of the ground of a particular area, but also on micro scale conditions related with the quality of the soil around its exact location. Or simpler, that the elevation of the ground depends on a wide range of environmental conditions plus some extra unpredictable conditions such as rocks or stones, which, in this case, can be given as examples of unstructured spatial processes. So, any random field can be mathematically represented as follows: Z(s) = µ + p a j U j (s), s R d (2.8) j=1 where U 1 (s),..., U p (s) are independent and zero-mean random variables, usually thought of as different sources of variation and p 0, ( Z). The covariance between two spatially defined random variables Z that are h spatial units distance apart, can be proved that is able to be expressed as: Cov[Z(s), Z(s + h)] = p j=1 k=1 p a j a k Cov[U j (s), U k (s + h)] = where U 1 (s),..., U p (s) are independent and zero-mean random variables. conveniently expressed as: C(h) = p a 2 jcov[u j (s), U k (s + h)] (2.9) j=1 The last relation can be more p a 2 jc j (h) (2.10) j=1 This last property, seems to be quite useful as it permits the covariance function of a spatial process to be expressed as the sum of the covariance functions of other processes operating on different scales, which is something valid, due to the property allowing linear combinations of valid covariance functions to be valid covariance functions, as well. Such a nesting of covariance models can give us the opportunity to add further flexibility into the modeling of the second-order structure of the random field, more than the one offered by single parametric covariance functions, such as those introduced earlier. In the case of having spatially unstructured processes or assuming spatially independent measurement errors in our sampling, the previous relation can be written as: κ p C(h) = a 2 jc j (h) + a 2 jνj 2 h=0 (2.11) j=1 j=κ+1 For example, the covariance of the elevation of the ground in two locations that are h spatial units apart, in the previous example, can be expressed as: C(h) = C 1 (h) + ν 2 h=0 (2.12)

19 2.5.Parameter Estimation and Predictions 13 where ν 2 is usually termed as the nugget effect and represents either the variance of the measurement errors in the collection of our sample or the variance of an unstructured spatial process. The existence of the nugget can be detected from the data by means of the variogram. An empirical variogram not starting from the value of zero, usually reflects the fact that one of the sources of variation in the process can be attributed to a nugget effect. This suggests an alternative way of estimating the nugget, whose value is equal to the initial value of the variogram in the y-axis. Similarly to what we did before and although it may not be so useful in practice, we can make the assumption that the process can be analyzed into a product of other processes, operating on different spatial scales. That is: Z(s) = µ + p a j U j (s) (2.13) where U 1 (s),..., U p (s) are independent and zero-mean random variables. After making the same manipulations as before, we can come to the conclusion that the covariance between two spatially defined random variables h distance apart, can be expressed as: or simpler: Cov[Z(s), Z(s + h)] = p j=1 k=1 j=1 p a j a k Cov[U j (s), U k (s + h)] = C(h) = which suggests an alternative way of giving greater flexibility to our modeling. 2.5 Parameter Estimation and Predictions p a 2 jcov[u j (s), U k (s + h)] (2.14) j=1 p a 2 jc j (h), (2.15) j=1 Models such as those presented earlier are able to capture many features of a particular process. However our inability to make them representatives of the real process, make them useless. Adequate representation is most times the result of a good approximation of their unknown components. For this reason many statistical methods aim at this best approximation. Nevertheless, not all of them are necessarily based on such parametric model specifications as those mentioned earlier. Such kind of non-parametric approaches come usually as the result of alternative representations of the random fields or their second order structure, involving for example the convolution representation of a spatial process and kernel smoothers. However, and as explained in the introduction, traditional geostatistical approaches in inference have been independently developed and include basically estimations with variograms, apart from the other mainstream statistical features of inference adopted later on. In this section we briefly present some of the most popular parametric approaches in geostatistical inference, while at the same time, we show how they are connected with the ideas of spatial prediction (kriging). These approaches can be generally divided into those involving estimations with variograms and the ones based on likelihood methods Estimation with Variograms An empirical estimate of the theoretical variogram introduced in 2.2 is the classical or Matheron estimator: ˆγ(h) = 1 2 N(h) {Z(s i ) Z(s j )} 2 (2.16) N(h) In other words the empirical semivariogram averages the squared differences between data at a particular distance apart. This can be illustrated in figures 2.5 and 2.5b. The second figure is the result of dividing the x-axis of A into a certain number of parts (bins) and averaging the squared differences of the values in each of them. So the outcome is the 10 plotted points in the second figure, which are nothing else but the Matheron s estimator calculated for 10 different bins. Matheron estimator gives an estimation for the semivariance of two given points that are h distance apart in space.

Basics of Point-Referenced Data Models

Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic