Spatio-Temporal Geostatistical Models, with an Application in Fish Stock

Size: px
Start display at page:

Download "Spatio-Temporal Geostatistical Models, with an Application in Fish Stock"

Transcription

1 Spatio-Temporal Geostatistical Models, with an Application in Fish Stock Ioannis Elmatzoglou Submitted for the degree of Master in Statistics at Lancaster University, September 2006.

2 Abstract Geostatistics is based on the adoption of a probabilistic framework, aiming at the description of the behavior of any kind of continuous and quantifiable spatial phenomenon. However, many such phenomena are characterized not only by spatial but also by temporal variability. Although geostatistics was initially developed for the needs of the mining industry, it is now used in many other application areas including hydrological, environmental and meteorological applications. Although spatio-temporal analysis is in princilpe a direct extension of the geostatistical philosophy of analysis in space, in practice there are many obstacles in the path to its full development. II

3 Acknowledgements Many thanks to my supervisor Paulo J. Ribeiro for his help, his guidelines and his general contribution to my knowledge thanks to Peter Diggle for the time he spent and the great interest that he showed to make improvements in my project. The quality and the organization of the data analysis would be much different without his suggestions. It is in general a great honor for me to work with people like him. I would also like to thank: Martin Schlather for the help he provided in aspects concerning the comprehension of RandomFields software. The Department of Statistics of Lancaster University for the funding covering my tickets to Brazil and studentship offered to me through the academic year Loukia Meligkotsidou for her willingness to help at a really very bad moment, even though see didn t. My father Antonios Elmatzoglou for his support and last, many thanks to George Tsiotas, Department of Economics, University of Crete, as without him I wouldn t be here doing anything. III

4 Contents 1 An Introduction to Geostatistics From Classical Statistics to Geostatistics A Motivating Example Geostatistics Approaches in Geostatistical Analysis Geostatistics in Space and Time What follows Geostatistical Analysis of Spatial Data Introduction Basics Special Characterizations of Random Fields Stationary Random Fields Non-Stationary Random Fields Anisotropy Gaussian Random Fields (GRF) Modeling the Dependency Structure Properties of Second-Order Covariance functions Covariance Models Spectral Representation Nesting of Covariance Models Parameter Estimation and Predictions Estimation with Variograms Maximum Likelihood Estimation Predictions and Kriging Models for spatio-temporal geostatistical data Introducing the New Dimension Different Approaches in the Spatio-Temporal Analysis Nesting of Space-Time Covariance Functions Separable Space-Time Models Some Examples of separable models Non-Separable Models Stationary Space-Time Models Anisotropy? Fully-Symmetry Not Fully-Symmetric Space-Time Covariance Models IV

5 CONTENTS V 3.9 Simulations of Simple Spatio-Temporal Gaussian Random Fields with Different Dependency Features Simulating a Random Field with a Non-Separable and Fully Symmetric Covariance Function Simulating a Random Field with a Non-Separable and Not Fully-Symmetric Covariance Function Simulating two Random Fields with a Separable Dependency Structure Realism vs. Convenience The Creesie-Huang Approach Gneiting s Family of Non-Separable Models Case Study: Spatio-Temporal Modeling of the Portuguese Fish Stocks Introduction Scientific Interest and Data Description The Need for Joint Space-Time Analysis Methods Exploratory Data Analysis and Assumptions Analysis Comparison between the Purely Spatial Models Assuming Constant and Non-Constant Properties Over Time Purely Temporal Analysis Building a Space-Time Covariance Model and Testing its Superiority over the Purely Spatial one Construction of a Gneiting s type Non-Separable Covariance Model and Testing its Appropriateness against the Separable one Time-Forward Kriging Assessment with the two Models Assessment and conclusions Concluding Remarks and Further Studies 44 6 Appendices Appendix A Simulations from the Estimated Model and Variability of Estimations, (See Appendix B for Rcodes) Appendix B R CODES Simulating from the Estimated Model and Computation of the P.Likelihood of the Seperability Parameter

6 September 27, 2006

7 Chapter 1 An Introduction to Geostatistics 1.1 From Classical Statistics to Geostatistics A Motivating Example Suppose that our interest lies in predicting the average temperature V after one day, in the location A. What a statistician could do is treating all the average daily temperatures V t in A as random and think that their values are closely related with each other. So a solution to our problem can be given by considering all the past daily realized values of V : (v t 1, v t 2,..., v t n ) as an observed realization of a stochastic process, with a particular dependency structure. By exploring, through our sample, the way that these variables are related, we are enabled to express an opinion about the future, yet unrealized value of V t+1. Suppose that we are now interested in the amount of the underground oil deposits O that exists in the location B in a geographical area with many oil wells installed. Again, statistically thinking we can express our ignorance about O B and think that its real value, will be a realization from a certain probability distribution. Unfortunately, we can t use the same methodology. First of all, the expense of making excavations is by far higher than the one when measuring the temperature. Second, the amount of the underground oil is going to be practically the same over time unless we don t refine it. Thus, even if we could cheaply check the amount of oil in B, this has no practical point as it is going to be constant over time. So, we don t have any information for this spatial location coming from different time instances. Due to this lack of samples in time, one thing that someone could do is trying to look for the existence of some other kinds of information. This time, though, not coming from time but from space. And more specifically we can be based on the information provided by the nearby existing oil wells. Smililarly, like before, we can think that the information coming from the closest wells is going to be more valuable than the ones that are far apart. This can give us the idea of working in the same kind of framework like before, with the temperature example. That is, to think that the underground oil deposit in all the locations of the area, is a sequence of infinite dependent random variables, indexed by O si, and our sample as one particular realization o s1, o s2,..., o sn at some specific points. By exploring the way that they are linked, we are enabled to express an opinion for the amount of the yet unrealized, oil deposit in B. The methodology in the two examples was quite similar, with the only difference being the sample source. In the first case this came from time, while in the second one it came from space. While the former can be given as an example of time-series analysis, the latter is just a typical example of a geostatistical application Geostatistics Geostatistics can be roughly thought of as the spatial version of time series and it is one of the three main branches of spatial statistics. It usually refers to the case where data consist of a finite sample of measured values relating to an underlying spatially continuous phenomenon (Diggle & Ribeiro 2006 ). Example of this, can be the temperature in a particular area, the concentrations of a particular pollutant in the soil of a big geographical region or even the wind s velocity in the same location. The first law of geography says that everything is related to everything else, but near things are more related than distant things (Tobler, 1970, p.236 ). However, this seems to be in a great correspondence with our temperature example, as the real and unknown value of the latter is more likely to be similar with 1

8 1.2.Approaches in Geostatistical Analysis 2 its measurements one or two days ago, than the ones one or two weeks before. In a similar manner, the information we get from the nearby wells is more likely to be more useful than one coming from the wells further apart. Although statistics has a well established methodology that allows for the description of the relationships between various random variables, geostatistics is a science independently developed. The term geostatistics was firstly introduced by George Matheron (1962), as a means of designating his own methodology of ore reserve evaluation. The same person coined the term regionalised variable to designate a numerical function z(x) depending on a continuous space index x, and combining high irregularity of detail with spatial correlation. Being based on these words, Chiles and Delfiner (1999) defined Geostatistics as the application of probabilistic methods to regionalized variables, which is different from the vague usage of the word in the sense statistics in Geosciences. However, the range of applicability of this methodology has not been restricted by the concept attributed by the Greek prefix geo (γɛω=earth,ground,soil,land) which emphasize the spatial aspect of the problem. Applications have been taken place into a wider class of environments such as the subsurface, land, atmosphere or oceans. This of course implies that there are more sciences involved with geostatistics than just the geosciences. At this point it should be noted that geostatistics is just a complementary and biased tool of performing a spatial analysis. This can become clearer with the following example. Suppose that a bank is robbed in a particular region, say A. The fact that this happened in A and not in B can be interpreted by various ways from different people. It is sure that the interpretations of an economist, a psychologist, a criminologist, a sociologist and a policeman are going to be much different from each other. The same holds in the case of Geosciences. The fact that region A is richer in mineral resources than B, it can be interpreted in a different way by a geologist or a physician. Here it should be emphasized that geostatistics does not aim into the interpretation of what has been observed but it focuses mostly on making a description. It just aims to solve particular problems by capturing the main structural features from the data. Its essence is to recognize the inherent variability of natural spatial phenomena and the fragmentary character of the data and to incorporate these notions in a model of stochastic nature. That means that it does not attempt any physical or genetic interpretation of the data (Chiles and Delfiner 1999) and knowledge of the subject matter does not have much impact in the analysis. And that, of course, means that data play unique role in the analysis. In our oil example, the non-existence of other nearby wells implies that the problem is clearly converted to a geological one. 1.2 Approaches in Geostatistical Analysis Traditional and Model Based Geostatistics We mentioned earlier that geostatistics was an independently developed science, which later adopted many statistical features. One consequence of this fact is that it still uses various non-formal statistical and ad hoc methods of inference, such as fitting lines to variograms or fit by eye methods. Main characteristic is the focus around the covariance structure of a given process that is usually assumed to have a Gaussian distribution. Diggle, Tawn & Moyeed (1998) coined the phrase model-based geostatistics to describe an approach to geostatistical problems based on the application of formal statistical methods under an explicitly assumed stochastic model. In this approach the covariance structure depends and it is a consequence of the model assumed. In this project we follow the traditional approach. Convolution Representation A third alternative representation of a spatial process can be gained in the terms of convolutions. This is a completely different approach in which the spatial process is assumed to be constructed by integrating an unobserved and weighted white noise process, usually referred to as the excitation field. Such kind of representation has sometimes many advantages, such as an alternative way of deriving valid covariance functions, among some others. However, we are not going to pay much attention on this approach during this work.

9 1.3.Geostatistics in Space and Time Geostatistics in Space and Time Although in the case of oil deposits, the latter remain practically constant over time, when the interest lies in other kinds of processes such as nitrate nitrogen (NO 3 ) contamination in groundwater or ammonium nitrogen (NH 4 ) contents in soil, the quality analysis can be much more enhanced by making more complex assumptions. In particular, models taking into account the fact that the process evolves not only in space but in time as well, are very often proved to be more useful. One main reason is that past information can contribute not only in the improvement of the spatial interpolations at the present time, but also in our ability to perform time-forward predictions at different spatial locations. The modeling of spatiotemporal distributions resulting from dynamic processes evolving in both space and time is critical and has been increasingly used in many scientific and engineering fields, involving environmental sciences, climate prediction, meteorology, hydrology and reservoir engineering. Just as in the purely spatial case, geostatistical spatiotemporal models provide a probabilistic framework for data analysis and predictions that builds on the joint spatial and temporal dependence between the observations. The simultaneous analysis of space and time is based exactly on the same philosophy as the analysis only in space. Time can be just treated as an extra dimension in space. However, the fact that space and time are two completely different notions does not allow us treating them in an exact equivalent manner. For this reason the spatio-temporal analysis and modeling has its own idiomorphies and difficulties, which makes it dependent on a further number of assumptions. 1.4 What follows The fact that joint space and time analysis needs to be treated in its own particular way, is the reason that we decided to split the analysis into two parts. As the title of the project implies and so, as our main concern in this project is the modeling of spatial processes at different time instances, we are initially focusing on making a brief description of the most necessary elements and main assumptions relating to the modeling of the purely spatial processes (Chapter 2 ). These elements are going to be useful in the next part, where we emphasize and focus on the additional characteristics that a space-time modeling has to encounter (Chapter 3 ). In the last part of the project we make an application of the theory introduced earlier to a real problem (Chapter 4 ). So, to summarize, the work can be divided into three parts: two parts of theoretical analysis and one of application.

10 Chapter 2 Geostatistical Analysis of Spatial Data 2.1 Introduction Main objective of this chapter is just to provide with a very short review of the main characteristics governing random fields existing in the purely spatial domain. This will enable us taking all the necessary ingredients in order to understand the more complex assumptions characterizing the spatio-temporal processes. The chapter begin with some very simple definitions, such as moments, variance and covariance structure of random fields and it continues by focusing on some very common and convenient assumptions such as, stationarity and isotropy. The rest of this chapter is concerned about different ways of modeling the dependency structure and different strategies for inference. Significant amount from the material presented below is inspired by the descriptions of Le & Zidek (2006), Schabenberger & Gotway (2005) and Journel & Kyriakidhs (1999) as well. 2.2 Basics A stochastic process is a family or collection of random variables, the members of which can be identified or located (indexed) according to some metric. Consider for convenience the first example of the previous chapter, where the measured value of the temperature P at each time point t 1,..., t q was treated as a realization from a stochastic process that was considered to exist in time. While this kind of process is usually referred to as a time-series process, a spatial process is defined to be a collection of random variables that exist exclusively in the space domain. These variables are indexed by some set D R d containing spatial coordinates s = s 1,..., s d. Our second example, regarding the prediction of the underground oil deposits in particular location we had two spatial coordinates i.e. s = (x 1, x 2 ) and so d = 2, where D denoted the particular geographical sub-region of interest. We could have also taken into account the depth of our measurements and thus work in three dimensions (d = 3). In the case where d 1, the spatial process is usually referred to as random field. Each of the random components of the field, Z(s), is fully characterized by its cumulative distribution function (cdf ). F (s; z) = P rob{z(s) z}, z and s R d In other words, the previous expression gives the probability that the variable Z at the location s in space is not greater than any given threshold z. Consider now the discretization of the d-dimensional spatial domain D into a set N of n points (N D). The joint uncertainty about this n set of random variables is characterized by the joint n-variate cdf : F (s 1,..., s n ; z 1,..., z n ) = P rob{z(s 1 ) z 1,..., Z(s n ) z n }, s i R d (2.1) The random field is characterized by all these sets of n-dimensional distributions of random variables spatially defined by every possible discretized subset N. Gaussian Random Field (GRF) is defined to be the case when all these joint distributions are multivariate Gaussians. This always implies that the marginal distribution is Gaussian as well, while the inverse does not necessarily hold. At this point, we should emphasize the fact that in practice we observe only one (and partial) realization of the random field. So, the statistical analysis is based on this single realization, something that it is a bit 4

11 2.2.Basics 5 contradictory and unusual with what someone is used to do in the classical applications of statistics. While there, there is usually an i.i.d. sample of n observations, here we have a sample of size one considered to be just a collection of n georeferenced observations {z s1,... z sn }. n simulations from a univariate distribution is quite different than one simulation from a multivariate one. This of course makes the inferential process quite difficult, but this issue will be discussed in more detail after giving some simple but useful definitions. Moments The kth order moment of the random field Z(s) at any location s R d is defined as: E[Z(s)] k = x k df s (x), provided this integral exists. df s (x) denotes the differential element of probability allocated to x, by the distribution F s. The kth order moment exists provided that E[Z(s)] k <. It is not always the case that all the moments of a random field exist. Expectation Expectation of a random field Z(s) is defined to be its first order moment: µ(s) = E[Z(s)], for any location s. The expectation in general is allowed to depend on s. In geostatistical applications µ(s) is often referred to as trend and represents the large-scale changes of Z(s) Variance and Covariance Variance of a random field Z(s) is defined as the second-order moment about the expectation µ(s): Var[Z(s)] = E[Z(s) µ(s)] 2, for any location s. Like before, variance is generally dependent on s. An important variant of the second-order moment, the covariance is defined as: C(s i, s j ) = E [( Z(s i ) µ(s i ) )( Z(s j ) µ(s j ) )], for any locations s i and s j. Covariance generally depends on these locations. Note that when i = j, we have the particular case when the covariance equals to the variance of s: C(s i, s i ) = V ar(z(s i )) The covariance matrix of the vector Z(s), with s = {s 1,..., s n } and s R d, is defined to be the n n matrix Σ ij with ij element C(s i, s j ). The covariance structure of the random field represents its variability due to small and microscale stochastic sources. Variogram and Semi-Variogram The variogram (or theoretical variogram) between any two spatial locations s i and s j, supporting a random field, is defined as: 2 γ(s i, s j ) = V ar[z(s i ) Z(s j )] = E[ ( Z(s i ) Z(s j ) ) ( µ(s i ) µ(s j ) ) ] 2, (2.2) that is, the variance of the difference of the two spatial random variables defined by these locations. What variogram describes is how this value becomes different as the separation distance between these points increases. That s why variogram is also used as the name of the graph of this function against the separation distance. γ(s i, s j ) is termed as semi-variogram and it is closely related to the covariance of random fields. The semivariogram is the simplest way to relate uncertainty with distance from an observation and it is probably one of the most traditional and useful tools of geostatistics. Just as in the case of covariance, semivariogram is unknown and in practice can be estimated by means of the Empirical variogram ( 2.5.1). Covariance and semi-variogram are two alternative ways of describing the second order properties of a random field. While statisticians are trained in expressing the dependency between random variables in terms of covariances, in geostatistical applications it is common to work with semivariograms. One of the main

12 2.3.Special Characterizations of Random Fields 6 reasons is the differences in the statistical properties of their empirical estimators and particularly, some problems of bias that arise when working with covariances. But the most important is that semivariogram does not only serve as a device which describes the spatial dependency structure. It is also a structural tool that conveys information about the behavior of a random field. One example can be its behavior at the first lags of distance (slow increase, quadratic etc), which is something that determines the smoothness of the process. Furthermore, semivariogram is traditionally used in geostatistics as an inferential tool. (see 2.5) But why is it so important knowing the spatial dependency of the random field? Unlike in other application areas of statistics, in geostatistics the specification of the covariance function is of greater importance than finding an appropriate mathematical expression for the trend. Of course this is not a rule as the analysis always depends on its targets. However, it is quite often the case that we are interested in making interpolations over the area than detecting the most significant covariates. Since the covariance structure reflects the strengths of relationship between random variables, it plays an important role in the spatial prediction problem. A big problem that arises at this point is related to one of our previous discussions, regarding the task of making inferences based on a sample of size one. We need to specify the best possible covariance function of the process by relying on a single realization of this process. However, under certain conditions and simplifications the modeling of such a process can be satisfactory. The next section is entirely focused on these cases where things become simpler. 2.3 Special Characterizations of Random Fields The mathematical modeling of the covariance function, in general, can be regarded as a complicated task. Very often the random field exhibits quite different patterns over its various spatial subsets of its domain, which does not allow simple mathematical expressions to capture key features of its dependency structure. However, the process sometimes appears to have a quite homogeneous structure, which implies that we can make a simple approximation of its spatial behavior with a smaller number of parameters. In this last case, we can say that the process replicates itself in the various subsets of its domain, which make us many times willing to treat one sample of observations as a collection of many sub-realizations of the same process, taking place at different spatial subsets. This has as a result better inferences and solves in a great degree the problem of having only one sample. Many times these homogeneous spatial patterns refer to only some certain characteristics of the process, while most of them are related tho the dependency structure. We give a brief description of some popular simplifications such as stationarity, anisotropy but as well as some common features of the random processes such as smoothness. Finally we describe the advantages of having the case of a Gaussian random field Stationary Random Fields Strict Stationarity Strict stationarity (or first-order) is the case when the joint uncertainty of any spatially defined random vector Z(s), s = {s 1,..., s n }, s i R d is the same with the joint uncertainty of Z(s + h), for any h R d and n, or equivalently: F (s 1,..., s n ; z 1,..., z n ) = F (s 1 + h,..., s n + h; z 1,..., z n ), n and h R d (2.3) In other words, in the random field is invariant under translation. This is a very strong requirement which imposes that all moments, provided that they exist, will not depend on the location. As this can be difficult in practice, weaker forms of stationarity may be sufficient to provide a foundation for modeling analysis. Weak Stationarity Weak stationarity (or second-order) is defined to be the case where: E[Z(s)] = µ and C(s + h, s) = C(s + h s) = C(h) The mean of a second order stationary random field is constant and the covariance between attributes at

13 2.3.Special Characterizations of Random Fields 7 different locations is only a function of their spatial separation. Stationarity reflects the lack of importance of absolute coordinates. The last expression implies that for the particular case where h = 0, we have that: C(s, s) = C(0) = V ar[z(s)], for every s. In other words the variability of a second-order random field is constant throughout its domain. Strict stationarity implies second-order stationarity while the reverse is not true. In the case of a second order stationary random field the semi-variogram, γ(s, s + h), can be written as: 1 2 V ar[z(s) Z(s+h)] = 1 2 (V ar[z(s)]+v ar[z(s+h)] 2Cov[Z(s), Z(s+h)]) = 1 (C(0)+C(0) 2C(s, s+h)) 2 This allows the semivariogram of the random process to be expressed as: Intrinsic Stationarity γ(s, s + h) = C(0) C(h) (2.4) A weaker form of stationarity is that of the intrinsic stationarity. This property defines the case when the increments Z(s) Z(s + h), are second order stationary: E[Z(s) Z(s + h)] = 0 and V ar[z(s) Z(s + h)] = 2γ(h) Although intrinsic stationarity implies second order stationarity, the inverse does not hold. Second-order stationarity of a random field is obviously a very important assumption, without which there was little hope to make progress in statistical inference of geostatistical data. It implies that the random field replicates itself in different parts of the spatial domain, which enables us making easier conclusions about its second-order properties. The later can be investigated by just considering pairs of points that share the same distance but without regard to their absolute coordinates Non-Stationary Random Fields If none of the above assumptions holds, then we have the more general case of a non-stationary random field. Non-stationarity is a common feature of many spatial processes, in particular those observed in the earth sciences (Schabenberger & Gotway 2005). Sources of non-stationarity may be either a non-constant mean, a non-constant variance or a spatially varying covariance function. Changes in the mean value can be accommodated in spatial models by parameterizing the mean function in terms of spatial coordinates and other regressor variables, while variance can be stabilized by transformation of the response variable. The last case, when the covariance function varies spatially cannot be so easily confronted. The convenience of inspecting the second-order structure by considering only the distances between the various points is now lost and the simple covariogram or semivariogram models considered so far, no longer apply. In such cases, tricky techniques such as spatial deformation or moving windows, are very often used, as they allow for a reduction to a stationary covariance structure (Haslett & Raftery 1989; Sampson & Guttorp 1992) Anisotropy A random field is said to be anisotropic when its covariance function exhibits different behavior at different directions. Or, in other words, when it is direction dependent. On the other hand, when the strength of association within the field is the same in each direction, then the random field is termed as an isotropic. Stationarity and isotropy are two completely different notions. Nevertheless, they can be seen as two different homogeneity features of a random field. While a stationary random field is always invariant under translation, an isotropic one is invariant under rotation. This distinction can be made more explicit with the following table, regarding the covariance between Z(s) and Z(s + h), s, h R d :

14 2.3.Special Characterizations of Random Fields 8 A B Class C(s, h ) Non-Stationary and Isotropic C(s, s + h) = C(h) Stationary and Anisotropic None of them Non-Stationary and Anisotropic Both or C( h ) Non-Stationary and Isotropic Table 2.3 Identifying the homogeneous characteristics of a random field. We can distinct four different cases. By comparing the element of the column A with the first two elements of column B, we are able to make a final classification of our process in terms of stationarity and isotropy. If it can be expressed only as the first one, then the random field is isotropic but not stationary. If it can be expressed only as the second element then it is stationary but not isotropic. If none of the two representations is equivalent, then the process is non-stationary and anisotropic. Finally, when both expressions are equivalent then it means that C(s, s + h) = C( h ), which is the case of a stationary and isotropic random field. This can be regarded as the case of a homogeneous two dimensional random field that replicates it self throughout its domain and in a similar manner over all the directions. Geometric Anisotropy The fact that in many cases the covariance structure of the process is directionally dependent, causes additional difficulties in our modeling and makes the need for adoption of further assumptions necessary. However, in some particular cases of anisotropy is quite plausible for someone to assume that the correlation between two spatially defined random variables is a function of their separation angle. Or more specifically that the rate of their correlation decay (scale) for a given direction can be represented by the radius of an elliptical shape, such as that in figure 2.3 below: Figure 2.3: Analysis of geometric anisotropy by elliptical shapes, the radius of which represents the rate of correlation decay at different directions The vectors α 1 and α 2 represent the scales at these particular directions, that is the rate of the decay in the correlation strength of two variables at this angle. In such cases the process is able to be converted into an isotropic one, by a linear transformation of the coordinate system. The transformation shifts the points into such a distance with each other, so that: C(s, s + h) = C(s, h ). This particular case of anisotropy is know as geometric anisotropy and the transformation can be performed by means of the following matrix:

15 2.4.Modeling the Dependency Structure 9 A i = [ α1 0 ] [ 0 cos(ψa ) α 2 sin(ψ A ) sin(ψ A ) cos(ψ A ) ] This matrix is usually referred to as the anisotropy matrix. More specifically, in the general case, where Z(s) is an anisotropic process with s R d and d 2, the anisotropic matrix A is defined as the (d d) matrix for which Z(sA 1 ) has isotropic covariance function. So, in terms of our example (figure 2.3 ) that means that all the pairs of spatial locations with separation angle ψ A, are transformed such that their corresponding spatial variables at these locations have a correlation decay represented by a scale equal to α 2. As a result, the ellipsis is converted into a circle with radius α 2 and the process into an isotropic one. The convenience of this transformation is that it allows the performance of a geostatistical analysis in this new coordinate system. This suggests that we can also make predictions at the transformed coordinate system and then re-transform them back into the original one (Christensen, Diggle & Ribeiro 2000) Gaussian Random Fields (GRF) Gaussian Random Fields are widely used in practice as models for geostatistical data. They are used as convenient empirical models which can capture a wide range of spatial behavior, according to the specification of the correlation structure (Schabenberger & Gotway 2005). One very good reason for concentrating on the gaussian models is that they are quite convenient and uniquely tractable as models for dependent data. The Gaussian distribution is fully characterized by its first and second moment structure. That means that by inferring the mean and the covariance (under second order stationarity assumptions) we are able to make inferences for the whole joint distribution, which is impossible in the cases of other distributions. Another consequence of this property is that second-order stationarity implies strict stationarity GRF holds a core position in the theory of spatial data analysis, because like the univariate Gaussian distribution, it is the key to many classical approaches of statistical inferences. The statistical properties of estimators derived from Gaussian data are easy to examine and test statistics usually have a known and simple distribution. As we will see in the next section 2.4, best linear kriging predictors are identical to conditional means in GRF, establishing their optimality beyond the class of linear predictors. The range of applicability of the Gaussian model can be extended by assuming that the model holds after a marginal transformation of the response variable. Box and Cox proposed the following parametric family of transformations (Box & Cox 1964 ): Z = { Z λ 1 λ : λ 0 log(z) : λ = 0 where a particular choice of λ can lead to an empirical Gaussian approximation. 2.4 Modeling the Dependency Structure The need of making simplifications in the analysis was emphasized many times. This need is most time a natural consequence of the fact that we base our conclusions on a single manifestation of the process. The modeling of spatial processes is on a great degree dependent on these assumptions, which are responsible not only for the simpler and mathematically more convenient parametric assumptions regarding the dependency structure, but also for their better statistical inference, due to the relatively smaller number of parameters that they require. So, the greatest percentage of this kind of models is based on these simplifications and basically in the second-order assumptions for the process. Unfortunately, it is quite often the case where these assumptions are in a total disagreement with the observed process. In these cases, we explained that analysis is possible by the adoption of alternative strategies of modeling and by the use of some tricky methods. In the present section, we are focusing in the properties and some of the possible ways that enable us describing the second-order structure of a weakly stationary and isotropic random fields. The term isotropic here includes also the cases of transformed anisotropic random field. We will see that generally, there are two alternative ways of modeling the covariance structure: By operations in the spatial domain and in the frequency domain. Each method has its own advantages and disadvantages.

16 2.4.Modeling the Dependency Structure Properties of Second-Order Covariance functions The covariance function C(.) of a second-order stationary random field must satisfy the following properties: C(0) 0 for any s R d C(h) = C( h), i.e. C is a an even function C(0) C(h) C(h) = Cov[Z(s), Z(s + h)] = Cov[Z(0), Z(h)] k j b jc j (h), with j = 1,... k and b j 0, is a valid covariance function if C j (h) j are valid covariance functions. k j b jc j (h), with j = 1,... k and b j 0, is a valid covariance function if C j (h) j are valid covariance functions. If C(h) is a valid covariance function in R d, then it is also a valid covariance function in R p for p < d The above restrictions make clear the fact that not all the mathematical functions can serve as covariance functions for a particular spatial process. But even when a function satisfies all of these restrictions, the property that ensures its validity as a covariance function is the positive definite condition. Positive Definite Condition k i=1 j=1 k α i α j C(s i s j ) 0, s i R d and i, j k (2.5) for any set of locations and real numbers. This is an obvious requirement as (2.5) is the variance of the linear combination a [Z(s 1 ),..., Z(s k )] Covariance Models At this paragraph we provide the general form of some of the most popular parametric covariance functions for second-order stationary processes. Such kinds of functions are quite interesting as they form the general case of some very wide in use covariance models. The Matern Class of Covariance Functions Based on the spectral representation (see 2.4.3) of isotropic covariance functions, Matern (1986) constructed a very flexible class of covariance models. This allowed many previously proposed covariance functions to be expressed as a particular case of the following mathematical expression: C(h) = σ 2 1 Γ(ν)( θh 2 ) ν2kν (θh), ν > 0, θ > 0, (2.6) where K ν is the modified Bessel function of the second kind of order ν > 0. The parameter θ governs the range of the spatial dependence, while diffrent values of ν allow for the modeling of processes with different degrees of smoothness (see example below): ν = 1 2, Exponential Model: C(h) = σ2 exp{ θh} ν = 1, Wittle Model: C(h) = σ 2 θhk 1 (θh)

17 2.4.Modeling the Dependency Structure 11 ν, Gaussian Model: C(h) = σ 2 exp{ θh 2 } Spherical Family of Covariance Functions Chiles & Delfiner (1999), based on the convolution representation of the spatial process ( 1.2) and by choosing some particular kernel functions, generated the following family of covariance functions: { 1 C(h) h/a (1 u2 ) (d 1)/2 du h a 0 otherwise (2.7) Particular cases of models that result from this family of covariance functions are the tent the circular and the spherical models for d=1,2 and 3 respectively. Different covariance models can capture different degrees of smoothness of the process. In order to give an intuition about this, consider the realization of the two one-dimensional spatial processes, illustrated in figure 2.4a. Semivariogram Semi Variance Differentiability Example lag h Figure 2.4 a & b: Representation of different degrees of smoothness. Darker lines represent higher degrees of smoothness, which correspond in high values of the ν parameter in the matern family of models. An additional source of smoothness can be caused by the existence of a nugget effect (dashed lines) The left figure illustrates two spatial processes with different degrees of smoothness, while the right one the theoretical variograms of the processes produced by (2.6) for different values of ν. Processes such as that represented by the dark line in figure 2.4a, correspond to variogram (covariance) models similar to those given by the lower curves of right figure. On the other hand, lower in smoothness processes such as the one represented by the dashed line of the left figure, correspond to variogram (covariance) models similar to the ones in the upper part of 2.4b or the dashed lines in the same figure usually assuming to represent processes with micro scale variation ( 2.4.4). Nevertheless, many correlation models are more smooth than can be supported by a natural mechanism. For example the darkest line (on the bottom of figure 2.4b), which represents the case in the matern family where ν (Gaussian model), is an example of an infinitely differentiable processes. tern family where ν (Gaussian model), is an example of an infinitely differentiable processes. However, even at this extreme case of modeling, such covariance functions have been proved useful in certain application areas as a means of representing micro structure effects. For example in meteorology for geopotential fields and in bathymetry in regions where the seafloor surface is smooth due to water flow, erosion and sedimentation (Herzfeld, 1989b) Spectral Representation An alternative way of describing the second order properties of a random field can be done by means of a spectral representation. This idea was taken from the fact that all the deterministic functions under some

18 2.4.Modeling the Dependency Structure 12 regularity conditions can be expressed as a Fourier series. In a similar manner a covariance function was managed to be expressed as follows: C(h) = exp{ih}s(ω)dω, where s(ω) is termed as the spectral density function. C(h) and s(ω) form a Fourier pair, which implies that the latter can be expressed as a function of the former. This has as an advantage the possibility that provides us with an alternative way of estimating the covariance structure from the data, that is by means of ŝ(ω), usually known as periodogram. Although C(h) and s(ω) are two alternative but equivalent representations of a particular process, the first emphasizes spatial dependency as a function of coordinate separation, while the latter emphasizes the association of components of variability with frequencies (Schabenberger & Gotway 2005). Bochner (1955) showed that every continuous non-negative function with finite C(h) can be expressed in the previous form. And most importantly, he proved that C(h) is positive definite if and only if it can be expressed in this way. But this is something that will be further discussed in the next chapter, where the restrictions imposed by the positive definite condition seem to be greater Nesting of Covariance Models Very often it is very plausible to assume that the observed process is composed by two or more other processes, existing in different scales. For example, the spatial variation in the altitude of a particular kind of plant may depend on the general conditions of the ground of a particular area, but also on micro scale conditions related with the quality of the soil around its exact location. Or simpler, that the elevation of the ground depends on a wide range of environmental conditions plus some extra unpredictable conditions such as rocks or stones, which, in this case, can be given as examples of unstructured spatial processes. So, any random field can be mathematically represented as follows: Z(s) = µ + p a j U j (s), s R d (2.8) j=1 where U 1 (s),..., U p (s) are independent and zero-mean random variables, usually thought of as different sources of variation and p 0, ( Z). The covariance between two spatially defined random variables Z that are h spatial units distance apart, can be proved that is able to be expressed as: Cov[Z(s), Z(s + h)] = p j=1 k=1 p a j a k Cov[U j (s), U k (s + h)] = where U 1 (s),..., U p (s) are independent and zero-mean random variables. conveniently expressed as: C(h) = p a 2 jcov[u j (s), U k (s + h)] (2.9) j=1 The last relation can be more p a 2 jc j (h) (2.10) j=1 This last property, seems to be quite useful as it permits the covariance function of a spatial process to be expressed as the sum of the covariance functions of other processes operating on different scales, which is something valid, due to the property allowing linear combinations of valid covariance functions to be valid covariance functions, as well. Such a nesting of covariance models can give us the opportunity to add further flexibility into the modeling of the second-order structure of the random field, more than the one offered by single parametric covariance functions, such as those introduced earlier. In the case of having spatially unstructured processes or assuming spatially independent measurement errors in our sampling, the previous relation can be written as: κ p C(h) = a 2 jc j (h) + a 2 jνj 2 h=0 (2.11) j=1 j=κ+1 For example, the covariance of the elevation of the ground in two locations that are h spatial units apart, in the previous example, can be expressed as: C(h) = C 1 (h) + ν 2 h=0 (2.12)

19 2.5.Parameter Estimation and Predictions 13 where ν 2 is usually termed as the nugget effect and represents either the variance of the measurement errors in the collection of our sample or the variance of an unstructured spatial process. The existence of the nugget can be detected from the data by means of the variogram. An empirical variogram not starting from the value of zero, usually reflects the fact that one of the sources of variation in the process can be attributed to a nugget effect. This suggests an alternative way of estimating the nugget, whose value is equal to the initial value of the variogram in the y-axis. Similarly to what we did before and although it may not be so useful in practice, we can make the assumption that the process can be analyzed into a product of other processes, operating on different spatial scales. That is: Z(s) = µ + p a j U j (s) (2.13) where U 1 (s),..., U p (s) are independent and zero-mean random variables. After making the same manipulations as before, we can come to the conclusion that the covariance between two spatially defined random variables h distance apart, can be expressed as: or simpler: Cov[Z(s), Z(s + h)] = p j=1 k=1 j=1 p a j a k Cov[U j (s), U k (s + h)] = C(h) = which suggests an alternative way of giving greater flexibility to our modeling. 2.5 Parameter Estimation and Predictions p a 2 jcov[u j (s), U k (s + h)] (2.14) j=1 p a 2 jc j (h), (2.15) j=1 Models such as those presented earlier are able to capture many features of a particular process. However our inability to make them representatives of the real process, make them useless. Adequate representation is most times the result of a good approximation of their unknown components. For this reason many statistical methods aim at this best approximation. Nevertheless, not all of them are necessarily based on such parametric model specifications as those mentioned earlier. Such kind of non-parametric approaches come usually as the result of alternative representations of the random fields or their second order structure, involving for example the convolution representation of a spatial process and kernel smoothers. However, and as explained in the introduction, traditional geostatistical approaches in inference have been independently developed and include basically estimations with variograms, apart from the other mainstream statistical features of inference adopted later on. In this section we briefly present some of the most popular parametric approaches in geostatistical inference, while at the same time, we show how they are connected with the ideas of spatial prediction (kriging). These approaches can be generally divided into those involving estimations with variograms and the ones based on likelihood methods Estimation with Variograms An empirical estimate of the theoretical variogram introduced in 2.2 is the classical or Matheron estimator: ˆγ(h) = 1 2 N(h) {Z(s i ) Z(s j )} 2 (2.16) N(h) In other words the empirical semivariogram averages the squared differences between data at a particular distance apart. This can be illustrated in figures 2.5 and 2.5b. The second figure is the result of dividing the x-axis of A into a certain number of parts (bins) and averaging the squared differences of the values in each of them. So the outcome is the 10 plotted points in the second figure, which are nothing else but the Matheron s estimator calculated for 10 different bins. Matheron estimator gives an estimation for the semivariance of two given points that are h distance apart in space.

Basics of Point-Referenced Data Models

Basics of Point-Referenced Data Models Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry

More information

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp Marcela Alfaro Córdoba August 25, 2016 NCSU Department of Statistics Continuous Parameter

More information

Statistícal Methods for Spatial Data Analysis

Statistícal Methods for Spatial Data Analysis Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis

More information

Introduction. Spatial Processes & Spatial Patterns

Introduction. Spatial Processes & Spatial Patterns Introduction Spatial data: set of geo-referenced attribute measurements: each measurement is associated with a location (point) or an entity (area/region/object) in geographical (or other) space; the domain

More information

Introduction to Geostatistics

Introduction to Geostatistics Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites

More information

PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH

PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH SURESH TRIPATHI Geostatistical Society of India Assumptions and Geostatistical Variogram

More information

7 Geostatistics. Figure 7.1 Focus of geostatistics

7 Geostatistics. Figure 7.1 Focus of geostatistics 7 Geostatistics 7.1 Introduction Geostatistics is the part of statistics that is concerned with geo-referenced data, i.e. data that are linked to spatial coordinates. To describe the spatial variation

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1

More information

Chapter 1. Summer School GEOSTAT 2014, Spatio-Temporal Geostatistics,

Chapter 1. Summer School GEOSTAT 2014, Spatio-Temporal Geostatistics, Chapter 1 Summer School GEOSTAT 2014, Geostatistics, 2014-06-19 sum- http://ifgi.de/graeler Institute for Geoinformatics University of Muenster 1.1 Spatial Data From a purely statistical perspective, spatial

More information

Point-Referenced Data Models

Point-Referenced Data Models Point-Referenced Data Models Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Point-Referenced Data Models Spring 2013 1 / 19 Objectives By the end of these meetings, participants should

More information

What s for today. Introduction to Space-time models. c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, / 19

What s for today. Introduction to Space-time models. c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, / 19 What s for today Introduction to Space-time models c Mikyoung Jun (Texas A&M) Stat647 Lecture 14 October 16, 2012 1 / 19 Space-time Data So far we looked at the data that vary over space Now we add another

More information

Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi)

Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi) Lecture 3 Stationary Processes and the Ergodic LLN (Reference Section 2.2, Hayashi) Our immediate goal is to formulate an LLN and a CLT which can be applied to establish sufficient conditions for the consistency

More information

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,

More information

What s for today. Random Fields Autocovariance Stationarity, Isotropy. c Mikyoung Jun (Texas A&M) stat647 Lecture 2 August 30, / 13

What s for today. Random Fields Autocovariance Stationarity, Isotropy. c Mikyoung Jun (Texas A&M) stat647 Lecture 2 August 30, / 13 What s for today Random Fields Autocovariance Stationarity, Isotropy c Mikyoung Jun (Texas A&M) stat647 Lecture 2 August 30, 2012 1 / 13 Stochastic Process and Random Fields A stochastic process is a family

More information

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University

Predictive spatio-temporal models for spatially sparse environmental data. Umeå University Seminar p.1/28 Predictive spatio-temporal models for spatially sparse environmental data Xavier de Luna and Marc G. Genton xavier.deluna@stat.umu.se and genton@stat.ncsu.edu http://www.stat.umu.se/egna/xdl/index.html

More information

Coregionalization by Linear Combination of Nonorthogonal Components 1

Coregionalization by Linear Combination of Nonorthogonal Components 1 Mathematical Geology, Vol 34, No 4, May 2002 ( C 2002) Coregionalization by Linear Combination of Nonorthogonal Components 1 J A Vargas-Guzmán, 2,3 A W Warrick, 3 and D E Myers 4 This paper applies the

More information

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

Geostatistics for Gaussian processes

Geostatistics for Gaussian processes Introduction Geostatistical Model Covariance structure Cokriging Conclusion Geostatistics for Gaussian processes Hans Wackernagel Geostatistics group MINES ParisTech http://hans.wackernagel.free.fr Kernels

More information

CBMS Lecture 1. Alan E. Gelfand Duke University

CBMS Lecture 1. Alan E. Gelfand Duke University CBMS Lecture 1 Alan E. Gelfand Duke University Introduction to spatial data and models Researchers in diverse areas such as climatology, ecology, environmental exposure, public health, and real estate

More information

1 Isotropic Covariance Functions

1 Isotropic Covariance Functions 1 Isotropic Covariance Functions Let {Z(s)} be a Gaussian process on, ie, a collection of jointly normal random variables Z(s) associated with n-dimensional locations s The joint distribution of {Z(s)}

More information

An Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University

An Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University An Introduction to Spatial Statistics Chunfeng Huang Department of Statistics, Indiana University Microwave Sounding Unit (MSU) Anomalies (Monthly): 1979-2006. Iron Ore (Cressie, 1986) Raw percent data

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Geostatistics in Hydrology: Kriging interpolation

Geostatistics in Hydrology: Kriging interpolation Chapter Geostatistics in Hydrology: Kriging interpolation Hydrologic properties, such as rainfall, aquifer characteristics (porosity, hydraulic conductivity, transmissivity, storage coefficient, etc.),

More information

Conditional Distribution Fitting of High Dimensional Stationary Data

Conditional Distribution Fitting of High Dimensional Stationary Data Conditional Distribution Fitting of High Dimensional Stationary Data Miguel Cuba and Oy Leuangthong The second order stationary assumption implies the spatial variability defined by the variogram is constant

More information

Fluvial Variography: Characterizing Spatial Dependence on Stream Networks. Dale Zimmerman University of Iowa (joint work with Jay Ver Hoef, NOAA)

Fluvial Variography: Characterizing Spatial Dependence on Stream Networks. Dale Zimmerman University of Iowa (joint work with Jay Ver Hoef, NOAA) Fluvial Variography: Characterizing Spatial Dependence on Stream Networks Dale Zimmerman University of Iowa (joint work with Jay Ver Hoef, NOAA) March 5, 2015 Stream network data Flow Legend o 4.40-5.80

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time

Space-time data. Simple space-time analyses. PM10 in space. PM10 in time Space-time data Observations taken over space and over time Z(s, t): indexed by space, s, and time, t Here, consider geostatistical/time data Z(s, t) exists for all locations and all times May consider

More information

What s for today. All about Variogram Nugget effect. Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, / 17

What s for today. All about Variogram Nugget effect. Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, / 17 What s for today All about Variogram Nugget effect Mikyoung Jun (Texas A&M) stat647 lecture 4 September 6, 2012 1 / 17 What is the variogram? Let us consider a stationary (or isotropic) random field Z

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial

More information

Spatial Statistics with Image Analysis. Lecture L02. Computer exercise 0 Daily Temperature. Lecture 2. Johan Lindström.

Spatial Statistics with Image Analysis. Lecture L02. Computer exercise 0 Daily Temperature. Lecture 2. Johan Lindström. C Stochastic fields Covariance Spatial Statistics with Image Analysis Lecture 2 Johan Lindström November 4, 26 Lecture L2 Johan Lindström - johanl@maths.lth.se FMSN2/MASM2 L /2 C Stochastic fields Covariance

More information

Fitting a Straight Line to Data

Fitting a Straight Line to Data Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,

More information

Dale L. Zimmerman Department of Statistics and Actuarial Science, University of Iowa, USA

Dale L. Zimmerman Department of Statistics and Actuarial Science, University of Iowa, USA SPATIAL STATISTICS Dale L. Zimmerman Department of Statistics and Actuarial Science, University of Iowa, USA Keywords: Geostatistics, Isotropy, Kriging, Lattice Data, Spatial point patterns, Stationarity

More information

Introduction. Semivariogram Cloud

Introduction. Semivariogram Cloud Introduction Data: set of n attribute measurements {z(s i ), i = 1,, n}, available at n sample locations {s i, i = 1,, n} Objectives: Slide 1 quantify spatial auto-correlation, or attribute dissimilarity

More information

ON THE USE OF NON-EUCLIDEAN ISOTROPY IN GEOSTATISTICS

ON THE USE OF NON-EUCLIDEAN ISOTROPY IN GEOSTATISTICS Johns Hopkins University, Dept. of Biostatistics Working Papers 12-5-2005 ON THE USE OF NON-EUCLIDEAN ISOTROPY IN GEOSTATISTICS Frank C. Curriero Department of Environmental Health Sciences and Biostatistics,

More information

ENGRG Introduction to GIS

ENGRG Introduction to GIS ENGRG 59910 Introduction to GIS Michael Piasecki October 13, 2017 Lecture 06: Spatial Analysis Outline Today Concepts What is spatial interpolation Why is necessary Sample of interpolation (size and pattern)

More information

Spatial and Environmental Statistics

Spatial and Environmental Statistics Spatial and Environmental Statistics Dale Zimmerman Department of Statistics and Actuarial Science University of Iowa January 17, 2019 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January

More information

Spatial Interpolation & Geostatistics

Spatial Interpolation & Geostatistics (Z i Z j ) 2 / 2 Spatial Interpolation & Geostatistics Lag Lag Mean Distance between pairs of points 1 y Kriging Step 1 Describe spatial variation with Semivariogram (Z i Z j ) 2 / 2 Point cloud Map 3

More information

Space-time analysis using a general product-sum model

Space-time analysis using a general product-sum model Space-time analysis using a general product-sum model De Iaco S., Myers D. E. 2 and Posa D. 3,4 Università di Chieti, Pescara - ITALY; sdeiaco@tiscalinet.it 2 University of Arizona, Tucson AZ - USA; myers@math.arizona.edu

More information

Spatial-Temporal Modeling of Active Layer Thickness

Spatial-Temporal Modeling of Active Layer Thickness Spatial-Temporal Modeling of Active Layer Thickness Qian Chen Advisor : Dr. Tatiyana Apanasovich Department of Statistics The George Washington University Abstract The objective of this study is to provide

More information

An Introduction to Spatial Autocorrelation and Kriging

An Introduction to Spatial Autocorrelation and Kriging An Introduction to Spatial Autocorrelation and Kriging Matt Robinson and Sebastian Dietrich RenR 690 Spring 2016 Tobler and Spatial Relationships Tobler s 1 st Law of Geography: Everything is related to

More information

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February

More information

Extreme Value Analysis and Spatial Extremes

Extreme Value Analysis and Spatial Extremes Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models

More information

Non-gaussian spatiotemporal modeling

Non-gaussian spatiotemporal modeling Dec, 2008 1/ 37 Non-gaussian spatiotemporal modeling Thais C O da Fonseca Joint work with Prof Mark F J Steel Department of Statistics University of Warwick Dec, 2008 Dec, 2008 2/ 37 1 Introduction Motivation

More information

11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1

11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1 (Z i Z j ) 2 / 2 (Z i Zj) 2 / 2 Semivariance y 11/8/2018 Spatial Interpolation & Geostatistics Kriging Step 1 Describe spatial variation with Semivariogram Lag Distance between pairs of points Lag Mean

More information

POPULAR CARTOGRAPHIC AREAL INTERPOLATION METHODS VIEWED FROM A GEOSTATISTICAL PERSPECTIVE

POPULAR CARTOGRAPHIC AREAL INTERPOLATION METHODS VIEWED FROM A GEOSTATISTICAL PERSPECTIVE CO-282 POPULAR CARTOGRAPHIC AREAL INTERPOLATION METHODS VIEWED FROM A GEOSTATISTICAL PERSPECTIVE KYRIAKIDIS P. University of California Santa Barbara, MYTILENE, GREECE ABSTRACT Cartographic areal interpolation

More information

I don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph

I don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph Spatial analysis Huge topic! Key references Diggle (point patterns); Cressie (everything); Diggle and Ribeiro (geostatistics); Dormann et al (GLMMs for species presence/abundance); Haining; (Pinheiro and

More information

Bayesian Transgaussian Kriging

Bayesian Transgaussian Kriging 1 Bayesian Transgaussian Kriging Hannes Müller Institut für Statistik University of Klagenfurt 9020 Austria Keywords: Kriging, Bayesian statistics AMS: 62H11,60G90 Abstract In geostatistics a widely used

More information

Overview of Spatial Statistics with Applications to fmri

Overview of Spatial Statistics with Applications to fmri with Applications to fmri School of Mathematics & Statistics Newcastle University April 8 th, 2016 Outline Why spatial statistics? Basic results Nonstationary models Inference for large data sets An example

More information

A MultiGaussian Approach to Assess Block Grade Uncertainty

A MultiGaussian Approach to Assess Block Grade Uncertainty A MultiGaussian Approach to Assess Block Grade Uncertainty Julián M. Ortiz 1, Oy Leuangthong 2, and Clayton V. Deutsch 2 1 Department of Mining Engineering, University of Chile 2 Department of Civil &

More information

Spatial Data Mining. Regression and Classification Techniques

Spatial Data Mining. Regression and Classification Techniques Spatial Data Mining Regression and Classification Techniques 1 Spatial Regression and Classisfication Discrete class labels (left) vs. continues quantities (right) measured at locations (2D for geographic

More information

Stochastic Processes

Stochastic Processes qmc082.tex. Version of 30 September 2010. Lecture Notes on Quantum Mechanics No. 8 R. B. Griffiths References: Stochastic Processes CQT = R. B. Griffiths, Consistent Quantum Theory (Cambridge, 2002) DeGroot

More information

A robust statistically based approach to estimating the probability of contamination occurring between sampling locations

A robust statistically based approach to estimating the probability of contamination occurring between sampling locations A robust statistically based approach to estimating the probability of contamination occurring between sampling locations Peter Beck Principal Environmental Scientist Image placeholder Image placeholder

More information

Automatic Determination of Uncertainty versus Data Density

Automatic Determination of Uncertainty versus Data Density Automatic Determination of Uncertainty versus Data Density Brandon Wilde and Clayton V. Deutsch It is useful to know how various measures of uncertainty respond to changes in data density. Calculating

More information

Predicting AGI: What can we say when we know so little?

Predicting AGI: What can we say when we know so little? Predicting AGI: What can we say when we know so little? Fallenstein, Benja Mennen, Alex December 2, 2013 (Working Paper) 1 Time to taxi Our situation now looks fairly similar to our situation 20 years

More information

Empirical Bayesian Kriging

Empirical Bayesian Kriging Empirical Bayesian Kriging Implemented in ArcGIS Geostatistical Analyst By Konstantin Krivoruchko, Senior Research Associate, Software Development Team, Esri Obtaining reliable environmental measurements

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

On dealing with spatially correlated residuals in remote sensing and GIS

On dealing with spatially correlated residuals in remote sensing and GIS On dealing with spatially correlated residuals in remote sensing and GIS Nicholas A. S. Hamm 1, Peter M. Atkinson and Edward J. Milton 3 School of Geography University of Southampton Southampton SO17 3AT

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Spatial smoothing using Gaussian processes

Spatial smoothing using Gaussian processes Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1 OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Multitask Learning of Environmental Spatial Data

Multitask Learning of Environmental Spatial Data 9th International Congress on Environmental Modelling and Software Brigham Young University BYU ScholarsArchive 6th International Congress on Environmental Modelling and Software - Leipzig, Germany - July

More information

A008 THE PROBABILITY PERTURBATION METHOD AN ALTERNATIVE TO A TRADITIONAL BAYESIAN APPROACH FOR SOLVING INVERSE PROBLEMS

A008 THE PROBABILITY PERTURBATION METHOD AN ALTERNATIVE TO A TRADITIONAL BAYESIAN APPROACH FOR SOLVING INVERSE PROBLEMS A008 THE PROAILITY PERTURATION METHOD AN ALTERNATIVE TO A TRADITIONAL AYESIAN APPROAH FOR SOLVING INVERSE PROLEMS Jef AERS Stanford University, Petroleum Engineering, Stanford A 94305-2220 USA Abstract

More information

Geostatistics: Kriging

Geostatistics: Kriging Geostatistics: Kriging 8.10.2015 Konetekniikka 1, Otakaari 4, 150 10-12 Rangsima Sunila, D.Sc. Background What is Geostatitics Concepts Variogram: experimental, theoretical Anisotropy, Isotropy Lag, Sill,

More information

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge Master s Thesis Presentation Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge Diego Selle (RIS @ LAAS-CNRS, RT-TUM) Master s Thesis Presentation

More information

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Practicum : Spatial Regression

Practicum : Spatial Regression : Alexandra M. Schmidt Instituto de Matemática UFRJ - www.dme.ufrj.br/ alex 2014 Búzios, RJ, www.dme.ufrj.br Exploratory (Spatial) Data Analysis 1. Non-spatial summaries Numerical summaries: Mean, median,

More information

Basics in Geostatistics 2 Geostatistical interpolation/estimation: Kriging methods. Hans Wackernagel. MINES ParisTech.

Basics in Geostatistics 2 Geostatistical interpolation/estimation: Kriging methods. Hans Wackernagel. MINES ParisTech. Basics in Geostatistics 2 Geostatistical interpolation/estimation: Kriging methods Hans Wackernagel MINES ParisTech NERSC April 2013 http://hans.wackernagel.free.fr Basic concepts Geostatistics Hans Wackernagel

More information

Generating Spatial Correlated Binary Data Through a Copulas Method

Generating Spatial Correlated Binary Data Through a Copulas Method Science Research 2015; 3(4): 206-212 Published online July 23, 2015 (http://www.sciencepublishinggroup.com/j/sr) doi: 10.11648/j.sr.20150304.18 ISSN: 2329-0935 (Print); ISSN: 2329-0927 (Online) Generating

More information

Kriging Luc Anselin, All Rights Reserved

Kriging Luc Anselin, All Rights Reserved Kriging Luc Anselin Spatial Analysis Laboratory Dept. Agricultural and Consumer Economics University of Illinois, Urbana-Champaign http://sal.agecon.uiuc.edu Outline Principles Kriging Models Spatial Interpolation

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Nonlinear Kriging, potentialities and drawbacks

Nonlinear Kriging, potentialities and drawbacks Nonlinear Kriging, potentialities and drawbacks K. G. van den Boogaart TU Bergakademie Freiberg, Germany; boogaart@grad.tu-freiberg.de Motivation Kriging is known to be the best linear prediction to conclude

More information

A Covariance Conversion Approach of Gamma Random Field Simulation

A Covariance Conversion Approach of Gamma Random Field Simulation Proceedings of the 8th International Symposium on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences Shanghai, P. R. China, June 5-7, 008, pp. 4-45 A Covariance Conversion Approach

More information

METHODOLOGY WHICH APPLIES GEOSTATISTICS TECHNIQUES TO THE TOPOGRAPHICAL SURVEY

METHODOLOGY WHICH APPLIES GEOSTATISTICS TECHNIQUES TO THE TOPOGRAPHICAL SURVEY International Journal of Computer Science and Applications, 2008, Vol. 5, No. 3a, pp 67-79 Technomathematics Research Foundation METHODOLOGY WHICH APPLIES GEOSTATISTICS TECHNIQUES TO THE TOPOGRAPHICAL

More information

Investigation of Monthly Pan Evaporation in Turkey with Geostatistical Technique

Investigation of Monthly Pan Evaporation in Turkey with Geostatistical Technique Investigation of Monthly Pan Evaporation in Turkey with Geostatistical Technique Hatice Çitakoğlu 1, Murat Çobaner 1, Tefaruk Haktanir 1, 1 Department of Civil Engineering, Erciyes University, Kayseri,

More information

Advances in Locally Varying Anisotropy With MDS

Advances in Locally Varying Anisotropy With MDS Paper 102, CCG Annual Report 11, 2009 ( 2009) Advances in Locally Varying Anisotropy With MDS J.B. Boisvert and C. V. Deutsch Often, geology displays non-linear features such as veins, channels or folds/faults

More information

arxiv: v4 [stat.me] 14 Sep 2015

arxiv: v4 [stat.me] 14 Sep 2015 Does non-stationary spatial data always require non-stationary random fields? Geir-Arne Fuglstad 1, Daniel Simpson 1, Finn Lindgren 2, and Håvard Rue 1 1 Department of Mathematical Sciences, NTNU, Norway

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking

More information

Spatial Analysis II. Spatial data analysis Spatial analysis and inference

Spatial Analysis II. Spatial data analysis Spatial analysis and inference Spatial Analysis II Spatial data analysis Spatial analysis and inference Roadmap Spatial Analysis I Outline: What is spatial analysis? Spatial Joins Step 1: Analysis of attributes Step 2: Preparing for

More information

Transiogram: A spatial relationship measure for categorical data

Transiogram: A spatial relationship measure for categorical data International Journal of Geographical Information Science Vol. 20, No. 6, July 2006, 693 699 Technical Note Transiogram: A spatial relationship measure for categorical data WEIDONG LI* Department of Geography,

More information

Time Series 2. Robert Almgren. Sept. 21, 2009

Time Series 2. Robert Almgren. Sept. 21, 2009 Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models

More information

An anisotropic Matérn spatial covariance model: REML estimation and properties

An anisotropic Matérn spatial covariance model: REML estimation and properties An anisotropic Matérn spatial covariance model: REML estimation and properties Kathryn Anne Haskard Doctor of Philosophy November 2007 Supervisors: Arūnas Verbyla and Brian Cullis THE UNIVERSITY OF ADELAIDE

More information

1. Stochastic Processes and Stationarity

1. Stochastic Processes and Stationarity Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture Note 1 - Introduction This course provides the basic tools needed to analyze data that is observed

More information

RESEARCH REPORT. Estimation of sample spacing in stochastic processes. Anders Rønn-Nielsen, Jon Sporring and Eva B.

RESEARCH REPORT. Estimation of sample spacing in stochastic processes.   Anders Rønn-Nielsen, Jon Sporring and Eva B. CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING www.csgb.dk RESEARCH REPORT 6 Anders Rønn-Nielsen, Jon Sporring and Eva B. Vedel Jensen Estimation of sample spacing in stochastic processes No. 7,

More information

Spatial analysis. Spatial descriptive analysis. Spatial inferential analysis:

Spatial analysis. Spatial descriptive analysis. Spatial inferential analysis: Spatial analysis Spatial descriptive analysis Point pattern analysis (minimum bounding box, mean center, weighted mean center, standard distance, nearest neighbor analysis) Spatial clustering analysis

More information

The 1d Kalman Filter. 1 Understanding the forward model. Richard Turner

The 1d Kalman Filter. 1 Understanding the forward model. Richard Turner The d Kalman Filter Richard Turner This is a Jekyll and Hyde of a document and should really be split up. We start with Jekyll which contains a very short derivation for the d Kalman filter, the purpose

More information

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced

More information

Interpolation and 3D Visualization of Geodata

Interpolation and 3D Visualization of Geodata Marek KULCZYCKI and Marcin LIGAS, Poland Key words: interpolation, kriging, real estate market analysis, property price index ABSRAC Data derived from property markets have spatial character, no doubt

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

The World According to Wolfram

The World According to Wolfram The World According to Wolfram Basic Summary of NKS- A New Kind of Science is Stephen Wolfram s attempt to revolutionize the theoretical and methodological underpinnings of the universe. Though this endeavor

More information

Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore

Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore Stochastic Structural Dynamics Prof. Dr. C. S. Manohar Department of Civil Engineering Indian Institute of Science, Bangalore Lecture No. # 33 Probabilistic methods in earthquake engineering-2 So, we have

More information