GEOSTATISTICS. Dr. Spyros Fountas

GEOSTATISTICS Dr. Spyros Fountas

Northing (m) 140550 140450 140350 Trent field Disturbed area Andover 140250 Panholes 436950 437050 437150 437250 437350 Easting (m) Trent Field Westover Farm (Blackmore, 2003)

Trent field, Westover Farm

Yield mapping combine Ian Britton

Raw yield data points 140600 140500 Northing (m) 140400 140300 140200 436900 437000 437100 437200 437300 437400 Easting (m) (Blackmore, 2003)

Univariate summary statistics 2000 1600 1200 800 Central tendency Min 0 Q1 3.3 Median 3.8 Q3 4.7 Max 7.6 Average 3.9 Mode 3.7 Count 3917 Variability Variance 1.6 Std dev 1.3 Interquartile range 1.4 Upper outliers above 6.8 Lower outliers below 1.2 Trent field, Westover farm 1997 1598 1004 Frequency Cumulative % 100% 80% 60% 40% 400 Measures of histogram shape Skewness -0.358 Coefficient of variation 0.319 Kurtosis 1.09 421 457 20% 0 161 115 124 19 18 0% 1 2 3 4 5 6 7 8 9 Classed yield (t/ha) (data: Blackmore, 2003)

Why Geostatistics? We want to know the soil properties of some elements at each point to apply fertilizer where it needs and nowhere else Grid 20m x 20m e.g soil salinity pollution by heavy metals arsenic in ground water rainfall barometric pressure

Why Geostatistics? COMMON: the environment is continuous, BUT We can afford to measure properties at only a finite number of places, OR The best we can do is to estimate, or predict, in spatial sense This is the PRINCIPLE of Geostatistics

? The estimation procedure in Geostatistics: KRIGING

Spatial prediction 14.2 16.3 16.0 14.1? 15.3 15.8 13.8 14.9 15.0

Spatial relationships 44 7 2 1 40 141.4 42 6 100.0 h 2 = 100.0 2 43 100.0 5 37 4 37 3 37

Why Geostatistics? But, if there is an ERROR? If we underestimate and recommend less fertilizer when it is needed & the farmer loose yield & profit? What are you going to say to the farmer? Geostatistics gives the answer! It can never provide complete information, of course, but given the data, it can enable you to estimate the probabilities that true values exceed specified threshold

Spatial data Spatial statistics refers to environment for soil, water, air Result of the actions & interactions of many different processes & factors Each process might itself operate on several scales simultaneously, in a non-linear way, and with local positive feedback.

Spatial data The spatial changes in the environment are obvious, when we see them on aerial photographs and satellites or are more subtle e.g. temperature or chemical Measurements are taken in a few cm, or meters, which we call point samples. Point samples are positively related autocorrelated. Places close to one another tend to have similar values, while further apart differ more. Environmentalists know that intuitively. Geostatisticians can quantify the spatial autocorrelations and minimize the errors

Geo vs Classical Statistics 1. Classical Stats are based on random sampling, linear sum of data, all of whom carry the same weight If there is spatial correlations, then by stratifying we can estimate more precisely or sample more effectively. If the strata are of different sizes, then we can vary the weights attributable to their data in proportion

Geo vs Classical Stats 2. Geostatistics rely on spatial models, while classical don t. Classical are based on sampling design, which implies unbiasedness & provides estimates of error if the choice of sampling design is suitable. It requires no assumptions about the nature of the variable itself. In Geostatistics, assumes that the variable is random & is the outcome of one or more random processes. The models of which predictions are based, are of these random processes.

Semi-variogram used for Kriging Spatial relationship Sill Variance Micro error or Sampling error Nugget Range Distance of influence

Spatial models γ h

Simulation Variogram model Search window Sample data Variance Lag Nugget Sill Range Convert parameters to a covariance model 14.2 16.3 14.1 13.8? 14.9 15.3 15.0 16.0 15.8 Search window 14.2 16.3 16.0 Randomly select value from distribution Local mean Gaussian Mean Lag 14.1 14.5 15.3 15.8 13.8 14.9 15.0 Simulated value = 14.5 Search window (Adapted from Chainey and Stuart, 1998) Using the lag, find covariance from covariance model