Spatial and Environmental Statistics
|
|
- Bernard Richard
- 5 years ago
- Views:
Transcription
1 Spatial and Environmental Statistics Dale Zimmerman Department of Statistics and Actuarial Science University of Iowa January 17, 2019 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
2 Overview 1 Spatial Datasets 2 What is Spatial (and Environmental) Statistics? 3 Three Important Types of Spatial Data Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
3 Spatial Datasets Wet deposition of SO 4 (g/m 2 ) in 1987 at National Acid Deposition Program sites. lat so4dep long Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
4 Spatial Datasets Coal ash samples from a mine in Pennsylvania. coal.ash$y coal.ash$x 9.1 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
5 Spatial Datasets Presence (black) or absence (white) of Atriplex hymenelytra on a grid of quadrats in Death Valley, CA. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
6 Spatial Datasets Population-adjusted mortality rates due to SIDS in counties of North Carolina, Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
7 Spatial Datasets Locations of Japanese pines, redwood saplings, biological cells, and scouring rushes in various study areas. Pines Redwoods Cells Rushes Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
8 What is Spatial Statistics? Basic ingredients: Observations on one or more response variables are taken at multiple, identifiable sites in some spatial domain. Locations of these sites are observed and are attached, as labels, to the observations. An analysis of the observations is performed, in which the spatial locations of sites are taken into account. Either the observations or the spatial locations (or both) are modelled as random variables, and inferences are made about these models and/or about additional unobserved variables. Thus, spatial statistics would include any investigation in which the data s spatial locations play a role in a probabilistic or statistical analysis (we will emphasize the statistical). Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
9 What is Spatial Statistics? Spatial statistics is a vast subject, in large part because spatial data are of so many different types. The response variable may be: univariate or multivariate categorical or continuous real-valued (numerical) or not real-valued (e.g. set-valued) observational or experimental The data locations may: be points, regions, or something else be regularly or irregularly spaced be regularly or irregularly shaped belong to a Euclidean or non-euclidean space (e.g., river network) Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
10 What is Spatial Statistics? The mechanism that generates the data locations may be: known or unknown random or non-random related or unrelated to the processes that govern the responses Related subjects: Time series analysis Reliability/survival analysis Longitudinal data analysis Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
11 Three Important Types of Spatial Data 1. Geostatistical data The response variable exists at every point in the study region; however, we observe the response at only a finite number of points, or at a finite number of subregions that are small relative to the spacing between them. Examples: (a) Annual acid rain deposition in U.S. (b) Richness of iron ore within an ore body Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
12 Three Important Types of Spatial Data 2. Areal (sometimes called lattice) data The response variable exists and is observed only on a finite set of (usually contiguous) subregions within the study region. Examples: (a) Presence or absence of a plant species in square quadrats over a study area (b) Numbers of deaths due to SIDS in the counties of North Carolina (c) Pixel values from remote sensing (satellites) Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
13 Three Important Types of Spatial Data 3. Spatial point patterns Data are the spatial locations of point events within the study region. No response variable is observed at the locations. Examples: (a) Locations of Equisetum arvense plants at a marsh edge evidence of environmental gradient? (b) Location of lunar craters meteor impacts or volcanism? (c) Locations of residences of individuals with lung cancer within 50 miles of a large incinerator does disease risk increase with proximity to the incinerator? A more general kind of spatial point pattern is a marked spatial point pattern, in which a nontrivial response variable (called the mark) is observed at each point. If the mark is discrete, we have a multivariate spatial point pattern. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
14 Three Important Types of Spatial Data The distinctions between these three types are not always clear-cut. In particular, areal data and geostatistical data have many similarities. In a sense, areal data are not as refined as geostatistical data or spatial point patterns since you can obtain areal data by various reductions (integration or counting) of the other two. In addition to indicating some prototypes of spatial data, the examples listed above indicate the breadth of disciplines in which scientific inquiry is concerned with spatial data. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
15 Spatial Statistics Why Bother? There are two main reasons why we bother with spatial statistics for spatial data (instead of just using classical statistics): 1 Characterizing the spatial structure of the data may be of direct interest. 2 The spatial structure may not be of direct interest, but modeling or otherwise accounting for it may improve other inferences. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
16 Spatial Statistics Why Bother? More details on the first reason: The observations are suspected of having a coherent spatial structure, the characterization of which may be important. The kinds of spatial structure that may occur vary across types, but there are some commonalities. It has been observed over and over again in practice that observations taken at sites close together tend to be more alike than observations taken at sites far apart. In the spatial context, this is sometimes called the First Law of Spatial Statistics. This law can manifest through either the large-scale (global) structure or the small-scale (local) structure, or both. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
17 Spatial Statistics Why Bother? Large-scale structure Mean function of geostatistical process Mean vector of areal process Intensity of spatial point process Small-scale structure Variogram, covariance function of geostatistical process Neighbor weights for areal process Ripley s K-function, second-order intensity, nearest-neighbor functions for spatial point process Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
18 Spatial Statistics Why Bother? Two important types of spatial structure are stationarity and isotropy. Formal definitions of these will be given later. For now, the following descriptions will suffice. (a) Stationarity the property whereby the behavior of the process is similar across all of the spatial domain under study. This implies: constant (no trend) large-scale structure small-scale structure that depends on the spatial locations only through their relative positions (displacement) (b) Isotropy the property whereby the process is stationary, plus the small-scale structure depends on the spatial locations only through the Euclidean distance between them. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
19 Spatial Statistics Why Bother? Characterization of the spatial structure is usually achieved by one or more of the following types of statistical inference: Testing for the existence of spatial structure Estimating spatial structural parameters Choosing between alternative structural models Prediction of unobserved variables using estimated structure (almost exclusively geostatistical, where it is known as kriging) Now we elaborate on the second reason why we bother with spatial statistics, i.e., taking account of the spatial structure to improve non-spatial inferences. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
20 Spatial Statistics Why Bother? Example 1 (from geostatistics). Prediction of an unobserved response at the from 6 nearby observations. If all of the observed responses are uncorrelated with each other and with Z(s 0 ), then Z (the average of the observed responses) is the best linear unbiased predictor. If, however, the responses are spatially correlated, then Z is inefficient. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
21 Spatial Statistics Why Bother? Example 2 (from spatial point pattern analysis). Estimation of the number, N, of trees in a forest of area A. One method for estimating N is based on measuring the distance, X i, to the nearest tree from each of m fixed points. X2 X4 X1 X3 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
22 Spatial Statistics Why Bother? If tree locations are completely spatially random (a random sample from the uniform distribution on A), then the MLE of N is ˆN = m A m i=1 πx i 2. If not, then ˆN can be badly biased. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
23 Spatial Statistics Why Bother? Example 3 (from areal data analysis). Variance of the sample mean. Consider 16 observations taken over square subregions in a 4 4 grid, indexed by rows and columns as Z(i, j): Z(4,1) Z(4,4) Z(1,1) Z(1,4) Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
24 Spatial Statistics Why Bother? Suppose that the observations have common mean µ and common variance 1, and corr[z(i, j), Z(k, l)] = 0.5 i k + j l. Suppose we wish to estimate µ by the sample mean, Z. It s tedious but mathematically easy to show that var( Z). = If there were no spatial correlation, then var( Z) = 1/16 = Thus if we obtain a 95% (say) confidence interval for µ by acting as though there is no correlation, our interval will actually be much narrower than it should be. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
25 Spatial Statistics Why Bother? Example 4 (from areal data analysis). Spatial experimental design. Consider a field-plot experiment with 50 units, laid out in 10 linear blocks of 5 plots each. Suppose there are 5 treatments and each is to occur once in each block. Consider two designs: Randomized block design (RBD) First-order nearest-neighbor balanced design (first-order NNBD) Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
26 Spatial Statistics Why Bother? It turns out that if treatment-adjusted responses are independent across blocks but positively spatially correlated within blocks, then the first-order NNBD is optimal in the sense of minimizing the average variance of treatment contrasts. It can be considerably superior to the RBD if the correlation is strong. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
27 Geostatistical Data and Model, Part I Basic form of geostatistical data is (x i, y i ) : i = 1,..., n, where x i is a spatial location in a spatial domain of interest A, and y i is a the measured value of a variable y at x i. Usually A lies in 2-D space and the x i s are points, in which case x i is a 2-D vector of spatial coordinates. Almost always, the x i s are distinct (i.e., no repeated measurements). The x i s are either deterministic (nonrandom) or, if they are randomly chosen, it is assumed (unless noted otherwise) that the mechanism for their selection is independent of the process governing the y i s. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
28 Geostatistical Data and Model, Part I As for y i, it is assumed to be a realization of a random variable Y i, which in turn is regarded as a function f ( ) of another random variable S(x i ): Y i = f (S(x i )). In fact, it is assumed that there is a random variable S(x) at each possible location in x A, not merely at x 1,..., x n. The collection S( ) {S(x) : x A} is a stochastic process, which in this context is called a geostatistical process or a random field. In general it is not observable, even at x 1,..., x n. In the wet sulfate deposition example: A is the continental U.S. x 1,..., x n are the locations of the buckets y i is the observed annual wet sulfate deposition Y i = S(x i ) + ɛ i, where ɛ i is measurement error Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
29 Geostatistical Data and Model, Part I It is assumed that Y 1,..., Y n S( ) are conditionally independent, but that S(x) and S(x ) generally are not independent. If S( ) is a Gaussian process, meaning that any finite subcollection has a multivariate normal distribution, then S(x) and S(x ) are independent iff corr(s(x), S(x )) = 0. Thus in this case at least, it is clear that the correlation structure of S( ) is important in modeling and inference for geostatistical data. Even if S( ) is not Gaussian, some types of inference are possible if we can characterize the first-order ( large-scale) and second-order ( small-scale) moment structure of the data. So we take that as our first major task. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
30 Means, Variances, and Correlations of Random Variables Suppose that Y is a random variable (defined on some probability space). If Y is discrete, it has a probability mass function f (y), where f (y) > 0 for y Y. If the sum u(y)f (y) y Y exists, then it (the sum) is called the expectation of u(y ), written as E(u(Y )). If Y is (absolutely) continuous, it has a probability density function f (y), where f (y) > 0 for y Y. If u(y)f (y) dy Y exists, then it (the integral) is also called the expectation of u(y ), also written as E(u(Y )). Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
31 Means, Variances, and Correlations of Random Variables It follows easily from the definition of expectation that when it exists, E(c 1 u 1 (Y ) + c 2 u 2 (Y )) = c 1 E(u 1 (Y )) + c 2 E(u 2 (Y )) where c 1 and c 2 are constants and u 1 ( ) and u 2 ( ) are functions. Some important special types of expectations are the mean and variance: Mean, E(Y ) ( µ) Variance, Var(Y ) = E[(Y µ) 2 ] = E[Y 2 2µY + µ 2 ] = E(Y 2 ) µ 2 ( σ 2 ) By the rule at the top of this page, E(aY + b) = aµ + b, and Var(aY + b) = a 2 σ 2. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
32 Means, Variances, and Correlations of Random Variables Now suppose that Y 1 and Y 2 are two random variables, either both discrete or both continuous, with joint pmf or pdf f (y 1, y 2 ). Then the expectation of u(y 1, Y 2 ), when it exists, is given by an expression similar to that given before. For example, in the discrete case, u(y 1, y 2 )f (y 1, y 2 ). (y 1,y 2 ) Y It turns out that the means µ 1 and µ 2, and variances σ 2 1 and σ2 2, of Y 1 and Y 2 (when they exist) may be obtained using either this approach or the previous approach (using marginal distributions). However, there is another important expectation in this bivariate case that requires the second approach only. The covariance of Y 1 and Y 2, when it exists, is Cov(Y 1, Y 2 ) = E[(Y 1 µ 1 )(Y 2 µ 2 )] σ 12 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
33 Means, Variances, and Correlations of Random Variables The covariance σ 12 : exists if and only if σ 2 1 and σ2 2 exist measures the sign and extent to which Y 1 and Y 2 co-vary, but its magnitude depends on the variances may be written alternatively as E(Y 1 Y 2 ) µ 1 µ 2 satisfies the rule Cov(a 1 Y 1 + b 1, a 2 Y 2 + b 2 ) = a 1 a 2 σ 12 A scaled version of the covariance is the correlation, written as corr(y 1, Y 2 ) or ρ 12 : ρ 12 = σ 12, σ1 2σ2 2 provided that σ 2 1 and σ2 2 are positive. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
34 Means, Variances, and Correlations of Random Variables Facts about ρ 12 : 1 ρ 12 1 ρ 12 measures the strength of the linear association between Y 1 and Y 2 If Y 1 and Y 2 are independent, then ρ 12 = 0; but the converse is false Another moment used in spatial statistics is the semivariance of Y 1 and Y 2, written as γ 12 : γ 12 = 1 2 Var(Y 1 Y 2 ) provided that this variance exists. It can be shown that: when σ 2 1 and σ2 2 exist, γ 12 = 1 2 (σ2 1 + σ2 2 2σ 12); when σ 2 1 = σ2 2 = σ2, say, γ 12 = σ 2 σ 12 = σ 2 (1 ρ 12 ). Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
35 Means, Variances, and Correlations of Random Variables We can extend all of these ideas to situations with any finite number n of random variables Y 1,..., Y n. If Y 1,..., Y n are random variables whose variances σ1 2,..., σ2 n exist, and we write µ 1,..., µ n for their means and σ ij for cov(y i, Y j ), then the following rules hold for the mean, variance, and covariance of linear transformations of the variables: ( n ) n E a i Y i = a i µ i ; i=1 ( n ) Var a i Y i i=1 = = i=1 n ai 2 σi i=1 n ai 2 σi i=1 n n i=1 j=i+1 n n i=1 j=i+1 a i a j σ ij a i a j σ i σ j ρ ij ; Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
36 Means, Variances, and Correlations of Random Variables n n Cov a i Y i, b j Y j = i=1 j=1 = n n a i b j σ ij i=1 j=1 n i=1 j=1 n a i b j ρ ij σ i σ j. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
37 Means, Variances, and Correlations of Random Variables For simplicity, it is helpful to organize all the means in a n-dimensional vector: µ 1 µ 2 µ =.. µ n Likewise, we can organize all the variance and covariances in an n n symmetric nonnegative definite matrix: σ 2 1 σ 12 σ 1n σ 21 σ2 2 σ 2n Σ =.... σ n1 σ n2 σn 2 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
38 Means, Variances, and Correlations of Random Variables Similarly, we can construct an n n symmetric nonnegative definite correlation matrix 1 ρ 12 ρ 1n ρ 21 1 ρ 2n ρ =... ρ n1 ρ n2 1 and an n n symmetric semivariance matrix 0 γ 12 γ 1n γ 21 0 γ 2n Γ =.... γ n1 γ n2 0 Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
39 Mean Function, Covariance Function, Correlation Function, and Semivariogram of a Geostatistical Process Recall that a geostatistical process S( ) = {S(x) : x A} is an infinite collection of random variables, one located at each point in a spatial domain A. How can we represent the mean, variance, covariance (or correlation), and semivariance of all these random variables? We do it by regarding the moments as functions of the spatial location(s): Mean function µ(x) = E(S(x)); Covariance function σ(x, x ) = Cov(S(x), S(x )); Correlation function ρ(x, x ) = Corr(S(x), S(x )); Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
40 Mean Function, Covariance Function, Correlation Function, and Semivariogram of a Geostatistical Process Semivariogram γ(x, x ) = Var(S(x) S(x )). Often, there is not enough data to estimate these functions nonparametrically, so in practice they are approximated by parsimonious parametric models. In due time we will consider such models, but for now, we merely note the simplifications that result from stationarity and isotropy. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
41 Mean Function, Covariance Function, Correlation Function, and Semivariogram of a Geostatistical Process We consider three types of stationarity: 1. Strict stationarity requires that the joint probability distribution of {S(x) : x A} depends only on the relative positions of sites, i.e., F S(x1 +h),...,s(x m+h)(s 1,..., s m ) = F S(x1 ),...,S(x m)(s 1,..., s m ) for all m, all locations x 1,..., x m, all displacements h, and all s 1,..., s m. This implies, for example, that P[S(x 1 + h) s 1 ] = P[S(x 1 ) s 1 ] for all x 1, all h, and all s 1 (i.e., equality of marginal distributions). It also implies that bivariate distributions are identical for any pairs of variables displaced by the same vector. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
42 Mean Function, Covariance Function, Correlation Function, and Semivariogram of a Geostatistical Process 2. Second-order stationarity requires that: the mean is constant over space; the way that two variables of S( ) co-vary is consistent for variables located at sites having the same relative positions. That is, the covariance between variables at two sites depends on only the sites relative positions. This can be expressed in either of the following two ways: σ(x, x ) = σ(x + h, x + h) for all h. σ(x, x ) = σ(x x ) for all x, x A. This implies that the variance is constant over space. If S( ) is second-order stationary, the correlation function satisfies the same property. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
43 Mean Function, Covariance Function, Correlation Function, and Semivariogram of a Geostatistical Process 3. Intrinsic stationarity requires that: the mean is constant over space; the semivariogram depends on only the relative positions of sites. This can be expressed as Some important facts: γ(x, x ) = γ(x x ) for all x, x A. Strict stationarity is stronger than second-order stationarity, which in turn is stronger than intrinsic stationarity. If S( ) is second-order stationary with variance σ 2, then γ(x x ) = σ 2 σ(x x ). Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
44 Mean Function, Covariance Function, Correlation Function, and Semivariogram of a Geostatistical Process Isotropy adds a further requirement to each stationarity assumption. In particular, it requires that the condition defining the stationarity hold not only for variables displaced by the same h, but also for variables for which the orientation of h is different but the length of h is the same. Thus, isotropic second-order stationarity and isotropic intrinsic stationary require, respectively, that σ(x, x ) = σ( x x ) for all x, x A, γ(x, x ) = γ( x x ) for all x, x A. Isotropic second-order stationarity implies that ρ(x, x ) = ρ( x x ) for all x, x A. Dale Zimmerman (UIOWA) Spatial and Environmental Statistics January 17, / 44
Dale L. Zimmerman Department of Statistics and Actuarial Science, University of Iowa, USA
SPATIAL STATISTICS Dale L. Zimmerman Department of Statistics and Actuarial Science, University of Iowa, USA Keywords: Geostatistics, Isotropy, Kriging, Lattice Data, Spatial point patterns, Stationarity
More informationAn Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University
An Introduction to Spatial Statistics Chunfeng Huang Department of Statistics, Indiana University Microwave Sounding Unit (MSU) Anomalies (Monthly): 1979-2006. Iron Ore (Cressie, 1986) Raw percent data
More informationIntroduction. Spatial Processes & Spatial Patterns
Introduction Spatial data: set of geo-referenced attribute measurements: each measurement is associated with a location (point) or an entity (area/region/object) in geographical (or other) space; the domain
More informationChapter 5: Joint Probability Distributions
Chapter 5: Joint Probability Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 19 Joint pmf Definition: The joint probability mass
More informationAreal data. Infant mortality, Auckland NZ districts. Number of plant species in 20cm x 20 cm patches of alpine tundra. Wheat yield
Areal data Reminder about types of data Geostatistical data: Z(s) exists everyhere, varies continuously Can accommodate sudden changes by a model for the mean E.g., soil ph, two soil types with different
More informationFor a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,
CHAPTER 2 FUNDAMENTAL CONCEPTS This chapter describes the fundamental concepts in the theory of time series models. In particular, we introduce the concepts of stochastic processes, mean and covariance
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,
More informationIntroduction to Spatial Data and Models
Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry
More informationIntroduction to Geostatistics
Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,
More informationCBMS Lecture 1. Alan E. Gelfand Duke University
CBMS Lecture 1 Alan E. Gelfand Duke University Introduction to spatial data and models Researchers in diverse areas such as climatology, ecology, environmental exposure, public health, and real estate
More informationTypes of Spatial Data
Spatial Data Types of Spatial Data Point pattern Point referenced geostatistical Block referenced Raster / lattice / grid Vector / polygon Point Pattern Data Interested in the location of points, not their
More information01 Probability Theory and Statistics Review
NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement
More informationSTAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics
David B. University of South Carolina Department of Statistics What are Time Series Data? Time series data are collected sequentially over time. Some common examples include: 1. Meteorological data (temperatures,
More informationSpatial analysis is the quantitative study of phenomena that are located in space.
c HYON-JUNG KIM, 2016 1 Introduction Spatial analysis is the quantitative study of phenomena that are located in space. Spatial data analysis usually refers to an analysis of the observations in which
More information7 Geostatistics. Figure 7.1 Focus of geostatistics
7 Geostatistics 7.1 Introduction Geostatistics is the part of statistics that is concerned with geo-referenced data, i.e. data that are linked to spatial coordinates. To describe the spatial variation
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationHandbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp
Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp Marcela Alfaro Córdoba August 25, 2016 NCSU Department of Statistics Continuous Parameter
More informationBasics of Point-Referenced Data Models
Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationFluvial Variography: Characterizing Spatial Dependence on Stream Networks. Dale Zimmerman University of Iowa (joint work with Jay Ver Hoef, NOAA)
Fluvial Variography: Characterizing Spatial Dependence on Stream Networks Dale Zimmerman University of Iowa (joint work with Jay Ver Hoef, NOAA) March 5, 2015 Stream network data Flow Legend o 4.40-5.80
More informationSpatial Data Mining. Regression and Classification Techniques
Spatial Data Mining Regression and Classification Techniques 1 Spatial Regression and Classisfication Discrete class labels (left) vs. continues quantities (right) measured at locations (2D for geographic
More informationPoint-Referenced Data Models
Point-Referenced Data Models Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Point-Referenced Data Models Spring 2013 1 / 19 Objectives By the end of these meetings, participants should
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationIntensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis
Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 5 Topic Overview 1) Introduction/Unvariate Statistics 2) Bootstrapping/Monte Carlo Simulation/Kernel
More informationStatistícal Methods for Spatial Data Analysis
Texts in Statistícal Science Statistícal Methods for Spatial Data Analysis V- Oliver Schabenberger Carol A. Gotway PCT CHAPMAN & K Contents Preface xv 1 Introduction 1 1.1 The Need for Spatial Analysis
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationAreal Unit Data Regular or Irregular Grids or Lattices Large Point-referenced Datasets
Areal Unit Data Regular or Irregular Grids or Lattices Large Point-referenced Datasets Is there spatial pattern? Chapter 3: Basics of Areal Data Models p. 1/18 Areal Unit Data Regular or Irregular Grids
More informationProbability and Stochastic Processes
Probability and Stochastic Processes A Friendly Introduction Electrical and Computer Engineers Third Edition Roy D. Yates Rutgers, The State University of New Jersey David J. Goodman New York University
More informationSimulating Realistic Ecological Count Data
1 / 76 Simulating Realistic Ecological Count Data Lisa Madsen Dave Birkes Oregon State University Statistics Department Seminar May 2, 2011 2 / 76 Outline 1 Motivation Example: Weed Counts 2 Pearson Correlation
More informationCovariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance
Covariance Lecture 0: Covariance / Correlation & General Bivariate Normal Sta30 / Mth 30 We have previously discussed Covariance in relation to the variance of the sum of two random variables Review Lecture
More informationStatistical signal processing
Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable
More informationIntroduction to Probability and Stocastic Processes - Part I
Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark
More informationECE Lecture #10 Overview
ECE 450 - Lecture #0 Overview Introduction to Random Vectors CDF, PDF Mean Vector, Covariance Matrix Jointly Gaussian RV s: vector form of pdf Introduction to Random (or Stochastic) Processes Definitions
More informationExtreme Value Analysis and Spatial Extremes
Extreme Value Analysis and Department of Statistics Purdue University 11/07/2013 Outline Motivation 1 Motivation 2 Extreme Value Theorem and 3 Bayesian Hierarchical Models Copula Models Max-stable Models
More informationBayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling
Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation
More informationWhat s for today. Random Fields Autocovariance Stationarity, Isotropy. c Mikyoung Jun (Texas A&M) stat647 Lecture 2 August 30, / 13
What s for today Random Fields Autocovariance Stationarity, Isotropy c Mikyoung Jun (Texas A&M) stat647 Lecture 2 August 30, 2012 1 / 13 Stochastic Process and Random Fields A stochastic process is a family
More informationGeostatistics in Hydrology: Kriging interpolation
Chapter Geostatistics in Hydrology: Kriging interpolation Hydrologic properties, such as rainfall, aquifer characteristics (porosity, hydraulic conductivity, transmissivity, storage coefficient, etc.),
More information1. Stochastic Processes and Stationarity
Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture Note 1 - Introduction This course provides the basic tools needed to analyze data that is observed
More informationLecture Note 1: Probability Theory and Statistics
Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 1: Probability Theory and Statistics Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 For this and all future notes, if you would
More informationBivariate distributions
Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient
More informationBivariate Distributions. Discrete Bivariate Distribution Example
Spring 7 Geog C: Phaedon C. Kyriakidis Bivariate Distributions Definition: class of multivariate probability distributions describing joint variation of outcomes of two random variables (discrete or continuous),
More informationWe introduce methods that are useful in:
Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more
More informationChapter 4 - Fundamentals of spatial processes Lecture notes
TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationCourse information: Instructor: Tim Hanson, Leconte 219C, phone Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment.
Course information: Instructor: Tim Hanson, Leconte 219C, phone 777-3859. Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment. Text: Applied Linear Statistical Models (5th Edition),
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationProblem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51
Yi Lu Correlation and Covariance Yi Lu ECE 313 2/51 Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] Yi Lu ECE 313 3/51 Definition Let X and Y be random variables
More informationPRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH
PRODUCING PROBABILITY MAPS TO ASSESS RISK OF EXCEEDING CRITICAL THRESHOLD VALUE OF SOIL EC USING GEOSTATISTICAL APPROACH SURESH TRIPATHI Geostatistical Society of India Assumptions and Geostatistical Variogram
More informationBayesian Transgaussian Kriging
1 Bayesian Transgaussian Kriging Hannes Müller Institut für Statistik University of Klagenfurt 9020 Austria Keywords: Kriging, Bayesian statistics AMS: 62H11,60G90 Abstract In geostatistics a widely used
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More information11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1
(Z i Z j ) 2 / 2 (Z i Zj) 2 / 2 Semivariance y 11/8/2018 Spatial Interpolation & Geostatistics Kriging Step 1 Describe spatial variation with Semivariogram Lag Distance between pairs of points Lag Mean
More informationENGRG Introduction to GIS
ENGRG 59910 Introduction to GIS Michael Piasecki October 13, 2017 Lecture 06: Spatial Analysis Outline Today Concepts What is spatial interpolation Why is necessary Sample of interpolation (size and pattern)
More informationKarhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes
TTU, October 26, 2012 p. 1/3 Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes Hao Zhang Department of Statistics Department of Forestry and Natural Resources Purdue University
More informationStatistics 910, #5 1. Regression Methods
Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known
More informationA kernel indicator variogram and its application to groundwater pollution
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS101) p.1514 A kernel indicator variogram and its application to groundwater pollution data Menezes, Raquel University
More informationAcceptable Ergodic Fluctuations and Simulation of Skewed Distributions
Acceptable Ergodic Fluctuations and Simulation of Skewed Distributions Oy Leuangthong, Jason McLennan and Clayton V. Deutsch Centre for Computational Geostatistics Department of Civil & Environmental Engineering
More informationMultivariate Time Series
Multivariate Time Series Notation: I do not use boldface (or anything else) to distinguish vectors from scalars. Tsay (and many other writers) do. I denote a multivariate stochastic process in the form
More informationSpatial Interpolation & Geostatistics
(Z i Z j ) 2 / 2 Spatial Interpolation & Geostatistics Lag Lag Mean Distance between pairs of points 1 y Kriging Step 1 Describe spatial variation with Semivariogram (Z i Z j ) 2 / 2 Point cloud Map 3
More informationAn Introduction to Pattern Statistics
An Introduction to Pattern Statistics Nearest Neighbors The CSR hypothesis Clark/Evans and modification Cuzick and Edwards and controls All events k function Weighted k function Comparative k functions
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationOverview of Statistical Analysis of Spatial Data
Overview of Statistical Analysis of Spatial Data Geog 2C Introduction to Spatial Data Analysis Phaedon C. Kyriakidis www.geog.ucsb.edu/ phaedon Department of Geography University of California Santa Barbara
More information2. This problem is very easy if you remember the exponential cdf; i.e., 0, y 0 F Y (y) = = ln(0.5) = ln = ln 2 = φ 0.5 = β ln 2.
FALL 8 FINAL EXAM SOLUTIONS. Use the basic counting rule. There are n = number of was to choose 5 white balls from 7 = n = number of was to choose ellow ball from 5 = ( ) 7 5 ( ) 5. there are n n = ( )(
More informationA full scale, non stationary approach for the kriging of large spatio(-temporal) datasets
A full scale, non stationary approach for the kriging of large spatio(-temporal) datasets Thomas Romary, Nicolas Desassis & Francky Fouedjio Mines ParisTech Centre de Géosciences, Equipe Géostatistique
More informationY i = η + ɛ i, i = 1,...,n.
Nonparametric tests If data do not come from a normal population (and if the sample is not large), we cannot use a t-test. One useful approach to creating test statistics is through the use of rank statistics.
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationCorrelation analysis. Contents
Correlation analysis Contents 1 Correlation analysis 2 1.1 Distribution function and independence of random variables.......... 2 1.2 Measures of statistical links between two random variables...........
More informationBasics on 2-D 2 D Random Signal
Basics on -D D Random Signal Spring 06 Instructor: K. J. Ray Liu ECE Department, Univ. of Maryland, College Park Overview Last Time: Fourier Analysis for -D signals Image enhancement via spatial filtering
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationGaussian processes. Basic Properties VAG002-
Gaussian processes The class of Gaussian processes is one of the most widely used families of stochastic processes for modeling dependent data observed over time, or space, or time and space. The popularity
More informationGaussian, Markov and stationary processes
Gaussian, Markov and stationary processes Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ November
More informationFinal Exam. Economics 835: Econometrics. Fall 2010
Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each
More informationIntroduction to Stochastic processes
Università di Pavia Introduction to Stochastic processes Eduardo Rossi Stochastic Process Stochastic Process: A stochastic process is an ordered sequence of random variables defined on a probability space
More informationProbability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver
Stochastic Signals Overview Definitions Second order statistics Stationarity and ergodicity Random signal variability Power spectral density Linear systems with stationary inputs Random signal memory Correlation
More informationThis exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.
TEST #3 STA 536 December, 00 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. You will have access to a copy
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationA Probability Review
A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in
More informationPrediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction
MIT 18.655 Dr. Kempthorne Spring 2016 1 Problems Targets of Change in value of portfolio over fixed holding period. Long-term interest rate in 3 months Survival time of patients being treated for cancer
More informationECON 3150/4150, Spring term Lecture 6
ECON 3150/4150, Spring term 2013. Lecture 6 Review of theoretical statistics for econometric modelling (II) Ragnar Nymoen University of Oslo 31 January 2013 1 / 25 References to Lecture 3 and 6 Lecture
More informationJointly Distributed Random Variables
Jointly Distributed Random Variables CE 311S What if there is more than one random variable we are interested in? How should you invest the extra money from your summer internship? To simplify matters,
More informationUniversity of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries
University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent
More informationIntroduction. Semivariogram Cloud
Introduction Data: set of n attribute measurements {z(s i ), i = 1,, n}, available at n sample locations {s i, i = 1,, n} Objectives: Slide 1 quantify spatial auto-correlation, or attribute dissimilarity
More informationPracticum : Spatial Regression
: Alexandra M. Schmidt Instituto de Matemática UFRJ - www.dme.ufrj.br/ alex 2014 Búzios, RJ, www.dme.ufrj.br Exploratory (Spatial) Data Analysis 1. Non-spatial summaries Numerical summaries: Mean, median,
More informationOn block bootstrapping areal data Introduction
On block bootstrapping areal data Nicholas Nagle Department of Geography University of Colorado UCB 260 Boulder, CO 80309-0260 Telephone: 303-492-4794 Email: nicholas.nagle@colorado.edu Introduction Inference
More informationSpatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields
Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February
More informationRecall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.
Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationStochastic Processes
Stochastic Processes Stochastic Process Non Formal Definition: Non formal: A stochastic process (random process) is the opposite of a deterministic process such as one defined by a differential equation.
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationInteraction Analysis of Spatial Point Patterns
Interaction Analysis of Spatial Point Patterns Geog 2C Introduction to Spatial Data Analysis Phaedon C Kyriakidis wwwgeogucsbedu/ phaedon Department of Geography University of California Santa Barbara
More informationOptimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields.
Optimal Interpolation ( 5.4) We now generalize the least squares method to obtain the OI equations for vectors of observations and background fields. Optimal Interpolation ( 5.4) We now generalize the
More informationIntensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis
Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 4 Spatial Point Patterns Definition Set of point locations with recorded events" within study
More informationAutoregressive Moving Average (ARMA) Models and their Practical Applications
Autoregressive Moving Average (ARMA) Models and their Practical Applications Massimo Guidolin February 2018 1 Essential Concepts in Time Series Analysis 1.1 Time Series and Their Properties Time series:
More informationSTATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS
STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS Eric Gilleland Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research Supported
More informationSpatiotemporal Analysis of Environmental Radiation in Korea
WM 0 Conference, February 25 - March, 200, Tucson, AZ Spatiotemporal Analysis of Environmental Radiation in Korea J.Y. Kim, B.C. Lee FNC Technology Co., Ltd. Main Bldg. 56, Seoul National University Research
More informationProbability. Table of contents
Probability Table of contents 1. Important definitions 2. Distributions 3. Discrete distributions 4. Continuous distributions 5. The Normal distribution 6. Multivariate random variables 7. Other continuous
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More information