Hierarchical Modeling and Analysis for Spatial Data

Similar documents
CBMS Lecture 1. Alan E. Gelfand Duke University

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models

Bayesian Hierarchical Models

Bayesian Areal Wombling for Geographic Boundary Analysis

Analysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS

Hierarchical Modelling for Univariate and Multivariate Spatial Data

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Preface. Geostatistical data

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

Approaches for Multiple Disease Mapping: MCAR and SANOVA

A Spatio-Temporal Downscaler for Output From Numerical Models

Chapter 6 Spatial Analysis

Introduction to Geostatistics

Spatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States

Modelling Multivariate Spatial Data

Outline. Shape of the Earth. Geographic Coordinates (φ, λ, z) Ellipsoid or Spheroid Rotate an ellipse around an axis. Ellipse.

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007

Dale L. Zimmerman Department of Statistics and Actuarial Science, University of Iowa, USA

Georeferencing. datum. projection. scale. The next few lectures will introduce you to these elements. on the Earth, you ll need to understand how

Map Projections. Displaying the earth on 2 dimensional maps

Shape e o f f the e Earth

Spatial Analysis 1. Introduction

GIS & Spatial Analysis in MCH

The Mathematics of Maps Lecture 4. Dennis The The Mathematics of Maps Lecture 4 1/29

Map Projections 2/4/2013. Map Projections. Rhumb Line (Loxodrome) Great Circle. The GLOBE. Line of constant bearing (e.g., 292.

Multivariate spatial modeling

Spatial Misalignment

Spatial Analysis I. Spatial data analysis Spatial analysis and inference

Modern Navigation. Thomas Herring

Analysing geoadditive regression data: a mixed model approach

Hierarchical Modeling for Multivariate Spatial Data

Overview of Statistical Analysis of Spatial Data

A Geostatistical Approach to Linking Geographically-Aggregated Data From Different Sources

Principles of Bayesian Inference

Spatial Misalignment

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

The Elements of GIS. Organizing Data and Information. The GIS Database. MAP and ATRIBUTE INFORMATION

Community Health Needs Assessment through Spatial Regression Modeling

Introduction to Spatial Data and Models

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Detection of Clustering in Spatial Data

Outline. Introduction to SpaceStat and ESTDA. ESTDA & SpaceStat. Learning Objectives. Space-Time Intelligence System. Space-Time Intelligence System

Combining Incompatible Spatial Data

MAP PROJECTIONS but before let s review some basic concepts.

Principles of Bayesian Inference

GEO 463-Geographic Information Systems Applications. Lecture 1

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

POPULAR CARTOGRAPHIC AREAL INTERPOLATION METHODS VIEWED FROM A GEOSTATISTICAL PERSPECTIVE

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

An Introduction to Spatial Statistics. Chunfeng Huang Department of Statistics, Indiana University

Overview of Spatial analysis in ecology

Hierarchical Multiresolution Approaches for Dense Point-Level Breast Cancer Treatment Data

I don t have much to say here: data are often sampled this way but we more typically model them in continuous space, or on a graph

Applied Cartography and Introduction to GIS GEOG 2017 EL. Lecture-1 Chapters 1 and 2

On the change of support problem for spatio-temporal data

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Web Appendices: Hierarchical and Joint Site-Edge Methods for Medicare Hospice Service Region Boundary Analysis

Lesson 5: Map Scale and Projections

The Nature of Geographic Data

In matrix algebra notation, a linear model is written as

What is a map? A simple representation of the real world Two types of maps

Geographic coordinate systems

Bayesian spatial hierarchical modeling for temperature extremes

Bayesian Linear Regression

Hierarchical Modelling for Multivariate Spatial Data

Chapter 17: Double and Triple Integrals

Map Projections. What does the world look like? AITOFF AZIMUTHAL EQUIDISTANT BEHRMANN EQUAL AREA CYLINDRICAL

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Fri. Jan. 26, Demonstration of QGIS with GPS tracks. Types of data, simple vector (shapefile) formats

Referencing map features: Coordinate systems and map projections

Importance of Understanding Coordinate Systems and Map Projections.

ch02.pdf chap2.pdf chap02.pdf

Spatial Models: A Quick Overview Astrostatistics Summer School, 2017

Quasi-likelihood Scan Statistics for Detection of

Georeferencing, Map Projections, Cartographic Concepts. -Coordinate Systems -Datum

Chapter 22 Gauss s Law. Copyright 2009 Pearson Education, Inc.

Areal Unit Data Regular or Irregular Grids or Lattices Large Point-referenced Datasets

Interactive GIS in Veterinary Epidemiology Technology & Application in a Veterinary Diagnostic Lab

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang

Types of spatial data. The Nature of Geographic Data. Types of spatial data. Spatial Autocorrelation. Continuous spatial data: geostatistics

GIST 4302/5302: Spatial Analysis and Modeling Point Pattern Analysis

Geostatistical Modeling for Large Data Sets: Low-rank methods

Spatial and Environmental Statistics

Projections Part I - Categories and Properties James R. Clynch February 2006

Effective Use of Geographic Maps

Spatial Point Pattern Analysis

Spatial discrete hazards using Hierarchical Bayesian Modeling

Geographers Perspectives on the World

Week 7: Integration: Special Coordinates

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Introduction to Spatial Analysis. Spatial Analysis. Session organization. Learning objectives. Module organization. GIS and spatial analysis

Vectors and Geometry

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Aggregated cancer incidence data: spatial models

Principles of Bayesian Inference

Gaussian predictive process models for large spatial data sets.

Transcription:

Hierarchical Modeling and Analysis for Spatial Data Bradley P. Carlin, Sudipto Banerjee, and Alan E. Gelfand brad@biostat.umn.edu, sudiptob@biostat.umn.edu, and alan@stat.duke.edu University of Minnesota and Duke University Hierarchical Modeling and Analysis for Spatial Data p. 1/2

Introduction to spatial data and models Researchers in diverse areas such as climatology, ecology, environmental health, and real estate marketing are increasingly faced with the task of analyzing data that are: highly multivariate, with many important predictors and response variables, geographically referenced, and often presented as maps, and temporally correlated, as in longitudinal or other time series structures. motivates hierarchical modeling and data analysis for complex spatial (and spatiotemporal) data sets. Hierarchical Modeling and Analysis for Spatial Data p. 2/2

Introduction (cont d) Example: In an epidemiological investigation, we might wish to analyze lung, breast, colorectal, and cervical cancer rates by county and year in a particular state with smoking, mammography, and other important screening and staging information also available at some level. Hierarchical Modeling and Analysis for Spatial Data p. 3/2

Introduction (cont d) Public health professionals who collect such data are charged not only with surveillance, but also statistical inference tasks, such as modeling of trends and correlation structures estimation of underlying model parameters hypothesis testing (or comparison of competing models) prediction of observations at unobserved times or locations. = all naturally accomplished through hierarchical modeling implemented via Markov chain Monte Carlo (MCMC) methods! Hierarchical Modeling and Analysis for Spatial Data p. 4/2

Existing spatial statistics books Cressie (1990, 1993): the legendary bible of spatial statistics, but rather high mathematical level lacks modern hierarchical modeling/computing Wackernagel (1998): terse; only geostatistics Chiles and Delfiner (1999): only geostatistics Stein (1999a): theoretical treatise on kriging More descriptive presentations: Bailey and Gattrell (1995), Fotheringham and Rogerson (1994), or Haining (1990). Our primary focus is on the issues of modeling, computing, and data analysis. Hierarchical Modeling and Analysis for Spatial Data p. 5/2

Type of spatial data point-referenced data, where Y (s) is a random vector at a location s R r, where s varies continuously over D, a fixed subset of R r that contains an r-dimensional rectangle of positive volume; areal data, where D is again a fixed subset (of regular or irregular shape), but now partitioned into a finite number of areal units with well-defined boundaries; point pattern data, where now D is itself random; its index set gives the locations of random events that are the spatial point pattern. Y (s) itself can simply equal 1 for all s D (indicating occurrence of the event), or possibly give some additional covariate information (producing a marked point pattern process). Hierarchical Modeling and Analysis for Spatial Data p. 6/2

Point-level (geostatistical) data Figure 1: < 12.9 12.9-13.7 13.7-14.6 14.6-15.5 15.5-16.4 16.4-17.3 17.3-18.1 18.1-19 19-19.9 > 19.9 Map of PM2.5 sampling sites; plotting color indicates range of average 2001 level Hierarchical Modeling and Analysis for Spatial Data p. 7/2

Areal (lattice) data Figure 2: ArcView poverty map, regional survey units in Hennepin County, MN. Hierarchical Modeling and Analysis for Spatial Data p. 8/2

Notes on areal data Figure 2 is an example of a choropleth map, which uses shades of color (or greyscale) to classify values into a few broad classes, like a histogram From the choropleth map we know which regions are adjacent to (touch) which other regions. Thus the sites s D in this case are actually the regions (or blocks) themselves, which we will denote not by s i but by B i, i = 1,...,n. It may be helpful to think of the county centroids as forming the vertices of an irregular lattice, with two lattice points being connected if and only if the counties are neighbors in the spatial map. Hierarchical Modeling and Analysis for Spatial Data p. 9/2

Misaligned (point and areal) data Figure 3: Atlanta zip codes and 8-hour maximum ozone levels (ppm) at 10 sites, July 15, 1995. Hierarchical Modeling and Analysis for Spatial Data p. 10/2

Spatial point process data Exemplified by residences of persons suffering from a particular disease, or by locations of a certain species of tree in a forest. The response Y is often fixed (occurrence of the event), and only the locations s i are thought of as random. Such data are often of interest in studies of event clustering, where the goal is to determine whether points tend to be spatially close to other points, or result merely from a random process operating independently and homogeneously over space. In contrast to areal data, here (and with pointreferenced data as well) precise locations are known, and so must often be protected to protect the privacy of the persons in the set. Hierarchical Modeling and Analysis for Spatial Data p. 11/2

Spatial point process data (cont d) No clustering" is often described through a homogeneous Poisson process: E[number of occurrences in region A] = λ A, where λ is the intensity parameter, and A is area(a). Visual tests can be unreliable (tendency of the human eye to see clustering), so instead we might rely on Ripley s K function, K(d) = 1 E[number of points within d of an arbitrary point], λ where again λ is the intensity of the process, i.e., the mean number of points per unit area. Hierarchical Modeling and Analysis for Spatial Data p. 12/2

Spatial point process data (cont d) The usual estimator for K is K(d) = n 2 A i j p 1 ij I d(d ij ), where n is the number of points in A, d ij is the distance between points i and j, p ij is the proportion of the circle with center i and passing through j that lies within A, and I d (d ij ) equals 1 if d ij < d, and 0 otherwise. Compare this to, say, K(d) = πd 2, the theoretical value for nonspatial processes Clustered data would have larger K; uniformly spaced data would have a smaller K Hierarchical Modeling and Analysis for Spatial Data p. 13/2

Spatial point process summary A popular spatial add-on to the S+ package, S+SpatialStats, allows computation of K for any data set, as well as approximate 95% intervals for it Full inference likely requires use of the Splancs software, or perhaps a fully Bayesian approach along the lines of Wakefield and Morris (2001). We consider only a fixed index set D, i.e., random observations at either points s i or areal units B i ; see Diggle (2003) Lawson and Denison (2002) Møller and Waagepetersen (2004) for recent treatments of spatial point processes and spatial cluster detection and modeling. Hierarchical Modeling and Analysis for Spatial Data p. 14/2

Fundamentals of Cartography The earth is round! So (longitude, latitude) (x,y)! A map projection is a systematic representation of all or part of the surface of the earth on a plane. Theorem: The sphere cannot be flattened onto a plane without distortion Instead, use an intermediate surface that can be flattened. The sphere is first projected onto the this developable surface, which is then laid out as a plane. The three most commonly used surfaces are the cylinder, the cone, and the plane itself. Using different orientations of these surfaces lead to different classes of map projections... Hierarchical Modeling and Analysis for Spatial Data p. 15/2

Developable surfaces Figure 4: Geometric constructions of projections Hierarchical Modeling and Analysis for Spatial Data p. 16/2

Sinusoidal projection Writing (longitude, latitude) as (λ, θ), projections are x = f(λ,φ), y = g(λ,φ), where f and g are chosen based upon properties our map must possess. This sinusoidal projection preserves area. Hierarchical Modeling and Analysis for Spatial Data p. 17/2

Mercator projection While no projection preserves distance (Gauss Theorema Eggregium in differential geometry), this famous conformal (angle-preserving) projection distorts badly near the poles. Hierarchical Modeling and Analysis for Spatial Data p. 18/2

Calculation of geodesic distance Consider two points on the surface of the earth, P 1 = (θ 1,λ 1 ) and P 2 = (θ 2,λ 2 ), where θ = latitude and λ = longitude. The geodesic distance we seek is D = Rφ, where R is the radius of the earth φ is the angle subtended by the arc connecting P 1 and P 2 at the center Hierarchical Modeling and Analysis for Spatial Data p. 19/2

Calculation of geodesic distance (cont d) From elementary trig, we have x = R cos θ cos λ, y = R cos θ sin λ, and z = R sin θ Letting u 1 = (x 1,y 1,z 1 ) and u 2 = (x 2,y 2,z 2 ), we know cos φ = u 1,u 2 u 1 u 2 Hierarchical Modeling and Analysis for Spatial Data p. 20/2

Calculation of geodesic distance (cont d) We then compute u 1,u 2 as R 2 [cos θ 1 cosλ 1 cos θ 2 cosλ 2 + cosθ 1 sin λ 1 cosθ 2 sin λ 2 + sinθ 1 sin θ 2 ] = R 2 [cos θ 1 cosθ 2 cos (λ 1 λ 2 ) + sinθ 1 sin θ 2 ]. But u 1 = u 2 = R, so our final answer is D = Rφ = R arccos[cosθ 1 cos θ 2 cos(λ 1 λ 2 ) + sin θ 1 sin θ 2 ]. Hierarchical Modeling and Analysis for Spatial Data p. 21/2