Spatial Misalignment

Similar documents
Spatial Misalignment

Spatial Misalignment

Point-Referenced Data Models

BAYESIAN HIERARCHICAL MODELS FOR MISALIGNED DATA: A SIMULATION STUDY

Bayesian Hierarchical Models

Spatial Point Pattern Analysis

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

On the change of support problem for spatio-temporal data

SPATIAL ANALYSIS & MORE

Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models

Hierarchical Modelling for Univariate and Multivariate Spatial Data

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Hierarchical Modelling for Multivariate Spatial Data

Bayesian non-parametric model to longitudinally predict churn

Multivariate spatial modeling

Chapter 4 - Fundamentals of spatial processes Lecture notes

STAT 518 Intro Student Presentation

Analysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information

Hierarchical Modelling for Univariate Spatial Data

Joint Modeling of Longitudinal Item Response Data and Survival

Spatial Analysis and Modeling (GIST 4302/5302) Guofeng Cao Department of Geosciences Texas Tech University

Hierarchical Modeling for Univariate Spatial Data

Multiple Regression Analysis: The Problem of Inference

A Geostatistical Approach to Linking Geographically-Aggregated Data From Different Sources

Bayesian data analysis in practice: Three simple examples

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Integrated Non-Factorized Variational Inference

A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models

Spatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States

Hierarchical Modeling and Analysis for Spatial Data

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modeling for Multivariate Spatial Data

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Fusing point and areal level space-time data. data with application to wet deposition

Gibbs Sampling in Latent Variable Models #1

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Areal Unit Data Regular or Irregular Grids or Lattices Large Point-referenced Datasets

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Approaches for Multiple Disease Mapping: MCAR and SANOVA

Default Priors and Effcient Posterior Computation in Bayesian

Part 7: Hierarchical Modeling

A Bayesian multi-dimensional couple-based latent risk model for infertility

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

Bayesian spatial hierarchical modeling for temperature extremes

Statistics for extreme & sparse data

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Disease mapping with Gaussian processes

Combining Incompatible Spatial Data

Markov Chain Monte Carlo

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL

Modelling geoadditive survival data

Hierarchical Modeling for Spatio-temporal Data

Wrapped Gaussian processes: a short review and some new results

Granger Causality Testing

Aggregated cancer incidence data: spatial models

Modeling the Covariance

Nature of Spatial Data. Outline. Spatial Is Special

Analysing geoadditive regression data: a mixed model approach

CBMS Lecture 1. Alan E. Gelfand Duke University

Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes

Statistical Inference for Means

A Bayesian Probit Model with Spatial Dependencies

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

Using Estimating Equations for Spatially Correlated A

Practicum : Spatial Regression

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

Hierarchical Modelling for non-gaussian Spatial Data

Multivariate Normal & Wishart

GIST 4302/5302: Spatial Analysis and Modeling

Plug-in Approach to Active Learning

Gaussian Process Regression Model in Spatial Logistic Regression

Image segmentation combining Markov Random Fields and Dirichlet Processes

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Learning Bayesian Networks for Biomedical Data

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Bayesian Areal Wombling for Geographic Boundary Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Online Appendix. Online Appendix A: MCMC Algorithm. The model can be written in the hierarchical form: , Ω. V b {b k }, z, b, ν, S

Riemann Manifold Methods in Bayesian Statistics

multilevel modeling: concepts, applications and interpretations

Basics of Geographic Analysis in R

Special Topic: Bayesian Finite Population Survey Sampling

Spatial Smoothing in Stan: Conditional Auto-Regressive Models

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Describing Contingency tables

Classification 1: Linear regression of indicators, linear discriminant analysis

Gaussian Processes for Big Data. James Hensman

Statistical Data Mining and Machine Learning Hilary Term 2016

Transcription:

Spatial Misalignment Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Spatial Misalignment Spring 2013 1 / 28

Objectives By the end of today s meeting, participants should be able to: Predict values of a variable at unobserved points or blocks given either point-process or areal data. Explain the methods behind predicting values of areal data when the new level of measurement is nested within the original level. Realign nonnested block level data. Resolve problems of misaligned data for regression modeling. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 2 / 28

Goal: Spatial Regression Terminology Points refer to point-level observations and blocks refer to areal-level summaries. We cannot fit a regression if two spatially referenced variables are misaligned: X at point level, Y at other points This is point-point misalignment, normally handled by kriging! X at point level, Y at block level Block kriging: we use a block average. X at block level, Y at point level X at block level, Y at a different block level Solution: Bring the X s to the scale of the Y s, then fit the model. With more than two variables, bring all of the variables to a common scale. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 3 / 28

This is better than any ad hoc measure such as kriging the centroid for a block or averaging true point-level values within a block. Why? Jamie Monogan (UGA) Spatial Misalignment Spring 2013 4 / 28 Block Kriging The ideal estimator: Y (B) = 1 Y (s)ds B B for B a block within our space and B the area of the block. This equation would give us the true average over continuous space. In practice, we settle for this estimator: Y (B) 1 Y (s l ) L This is a Monte Carlo approximation where we have drawn random locations within the block: s l B, l = 1,..., L. We thus estimate a block average by averaging kriged values at several points within the block. l

Block-to-Block Misalignment Suppose a variable is observed at block level (source zones) with inference desired at other blocks (target zones) Nested block-level modeling: census tracts to census blocks Nonnested block-level modeling: census tracts to cells of the exposure windrose (Fig. 6.9, p. 194) Geography calls this the modifiable areal unit problem (MAUP). Hierarchical modeling offers a more sensible alternative to areal allocation (i.e., simply allocating counts proportional to the area of the subregions formed by the intersection of the two grids). (Fig. 6.3, p. 185) Jamie Monogan (UGA) Spatial Misalignment Spring 2013 5 / 28

Block-to-Point Misalignment What if a variable is observed at the block level, but inference is sought at the point level? Does such a projection make sense? Average rainfall over a block B=sensible to consider at the point level. Count of disease cases in B=silly to consider at the point level. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 6 / 28

Methodology for Point-Level Realignment Gaussian process specification. Applied to all four misalignment problems. Assumptions Y T s = (Y (s 1 ),..., Y (s I )) for I sites observed. Y s β, θ N (µ s (β), σ 2 H s (φ)) where θ = (σ 2, φ) T, µ s (β) i = µ(s i ; β), and (H s (φ)) ii = ρ(s i s i ; φ) Point-to-Point Realignment / Kriging f (( Ys Y s ) ) β, θ = N (( µs (β) µ s (β) ) (, σ 2 Hs (φ) H s,s (φ) Hs,s T (φ) H s (φ) Y s Y s, β, θ N (µ s (β) + Hs,s T (φ)h 1 s (φ)(y s µ s (β)), σ 2 [H s (φ) Hs,s T (φ)h 1 s (φ)h s,s (φ)]) )) Jamie Monogan (UGA) Spatial Misalignment Spring 2013 7 / 28

Point-to-Block Realignment Assume the observed point-level data and the extrapolated block-level averages have a joint Gaussian distribution: ( ( ) ) (( ) ( )) Ys f β, φ µs (β) Hs (φ) H = N, s,b (φ) µ B (β) Hs,B T (φ) H. B(φ) where Y B (µ B (β)) k = E(Y (B k ) β) = B k B 1 µ(s; β)ds k (H B (φ)) kk = B k 1 B k 1 ρ(s s ; φ)ds ds (H s,b (φ)) ik = B k 1 B k B k B k ρ(s i s ; φ)ds Jamie Monogan (UGA) Spatial Misalignment Spring 2013 8 / 28

Point-to-Block: Monte Carlo Integration By standard normal theory, the conditional distribution of our extrapolated block averages is: [ Y B Y s, β, φ N µ B (β) + Hs,B T (φ)h 1 s (φ)(y s µ s (β)), ] H B (φ) Hs,B T (φ)h 1 s (φ)h s,b (φ). Our quantities of interest can be estimated with Monte Carlo integration: (ˆµ B (β)) k = L 1 k µ(s kl ; β) (ĤB(φ)) kk = L 1 k (Ĥs,B(φ)) ik = L 1 k l ρ(s kl s k l ; φ) l l ρ(s i s kl ; φ) L 1 k This will us to forecast block scores with: l ˆµ B (β) + ĤT s,b (φ)ĥ 1 s (φ)(y s ˆµ s (β)). Jamie Monogan (UGA) Spatial Misalignment Spring 2013 9 / 28

Block Inputs for the Gaussian Specification Block-to-Point Realignment We know f (Y s, Y B β, θ) (joint distribution) and f (Y B β, θ) N (µ B (β), σ 2 H B (φ)) (marginal distribution). Bayes Rule provides for the conditional distribution: f (Y s Y B, β, θ) = f (Y s,y B β,θ) f (Y B β,θ) We still need to do Monte Carlo integration to obtain ˆf (Y s Y B, β, θ). By sampling from this, we can krige point predictions from block data. Predicting New Blocks Suppose we wanted to predict new blocks, B 1,..., B K By the same math, we can obtain ˆf (Y B Y B, β, θ) This requires Monte Carlo integration over the B k s and the B i s Jamie Monogan (UGA) Spatial Misalignment Spring 2013 10 / 28

Nested Block-Level Modeling Suppose we need smaller areal units. Borders at the more precise level do not cut across borders at the less precise level. Example: census blocks within census tracts in Tompkins County. How to project values at the census block level? Jamie Monogan (UGA) Spatial Misalignment Spring 2013 11 / 28

Model of Leukemia Case Counts Y ij m k(i,j) Po(E ij m k(i,j) ); i = 1,..., I ; j = 1,..., J i Reframe the model for tract-level data: Y i. m Po(s 1 m 1 + s 2 m 2 + s 3 m 3 + s 4 m 4 ); i = 1,..., I s k = j:k(i,j)=k E ij log(m k(i,j) ) = θ 0 + θ 1 u ij + θ 2 w ij + θ 3 u ij w ij Forecasting at the Census Block Level E(Y ij y) = E[E(Y ij m, y)] y i. G Where g represents an MCMC iteration. G g=1 p (g) ij Jamie Monogan (UGA) Spatial Misalignment Spring 2013 12 / 28

Nonnested Block-Level Modeling Breaking it down: from blocks to atoms B i are blocks on the response grid - atoms in B i are B ik C j are blocks on the explanatory grid - atoms in C j are C jl Jamie Monogan (UGA) Spatial Misalignment Spring 2013 13 / 28

Nonnested Block-Level Modeling: Notation Basic Components Y - response (measured on the response grid, B i ) W - covariates on the response grid (B i ) X - covariates on the explanatory grid (C j ) µ i - random effects (spatial association among Y i s) ω i - random effects (spatial association among X j s) Assumptions Y i s are aggregated measurements W i s are aggregated measurements or inheritable X j s are aggregated measurements µ i s are inherited by latent Y ik ω i s are inherited by latent X jl Jamie Monogan (UGA) Spatial Misalignment Spring 2013 14 / 28

From Apples to Oranges (and Back) For all non-edge atoms X jl can also be labeled as X ik What about edge atoms? Use neighboring nonedge atoms to determine the distribution of the edge atom. E.g., if X is a count variable: X ie ω i Po(e ω i B ie ) Note: B ie is the area of B ie and ωi is a new set of random effects. The same basic procedure can be used for Y je as well. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 15 / 28

Modeling Latent Variables X jl and Y ik Assuming X and Y are counts: X jl ω j Po(e ω j C jl ) ( ) (X j1,..., X jlj X j, ω j ) Mult X j ; C j1 C j,..., C jl j C j (ω j, ω i ) CAR(λ ω ) ( ( Y ik µ i, θ ik Po e µ i X )) B ik h ik B ik ; θ ik ) iid µ i N (η µ, 1 τ µ h (z; θ ik ) is a preselected function used for model specification. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 16 / 28

Path Diagram of the Latent Variable Model Jamie Monogan (UGA) Spatial Misalignment Spring 2013 17 / 28

FMPC Data Example What is the population in each cell of the windrose? What is the population in each cell of the windrose broken down into categories for age and sex? Jamie Monogan (UGA) Spatial Misalignment Spring 2013 18 / 28

FMPC Example Step 1 - model the number of structures in each atom, X jl X jl Po(e ω j C jl ) ( ) (X j1,..., X jlj X j., ω j ) Mult X j. ; C j1 C j,..., C jl j C j (ω j, ω i ) CAR(λ ω ) λ ω = 10 Step 2 - estimate the population in each atom ( ( )) Y ik Po e µ i X ik + θ K i ( (Y i1,..., Y iki Y i. ) Mult Y i. ; X i1 +θ/k i X i. +θ,..., X µ i iid N (η µ, 1 τ µ ) ik +θ/k i i X i. +θ η µ = 1.1 τ µ = 0.5 θ = 1 ) Jamie Monogan (UGA) Spatial Misalignment Spring 2013 19 / 28

FMPC Example Step 3 - Aggregate the population totals for all the atoms in each cell of the windrose Jamie Monogan (UGA) Spatial Misalignment Spring 2013 20 / 28

FMPC Example Step 4 - Model the population count in each cell by sex and age group Y ikga Po ( exp [ µ i + gα + 17 a=1 β a I a ] ( X ik + θ K i ) ) g = 0 for males, 1 for female I a = indicator for age group ) iid µ i N (η µ, 1 τ µ η µ = 2.5 log(3/36) τ µ = 0.5 θ = 1 Jamie Monogan (UGA) Spatial Misalignment Spring 2013 21 / 28

Misaligned Regression Modeling We have learned discrete methods for realigning each type and combination of spatially misaligned data. How to deal with misalignment directly within a regression model. First example (theory) - land use in Madagascar. Second example (actual data) - flowers in South Africa. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 22 / 28

Madagascar Land Use Example Data: Population - collected by town P i : population in the ith town Land Use - collected by pixels (4 km x 4 km) L ij : jth pixel of the ith town Elevation - pixel Slope - pixel Spatial effects parameters: ϕ ij - pixel-level spatial effects δ i - town-level spatial effects Jamie Monogan (UGA) Spatial Misalignment Spring 2013 23 / 28

Madagascar Example The joint distribution of land use and population: p(l, P E ij, S ij, ϕ ij, δ i ) Rearranged to examine the effect of population on land use: p(p E ij, S ij, δ i ) }{{} P ij P i Mult(P i. ;λ ij /λ i. ) p(l P, E ij, S ij, ϕ ij ) }{{} L ij Bin(16,q ij ) Where, log λ ij = β 0 + β 1 E ij + β 2 S ij + δ i ( ) qij log = α 0 + α 1 E ij + α 2 S ij + α 3 P ij + ϕ ij 1 q ij Jamie Monogan (UGA) Spatial Misalignment Spring 2013 24 / 28

Madagascar Example Model 1: Model 2: ( ) qij log = α 0 + α 1 E ij + α 2 S ij + α 3 P ij 1 q ij ( ) qij log = α 0 + α 1 E ij + α 2 S ij + α 3 P ij + ϕ ij 1 q ij Conclusion: There is a relationship between population and land use. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 25 / 28

Flowers in South Africa s Cape Floristic Region Observations were made at a number of locations regarding whether the Grand Protea flower was present at that location. Information on a number of environmental covariates is available for each of 476 one minute by one minute grid cells in the study region. The point level response data is converted to grid level by modeling the number of times a flower is observed in a cell given the number of sampling locations in that grid. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 26 / 28

South Africa Example Y i Bin(n i, p i ) ( ) pi log = w 1 p iβ + µ + ρ i i w i is a vector of environmental covariates µ is the non-spatial random effects ρ i is the spatial random effects Jamie Monogan (UGA) Spatial Misalignment Spring 2013 27 / 28

April 10 Read: Banerjee, Carlin, & Gelfand, Chapter 8 Franzese & Hays. 2007. Spatial Econometric Models of Cross-Sectional Interdependence in Political Science Panel and Time-Series-Cross-Section Data. Political Analysis 15(2):140-164. April 17 Read: Banerjee, Carlin, & Gelfand, Chapter 7 Download the 1996 presidential advertisement data (pres1996.csv). Log the total number of ads in a media market (total). Suppose you wanted the relative concentration of media ads by state. What problems would averaging or adding the market data pose? Draw a map of the location of your data and a polygon for Wyoming. Assume longitude=(-111.05,-104.05) and latitude=(41,45). Krige total ads for 100 hypothetical media markets in Wyoming. Use this to estimate the mean number of ads in a Wyoming market. Do you believe this forecast? Why or why not? April 24: Papers due! Read: Kelsall & Diggle. 1998. Spatial Variation in Risk of Disease: A Nonparametric Binary Regression Approach. Applied Statistics. May 1, 3:30-6:30, Baldwin 302 In-class presentations of student papers. 15 minutes, each. Jamie Monogan (UGA) Spatial Misalignment Spring 2013 28 / 28