Hierarchical Modeling for non-gaussian Spatial Data

Similar documents
Hierarchical Modelling for non-gaussian Spatial Data

Hierarchical Modelling for non-gaussian Spatial Data

Hierarchical Modelling for Univariate and Multivariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modeling for Spatio-temporal Data

Hierarchical Modeling for Multivariate Spatial Data

Introduction to Spatial Data and Models

Hierarchical Modelling for Univariate Spatial Data

spbayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

Introduction to Spatial Data and Models

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modelling for Multivariate Spatial Data

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Principles of Bayesian Inference

Hierarchical Modeling for Univariate Spatial Data

Bayesian Linear Models

Nearest Neighbor Gaussian Processes for Large Spatial Data

Bayesian Linear Models

Introduction to Spatial Data and Models

Introduction to Geostatistics

Model Assessment and Comparisons

Principles of Bayesian Inference

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geo-statistical Datasets

Modelling Multivariate Spatial Data

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Gaussian predictive process models for large spatial data sets.

Bayesian Linear Regression

Principles of Bayesian Inference

Introduction to Spatial Data and Models

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Part 8: GLMs and Hierarchical LMs and GLMs

Bayesian Linear Models

Journal of Statistical Software

Generalized Linear Models. Kurt Hornik

Multivariate spatial modeling

spbayes: an R package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Bayesian linear regression

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Linear Regression. Data Model. β, σ 2. Process Model. ,V β. ,s 2. s 1. Parameter Model

Lecture 23. Spatio-temporal Models. Colin Rundel 04/17/2017

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian data analysis in practice: Three simple examples

1 Using standard errors when comparing estimated values

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

On Gaussian Process Models for High-Dimensional Geostatistical Datasets

Relevance Vector Machines

Linear Regression Models P8111

Web Appendices: Hierarchical and Joint Site-Edge Methods for Medicare Hospice Service Region Boundary Analysis

eqr094: Hierarchical MCMC for Bayesian System Reliability

Lecture 16: Mixtures of Generalized Linear Models

Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test

A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models

Hierarchical Multiresolution Approaches for Dense Point-Level Breast Cancer Treatment Data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Multivariate Bayesian Linear Regression MLAI Lecture 11

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Parameter Estimation. Industrial AI Lab.

Dimensionality Reduction for Exponential Family Data

Areal data models. Spatial smoothers. Brook s Lemma and Gibbs distribution. CAR models Gaussian case Non-Gaussian case

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Principles of Bayesian Inference

Analysing geoadditive regression data: a mixed model approach

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference

STA 414/2104, Spring 2014, Practice Problem Set #1

Subject CS1 Actuarial Statistics 1 Core Principles

Analysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information

Naïve Bayes classification

Special Topic: Bayesian Finite Population Survey Sampling

Chapter 4 - Fundamentals of spatial processes Lecture notes

Spatial Quantile Multiple Regression Using the Asymmetric Laplace Process

The joint posterior distribution of the unknown parameters and hidden variables, given the

An introduction to Bayesian statistics and model calibration and a host of related topics

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

HPD Intervals / Regions

Flexible Regression Modeling using Bayesian Nonparametric Mixtures

TECHNIQUE FOR RANKING POTENTIAL PREDICTOR LAYERS FOR USE IN REMOTE SENSING ANALYSIS. Andrew Lister, Mike Hoppus, and Rachel Riemam

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Metropolis-Hastings Algorithm

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS

Dynamically updated spatially varying parameterisations of hierarchical Bayesian models for spatially correlated data

STAT5044: Regression and Anova

Introduction to Machine Learning Midterm Exam

MCMC algorithms for fitting Bayesian models

Markov Chain Monte Carlo in Practice

Bayesian non-parametric model to longitudinally predict churn

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Learning (II)

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Tail negative dependence and its applications for aggregate loss modeling

Transcription:

Hierarchical Modeling for non-gaussian Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. July 25, 2014 1

Spatial Generalized Linear Models Often data sets preclude Gaussian modeling: y(s) may not even be continuous Example: y(s) is a binary or count variable species presence or absence at location s species abundance from count at location s continuous forest variable is high or low at location s Replace Gaussian likelihood by exponential family member Spatial GLM, Diggle Tawn and Moyeed (1998) 2 Graduate Workshop on Environmental Data Analytics 2014

Spatial Generalized Linear Models First stage: y(s i ) are conditionally independent given β and w(s i ), so f(y(s i ) β, w(s i ), γ) equals h(y(s i ), γ) exp (γ[y(s i )η(s i ) ψ(η(s i ))]) where g(e(y(s i ))) = η(s i ) = x (s i )β + w(s i ) (canonical link function) and γ is a dispersion parameter. Second stage: Model w(s) as a Gaussian process: w N(0, σ 2 R(φ)) Third stage: Priors and hyperpriors. No process for y(s), only a valid joint distribution Not sensible to add a pure error term ɛ(s) 3 Graduate Workshop on Environmental Data Analytics 2014

Spatial Generalized Linear Models Comments We are modeling with spatial random effects Introducing these in the transformed mean encourages means of spatial variables at proximate locations to be close to each other Marginal spatial dependence is induced between, say, y(s) and y(s ), but observed y(s) and y(s ) need not be close to each other Second stage spatial modeling is attractive for spatial explanation in the mean First stage spatial modeling more appropriate to encourage proximate observations to be similar. 4 Graduate Workshop on Environmental Data Analytics 2014

Illustration from: Finley, A.O., S. Banerjee, and R.E. McRoberts. (2008) A Bayesian approach to quantifying uncertainty in multi-source forest area estimates. Environmental and Ecological Statistics, 15:241 258. 5 Graduate Workshop on Environmental Data Analytics 2014

Binary spatial regression: forest/non-forest We illustrate a non-gaussian model for point-referenced spatial data: Objective is to make pixel-level prediction of forest/non-forest across the domain. 6 Graduate Workshop on Environmental Data Analytics 2014

Binary spatial regression: forest/non-forest We illustrate a non-gaussian model for point-referenced spatial data: Objective is to make pixel-level prediction of forest/non-forest across the domain. Data: Observations are from 500 georeferenced USDA Forest Service Forest Inventory and Analysis (FIA) inventory plots within a 32 km radius circle in MN, USA. The response y(s) is a binary variable, with { 1 if inventory plot is forested y(s) = 0 if inventory plot is not forested 6 Graduate Workshop on Environmental Data Analytics 2014

Binary spatial regression: forest/non-forest We illustrate a non-gaussian model for point-referenced spatial data: Objective is to make pixel-level prediction of forest/non-forest across the domain. Data: Observations are from 500 georeferenced USDA Forest Service Forest Inventory and Analysis (FIA) inventory plots within a 32 km radius circle in MN, USA. The response y(s) is a binary variable, with { 1 if inventory plot is forested y(s) = 0 if inventory plot is not forested Observed covariates include the coinsiding pixel values for 3 dates of 30 30 m resolution Landsat imagery. 6 Graduate Workshop on Environmental Data Analytics 2014

Binary spatial regression: forest/non-forest We fit a generalized linear model where y(s i ) Bernoulli(p(s i )), logit(p(s i )) = x(s i ) β + w(s i ). 7 Graduate Workshop on Environmental Data Analytics 2014

Binary spatial regression: forest/non-forest We fit a generalized linear model where y(s i ) Bernoulli(p(s i )), logit(p(s i )) = x(s i ) β + w(s i ). Assume vague flat for β, a Uniform(3/32, 3/0.5) prior for φ, and an inverse-gamma(2, ) prior for σ 2. 7 Graduate Workshop on Environmental Data Analytics 2014

Binary spatial regression: forest/non-forest We fit a generalized linear model where y(s i ) Bernoulli(p(s i )), logit(p(s i )) = x(s i ) β + w(s i ). Assume vague flat for β, a Uniform(3/32, 3/0.5) prior for φ, and an inverse-gamma(2, ) prior for σ 2. Parameters updated with Metropolis algorithm using target log density: ln (p(ω y)) ( σ a + 1 + n ) ln ( σ 2) σ b 2 σ 1 1 ln ( R(φ) ) 2 2 2σ 2 w R(φ) 1 w n + y(s ( i) x(s i) β + w(s ) n i) ln ( 1 + exp(x(s i) β + w(s ) i) i=1 i=1 + ln(σ 2 ) + ln(φ φ a) + ln(φ b φ). 7 Graduate Workshop on Environmental Data Analytics 2014

Posterior parameter estimates Parameter estimates (posterior medians and upper and lower 2.5 percentiles): Parameters Estimates: 50% (2.5%, 97.5%) Intercept (θ 0 ) 82.39 (49.56, 120.46) AprilTC1 (θ 1 ) -0.27 (-0.45, -0.11) AprilTC2 (θ 2 ) 0.17 (0.07, 0.29) AprilTC3 (θ 3 ) -0.24 (-0.43, -0.08) JulyTC1 (θ 4 ) -0.04 (-0.25, 0.17) JulyTC2 (θ 5 ) 0.09 (-0.01, 0.19) JulyTC3 (θ 6 ) 0.01 (-0.15, 0.16) OctTC1 (θ 7 ) -0.43 (-0.68, -0.22) OctTC2 (θ 8 ) -0.03 (-0.19, 0.14) OctTC3 (θ 9 ) -0.26 (-0.46, -0.07) σ 2 1.358 (0.39, 2.42) φ 0.00182 (0.00065, 0.0032) log(0.05)/φ (meters) 1644.19 (932.33, 4606.7) Covariates and proximity to observed FIA plot will contribute to increase precision of prediction. 8 Graduate Workshop on Environmental Data Analytics 2014

Median of posterior predictive distributions 9 Graduate Workshop on Environmental Data Analytics 2014

97.5%-2.5% range of posterior predictive distributions 10 Graduate Workshop on Environmental Data Analytics 2014

CDF of holdout area s posterior predictive distributions F[P(Y(A) = 1)] 0.0 0.2 0.4 0.6 0.8 1.0 cut point 0.0 0.2 0.4 0.6 0.8 1.0 P(Y(A) = 1) Classification of 15 20 20 pixel areas (based on visual inspection of imagery) into non-forest ( ), moderately forest ( ), and forest (no marker). 11 Graduate Workshop on Environmental Data Analytics 2014