Spatio-temporal Statistical Modelling for Environmental Epidemiology

Similar documents
A Tale of Two Parasites

Spatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique

Statistical and epidemiological considerations in using remote sensing data for exposure estimation

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Predicting Long-term Exposures for Health Effect Studies

Measurement Error in Spatial Modeling of Environmental Exposures

The Use of Spatial Exposure Predictions in Health Effects Models: An Application to PM Epidemiology

Pumps, Maps and Pea Soup: Spatio-temporal methods in environmental epidemiology

Combining Incompatible Spatial Data

Comparison of data-fitting models for schistosomiasis: a case study in Xingzi, China

Dismantling the Mantel tests

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter

Bayesian Geostatistical Design

Brent Coull, Petros Koutrakis, Joel Schwartz, Itai Kloog, Antonella Zanobetti, Joseph Antonelli, Ander Wilson, Jeremiah Zhu.

PrevMap: An R Package for Prevalence Mapping

Journal of Statistical Software

Disease mapping with Gaussian processes

Modelling spatio-temporal patterns of disease

What is (certain) Spatio-Temporal Data?

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Linkage Methods for Environment and Health Analysis General Guidelines

Estimating the long-term health impact of air pollution using spatial ecological studies. Duncan Lee

Cluster Analysis using SaTScan

Environmental factors associated with the distribution of Loa loa vectors Chrysops spp. in Central and West Africa: seeing the forest for the trees

WHO lunchtime seminar Mapping child growth failure in Africa between 2000 and Professor Simon I. Hay March 12, 2018

Rejoinder. Peihua Qiu Department of Biostatistics, University of Florida 2004 Mowry Road, Gainesville, FL 32610

at least 50 and preferably 100 observations should be available to build a proper model

Statistics in medicine

Modelling trends in the ocean wave climate for dimensioning of ships

Space-time downscaling under error in computer model output

Bayesian Hierarchical Models

Extended Follow-Up and Spatial Analysis of the American Cancer Society Study Linking Particulate Air Pollution and Mortality

Community Health Needs Assessment through Spatial Regression Modeling

Site-specific Prediction of Mosquito Abundance using Spatio-Temporal Geostatistics

EPIDEMIOLOGY FOR URBAN MALARIA MAPPING

TELE-EPIDEMIOLOGY URBAN MALARIA MAPPING

Map Methodology Loa loa Estimated prevalence of Eye Worm:

Spatio-temporal precipitation modeling based on time-varying regressions

Flexible modelling of the cumulative effects of time-varying exposures

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland

Model-based Geostatistics

A Geostatistical Approach to Linking Geographically-Aggregated Data From Different Sources

Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology

GIS Application in Landslide Hazard Analysis An Example from the Shihmen Reservoir Catchment Area in Northern Taiwan

ICU Monitoring: An Engineering Perspective

UKPMC Funders Group Author Manuscript Adv Parasitol. Author manuscript; available in PMC 2011 February 12.

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Probabilistic Graphical Models

Probabilistic assessment of danger zones using a surrogate model of CFD simulations

Multivariate Count Time Series Modeling of Surveillance Data

Gaussian Process Approximations of Stochastic Differential Equations

Identification of hotspots of rat abundance and their effect on human risk of leptospirosis in a Brazilian slum community

Comparative effectiveness of dynamic treatment regimes

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Niche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016

Spatio-temporal modeling of avalanche frequencies in the French Alps

Lecture 1 January 18

Stochastic Hydrology. a) Data Mining for Evolution of Association Rules for Droughts and Floods in India using Climate Inputs

Proteomics and Variable Selection

Data Integration Model for Air Quality: A Hierarchical Approach to the Global Estimation of Exposures to Ambient Air Pollution

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Flexible Spatio-temporal smoothing with array methods

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

Spatial data analysis

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

CLIMATE AND LAND USE DRIVERS OF MALARIA RISK IN THE PERUVIAN AMAZON,

On dealing with spatially correlated residuals in remote sensing and GIS

Lognormal Measurement Error in Air Pollution Health Effect Studies

Time Series. Anthony Davison. c

Lecture 01: Introduction

Spatial Inference of Nitrate Concentrations in Groundwater

Time Series Analysis -- An Introduction -- AMS 586

Dengue Forecasting Project

Spatio-temporal modelling of daily air temperature in Catalonia

An application of the GAM-PCA-VAR model to respiratory disease and air pollution data

USING CLUSTERING SOFTWARE FOR EXPLORING SPATIAL AND TEMPORAL PATTERNS IN NON-COMMUNICABLE DISEASES

EpiMAN-TB, a decision support system using spatial information for the management of tuberculosis in cattle and deer in New Zealand

Introduction to Bayesian Inference

Report on Kriging in Interpolation

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Lecture 5: Poisson and logistic regression

Introduction to ecosystem modelling Stages of the modelling process

1 The problem of survival analysis

Statistical challenges in Disease Ecology

Everything is related to everything else, but near things are more related than distant things.

Applications of GIS in Health Research. West Nile virus

Characterisation of Near-Earth Magnetic Field Data for Space Weather Monitoring

STATISTICS-STAT (STAT)

Bayesian Areal Wombling for Geographic Boundary Analysis

Mathematical Modelling on the CDTI Prospects for Elimination of Onchocerciasis: A Deterministic Model Approach

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007

Uncertainty quantification of world population growth: A self-similar PDF model

The STS Surgeon Composite Technical Appendix

Roger S. Bivand Edzer J. Pebesma Virgilio Gömez-Rubio. Applied Spatial Data Analysis with R. 4:1 Springer

LARGE NUMBERS OF EXPLANATORY VARIABLES. H.S. Battey. WHAO-PSI, St Louis, 9 September 2018

Transcription:

Spatio-temporal Statistical Modelling for Environmental Epidemiology Peter Diggle Department of Medicine, Lancaster University and Department of Biostatistics, Johns Hopkins University WHO Geneva, September 2007

Introduction environmental determinants of health vary: continuously; in space; and in time assigning a spatially and/or temporally averaged exposure to an individual at risk is at best a pragmatic approximation models of the association between exposure and health outcome should acknowledge the statistical uncertainty in exposure estimates model-based geostatistical methods can be used to estimate a spatially and temporally continuous exposure surface from spatially and/or temporally sparse data with accompanying estimates of precision

1. APOC Two case-studies goal: predict Loa loa prevalence from spatially sparse community-level surveys environmental covariates: elevation, green-ness (from satellite images) policy-relevant question: does prevalence exceed 20%? 2. PAMPER goal: estimate a spatio-temporal exposure surface of black smoke concentrations for UK city of Newcastle upon Tyne over a thirty-year period goal: relate exposure estimates to pregnancy outcomes spatially sparse network of monitors, operating intermittently over study-period

Geostatistics Data (x i, Y i ) : i = 1,..., n x i = location of ith measurement Y i = measured value at location x i plus relevant covariate information, context,... Model S(x) = spatial process of interest (exposure surface) Y i = noisy version of S(x i ) Spatio-temporal extension: Y i = S(x i, t i )+ noise

African Programme for Onchocerciasis Control river blindness an endemic disease in wet tropical regions donation programme of mass treatment with ivermectin approximately 30 million treatments to date serious adverse reactions experienced by some patients highly co-infected with Loa loa parasites precautionary measures put in place before mass treatment in areas of high Loa loa prevalence http://www.who.int/pbd/blindness/onchocerciasis/en/

Schematic representation of Loa loa model prevalence 0.0 0.2 0.4 0.6 0.8 1.0 local fluctuation sampling error environmental gradient 0.0 0.2 0.4 0.6 0.8 1.0 location

Modelling strategy use relationship between environmental variables and ground-truth prevalence to construct preliminary predictions via logistic regression use local deviations from regression model to estimate smooth residual spatial variation Bayesian paradigm for quantification of uncertainty in resulting model-based predictions

logit prevalence vs elevation logit prevalence 5 4 3 2 1 0 0 500 1000 1500 elevation

logit prevalence vs max NDVI logit prevalence 5 4 3 2 1 0 0.65 0.70 0.75 0.80 0.85 0.90 Max Greeness

Comparing non-spatial and spatial predictions in Cameroon Non-spatial 60 50 40 30 20 10 0 0 10 20 30

Spatial 60 50 40 30 20 10 0 0 10 20 30 40

Probabilistic prediction in Cameroon

RAPLOA have you ever had eye-worm? did it look like this photograph? did it go away within a week?

eplacemen RAPLOA calibration parasitology prevalence 0 20 40 60 80 100 parasitology logit 6 4 2 0 2 0 20 40 60 80 100 RAPLOA prevalence 6 4 2 0 2 RAPLOA logit Calibration relationship enables improved prediction of parasitological prevalence in areas of high uncertainty

PAMPER Goal: Construct predictions of black smoke levels, S(x, t), over thirty-year period Available data: monitored black smoke levels from spatially discrete monitoring network

monitors are only active intermittently

Modelling strategy Two-stage approach: 1. model temporal variation in spatially averaged black smoke levels 2. model residual spatio-temporal variation about temporal average

Model for temporal variation in spatially averaged black smoke Y t = spatially averaged black smoke at time t Model needs to take account of: long-term (decreasing) trend seasonal variation Classical regression model for Y t is log P t = α + βt + r {A k cos(kωt) + B k sin(kωt)} + Z t k=1 Case r = 1 gives pure sinusoid, r = 2, 3,... allows non-sinusoidal seasonal patterns

Fitted values, model 1 1200 1300 1400 1500 1600 Week 0 50 100 150 Lag Log average black smoke level Autocorrelation 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5

Model for temporal variation in spatially averaged black smoke (continued) Classical model fails because seasonal pattern is stochastic. Dynamic model: log P t = α + βt + {A t cos(ωt) + B t sin(ωt)} + Z t A t B t = A t 1 + ǫ t = B t 1 + δ t Allows locations and magnitudes of seasonal peaks and troughs to vary between years

1200 1300 1400 1500 1600 Week Model (2) 0 50 100 150 Lag Autocorrelation Log average black smoke level 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 4 5

Model for spatio-temporal variation in residuals Y t (x) = log ˆP t + S(x, t) + Z t (x) S(x, t) = spatio-temporally correlated (?) random field Z t (x) = mutually independent measurement errors

Constructed covariates where does the spatio-temporal correlation come from? look for possible surrogate measures which: are available at all locations and times correlate well with measured black smoke concentrations at monitored locations

Monitored black smoke vs domestic chimney density Important interactions with: non-residential/residential land-use (solid/open circles) clean-air act (staggered implementation) Average residual from dynamic model 0.2 0.0 0.2 0.4 0.6 0 500 1000 1500 2000 2500 3000 Average chimney count within 500 metres

Discussion points 1. Reliance on area-level data to analyse individual-level risk-factors is tricky 2. APOC (a) target for prediction linked directly to policy-relevant question (b) spatially dense surrogate outcomes (RAPLOA) used in combination with spatially sparse primary outcomes (parasitology) to improve spatial prediction

3. PAMPER (a) temporal takes precedence over spatial (b) construction of spatially continuous explanatory variables assists prediction of spatio-temporally continuous exposure surface (c) and may eliminate residual spatio-temporal correlation

4. Epidemiological relevance of point exposure in space-time: (a) ambient vs indoor? (b) integration over space and/or time? (c) integration over tracked movements of individuals at risk? 5. Making proper allowance for imprecision in exposure estimates is important 6. Integrated analysis of exposure and health outcome data: (a) possible in principle (APOC Loa loa study) (b) but computationally challenging for large data-sets 7. Real-time spatial prediction feasible using spatio-temporal models in conjunction with Monte Carlo algorithms

References Diggle, P.J., Thomson, M.C., Christensen, O.F., Rowlingson, B., Obsomer, V., Gardon, J., Wanji, S., Takougang, I., Enyong, P., Kamgno, J., Remme, H., Boussinesq, M. and Molyneux, D.H. (2007). Spatial modelling and prediction of Loa loa risk: decision making under uncertainty. Annals of Tropical Medicine and Parasitology, 101, 499 509. Fanshawe, T.R., Diggle, P.J., Rushton, S., Sanderson, R., Lurz, P.W.W., Glinianaia, S.V., Pearce, M.S., Parker, L., Charlton, M. and Pless- Mulloli, T. (2007). Modelling spatio-temporal variation in exposure to particulate matter: a two-stage approach. Environmetrics (in press)