Challenges in modelling air pollution and understanding its impact on human health

Similar documents
Estimating the long-term health impact of air pollution using spatial ecological studies. Duncan Lee

Air Quality Modelling under a Future Climate

K. (2017) A 18 (2) ISSN

A rigorous statistical framework for estimating the long-term health impact of air pollution

arxiv: v2 [stat.ap] 18 Apr 2016

Air Quality Modelling for Health Impacts Studies

A Bayesian localised conditional autoregressive model for estimating the health. effects of air pollution

Pumps, Maps and Pea Soup: Spatio-temporal methods in environmental epidemiology

Controlling for unmeasured confounding and spatial misalignment in long-term air pollution and health studies

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Measurement Error in Spatial Modeling of Environmental Exposures

An application of the GAM-PCA-VAR model to respiratory disease and air pollution data

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian Hierarchical Models

Handbook of Spatial Epidemiology. Andrew Lawson, Sudipto Banerjee, Robert Haining, Lola Ugarte

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian hierarchical modelling for data assimilation of past observations and numerical model forecasts

Statistical integration of disparate information for spatially-resolved PM exposure estimation

Disease mapping with Gaussian processes

Spatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States

Bayesian Spatial Health Surveillance

A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Predicting Long-term Exposures for Health Effect Studies

Point process with spatio-temporal heterogeneity

Statistical and epidemiological considerations in using remote sensing data for exposure estimation

Part 8: GLMs and Hierarchical LMs and GLMs

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Data Integration Model for Air Quality: A Hierarchical Approach to the Global Estimation of Exposures to Ambient Air Pollution

Flexible Spatio-temporal smoothing with array methods

Fusing point and areal level space-time data. data with application to wet deposition

P -spline ANOVA-type interaction models for spatio-temporal smoothing

Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005

Deposited on: 07 September 2010

A spatio-temporal process-convolution model for quantifying health inequalities in respiratory prescription rates in Scotland

INLA for Spatial Statistics

01 July 2016 N/A N/A Broadband July 2016 N/A N/A Oyster Card top up

Dr. Haritini Tsangari Associate Professor of Statistics University of Nicosia, Cyprus

Spatial Smoothing in Stan: Conditional Auto-Regressive Models

Spatio-temporal modeling of fine particulate matter

Bayesian PalaeoClimate Reconstruction from proxies:

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Spatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique

Brent Coull, Petros Koutrakis, Joel Schwartz, Itai Kloog, Antonella Zanobetti, Joseph Antonelli, Ander Wilson, Jeremiah Zhu.

Fitting Narrow Emission Lines in X-ray Spectra

The Use of Spatial Exposure Predictions in Health Effects Models: An Application to PM Epidemiology

Markov random fields. The Markov property

A Spatio-Temporal Downscaler for Output From Numerical Models

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Package CARBayesST. April 17, Type Package

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter

Lognormal Measurement Error in Air Pollution Health Effect Studies

Community Health Needs Assessment through Spatial Regression Modeling

MODULE 12: Spatial Statistics in Epidemiology and Public Health Lecture 7: Slippery Slopes: Spatially Varying Associations

Big Data in Environmental Research

Hierarchical Bayesian models for space-time air pollution data

Richard D Riley was supported by funding from a multivariate meta-analysis grant from

Extended Follow-Up and Spatial Analysis of the American Cancer Society Study Linking Particulate Air Pollution and Mortality

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency

Sub-kilometer-scale space-time stochastic rainfall simulation

A Practitioner s Guide to Generalized Linear Models

Sensitivity Analysis with Several Unmeasured Confounders

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Uncertainty in merged radar - rain gauge rainfall products

Accounting for Calibration Uncertainty in Spectral Analysis. David A. van Dyk

Roger S. Bivand Edzer J. Pebesma Virgilio Gömez-Rubio. Applied Spatial Data Analysis with R. 4:1 Springer

Measurement error effects on bias and variance in two-stage regression, with application to air pollution epidemiology

Developments in operational air quality forecasting at the Met Office

A spatial causal analysis of wildfire-contributed PM 2.5 using numerical model output. Brian Reich, NC State

The Effect of Climate Variation on Infectious Diseases in Humans in New Zealand

Lecture 16: Mixtures of Generalized Linear Models

Aggregated cancer incidence data: spatial models

Andrew B. Lawson 2019 BMTRY 763

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013

Causal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University

STA 216, GLM, Lecture 16. October 29, 2007

Linkage Methods for Environment and Health Analysis General Guidelines

Interpolation of daily mean air temperature data via spatial and non-spatial copulas

Dimensionality reduction for classification of stochastic fibre radiographs

Markov chain optimisation for energy systems (MC-ES)

GSI Geometric Science of Information, Paris, August Dimensionality reduction for classification of stochastic fibre radiographs

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Big Data in Environmental Research

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Calibrating Environmental Engineering Models and Uncertainty Analysis

Assessing the impact of seasonal population fluctuation on regional flood risk management

Chapter 4 - Fundamentals of spatial processes Lecture notes

Spatial point processes in the modern world an

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Spatio-Temporal Latent Variable Models: A Potential Waste of Space and Time?

Semiparametric Varying Coefficient Models for Matched Case-Crossover Studies

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Lab 8. Matched Case Control Studies

Fusing space-time data under measurement error for computer model output

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

Characterisation of Near-Earth Magnetic Field Data for Space Weather Monitoring

Transcription:

Challenges in modelling air pollution and understanding its impact on human health Alastair Rushworth Joint Statistical Meeting, Seattle Wednesday August 12 th, 2015

Acknowledgements Work in this talk part of a larger collaboration: University of Glasgow University of Southampton UK Met O ce Funded by the EPSRC Duncan Lee Richard Mitchell Sujit Sahu Sabyasachi Mukhopadhyay Paul Agnew Christophe Sarran

Goals I Construct state-of-the-art spatio-temporal model for predicting air pollution at high spatial resolution across England. I NO 2, O 3, PM 10 and PM 2.5 monitored at 142 urban and rural locations I Output from computer model (AQUM) on 1km 2 grid. I Fusion of modelled and measured data sources achieved using anisotropic Gaussian predictive process model, where AQUM included as a regressor. I See technical report for details www.southampton.ac.uk/~sks/research/papers/anisotropy4.pdf I Build better models for disease risk given an air pollution exposure, that can adequately represent the spatio-temporal pattern in disease risk. I Utilising the new models above within a two-stage framework, estimate the health e ects of air pollution across England.

England LHA respiratory admissions data Jan 2007 2.8 Newcastle 2.4 I Jan 2007 to Dec 2011 (60 months) I 323 Local Health Authorities I N = 19380 2 Manchester 1.6 1.2 Birmingham 0.8 0.4 London 0

Typical model for spatial heath counts Y kt E kt, R kt Poisson(E kt R kt ) ln(r kt ) = 0 + x kt + z > kt + kt, t = 1,...,T time points k = 1,...,N regions Where Y kt health counts E kt expected cases R kt health risk kt random e ect z > kt other covariate e ects 0 intercept x kt air pollution pollution e ect

Statistical considerations Unmeasured confounding: Air pollution, and the other measured covariates do not account for all variation. Adding a set of spatio-temporal random e ects, kt can o er a solution. How should kt be structured in space and time? Misalignment: The air pollution model estimates the true exposure surface Z(s kj, t), by a set of predictive distributions at grid locations, {s kj }. Health counts are regional totals. How can we reconcile these quantities? Could we simply average the air pollution? Uncertainty: The posterior density of Z(s kj, t) isavailablevia MCMC samples, and therefore uncertainty in air pollution is quantified. How should this source of uncertainty be incorporated into the health model? What e ect does this have on estimation?

Unmeasured confounding: An existing model for kt Rushworth et al. (2014) propose the global model: ln(r kt ) = 0 + x kt + z > kt + kt Letting t =( 1t,..., Nt), where t =1,...,T,then: 1 N 0, 2 Q(W, ) 1 t t 1 N t 1, 2 Q(W, ) 1 for t 2 Q = [diag(w1) W]+(1 )I W = spatial (binary) neighbours matrix.

Unmeasured confounding: a more flexible model for kt Q(W, ) restricts the range of surfaces that can be fitted. Solution: Treat non-zero elements of W as random variables w + ij 2 [0, 1]. Control model complexity using normal prior on transformed w + ij : ln w +! ij 1 w + N µ, 2 ij µ is chosen to be large and positive reflecting prior preference for spatial smoothness.

English respiratory data: random e ects We will compare the random e ects models Model type Random e ects Adjacency model GLM NA Non-adaptive kt Adaptive kt w + kt =1 logit(w + kt ) N(µ, 2 ) Under the risk specification ln(r kt )= 0 + x kt + jobseekers kt 1 + houseprice kt 2 + kt

English respiratory data: random e ects Pollutant No random e ects (GLM) Non-adaptive kt Adaptive kt NO 2 1.151 (1.144, 1.158) 1.057 (1.045, 1.069) 1.048 (1.036, 1.060) PM 10 1.013 (1.007, 1.020) 1.007 (0.998, 1.015) 1.006 (0.995, 1.015) PM 2.5 1.013 (1.007, 1.019) 1.006 (0.997, 1.014) 1.006 (0.997, 1.016) O 3 0.981 (0.974, 0.987) 0.983 (0.972, 0.995) 0.980 (0.965, 0.993) Table : Risks and 95% CIs for 1-standard deviation increases in pollutant Simpler models have a tendency to overestimate air-pollution e ects.

kt Respiratory: P[ wij < 0.5 ] > 0.99 estimates and adjacency model 2.7 Newcastle 2.43 2.16 1.89 1.62 Manchester 1.35 1.08 Birmingham 0.81 0.54 0.27 London 0

Air pollution - uncertainty 1 st stage model yields predictive distributions for air pollution in space and time. This uncertainty should be passed through the 2 nd stage health model so that resulting health estimates represent all available information. Some possible strategies: (1) Treat posterior mean pollution concentrations as true values (no uncertainty) (2) Directly feed samples from the posterior air pollution density through the health model (3) Treat the posterior pollution densities as prior distributions in the health model (e.g. using a Gaussian approximation)

Exploring the English respiratory data: uncertainty Compare approaches to incorporating pollution uncertainty: (1) x kt = x kt (2) x kt DU over posterior air pollution samples (3) x kt MVN estimated from posterior samples Again, under the risk specification ln(r kt )= 0 + x kt + jobseekers kt 1 + houseprice kt 2 + kt

Results uncertainty Pollutant (1) x kt = x kt (2) x kt DU (3) x kt MVN NO 2 1.048 (1.036, 1.060) 1.001 (0.999, 1.003) 1.035 (1.030, 1.041) PM 10 1.006 (0.995, 1.015) 1.000 (0.998, 1.003) 1.025 (0.999, 1.043) PM 2.5 1.006 (0.997, 1.016) 1.001 (0.997, 1.004) 1.008 (0.995, 1.062) O 3 0.980 (0.965, 0.993) 1.000 (0.999, 1.001) 0.996 (0.967, 1.000) Table : Risks and 95% CIs for 1-standard deviation increases in pollutant

Conclusions I Choices for handling spatio-temporal autocorrelation have important consequences for the estimating the e ects of air pollution. I It is important to treat air pollution exposure as uncertain, as it is rarely realistic to assume exposure is observed (or predicted) without error. Future work: I Simulate to determine bias and coverage properties for I Improve on current Gaussian approximation to air pollution posterior I Multivariate disease responses

Thank you very much for listening!