Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Similar documents
Spatial Inference of Nitrate Concentrations in Groundwater

STAT 518 Intro Student Presentation

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Statistical Analysis of Spatio-temporal Point Process Data. Peter J Diggle

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Bayesian Hierarchical Models

Gibbs Sampling in Linear Models #2

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Chapter 2. Data Analysis

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Hierarchical Modelling for Univariate Spatial Data

CPSC 540: Machine Learning

Kernels for Automatic Pattern Discovery and Extrapolation

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Spatial Misalignment

Bayesian data analysis in practice: Three simple examples

Density Estimation. Seungjin Choi

A Spatio-Temporal Point Process Model for Ambulance Demand

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Variable Selection in Structured High-dimensional Covariate Spaces

Analysing geoadditive regression data: a mixed model approach

Longitudinal breast density as a marker of breast cancer risk

Gaussian Process Regression Model in Spatial Logistic Regression

Bayesian Areal Wombling for Geographic Boundary Analysis

Bayesian spatial hierarchical modeling for temperature extremes

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Learning Bayesian Networks for Biomedical Data

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

Contents. Part I: Fundamentals of Bayesian Inference 1

CTDL-Positive Stable Frailty Model

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

An Overview of Methods for Applying Semi-Markov Processes in Biostatistics.

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Statistics for extreme & sparse data

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

A short introduction to INLA and R-INLA

Bayesian Linear Regression

Bayesian Modeling of Conditional Distributions

Normalized kernel-weighted random measures

Resolving GRB Light Curves

Markov Chains and Hidden Markov Models

Extreme Value Analysis and Spatial Extremes

Approximate Bayesian Computation

Metropolis-Hastings Algorithm

Introduction to Bayesian methods in inverse problems

Multivariate spatial modeling

XXV ENCONTRO BRASILEIRO DE ECONOMETRIA Porto Seguro - BA, 2003 REVISITING DISTRIBUTED LAG MODELS THROUGH A BAYESIAN PERSPECTIVE

Analysis of Marked Point Patterns with Spatial and Non-spatial Covariate Information

Stat 516, Homework 1

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

A Process over all Stationary Covariance Kernels

Spatio-Temporal Modelling of Credit Default Data

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Spatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States

Lecture 13 Fundamentals of Bayesian Inference

Optimisation séquentielle et application au design

Dynamic Scheduling of the Upcoming Exam in Cancer Screening

Gaussian processes for inference in stochastic differential equations

Disease mapping with Gaussian processes

False Discovery Control in Spatial Multiple Testing

Cluster Analysis using SaTScan. Patrick DeLuca, M.A. APHEO 2007 Conference, Ottawa October 16 th, 2007

Bayesian model selection in graphs by using BDgraph package

Nonparametric Bayesian Methods (Gaussian Processes)

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Control Variates for Markov Chain Monte Carlo

A spatio-temporal model for extreme precipitation simulated by a climate model

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

ABC methods for phase-type distributions with applications in insurance risk problems

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009

Geostatistical Modeling for Large Data Sets: Low-rank methods

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Kernel density estimation in R

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS

Modeling Real Estate Data using Quantile Regression

Hierarchical Modelling for Multivariate Spatial Data

Wrapped Gaussian processes: a short review and some new results

Dynamic System Identification using HDMR-Bayesian Technique

MCMC algorithms for fitting Bayesian models

Chapter 4 - Fundamentals of spatial processes Lecture notes

Represent processes and observations that span multiple levels (aka multi level models) R 2

CBMS Lecture 1. Alan E. Gelfand Duke University

On Bayesian Computation

Fast Likelihood-Free Inference via Bayesian Optimization

Spatial Dynamic Factor Analysis

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables

Hierarchical Modeling for Multivariate Spatial Data

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Supervised Dimension Reduction:

Introduction to Probabilistic Machine Learning

Hierarchical Modeling and Analysis for Spatial Data

Modelling geoadditive survival data

Spatial Bayesian Nonparametrics for Natural Image Segmentation

State Space Representation of Gaussian Processes

Transcription:

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Dawn Woodard Operations Research & Information Engineering Cornell University Johns Hopkins Dept. of Biostatistics, April 2009 1

Outline 1 Physician Performance Assessment Performance Metrics 2 Spatial Inference of Pollutant Concentrations Statistical Approaches for Pollutant Estimation Bayesian Moving-Average Models Application to Nitrates Data Conclusions and Future Work 2

Physician Performance Assessment Joint work with A. Gelfand, B. Barlow, J. Elmore, and the Breast Cancer Surveillance Consortium There is concern about large differences in false positive and false negative rates between radiologists in screening mammography Database of 500,000+ mammograms Demographic characteristics of the patient Outcome of the mammogram (false +, false -, true +, or true -) Radiologist surveys Demographic & practice characteristics Level of concern about malpractice 4

Physician Performance Assessment Goal: assess physician performance while accounting for: 1. Differences in patients (case mix) 2. Differences in sample size (e.g. few cancer cases for some radiologists)?. Differences in radiologist attributes 5

Physician Performance Assessment Can adjust for case mix (e.g. Salem-Schatz et al. 1994) Can test whether a physician is significantly above or below average Tests invalid for small sample sizes Not clear how to compare one physician to another We build on Normand, Glickman, Gatsonis (1997): performance metrics for hospitals based on patient survival rate We extend to metrics for sens. & spec. of physicians We use a Bayesian hierarchical modeling approach to estimate and explain accuracy differences among radiologists 6

Modeling Accuracy Logistic regression: logit(s ij )=X ij β + τ i τ i = W i γ + φ i φ i N(0,ψ) S ij = sensitivity or specificity on mammogram i,j X ij = risk factors of patient i,j W i = attributes of radiologist i i = 1,...,I radiologists j = 1,...,n i mammograms of radiologist i with cancer present 7

Performance on a Hypothetical Patient Predict the sensitivity and specificity of each radiologist for a typical patient Or a high-risk or low-risk patient For a hypothetical patient with attributes X 0, the measure is S(X 0,β,τ i )=logit 1 (X 0 β + τ i ) 9

Performance on a Hypothetical Patient Sensitivity and specificity on a typical patient: 3 33 43 44 49 61 70 94 97 100 101 107 109 118 119 3 33 43 44 49 61 70 94 97 100 101 Sensitivity (%) 0 20 40 60 80 100 Specificity (%) 80 85 90 95 100 107 109 118 119 Radiologist ID Radiologist ID These measures do not adjust for differences in radiologist attributes 10

Performance Relative to a Standard Alternatively, take the predicted average accuracy (sensitivity or specificity) of a particular radiologist on her patients: μ i = 1 n i Σ n i j=1 S(X ij,β,τ i ) Compare to that expected for a radiologist with the same attributes and case mix: Take μ i μ i μ i = 1 n i Σ n i j=1 S(X ij,β,w i ) S(X ij,β,w i )=E τ Wi {S(X ij,β,τ)} Performance is evaluated while adjusting for radiologist attributes 11

Performance Relative to a Standard Sensitivity Difference 0.15 0.10 0.05 0.00 0.05 0.10 0.15 Specificity Difference 0.10 0.05 0.00 0.05 0.10 3 33 43 44 49 61 70 94 97 100 101 107 109 118 119 Radiologist Index 3 33 43 44 49 61 70 94 97 100 101 107 109 118 11 Radiologist Index Many radiologists had predicted specificity significantly above or below that expected; not so for sensitivity 12

Conclusion Bayesian modeling of patient-level sensitivity and specificity provides estimates of performance measures while fully accounting for uncertainty 13

Spatial Inference of Pollutant Concentrations Joint work with R. Wolpert and M. O Connell. Measurements of nitrates in groundwater have been obtained from wells in the mid-atlantic states (Ator 1998): > 8.3 mg/l mid range < 0.75 mg/l 15

Spatial Inference of Pollutant Concentrations Desire geographic interpolation of nitrate levels Distinct regulatory goals require inference at distinct, non-nested geographic scales... fine-scale, regulatory units (e.g. counties), hydrologic units (e.g. watersheds)....as well as distinct risk measures average nitrate concentration, probability of exceeding a threshold, averaged by region, maximum nitrate concentration occurring in each region. 16

Spatial Inference of Pollutant Concentrations We utilize a nonparametric spatial statistical model for nitrate concentrations at all locations Bayesian approach: uncertainty about the nitrate concentration and its average over various regions are all random variables......for which we can compute expected values (best overall estimates) and probabilities of exceeding specified thresholds 17

Existing Approaches When inference is desired at a single spatial partition (e.g. counties) or nested partitions, lattice models can be used. Kriging allows smooth spatial interpolation: models the pollutant concentration Λ(x) at x X as: log Λ(x) = JX X j (x)β j + Z (x) j=1 where Z (x) is a mean-zero Gaussian process. 19

Existing Approaches A kriged surface with only an intercept term β 0 : > 8.3 mid range < 0.7525 Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 20

Existing Approaches The confidence intervals are very wide in many locations, even where there is much data: Lower Bound: Upper Bound: > 8.3 mid range < 0.7525 > 8.3 mid range < 0.7525 Latitude 34 36 38 40 42 0 6 12 Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 82 80 78 76 74 Longitude Longitude 21

Moving-Average Models Ickstadt and Wolpert (1997) and Wolpert and Ickstadt (1998) introduced methods for interpolating intensities of spatial point processes by modeling the intensity Λ(x) as a moving average of an unobserved stochastic process The approach has been used in non-point-process applications: identifying proteins in mass spectroscopy (House, Clyde, and Wolpert 2006) spatio-temporal inference of sulfur dioxide air pollution (Tu 2006) We apply this model to obtain inferences of multiple risk measures, at multiple spatial scales 23

Moving-Average Models The concentration Λ(x) at location x X is modeled as: Λ(x) = JX X j (x)β j + j=1 MX k(x, s m )γ m m=1 for k(x, s) a kernel function on X S. The parameters s m are taken to be the centers of the mixture components, so that S = X The number M, locations s m, and magnitudes γ m > 0ofthe components are uncertain The ith measurement Y i is assumed to follow: log Y i N(log Λ(x i ),σ 2 ) 24

Moving-Average Models Interpretation of the spatial portion of the model, m k(x, s m)γ m, for pollutant level estimation: the pollutant surface is the sum of an unknown number of point sources with unknown locations and magnitudes......where the concentration decreases with distance from each source in a manner consistent with the kernel k(, ) 25

Moving-Average Models The kernel form is specified as: where d > 0 is a constant k(x, s) =exp j 1 ff x s 2 2d 2 Can be generalized to use unknown scale, eccentricity, and asymmetry For the nitrates analysis we do not include covariates, so β is not in the model 26

Prior Specification The spatial term in the model can be rewritten MX Z k(x, s m )γ m = m=1 S k(x, s)γ(ds) where MX Γ(ds) = γ m δ sm (ds) is a discrete measure on S. m=1 Γ is given a Lévy random field prior: Parameterized by a measure ν(dγ,ds) on R + S M Pois(ν + ) where ν + = ν(r + S) Conditional on M, (γ m, s m ) iid ν(dγ,ds)/ν + 27

Prior Specification We use the gamma random field on a bounded set S R 2 Its Levy density is ν(γ,s) =αγ 1 e ργ for α, ρ > 0 For A Swe have Γ(A) Ga(α A,ρ) In order for ν + <, must truncate ν(γ,s) by setting to zero for γ<ɛwhere ɛ>0 28

Prior Specification This prior implies that: The number of mixture components M satisfies: M Pois(α S E 1 (ρɛ)) where E 1 is the exponential integral function Conditional on M, the locations s m are independently uniformly distributed on S......and the magnitudes γ m are independently distributed according to the density f (γ) γ 1 e ργ 1(γ >ɛ) 29

Prior Specification These choices lead to prior surfaces Λ(x) like this one: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 30

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 31

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 32

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 33

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 34

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 35

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 36

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 37

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 38

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 39

Prior Specification The areas with high concentrations have random (unknown) locations a priori: Latitude 34 36 38 40 42 0 6 12 82 80 78 76 74 Longitude 40

Computation Computation is performed using reversible jump Markov chain Monte Carlo (Green 1995), wherein samples ω t of the parameter vector ω are obtained approximately from the posterior distribution Each iteration updates a single parameter, or adds, deletes, or updates a single mixture component A posterior estimate can be obtained for any function g(ω) of the parameters ω, since E[g(ω)] = lim T 1 T X g(ω t ) t T Ex: for the average of Λ(x,ω) P over x A, sample {a i } K i=1 uniformly in A and use the estimate 1 Λ(a TK i,ω t ) i,t 41

Nitrate Inferences The posterior mean of the concentration: Latitude 34 36 38 40 42 > 8.3 mid range < 0.7525 0 6 12 82 80 78 76 74 Longitude 43

Nitrate Inferences The posterior standard deviation of the concentration: Latitude 34 36 38 40 42 > 8.3 mid range < 0.7525 0 2 4 82 80 78 76 74 Longitude 44

Nitrate Inferences This is a measure of estimation uncertainty. Latitude 34 36 38 40 42 > 8.3 mid range < 0.7525 0 2 4 82 80 78 76 74 Longitude 45

Nitrate Inferences Most areas with numerous measurements have low uncertainty. Latitude 34 36 38 40 42 > 8.3 mid range < 0.7525 0 2 4 82 80 78 76 74 Longitude 46

Nitrate Inferences Average nitrate concentrations over counties: Latitude 34 36 38 40 42 > 5 mg/l mid range < 1 mg/l 82 80 78 76 74 Longitude 47

Nitrate Inferences The probability that the nitrate concentration exceeds the regulatory limit, averaged by county: Latitude 34 36 38 40 42 > 8 % mid range < 2 % 82 80 78 76 74 Longitude 48

Conclusions The Bayesian moving-average model allows inference of a variety of risk measures at a variety of spatial scales. Uncertainty measures are available for all these estimates. The model is nonparametric. It has a desirable interpretation in the context of pollutant level estimation. 50

Conclusions The moving-average model has a computational advantage over kriging for large data sets Likelihood evaluation for the moving-average model is O(NM), where N is the number of data points and M is the number of mixture components. Likelihood evaluation is O(N 3 ) for kriging. 51

Future Work Covariates such as climatic, geologic, and land use factors could be added. The fixed kernels could be replaced with kernels that have priors on the scale, eccentricity, and asymmetry. This would allow the model to capture, e.g., pollutant point sources that have spread out more in one direction than another due to flow patterns. 52

Published in Woodard, Gelfand, Barlow, and Elmore (2007, Statistics in Medicine) and Woodard, Wolpert, and O Connell (2009, JABES). More details (references, this talk in.pdf, related work) available at or on request from www.orie.cornell.edu/woodard dbw59@cornell.edu 53