Spatial Statistics for environmental health data Montse Fuentes (NC State)
|
|
- Randell James
- 5 years ago
- Views:
Transcription
1 Spatial Statistics for environmental health data (NC State) Josh Warren (Yale), Amy Herring (UNC) and Peter Langlois (Texas SHD)
2 Why Congenital Anomalies? Congenital Anomalies: abnormalities, physiological or structural, which present at the time of birth Around 3% of all births result in defect Leading cause of infant mortality (>20% of all infant deaths) Cardiac defects: leading cause of mortality among defect related deaths (a ect around 1% of all births) Cause of most defects is unknown More than $2.5 billion a year in hospital costs alone Sources: March of Dimes Prematurity Campaign, 2010; Martin et al., 2010.; Rynn, Cragan, and Correa, 2008; Martin et al., NC State University 2 / 44
3 Scientific Motivation Birth Defects and Pollution Exposure: Vrijheid et al. (2010) literature review and meta-analysis Meta-analysis: Only few cardiac anomalies and pollutant combinations found to be significant NO 2,SO 2,andPM 10 Atrial septal defect, coarctation of the aorta, tetralogy of Fallot Recommendations: Exposure assessment improvement Other defect groups being analyzed Identify the more susceptible time windows during pregnancy NC State University 3 / 44
4 Texas Vital Records Data Full birth records from all births in Texas where 1 Mother resided at delivery in a region and time period covered by the birth defect registry; At least some demographic information available Geocoding Information: Each birth geocoded to the residence at delivery Texas ( ) study found that 68% of pregnant women did not move between date of conception and date of birth (Nuckols et al. (2004)) Of the women who did move, 49% stayed very close geographically (same water supply source) NC State University 4 / 44
5 Data Sources and Information Infant/Fetus and Pregnancy Information Included: Case/Control Status Sex Birth Weight Pregnancy Outcome Plurality of Pregnancy Gestational Age Estimates Number of Previous Live Births Mother/Father Information Included: Age Birthplace Race/Ethnicity Education NC State University 5 / 44
6 Texas Vital Records Data Figure: Birth Locations in Texas, NC State University 6 / 44
7 Analysis Data Case-control study design Two controls per case Matched on mother s age group, year of birth, and infant s gender Three cardiac defects of interest Atrial septal defects Pulmonary artery and valve defects Ventricular septal defects Groups chosen based on largest observed sample sizes and similar groupings in Gilboa (2005). NC State University 7 / 44
8 Analysis Data The Birth Defects Epidemiology and Surveillance Branch (BDES) in Texas was established in 1993 after cluster of birth defect cases seen in Brownsville, TX. Further investigation showed Hispanic mothers in Southern Texas were at higher risk of their babies developing birth defects (TDSHS, 2011). We focus on the southern part of Texas, TDSHS health service regions 6, 8, and 11, from Regions include Houston, San Antonio, Corpus Christi, and Brownsville Shares border with Mexico 5134 controls, 2567 cases in final dataset Births removed if gestational age < 20 weeks and if birth resulted in an induced termination of the pregnancy or unspecified fetal death, based on previous epidemiologic cardiac defect and air pollution studies (Dolk et al., 2010; Dadvand et al., 2010; Rankin et al.,2009) NC State University 8 / 44
9 Analysis Data (a) Health Regions. (b) Locations of births. NC State University 9 / 44
10 Analysis Data ASD PAVD VSD 5134 Figure: Cardiac defect breakdown of analyzed data. NC State University 10 / 44
11 Pollution Data Sources Air Quality System (AQS): Texas, Ozone- Maximum daily 8-hour average (parts per million (ppm)) PM Daily average (micrograms per cubic meter (ug/m 3 )) Nitrate (NO 3 )- Daily average (ug/m 3 ) Sulfate (SO 4 )- Daily average (ug/m 3 ) Elemental carbon (EC)- Daily average (ug/m 3 ) Organic carbon (OC)- Daily average (ug/m 3 ) Note: EC and OC available from in Texas NC State University 11 / 44
12 PM 2.5 Speciated Data Monitors Figure: Active Monitors, NC State University 12 / 44
13 Limitations of the traditional monitoring data Sparse spatial coverage Missing values Question? How to fill the space-time gaps for the monitoring data? What we need? More complete coverage and better understanding of the underlying physical system. Possible solutions Numerical model output data. NC State University 13 / 44
14 CMAQ (Community Multiscale Air Quality) Model CMAQ model Air quality simulations based on di erential equations that represent the underlying physical processes. Primary inputs: meteorological information, emission rates, initial and boundary conditions. Benefits from CMAQ models Air quality simulations for the entire US. High spatial and temporal resolution without missing values; NC State University 14 / 44
15 CMAQ (Community Multiscale Air Quality) Data 12km x 12km grid over Texas, Included pollutants (daily values): Ozone- Maximum daily 8-hour average (ppb) PM Daily average (ug/m 3 ) Nitrate (NO 3 )- Daily average (ug/m 3 ) Sulfate (SO 4 )- Daily average (ug/m 3 ) Elemental carbon (EC)- Daily average (ug/m 3 ) Organic carbon (OC)- Daily average (ug/m 3 ) NC State University 15 / 44
16 Previous Work: Congenital Anomalies Exposure to pollution during pregnancy typically handled through averages over suspected windows of importance; fit separately using multiple models for di erent exposure windows Separate models for di erent pollutant/defect combinations; Multivariate outcomes rarely considered The impact of PM 2.5 or speciated fine PM on birth defects has not been studied NC State University 16 / 44
17 Statistical Challenges Incorporate spatial analysis of environmental health data, typically not considered in classical birth outcome epidemiological studies Create statistical models to identify the specific critical windows during the pregnancy when high exposures to pollutants more negatively a ect the birth defects Model multivariate birth defects to include additional information rarely utilized in previous studies Incorporate nonstationary and non-gaussian spatial-temporal behavior to increase flexibility to model complex relationships between pollution exposure and birth defects Carry out appropriate inference to better characterize the uncertainty associated with the parameter estimates NC State University 17 / 44
18 New Methodology Cardiac Anomaly Model: Y ij is a binary outcome, taking the value 1 when pregnancy i ends on birth defect j. p ij is the probability of that outcome. We use a probit link. Y i =(Y i1,...,y ij ) T, i =1,...,N; where Y ij p ij ind Bern p ij 1 (p ij) =x T i j + z T i j z vector of multi-pollutants, j pollutant and defect specific, spatially and temporally-varying coe cients Represent the e ect of the concentration of air pollutants during the pregnancy and across space on the probability of developing anomaly j for woman i. NC State University 18 / 44
19 Stick-Breaking Process Prior Review To account for uncertainty in the parametric form of a distribution, G, we use a Bayesian nonparametric model and put a prior on the unknown distribution G. The stick-breaking prior for G is the possibly infinite mixture G = d P M k=1 p k k,wherep i are the mixture probabilities and k is the Dirac distribution with point mass at k. The mixture probabilities break the stick into M pieces, so the sum of the pieces is one, P M k=1 p k =1 p 1 = V 1 and the subsequent probabilities are p k = V k Qj<k (1 V j), P where 1 j 1 k=1 p k = Q j 1 k=1 (1 V k) is the probability not accounted by the first j 1 components. NC State University 19 / 44
20 Stick-Breaking Process Prior Review V i Beta (a, b) control the distribution of the probability mixture i G 0,whereG 0 is a known density Dirichlet Process prior: DP ( G 0 ) Ferguson, 1973 M = 1, V i iid Beta (1, ) SBP can be extended to spatial setting by incorporating spatial information in the p k or the k Generally finite M used in practice SBP is a.s. discrete, problematic for assumed continuous data. Instead Y i = i + i,where i is a continuous random error component and i has the SBP prior. NC State University 20 / 44
21 University of Michigan, February Stick breaking 1. Draw θ 1 from H 0 2. Draw V 1 from Beta(1,ν) 3. p 1 = V 1 4. Draw θ 2 from H 0 5. Draw V 2 from Beta(1,ν) p 2 = V 2(1 V 1)
22 Previous Extensions of Stick-Breaking Prior Gelfand et al., 2005: Spatial Dirichlet Process Prior 1X G ( ) (.) = k (.), k =( (s 1 ),..., (s n )) T k=1 p k iid k MVN (0, s ) Duan, Guindani, and Gelfand, 2007: Generalized Spatial Dirichlet Process Prior 1X 1X G ( ) (.) =... p i(s1 ),...,i(s n) (s 1 ) (.)... i(s1 ) (s n) i(sn) (.) i(s 1 )=1 i(s n)=1 Reich and Fuentes, 2007: Spatial Stick-Breaking Prior MX G ( (s)) (.) = p k (s) k (.) k=1 NC State University 21 / 44
23 Advances in Interdisciplinary Statistics, UNCG October 14, 2007 Figure 2: Example to illustrate the spatial stick-breaking prior. In this example, the spatial domain is the one-dimensional interval (0,1) and the model has Gaussian kernels with knots ψ = (0.5, 0.0, 1.0, 0.2, 0.8), bandwidths ϵ =(0.1, 0.2, 0.2, 0.2, 0.2), and V =(0.9, 0.7, 0.7, 0.9, 0.9). Panel (a) shows the masses p i (s) and Panel (b) shows the correlation between Ṽ (s) and Ṽ (s ) (this is a nonstationary covariance). probabilities p1(s) p2(s) p3(s) p4(s) p5(s) p6(s) s s s 23
24 New Methodology Cardiac Anomaly Model: Y ij is a binary outcome, taking the value 1 when pregnancy i ends on birth defect j. p ij is the probability of that outcome. We use a probit link. Y i =(Y i1,...,y ij ) T, i =1,...,N; where 1 (p ij) =x T i j + Y ij p ij QX ind D+l X q=1 d=1+l Bern p ij z q (t i (d), s i ) j (B (s i ), d, q) j (B (s i ), d, q) : pollutant and defect specific, spatially and temporally-varying coe cients Represent the e ect of the concentration of air pollutant q at pregnancy week d (corresponding to calendar week t i (d)) at location s i within region B (s i ) on the probability of developing anomaly j for woman i. NC State University 22 / 44
25 New Methodology E ects grouped across defects and pollutants such that (B (s i ), d) =( 1 (B (s i ), d, 1),..., 1 (B (s i ), d, Q),..., J (B (s i ), d, Q)) T G ( (B(si ),d)) (.) = (B (s i ), d) G ind G ( (B(si ),d)) G KSBP MX p k (B (s i ), d) (B(si ),d) k (.) k=1 Y p k (B (s i ), d) =w k (B (s i ), d) V k (1 w j (B (s i ), d) V j ), j<k V i iid Beta (a, b), and a, b > 0. NC State University 23 / 44
26 New Methodology iid k MVN (0, ), k =1,...,M, where T k = (s 1, 1) T k,..., (s L, D)T k = s t s : spatial correlation matrix t : temporal correlation matrix : unstructured covariance matrix describing cross-correlations between anomaly groups and pollutants Di erent kernel functions available for the w k (B (s i ), d) parameters. Di erent flexible covariance structures induced with di erent kernel functions. Conjugancy for full conditionals allows use of Gibbs Sampler. NC State University 24 / 44
27 Finite Mixture Model Representation For finite # of mixture components (M< 1), the model can be expressed as: (B (s i ), d) = (B (s i ), d) g(b(si ),d) g (B (s i ), d) ind Categorical (p 1 (B (s i ), d),...,p M (B (s i ), d)) p k (B (s i ), d) parameters similarly defined as before NC State University 25 / 44
28 Induced Covariance Structure Conditional on p: cov ( (B (s), d), (B (s 0 ), d 0 ) p) = MX x exp{ s kb (s) B (s 0 )k t d d 0 } p k (B (s), d) p k (B (s 0 ), d 0 ). Conditionally nonstationary Allowing M = 1 and unconditionally: cov ( (B (s), d), (B (s 0 ), d 0 )) = x exp{ s kb (s) B (s 0 )k t d d 0 } apple x ((B (s), d), (B (s 0 ), d 0 )) 2 a + b +1 1 ((B (s), d), (B (s 0 ), d 0 )) a +1 k=1 Unconditionally stationary if function is stationary 1 Increased flexibility to model nonstationary and non-gaussian situations 2 Covariance structure depends on chosen kernel function NC State University 26 / 44
29 Kernel Function Examples Table: =( 1, 2, 3 )=(s 1, s 2, d), s =(s 1, s 2 ) Name w k (s, d) Model for kj (s, d), s 0, d 0 Q Uniform 3j=1 I j kj < kj Q 2 kj 3j=1 j 0! + j j I (j6=1) 1 j I (j6=1) ( Q Uniform 3j=1 I j kj < kj P3 j 0 ) j 2 kj Expo j I (j6=1) exp j=1 j I (j6=1) Q < 2 = < Exponential 3j=1 j kj exp : kj ; kj 2 j I (j6=1) /2 1 exp 23/2 : 2 j Q < 2 = Exponential 3j=1 j kj 2 exp : kj ; kj IG 1.5, j I (j6=1) /2 1 Q 3j=1 j j P 3j=1 j j = ; I (j6=1) A 2 j I (j6=1) NC State University 27 / 44
30 Simulation Study Comparing proposed model to other two competiting models Model 1: Multivariate spatial-temporal KSBP prior (Nonseparable/Nonstationary/NonGaussian) Model 2: Multivariate Gaussian separable/stationary spatial-temporal process prior (M=1) cov (B (s), d), B s 0, d 0 = x exp s B (s) B s 0 t d d 0. Model 3: Multiple probit regression model, assuming independence in space and time cov (B (s), d), B s 0, d 0 = if B (s) =B (s 0 ) and d = d 0 0 otherwise. Analyzed MSE and coverage probabilities of the estimated risk parameters NC State University 28 / 44
31 Results MSE Method 1 Method 2 Method Setting Figure: Setting 1: Independence in space-time, Setting 2: Separable/Stationary in space-time, Setting 3: Nonstationary, Setting 4: Nonstationary/Non-Gaussian, Setting 5: Nonseparable/Nonstationary/Non-Gaussian. S.E. =0.6 NC State University 29 / 44
32 Results Table: MSE estimates and confidence intervals. Method Mean LCL UCL Table: Coverage probability estimates and confidence intervals Method Mean LCL UCL NC State University 30 / 44
33 Simulation Study Conclusions Newly introduced prior has flexibility to produce similar results when the prior covariance structure and distribution of the risk e ects agrees with alternative model Outperforms the other models in terms of MSE in the more complicated (nonstationary/non-gaussian) data settings Never significantly outperformed by the competitor models due to flexibility NC State University 31 / 44
34 Model Application Results DIC values very similar for Models 1 and 2, significantly larger for Model 3 Classic DIC may not be appropriate in mixture model setting (Celeux et al., 2003) Model adequacy checks needed in this situation Significant covariate e ects from Model 1: ASD: Black vs. White (negative), > One Fetus vs. One Fetus (positive), Summer vs. Winter (positive) PAVD: Other vs. White (negative), One vs. No Previous Births (negative), Summer vs. Winter (positive) VSD: Black vs. White (negative), > One Fetus vs. One Fetus (positive), Summer vs. Winter (positive) NC State University 32 / 44
35 Model Adequacy Pearson Residual Discrepancy Measure (Dey and Chen, 2000): Observation level: D ij (y ij ; j, ) = (y ij p ij ( j, )) 2 p ij ( j, )(1 p ij ( j, )), Total (since model adequacy is of concern): D(y;, ) = NX JX D ij (y ij, ; i=1 j=1 j, ), Compare f (D(y obs ;, ) y obs )withf (D(y new ;, ) y obs ) Should be very similar if model fits well NC State University 33 / 44
36 Model Adequacy Observed Model 1 New Observed Model 2 New Observed Model 3 New Pro- (a) Model 1. posed model. (b) Model 2. Separable Stationary Gaussian Model. (c) Model 3. Gaussian indepent in space time. (d) Model 1. Model. Proposed (e) Model 2.Separable Stationary Gaussian Model. (f) Model 3. Gaussian indepent in space time. NC State University 34 / 44
37 Model Adequacy P D(yobs ;, ) NJ p 2NJ > K y obs P D(ynew ;, ) NJ p 2NJ > K y obs Model # K=3 K=4 K=5 K=6 K=3 K=4 K=5 K= P D(yobs ;, ) NJ p 2NJ > K y obs P D(ynew ;, ) NJ p 2NJ > K y obs Model # K=1 K=2 K=3 K=4 K=1 K=2 K=3 K= P D(yobs ;, ) NJ p 2NJ > K y obs P D(ynew ;, ) NJ p 2NJ > K y obs Model # K=15 K=17 K=19 K=21 K=15 K=17 K=19 K= MC errors ranged from to with an average value of NC State University 35 / 44
38 Results Nitrate Nitrate Nitrate Effect Effect Effect Week Week Week (g) Model 1. Proposed Model. (h) Model 2. Separable Stationary Gaussian Model. (i) Model 3. Gaussian indepent in space time. Figure: Brownsville, PAVD, nitrate. NC State University 36 / 44
39 Results EC EC EC Effect Effect Effect Week Week Week (a) Model 1. Proposed Model (b) Model 2. Separable Stationary Gaussian Model. (c) Model 3. Gaussian indepent in space time. Figure: Site 3: San Antonio, VSD, elemental carbon. NC State University 37 / 44
40 Results Nitrate Nitrate Nitrate Effect Effect Effect Week Week Week (a) Model 1. Proposed Model (b) Model 2. Separable Stationary Gaussian Model. (c) Model 3. Gaussian indepent in space time. Figure: Site 3: San Antonio subgroup, ASD, nitrate. NC State University 38 / 44
41 Results Nitrate Nitrate Nitrate Effect Effect Effect Week Week Week (a) Model 1. Proposed Model. (b) Model 2. Separable Stationary Gaussian Model. (c) Model 3. Gaussian indepent in space time. Figure: Site 2: Brownsville subgroup, ASD, nitrate. NC State University 39 / 44
42 Results OC OC OC Effect Effect Effect Week Week Week (a) Model 1. Proposed Model. (b) Model 2. Separable Stationary Gaussian Model. (c) Model 3. Gaussian indepent in space time. Figure: Site 1: Houston subgroup, ASD, organic carbon. NC State University 40 / 44
43 Model Application Conclusions Model adequacy results suggest that Model 1 provides the best fit of the data Model 2 misses a majority of the significant weeks that Model 1 detects The improved fit of the data for Model 1 and the similarity to the empirical results (Model 3) suggests that while Model 2 does have smaller variances, it is not characterizing the distribution properly and should not be used to make inference Model 3 is rejected based on model adequacy checks Main signal seen for weeks 3, 7, and 8 for multiple defects and pollutants across most locations NC State University 41 / 44
44 Conclusion/Discussion Simultaneously characterized the e ect of exposure to multiple pollutants on cardiac defect outcomes in a continuous manner throughout the pregnancy Introduced spatial models to account for the changing risks across space Critical time periods identified during the pregnancy in which increased pollution exposure negatively impacts the resulting birth outcomes Results further build the evidence supporting the link between air pollution and birth outcomes while extending our knowledge for the common congenital anomaly outcomes NC State University 42 / 44
45 Acknowledgments National Science Foundation (Fuentes DMS , DMS ) Environmental Protection Agency (Fuentes, R833863) National Institutes of Health (Fuentes, 5R01ES ) NC State University 43 / 44
46 Thank You NC State University 44 / 44
Jun Tu. Department of Geography and Anthropology Kennesaw State University
Examining Spatially Varying Relationships between Preterm Births and Ambient Air Pollution in Georgia using Geographically Weighted Logistic Regression Jun Tu Department of Geography and Anthropology Kennesaw
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationNonparametric Bayesian modeling for dynamic ordinal regression relationships
Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria
More informationBayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,
Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives Often interest may focus on comparing a null hypothesis of no difference between groups to an ordered restricted alternative. For
More informationNon-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources
th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationFaculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics
Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial
More informationStatistical methods for evaluating air pollution and temperature effects on human health
Statistical methods for evaluating air pollution and temperature effects on human health Ho Kim School of Public Health, Seoul National University ISES-ISEE 2010 Workshop August 28, 2010, COEX, Seoul,
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationBayesian spatial quantile regression
Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationA Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles
A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More information1Department of Demography and Organization Studies, University of Texas at San Antonio, One UTSA Circle, San Antonio, TX
Well, it depends on where you're born: A practical application of geographically weighted regression to the study of infant mortality in the U.S. P. Johnelle Sparks and Corey S. Sparks 1 Introduction Infant
More informationStatistical integration of disparate information for spatially-resolved PM exposure estimation
Statistical integration of disparate information for spatially-resolved PM exposure estimation Chris Paciorek Department of Biostatistics May 4, 2006 www.biostat.harvard.edu/~paciorek L Y X - FoilT E X
More informationLecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu
Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationBayesian Nonparametric Regression for Diabetes Deaths
Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,
More informationA spatial causal analysis of wildfire-contributed PM 2.5 using numerical model output. Brian Reich, NC State
A spatial causal analysis of wildfire-contributed PM 2.5 using numerical model output Brian Reich, NC State Workshop on Causal Adjustment in the Presence of Spatial Dependence June 11-13, 2018 Brian Reich,
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationMultivariate spatial nonparametric modelling. via kernel processes mixing
Multivariate spatial nonparametric modelling via kernel processes mixing Montserrat Fuentes and Brian Reich SUMMARY In this paper we develop a nonparametric multivariate spatial model that avoids specifying
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationHierarchical Modeling for Multivariate Spatial Data
Hierarchical Modeling for Multivariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department
More informationBrent Coull, Petros Koutrakis, Joel Schwartz, Itai Kloog, Antonella Zanobetti, Joseph Antonelli, Ander Wilson, Jeremiah Zhu.
Harvard ACE Project 2: Air Pollutant Mixtures in Eastern Massachusetts: Spatial Multi-resolution Analysis of Trends, Effects of Modifiable Factors, Climate, and Particle-induced Mortality Brent Coull,
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationCausal Modeling in Environmental Epidemiology. Joel Schwartz Harvard University
Causal Modeling in Environmental Epidemiology Joel Schwartz Harvard University When I was Young What do I mean by Causal Modeling? What would have happened if the population had been exposed to a instead
More informationSpatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States
Spatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States Laura A. Hatfield and Bradley P. Carlin Division of Biostatistics School of Public Health University
More informationNonparametric Bayes Uncertainty Quantification
Nonparametric Bayes Uncertainty Quantification David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & ONR Review of Bayes Intro to Nonparametric Bayes
More informationDisease mapping with Gaussian processes
EUROHEIS2 Kuopio, Finland 17-18 August 2010 Aki Vehtari (former Helsinki University of Technology) Department of Biomedical Engineering and Computational Science (BECS) Acknowledgments Researchers - Jarno
More informationFusing space-time data under measurement error for computer model output
for computer model output (vjb2@stat.duke.edu) SAMSI joint work with Alan E. Gelfand and David M. Holland Introduction In many environmental disciplines data come from two sources: monitoring networks
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationMultivariate spatial modeling
Multivariate spatial modeling Point-referenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Point-referenced
More informationDoes low participation in cohort studies induce bias? Additional material
Does low participation in cohort studies induce bias? Additional material Content: Page 1: A heuristic proof of the formula for the asymptotic standard error Page 2-3: A description of the simulation study
More informationBayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London
Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline
More informationFlexible Regression Modeling using Bayesian Nonparametric Mixtures
Flexible Regression Modeling using Bayesian Nonparametric Mixtures Athanasios Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Department of Statistics Brigham
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationPh.D. course: Regression models. Introduction. 19 April 2012
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationBayesian Analysis of Latent Variable Models using Mplus
Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are
More informationCombining Incompatible Spatial Data
Combining Incompatible Spatial Data Carol A. Gotway Crawford Office of Workforce and Career Development Centers for Disease Control and Prevention Invited for Quantitative Methods in Defense and National
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationA Spatio-Temporal Downscaler for Output From Numerical Models
Supplementary materials for this article are available at 10.1007/s13253-009-0004-z. A Spatio-Temporal Downscaler for Output From Numerical Models Veronica J. BERROCAL,AlanE.GELFAND, and David M. HOLLAND
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationPh.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable
More informationHierarchical Modelling for Multivariate Spatial Data
Hierarchical Modelling for Multivariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Point-referenced spatial data often come as
More informationComparison of multiple imputation methods for systematically and sporadically missing multilevel data
Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationPredicting Long-term Exposures for Health Effect Studies
Predicting Long-term Exposures for Health Effect Studies Lianne Sheppard Adam A. Szpiro, Johan Lindström, Paul D. Sampson and the MESA Air team University of Washington CMAS Special Session, October 13,
More informationECONOMETRICS FIELD EXAM Michigan State University May 9, 2008
ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationNonparametric Bayes tensor factorizations for big data
Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationExtended Follow-Up and Spatial Analysis of the American Cancer Society Study Linking Particulate Air Pollution and Mortality
Extended Follow-Up and Spatial Analysis of the American Cancer Society Study Linking Particulate Air Pollution and Mortality Daniel Krewski, Michael Jerrett, Richard T Burnett, Renjun Ma, Edward Hughes,
More informationBayesian Nonparametric Inference Methods for Mean Residual Life Functions
Bayesian Nonparametric Inference Methods for Mean Residual Life Functions Valerie Poynor Department of Applied Mathematics and Statistics, University of California, Santa Cruz April 28, 212 1/3 Outline
More informationObnoxious lateness humor
Obnoxious lateness humor 1 Using Bayesian Model Averaging For Addressing Model Uncertainty in Environmental Risk Assessment Louise Ryan and Melissa Whitney Department of Biostatistics Harvard School of
More informationPhysician Performance Assessment / Spatial Inference of Pollutant Concentrations
Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Dawn Woodard Operations Research & Information Engineering Cornell University Johns Hopkins Dept. of Biostatistics, April
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationBayesian data analysis in practice: Three simple examples
Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to
More informationSpatial Normalized Gamma Process
Spatial Normalized Gamma Process Vinayak Rao Yee Whye Teh Presented at NIPS 2009 Discussion and Slides by Eric Wang June 23, 2010 Outline Introduction Motivation The Gamma Process Spatial Normalized Gamma
More informationof the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated
Spatial Trends and Spatial Extremes in South Korean Ozone Seokhoon Yun University of Suwon, Department of Applied Statistics Suwon, Kyonggi-do 445-74 South Korea syun@mail.suwon.ac.kr Richard L. Smith
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationLocal Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina
Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area
More informationStatistical Models for Monitoring and Regulating Ground-level Ozone. Abstract
Statistical Models for Monitoring and Regulating Ground-level Ozone Eric Gilleland 1 and Douglas Nychka 2 Abstract The application of statistical techniques to environmental problems often involves a tradeoff
More informationSpatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter
Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationDependence structures with applications to actuarial science
with applications to actuarial science Department of Statistics, ITAM, Mexico Recent Advances in Actuarial Mathematics, OAX, MEX October 26, 2015 Contents Order-1 process Application in survival analysis
More informationCentering Predictor and Mediator Variables in Multilevel and Time-Series Models
Centering Predictor and Mediator Variables in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén Part 2 May 7, 2018 Tihomir Asparouhov and Bengt Muthén Part 2 Muthén & Muthén 1/ 42 Overview
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities
More informationThe Joys of Geocoding (from a Spatial Statistician s Perspective)
The Joys of Geocoding (from a Spatial Statistician s Perspective) Dale L. Zimmerman University of Iowa October 21, 2010 2 Geocoding context Applications of spatial statistics to public health and social
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationWhy experimenters should not randomize, and what they should do instead
Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction
More informationESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS
ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,
More informationBayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling
Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationNear/Far Matching. Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants
Near/Far Matching Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants Joint research: Mike Baiocchi, Dylan Small, Scott Lorch and Paul Rosenbaum What this talk
More informationNuoo-Ting (Jassy) Molitor, Nicky Best, Chris Jackson and Sylvia Richardson Imperial College UK. September 30, 2008
Using Bayesian graphical models to model biases in observational studies and to combine multiple data sources: Application to low birth-weight and water disinfection by-products Nuoo-Ting (Jassy) Molitor,
More informationBeyond MCMC in fitting complex Bayesian models: The INLA method
Beyond MCMC in fitting complex Bayesian models: The INLA method Valeska Andreozzi Centre of Statistics and Applications of Lisbon University (valeska.andreozzi at fc.ul.pt) European Congress of Epidemiology
More informationA general mixed model approach for spatio-temporal regression data
A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression
More informationIntroduction to Geostatistics
Introduction to Geostatistics Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore,
More informationBayesian Nonparametric Meta-Analysis Model George Karabatsos University of Illinois-Chicago (UIC)
Bayesian Nonparametric Meta-Analysis Model George Karabatsos University of Illinois-Chicago (UIC) Collaborators: Elizabeth Talbott, UIC. Stephen Walker, UT-Austin. August 9, 5, 4:5-4:45pm JSM 5 Meeting,
More informationBayesian Hierarchical Models
Bayesian Hierarchical Models Gavin Shaddick, Millie Green, Matthew Thomas University of Bath 6 th - 9 th December 2016 1/ 34 APPLICATIONS OF BAYESIAN HIERARCHICAL MODELS 2/ 34 OUTLINE Spatial epidemiology
More informationFusing point and areal level space-time data. data with application to wet deposition
Fusing point and areal level space-time data with application to wet deposition Alan Gelfand Duke University Joint work with Sujit Sahu and David Holland Chemical Deposition Combustion of fossil fuel produces
More informationComparing groups using predicted probabilities
Comparing groups using predicted probabilities J. Scott Long Indiana University May 9, 2006 MAPSS - May 9, 2006 - Page 1 The problem Allison (1999): Di erences in the estimated coe cients tell us nothing
More informationDirichlet process Bayesian clustering with the R package PReMiuM
Dirichlet process Bayesian clustering with the R package PReMiuM Dr Silvia Liverani Brunel University London July 2015 Silvia Liverani (Brunel University London) Profile Regression 1 / 18 Outline Motivation
More informationModern regression and Mortality
Modern regression and Mortality Doug Nychka National Center for Atmospheric Research www.cgd.ucar.edu/~nychka Statistical models GLM models Flexible regression Combining GLM and splines. Statistical tools
More informationARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2
ARIC Manuscript Proposal # 1186 PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2 1.a. Full Title: Comparing Methods of Incorporating Spatial Correlation in
More informationOnline Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access
Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci
More information