Bayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection By-products and Spontaneous Abortion

Similar documents
Bayesian Methods for Highly Correlated Exposures: an Application to Tap Water Disinfection By-Products and Spontaneous Abortion

Bayesian Regression (1/31/13)

Multivariate Survival Analysis

PMR Learning as Inference

Robustness to Parametric Assumptions in Missing Data Models

Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,

Non-Parametric Bayes

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Selection on Observables: Propensity Score Matching.

Plausible Values for Latent Variables Using Mplus

Bayesian Multivariate Logistic Regression

Nonparametric Bayes tensor factorizations for big data

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data

Lecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017

Bayesian Linear Regression

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Machine Learning Linear Classification. Prof. Matteo Matteucci

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

CS 340 Fall 2007: Homework 3

WU Weiterbildung. Linear Mixed Models

Nuoo-Ting (Jassy) Molitor, Nicky Best, Chris Jackson and Sylvia Richardson Imperial College UK. September 30, 2008

A Practitioner s Guide to Cluster-Robust Inference

STAT331. Cox s Proportional Hazards Model

Predictive Distributions

Correlation and regression

Preliminary Statistics. Lecture 5: Hypothesis Testing

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Bayesian Hierarchical Models

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Empirical Bayes Moderation of Asymptotically Linear Parameters

Bayesian Analysis for Natural Language Processing Lecture 2

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

Multi-level Models: Idea

9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Bayesian non-parametric model to longitudinally predict churn

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

ECE521 week 3: 23/26 January 2017

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

BIOS 312: Precision of Statistical Inference

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Bayesian Methods for Machine Learning

Computational Systems Biology: Biology X

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

SIGNAL RANKING-BASED COMPARISON OF AUTOMATIC DETECTION METHODS IN PHARMACOVIGILANCE

Generative Clustering, Topic Modeling, & Bayesian Inference

Nonparametric Bayesian Methods - Lecture I

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Lecture 01: Introduction

Bayesian linear regression

Machine Learning Lecture 2

Multiple linear regression S6

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

multilevel modeling: concepts, applications and interpretations

Accounting for Complex Sample Designs via Mixture Models

Lecture 8: Information Theory and Statistics

Propensity Score Weighting with Multilevel Data

Part 4: Multi-parameter and normal models

CONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Does low participation in cohort studies induce bias? Additional material

An introduction to biostatistics: part 1

Lecture 16 : Bayesian analysis of contingency tables. Bayesian linear regression. Jonathan Marchini (University of Oxford) BS2a MT / 15

Ignoring the matching variables in cohort studies - when is it valid, and why?

Empirical Bayes Moderation of Asymptotically Linear Parameters

Lecture 7 Time-dependent Covariates in Cox Regression

PS 203 Spring 2002 Homework One - Answer Key

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

BIOS 2083 Linear Models c Abdus S. Wahed

Chapter 8: Sampling distributions of estimators Sections

FREQUENTIST BEHAVIOR OF FORMAL BAYESIAN INFERENCE

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

Semiparametric Regression

Principles of Bayesian Inference

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Hierarchical Modeling for Univariate Spatial Data

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Lecture 4 January 23

Subject CS1 Actuarial Statistics 1 Core Principles

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models

Bayesian Linear Models

Applied Statistics and Econometrics

AIM HIGH SCHOOL. Curriculum Map W. 12 Mile Road Farmington Hills, MI (248)

Bayesian Estimation of Regression Coefficients Under Extended Balanced Loss Function

Doing Right By Massive Data: How To Bring Probability Modeling To The Analysis Of Huge Datasets Without Taking Over The Datacenter

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Propensity Score Adjustment for Unmeasured Confounding in Observational Studies

Important note: Transcripts are not substitutes for textbook assignments. 1

Sensitivity Analysis with Several Unmeasured Confounders

Bayesian Linear Models

Contents. Part I: Fundamentals of Bayesian Inference 1

Transcription:

Outline Bayesian Methods for Highly Correlated Exposures: An Application to Disinfection By-products and Spontaneous Abortion November 8, 2007

Outline Outline 1 Introduction

Outline Outline 1 Introduction 2

Outline Outline 1 Introduction 2 3

Outline Outline 1 Introduction 2 3 4

DBPs and SAB Right From the Start Outline 1 Introduction 2 3 4

DBPs and SAB Right From the Start Spontaneous Abortion (SAB) Pregnancy loss prior to 20 weeks gestation Very common (> 30% of all pregnancies) Relatively little known about its causes maternal age, smoking, prior pregnancy loss, occupational exposures, caffeine

DBPs and SAB Right From the Start Spontaneous Abortion (SAB) Pregnancy loss prior to 20 weeks gestation Very common (> 30% of all pregnancies) Relatively little known about its causes maternal age, smoking, prior pregnancy loss, occupational exposures, caffeine disinfection by-products (DBPs)?

DBPs and SAB Right From the Start Disinfection By-Products A vast array of DBPs are formed in the disinfection process We focus on 2 main types: Trihalomethanes (THMs) CHCl 3, CHBr 3, CHCl 2 Br, CHClBr 2

DBPs and SAB Right From the Start Disinfection By-Products A vast array of DBPs are formed in the disinfection process We focus on 2 main types: Trihalomethanes (THMs) CHCl 3, CHBr 3, CHCl 2 Br, CHClBr 2 Haloacetic Acids (HAAs) ClAA, Cl 2 AA, Cl 3 AA, BrAA, Br 2 AA, Br 3 AA, BrClAA, Br 2 ClAA, BrCl 2 AA

DBPs and SAB Right From the Start DBPs and SABs Early Studies Noted an increased risk of SAB with increased tap-water consumption

DBPs and SAB Right From the Start DBPs and SABs Early Studies Noted an increased risk of SAB with increased tap-water consumption More Recent Studies Increased risk of SAB with exposure to THMs Notably, CHBrCl 2 in Waller et al (1998) OR=2.0 (1.2, 3.5)

DBPs and SAB Right From the Start Specific Aim To estimate the effect of each of the 13 constituent DBPs (4 THMs and 9 HAAs) on SAB

DBPs and SAB Right From the Start Specific Aim To estimate the effect of each of the 13 constituent DBPs (4 THMs and 9 HAAs) on SAB The Problem DBPs are very highly correlated e.g., ρ=0.91 between Cl 2 AA and Cl 3 AA

DBPs and SAB Right From the Start RFTS - briefly 2507 enrolled in three metropolitan areas in U.S. Years: 2001-2004 Recruitment Prenatal care practices (52%) Health departments (32%) Promotional mailings (3%) Drug stores, referrals, etc. (13%)

DBPs and SAB Right From the Start Eligibility criteria 18 years lived in area served by 1 of the water utilities not using assisted reproductive technology positive pregnancy test intended to carry to term intended to remain in area

DBPs and SAB Right From the Start Data Collection Baseline Interview demographic information, medical history, other confounders Pregnancy loss self report or chart abstraction DBP concentration Disinfecting utilities Weekly samples at two sites with high DBPs Every other week at thirdy site with low DBPs

Outline Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) 1 Introduction 2 3 4

Preliminary Analysis Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Discrete time hazard model including all 13 DBPs Time to event: gestational weeks until loss DBP concentrations were measured weekly, included as time-varying covariates Allow for non-linear relationships (crudely) by categorizing DBPs (now have 32 coefficients)

Hazard Model Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) logit{pr(t i = j T i j, )} = α j + γ 1 z 1i + + γ p z pi + Where, β 1 x 1ij + + β 32 x 32ij α j s are week specific intercepts (weeks 5... 20) z 1i... z pi are confounders: smoking, alcohol use, ethnicity, maternal age x kij is the concentration of the k th category of DBP for the i th individual in the j th week

of frequentist analysis Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2)

of frequentist analysis Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Several large but imprecise effects are seen 4 of 32 coefficients are statistically significant Imprecision makes us question results Is there a better analytic approach?

Other common options Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Try all exposure in one model Problem: unreliable estimates

Other common options Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Try all exposure in one model Problem: unreliable estimates Combine variables in aggregate scores Problem: difficult to interpret, can mask effects

Other common options Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Try all exposure in one model Problem: unreliable estimates Combine variables in aggregate scores Problem: difficult to interpret, can mask effects Analyze one variable at a time Problem: uncontrolled confounding

Alternative Approaches Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Bayesian Parametric Models Semi-Bayes (model P1) Fully-Bayes (model P2) Bayesian Semi-Parametric Models Dirichlet process priors (model SP1) Dirichlet process models with selection component (model SP2)

Model P1 Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) A simple two-level hierarchical model popularized by Greenland Have seen use in nutritional, genetic, occupational, and cancer epidemiology Despite the name, they are Bayesian models. name may refer to asymptotic methods commonly used in fitting semi-bayes models

Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Model y i N(x i ) β j N(β 0, φ 2 ) Posterior β N(Ê, V ) V = (X X/σ 2 + I/φ 2 ) 1 Ê = V (X y/σ 2 + β 0 /φ 2 ) y i could be dichotomous and we could use a data augmentation scheme to impute a normal continuous latent variable y i

Shrinkage Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Bayesian: natural consequence of combining prior with data Frequentist: introduce bias to reduce MSE (biased but more precise) Amount of shrinkage depends on prior variance

Shrinkage in model P1 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2)

Problems with model P1 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Hypothesis testing Have been advocated as a way to reduce problems of multiple comparisons Unfortunately, reduction in type I error rate is typically small

Type I error rate with model P1 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2)

More Problems with model P1 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Assumes the prior variance, φ 2, is known with certainty constant shrinkage of all coefficients Sensitivity analyses address changes to results with different prior variances Data contain information on prior variance

Model P2 Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Places prior distribution on φ 2 reduces dependence on prior variance Model Specification y i N(x i ) β N(β 0, φ 2 ) φ 2 IG(α 1, α 2 ) Could place prior on µ in some instances as well

Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) What s an inverse-gamma distribution?

Properties of model P2 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Prior distribution on φ 2 allows it to be updated by the data As variability of estimates from prior mean increases, so does φ 2 As variability of estimates from prior mean decreases, so does φ 2 Adaptive shrinkage of all coefficients

Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Model Specification y i N(x i ) β N(β 0, φ 2 ) φ 2 IG(α 1, α 2 ) Conditional Posteriors β y, φ 2 N(Ê, V ) φ 2 β IG(α 1 + p/2, α 2 + (β β 0 ) (β β 0 )/2) V = (X X/σ 2 + I/φ 2 ) 1 Ê = V (X y/σ 2 + β 0 /φ 2 )

Adaptive Shrinkage of model P2 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Model Prior variance φ 2, Data Shrinkage P1 Fixed Constant Constant P2 Random

Adaptive Shrinkage of model P2 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Model Prior variance φ 2, Data Shrinkage P1 Fixed Constant Constant P2 Random

The Problem with Model P2 Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) How sure are we of our parametric specification of the prior? Can we do better by grouping coefficients into clustering and then shrinking the cluster specific coefficients separately? Amount of shrinkage varies by coefficient

Clustering Coefficients Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2)

DPP - Model SP1 Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Popular Bayesian non-parametric approach Rather than specifying that β j N(µ, φ 2 ), we specify β j D D is an unknown distribution D needs a prior distribution: D DP(λ, D 0 ) D 0 is a base distribution such as N(µ, φ 2 ) λ is a precision parameter. As λ gets large, D converges to D 0 Sample space Ω with support over D 0 Chop Ω into disjoint Borel sets B 1, B 2,..., B J Then, specifying D DP(αD 0 ) prior implies: (D(B 1 ), D(B 2 ),..., D(B J )) Dirichlet(αD 0 (B 1 ), αd 0 (B 2 ),..., αd 0 (B J ))

Random Sample from a DP Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2)

Dirichlet Process Priors Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) β j D, D DP(λ, D 0 ), D 0 N(µ, φ 2 ) This prior implies: β j Data λ λ+k 1 D 0 + 1 λ+k 1 i j δ β i β j has a probability of being clustered with any other coefficient Number of expected clusters (asymptotically): λlog(1 + n/λ) Larger values of λ indicate more certainty about the distribution of β j (more clusters) Smaller values of λ indicate less certainty about the distribution of β j (fewer clusters)

Dirichlet Process Prior Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2)

Clustering Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Consider two parameters, β m and β n, that are equal with probability p mn If β m = β n (p mn = 1), we can use both x im and x in to estimate the common parameter β mn We have twice as much data to estimate the parameter of interest More commonly p mn < 1, so β m adds some information when estimating β n (and vice versa) Part of the reason this method performs so well, is that deciding the probability of clustering is very cheap but the payoff is potentially huge (think of df)

Posterior computation Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) y i N(α + Xβ, σ 2 ) β j D; D DP(λD 0 ); D 0 N(µ, φ 2 ) Conditional Posterior β j β (j), y w 0j D j0 + w kj δ βk k j w oj λ N(β j µ, φ 2 )N(y X j β j, σ 2 ); w kj N(y X j β k, σ 2 ) D j0 N(y X j β k, σ 2 )N(β j µ, φ 2 ) y i = y i α x (j) i β (j)

Model SP2 Introduction Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) Minor modification to Dirichlet process prior model We may desire a more parsimonious model If some DBPs have no effect, would prefer to eliminate them from the model forward/backward selection result in inappropriate confidence intervals

Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) We incorporate a selection model in the Dirichlet Process base distribution: D 0 = πδ 0 + (1 π)n(µ, φ 2 ) π is the probability that a coefficient has no effect (1 π) is the probability that it is N(µ, φ 2 )

Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2) A coefficient is equal to zero (no effect) with probability π A priori, we expect this to happen (π100%) of the time We place a prior distribution on π to allow the data to guide inference

DPP with variable selection Model Model P1 (semi-bayes) Model P2 (fully Bayes) Dirichlet Process Prior Model (SP1) DPP with selection component (SP2)

Simulations Hierarchical Models and RFTS Outline 1 Introduction 2 3 4

Simulations Hierarchical Models and RFTS Four hierarchical models, how do they compare? The increased complexity of these hierarchical models seems to make sense, but what does it gain us? Simulated datasets of size n=500

Simulations Hierarchical Models and RFTS MSE of Hierarchical Models

Simulations Hierarchical Models and RFTS Model P1 logit{pr(t i = j t i j, )} = α j + β 1 x 1i + + β k x ki β j N(µ, φ 2 ) Little prior evidence of effect: specify µ = 0 Calculate φ 2 from existing literature Largest observed effect: OR=3.0 φ 2 = (ln(3) ln(1/3))/(2 1.96) = 0.3142

Simulations Hierarchical Models and RFTS Model P1 -

Simulations Hierarchical Models and RFTS Model P2 logit{pr(t i = j t i j, )} = α j + β 1 x 1i + + β k x ki β j N(µ, φ 2 ) φ 2 IG(α 1, α 2 ) µ = 0 φ 2 is random. Choose α 1 = 3.39, α 2 = 1.33 E(φ 2 ) = 0.31 (as in model P1) V (φ 2 ) = 0.07 (at the 95 th percentile of φ 2, 95% of β s will fall between OR=6 and OR=1/6... the most extreme results we believe possible)

Simulations Hierarchical Models and RFTS Model P2 -

Simulations Hierarchical Models and RFTS Model SP1 logit{pr(t i = j t i j, )} = α j + β 1 x 1i + + β k x ki β j D D DP(λ, D 0 ) D 0 N(µ, φ 2 ) λ G(ν 1, ν 2 ) D 0 N(µ, φ 2 ) φ 2 IG(α 1, α 2 ) µ = 0, α 1 = 3.39, α 2 = 1.33 ν 1 = 1, ν 2 = 1, an uninformative prior for λ

Simulations Hierarchical Models and RFTS Model SP1 -

Simulations Hierarchical Models and RFTS Model SP2 logit{pr(t i = j t i j, )} = α j + β 1 x 1i + + β k x ki β j D D DP(λ, D 0 ) D 0 N(µ, φ 2 ) λ G(ν 1, ν 2 ) D 0 πδ 0 + (1 π)n(µ, φ 2 ) φ 2 IG(α 1, α 2 ) π beta(ω 1, ω 2 ) µ = 0, α 1 = 3.39, α 2 = 1.33, ν 1 = 1, ν 2 = 1 ω 1 = 1.5, ω 2 = 1.5, so E(π) = 0.5 and 95%CI=(0.01, 0.99)

Simulations Hierarchical Models and RFTS Model SP2 -

Future/Current Research Outline 1 Introduction 2 3 4

Future/Current Research Hierarchical Models Semi-Bayes: Assumes β random Fully-Bayes: Assumes φ 2 random Dirichlet Process: Assumes prior distribution is random Dirichlet Process with Selection Component: Assumes prior distribution is random and allows coefficients to cluster at the null Can improve performance (MSE) with increasing complexity

Future/Current Research Hierarchical Models Trade off between performance and difficulty Semi and Fully Bayes are easily implemented in Winbugs (exact distribution) Dirichlet process priors require more programming skill, but often have better MSE

Future/Current Research DBPs and SAB Model P1 provided the least shrinkage; Dirichlet Process models, the most These results in contrast to previous research (sort of) Very little evidence of an effect of any constituent DBP on SAB

Future/Current Research Dimension reduction in genomics Big problem in genotyping research: P >> N Typical frequentist models won t work in this situation Generally rely on FDR approach: tradeoff between per-comparison error rate and family-wise error rate... but you still need a model to generate your p-values We cluster SNP effects to reduce dimension and use a Double Exponential prior in our base distribution

Future/Current Research DP for dimension reduction

Future/Current Research Non parametric functional data analysis Functional data is common in epidemiology Use DP prior to fit flexible random curves to data Individual curves may be clustered over all or part of the curve to aid in inference and prediction We use hierarchical DP prior to allow global or local clustering of curves

Future/Current Research Functional data analysis

Future/Current Research The End