A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data
|
|
- Henry Knight
- 5 years ago
- Views:
Transcription
1 A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data Alexina Mason, Sylvia Richardson and Nicky Best Department of Epidemiology and Biostatistics, Imperial College London, UK 5th Annual Bayesian Biostatistics Conference January 23-25,
2 Outline Motivation Simulations general setup non-hierarchical linear hierarchical linear v-shaped informative missingness Application
3 Missing covariate problems The common problem of how to analyse datasets with incomplete covariates can arise directly indirectly - missing covariate problems in disguise Motivating example: reframes problem of unmeasured confounding as a non standard missing data problem the primary data source misses important confounders information on these unmeasured confounders is available from a supplementary data source the two data sources are matched Combining datasets in this way can lead to extreme amounts of missing data
4 Motivating example: water disinfection by-products and risk of low birth weight Objective: estimate the association between trihalomethane concentrations and the risk of full term low birth weight (<2.5kg) Primary data: 8969 birth records between 2000 and 2001 from the Hospital Episode Statistics (HES) database linked to estimated trihalomethane water concentrations data on mother s age, baby gender and an index of deprivation but no data on maternal smoking and ethnicity Supplementary data: survey information from the Millennium Cohort Study (MCS) contains detailed information on smoking and ethnicity 824 cohort births matched to primary data Over 90% of smoking and ethnicity missing
5 Imputation One generally recommended approach to analysing data with incomplete covariates is 1. create multiple completed datasets by imputing the missing values (requires an imputation model) 2. analyse the completed datasets (requires an analysis model) However, this approach can take different forms, e.g. one-stage strategy: fit imputation and analysis models simultaneously two-stage strategy: create imputations first, then carry out analysis Motivating question: how should we advise a practitioner?
6 Multiple Imputation (MI) spectrum feedback from analysis model to imputation model cut averages over small number of draws series of univariate conditional distributions Fully Bayesian Model (FBM) Bayesian MI (feedforward only model) Standard MI with joint multivariate distribution Standard MI with chained equations (MICE) Increasing approximation to FBM FBM and MICE can be considered the extremes of the spectrum
7 FBM - pros Pros and cons of FBM and MICE theoretically sound coherent model estimation uncertainty fully propagated can add missingness model to explore informative missingness
8 FBM - pros Pros and cons of FBM and MICE theoretically sound coherent model estimation uncertainty fully propagated can add missingness model to explore informative missingness FBM - cons implementation can be challenging
9 FBM - pros Pros and cons of FBM and MICE theoretically sound coherent model estimation uncertainty fully propagated can add missingness model to explore informative missingness FBM - cons implementation can be challenging MICE - pros range of readily available packages speed
10 FBM - pros Pros and cons of FBM and MICE theoretically sound coherent model estimation uncertainty fully propagated can add missingness model to explore informative missingness FBM - cons implementation can be challenging MICE - pros range of readily available packages speed MICE - cons conditional distributions may not correspond to a joint distribution difficult to explore informative missingness
11 FBM - pros Pros and cons of FBM and MICE theoretically sound coherent model estimation uncertainty fully propagated can add missingness model to explore informative missingness FBM - cons implementation can be challenging MICE - pros range of readily available packages speed MICE - cons conditional distributions may not correspond to a joint distribution difficult to explore informative missingness But how do FBM and MICE actually perform in practice? We investigate with some simulations
12 General setup of simulations Generate 1000 simulated data sets with 2 correlated explanatory variables, x and u response, y, dependent on x and u missingness imposed on u dependent on y Each simulated dataset analysed by a series of models Performance of models assessed for coefficient for u, β u, (true value=-2) coefficient for x, β x, (true value=1) We report average estimate (average of the posterior means) bias (average estimate - true value) coverage rate (proportion of times true value is contained in the 95% interval) interval width (average width of 95% interval)
13 Simulation setup: model descriptions We run 5 types of models GOLD: correct analysis model run on complete datasets EXU: excludes u from analysis model CC: complete case analysis FBM: Fully Bayesian Model (analysis and imputation models) MICE: uses 20 imputations GOLD provides performance targets GOLD, EXU, CC, FBM all fitted using WinBUGS software MICE fitted using functions from mice package in R software FBM has several variants dependent on scenario All models have the correct analysis model
14 Non-hierarchical linear simulation Data generated all variables continuous no hierarchical structure 1000 individuals 90% missingness Note: with single covariate with missing values, the approximation of using chained equations disappears Results EXU: extreme bias and 0 coverage (β x only) CC: serious bias and very low coverage FBM and MICE: both correct most of the bias and achieve nominal coverage
15 Non-hierarchical linear simulation Data generated all variables continuous no hierarchical structure 1000 individuals 90% missingness Note: with single covariate with missing values, the approximation of using chained equations disappears Results EXU: extreme bias and 0 coverage (β x only) CC: serious bias and very low coverage FBM and MICE: both correct most of the bias and achieve nominal coverage Even with extreme levels of missingness, FBM and MICE have similar performance with non-complex data
16 Hierarchical linear simulation - description Data generated with hierarchical structure (individuals within clusters) 10 clusters, each with 100 individuals 50% missingness FBM models: 3 variants with different imputation models for u no hierarchical structure (no HS) random intercepts (HS: ri) random intercepts + random slopes on x (HS: ri+rs) MICE: no hierarchical structure in imputation model in theory could run variants with hierarchical structure but implementation difficulties
17 Hierarchical linear simulation - β u results average coverage interval bias estimate rate width GOLD CC FBM (no HS) FBM (HS: ri) FBM (HS: ri+rs) MICE (no HS)
18 Hierarchical linear simulation - β u results average coverage interval bias estimate rate width GOLD CC FBM (no HS) FBM (HS: ri) FBM (HS: ri+rs) MICE (no HS) If hierarchical structure ignored in imputation model FBM - slight bias and poor coverage
19 Hierarchical linear simulation - β u results average coverage interval bias estimate rate width GOLD CC FBM (no HS) FBM (HS: ri) FBM (HS: ri+rs) MICE (no HS) If hierarchical structure ignored in imputation model FBM - slight bias and poor coverage MICE - much worse (no feedback from structure in analysis model)
20 Hierarchical linear simulation - β u results average coverage interval bias estimate rate width GOLD CC FBM (no HS) FBM (HS: ri) FBM (HS: ri+rs) MICE (no HS) If hierarchical structure incorporated in imputation model bias corrected nominal coverage rate achieved
21 Hierarchical linear simulation - β x results average coverage interval bias estimate rate width GOLD EXU CC FBM (no HS) FBM (HS: ri) FBM (HS: ri+rs) MICE (no HS) Pattern of bias and coverage results similar to β u
22 V-shaped informative missingness - description Data generated with no hierarchical structure 100 individuals missingness imposed on u depends on y and u 50% missingness FBM models: 4 variants MAR: no model of covariate missingness MNAR: assumes linear shape (linear) MNAR: allows v-shape (v-shape) MNAR: allows v-shape + priors inform signs of slopes (v-shape+) MICE: MAR, no model of covariate missingness most implementations do not readily extend to MNAR ad hoc sensitivity analysis to MNAR possible by inflating or deflating imputations (van Buuren and Groothuis-Oudshoorn, 2011)
23 V-shaped informative missingness - β u results average coverage interval bias estimate rate width GOLD CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE
24 V-shaped informative missingness - β u results average coverage interval bias estimate rate width GOLD CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE MAR results in bias and slightly reduced coverage
25 V-shaped informative missingness - β u results average coverage interval bias estimate rate width GOLD CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE MAR results in bias and slightly reduced coverage improvements if allow MNAR, even if wrong form
26 V-shaped informative missingness - β u results average coverage interval bias estimate rate width GOLD CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE MAR results in bias and slightly reduced coverage improvements if allow MNAR, even if wrong form further improvements from correct form
27 V-shaped informative missingness - β u results average coverage interval bias estimate rate width GOLD CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE MAR results in bias and slightly reduced coverage improvements if allow MNAR, even if wrong form further improvements from correct form and even better with informative priors
28 V-shaped informative missingness - β x results average coverage interval bias estimate rate width GOLD EXU CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE
29 V-shaped informative missingness - β x results average coverage interval bias estimate rate width GOLD EXU CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE MAR results in modest bias (FBM and MICE)
30 V-shaped informative missingness - β x results average coverage interval bias estimate rate width GOLD EXU CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE MAR results in modest bias (FBM and MICE) wrong MNAR (linear) slightly worse than MAR
31 V-shaped informative missingness - β x results average coverage interval bias estimate rate width GOLD EXU CC MAR: FBM MNAR: FBM (linear) MNAR: FBM (vshape) MNAR: FBM (vshape+) MAR: MICE MAR results in modest bias (FBM and MICE) wrong MNAR (linear) slightly worse than MAR little gain in correct MNAR over MAR
32 hierarchical structure Summary of simulation results β u and β x - clear benefits from incorporating structure in FBM imputation model (unable to assess MICE) informative missingness β u - benefits from correct MNAR β x - no clear benefits from allowing MNAR FBM CIM AM MoCM AM = Analysis Model CIM = Covariate Imputation Model MoCM = Model of Covariate Missingness With FBM for informative missingness, 3 linked models fitted simultaneously
33 Application results CC (MCS only) FBM Bayesian MI Standard MI MICE odds ratio for smoking during pregnancy (u) EXU (HES only) CC (MCS only) FBM Bayesian MI Standard MI MICE odds ratio for Trihalomethanes > 60µg/L (x)
34 Practical advice and future work Future work: extend simulations to multiple covariates non-linear (glm) analysis models
35 Practical advice and future work Future work: extend simulations to multiple covariates non-linear (glm) analysis models Practical advice simple complex small dataset and few covariates with missingness FBM MICE FBM large dataset and/or many covariates with missingness MICE
36 Direction for missing data research feedback from analysis model to imputation model cut averages over small number of draws series of univariate conditional distributions Fully Bayesian Model (FBM) Bayesian MI (feedforward only model) Standard MI with joint multivariate distribution Standard MI with chained equations (MICE) Increasing approximation to FBM Where should our starting point be? FBM: improve computational efficiency, more case studies MICE: robust in simple setup, but more work needed for realistic situations
37 Further Information and Acknowledgements See BIAS web site ( Funding by ESRC: the BIAS project (PI N Best), based at Imperial College, London, is a node of the Economic and Social Research Council s National Centre for Research Methods (NCRM) Daniels, M. J. and Hogan, J. W. (2008). Missing Data In Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis. Chapman & Hall. Mason, A., Richardson, S., Plewis, I., and Best, N. (2012). Strategy for modelling non-random missing data mechanisms in observational studies using Bayesian methods. Journal of Official Statistics, to appear. Mason, A. J. (2009). Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies. PhD thesis, Imperial College London, available at van Buuren, S. and Groothuis-Oudshoorn, K. (2011). mice: Multiple Imputation by Chained Equations in R. Journal of Statistical Software, 45, (3), 1 67.
38 Comparison of FBM and MICE FBM MICE Imputation Model Analysis Model Imputation Model Analysis Model 1 stage procedure fit Imputation and Analysis Models simultaneously imputation model uses joint distribution of all missing variables uses full posterior distribution of missing values 2 stage procedure 1. fit Imputation Model 2. fit Analysis Model imputation model based on a set of univariate conditional distributions uses small number of draws of missing values from their predictive distribution
39 Non-hierarchical linear simulation - results average coverage interval bias estimate rate width β x GOLD β x EXU β x CC β x FBM β x MICE β u GOLD β u CC β u FBM β u MICE
40 Hierarchical linear simulation - equations Generate full data set as follows: x c 0 u c MVN 0 α c 1 ( ) (( xi xc MVN u i u c, ), y i N(α c + x i 2u i, 1) ( )) c indicates cluster level data; i indicates individual level data Impose missingness such that u i is missing with probability p i logit(p i ) = y i
41 V-shaped informative missingness - equations Generate full data set as follows: ( ) (( x 0 MVN u 0 y N(1 + x 2u, 4 2 ) ) ( 1 0.5, )) Impose missingness such that u is missing with probability p logit(p) = u + 0.5y
Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London
Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline
More informationAlexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010
Strategy for modelling non-random missing data mechanisms in longitudinal studies using Bayesian methods: application to income data from the Millennium Cohort Study Alexina Mason Department of Epidemiology
More informationNuoo-Ting (Jassy) Molitor, Nicky Best, Chris Jackson and Sylvia Richardson Imperial College UK. September 30, 2008
Using Bayesian graphical models to model biases in observational studies and to combine multiple data sources: Application to low birth-weight and water disinfection by-products Nuoo-Ting (Jassy) Molitor,
More informationDon t be Fancy. Impute Your Dependent Variables!
Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th
More informationBayesian methods for modelling non-random missing data mechanisms in longitudinal studies. Alexina Jane Mason. Imperial College London
Bayesian methods for modelling non-random missing data mechanisms in longitudinal studies Alexina Jane Mason Imperial College London Department of Epidemiology and Public Health PhD Thesis Abstract In
More informationAdjustment for Missing Confounders Using External Validation Data and Propensity Scores
Adjustment for Missing Confounders Using External Validation Data and Propensity Scores Lawrence C. McCandless 1 Sylvia Richardson 2 Nicky Best 2 1 Faculty of Health Sciences, Simon Fraser University,
More informationPooling multiple imputations when the sample happens to be the population.
Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht
More informationComparison of multiple imputation methods for systematically and sporadically missing multilevel data
Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon
More informationUnbiased estimation of exposure odds ratios in complete records logistic regression
Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology
More informationStatistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23
1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing
More informationPropensity Score Adjustment for Unmeasured Confounding in Observational Studies
Propensity Score Adjustment for Unmeasured Confounding in Observational Studies Lawrence C. McCandless Sylvia Richardson Nicky G. Best Department of Epidemiology and Public Health, Imperial College London,
More informationBasics of Modern Missing Data Analysis
Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /rssa.
Goldstein, H., Carpenter, J. R., & Browne, W. J. (2014). Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal
More informationEstimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing
Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università
More informationStatistical Practice
Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed
More informationA Note on Bayesian Inference After Multiple Imputation
A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in
More informationWhether to use MMRM as primary estimand.
Whether to use MMRM as primary estimand. James Roger London School of Hygiene & Tropical Medicine, London. PSI/EFSPI European Statistical Meeting on Estimands. Stevenage, UK: 28 September 2015. 1 / 38
More informationEstimating complex causal effects from incomplete observational data
Estimating complex causal effects from incomplete observational data arxiv:1403.1124v2 [stat.me] 2 Jul 2014 Abstract Juha Karvanen Department of Mathematics and Statistics, University of Jyväskylä, Jyväskylä,
More informationMixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina
Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals John W. Mac McDonald & Alessandro Rosina Quantitative Methods in the Social Sciences Seminar -
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationNonrespondent subsample multiple imputation in two-phase random sampling for nonresponse
Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nanhua Zhang Division of Biostatistics & Epidemiology Cincinnati Children s Hospital Medical Center (Joint work
More informationSome methods for handling missing values in outcome variables. Roderick J. Little
Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean
More informationInferences on missing information under multiple imputation and two-stage multiple imputation
p. 1/4 Inferences on missing information under multiple imputation and two-stage multiple imputation Ofer Harel Department of Statistics University of Connecticut Prepared for the Missing Data Approaches
More informationCan a Pseudo Panel be a Substitute for a Genuine Panel?
Can a Pseudo Panel be a Substitute for a Genuine Panel? Min Hee Seo Washington University in St. Louis minheeseo@wustl.edu February 16th 1 / 20 Outline Motivation: gauging mechanism of changes Introduce
More informationHierarchical Bayesian Modeling of Multisite Daily Rainfall Occurrence
The First Henry Krumb Sustainable Engineering Symposium Hierarchical Bayesian Modeling of Multisite Daily Rainfall Occurrence Carlos Henrique Ribeiro Lima Prof. Upmanu Lall March 2009 Agenda 1) Motivation
More informationarxiv: v1 [stat.me] 27 Feb 2017
arxiv:1702.08148v1 [stat.me] 27 Feb 2017 A Copula-based Imputation Model for Missing Data of Mixed Type in Multilevel Data Sets Jiali Wang 1, Bronwyn Loong 1, Anton H. Westveld 1,2, and Alan H. Welsh 3
More informationMISSING or INCOMPLETE DATA
MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationBayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection By-products and Spontaneous Abortion
Outline Bayesian Methods for Highly Correlated Exposures: An Application to Disinfection By-products and Spontaneous Abortion November 8, 2007 Outline Outline 1 Introduction Outline Outline 1 Introduction
More informationThe mice package. Stef van Buuren 1,2. amst-r-dam, Oct. 29, TNO, Leiden. 2 Methodology and Statistics, FSBS, Utrecht University
The mice package 1,2 1 TNO, Leiden 2 Methodology and Statistics, FSBS, Utrecht University amst-r-dam, Oct. 29, 2012 > The problem of missing data Consequences of missing data Less information than planned
More informationMULTILEVEL IMPUTATION 1
MULTILEVEL IMPUTATION 1 Supplement B: MCMC Sampling Steps and Distributions for Two-Level Imputation This document gives technical details of the full conditional distributions used to draw regression
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota,
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationThree-Level Multiple Imputation: A Fully Conditional Specification Approach. Brian Tinnell Keller
Three-Level Multiple Imputation: A Fully Conditional Specification Approach by Brian Tinnell Keller A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Arts Approved
More informationarxiv: v2 [stat.me] 27 Nov 2017
arxiv:1702.00971v2 [stat.me] 27 Nov 2017 Multiple imputation for multilevel data with continuous and binary variables November 28, 2017 Vincent Audigier 1,2,3 *, Ian R. White 4,5, Shahab Jolani 6, Thomas
More informationMultiple Imputation for Missing Values Through Conditional Semiparametric Odds Ratio Models
Multiple Imputation for Missing Values Through Conditional Semiparametric Odds Ratio Models Hui Xie Assistant Professor Division of Epidemiology & Biostatistics UIC This is a joint work with Drs. Hua Yun
More informationMethodology and Statistics for the Social and Behavioural Sciences Utrecht University, the Netherlands
Methodology and Statistics for the Social and Behavioural Sciences Utrecht University, the Netherlands MSc Thesis Emmeke Aarts TITLE: A novel method to obtain the treatment effect assessed for a completely
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationF-tests for Incomplete Data in Multiple Regression Setup
F-tests for Incomplete Data in Multiple Regression Setup ASHOK CHAURASIA Advisor: Dr. Ofer Harel University of Connecticut / 1 of 19 OUTLINE INTRODUCTION F-tests in Multiple Linear Regression Incomplete
More informationKnown unknowns : using multiple imputation to fill in the blanks for missing data
Known unknowns : using multiple imputation to fill in the blanks for missing data James Stanley Department of Public Health University of Otago, Wellington james.stanley@otago.ac.nz Acknowledgments Cancer
More informationReconstruction of individual patient data for meta analysis via Bayesian approach
Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi
More informationChallenges in modelling air pollution and understanding its impact on human health
Challenges in modelling air pollution and understanding its impact on human health Alastair Rushworth Joint Statistical Meeting, Seattle Wednesday August 12 th, 2015 Acknowledgements Work in this talk
More informationInteractions and Squares: Don t Transform, Just Impute!
Interactions and Squares: Don t Transform, Just Impute! Philipp Gaffert Volker Bosch Florian Meinfelder Abstract Multiple imputation [Rubin, 1987] is difficult to conduct if the analysis model includes
More informationANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW
SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved
More informationStatistical Analysis of Longitudinal Data with a. Case Study
Statistical Analysis of Longitudinal Data with a Case Study STATISTICAL ANALYSIS OF LONGITUDINAL DATA WITH A CASE STUDY BY KAI LIU, B.Sc. a thesis submitted to the department of mathematics & statistics
More informationMISSING or INCOMPLETE DATA
MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing
More informationRichard D Riley was supported by funding from a multivariate meta-analysis grant from
Bayesian bivariate meta-analysis of correlated effects: impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences Author affiliations Danielle L Burke
More informationJoint longitudinal and time-to-event models via Stan
Joint longitudinal and time-to-event models via Stan Sam Brilleman 1,2, Michael J. Crowther 3, Margarita Moreno-Betancur 2,4,5, Jacqueline Buros Novik 6, Rory Wolfe 1,2 StanCon 2018 Pacific Grove, California,
More informationPh.D. course: Regression models. Introduction. 19 April 2012
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable
More informationBAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS
BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS Srinivasan R and Venkatesan P Dept. of Statistics, National Institute for Research Tuberculosis, (Indian Council of Medical Research),
More informationPh.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable
More informationSummary of Talk Background to Multilevel modelling project. What is complex level 1 variation? Tutorial dataset. Method 1 : Inverse Wishart proposals.
Modelling the Variance : MCMC methods for tting multilevel models with complex level 1 variation and extensions to constrained variance matrices By Dr William Browne Centre for Multilevel Modelling Institute
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationDirichlet process Bayesian clustering with the R package PReMiuM
Dirichlet process Bayesian clustering with the R package PReMiuM Dr Silvia Liverani Brunel University London July 2015 Silvia Liverani (Brunel University London) Profile Regression 1 / 18 Outline Motivation
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley Department of Forestry & Department of Geography, Michigan State University, Lansing
More informationMCMC for Cut Models or Chasing a Moving Target with MCMC
MCMC for Cut Models or Chasing a Moving Target with MCMC Martyn Plummer International Agency for Research on Cancer MCMSki Chamonix, 6 Jan 2014 Cut models What do we want to do? 1. Generate some random
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationPattern Mixture Models for the Analysis of Repeated Attempt Designs
Biometrics 71, 1160 1167 December 2015 DOI: 10.1111/biom.12353 Mixture Models for the Analysis of Repeated Attempt Designs Michael J. Daniels, 1, * Dan Jackson, 2, ** Wei Feng, 3, *** and Ian R. White
More informationParameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1
Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data
More informationDiscussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs
Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully
More informationHandling Missing Data in R with MICE
Handling Missing Data in R with MICE Handling Missing Data in R with MICE Stef van Buuren 1,2 1 Methodology and Statistics, FSBS, Utrecht University 2 Netherlands Organization for Applied Scientific Research
More informationPredictive mean matching imputation of semicontinuous variables
61 Statistica Neerlandica (2014) Vol. 68, nr. 1, pp. 61 90 doi:10.1111/stan.12023 Predictive mean matching imputation of semicontinuous variables Gerko Vink* Department of Methodology and Statistics, Utrecht
More informationAn Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies
Paper 177-2015 An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies Yan Wang, Seang-Hwane Joo, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta
More informationMore Statistics tutorial at Logistic Regression and the new:
Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual
More informationBeyond MCMC in fitting complex Bayesian models: The INLA method
Beyond MCMC in fitting complex Bayesian models: The INLA method Valeska Andreozzi Centre of Statistics and Applications of Lisbon University (valeska.andreozzi at fc.ul.pt) European Congress of Epidemiology
More informationBayesian Multilevel Latent Class Models for the Multiple. Imputation of Nested Categorical Data
Bayesian Multilevel Latent Class Models for the Multiple Imputation of Nested Categorical Data Davide Vidotto Jeroen K. Vermunt Katrijn van Deun Department of Methodology and Statistics, Tilburg University
More informationARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2
ARIC Manuscript Proposal # 1186 PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2 1.a. Full Title: Comparing Methods of Incorporating Spatial Correlation in
More informationA multivariate multilevel model for the analysis of TIMMS & PIRLS data
A multivariate multilevel model for the analysis of TIMMS & PIRLS data European Congress of Methodology July 23-25, 2014 - Utrecht Leonardo Grilli 1, Fulvia Pennoni 2, Carla Rampichini 1, Isabella Romeo
More informationLongitudinal analysis of ordinal data
Longitudinal analysis of ordinal data A report on the external research project with ULg Anne-Françoise Donneau, Murielle Mauer June 30 th 2009 Generalized Estimating Equations (Liang and Zeger, 1986)
More informationRevision: Chapter 1-6. Applied Multivariate Statistics Spring 2012
Revision: Chapter 1-6 Applied Multivariate Statistics Spring 2012 Overview Cov, Cor, Mahalanobis, MV normal distribution Visualization: Stars plot, mosaic plot with shading Outlier: chisq.plot Missing
More informationLatent Variable Model for Weight Gain Prevention Data with Informative Intermittent Missingness
Journal of Modern Applied Statistical Methods Volume 15 Issue 2 Article 36 11-1-2016 Latent Variable Model for Weight Gain Prevention Data with Informative Intermittent Missingness Li Qin Yale University,
More informationINTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011
INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA Belfast 9 th June to 10 th June, 2011 Dr James J Brown Southampton Statistical Sciences Research Institute (UoS) ADMIN Research Centre (IoE
More informationDavid Hughes. Flexible Discriminant Analysis Using. Multivariate Mixed Models. D. Hughes. Motivation MGLMM. Discriminant. Analysis.
Using Using David Hughes 2015 Outline Using 1. 2. Multivariate Generalized Linear Mixed () 3. Longitudinal 4. 5. Using Complex data. Using Complex data. Longitudinal Using Complex data. Longitudinal Multivariate
More informationImplications of Missing Data Imputation for Agricultural Household Surveys: An Application to Technology Adoption
Implications of Missing Data Imputation for Agricultural Household Surveys: An Application to Technology Adoption Haluk Gedikoglu Assistant Professor of Agricultural Economics Cooperative Research Programs
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities
More informationST 790, Homework 1 Spring 2017
ST 790, Homework 1 Spring 2017 1. In EXAMPLE 1 of Chapter 1 of the notes, it is shown at the bottom of page 22 that the complete case estimator for the mean µ of an outcome Y given in (1.18) under MNAR
More informationSensitivity Analysis with Several Unmeasured Confounders
Sensitivity Analysis with Several Unmeasured Confounders Lawrence McCandless lmccandl@sfu.ca Faculty of Health Sciences, Simon Fraser University, Vancouver Canada Spring 2015 Outline The problem of several
More informationUniversity of Pennsylvania and The Children s Hospital of Philadelphia
Submitted to the Annals of Applied Statistics arxiv: arxiv:0000.0000 ESTIMATION OF CAUSAL EFFECTS USING INSTRUMENTAL VARIABLES WITH NONIGNORABLE MISSING COVARIATES: APPLICATION TO EFFECT OF TYPE OF DELIVERY
More informationMissing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology
Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Sheng Luo, PhD Associate Professor Department of Biostatistics & Bioinformatics Duke University Medical Center sheng.luo@duke.edu
More informationBayesian Analysis of Multivariate Normal Models when Dimensions are Absent
Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Robert Zeithammer University of Chicago Peter Lenk University of Michigan http://webuser.bus.umich.edu/plenk/downloads.htm SBIES
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationFully Bayesian inference under ignorable missingness in the presence of auxiliary covariates
Biometrics 000, 000 000 DOI: 000 000 0000 Fully Bayesian inference under ignorable missingness in the presence of auxiliary covariates M.J. Daniels, C. Wang, B.H. Marcus 1 Division of Statistics & Scientific
More informationLecture Notes: Some Core Ideas of Imputation for Nonresponse in Surveys. Tom Rosenström University of Helsinki May 14, 2014
Lecture Notes: Some Core Ideas of Imputation for Nonresponse in Surveys Tom Rosenström University of Helsinki May 14, 2014 1 Contents 1 Preface 3 2 Definitions 3 3 Different ways to handle MAR data 4 4
More informationChapter 4 Multi-factor Treatment Designs with Multiple Error Terms 93
Contents Preface ix Chapter 1 Introduction 1 1.1 Types of Models That Produce Data 1 1.2 Statistical Models 2 1.3 Fixed and Random Effects 4 1.4 Mixed Models 6 1.5 Typical Studies and the Modeling Issues
More informationComparing Group Means When Nonresponse Rates Differ
UNF Digital Commons UNF Theses and Dissertations Student Scholarship 2015 Comparing Group Means When Nonresponse Rates Differ Gabriela M. Stegmann University of North Florida Suggested Citation Stegmann,
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More informationMultiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas
Multiple Imputation for Missing Data in epeated Measurements Using MCMC and Copulas Lily Ingsrisawang and Duangporn Potawee Abstract This paper presents two imputation methods: Marov Chain Monte Carlo
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationNon-iterative, regression-based estimation of haplotype associations
Non-iterative, regression-based estimation of haplotype associations Benjamin French, PhD Department of Biostatistics and Epidemiology University of Pennsylvania bcfrench@upenn.edu National Cancer Center
More informationEstimating the long-term health impact of air pollution using spatial ecological studies. Duncan Lee
Estimating the long-term health impact of air pollution using spatial ecological studies Duncan Lee EPSRC and RSS workshop 12th September 2014 Acknowledgements This is joint work with Alastair Rushworth
More informationIntroduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data
Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington
More informationMissing Data and Multiple Imputation
Maximum Likelihood Methods for the Social Sciences POLS 510 CSSS 510 Missing Data and Multiple Imputation Christopher Adolph Political Science and CSSS University of Washington, Seattle Vincent van Gogh
More informationTime-Invariant Predictors in Longitudinal Models
Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies
More informationHelp! Statistics! Mediation Analysis
Help! Statistics! Lunch time lectures Help! Statistics! Mediation Analysis What? Frequently used statistical methods and questions in a manageable timeframe for all researchers at the UMCG. No knowledge
More informationA weighted simulation-based estimator for incomplete longitudinal data models
To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun
More informationAn Introduction to Path Analysis
An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving
More informationLogistic Regression. Fitting the Logistic Regression Model BAL040-A.A.-10-MAJ
Logistic Regression The goal of a logistic regression analysis is to find the best fitting and most parsimonious, yet biologically reasonable, model to describe the relationship between an outcome (dependent
More informationFlexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.
FACULTY OF PSYCHOLOGY AND EDUCATIONAL SCIENCES Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula. Modern Modeling Methods (M 3 ) Conference Beatrijs Moerkerke
More information