Comparison of multiple imputation methods for systematically and sporadically missing multilevel data
|
|
- Brent Charles Sutton
- 5 years ago
- Views:
Transcription
1 Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon INSERM, UMR 1153, ECSTRA team, Saint-Louis Hospital, Paris MODAL Seminar, November 22th, Lille 1 / 29
2 Motivation: GREAT data (Great Network, 2013) Risk factors associated with short-term mortality in acute heart failure 28 observational cohorts, patients, 2 binary and 8 continuous variables (patient characteristics and potential risk factors) X 1, X 2, X 3,... X p, Y (LVEF ) sporadically and systematically missing data Index Aim: explain the relationship between biomarkers (BNP, AFIB,...) and the left ventricular ejection fraction (LVEF) y ik = x ik β + z ik b k + ε ik b k N (0, Ψ) ε ik N ( 0, σ 2) ) ˆβ and associated variability var ( β gender bmi age SBP DBP HR bnpl AFib LVEF 2 / 29
3 Methods to handle missing values Missing data are often assumed to be missing at random (MAR) Ad-hoc methods Complete-case analysis generally leads to biased estimates increases standard errors Single imputation leads to unbiased estimates standard errors are downwardly biased 3 / 29
4 Methods to handle missing values Relevant methods Likelihood approaches Frequentist framework: EM algorithm Bayesian framework: Data Augmentation leads to unbiased estimates specic to the analysis model not always feasible Multiple imputation leads to unbiased estimates can be used for several analysis models 4 / 29
5 Multiple imputation (Rubin, 1987) 1 Generate a set of M parameters θ m of an imputation model to generate M plausible imputed data sets ) P (X miss X obs, θ 1 ) P (X miss X obs, θ M (ˆFû )ij (ˆFû ) 1 ij +ε1 ij (ˆFû ) 2 ij +ε2 ij (ˆFû ) 3 ij +ε3 ij (ˆFû ) B ij +εb ij 2 Fit the analysis model on each imputed data set: ˆβ m, Var ( ) ˆβ m 3 Combine the results: ˆβ = 1 M ˆβ M m=1 m T = 1 M Var ( ) M m=1 ˆβm + ( ) ( M M M 1 m=1 ˆβm ˆβ Provide estimation of the parameters and of their variability ) 2 5 / 29
6 MI for multilevel data Two standard ways to perform MI Fully conditional specication (FCS, MICE): a conditional imputation model for each variable Joint modelling (JM): a joint imputation model for all variables The imputation model (joint or conditional) needs to be in line with the data need to account for the heterogeneity between clusters need to account for the types of data (continuous and binary) 6 / 29
7 MI for multilevel data Type and name Handles missing data: Coded in R Sporadic? Systematic? in binary variable? JM-Pan yes yes no yes, (Pan) JM-REALCOM yes yes yes no JM-jomo yes yes yes yes, (jomo) JM-Mplus yes yes yes no FCS-2lnorm yes no no yes, (mice) FCS-1stage yes using variant yes yes yes FCS-2stage yes yes yes using yes variant 7 / 29
8 Outline 1 Introduction GREAT data Multiple imputation MI for multilevel data 2 MI methods for multilevel data Continuous variables Univariate case Multivariate case Binary variables 3 Comparisons Simulations Application 4 Conclusion 8 / 29
9 Continuous variables Heteroscedastic random eect model as imputation model y ik = x ik β + z ik b k + ε ik b k N (0, Ψ) ε ik N (0, Σ k ) Multiple imputation under this model 1 generating M sets of parameters θ m = ( ) β m, Ψ m, (Σ m k ) 1 k K Bayesian formulation: draw θ m from its posterior distribution asymptotic method: estimate θ m, draw θ m from the asymptotic distribution of the estimator 2 imputing the data according each set θ m draw bk m yobs k, θ m draw y miss ik θ m, bk m 9 / 29
10 Continuous variables Heteroscedastic random eect model as imputation model y ik = x ik β + z ik b k + ε ik b k N (0, Ψ) ε ik N (0, Σ k ) Multiple imputation under this model 1 generating M sets of parameters θ m = 2 imputing the data according each set θ m draw bk m yobs k, θ m draw y miss ik θ m, bk m Specic issues 1 how to generate Σ k without y ik? (systematic) ( ) β m, Ψ m, (Σ m k ) 1 k K 2 how to draw b m k without y ik (systematic) or given y ik (sporadic)? 9 / 29
11 FCS-1stage (Jolani et al., 2015) Conditional imputation models y ik = x ik β + z ik b k + ε ik b k N (0, Ψ) ε ik N ( 0, σ 2) For each incomplete variable 1 generate θ m = ( ) β m, Ψ m, σ 2 m 1 m M prior: non-informative (Jereys) posterior distribution ( β)) β m N ( β, var W (K, b b ) Ψ 1 m requires REML estimate ( ) σ 2 m Inv-Γ n p 2, (n p) σ impute in each cluster k with systematically missing data draw b k N (0, Ψ m ) impute data according to the imputation model 10 / 29
12 FCS-1stage (Jolani et al., 2015) Conditional imputation models y ik = x ik β + z ik b k + ε ik b k N (0, Ψ) ε ik N ( 0, σ 2) For each incomplete variable 1 generate θ m = ( ) β m, Ψ m, σ 2 m 1 m M prior: non-informative (Jereys) posterior distribution ( β)) β m N ( β, var W (K, b b ) Ψ 1 m requires REML estimate ( ) σ 2 m Inv-Γ n p 2, (n p) σ impute in each cluster k with sporadically missing data draw b k N ( µ bk y k, Ψ bk y k ) impute data according to the imputation model 10 / 29
13 FCS-2stage (Resche-Rigon and White, 2016) Conditional imputation models y ik = x ik (β + b k ) + ε ik b k N (0, Ψ) ε ik N ( ) 0, σ 2 k the same imputation model, with heteroscedastic assumption 1 generate θ m = ( β m, Ψ m, ( σ1 2,..., ) ) σ2 K ) m estimate θ and var ( θ with a two-stage estimator draw θ m from the asymptotic distribution of the estimator with expectation θ ) and variance var ( θ 2 impute in each cluster k 11 / 29
14 FCS-2stage (Resche-Rigon and White, 2016) 1 generate θ m = ( β m, Ψ m, ( σ1 2,..., ) ) σ2 K m stage 1 t a linear model to each observed cluster ( ) 1 β k = X k X k X k y k stage 2 combine the estimates β k = (β + b k ) + ε k b k N (0, Ψ) ε k N ( )) 0, var ( βk Two estimators available: REML or method of moments Ψ, β and their associated (asymptotic) variances 12 / 29
15 FCS-2stage (Resche-Rigon and White, 2016) 1 generate θ m = ( β m, Ψ m, ( σ1 2,..., ) ) σ2 K m stage 1 t a linear model to each observed cluster ) 1 β k = (X k X k X k y k σ k = y k X k β k 2 stage 2 combine the estimates n k p 1 log σ k = (log σ + s k ) + ε k s k N (0, Ψ s) ε k N (0, var (log σ k )) ( )) β k = (β + b k ) + ε k b k N (0, Ψ) ε k N 0, var ( βk Two estimators available: REML or method of moments log σ, Ψ s and their associated (asymptotic) variances Ψ, β and their associated (asymptotic) variances 12 / 29
16 FCS-2stage (Resche-Rigon and White, 2016) 1 generate θ m = ( β m, Ψ m, ( σ1 2,..., ) ) σ2 K m stage 1 t a linear model to each observed cluster ) 1 β k = (X k X k X k y k σ k = y k X k β k 2 stage 2 combine the estimates n k p 1 log σ k = (log σ + s k ) + ε k s k N (0, Ψ s) ε k N (0, var (log σ k )) ( )) β k = (β + b k ) + ε k b k N (0, Ψ) ε k N 0, var ( βk Two estimators available: REML or method of moments log σ, Ψ s and their associated (asymptotic) variances Ψ, β and their associated (asymptotic) variances 2 impute in each cluster k with systematically missing data draw b k from their marginal distribution impute data according to the imputation model 12 / 29
17 FCS-2stage (Resche-Rigon and White, 2016) 1 generate θ m = ( β m, Ψ m, ( σ1 2,..., ) ) σ2 K m stage 1 t a linear model to each observed cluster ) 1 β k = (X k X k X k y k σ k = y k X k β k 2 stage 2 combine the estimates n k p 1 log σ k = (log σ + s k ) + ε k s k N (0, Ψ s) ε k N (0, var (log σ k )) ( )) β k = (β + b k ) + ε k b k N (0, Ψ) ε k N 0, var ( βk Two estimators available: REML or method of moments log σ, Ψ s and their associated (asymptotic) variances Ψ, β and their associated (asymptotic) variances 2 impute in each cluster k with sporadically missing data draw b k conditionally to β k impute data according to the imputation model 12 / 29
18 JM-jomo (Quartagno and Carpenter, 2016) y ik = x ik β + z ik b k + ε ik b k N (0, Ψ) ε ik N (0, Σ k ) 1 Bayesian formulation to generate θ m = (β m, Ψ m, Σ m ) 1 m M (informative) prior: β 1, Ψ 1 W (ν 1, Λ 1 ), Σ 1 k ν 2, Λ 2 W (ν 2, Λ 2 ) posterior: unknown explicitly but... most of conditional posterior distributions are known Gibbs sampler do not require REML estimate unknown conditional distributions can be simulated by MCMC 2 Imputation (given by step 1) 13 / 29
19 Binary variables FCS-1stage (Jolani et al., 2015) t a logistic model with mixed eect to all clusters sporadically missing values not handled FCS-2stage (Resche-Rigon and White, 2016) t a logistic model with xed eect to each cluster combine estimates using a meta-analysis large clusters are required JM-jomo (Quartagno and Carpenter, 2016) probit link: outcomes are latent normal variables, variance for errors are xed to 1 draw latent normal variables derive categories more time consuming 14 / 29
20 Summary method Bayesian / asymptotic prior for covariance matrices heteroscedasticity assumption for errors binary variables FCS-1stage Bayesian Jerey no probit link FCS-2tage asymptotic yes logistic link JM-jomo Bayesian Wishart yes logistic link 15 / 29
21 Simulation design Data generation: 500 incomplete data sets are independently simulated (n = 11685, K = 28, 18 n k 1834) y ik = β 0 + β 1 x (1) ik + β 2 x (2) ik with β = (.72,.11,.03), Ψ = (µ k, ν k, ξ k ) N 0, b0 k [ + b1 k x(1) ik + ε ik x (1) ik : N (2.9 + µ k,.36) ( ( )) x (2) ik : logit P x (2) ik = 1 = ν k x (3) ik : N (2.9 + ξ k,.36) ], σ =.15 add missing values on x (1), x (2) with π syst =.25 and π spor = / 29
22 Simulation design Methods JM-jomo, FCS-1stage, FCS-2stage Full, CC, FCS-x, FCS-noclust, JM-pan M = 5 imputed arrays ) Estimands: β and var ( β Criteria: bias, rmse, variance estimate, coverage 17 / 29
23 Base-case conguration β β true Full CC FCS fix FCS noclust JM pan FCS 1stage FCS 2stageMM FCS 2stageRE JM jomo β 1 β 2 18 / 29
24 Base-case conguration Method ) ) var ( β var ( β 95% Cover Time (min) β 1 β 2 β 1 β 2 β 1 β 2 Full CC FCS-x FCS-noclust JM-pan FCS-1stage FCS-2stagemm FCS-2stagere JM-jomo / 29
25 Base-case conguration Method ) ) var ( β var ( β 95% Cover Time (min) β 1 β 2 β 1 β 2 β 1 β 2 Full CC FCS-x FCS-noclust JM-pan FCS-1stage FCS-2stagemm FCS-2stagere JM-jomo / 29
26 Robustness to the cluster size: point estimate β 1 β 2 Relative bias JM FCS 1stage FCS 2stageRE FCS 2stageMM Relative bias n k n k 20 / 29
27 Robustness to the number of clusters: point estimate β 1 β 2 Relative bias JM FCS 1stage FCS 2stageRE FCS 2stageMM Relative bias K K 21 / 29
28 Robustness to π syst : point estimate β 1 β 2 Relative bias JM FCS 1stage FCS 2stageRE FCS 2stageMM Relative bias π syst π syst 22 / 29
29 Robustness to π syst : variance estimate SE β 1 Model SE JM Model SE FCS 1stage Model SE FCS 2stageRE Model SE FCS 2stageMM Emp SE SE β π syst π syst 23 / 29
30 Robustness to the type of imputed variables β β true Full FCS 1stage FCS 2stageMM FCS 2stageRE JM jomo β 1 β 2 Method ) ) var ( β var ( β 95% Cover Time (min) β 1 β 2 β 1 β 2 β 1 β 2 Full FCS-1stage FCS-2stagemm FCS-2stagere JM-jomo / 29
31 Other congurations Methods have similar performances when the missing data mechanism is MAR the outcome of the analysis model is binary the variance of random eects is higher or smaller binary variables are generated using a probit link 25 / 29
32 Appplication to GREAT data Explain the relationship between biomarkers easily measurable (BNP, AFIB) and the left ventricular ejection fraction y=lvef, X = BNP, AFib MI using M = 20 imputed arrays gender bmi age SBP DBP HR bnpl AFib LVEF Index CC JM FCS-1stage FCS- 2stagere FCS- 2stagemm β BNP Est ModelSE β AFIB Est ModelSE Time / 29
33 Appplication to GREAT data Explain the relationship between biomarkers easily measurable (BNP, AFIB) and the left ventricular ejection fraction y=lvef, X = BNP, AFib MI using M = 20 imputed arrays gender bmi age SBP DBP HR bnpl AFib LVEF Index CC JM FCS-1stage FCS- 2stagere FCS- 2stagemm β BNP Est ModelSE β AFIB Est ModelSE Time / 29
34 Appplication to GREAT data Explain the relationship between biomarkers easily measurable (BNP, AFIB) and the left ventricular ejection fraction y=lvef, X = BNP, AFib MI using M = 20 imputed arrays gender bmi age SBP DBP HR bnpl AFib LVEF Index CC JM FCS-1stage FCS- 2stagere FCS- 2stagemm β BNP Est ModelSE β AFIB Est ModelSE Time / 29
35 Appplication to GREAT data Explain the relationship between biomarkers easily measurable (BNP, AFIB) and the left ventricular ejection fraction y=lvef, X = BNP, AFib MI using M = 20 imputed arrays gender bmi age SBP DBP HR bnpl AFib LVEF Index CC JM FCS-1stage FCS- 2stagere FCS- 2stagemm β BNP Est ModelSE β AFIB Est ModelSE Time / 29
36 Appplication to GREAT data Explain the relationship between biomarkers easily measurable (BNP, AFIB) and the left ventricular ejection fraction y=lvef, X = BNP, AFib MI using M = 20 imputed arrays gender bmi age SBP DBP HR bnpl AFib LVEF Index CC JM FCS-1stage FCS- 2stagere FCS- 2stagemm β BNP Est ModelSE β AFIB Est ModelSE Time / 29
37 Conclusion An overview of MI methods for multilevel data FCS-1stage, FSC-2stage and JM-jomo all appear to perform well Outperform had-hoc methods FCS-2stage FCS-1stage JM-jomo MM version provides a quick way to obtain rst results for large clusters relevant with few systematically missing values time consuming with binary variables advised with a lot of incomplete categorical variables be careful with few clusters Methods are implemented in R mice package for FCS methods jomo package for JM-jomo 27 / 29
38 Limits and perspectives Limits congeniality y ik = β 0 + β 1 x (1) ik + β2 x (2) ik + b0 k + bk 1 x(1) ik + ε ik convergence of the FCS approaches computational time Perspectives correction for FCS-2stage with small clusters? handle logistic link for FCS-1stage? 28 / 29
39 References I Great Network. Managing acute heart failure in the ED - case studies from the acute heart failure academy, D. B. Rubin. Multiple Imputation for Non-Response in Survey. Wiley, New-York, S. van Buuren. Multiple imputation of multilevel data. In The Handbook of Advanced Multilevel Analysis. Routledge, Milton Park, UK, S. Jolani, T. P. A. Debray, H. Kojberg, S. van Buuren, and K. G. M. Moons. Imputation of systematically missing predictors in an individual participant data meta-analysis: a generalized approach using MICE. Statistics in Medicine, 34(11): , M. Resche-Rigon, I. R. White, J. Bartlett, S.A.E. Peters, S.G. Thompson, and on behalf of the PROG-IMT Study Group. Multiple imputation for handling systematically missing confounders in meta-analysis of individual participant data. Statistics in Medicine, 32(28): , ISSN doi: /sim M. Resche-Rigon and I. White. Multiple imputation by chained equations for systematically and sporadically missing multilevel data. smmr, J. L. Schafer. Imputation of missing covariates under a multivariate linear mixed model. Technical report, Dept. of Statistics, The Pennsylvania State University, M. Quartagno and J. R. Carpenter. Multiple imputation for IPD meta-analysis: allowing for heterogeneity and studies with missing covariates. Statistics in Medicine, 35(17): , ISSN / 29
arxiv: v2 [stat.me] 27 Nov 2017
arxiv:1702.00971v2 [stat.me] 27 Nov 2017 Multiple imputation for multilevel data with continuous and binary variables November 28, 2017 Vincent Audigier 1,2,3 *, Ian R. White 4,5, Shahab Jolani 6, Thomas
More informationStatistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23
1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing
More informationMULTILEVEL IMPUTATION 1
MULTILEVEL IMPUTATION 1 Supplement B: MCMC Sampling Steps and Distributions for Two-Level Imputation This document gives technical details of the full conditional distributions used to draw regression
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationMultilevel Multiple Imputation in presence of interactions, non-linearities and random slopes
Multilevel Multiple Imputation in presence of interactions, non-linearities and random slopes Imputazione Multipla Multilivello in presenza di interazioni, non-linearità e pendenze casuali Matteo Quartagno
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationDownloaded from:
Hossain, A; DiazOrdaz, K; Bartlett, JW (2017) Missing binary outcomes under covariate-dependent missingness in cluster randomised trials. Statistics in medicine. ISSN 0277-6715 DOI: https://doi.org/10.1002/sim.7334
More informationBayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London
Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline
More informationA comparison of fully Bayesian and two-stage imputation strategies for missing covariate data
A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data Alexina Mason, Sylvia Richardson and Nicky Best Department of Epidemiology and Biostatistics, Imperial College
More informationBayesian Analysis of Latent Variable Models using Mplus
Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are
More informationarxiv: v1 [stat.me] 27 Feb 2017
arxiv:1702.08148v1 [stat.me] 27 Feb 2017 A Copula-based Imputation Model for Missing Data of Mixed Type in Multilevel Data Sets Jiali Wang 1, Bronwyn Loong 1, Anton H. Westveld 1,2, and Alan H. Welsh 3
More informationComputationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models
Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationIncorporating published univariable associations in diagnostic and prognostic modeling
Incorporating published univariable associations in diagnostic and prognostic modeling Thomas Debray Julius Center for Health Sciences and Primary Care University Medical Center Utrecht The Netherlands
More informationBayesian Multilevel Latent Class Models for the Multiple. Imputation of Nested Categorical Data
Bayesian Multilevel Latent Class Models for the Multiple Imputation of Nested Categorical Data Davide Vidotto Jeroen K. Vermunt Katrijn van Deun Department of Methodology and Statistics, Tilburg University
More informationReconstruction of individual patient data for meta analysis via Bayesian approach
Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi
More informationCentering Predictor and Mediator Variables in Multilevel and Time-Series Models
Centering Predictor and Mediator Variables in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén Part 2 May 7, 2018 Tihomir Asparouhov and Bengt Muthén Part 2 Muthén & Muthén 1/ 42 Overview
More informationBasics of Modern Missing Data Analysis
Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /rssa.
Goldstein, H., Carpenter, J. R., & Browne, W. J. (2014). Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal
More informationThe STS Surgeon Composite Technical Appendix
The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic
More informationThree-Level Multiple Imputation: A Fully Conditional Specification Approach. Brian Tinnell Keller
Three-Level Multiple Imputation: A Fully Conditional Specification Approach by Brian Tinnell Keller A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Arts Approved
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationUnbiased estimation of exposure odds ratios in complete records logistic regression
Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationEstimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing
Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università
More informationLatent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models
Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models Tihomir Asparouhov and Bengt Muthén August 5, 2018 Abstract We discuss different methods for centering a predictor
More informationPooling multiple imputations when the sample happens to be the population.
Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht
More informationMissing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology
Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology Sheng Luo, PhD Associate Professor Department of Biostatistics & Bioinformatics Duke University Medical Center sheng.luo@duke.edu
More informationComparison between conditional and marginal maximum likelihood for a class of item response models
(1/24) Comparison between conditional and marginal maximum likelihood for a class of item response models Francesco Bartolucci, University of Perugia (IT) Silvia Bacci, University of Perugia (IT) Claudia
More informationEstimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test
Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test L. García Barrado 1 E. Coart 2 T. Burzykowski 1,2 1 Interuniversity Institute for Biostatistics and
More informationBayesian Analysis of Multivariate Normal Models when Dimensions are Absent
Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Robert Zeithammer University of Chicago Peter Lenk University of Michigan http://webuser.bus.umich.edu/plenk/downloads.htm SBIES
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationCTDL-Positive Stable Frailty Model
CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland
More informationImplications of Missing Data Imputation for Agricultural Household Surveys: An Application to Technology Adoption
Implications of Missing Data Imputation for Agricultural Household Surveys: An Application to Technology Adoption Haluk Gedikoglu Assistant Professor of Agricultural Economics Cooperative Research Programs
More informationA weighted simulation-based estimator for incomplete longitudinal data models
To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun
More informationRichard D Riley was supported by funding from a multivariate meta-analysis grant from
Bayesian bivariate meta-analysis of correlated effects: impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences Author affiliations Danielle L Burke
More informationInference for correlated effect sizes using multiple univariate meta-analyses
Europe PMC Funders Group Author Manuscript Published in final edited form as: Stat Med. 2016 April 30; 35(9): 1405 1422. doi:10.1002/sim.6789. Inference for correlated effect sizes using multiple univariate
More informationA Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java
IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS A Comparative Study of Imputation Methods for Estimation of Missing Values of Per Capita Expenditure in Central Java To cite this
More informationSome methods for handling missing values in outcome variables. Roderick J. Little
Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean
More informationLongitudinal analysis of ordinal data
Longitudinal analysis of ordinal data A report on the external research project with ULg Anne-Françoise Donneau, Murielle Mauer June 30 th 2009 Generalized Estimating Equations (Liang and Zeger, 1986)
More informationBayesian Multiple Imputation for Large-Scale Categorical Data with Structural Zeros
Bayesian Multiple Imputation for Large-Scale Categorical Data with Structural Zeros Daniel Manrique-Vallier and Jerome P. Reiter June 25, 2013 Abstract We propose an approach for multiple imputation of
More informationSTATISTICAL ANALYSIS WITH MISSING DATA
STATISTICAL ANALYSIS WITH MISSING DATA SECOND EDITION Roderick J.A. Little & Donald B. Rubin WILEY SERIES IN PROBABILITY AND STATISTICS Statistical Analysis with Missing Data Second Edition WILEY SERIES
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationDiscussing Effects of Different MAR-Settings
Discussing Effects of Different MAR-Settings Research Seminar, Department of Statistics, LMU Munich Munich, 11.07.2014 Matthias Speidel Jörg Drechsler Joseph Sakshaug Outline What we basically want to
More informationContinuous Time Survival in Latent Variable Models
Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationStatistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes
Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable
More informationAges of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008
Ages of stellar populations from color-magnitude diagrams Paul Baines Department of Statistics Harvard University September 30, 2008 Context & Example Welcome! Today we will look at using hierarchical
More informationOutline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models
Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationInferences on missing information under multiple imputation and two-stage multiple imputation
p. 1/4 Inferences on missing information under multiple imputation and two-stage multiple imputation Ofer Harel Department of Statistics University of Connecticut Prepared for the Missing Data Approaches
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationHandling Missing Data in R with MICE
Handling Missing Data in R with MICE Handling Missing Data in R with MICE Stef van Buuren 1,2 1 Methodology and Statistics, FSBS, Utrecht University 2 Netherlands Organization for Applied Scientific Research
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationBayesian Multivariate Logistic Regression
Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationBAYESIAN METHODS TO IMPUTE MISSING COVARIATES FOR CAUSAL INFERENCE AND MODEL SELECTION
BAYESIAN METHODS TO IMPUTE MISSING COVARIATES FOR CAUSAL INFERENCE AND MODEL SELECTION by Robin Mitra Department of Statistical Science Duke University Date: Approved: Dr. Jerome P. Reiter, Supervisor
More informationDon t be Fancy. Impute Your Dependent Variables!
Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th
More informationFIT CRITERIA PERFORMANCE AND PARAMETER ESTIMATE BIAS IN LATENT GROWTH MODELS WITH SMALL SAMPLES
FIT CRITERIA PERFORMANCE AND PARAMETER ESTIMATE BIAS IN LATENT GROWTH MODELS WITH SMALL SAMPLES Daniel M. McNeish Measurement, Statistics, and Evaluation University of Maryland, College Park Background
More informationAN EMPIRICAL INVESTIGATION OF THE IMPACT OF DIFFERENT METHODS FOR SYNTHESISING EVIDENCE IN A NETWORK META- ANALYSIS
AN EMPIRICAL INVESTIGATION OF THE IMPACT OF DIFFERENT METHODS FOR SYNTHESISING EVIDENCE IN A NETWORK META- ANALYSIS Project team Amalia (Emily) Karahalios - School of Public Health and Preventive Medicine,
More informationOutline. Selection Bias in Multilevel Models. The selection problem. The selection problem. Scope of analysis. The selection problem
International Workshop on tatistical Latent Variable Models in Health ciences erugia, 6-8 eptember 6 Outline election Bias in Multilevel Models Leonardo Grilli Carla Rampichini grilli@ds.unifi.it carla@ds.unifi.it
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationMultiple Imputation for Complex Data Sets
Dissertation zur Erlangung des Doktorgrades Multiple Imputation for Complex Data Sets Daniel Salfrán Vaquero Hamburg, 2018 Erstgutachter: Zweitgutachter: Professor Dr. Martin Spieß Professor Dr. Matthias
More informationMissing values imputation for mixed data based on principal component methods
Missing values imputation for mixed data based on principal component methods Vincent Audigier, François Husson & Julie Josse Agrocampus Rennes Compstat' 2012, Limassol (Cyprus), 28-08-2012 1 / 21 A real
More informationGibbs Sampling in Endogenous Variables Models
Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take
More informationGeneralized Linear Models for Non-Normal Data
Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More informationDavid Hughes. Flexible Discriminant Analysis Using. Multivariate Mixed Models. D. Hughes. Motivation MGLMM. Discriminant. Analysis.
Using Using David Hughes 2015 Outline Using 1. 2. Multivariate Generalized Linear Mixed () 3. Longitudinal 4. 5. Using Complex data. Using Complex data. Longitudinal Using Complex data. Longitudinal Multivariate
More informationBayesian inference for factor scores
Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor
More informationAccounting for Complex Sample Designs via Mixture Models
Accounting for Complex Sample Designs via Finite Normal Mixture Models 1 1 University of Michigan School of Public Health August 2009 Talk Outline 1 2 Accommodating Sampling Weights in Mixture Models 3
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationUsing Bayesian Priors for More Flexible Latent Class Analysis
Using Bayesian Priors for More Flexible Latent Class Analysis Tihomir Asparouhov Bengt Muthén Abstract Latent class analysis is based on the assumption that within each class the observed class indicator
More informationDistributed multilevel matrix completion for medical databases
Distributed multilevel matrix completion for medical databases Julie Josse Ecole Polytechnique, INRIA Séminaire Parisien de Statistique, IHP January 15, 2018 1 / 38 Overview 1 Introduction 2 Single imputation
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationSTAT 425: Introduction to Bayesian Analysis
STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationCausal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD
Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable
More informationWU Weiterbildung. Linear Mixed Models
Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes
More informationWhether to use MMRM as primary estimand.
Whether to use MMRM as primary estimand. James Roger London School of Hygiene & Tropical Medicine, London. PSI/EFSPI European Statistical Meeting on Estimands. Stevenage, UK: 28 September 2015. 1 / 38
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationLocal Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina
Local Likelihood Bayesian Cluster Modeling for small area health data Andrew Lawson Arnold School of Public Health University of South Carolina Local Likelihood Bayesian Cluster Modelling for Small Area
More informationMulti-level Models: Idea
Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random
More informationMultiple Imputation for Missing Values Through Conditional Semiparametric Odds Ratio Models
Multiple Imputation for Missing Values Through Conditional Semiparametric Odds Ratio Models Hui Xie Assistant Professor Division of Epidemiology & Biostatistics UIC This is a joint work with Drs. Hua Yun
More informationHealth utilities' affect you are reported alongside underestimates of uncertainty
Dr. Kelvin Chan, Medical Oncologist, Associate Scientist, Odette Cancer Centre, Sunnybrook Health Sciences Centre and Dr. Eleanor Pullenayegum, Senior Scientist, Hospital for Sick Children Title: Underestimation
More informationCourse Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model
Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012
More informationSampling bias in logistic models
Sampling bias in logistic models Department of Statistics University of Chicago University of Wisconsin Oct 24, 2007 www.stat.uchicago.edu/~pmcc/reports/bias.pdf Outline Conventional regression models
More informationParametric fractional imputation for missing data analysis
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationPACKAGE LMest FOR LATENT MARKOV ANALYSIS
PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,
More informationMarginal Specifications and a Gaussian Copula Estimation
Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required
More information