Don t be Fancy. Impute Your Dependent Variables!

Similar documents
Pooling multiple imputations when the sample happens to be the population.

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level

Basics of Modern Missing Data Analysis

A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data

F-tests for Incomplete Data in Multiple Regression Setup

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

The mice package. Stef van Buuren 1,2. amst-r-dam, Oct. 29, TNO, Leiden. 2 Methodology and Statistics, FSBS, Utrecht University

Discussing Effects of Different MAR-Settings

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /rssa.

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Known unknowns : using multiple imputation to fill in the blanks for missing data

Interactions and Squares: Don t Transform, Just Impute!

Lecture Notes: Some Core Ideas of Imputation for Nonresponse in Surveys. Tom Rosenström University of Helsinki May 14, 2014

Three-Level Multiple Imputation: A Fully Conditional Specification Approach. Brian Tinnell Keller

2 Naïve Methods. 2.1 Complete or available case analysis

A Note on Bayesian Inference After Multiple Imputation

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models

Estimating complex causal effects from incomplete observational data

Statistical Distribution Assumptions of General Linear Models

Time Invariant Predictors in Longitudinal Models

Plausible Values for Latent Variables Using Mplus

Analyzing Pilot Studies with Missing Observations

A multivariate multilevel model for the analysis of TIMMS & PIRLS data

Handling Missing Data in R with MICE

Given a sample of n observations measured on k IVs and one DV, we obtain the equation

Predictive mean matching imputation of semicontinuous variables

Statistical Practice

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Missing Data and Multiple Imputation

Kyle M. Lang. Date defended: February 13, Chairperson Todd D. Little. Wei Wu. Paul E. Johnson

EMERGING MARKETS - Lecture 2: Methodology refresher

6 Pattern Mixture Models

MISSING or INCOMPLETE DATA

MULTILEVEL IMPUTATION 1

Confidence Intervals, Testing and ANOVA Summary

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent

Methods Used for Estimating Statistics in EdSurvey Developed by Paul Bailey & Michael Cohen May 04, 2017

Bayesian Multilevel Latent Class Models for the Multiple. Imputation of Nested Categorical Data

Discussion Paper. No. 4. Missing data: On criteria to evaluate imputation methods. January Daniel Salfrán, Pascal Jordan & Martin Spiess

Time-Invariant Predictors in Longitudinal Models

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

Introduction to Within-Person Analysis and RM ANOVA

Measuring the fit of the model - SSR

In Praise of the Listwise-Deletion Method (Perhaps with Reweighting)

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Methodology and Statistics for the Social and Behavioural Sciences Utrecht University, the Netherlands

High-dimensional regression

MIBEN: Robust Multiple Imputation with the Bayesian Elastic Net. Kyle M. Lang

x 21 x 22 x 23 f X 1 X 2 X 3 ε

Itemwise Conditionally Independent Nonresponse Modeling for Incomplete Multivariate Data 1

Lecture 2: Linear regression

OPTIMIZATION OF FIRST ORDER MODELS

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

Introduction The framework Bias and variance Approximate computation of leverage Empirical evaluation Discussion of sampling approach in big data

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement

Inferences on missing information under multiple imputation and two-stage multiple imputation

MISSING or INCOMPLETE DATA

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

POL 681 Lecture Notes: Statistical Interactions

Applied Econometrics (QEM)

Econometric textbooks usually discuss the procedures to be adopted

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Multiple Regression Analysis: The Problem of Inference

Review of Multiple Regression

Terrence D. Jorgensen*, Alexander M. Schoemann, Brent McPherson, Mijke Rhemtulla, Wei Wu, & Todd D. Little

An overview of applied econometrics

Unbiased estimation of exposure odds ratios in complete records logistic regression

Data Mining Stat 588

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

AP Statistics Cumulative AP Exam Study Guide

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Semester I BASIC STATISTICS AND PROBABILITY STS1C01

On Fitting Generalized Linear Mixed Effects Models for Longitudinal Binary Data Using Different Correlation

Specification Errors, Measurement Errors, Confounding

Lecture 14 Simple Linear Regression

Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Regression: Lecture 2

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

7 Sensitivity Analysis

Optimizing forecasts for inflation and interest rates by time-series model averaging

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood

Review of CLDP 944: Multilevel Models for Longitudinal Data

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

A Comparison of Multiple Imputation Methods for Missing Covariate Values in Recurrent Event Data

Online Appendix: Bayesian versus maximum likelihood estimation of treatment effects in bivariate probit instrumental variable models

Simple Linear Regression: One Quantitative IV

Impact of serial correlation structures on random effect misspecification with the linear mixed model.

R 2 and F -Tests and ANOVA

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

PIRLS 2016 Achievement Scaling Methodology 1

Transcription:

Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th Annual Modern Modeling Methods M 3 Conference Storrs, CT

Outline Motivation and background Present simulation study Reiterate recommendations Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 2 / 45

Motivation Pretty much everyone agrees that missing data should be treated with a principled analytic tool (i.e., FIML or MI). Regression modeling offers an interesting special case. The basic regression problem is a relatively simple task. We only need to work with a single conditional density. The predictors are usually assumed fixed. This simplicity means that many of the familiar problems with ad-hoc missing data treatments don t apply in certain regression modeling circumstances. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 3 / 45

Special Case I One familiar exception to the rule of always using a principled missing data treatment occurs when: 1 Missing data occur on the dependent variable of a linear regression model. 2 The missingness is strictly a function of the predictors in the regression equation. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 4 / 45

Special Case I One familiar exception to the rule of always using a principled missing data treatment occurs when: 1 Missing data occur on the dependent variable of a linear regression model. 2 The missingness is strictly a function of the predictors in the regression equation. In this circumstance, listwise deletion (LWD) will produce unbiased estimates of the regression slopes. The intercept will be biased to the extent that missing data falls systematically closer to one tail of the DV s distribution. Power and generalizability still suffer from removing all cases that are subject to MAR missingness. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 4 / 45

Complicating Special Case I What if missing data occur on both the DV and IVs? Again, when missingness is strictly a function of IVs in the model, listwise deletion will produce unbiased estimates of regression slopes. If missingness on the IVs is a function of the DV, listwise deletion will bias slope estimates. Likewise when missingness is a function of unmeasured variables. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 5 / 45

Complicating Special Case I What if missing data occur on both the DV and IVs? Again, when missingness is strictly a function of IVs in the model, listwise deletion will produce unbiased estimates of regression slopes. If missingness on the IVs is a function of the DV, listwise deletion will bias slope estimates. Likewise when missingness is a function of unmeasured variables. When missingness occurs on both the DV and IVs, the general recommendation is to use MI to impute all missing data. Little (1992) showed that including the incomplete DV in the imputation model can improve imputations of the IVs. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 5 / 45

Special Case II There is still debate about how to address the cases with imputed DV values. Von Hippel (2007) introduced the Multiple Imputation then Deletion (MID) approach. Von Hippel (2007) claimed that cases with imputed DV values cannot provide any information to the regression equation. He suggested that such cases should be retained for imputation but should be excluded from the final inferential modeling. Von Hippel (2007) provided analytic and simulation-based arguments for the superiority of MID to traditional MI (wherein the imputed DVs are retained for inferential analyses). Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 6 / 45

Rationale for MID The MID approach rests on the following premises: 1 Observations with missing DVs cannot offer any information to the estimation of regression slopes. 2 Including these observations can only increase the between-imputation variability of the pooled estimates. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 7 / 45

Rationale for MID The MID approach rests on the following premises: 1 Observations with missing DVs cannot offer any information to the estimation of regression slopes. 2 Including these observations can only increase the between-imputation variability of the pooled estimates. BUT, there are a two big issues with this foundation: 1 Premise 1 is only true when the MAR predictors are fully represented among the IVs of the inferential regression model. 2 Premise 2 is nullified by taking a large enough number of imputations. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 7 / 45

Crux of the Matter This whole problem boils down to whether or not the MAR assumption is satisfied in the inferential model. Special Cases I and II amount to situations wherein the inferential regression model suffices to satisfy the MAR assumption. In general, neither LWD nor MID will satisfy the MAR assumption. When any portion of the (multivariate) MAR predictor is not contained by the set of IVs in the inferential model, both LWD and MID will produce biased estimates of regression slopes. Given the minor caveat I ll discuss momentarily Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 8 / 45

Graphical Representations Example MAR Mechanisms R Y Z R Y Z R Y X Y X Y X Y Transformed into MNAR R Y Z R Y Z R Y X Y X Y X Y Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 9 / 45

Methods Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 10 / 45

Simulation Parameters Primary parameters 1 Proportion of the (bivariate) MAR predictor that was represented among the analysis model s IVs: pmar = {1.0,0.75,0.5,0.25,0.0} 2 Strength of correlations among the predictors in the data generating model: rxz = {0.0,0.1,0.3,0.5} Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 11 / 45

Simulation Parameters Primary parameters 1 Proportion of the (bivariate) MAR predictor that was represented among the analysis model s IVs: pmar = {1.0,0.75,0.5,0.25,0.0} 2 Strength of correlations among the predictors in the data generating model: rxz = {0.0,0.1,0.3,0.5} Secondary parameters Sample size: N = {100, 250, 500} Proportion of missing data: PM = {0.1, 0.2, 0.4} R 2 for the data generating model: R 2 = {0.15,0.3,0.6} Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 11 / 45

Simulation Parameters Primary parameters 1 Proportion of the (bivariate) MAR predictor that was represented among the analysis model s IVs: pmar = {1.0,0.75,0.5,0.25,0.0} 2 Strength of correlations among the predictors in the data generating model: rxz = {0.0,0.1,0.3,0.5} Secondary parameters Sample size: N = {100, 250, 500} Proportion of missing data: PM = {0.1, 0.2, 0.4} R 2 for the data generating model: R 2 = {0.15,0.3,0.6} Crossed conditions in the final design 5(pMAR) 4(rXZ ) 3(N ) 3(PM) 3(R 2 ) = 540 R = 500 replications within each condition. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 11 / 45

Data Generation Data were generated according to the following model: Y = 1.0 + 0.33X + 0.33Z 1 + 0.33Z 2 + ε, ε N ( 0,σ 2). Where σ 2 was manipulated to achieve the desired R 2 level. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 12 / 45

Data Generation Data were generated according to the following model: Y = 1.0 + 0.33X + 0.33Z 1 + 0.33Z 2 + ε, ε N ( 0,σ 2). Where σ 2 was manipulated to achieve the desired R 2 level. The analysis model was: Ŷ = ˆα + ˆβ 1 X + ˆβ 2 Z 1. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 12 / 45

Data Generation Data were generated according to the following model: Y = 1.0 + 0.33X + 0.33Z 1 + 0.33Z 2 + ε, ε N ( 0,σ 2). Where σ 2 was manipulated to achieve the desired R 2 level. The analysis model was: Ŷ = ˆα + ˆβ 1 X + ˆβ 2 Z 1. Missing data were imposed on Y and X using the weighted sum of Z 1 and Z 2 as the MAR predictor. The weighting was manipulated to achieve the proportions of MAR in {pmar}. Y values in the positive tail of the MAR predictor s distribution and X values in the negative tail of the MAR predictor s distribution were set to missing data. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 12 / 45

Outcome Measures The focal parameter was the slope coefficient associated with X in the analysis model (i.e., β 1 ). For this report, we focus on two outcome measures: 1 Percentage Relative Bias: PRB = 100 ˆβ 1 β 1 β 1 2 Empirical Power: R Power = R 1 I ( p β1,r < 0.05 ) r =1 True values (i.e., β 1 ) were the average complete data estimates. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 13 / 45

Computational Details The simulation code was written in the R statistical programming language (R Core Team, 2014). Missing data were imputed using the mice package (van Buuren & Groothuis-Oudshoorn, 2011). m = 100 imputations were created. Results were pooled using the mitools package (Lumley, 2014). Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 14 / 45

Hypotheses 1 Traditional MI will produce unbiased estimates of β 1 in all conditions. 2 When rxz = 0.0 or pmar = 1.0, MID and LWD will produce unbiased estimates of β 1. 3 When pmar 1.0 and rxz 0.0, MID and LWD will produce biased estimates of β 1 and bias will increase as pmar decreases and rxz increases. 4 Traditional MI will maintain power levels that are, at least, as high as MID and LWD in all conditions. 5 LWD and MID will manifest disproportionately greater power loss than traditional MI. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 15 / 45

Results Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 16 / 45

PRB: N = 500; R 2 = 0.3 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 17 / 45

PRB: N = 250; R 2 = 0.3 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 18 / 45

PRB: N = 100; R 2 = 0.3 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 19 / 45

Power: N = 100; R 2 = 0.3 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 20 / 45

Discussion Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 21 / 45

Bias-Related Findings Traditional MI did not lead to bias in most conditions. When N = 100 and PM = 0.4 all methods induced relatively large biases. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 22 / 45

Bias-Related Findings Traditional MI did not lead to bias in most conditions. When N = 100 and PM = 0.4 all methods induced relatively large biases. Unless the set of MAR predictors is a subset of the IVs in the analysis model, MID and LWD will produce biased estimates of regression slopes. Traditional MI requires only that the MAR predictors be available for use during the imputation process. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 22 / 45

Bias-Related Findings Traditional MI did not lead to bias in most conditions. When N = 100 and PM = 0.4 all methods induced relatively large biases. Unless the set of MAR predictors is a subset of the IVs in the analysis model, MID and LWD will produce biased estimates of regression slopes. Traditional MI requires only that the MAR predictors be available for use during the imputation process. Science prefers parsimonious models, so it seems likely that important MAR predictors are often not represented in the set of analyzed IVs. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 22 / 45

Power-Related Findings Traditional MI did not suffer greater power loss than MID or LWD. Taking sufficiently many imputations mitigates any inflation of variability due to between-imputation variance. Arguments for MI s inflation of variability are all based on use of a very small number of imputations The commonly cited justification for few (i.e., m = 5) imputations was made in 1987 (i.e., Rubin, 1987). Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 23 / 45

Power-Related Findings Traditional MI did not suffer greater power loss than MID or LWD. Taking sufficiently many imputations mitigates any inflation of variability due to between-imputation variance. Arguments for MI s inflation of variability are all based on use of a very small number of imputations The commonly cited justification for few (i.e., m = 5) imputations was made in 1987 (i.e., Rubin, 1987). Both MID and LWD suffered substantial power loss with high proportions of missing data No matter the mathematical justification, both MID and LWD entail throwing away large portions of your dataset. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 23 / 45

Conclusion In special circumstances, LWD and MID will produce unbiased estimates of regression slopes, but... These conditions are not likely to occur outside of strictly controlled experimental settings. The negative consequences of assuming these special conditions hold, when they do not, can be severe. Estimated intercepts, means, variances, and correlations will still be biased. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 24 / 45

Conclusion The only methodological argument against traditional MI in favor of MID assumes the use of a very small number of imputations (i.e., m < 10). Taking m to be large enough ensures that traditional MI will do no worse than MID. Traditional MI will perform well if the MAR predictors are available, without required them to be included in the analysis model. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 25 / 45

Limitations The models we employed were very simple. Some may question the ecological validity of our results. We purposefully focused on internal validity. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 26 / 45

Limitations The models we employed were very simple. Some may question the ecological validity of our results. We purposefully focused on internal validity. All relationships were linear. The findings presented here may not fully generalize to: MAR mechanisms that manifest through nonlinear relations. Nonlinear regression models (e.g., moderated regression, polynomial regression, generalized linear models). These nonlinear situations are important areas for future work. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 26 / 45

Take Home Message Don t be Fancy. Impute your DVs! (and don t delete them, afterward) Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 27 / 45

References Little, R. J. A. (1992). Regression with missing X s: A review. Journal of the American Statistical Association, 87(420), 1227 1237. Lumley, T. (2014). mitools: Tools for multiple imputation of missing data [Computer software manual]. Retrieved from https://cran.r-project.org/package=mitools (R package version 2.3) R Core Team. (2014). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from http://www.r-project.org/ Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys (Vol. 519). New York, NY: John Wiley & Sons. van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1 67. Von Hippel, P. T. (2007). Regression with missing Ys: An improved strategy for analyzing multiply imputed data. Sociological Methodology, 37(1), 83 117. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 28 / 45

Extras Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 29 / 45

PRB: N = 500; R 2 = 0.15 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 30 / 45

PRB: N = 250; R 2 = 0.15 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 31 / 45

PRB: N = 100; R 2 = 0.15 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 32 / 45

PRB: N = 500; R 2 = 0.60 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 33 / 45

PRB: N = 250; R 2 = 0.60 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 34 / 45

PRB: N = 100; R 2 = 0.60 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 30 20 10 0 10 PM = 0.4 Percentage Relative Bias 30 20 10 0 10 PM = 0.2 30 20 10 0 10 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 35 / 45

Power: N = 100; R 2 = 0.15 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 36 / 45

Power: N = 100; R 2 = 0.3 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 37 / 45

Power: N = 100; R 2 = 0.6 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 38 / 45

Power: N = 250; R 2 = 0.15 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 39 / 45

Power: N = 250; R 2 = 0.30 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 40 / 45

Power: N = 250; R 2 = 0.60 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 41 / 45

Power: N = 500; R 2 = 0.15 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 42 / 45

Power: N = 500; R 2 = 0.30 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 43 / 45

Power: N = 500; R 2 = 0.60 rxz = 0 rxz = 0.1 rxz = 0.3 rxz = 0.5 PM = 0.4 Empirical Power PM = 0.2 PM = 0.1 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 1 0.75 0.5 0.25 0 Proportion MAR Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 44 / 45

Influence of Von Hippel (2007) The MID technique has become relatively popular. A Web of Science search for citations of Von Hippel (2007) returns 297 results. Filtering those hits to only psychology related subjects results in 79 citations. Of these 79 papers, 60 (75.95%) employed the MID approach in empirical research. Kyle M. Lang, Todd D. Little (TTU IMMAP) Impute Your DVs! 45 / 45