Analyzing Pilot Studies with Missing Observations

Size: px
Start display at page:

Download "Analyzing Pilot Studies with Missing Observations"

Transcription

1 Analyzing Pilot Studies with Missing Observations Monnie McGee Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate Medical Center) I. Ginsburg and D. Engler (Columbia Presbyterian Medical Center) University of Texas at Dallas, April 19, 2005 p.1/32

2 Outline 1. Motivation: Gabapentin Study 2. Analysis with a Mixed-Effects Model 3. Other Important Facts about the Data 4. Dealing with the Real Data 5. Conclusions and Future Explorations University of Texas at Dallas, April 19, 2005 p.2/32

3 Gabapentin Study Protocol called for 16 subjects in pre-post format Half randomized to receive Gabapentin Main outcomes: Hourly Scratching Activity & Visual Analogue Score Two quantitations: Baseline and After 6 weeks Quantitations required a 48-hour stay in the hospital University of Texas at Dallas, April 19, 2005 p.3/32

4 Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. University of Texas at Dallas, April 19, 2005 p.4/32

5 Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group University of Texas at Dallas, April 19, 2005 p.4/32

6 Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group β j, j = 1, 2 is the effect of the j th quantitation University of Texas at Dallas, April 19, 2005 p.4/32

7 Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group β j, j = 1, 2 is the effect of the j th quantitation γ ij is the interaction effect between group and quantitation University of Texas at Dallas, April 19, 2005 p.4/32

8 Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group β j, j = 1, 2 is the effect of the j th quantitation γ ij is the interaction effect between group and quantitation ɛ ijk N(0,σ 2 I) University of Texas at Dallas, April 19, 2005 p.4/32

9 Mixed Effects Model Analysis y ijk = α i + β j + γ ij + ɛ ijk y ijk is the response for the i th group and the j th quantitation on the k th subject. α i, i = 1, 2, represents effect of treatment group β j, j = 1, 2 is the effect of the j th quantitation γ ij is the interaction effect between group and quantitation ɛ ijk N(0,σ 2 I) The random effect is due to different initial levels of response for each subject on each quantitation University of Texas at Dallas, April 19, 2005 p.4/32

10 LME Results for HSA Effect Num DF Den DF F Value Pr > F Constant < Group Quant Group Quant Log Likelihood: University of Texas at Dallas, April 19, 2005 p.5/32

11 Show Me the Data! Excel Spreadsheet of the Data Graphical Display of HSA and VAS University of Texas at Dallas, April 19, 2005 p.6/32

12 Issues with the Data Lots of NAs in spreadsheet! Entire pre and/or post assessments missing for 4 subjects A priori difference in gabapentin and placebo groups Very small sample size Disparate beginning times HSA and VAS normalization Detection limit for HSA; Finite scale for VAS University of Texas at Dallas, April 19, 2005 p.7/32

13 Types of Missingness Missing Completely at Random (MCAR): probability of an observation being missing does not depend on observed or unobserved measurements. Pr(R y o,y m ) = Pr(R) University of Texas at Dallas, April 19, 2005 p.8/32

14 Types of Missingness Missing Completely at Random (MCAR): probability of an observation being missing does not depend on observed or unobserved measurements. Pr(R y o,y m ) = Pr(R) Missing at Random (MAR): probability of an observation being missing, given the observed data, does not depend on the unobserved data. Pr(R y o,y m ) = Pr(R y o ) University of Texas at Dallas, April 19, 2005 p.8/32

15 Types of Missingness (cont d) Missing Not at Random (MNAR): probability of an observation being missing depends on the value of the missing observation itself. University of Texas at Dallas, April 19, 2005 p.9/32

16 Types of Missingness (cont d) Missing Not at Random (MNAR): probability of an observation being missing depends on the value of the missing observation itself. In most situations, the true mechanism is probably MNAR. - Carpenter & Kenward ( 2005) University of Texas at Dallas, April 19, 2005 p.9/32

17 Missingness in Gabapentin Data Due to severity of missingness in hours of quantitation, only first 24-hours of data were used. University of Texas at Dallas, April 19, 2005 p.10/32

18 Missingness in Gabapentin Data Due to severity of missingness in hours of quantitation, only first 24-hours of data were used. Two subjects pre-treatment data are missing due to equipment malfunction. University of Texas at Dallas, April 19, 2005 p.10/32

19 Missingness in Gabapentin Data Due to severity of missingness in hours of quantitation, only first 24-hours of data were used. Two subjects pre-treatment data are missing due to equipment malfunction. Itermittant data missing due to eating, sleeping, showering, etc. during the hospital stay. University of Texas at Dallas, April 19, 2005 p.10/32

20 Missingness in Gabapentin Data Due to severity of missingness in hours of quantitation, only first 24-hours of data were used. Two subjects pre-treatment data are missing due to equipment malfunction. Itermittant data missing due to eating, sleeping, showering, etc. during the hospital stay. Some data may be missing due to severity of scratching or severity of illness (two subjects with missing post-treatment measurements) University of Texas at Dallas, April 19, 2005 p.10/32

21 Missingness in Gabapentin Data Due to severity of missingness in hours of quantitation, only first 24-hours of data were used. Two subjects pre-treatment data are missing due to equipment malfunction. Itermittant data missing due to eating, sleeping, showering, etc. during the hospital stay. Some data may be missing due to severity of scratching or severity of illness (two subjects with missing post-treatment measurements) Our mechanism is mostly MAR University of Texas at Dallas, April 19, 2005 p.10/32

22 Now What? Fill-in the missing values and rerun the mixed model. University of Texas at Dallas, April 19, 2005 p.11/32

23 Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation University of Texas at Dallas, April 19, 2005 p.11/32

24 Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation Last Observation Carried Forward (LOCF) University of Texas at Dallas, April 19, 2005 p.11/32

25 Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation Last Observation Carried Forward (LOCF) Hot Deck (or Cold Deck) Imputation University of Texas at Dallas, April 19, 2005 p.11/32

26 Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation Last Observation Carried Forward (LOCF) Hot Deck (or Cold Deck) Imputation Likelihood based Imputation University of Texas at Dallas, April 19, 2005 p.11/32

27 Now What? Fill-in the missing values and rerun the mixed model. Mean-filled values Regression-mean imputation Last Observation Carried Forward (LOCF) Hot Deck (or Cold Deck) Imputation Likelihood based Imputation Time Series Approach (Pfeffermann and Nathan, 2002) NB: Most results pertaining to inference are asymptotic results. University of Texas at Dallas, April 19, 2005 p.11/32

28 Results: Mean-Filled Values Effect Num DF Den DF F Value Pr > F Constant < Group Quant < Group Quant < Log Likelihood: University of Texas at Dallas, April 19, 2005 p.12/32

29 Results: LOCF-Filled Values Effect Num DF Den DF F Value Pr > F Constant < Group Quant < Group Quant < Log Likelihood: University of Texas at Dallas, April 19, 2005 p.13/32

30 Summary Thus Far Carpenter and Kenward (2005) call mean replacement and LOCF unprincipled methods Both lead to biased estimates of parameters. Simple mean imputation tends to dilute associations. LOCF distorts mean and covariance structure, even for a single time point, even under MCAR. Regression mean imputation can generate unbiased estimates, but the variance is still typically underestimated. Can t replace entire quantitations with mean or LOCF. University of Texas at Dallas, April 19, 2005 p.14/32

31 Nearest Neighbor Hot Deck Imputation Let y i = (y i1,...,y ik ) be a K 1 complete data vector of outcomes. University of Texas at Dallas, April 19, 2005 p.15/32

32 Nearest Neighbor Hot Deck Imputation Let y i = (y i1,...,y ik ) be a K 1 complete data vector of outcomes. Let y i = (y obs,i,y obs,m ) where y obs,i is the observed part and y obs,m is the missing part of y i. Then ŷ it = y lt + (y obs,i y obs,l ) where y obs,i is the mean of the observed values for subject i. University of Texas at Dallas, April 19, 2005 p.15/32

33 Nearest Neighbor Hot Deck Imputation Let y i = (y i1,...,y ik ) be a K 1 complete data vector of outcomes. Let y i = (y obs,i,y obs,m ) where y obs,i is the observed part and y obs,m is the missing part of y i. Then ŷ it = y lt + (y obs,i y obs,l ) where y obs,i is the mean of the observed values for subject i. Subject l is called the donor. University of Texas at Dallas, April 19, 2005 p.15/32

34 Choosing a Donor We want a donor that is close to the subject whose observations are missing. University of Texas at Dallas, April 19, 2005 p.16/32

35 Choosing a Donor We want a donor that is close to the subject whose observations are missing. Close is defined by a metric, e. g. d(i,j) = max k x ik x jk where x i = (x i1,...,x ik ) T are the values of K appropriatly scaled covariates for a unit i at which y i is missing. University of Texas at Dallas, April 19, 2005 p.16/32

36 Donors for TS Data Suppose subject i is missing a value at time t. The closest donor is defined as d j (t) = min j for all j = 1,...,n 1. T t=1 x it x jt, University of Texas at Dallas, April 19, 2005 p.17/32

37 Hot Deck Results Effect Num DF Den DF F Value Pr > F Constant < Group Quant < Group Quant < Log Likelihood: University of Texas at Dallas, April 19, 2005 p.18/32

38 A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. University of Texas at Dallas, April 19, 2005 p.19/32

39 A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. Multiple Imputation provides us with multiple data sets, which we can use to estimate uncertainty about the correct nonresponse model. University of Texas at Dallas, April 19, 2005 p.19/32

40 A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. Multiple Imputation provides us with multiple data sets, which we can use to estimate uncertainty about the correct nonresponse model. BUT - MI can be complicated. University of Texas at Dallas, April 19, 2005 p.19/32

41 A Modification Hot Deck Imputation provides us with only one data set, which we take as the real data. Multiple Imputation provides us with multiple data sets, which we can use to estimate uncertainty about the correct nonresponse model. BUT - MI can be complicated. Estimate multiple data sets using NNHDI with additive noise. University of Texas at Dallas, April 19, 2005 p.19/32

42 Modified NNHDI Results Results for 3 Imputations of NNHDI with additive N(0, 29) noise. University of Texas at Dallas, April 19, 2005 p.20/32

43 Modified NNHDI Results Results for 3 Imputations of NNHDI with additive N(0, 29) noise. Effect Imputation Num DF Den DF F Value Pr > F Group A B C Quant A < B < C < Group Quant A < Log Likelihoods: A: , B: , C: B < C < University of Texas at Dallas, April 19, 2005 p.20/32

44 Nonresponse Uncertainty Let ˆθ d and W d, d = 1,...,D, be D complete-data estimates and their associated variances for θ. Then University of Texas at Dallas, April 19, 2005 p.21/32

45 Nonresponse Uncertainty Let ˆθ d and W d, d = 1,...,D, be D complete-data estimates and their associated variances for θ. Then θ D = 1 D D d=1 ˆθ d. University of Texas at Dallas, April 19, 2005 p.21/32

46 Nonresponse Uncertainty Let ˆθ d and W d, d = 1,...,D, be D complete-data estimates and their associated variances for θ. Then θ D = 1 D D ˆθ d. d=1 and the average within imputation variance is W D = 1 D D d=1 W d. University of Texas at Dallas, April 19, 2005 p.21/32

47 More Uncertainty The between-imputation variance is Total variability is B D = 1 D 1 D (ˆθ d θ D ) 2. d=1 T D = W D + D + 1 D B D, University of Texas at Dallas, April 19, 2005 p.22/32

48 More Uncertainty The between-imputation variance is Total variability is B D = 1 D 1 D (ˆθ d θ D ) 2. d=1 T D = W D + D + 1 D B D, and ˆγ D = (1 + 1/D)B D /T D is an estimate of the fraction of information about θ due to nonresponse (Little and Rubin, pp.86-87). University of Texas at Dallas, April 19, 2005 p.22/32

49 Uncertainty Calculations For the Gabapentin Data: Effect θd Wd B D T D ˆγ D Group Quant Interaction University of Texas at Dallas, April 19, 2005 p.23/32

50 Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR University of Texas at Dallas, April 19, 2005 p.24/32

51 Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). University of Texas at Dallas, April 19, 2005 p.24/32

52 Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). Case 3: Wave nonresponse for longitudinal data with no correlation, analyzed with mixed-model University of Texas at Dallas, April 19, 2005 p.24/32

53 Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). Case 3: Wave nonresponse for longitudinal data with no correlation, analyzed with mixed-model Case 4: Same as 3 with AR(1) structure in data University of Texas at Dallas, April 19, 2005 p.24/32

54 Power and Size Case 1: Pretest/Posttest Study with one normally distributed random variable (σ 2 = 1) and data MCAR Case 2: Case 1 with chunks of missing data (wave nonresponse). Case 3: Wave nonresponse for longitudinal data with no correlation, analyzed with mixed-model Case 4: Same as 3 with AR(1) structure in data Compared size and power for 10%, 30%, and 50% missing values. University of Texas at Dallas, April 19, 2005 p.24/32

55 Case 1: A Simple Paired t-test N = 10 N = 30 % Missing None 10% 30% None 10% 30% µ d = µ d = µ d = University of Texas at Dallas, April 19, 2005 p.25/32

56 Case 2: Paired t-test with Wave Nonresponse N = 10 N = 30 % Missing 30% 50% 10% 30% 50% µ d = µ d = University of Texas at Dallas, April 19, 2005 p.26/32

57 Case 3: Longitudinal WN Data N = 10 N = 30 Scenario Effect 30% 50% 30% 50% µ d = 0 Group µ d = 0 Quant µ d = 2 Group µ d = 2 Quant University of Texas at Dallas, April 19, 2005 p.27/32

58 Case 4: Longitudinal AR(1) Data N = 10 N = 30 Scenario Effect 30% 50% 30% 50% φ 1 = φ 2 Group φ 1 = φ 2 Quant φ 1 φ 2 Group φ 1 φ 2 Quant University of Texas at Dallas, April 19, 2005 p.28/32

59 The Real Issue How good are the parameter estimates under the above scenarios? University of Texas at Dallas, April 19, 2005 p.29/32

60 The Real Issue How good are the parameter estimates under the above scenarios? Results about estimation in the literature are asymptotic. University of Texas at Dallas, April 19, 2005 p.29/32

61 The Real Issue How good are the parameter estimates under the above scenarios? Results about estimation in the literature are asymptotic. Literature suggests a transformation that makes normality more accurate for small samples. University of Texas at Dallas, April 19, 2005 p.29/32

62 The Real Issue How good are the parameter estimates under the above scenarios? Results about estimation in the literature are asymptotic. Literature suggests a transformation that makes normality more accurate for small samples. Searle (1970) gives information matrices for mixed effects models with unbalanced data. University of Texas at Dallas, April 19, 2005 p.29/32

63 The Real Issue How good are the parameter estimates under the above scenarios? Results about estimation in the literature are asymptotic. Literature suggests a transformation that makes normality more accurate for small samples. Searle (1970) gives information matrices for mixed effects models with unbalanced data. Large literature on efficiency for various experimental designs in presence of missing observations. University of Texas at Dallas, April 19, 2005 p.29/32

64 Remaining Issues Automating choice of like individuals for replacement values University of Texas at Dallas, April 19, 2005 p.30/32

65 Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation University of Texas at Dallas, April 19, 2005 p.30/32

66 Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation Generating data substitutions from models University of Texas at Dallas, April 19, 2005 p.30/32

67 Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation Generating data substitutions from models Calculate efficiencies, bias, and variance University of Texas at Dallas, April 19, 2005 p.30/32

68 Remaining Issues Automating choice of like individuals for replacement values Variance of random perturbation Generating data substitutions from models Calculate efficiencies, bias, and variance Detection limits, a priori differences in groups, normalization, etc. University of Texas at Dallas, April 19, 2005 p.30/32

69 References 1. Carpenter, James and Kendward, Mike (2005) Economic and Social Research Council Missing Data Website Little, Roderick J.A. and Rubin, Donald B.(2002). Statistical Analysis with Missing Data (2nd edition). New York: Wiley Interscience. 3. Pfeffermann, Danny and Nathan, Gad (2002). Imputation for Wave Nonresponse: Existing Methods and a Time Series Approach, in Survey Nonresponse (Robert M. Groves, Don A. Dilman, John L. Eltinge, and Rodrick J.A. Little, eds.). New York: Wiley, Chapter Prescott, P. and Mansson, R.A. (2002). Efficiency of Pair Wise Treatment Comparisons in Incomplete Block Experiments Subject to the Loss of a Block of Observations. Communications in Statistics: Theory and Methods, 31, Searle, S. R. (1970). Large Sample Variances of Maximum Likelihood Estimators of Variance Components Using Unbalanced Data. Biometrics, 26, University of Texas at Dallas, April 19, 2005 p.31/32

70 A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment University of Texas at Dallas, April 19, 2005 p.32/32

71 A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment Calculate two-sample t-tests for each assignment University of Texas at Dallas, April 19, 2005 p.32/32

72 A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment Calculate two-sample t-tests for each assignment 1000 replications of assignments University of Texas at Dallas, April 19, 2005 p.32/32

73 A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment Calculate two-sample t-tests for each assignment 1000 replications of assignments Results: Percentage of P-values < 0.05 University of Texas at Dallas, April 19, 2005 p.32/32

74 A Priori Difference in Groups Reassign subjects to groups at random, regardless of true assignment Calculate two-sample t-tests for each assignment 1000 replications of assignments Results: Percentage of P-values < 0.05 Data Min Median Max Original Mean Repl LOCF Repl University of Texas at Dallas, April 19, 2005 p.32/32

2 Naïve Methods. 2.1 Complete or available case analysis

2 Naïve Methods. 2.1 Complete or available case analysis 2 Naïve Methods Before discussing methods for taking account of missingness when the missingness pattern can be assumed to be MAR in the next three chapters, we review some simple methods for handling

More information

Some methods for handling missing values in outcome variables. Roderick J. Little

Some methods for handling missing values in outcome variables. Roderick J. Little Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

A weighted simulation-based estimator for incomplete longitudinal data models

A weighted simulation-based estimator for incomplete longitudinal data models To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun

More information

Discussing Effects of Different MAR-Settings

Discussing Effects of Different MAR-Settings Discussing Effects of Different MAR-Settings Research Seminar, Department of Statistics, LMU Munich Munich, 11.07.2014 Matthias Speidel Jörg Drechsler Joseph Sakshaug Outline What we basically want to

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Inferences on missing information under multiple imputation and two-stage multiple imputation

Inferences on missing information under multiple imputation and two-stage multiple imputation p. 1/4 Inferences on missing information under multiple imputation and two-stage multiple imputation Ofer Harel Department of Statistics University of Connecticut Prepared for the Missing Data Approaches

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

Comparing Group Means When Nonresponse Rates Differ

Comparing Group Means When Nonresponse Rates Differ UNF Digital Commons UNF Theses and Dissertations Student Scholarship 2015 Comparing Group Means When Nonresponse Rates Differ Gabriela M. Stegmann University of North Florida Suggested Citation Stegmann,

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

analysis of incomplete data in statistical surveys

analysis of incomplete data in statistical surveys analysis of incomplete data in statistical surveys Ugo Guarnera 1 1 Italian National Institute of Statistics, Italy guarnera@istat.it Jordan Twinning: Imputation - Amman, 6-13 Dec 2014 outline 1 origin

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Whether to use MMRM as primary estimand.

Whether to use MMRM as primary estimand. Whether to use MMRM as primary estimand. James Roger London School of Hygiene & Tropical Medicine, London. PSI/EFSPI European Statistical Meeting on Estimands. Stevenage, UK: 28 September 2015. 1 / 38

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

Don t be Fancy. Impute Your Dependent Variables!

Don t be Fancy. Impute Your Dependent Variables! Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

6 Pattern Mixture Models

6 Pattern Mixture Models 6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data

More information

Estimation of Missing Data Using Convoluted Weighted Method in Nigeria Household Survey

Estimation of Missing Data Using Convoluted Weighted Method in Nigeria Household Survey Science Journal of Applied Mathematics and Statistics 2017; 5(2): 70-77 http://www.sciencepublishinggroup.com/j/sjams doi: 10.1168/j.sjams.20170502.12 ISSN: 2376-991 (Print); ISSN: 2376-9513 (Online) Estimation

More information

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies Paper 177-2015 An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies Yan Wang, Seang-Hwane Joo, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta

More information

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach The 8th Tartu Conference on MULTIVARIATE STATISTICS, The 6th Conference on MULTIVARIATE DISTRIBUTIONS with Fixed Marginals Modelling Dropouts by Conditional Distribution, a Copula-Based Approach Ene Käärik

More information

Longitudinal analysis of ordinal data

Longitudinal analysis of ordinal data Longitudinal analysis of ordinal data A report on the external research project with ULg Anne-Françoise Donneau, Murielle Mauer June 30 th 2009 Generalized Estimating Equations (Liang and Zeger, 1986)

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

ST 790, Homework 1 Spring 2017

ST 790, Homework 1 Spring 2017 ST 790, Homework 1 Spring 2017 1. In EXAMPLE 1 of Chapter 1 of the notes, it is shown at the bottom of page 22 that the complete case estimator for the mean µ of an outcome Y given in (1.18) under MNAR

More information

7 Sensitivity Analysis

7 Sensitivity Analysis 7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption

More information

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE

T E C H N I C A L R E P O R T KERNEL WEIGHTED INFLUENCE MEASURES. HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE T E C H N I C A L R E P O R T 0465 KERNEL WEIGHTED INFLUENCE MEASURES HENS, N., AERTS, M., MOLENBERGHS, G., THIJS, H. and G. VERBEKE * I A P S T A T I S T I C S N E T W O R K INTERUNIVERSITY ATTRACTION

More information

Plausible Values for Latent Variables Using Mplus

Plausible Values for Latent Variables Using Mplus Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can

More information

F-tests for Incomplete Data in Multiple Regression Setup

F-tests for Incomplete Data in Multiple Regression Setup F-tests for Incomplete Data in Multiple Regression Setup ASHOK CHAURASIA Advisor: Dr. Ofer Harel University of Connecticut / 1 of 19 OUTLINE INTRODUCTION F-tests in Multiple Linear Regression Incomplete

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

Short course on Missing Data

Short course on Missing Data Short course on Missing Data Demetris Athienitis University of Florida Department of Statistics Contents An Introduction to Missing Data....................................... 5. Introduction 5.2 Types

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Sonderforschungsbereich 386, Paper 24 (2) Online unter: http://epub.ub.uni-muenchen.de/

More information

Analysis of Incomplete Non-Normal Longitudinal Lipid Data

Analysis of Incomplete Non-Normal Longitudinal Lipid Data Analysis of Incomplete Non-Normal Longitudinal Lipid Data Jiajun Liu*, Devan V. Mehrotra, Xiaoming Li, and Kaifeng Lu 2 Merck Research Laboratories, PA/NJ 2 Forrest Laboratories, NY *jiajun_liu@merck.com

More information

Miscellanea A note on multiple imputation under complex sampling

Miscellanea A note on multiple imputation under complex sampling Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.

More information

Imputation Algorithm Using Copulas

Imputation Algorithm Using Copulas Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for

More information

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data Biometrics 000, 000 000 DOI: 000 000 0000 Discussion of Identifiability and Estimation of Causal Effects in Randomized Trials with Noncompliance and Completely Non-ignorable Missing Data Dylan S. Small

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Weighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management

Weighting Missing Data Coding and Data Preparation Wrap-up Preview of Next Time. Data Management Data Management Department of Political Science and Government Aarhus University November 24, 2014 Data Management Weighting Handling missing data Categorizing missing data types Imputation Summary measures

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Linear Mixed Models for Longitudinal Data with Nonrandom Dropouts

Linear Mixed Models for Longitudinal Data with Nonrandom Dropouts Journal of Data Science 4(2006), 447-460 Linear Mixed Models for Longitudinal Data with Nonrandom Dropouts Ahmed M. Gad and Noha A. Youssif Cairo University Abstract: Longitudinal studies represent one

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level A Monte Carlo Simulation to Test the Tenability of the SuperMatrix Approach Kyle M Lang Quantitative Psychology

More information

Downloaded from:

Downloaded from: Hossain, A; DiazOrdaz, K; Bartlett, JW (2017) Missing binary outcomes under covariate-dependent missingness in cluster randomised trials. Statistics in medicine. ISSN 0277-6715 DOI: https://doi.org/10.1002/sim.7334

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Known unknowns : using multiple imputation to fill in the blanks for missing data

Known unknowns : using multiple imputation to fill in the blanks for missing data Known unknowns : using multiple imputation to fill in the blanks for missing data James Stanley Department of Public Health University of Otago, Wellington james.stanley@otago.ac.nz Acknowledgments Cancer

More information

STATISTICAL INFERENCE FOR SURVEY DATA ANALYSIS

STATISTICAL INFERENCE FOR SURVEY DATA ANALYSIS STATISTICAL INFERENCE FOR SURVEY DATA ANALYSIS David A Binder and Georgia R Roberts Methodology Branch, Statistics Canada, Ottawa, ON, Canada K1A 0T6 KEY WORDS: Design-based properties, Informative sampling,

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical

More information

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent Robert Zeithammer University of Chicago Peter Lenk University of Michigan http://webuser.bus.umich.edu/plenk/downloads.htm SBIES

More information

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline

More information

Topics and Papers for Spring 14 RIT

Topics and Papers for Spring 14 RIT Eric Slud Feb. 3, 204 Topics and Papers for Spring 4 RIT The general topic of the RIT is inference for parameters of interest, such as population means or nonlinearregression coefficients, in the presence

More information

Planned Missingness Designs and the American Community Survey (ACS)

Planned Missingness Designs and the American Community Survey (ACS) Planned Missingness Designs and the American Community Survey (ACS) Steven G. Heeringa Institute for Social Research University of Michigan Presentation to the National Academies of Sciences Workshop on

More information

Pooling multiple imputations when the sample happens to be the population.

Pooling multiple imputations when the sample happens to be the population. Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht

More information

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial William R. Gillespie Pharsight Corporation Cary, North Carolina, USA PAGE 2003 Verona,

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies

More information

Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse

Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse Nanhua Zhang Division of Biostatistics & Epidemiology Cincinnati Children s Hospital Medical Center (Joint work

More information

BIOSTATISTICAL METHODS

BIOSTATISTICAL METHODS BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects

More information

On the bias of the multiple-imputation variance estimator in survey sampling

On the bias of the multiple-imputation variance estimator in survey sampling J. R. Statist. Soc. B (2006) 68, Part 3, pp. 509 521 On the bias of the multiple-imputation variance estimator in survey sampling Jae Kwang Kim, Yonsei University, Seoul, Korea J. Michael Brick, Westat,

More information

Sampling and incomplete network data

Sampling and incomplete network data 1/58 Sampling and incomplete network data 567 Statistical analysis of social networks Peter Hoff Statistics, University of Washington 2/58 Network sampling methods It is sometimes difficult to obtain a

More information

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA

VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA Submitted to the Annals of Applied Statistics VARIANCE ESTIMATION FOR NEAREST NEIGHBOR IMPUTATION FOR U.S. CENSUS LONG FORM DATA By Jae Kwang Kim, Wayne A. Fuller and William R. Bell Iowa State University

More information

Graybill Conference Poster Session Introductions

Graybill Conference Poster Session Introductions Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Chapter 4. Parametric Approach. 4.1 Introduction

Chapter 4. Parametric Approach. 4.1 Introduction Chapter 4 Parametric Approach 4.1 Introduction The missing data problem is already a classical problem that has not been yet solved satisfactorily. This problem includes those situations where the dependent

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh September 13 & 15, 2005 1. Complete-case analysis (I) Complete-case analysis refers to analysis based on

More information

Comparison of methods for repeated measures binary data with missing values. Farhood Mohammadi. A thesis submitted in partial fulfillment of the

Comparison of methods for repeated measures binary data with missing values. Farhood Mohammadi. A thesis submitted in partial fulfillment of the Comparison of methods for repeated measures binary data with missing values by Farhood Mohammadi A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Biostatistics

More information

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data A Bayesian Perspective on Residential Demand Response Using Smart Meter Data Datong-Paul Zhou, Maximilian Balandat, and Claire Tomlin University of California, Berkeley [datong.zhou, balandat, tomlin]@eecs.berkeley.edu

More information

A Comparison of Multiple Imputation Methods for Missing Covariate Values in Recurrent Event Data

A Comparison of Multiple Imputation Methods for Missing Covariate Values in Recurrent Event Data A Comparison of Multiple Imputation Methods for Missing Covariate Values in Recurrent Event Data By Zhao Huo Department of Statistics Uppsala University Supervisor: Ronnie Pingel 2015 Abstract Multiple

More information

Recent Advances in the analysis of missing data with non-ignorable missingness

Recent Advances in the analysis of missing data with non-ignorable missingness Recent Advances in the analysis of missing data with non-ignorable missingness Jae-Kwang Kim Department of Statistics, Iowa State University July 4th, 2014 1 Introduction 2 Full likelihood-based ML estimation

More information

Time Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models Time Invariant Predictors in Longitudinal Models Longitudinal Data Analysis Workshop Section 9 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

Integrated approaches for analysis of cluster randomised trials

Integrated approaches for analysis of cluster randomised trials Integrated approaches for analysis of cluster randomised trials Invited Session 4.1 - Recent developments in CRTs Joint work with L. Turner, F. Li, J. Gallis and D. Murray Mélanie PRAGUE - SCT 2017 - Liverpool

More information

Sample Size and Power Considerations for Longitudinal Studies

Sample Size and Power Considerations for Longitudinal Studies Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design 1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary

More information

Survival models and health sequences

Survival models and health sequences Survival models and health sequences Walter Dempsey University of Michigan July 27, 2015 Survival Data Problem Description Survival data is commonplace in medical studies, consisting of failure time information

More information

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as ST 51, Summer, Dr. Jason A. Osborne Homework assignment # - Solutions 1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available

More information

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010 Strategy for modelling non-random missing data mechanisms in longitudinal studies using Bayesian methods: application to income data from the Millennium Cohort Study Alexina Mason Department of Epidemiology

More information

Comment on Tests of Certain Types of Ignorable Nonresponse in Surveys Subject to Item Nonresponse or Attrition

Comment on Tests of Certain Types of Ignorable Nonresponse in Surveys Subject to Item Nonresponse or Attrition Institute for Policy Research Northwestern University Working Paper Series WP-09-10 Comment on Tests of Certain Types of Ignorable Nonresponse in Surveys Subject to Item Nonresponse or Attrition Christopher

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Methods for Handling Missing Data

Methods for Handling Missing Data Methods for Handling Missing Data Joseph Hogan Brown University MDEpiNet Conference Workshop October 22, 2018 Hogan (MDEpiNet) Missing Data October 22, 2018 1 / 160 Course Overview I 1 Introduction and

More information

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little Measurement error as missing data: the case of epidemiologic assays Roderick J. Little Outline Discuss two related calibration topics where classical methods are deficient (A) Limit of quantification methods

More information

Multiple Imputation For Missing Ordinal Data

Multiple Imputation For Missing Ordinal Data Journal of Modern Applied Statistical Methods Volume 4 Issue 1 Article 26 5-1-2005 Multiple Imputation For Missing Ordinal Data Ling Chen University of Arizona Marian Toma-Drane University of South Carolina

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

University of Michigan School of Public Health

University of Michigan School of Public Health University of Michigan School of Public Health The University of Michigan Department of Biostatistics Working Paper Series Year 003 Paper Weighting Adustments for Unit Nonresponse with Multiple Outcome

More information

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census

More information

ECO375 Tutorial 8 Instrumental Variables

ECO375 Tutorial 8 Instrumental Variables ECO375 Tutorial 8 Instrumental Variables Matt Tudball University of Toronto Mississauga November 16, 2017 Matt Tudball (University of Toronto) ECO375H5 November 16, 2017 1 / 22 Review: Endogeneity Instrumental

More information

More about linear mixed models

More about linear mixed models Faculty of Health Sciences Contents More about linear mixed models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Bios 6648: Design & conduct of clinical research

Bios 6648: Design & conduct of clinical research Bios 6648: Design & conduct of clinical research Section 2 - Formulating the scientific and statistical design designs 2.5(b) Binary 2.5(c) Skewed baseline (a) Time-to-event (revisited) (b) Binary (revisited)

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information