Inferences on missing information under multiple imputation and two-stage multiple imputation

Size: px
Start display at page:

Download "Inferences on missing information under multiple imputation and two-stage multiple imputation"

Transcription

1 p. 1/4 Inferences on missing information under multiple imputation and two-stage multiple imputation Ofer Harel Department of Statistics University of Connecticut Prepared for the Missing Data Approaches in the Health and Social Sciences: A Modern Survey meeting, October, 2010.

2 p. 2/4 Outline Background Methods Multiple imputation (MI) Two-stage MI Asymptotic results Number of imputations Discussion Current/future work

3 Background The missing data problem Most statistical analysis and estimation procedures were not designed to handle missing values. Even a small amount of missing values can cause great difficulty. The missing-data aspect is nuisance, not of primary interest. The Goal To make statistically valid inferences about population parameters from an incomplete data set. Not to estimate, predict, or recover the missing values themselves. Untestable assumptions are inevitable. Sensitivity analysis is helpful. p. 3/4

4 p. 4/4 Background Information The information concept was introduced by Fisher in the 30 s. The use of information, information matrices and other functions of the information concepts have become very common. For example The inverse of the covariance matrix. Information based tests such as AIC and BIC. The information concept became important for both frequentists and bayesians.

5 p. 5/4 Background Information I am interested in using the information concept in an incomplete-data situation. Rubin introduced the rate of the missing information, after partitioning the complete information into observed and missing parts. Both Rubin and Schafer argue for the importance of this measure. I am going to develop this concept and show its importance. I am going to touch on possible extensions. How do we define the rates of missing information?

6 p. 6/4 Why do we care? Can we use MI (or any other missing data procedure) if we have 1%, 5%, 10%, 25%, 50%, 75%, 90% missing values? what is the cut point?

7 p. 7/4 Why do we care? Can we use MI (or any other missing data procedure) if we have 1%, 5%, 10%, 25%, 50%, 75%, 90% missing values? what is the cut point? How confident are we in the results from incomplete data? When using MI, how do we determine the number of imputations? Can we evaluate the effect of a missing value or a group of missing values? Can we evaluate our missing data model, and/or its effect on our inference?

8 p. 8/4 Multiple imputation Multiple imputation (Rubin, 1987): a simulation-based approach to missing data. Imputation: Create m imputations of the missing data, Y (1) (m) mis,...,y mis, under a suitable model. Analysis: Analyze each of the m completed data sets in the same way. Combination: Combine the m sets of estimates and variances using Rubin s (1987) rules.

9 p. 9/4 Multiple imputation MI is separated into 3 stages: Imputation stage, Analysis stage, and Combining Results stage. Observed Data???? Im putation 1 2 m

10 p. 10/4 Multiple imputation Imputation: Under an ignorability assumption we draw m sets of imputations from P(Y mis Y obs ), while under non-ignorable models we draw the imputations from P(Y mis Y obs,m). Analysis: Given the complete data sets we can use any complete-data procedures to analyze the data.

11 p. 11/4 Multiple imputation Combining rules: Calculate and store the estimates ˆQ (j) and variancesu (j) for j = 1,...,m and combine: Q = m 1 m ˆQ (j) j=1 U = m 1 m j=1 U (j) m (ˆQ(j) Q) 2 B = (m 1) 1 j=1 T = U +(1+m 1 )B

12 p. 12/4 MI - Cont. Which leads to an approximate 95% interval for Q Q ±t ν T, where the degrees of freedom are ν = (m 1) [ ] (1+m 1 2 )B. T The rate of missing information is λ = U 1 (ν +1)(ν +3) 1 T 1 U 1. This is approximately λ = B U+B when ν is large.

13 p. 13/4 Rates of missing information The rate of missing information is the proportion of missing information to the complete information. The rate of missing information governs the convergence rates of the EM algorithm and the data augmentation. The lower the rate of missing information, the faster the convergence will be. The rate of missing information is important in assessing how missing information contributes to inferential uncertainty about the population quantity of interest.

14 p. 14/4 Rates of missing information The naive estimate of the rate of missing information is the rate of missing values. The naive estimate will be accurate in only very limited situations. The rate of missing information is a function not only of the amount of missing values but also of their importance. When using MI with a small number of imputations, the estimate of the rates of missing information can be very noisy.

15 p. 15/4 Two-types of missing values In many situations, missing values may be oftwo qualitatively different types. Examples include: planned missingness versus unplanned. unit non-response versus item non-response. refusal versus don t know. latent variables versus missing manifest items. deaths versus other types of dropout. dropouts versus subjects who return. dropout for reasons that might be closely related to outcomes, versus dropout for other reasons.

16 p. 16/4 Two-stage multiple imputation Extends Rubin s (1987) framework to two types of missing values. Imputation: Createmimputations ofymis A the first level of the missing data, Y A(1) A(m) mis,...,ymis. Then, for eachymis, A generate n imputations of Ymis B from a conditional predictive distribution given Y A mis. Analysis: Analyze each of the mn completed data sets in the same way. Combination: Combine the mn sets of estimates and variances by Shen s (2000) rules.

17 p. 17/4 Two-stage MI m A A A B B A B 1 n 1 n 1 n 1 n

18 Shen rules From Shen (2000, unpublished Ph.D. thesis) calculate and store: ˆQ (j,k) = estimate of Q U (j,k) = standard error 2 for j = 1,...,m and k = 1,...,n. Then: Q.. = 1 mn m n j=1 k=1 ˆQ (j,k) Qj. = 1 n n k=1 ˆQ (j,k) U = B = 1 mn W = 1 m m 1 m 1 n j=1 k=1 m j=1 U (j,k) m ) 2 (ˆQ (j.) Q.. j=1 1 n 1 n (ˆQ(j,k) Q ) 2 j. k=1 p. 18/4

19 p. 19/4 Shen rules - Cont. The total variance is T = U.. +(1+ 1 m )B +(1 1 n )W. An approximate 95% interval for Q is Q ±t ν T, where the degrees of freedom are ν 1 = 1 m(n 1) ( (1 1 n )W T ) (m 1) ( (1+ 1 m )B ) 2. T

20 p. 20/4 Shen rules - Cont. The overall estimated rate of missing information is ˆλ = B +(1 n 1 )W U.. +B +(1 n 1 )W. The estimated rate of missing information due to Y B mis if Y A mis was known is ˆλ B A = W U.. +W, and the difference ˆλ A = ˆλ ˆλ B A represents the decrease in the rate of missing information if Y A mis becomes known.

21 p. 21/4 Rates of missing information The overall estimated rate of missing information in two-stage MI is equivalent to the rate of missing information in conventional MI. The partition of the rates of missing information is based on the partition of the missing values. There are many ways to partition the missing data. Similar to conventional MI, a small number of imputations may lead to a noisy estimate of the rates of missing information.

22 p. 22/4 Asymptotic Results - Conventional MI The random imputations, Y (j) mis P(Y mis Y obs ), induces a probability distribution to B. Rubin argues that (m 1)B B χ 2 m 1, where B = V(ˆQ(Y obs,y mis ) Y obs ). Assuming that the above is reasonable, the Central Limit Theorem indicates that m/2(b B ) B N(0,1) asm, where E(B) = B and V(B) = 2B 2 /(m 1).

23 p. 23/4 Asymptotic Results - Conventional MI Because ˆλ = B/(B +U), we can use the delta method to derive m(ˆλ λ) 2B 2 U 2 (B+ U) 4 which may also be written as N(0,1), m(ˆλ λ) 2λ(1 λ) N(0,1). Therefore, an approximate 95% confidence interval for the population rate of missing information is ˆλ± 1.96ˆλ(1 ˆλ) m/2.

24 p. 24/4 Asymptotic Results - Conventional MI The rate of missing information is a proportion. In situations where the rate of missing information is close to zero or one, it may be useful to apply a normal approximation on the logit scale, Logit(p) = log[p/(1 p)]; this result in m(logit(ˆλ) Logit(λ)) N(0,2). the logit transformation is a first order variance-stabilizing transformation for ˆλ

25 p. 25/4 Asymptotic Results - two-stage MI In two-stage MI, the sources of the variance can be separated into a between and within nest. ˆQ jk Q.. = ( Q j. Q.. )+(ˆQ jk Q j. ). Source DF SS MS E(MS) Between m 1 n m j=1 ( Q j. Q.. ) 2 nb σw 2 +nσ2 B Within m(n 1) m n j=1 k=1 (ˆQ jk Q j. ) 2 W σw 2

26 p. 26/4 Asymptotic Results - two-stage MI From the ANOVA literature it is known that the mean square errors are independent chi-squares random variables such that, ( σ 2 W +nσb 2 m 1 ) 1 nb χ 2 m 1 ( σ 2 W m(n 1) ) 1 W χ 2 m(n 1).

27 p. 27/4 Asymptotic Results - two-stage MI A first-order approximation to the joint distribution ofb and W is m V B W E B W N (0,I), 0 V 2 where V 1 = 2(σ 2 W +nσ2 B )2 /n 2 and V 2 = 2(σ 2 W )2 /(n 1). Using a multivariate delta method expansion, similar to the procedure we have done for conventional MI we get

28 p. 28/4 Asymptotic Results - two-stage MI mσ 1 1 λ λ B A ˆλ ˆ λ B A N (0,I), where Σ 1 = V 1 U 2 +V 2 U 2 (1 n 1 ) V 2 U 2 (1 n 1 ) (U+B+(1 n 1 )W) 4 (U+B+(1 n 1 )W) 2 (U+W) 2 V 2 U 2 (1 n 1 ) V 2 U 2 (U+B+(1 n 1 )W) 2 (U+W) 2 (U+W) 4.

29 Asymptotic Results - two-stage MI Rewriting the covariance matrixσ1 as a function of λ andλ B A gives Σ 1 = [ λ 2(1 λ)4 1 λ λb A n 1 1 λ B A n 2(1 λ) 2 (λ B A ) 2 n ] 2 2(1 λ) 2 (λ B A ) 2 n 2(1 λ B A ) 2 (λ B A ) 2 (n 1). When the rates of missing information are close to zero or one, we can calculate an approximation on a logit scale mσ 1 Logit(λ) 2 Logit(ˆλ) Logit(λ B A ) Logit( λ B A ˆ ) N (0,I), where Σ 2 = 2(1 λ) 2 λ 2 [ λ 1 λ λb A 2(1 λ)λ B A λ(1 λ B A )n ] 2 n 1 2(1 λ)λ B A 1 λ B A n λ(1 λ B A )n 2 (n 1). p. 29/4

30 p. 30/4 Asymptotic Results - simulations Simulation studies were designed to evaluate the theoretical distributions. Simulation studies were performed for both conventional MI and two-stage MI. Different levels of rates of missing information and number of imputations were simulated for conventional MI and the estimates, variances and coverage were tested. For two-stage MI, the rates of missing information and the number of imputations in each level were simulated, and again the estimates variances (covariances) and coverage were tested. The simulated and theoretical results were very close.

31 p. 31/4 Simulations results Comparison of asymptotic and simulated behavior of the estimated rate of missing information. Plot of SE(ˆλ) 100

32 p. 32/4 Simulations results Comparison of asymptotic and simulated behavior of the estimated rate of missing information. Plot of SE(ˆλ) 100

33 Number of imputations Rubin (1987) demonstrated that a limited number of imputations (5-20) is enough. This is true since the relative efficiency on the variance scale of a point estimate based on m imputations, to one based on an infinite number of imputations, is(1+λ/m) 1. m λ p. 33/4

34 p. 34/4 Number of imputations - Cont. From the standpoint of point-estimation alone, there is little to gain from using more than a few imputations unless the rates of missing information are unusually large. When we are interested in the rates of missing information, estimates of these rates can be very unstable (Schafer, 1997). I will determine the number of imputations necessary for obtaining reliable estimates for the missing information rate.

35 p. 35/4 Number of imputations - Cont. Table 1: Recommendations for choosing m in Conventional MI to estimateλwith given accuracy ˆλ SE(ˆλ) = SE(ˆλ) = SE(ˆλ) =

36 p. 36/4 Number of imputations - Cont. For two-stage MI, one can invoke similar arguments for the relative efficiency (1+λ /m) 1, where λ = λ λb A (1 λ)(1 n 1 ). (1 λ B A ) Because0 λ B A λ, the relative efficiency lies between (1+λ/m) 1 and (1+λ/(mn)) 1.

37 Number of imputations - Cont. Recommendations for choosing m and n in two-stage MI where SE(ˆλ B A ) SE(ˆλ) 0.03 ˆλ B A n ˆλ p. 37/4

38 Number of imputations - Cont. Recommendations for choosing m and n in two-stage MI where SE(ˆλ B A ) SE(ˆλ) 0.02 ˆλ B A n ˆλ p. 38/4

39 Number of imputations - Cont. Recommendations for choosing m and n in two-stage MI where SE(ˆλ B A ) SE(ˆλ) 0.01 ˆλ B A n ˆλ p. 39/4

40 p. 40/4 Discussion When the researcher s main interest is point estimates (and their variances), it is sufficient to use a modest number of imputations. When the rates of missing information are a concern, more imputations are recommended. Simulations have been made to test the accuracy of the asymptotic distribution. In most cases, producing more imputations is not a problem. The rates of missing information are an important model diagnostic tool.

41 p. 41/4 Extensions There are numerous measures that assess the effect of an observation, group of observations, a variable, or variables and observations on the regression estimation (Outlier, influence point etc.). Can we assess the effect of a missing observation, a group of missing observations, an incomplete variable or any combination of these, on the overall estimation? We want this measure to be applicable under any missing data mechanism. And applicable to any data analysis problem.

42 p. 42/4 Extensions I call this measure "Outfluence". The Outfluence is a relative measure comparing the effect of a missing value relative to the effect of the rest of the missing values. For example, consider a survey study in which we are interested in the effect of missing income on the estimation. The outfluence will measure the effect of the missing income compared to all other missing values. We already defined the measure (Harel, 2008) and introduce its limiting distribution (Harel & Stratton, 2009). More work needs to be devoted to the development of methods that will evaluate incomplete-data methodology.

43 p. 43/4 Outfluence Consider the next setup:

44 p. 44/4 Marijuana Example A pilot study presented by Weil et al (1968) and analyzed by Schafer (1997). Three treatments (low dose, high dose and placebo) were given to each subject. The change in heart beat was recorded. We will look at the mean of placebo in 90 minute since Schafer (1997) showed missing data have a large effect on this estimate.

45 p. 45/4 Example Data 15 minutes 90 minutes Subject Placebo Low High Placebo Low High NA NA NA NA NA Means Source: Weil et al (1967)

46 p. 46/4 Example Results Missing values ˆ (4,+) (4,4) (5,4) (+,4) (9,+) B A ˆ ˆ

47 p. 47/4 Extensions and Future work Model diagnosis is an importance concept in statistics The rates of missing information can be used as a diagnostic tool (e.g. Harel & Miglioretti 2007). Many measures were developed to compare the performance of competing models in complete data. To date there is almost no research about the performance of methodology with incomplete data. How can we use measures such as AIC,BIC, DIC to evaluate our missing data modeling?

48 p. 48/4 References Harel, O. (2007) "Inferences on missing information under multiple imputation and two-stage multiple imputation" Statistical Methodology, 4, Harel, O., Miglioretti, D. (2007) "Latent class analysis and the rate of missing information" Journal of Data Science, 5, Harel, O. (2008) "Outfluence - The impact of missing values"model Assisted Statistics and Applications, 3(2), Harel, O., Stratton, J. (2009) "Inferences on the Outfluence How do missing values impact your analysis?" Communications in Statistics-Theory & Methods, In press. Rubin, D.B. (1987)Multiple Imputation for Nonresponse in Surveys. J. Wiley & Sons, New York. Schafer, J.L. (1997)Analysis of Incomplete Multivariate Data. Chapman & Hall, London. Shen, Z. (2000). Nested Multiple Imputation. Ph.D. thesis, Harvard University, Department of Statistics. Weil, A., Zinberg, N., Nelson, J. (1968) Clinical and psychological effects of marihuana in man. Science, 162,

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

F-tests for Incomplete Data in Multiple Regression Setup

F-tests for Incomplete Data in Multiple Regression Setup F-tests for Incomplete Data in Multiple Regression Setup ASHOK CHAURASIA Advisor: Dr. Ofer Harel University of Connecticut / 1 of 19 OUTLINE INTRODUCTION F-tests in Multiple Linear Regression Incomplete

More information

Handling Data with Three Types of Missing Values

Handling Data with Three Types of Missing Values University of Connecticut DigitalCommons@UConn Doctoral Dissertations University of Connecticut Graduate School 7-30-2013 Handling Data with Three Types of Missing Values Jennifer Boyko University of Connecticut,

More information

Pooling multiple imputations when the sample happens to be the population.

Pooling multiple imputations when the sample happens to be the population. Pooling multiple imputations when the sample happens to be the population. Gerko Vink 1,2, and Stef van Buuren 1,3 arxiv:1409.8542v1 [math.st] 30 Sep 2014 1 Department of Methodology and Statistics, Utrecht

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Analyzing Pilot Studies with Missing Observations

Analyzing Pilot Studies with Missing Observations Analyzing Pilot Studies with Missing Observations Monnie McGee mmcgee@smu.edu. Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate

More information

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010 Strategy for modelling non-random missing data mechanisms in longitudinal studies using Bayesian methods: application to income data from the Millennium Cohort Study Alexina Mason Department of Epidemiology

More information

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas Multiple Imputation for Missing Data in epeated Measurements Using MCMC and Copulas Lily Ingsrisawang and Duangporn Potawee Abstract This paper presents two imputation methods: Marov Chain Monte Carlo

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Statistical Practice

Statistical Practice Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

The lmm Package. May 9, Description Some improved procedures for linear mixed models

The lmm Package. May 9, Description Some improved procedures for linear mixed models The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved

More information

A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

More information

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer

Package lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer Package lmm March 19, 2012 Version 0.4 Date 2012-3-19 Title Linear mixed models Author Joseph L. Schafer Maintainer Jing hua Zhao Depends R (>= 2.0.0) Description Some

More information

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Sonderforschungsbereich 386, Paper 24 (2) Online unter: http://epub.ub.uni-muenchen.de/

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Bayesian inference for factor scores

Bayesian inference for factor scores Bayesian inference for factor scores Murray Aitkin and Irit Aitkin School of Mathematics and Statistics University of Newcastle UK October, 3 Abstract Bayesian inference for the parameters of the factor

More information

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW SSC Annual Meeting, June 2015 Proceedings of the Survey Methods Section ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW Xichen She and Changbao Wu 1 ABSTRACT Ordinal responses are frequently involved

More information

Plausible Values for Latent Variables Using Mplus

Plausible Values for Latent Variables Using Mplus Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can

More information

PIRLS 2016 Achievement Scaling Methodology 1

PIRLS 2016 Achievement Scaling Methodology 1 CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data

A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data Alexina Mason, Sylvia Richardson and Nicky Best Department of Epidemiology and Biostatistics, Imperial College

More information

Some methods for handling missing values in outcome variables. Roderick J. Little

Some methods for handling missing values in outcome variables. Roderick J. Little Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean

More information

A weighted simulation-based estimator for incomplete longitudinal data models

A weighted simulation-based estimator for incomplete longitudinal data models To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun

More information

Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data. Don Hedeker University of Illinois at Chicago

Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data. Don Hedeker University of Illinois at Chicago Mixed-Effects Pattern-Mixture Models for Incomplete Longitudinal Data Don Hedeker University of Illinois at Chicago This work was supported by National Institute of Mental Health Contract N44MH32056. 1

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

Determining Sufficient Number of Imputations Using Variance of Imputation Variances: Data from 2012 NAMCS Physician Workflow Mail Survey *

Determining Sufficient Number of Imputations Using Variance of Imputation Variances: Data from 2012 NAMCS Physician Workflow Mail Survey * Applied Mathematics, 2014,, 3421-3430 Published Online December 2014 in SciRes. http://www.scirp.org/journal/am http://dx.doi.org/10.4236/am.2014.21319 Determining Sufficient Number of Imputations Using

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level A Monte Carlo Simulation to Test the Tenability of the SuperMatrix Approach Kyle M Lang Quantitative Psychology

More information

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London Bayesian methods for missing data: part 1 Key Concepts Nicky Best and Alexina Mason Imperial College London BAYES 2013, May 21-23, Erasmus University Rotterdam Missing Data: Part 1 BAYES2013 1 / 68 Outline

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies

More information

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data Comparison of multiple imputation methods for systematically and sporadically missing multilevel data V. Audigier, I. White, S. Jolani, T. Debray, M. Quartagno, J. Carpenter, S. van Buuren, M. Resche-Rigon

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

An Approximate Test for Homogeneity of Correlated Correlation Coefficients

An Approximate Test for Homogeneity of Correlated Correlation Coefficients Quality & Quantity 37: 99 110, 2003. 2003 Kluwer Academic Publishers. Printed in the Netherlands. 99 Research Note An Approximate Test for Homogeneity of Correlated Correlation Coefficients TRIVELLORE

More information

A Flexible Bayesian Approach to Monotone Missing. Data in Longitudinal Studies with Nonignorable. Missingness with Application to an Acute

A Flexible Bayesian Approach to Monotone Missing. Data in Longitudinal Studies with Nonignorable. Missingness with Application to an Acute A Flexible Bayesian Approach to Monotone Missing Data in Longitudinal Studies with Nonignorable Missingness with Application to an Acute Schizophrenia Clinical Trial Antonio R. Linero, Michael J. Daniels

More information

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models Confirmatory Factor Analysis: Model comparison, respecification, and more Psychology 588: Covariance structure and factor models Model comparison 2 Essentially all goodness of fit indices are descriptive,

More information

Inference with Imputed Conditional Means

Inference with Imputed Conditional Means Inference with Imputed Conditional Means Joseph L. Schafer and Nathaniel Schenker June 4, 1997 Abstract In this paper, we develop analytic techniques that can be used to produce appropriate inferences

More information

6 Pattern Mixture Models

6 Pattern Mixture Models 6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data

More information

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Statistical Inference with Monotone Incomplete Multivariate Normal Data Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework

More information

Miscellanea A note on multiple imputation under complex sampling

Miscellanea A note on multiple imputation under complex sampling Biometrika (2017), 104, 1,pp. 221 228 doi: 10.1093/biomet/asw058 Printed in Great Britain Advance Access publication 3 January 2017 Miscellanea A note on multiple imputation under complex sampling BY J.

More information

Likelihood-based inference with missing data under missing-at-random

Likelihood-based inference with missing data under missing-at-random Likelihood-based inference with missing data under missing-at-random Jae-kwang Kim Joint work with Shu Yang Department of Statistics, Iowa State University May 4, 014 Outline 1. Introduction. Parametric

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Don t be Fancy. Impute Your Dependent Variables!

Don t be Fancy. Impute Your Dependent Variables! Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th

More information

Confidence Distribution

Confidence Distribution Confidence Distribution Xie and Singh (2013): Confidence distribution, the frequentist distribution estimator of a parameter: A Review Céline Cunen, 15/09/2014 Outline of Article Introduction The concept

More information

Discussing Effects of Different MAR-Settings

Discussing Effects of Different MAR-Settings Discussing Effects of Different MAR-Settings Research Seminar, Department of Statistics, LMU Munich Munich, 11.07.2014 Matthias Speidel Jörg Drechsler Joseph Sakshaug Outline What we basically want to

More information

Statistical Methods for Handling Missing Data

Statistical Methods for Handling Missing Data Statistical Methods for Handling Missing Data Jae-Kwang Kim Department of Statistics, Iowa State University July 5th, 2014 Outline Textbook : Statistical Methods for handling incomplete data by Kim and

More information

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Multiple Imputation Methods for Treatment Noncompliance and Nonresponse in Randomized Clinical Trials

Multiple Imputation Methods for Treatment Noncompliance and Nonresponse in Randomized Clinical Trials UW Biostatistics Working Paper Series 2-19-2009 Multiple Imputation Methods for Treatment Noncompliance and Nonresponse in Randomized Clinical Trials Leslie Taylor UW, taylorl@u.washington.edu Xiao-Hua

More information

Linear Regression Models P8111

Linear Regression Models P8111 Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

More information

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Wishart Priors Patrick Breheny March 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11 Introduction When more than two coefficients vary, it becomes difficult to directly model each element

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

Multi-group analyses for measurement invariance parameter estimates and model fit (ML)

Multi-group analyses for measurement invariance parameter estimates and model fit (ML) LBP-TBQ: Supplementary digital content 8 Multi-group analyses for measurement invariance parameter estimates and model fit (ML) Medication data Multi-group CFA analyses were performed with the 16-item

More information

The propensity score with continuous treatments

The propensity score with continuous treatments 7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,

More information

STATISTICAL ANALYSIS WITH MISSING DATA

STATISTICAL ANALYSIS WITH MISSING DATA STATISTICAL ANALYSIS WITH MISSING DATA SECOND EDITION Roderick J.A. Little & Donald B. Rubin WILEY SERIES IN PROBABILITY AND STATISTICS Statistical Analysis with Missing Data Second Edition WILEY SERIES

More information

REVISED PAGE PROOFS. Logistic Regression. Basic Ideas. Fundamental Data Analysis. bsa350

REVISED PAGE PROOFS. Logistic Regression. Basic Ideas. Fundamental Data Analysis. bsa350 bsa347 Logistic Regression Logistic regression is a method for predicting the outcomes of either-or trials. Either-or trials occur frequently in research. A person responds appropriately to a drug or does

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Introduction An approximated EM algorithm Simulation studies Discussion

Introduction An approximated EM algorithm Simulation studies Discussion 1 / 33 An Approximated Expectation-Maximization Algorithm for Analysis of Data with Missing Values Gong Tang Department of Biostatistics, GSPH University of Pittsburgh NISS Workshop on Nonignorable Nonresponse

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

Imputation Algorithm Using Copulas

Imputation Algorithm Using Copulas Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

Health utilities' affect you are reported alongside underestimates of uncertainty

Health utilities' affect you are reported alongside underestimates of uncertainty Dr. Kelvin Chan, Medical Oncologist, Associate Scientist, Odette Cancer Centre, Sunnybrook Health Sciences Centre and Dr. Eleanor Pullenayegum, Senior Scientist, Hospital for Sick Children Title: Underestimation

More information

State Estimation of Linear and Nonlinear Dynamic Systems

State Estimation of Linear and Nonlinear Dynamic Systems State Estimation of Linear and Nonlinear Dynamic Systems Part I: Linear Systems with Gaussian Noise James B. Rawlings and Fernando V. Lima Department of Chemical and Biological Engineering University of

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

Planned Missingness Designs and the American Community Survey (ACS)

Planned Missingness Designs and the American Community Survey (ACS) Planned Missingness Designs and the American Community Survey (ACS) Steven G. Heeringa Institute for Social Research University of Michigan Presentation to the National Academies of Sciences Workshop on

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Chapter 4. Parametric Approach. 4.1 Introduction

Chapter 4. Parametric Approach. 4.1 Introduction Chapter 4 Parametric Approach 4.1 Introduction The missing data problem is already a classical problem that has not been yet solved satisfactorily. This problem includes those situations where the dependent

More information

arxiv: v5 [stat.me] 13 Feb 2018

arxiv: v5 [stat.me] 13 Feb 2018 arxiv: arxiv:1602.07933 BOOTSTRAP INFERENCE WHEN USING MULTIPLE IMPUTATION By Michael Schomaker and Christian Heumann University of Cape Town and Ludwig-Maximilians Universität München arxiv:1602.07933v5

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Longitudinal analysis of ordinal data

Longitudinal analysis of ordinal data Longitudinal analysis of ordinal data A report on the external research project with ULg Anne-Françoise Donneau, Murielle Mauer June 30 th 2009 Generalized Estimating Equations (Liang and Zeger, 1986)

More information

New Developments in Nonresponse Adjustment Methods

New Developments in Nonresponse Adjustment Methods New Developments in Nonresponse Adjustment Methods Fannie Cobben January 23, 2009 1 Introduction In this paper, we describe two relatively new techniques to adjust for (unit) nonresponse bias: The sample

More information

Mixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University

Mixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University Identifying the Correct Number of Classes in a Growth Mixture Model Davood Tofighi Craig Enders Arizona State University Mixture Modeling Heterogeneity exists such that the data are comprised of two or

More information

A class of latent marginal models for capture-recapture data with continuous covariates

A class of latent marginal models for capture-recapture data with continuous covariates A class of latent marginal models for capture-recapture data with continuous covariates F Bartolucci A Forcina Università di Urbino Università di Perugia FrancescoBartolucci@uniurbit forcina@statunipgit

More information

Flexible multiple imputation by chained equations of the AVO-95 Survey

Flexible multiple imputation by chained equations of the AVO-95 Survey PG/VGZ/99.045 Flexible multiple imputation by chained equations of the AVO-95 Survey TNO Prevention and Health Public Health Wassenaarseweg 56 P.O.Box 2215 2301 CE Leiden The Netherlands Tel + 31 71 518

More information

Label Switching and Its Simple Solutions for Frequentist Mixture Models

Label Switching and Its Simple Solutions for Frequentist Mixture Models Label Switching and Its Simple Solutions for Frequentist Mixture Models Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Abstract The label switching

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Biostat 2065 Analysis of Incomplete Data

Biostat 2065 Analysis of Incomplete Data Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

WinLTA USER S GUIDE for Data Augmentation

WinLTA USER S GUIDE for Data Augmentation USER S GUIDE for Version 1.0 (for WinLTA Version 3.0) Linda M. Collins Stephanie T. Lanza Joseph L. Schafer The Methodology Center The Pennsylvania State University May 2002 Dev elopment of this program

More information

TWO-WAY CONTINGENCY TABLES WITH MARGINALLY AND CONDITIONALLY IMPUTED NONRESPONDENTS

TWO-WAY CONTINGENCY TABLES WITH MARGINALLY AND CONDITIONALLY IMPUTED NONRESPONDENTS TWO-WAY CONTINGENCY TABLES WITH MARGINALLY AND CONDITIONALLY IMPUTED NONRESPONDENTS By Hansheng Wang A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy

More information

Lecture 9: Linear Regression

Lecture 9: Linear Regression Lecture 9: Linear Regression Goals Develop basic concepts of linear regression from a probabilistic framework Estimating parameters and hypothesis testing with linear models Linear regression in R Regression

More information

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded 1 Background Latent confounder is common in social and behavioral science in which most of cases the selection mechanism

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: )

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: ) Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv:1809.05778) Workshop on Advanced Statistics for Physics Discovery aspd.stat.unipd.it Department of Statistical Sciences, University of

More information