Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Similar documents
Plausible Values for Latent Variables Using Mplus

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Geoffrey T. Wodtke. University of Toronto. Daniel Almirall. University of Michigan. Population Studies Center Research Report July 2015

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models

Analysis of Panel Data: Introduction and Causal Inference with Panel Data

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43


Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

Basic Linear Model. Chapters 4 and 4: Part II. Basic Linear Model

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Psychological Methods

AN INVESTIGATION OF THE ALIGNMENT METHOD FOR DETECTING MEASUREMENT NON- INVARIANCE ACROSS MANY GROUPS WITH DICHOTOMOUS INDICATORS

UNIVERSIDAD CARLOS III DE MADRID ECONOMETRICS Academic year 2009/10 FINAL EXAM (2nd Call) June, 25, 2010

Chapter 8 Handout: Interval Estimates and Hypothesis Testing

A Non-parametric bootstrap for multilevel models

Categorical Predictor Variables

Applied Econometrics Lecture 1

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Structural equation modeling

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM

Difference scores or statistical control? What should I use to predict change over two time points? Jason T. Newsom

MS&E 226: Small Data

THE AUSTRALIAN NATIONAL UNIVERSITY. Second Semester Final Examination November, Econometrics II: Econometric Modelling (EMET 2008/6008)

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

Workshop on Statistical Applications in Meta-Analysis

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies

Longitudinal Data Analysis. RatSWD Nachwuchsworkshop Vorlesung von Josef Brüderl 25. August, 2009

WU Weiterbildung. Linear Mixed Models

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

EMERGING MARKETS - Lecture 2: Methodology refresher

Technical and Practical Considerations in applying Value Added Models to estimate teacher effects

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Exam D0M61A Advanced econometrics

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Advising on Research Methods: A consultant's companion. Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand

Comments on The Role of Large Scale Assessments in Research on Educational Effectiveness and School Development by Eckhard Klieme, Ph.D.

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

Running Head: Effect Heterogeneity with Time-varying Treatments and Moderators ESTIMATING HETEROGENEOUS CAUSAL EFFECTS WITH TIME-VARYING

Auxiliary Variables in Mixture Modeling: Using the BCH Method in Mplus to Estimate a Distal Outcome Model and an Arbitrary Secondary Model

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Economics 308: Econometrics Professor Moody

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II

On The Comparison of Two Methods of Analyzing Panel Data Using Simulated Data

Composite Causal Effects for. Time-Varying Treatments and Time-Varying Outcomes

LINKING IN DEVELOPMENTAL SCALES. Michelle M. Langer. Chapel Hill 2006

How well do Fit Indices Distinguish Between the Two?

Principles Underlying Evaluation Estimators

Formula for the t-test

Final Exam - Solutions

Dynamics in Social Networks and Causality

II. MATCHMAKER, MATCHMAKER

Advanced Structural Equations Models I

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Exploring Cultural Differences with Structural Equation Modelling

Myths about the Analysis of Change

FinQuiz Notes

BOOTSTRAPPING DIFFERENCES-IN-DIFFERENCES ESTIMATES

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Introduction to Econometrics. Review of Probability & Statistics

ECO220Y Simple Regression: Testing the Slope

ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES

Chapter 6 Stochastic Regressors

multilevel modeling: concepts, applications and interpretations

COSC 341 Human Computer Interaction. Dr. Bowen Hui University of British Columbia Okanagan


Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Propensity Score Matching

Fixed Effects Models for Panel Data. December 1, 2014

Selection on Observables: Propensity Score Matching.

What s New in Econometrics. Lecture 1

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Outline

Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Efficiency of repeated-cross-section estimators in fixed-effects models

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Time Metric in Latent Difference Score Models. Holly P. O Rourke

Comparing Group Means When Nonresponse Rates Differ

9. Linear Regression and Correlation

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Psychology 282 Lecture #3 Outline

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

REVIEW 8/2/2017 陈芳华东师大英语系

WISE International Masters

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Course title SD206. Introduction to Structural Equation Modelling

Supplemental material to accompany Preacher and Hayes (2008)

Midterm 2 - Solutions

Using Mplus individual residual plots for. diagnostics and model evaluation in SEM

Quantitative Economics for the Evaluation of the European Policy

Applied Statistics and Econometrics

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Transcription:

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior David R. Johnson Department of Sociology and Haskell Sie Department of Educational Psychology & Department of Statistics The Pennsylvania State University May 22, 2013

The Problem Use of non-experimental two-wave panel (prospective) data to estimate the effects of parental behavior control strategies on the trajectory of child problem behavior. Two approaches have been used The Lagged Dependent Variable (LDV) method, also referred to as the residualized gain score method. The Change Score (SC) or the simple gain score method. In empirical studies, these often give different and often contradictory findings (Larzelere et. al, 2010). The purpose of this study is to examine some of the assumptions of these models and test them in simulated data where the true effect is know.

This is an extension of a comparison of twowave LDV and CS approaches in family research (Johnson, 2005) The focus here is on the two-wave case, although extension to multiple waves of both approaches is straightforward and will be briefly discussed.

The Substantive Model How do parent s practices of controlling rule violation and problem behavior by their child affect change in the child s observed problem behavior over time? Assume an observational or non-randomized experimental study rather than a randomized experiment. For example, assessing a measure of child problem behavior at two points of time and also assessing parental reports on their child control practices either at time 1, between 1 and 2, or at time 2.

Two-wave panel data: Assumptions Measurements are obtained at two time points. Y 1 : the child's problem behavior score at time 1 Y 2 : the child's problem behavior score at time 2 Both Y 1 and Y 2 are influenced by the same stable (non-time varying) variable Y (the child's disposition for problem behaviors). X is a parental intervention or control method: zero at time 1 and one at time 2 if intervention occurs somewhere between the two time points. X can also be specified a stable measure of parental control behavior consistent across time periods measured as a continuous variable or a binary variable. X can also be specified as a time-varying measure of parental control behavior that is measured at time 1 and time 2. S is a time-invariant variable measuring stable traits and background influences of both the parent and the child.

LDV and CS models Lagged Dependent Variable (LDV) method, also known as the residualized gain score or analysis of covariance (ANCOVA) method. Without any time-varying independent variables, the model is given by Y 2 = α + βx + γs + δy 1 + ε Change Score (CS) method, also commonly referred to as the simple gain score method. From Y 2 = α 2 + βx + γs + ε 2, and Y 1 = α 1 + γs + ε 1, we have Y 2 -Y 1 = α + βx + ε, where α = α 2 - α 1 and ε = ε 2 - ε 1.

Alternative Specifications of the CS model. The CS model can also be parameterized as a fixed effect pooled time-series model (Allison, 2009). Each wave is a separate record in the dataset linked by an ID (long vs. wide format). Y t = βx t + λt + α + ε, where t is the wave, X is a time-varying treatment, T is 0 for wave 1 and 1 for wave2, α is fixed effect for each individual in the dataset, and ε is a within-individual error term. Only time-varying variables can be included in the model. All variables are expressed as a deviation from their mean across waves.

Alternative Specifications of the CS model. When X is an intervention between w1 and w2 then X is time varying. If X is a measure of parenting practice assessed at both time periods then is also time varying. If X is a stable measure of parental practice measured only at time 1 (or earlier) then it is not time-varying. However, it can still be included in the fixed-effects model when interacted with time T. Estimates are the same as the CS in two-wave models.

LDV versus CS Both control Y 1 while examining the relationship between X and Y 2 in an attempt to estimate the causal effect of X on Y (Allison, 1990) important if X is not randomly assigned. The LDV method treats Y 2 as the dependent variable, regressed on X, S and Y 1 as independent variables. The CS method treats Y 2 -Y 1 as the dependent variable, regressed only on X. The LDV method has traditionally been considered superior to the CS method because: Y 2 -Y 1 tends to have a lower reliability than Y 1 or Y 2 itself, and a correlation between X and Y 1 will lead to a spurious negative relationship between X and Y 2 -Y 1 due to regression to the mean (Allison, 1990). The CS method is more advantageous in terms of: controlling for all time-invariant variables whether measured or unmeasured, as long as their effects are invariant over time, and allowing Y 1 to have measurement error

LDV versus CS Analyses of the same data using the two methods often lead to different conclusions. The two methods test different null hypotheses The LDV model is preferable over the CS model if Y 1 has a causal effect on Y 2 or if X is influenced by period-specific components of Y 1 (Allison, 1990). The LDV method is inappropriate to use when X (e.g., parental control) is not randomized and large differences on Y 1 scores exist between groups with different values of X. Also will be biased when measurement error is present in Y 1.

Simulation Study: Method Measurement errors: no measurement error, measurement errors on all four variables Y 1, Y 2, X, S. Time effect on the child's problem behavior at time 2: no effect and a moderate effect of 0.3. Parental intervention effect on the child's problem behavior: a positive effect of 0.5, a negative effect of -0.5, and no effect. A causal effect of the child's behavior at time 1 on his/her behavior at time 2: no effect and a positive effect of 0.4. Factors affecting the parental intervention: no factor affecting X, only S, only Y, only Y 1, both S and Y, both S and Y 1. Models compared: LDV model without control of S, LDV model with control of S, CS model.

Simulation Study: Method To generate the variable X that represents parental intervention, a continuous variable X c was first generated. Then define X=1 if X c > 0.67 and 0 otherwise. In models with measurement errors, the variables Y 1, Y 2, X c, S are substituted by y 1 =Y 1 +0.5r 1, y 2 =Y 2 +0.5r 2, x c =X c +0.5r 3, and s=s+0.7r 4, respectively, where r 1, r 2, r 3, and r 4 are standard normal random variates. 100 replications were generated with 2,000 subjects within each replication.

Model 1: X is random S = Stable Personality background factors: Y = Stable propensity for problem behavior; X = Parental Control; T = Developmental (time) effect; Y 1, Y 2 = problem behavior in waves 1 and 2 S Y T s Y 1 Y 2 X y 1 y 2 x

Model 2: Causal Effect from S to X S = Stable Personality background factors: Y = Stable propensity for problem behavior; X = Parental Control; T = Developmental (time) effect; Y 1, Y 2 = problem behavior in waves 1 and 2 S Y T s Y 1 Y 2 X y 1 y 2 x

Model 3: Causal Effect from Y to X S = Stable Personality background factors: Y = Stable propensity for problem behavior; X = Parental Control; T = Developmental (time) effect; Y 1, Y 2 = problem behavior in waves 1 and 2 S Y T s Y 1 Y 2 X y 1 y 2 x

Model 4: Causal Effect from Y 1 to X S = Stable Personality background factors: Y = Stable propensity for problem behavior; X = Parental Control; T = Developmental (time) effect; Y 1, Y 2 = problem behavior in waves 1 and 2 S Y T s Y 1 Y 2 X y 1 y 2 x

Model 5: Causal Effects from both S and Y to X S = Stable Personality background factors: Y = Stable propensity for problem behavior; X = Parental Control; T = Developmental (time) effect; Y 1, Y 2 = problem behavior in waves 1 and 2 S Y T s Y 1 Y 2 X y 1 y 2 x

Model 6: Causal Effects from S and Y 1 to X S = Stable Personality background factors: Y = Stable propensity for problem behavior; X = Parental Control; T = Developmental (time) effect; Y 1, Y 2 = problem behavior in waves 1 and 2 S Y T s Y 1 Y 2 X y 1 y 2 x

Models 7-12: Causal Effect from Y 1 to Y 2 added to Models 1-6 S = Stable Personality background factors: Y = Stable propensity for problem behavior; X = Parental Control; T = Developmental (time) effect; Y 1, Y 2 = problem behavior in waves 1 and 2 S Y T s Y 1 Y 2 X y 1 y 2 x

Simulation Study: Assessment Criteria In each model, the regression coefficient estimate representing the effect of parental intervention X was evaluated against its true effect on Y 2 using and Bias β = 1 100 100 r=1 β r β MSE β = 1 β 100 r β 2 r=1 Also compared was the proportion of models where the regression coefficient estimate was significant in the direction of the true effect. Will focus here only on the Bias. MSE and proportion significant available on handout for all models. 100

Descriptive Information The only pieces of information available in the estimation models without measurement error are Y 1, Y 2, X and S. When measurement error was assumed, then y 1, y 2, x and s were available. The means of the Y variables at each wave when X = 0 and X = 1 serve as an indicator of the pattern generated by the simulation. Will only present situations where the effect of T is 0. (estimate of effect of X does not vary by different values of T)

0.1 Means of Y at Wave 1 and 2 for Model 1 (X is Random) 0-0.1-0.2-0.3 Model 1 X=0 Model 1 X=1-0.4-0.5-0.6 W1 W2

1 Comparison of Means of Y at Waves 1 and 2 for Model 3 (Y -> X) and Model 4 (Y1 -> X) 0.8 0.6 Model 3 X=0 Model 3 X=1 Model 4 X=0 Model 4 X=1 0.4 0.2 0-0.2-0.4 W1 W2

0.8 Models 3 (Y -> X) & 4 (y1 -> X) with Measurement Error 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Model 3 X=0 Model 3 X=1 Model 4 X=0 Model 4 X=1 0-0.1-0.2-0.3 W1 W2

Summary of the Simulation Results Full simulation results available on handout. Present results for an effect of X of -.5. Assumes that parental control reduces problem behavior. Present models with no measurement error and with measurement error in all variables. Also exam some models where X is assumed to have no effect (X = 0) to assess whether either model is likely to yield an effect when none is present.

Bias in Estimated of X (-.5) when no Measurement Error is Present -0.5-0.4-0.3-0.2-0.1 0 0.1 0.2 0.3 0.4 0.5 M1: X random M2: S -> X M3: Y -> X M4: Y1 -> X M5: Y, S -> X M6: Y1, S -> X M7: X random; Y1 -> Y2 M8: S -> X ; Y1 -> Y2 M9: Y -> X ; Y1 -> Y2 M10: Y1 -> X ; Y1 -> Y2 M11: Y, S -> X ; Y1 -> Y2 M12: Y1, S -> X ; Y1 -> Y2 CS LDV

Bias in Effect of X with Measurement Error when the True Effect of X is negative (-.5) M1: X random M2: S -> X M3: Y -> X M4: Y1 -> X M5: Y, S -> X M6: Y1, S -> X M7: X random; Y1 -> Y2 M8: S -> X ; Y1 -> Y2 M9: Y -> X ; Y1 -> Y2 M10: Y1 -> X ; Y1 -> Y2 M11: Y, S -> X ; Y1 -> Y2 M12: Y1, S -> X ; Y1 -> Y2-0.3-0.2-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 CS LDV

M1: X random M2: S -> X M3: Y -> X M4: Y1 -> X M5: Y, S -> X M6: Y1, S -> X M7: X random; Y1 -> Y2 M8: S -> X ; Y1 -> Y2 M9: Y -> X ; Y1 -> Y2 M10: Y1 -> X ; Y1 -> Y2 M11: Y, S -> X ; Y1 -> Y2 M12: Y1, S -> X ; Y1 -> Y2 Bias in Effect of X when the True Effect of X is 0. Measurement Error in Y1, Y2, S, & X -0.5-0.4-0.3-0.2-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 CS LDV

Major findings Whether Y or Y 1 has a causal effect on X makes a substantial difference in the estimates of the CS and LDV models. When Y 1 has a causal effect on X, CS overestimates the negative effect and LDV is unbiased. When Y has a causal effect on X, LDV underestimates the effect and CS is unbiased. When measurement error is present, the CS method outperforms the LDV method overall. When Y 1 has a causal effect on Y 2, The LDV method is less biased except when measurement error is present.

Implications for choice of an analysis model. How and when the parental control method is measured is critical for choice of methods. If measured shortly after an episode of child problem behavior then there is likely to be a causal effect of that episode on the parental report, favoring the LDV method. If parental control methods tend to be stable, reflecting in generally the propensity of the child to problem behavior, then the CS method is likely to be preferred. If measurement error is expected to be present then CS would be preferred, or a method such as multiple indicators in SEM need to be used to adjust for measurement error. Computing both models is not of much help in deciding on which is more appropriate. Even if there is no effect, both models would under certain conditions show significant effects of the parental control.

Alternative Approaches? The economists Heckman and Robb (1985) examined models for a similar situation: the effect of training programs on earnings. They concluded that panel models involved many assumptions that were likely to be violated yielding biased estimates. However, methods based upon repeated cross-sections also involved assumptions but these were more reasonable and had less serious impact on the estimates. Two waves is likely not enough to deal with the complexities. LDV models can readily extent to multiple waves. CS within the fixed effects framework also permits multiple waves. With additional waves and multiple indicators of main variables allows more degrees of freedom to identify more of the model assumptions. Recent work has integrated the fixed effects approach with SEM models, including Latent Growth Curve models (Bollen & Brand, 2010 ; Teachman, 2012).

Thank You Contact Information: David R. Johnson drj10@psu.edu Haskell Sie hxs265@psu.edu

References Allison, P. D. (1990). Change scores as dependent variables in regression analysis. Sociological Methodology, 20, 93-114. Allison, P. D. (2009). Fixed effects regression models. Los Angeles: Sage. Bollen, K. A., & Brand, J. E. (2010). A general panel model with random and fixed effects: A structural equations approach. Social Forces. 89: 1-34. Cribbie, R. A., & Jamieson, J. (2004). Decreases in posttest variance and the measurement of change. Methods of Psychological Research Online, 9, 37-55. Heckman, J. J., & Robb, R. (1985). Alternative methods for evaluating the impact of interventions: An overview. Journal of Econometrics. 30:239-267. Johnson, D R. (1995). "Alternative methods for the quantitative analysis of panel data in family research: Pooled time-series models," Journal of Marriage and the Family. 57:1065-1077. Johnson, D. R. (2005). Two-wave panel analysis: Comparing statistical methods for studying the effects of transitions. Journal of Marriage and Family, 67, 1061-1075. Larzelere, R. E., Ferrer, E., Kuhn, B. R., & Danelia, K. (2010). Differences in causal estimates from longitudinal analyses of residualized versus simple gain scores: Contrasting controls for selection and regression artifacts. International Journal of Behavioral Development, 34, 180-189. Teachman, J. (2012). Latent growth curve models with random and fixed effects. Paper presented at the 20 th Annual Symposium on Family Issues, Pennsylvania State Univeristy.