Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI

Similar documents
Jay Magidson Statistical Innovations

Jay Magidson, Ph.D. Statistical Innovations

Chapter 4 Longitudinal Research Using Mixture Models

Using a Scale-Adjusted Latent Class Model to Establish Measurement Equivalence in Cross-Cultural Surveys:

SESSION 2 ASSIGNED READING MATERIALS

A NEW MODEL FOR THE FUSION OF MAXDIFF SCALING

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Growth models for categorical response variables: standard, latent-class, and hybrid approaches

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

Generalized Linear Models for Non-Normal Data

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Introducing Generalized Linear Models: Logistic Regression

Introduction to mtm: An R Package for Marginalized Transition Models

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Statistics Toolbox 6. Apply statistical algorithms and probability models

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Investigating Models with Two or Three Categories

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Categorical and Zero Inflated Growth Models

Lecture 8: Summary Measures

Retrieve and Open the Data

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Experimental Design and Data Analysis for Biologists

Title: Testing for Measurement Invariance with Latent Class Analysis. Abstract

Generalized Models: Part 1

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

ANOVA CIVL 7012/8012

Frequency Distribution Cross-Tabulation

An Introduction to SEM in Mplus

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Determining the number of components in mixture models for hierarchical data

Reader s Guide ANALYSIS OF VARIANCE. Quantitative and Qualitative Research, Debate About Secondary Analysis of Qualitative Data BASIC STATISTICS

Longitudinal Data Analysis of Health Outcomes

Statistical power of likelihood ratio and Wald tests in latent class models with covariates

REVIEW 8/2/2017 陈芳华东师大英语系

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

Model Assumptions; Predicting Heterogeneity of Variance

Relate Attributes and Counts

Readings Howitt & Cramer (2014) Overview

Midterm 2 - Solutions

Dyadic Data Analysis. Richard Gonzalez University of Michigan. September 9, 2010

Longitudinal breast density as a marker of breast cancer risk

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Walkthrough for Illustrations. Illustration 1

176 Index. G Gradient, 4, 17, 22, 24, 42, 44, 45, 51, 52, 55, 56

Lecture 2: Categorical Variable. A nice book about categorical variable is An Introduction to Categorical Data Analysis authored by Alan Agresti

Relating Latent Class Analysis Results to Variables not Included in the Analysis

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Model Estimation Example

Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest

MULTILEVEL IMPUTATION 1

Lecture 24: Partial correlation, multiple regression, and correlation

Introduction to Generalized Models

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Statistics in medicine

GOLDMineR TM. User s Guide. Jay Magidson, Ph.D. Statistical Innovations Inc.

Lecture 2: Poisson and logistic regression

8 Nominal and Ordinal Logistic Regression

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Multiple Regression. Peerapat Wongchaiwat, Ph.D.

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

CHAPTER 2. Types of Effect size indices: An Overview of the Literature

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Latent class analysis for intensive longitudinal data, Hidden Markov processes, Regime switching models and Dynamic Structural Equations in Mplus

Readings Howitt & Cramer (2014)

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Non-linear panel data modeling

Markov Chain Monte Carlo methods

Comparing IRT with Other Models

Bivariate Relationships Between Variables

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

Markov-switching autoregressive latent variable models for longitudinal data

FAQ: Linear and Multiple Regression Analysis: Coefficients

Mixture models for heterogeneity in ranked data

Fixed effects results...32

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Marginal effects and extending the Blinder-Oaxaca. decomposition to nonlinear models. Tamás Bartus

Review of the General Linear Model

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

Time-Invariant Predictors in Longitudinal Models

Incorporating Level of Effort Paradata in Nonresponse Adjustments. Paul Biemer RTI International University of North Carolina Chapel Hill

4 Multicategory Logistic Regression

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

1 Interaction models: Assignment 3

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

LOG-MULTIPLICATIVE ASSOCIATION MODELS AS LATENT VARIABLE MODELS FOR NOMINAL AND0OR ORDINAL DATA. Carolyn J. Anderson* Jeroen K.

ECONOMETRICS HONOR S EXAM REVIEW SESSION

MULTINOMIAL LOGISTIC REGRESSION

Generalized linear mixed models (GLMMs) for dependent compound risk models

Multi-level Models: Idea

Micro-macro multilevel latent class models with multiple discrete individual-level variables

Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:

Transcription:

Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI Jay Magidson, Ph.D. President, Statistical Innovations Inc. Belmont, MA., U.S. statisticalinnovations.com Presented at Modern Modeling Methods (M3) 2013, University of Connecticut 1

Abstract The latent Markov GUI in version 5.0 of Latent GOLD was designed to make it easy to estimate an extended class of latent Markov models. The simplicity of modeling with a single dichotomous indicator (response) and 5 time points is maintained with even hundreds of time points, multiple indicators, indicators of differing scale types (nominal, ordinal, count, continuous), observations of different length, covariates, and a mixture including mover- stayer structures. New longitudinal bivariate residuals (L-BVRs) are available to diagnose whether the model is picking up the most important aspect of the data, in addition to standard tools such chi-squared tests, with bootstrap, and AIC/BIC. Informative graphical displays are provided and parameter estimation is very fast. The final goal is to obtain useful and correct answers to the research questions of interest. 2

Classification of Latent Class Models for Longitudinal Research Latent Markov (LM) models are cluster models for longitudinal data where persons can switch between clusters. In the corresponding growth models, persons stay in the same cluster. Clusters in LM models are called latent states, while in the growth model clusters are called latent classes. In LM models, transition probability parameters spell out how switching between states occurs from time t to t+1. Model name Transition structure Unobserved heterogeneity Measurement error Example #2 Mixture latent Markov (MLM) yes yes yes Mixture Markov yes yes no Example #1 Latent (Hidden) Markov (LM) yes no yes Mixture latent growth no yes yes Mixture growth no yes no Standard latent class no no yes Vermunt, Tran and Magidson (2008) Latent class models in longitudinal research, chapter 23 in Handbook of Longitudinal Research, S. Menard Editor, Academic Press. 3

Graphs for LM and MLM Models X denotes a categorical latent variable with S categories (latent states) Person i can be in a different state s=1,2,,s at different times t=0,1,2,3,4 X t = latent state variable at time t LM model 3 sets of probability parameters: Initial State probs (b 0 ): P(X 0 = s) Transition probs (b t ), t=1,,4 Measurement errors (a) measurement equivalence The particular set of latent states (s0, s1, s2, s3, s4) defines a change pattern for person i Extension to mixture LM (MLM) model MLM model implies different change patterns for each latent class k=1,2,,k Example for K=2 classes, where class 2 is a Stayer class (1,1,1,1,1) or (2,2,2,2,2) no change 4

Graphs for LM and MLM Models X denotes a categorical latent variable with S categories (latent states) Person i can be in a different state s=1,2,,s at different times t=0,1,2,3,4 X t = latent state variable at time t LM model 3 sets of probability parameters: Initial State probs (b 0 ): P(X 0 = s) Transition probs (b t ), t=1,,4 Measurement errors (a) measurement equivalence Extension to multiple indicators is immediate! b 0 b 1 b 2 b 3 b 4 X 0 X 1 X 2 X 3 X 4 a Y a V a Y a V a Y a V a Y a V a Y a V Y 0 V 0 Y 1 V 1 Y 2 V 2 Y 3 V 3 Y 4 V 4 5

Latent GOLD Lets Users Customize Logit Models* Initial State probs: P(X 0 = s) P( x s) 0 log 0s Px ( 0 1) Logit model may include covariates Z (e.g., AGE, SEX) Transition probs: P(X t = r X t-1 = s) P xt 1 xt 1 s Logit model may include time-varying and fixed predictors Measurement model probs: P(y t = j X t = s) P( x r x s) t t 1 log ( ) 0r 1rs 2rt 3rst One or more indicator (dependent) variables Y, of possibly different scale types (e.g., continuous, count, dichotomous, ordinal, nominal) For introductory purposes, examples here are limited to analyses of a single categorical indicator: P( yit xt s) log P ( y 1 x s ) it t 0 1 s * Equations can be customized using Latent GOLD 5.0 GUI and/or syntax. 6

Examples Example 1: LM model with time-homogeneous transition probabilities (loyalty data) Example 2: MLM model (satisfaction data) Example 3: MLM model with covariates (sparse panel data) These introductory examples are limited to 5 and to 23 time points but even data with hundreds of time points are very easy to analyze with Latent GOLD. 7

Example 1: Loyalty Data in Long File Format N=631 respondents T= 5 time points Dichotomous Y: Choose Brand A? 1=Yes 0=No Y=(Y 1,Y 2,Y 3,Y 4,Y 5 ) 2 5 = 32 response patterns id=1,2,,32 freq is used as a case weight 8

Loyalty Model with Time-homogeneous Transitions Transition probs equal over time Initial State Probability (b 0 ) Transition Probabilities (b) Measurement Model Probabilities (a) 9

Example 1: Easy to generate future predictions Predict market share for brand A continues to increase Forecasts computed directly from model parameter estimates 10

Example 1: LM Model Fits Better than Latent Growth Model 2-state time-homogeneous latent Markov model fits well: (p=.77), small L-BVRs 2-class latent growth model ( 2-class Regression ) is rejected (p=.0084) L-BVR Null 2-class Regression LM Time 0.0 0.000.0785 Lag1 479.8* 8.74* (p =.003).0115 Lag2 259.2* 0.07.3992 Lag1 BVR pinpoints problem in LC growth model as failure to explain 1 st order autocorrelation 11

Example 2: Life Satisfaction N=5,147 respondents T= 5 time points Dichotomous Y: Satisfied with life? 1=No 2=Yes Y=(Y 1,Y 2,Y 3,Y 4,Y 5 ) 2 5 = 32 response patterns id=1,2,,32 weight is used as a case weight Models: Null Time heterogeneous LM 2-class mixture LM Restricted (Mover-Stayer) 12

Example 2: Model Parameters -- Time-heterogeneous Model Initial State Probability (b 0 ) Transition Probabilities (b t ) Measurement Model Probabilities (a) Estimated values for 2-state LM model 13

Ex. 2: Mover-Stayer Time-heterogeneous Latent Markov Model Both the unrestricted and Mover-Stayer 2-class MLM models fit well (p=.95 and.71), the BIC statistic preferring the Mover-Stayer model. Again, the comparable LC growth model fails to explain 1 st order autocorrelation L-BVR 2-class Regression 2-class LM 2-class LM Mover-Stayer Time 0.0 0.02 0.03 Lag1 55.1 (p=1.1e-13) 0.00 0.00 Lag2 1.7 0.10 1.97 14

2-class MLM Model Suggests Mover-Stayer Class Structure Estimated Values output for 2-state time-heterogeneous MLM model with 2 classes Class Size Initial State Transition Probabilities Measurement model 15

2-class MLM Model with Mover-Stayer Structure for Classes The Estimated Values output shows that 52.25% of respondents are in the Stayer class, who tend to be mostly Satisfied with their lives throughout this 5 year period -- 67.85% are in state 1 ( Satisfied' state) initially and remain in that state. In contrast, among respondents whose life satisfaction changed during this 5 year period (the Mover class), fewer (54.82%) were in the Satisfied state during the initial year. Class Size Initial State Transition Probabilities note that class 1 probability of staying in the same state has been restricted to 1. Measurement model 16

Example 2: Longitudinal Profile Plot for the Mover-Stayer LM model Stayer class showing 61.75% satisfied each year Mover class showing changes over time 17

Longitudinal-Plot with Overall Predicted Probability Appended 18

Example 3: Latent GOLD Longitudinal Analysis of Sparse Data N=1725 pupils who were of age 11-17 at the initial measurement occasion (in 1976) Survey conducted annually from 1976 to 1980 and at three year intervals after 1980 23 time points (T+1=23), where t=0 corresponds to age 11 and the last time point to age 33. For each subject, data is observed for at most 9 time points (the average is 7.93) which means that responses for the other time points are treated as missing. (See Figure 2) Dichotomous dependent variable drugs indicating whether respondent used hard drugs during the past year (1=yes; 0=no). Time-varying predictors are time (t) and time_2 (t 2 ); time-constant predictors are male and ethn4 (ethnicity). 19

Example 3: Latent GOLD Longitudinal Analysis of Sparse Data The plot on the left shows the overall trend in drug usage during this period is nonlinear, with zero usage reported for 11 year olds, increasing to a peak in the early 20s and then declining through age 33. The plot on the right plots the results from a mixture latent Markov model suggesting that the population consists of 2 distinct segments with different growth rates, Class 2 consisting primarily of non-users. 20

Example 3: 2-class MLM Model Class size Initial state probabilities by class Transition probabilities by class Measurement model probabilities 21

Example 3: Including Gender and Ethnicity as Covariates in Model 22

Example 3: Including Gender and Ethnicity as Covariates in Model Adding gender and ethnicity improves the BIC. Again, the 2-class LC growth model has a very large Lag1 BVR L-BVR null 2-class Regression 2-class MLM 2-class MLM w/ covariates Time 1.2 4.8* 1.6 1.6 Lag1 2282.1* 239.2* 0.1 0.1 Lag2 1196.1* 65.8* 0.0 0.0 23

Example 3: Including Gender and Ethnicity as Covariates in Model 24

Example 3: Including Gender and Ethnicity as Covariates in Model For concreteness, we focus on 18 year olds 18 year olds who were in the lower usage state (State 1) at age 17 have a probability of.1876 of switching to the higher usage state (State 2) if they are in Class 2 compared to a probability of only.0211 of switching if they were in Class 1. If they were in the higher use state (State 2) at age 17, they have a probability of.9589 of remaining in that state compared to only.3636 if they were in Class 1. The more general pattern -- Class 2 is more likely to move to and remain in a higher drug usage state than Class 1. 25

Example 3: Including Gender and Ethnicity as Covariates in Model Parameters output for model c, showing that age and ethnicity are significant. 26

Example with 3 Indicators 4 order-restricted latent state model. See Vermunt, J.K. (2013, in press). Latent class scaling models for longitudinal and multilevel data sets. In: G. R. Hancock and G. B. Macready (Eds.), Advances in latent class analysis: A Festschrift in honor of C. Mitchell Dayton. Charlotte, NC: Information Age Publishing, Inc. 27

Predicted vs. Observed for Each of the 3 Indicators 28

Summary The latent Markov GUI in version 5.0 of Latent GOLD was designed to make it easy to estimate a very extended class of LMs. In this presentation we analyzed data based on a single dichotomous indicator. However, because of the program structure/ design, the simplicity of analysis and speedy estimation is maintained with even hundreds of time points, multiple indicators of different scale types, observations of different length time series. In addition, The LG Syntax is an open system that allows more extended models, such as models with parameter restrictions more indicator scale types (censored, truncated, counts with exposure, beta, gamma) models with multiple state variables multilevel latent Markov models, including models with continuous random effects step3 latent Markov modeling (new in LG 5.0) continuous-time latent Markov modeling (new in LG 5.0) 29

References and Additional Resources Vermunt, J.K., Tran, B. and Magidson, J (2008). Latent class models in longitudinal research. In: S. Menard (ed.),handbook of Longitudinal Research: Design, Measurement, and Analysis, pp. 373-385. Burlington, MA: Elsevier. Vermunt, J.K. (2013, in press). Latent class scaling models for longitudinal and multilevel data sets. In: G. R. Hancock and G. B. Macready (Eds.), Advances in latent class analysis: A Festschrift in honor of C. Mitchell Dayton. Charlotte, NC: Information Age Publishing, Inc. Latent GOLD 5.0 will be available along with tutorials and demo data on June 10. Check website at: statisticalinnovations.com 30

Appendix: Equations K latent states, T+1 equidistant time points Latent Markov (LM): Initial latent state & Transition sub-models Measurement sub-model K K K P( y )... P( x, x,..., x ) P( y x, x,..., x ) i 0 1 T i 0 1 T x 1 x 1 x 1 0 1 T P( x, x,..., x ) P( x ) P( x x ) 0 1 T 0 t t 1 t 1 T T T J P( y x, x,..., x ) P( y x ) P( y x ) i 0 1 T it t itj t t 0 t 0 j 1 Latent Growth: L P( y z ) P( w z ) P( y w, z ) i i i it it w 1 t 0 T 31

Equations: Latent Markov (LM): Initial latent state & Transition sub-models K K K P( y z )... P( x, x,..., x z ) P( y x, x,..., x, z ) i i 0 1 T i i 0 1 T i x 1 x 1 x 1 0 1 T P( x, x,..., x z ) P( x z ) P( x x, z ) 0 1 T i 0 i0 t t 1 it t 1 T Measurement sub-model T T J P( y x, x,..., x, z ) P( y x, z ) P( y x, z ) i 0 1 T i it t it itj t it t 0 t 0 j 1 Mixture Latent Markov (MLM): L K K K P( y z )... P( w, x, x,..., x z ) P( y w, x, x,..., x, z ) i i 0 1 T i i 0 1 T i w 1 x0 1 x1 1 xt 1 T P( w, x, x,..., x z ) P( w z ) P( x w, z ) P( x x, w, z ) 0 1 T i i 0 i0 t t 1 it t 1 T T J P( y w, x, x,..., x, z ) P( y x, w, z ) P( y x, w, z ) i 0 1 T i it t it itj t it t 0 t 0 j 1 32

Longitudinal Bivariate Residuals (L-BVRs) Longitudinal bivariate residuals quantify for each response variable Y k how well the overall trend as well as the first- and second-order autocorrelations are predicted by the model. BVR.Time=BVR k (time, y k ), BVR.Lag1 =BVRk(y k [t-1], y k [t]) and BVR.Lag2 =BVR k (y k [t-2], y k [t]) residual autocorrelations remaining unexplained by the model. Lag1: