Latent class analysis for intensive longitudinal data, Hidden Markov processes, Regime switching models and Dynamic Structural Equations in Mplus

Similar documents
Dynamic Structural Equation Models

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Variable-Specific Entropy Contribution

Plausible Values for Latent Variables Using Mplus

Using Bayesian Priors for More Flexible Latent Class Analysis

Dynamic Structural Equation Modeling of Intensive Longitudinal Data Using Mplus Version 8

Nesting and Equivalence Testing

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models

Continuous Time Survival in Latent Variable Models

Bayesian Analysis of Latent Variable Models using Mplus

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

Dynamic Structural Equation Modeling of Intensive Longitudinal Data Using Multilevel Time Series Analysis in Mplus Version 8 Part 7

Auxiliary Variables in Mixture Modeling: Using the BCH Method in Mplus to Estimate a Distal Outcome Model and an Arbitrary Secondary Model

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

How to run the RI CLPM with Mplus By Ellen Hamaker March 21, 2018

Using Mplus individual residual plots for. diagnostics and model evaluation in SEM

MULTILEVEL IMPUTATION 1

Default Priors and Effcient Posterior Computation in Bayesian

Classification 1: Linear regression of indicators, linear discriminant analysis

Bayesian Linear Regression

A Re-Introduction to General Linear Models (GLM)

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Generalized Linear Models for Non-Normal Data

Non-Parametric Bayes

Application of Plausible Values of Latent Variables to Analyzing BSI-18 Factors. Jichuan Wang, Ph.D

Model Assumptions; Predicting Heterogeneity of Variance

Bayesian non-parametric model to longitudinally predict churn

Switching Regime Estimation

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

Strati cation in Multivariate Modeling

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Comparing IRT with Other Models

Multilevel Structural Equation Modeling

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

For more information about how to cite these materials visit

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Latent variable interactions

Bayesian linear regression

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent

Joint Modeling of Longitudinal Item Response Data and Survival

Bayesian Methods in Multilevel Regression

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

An Introduction to Mplus and Path Analysis

ECNS 561 Multiple Regression Analysis

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Basic IRT Concepts, Models, and Assumptions

Multi-level Models: Idea

Marginal Specifications and a Gaussian Copula Estimation

Time-Invariant Predictors in Longitudinal Models

MID-TERM EXAM ANSWERS. p t + δ t = Rp t 1 + η t (1.1)

CSC 2541: Bayesian Methods for Machine Learning

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Bayes methods for categorical data. April 25, 2017

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Citation for published version (APA): Jak, S. (2013). Cluster bias: Testing measurement invariance in multilevel data

Bayesian Mixture Modeling

Bayesian Networks in Educational Assessment

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Mixture Modeling in Mplus

WU Weiterbildung. Linear Mixed Models

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

Multilevel regression mixture analysis

Longitudinal Data Analysis of Health Outcomes

Longitudinal Modeling with Logistic Regression

Online Appendix for Sterba, S.K. (2013). Understanding linkages among mixture models. Multivariate Behavioral Research, 48,

Longitudinal Nested Compliance Class Model in the Presence of Time-Varying Noncompliance

Advanced Introduction to Machine Learning

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Bayes Model Selection with Path Sampling: Factor Models

STA 4273H: Statistical Machine Learning

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

An Introduction to Path Analysis

A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models

Using Mixture Latent Markov Models for Analyzing Change in Longitudinal Data with the New Latent GOLD 5.0 GUI

Introduction to Random Effects of Time and Model Estimation

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

CPSC 540: Machine Learning

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

Nonparametric inference in hidden Markov and related models

Nonparametric Bayesian Methods (Gaussian Processes)

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Bayesian Linear Models

Introduction to Structural Equation Modeling

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Bayesian Methods for Machine Learning

Mobile Robot Localization

Some Customers Would Rather Leave Without Saying Goodbye

ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

1. Introduction. Hang Qian 1 Iowa State University

Stat 579: Generalized Linear Models and Extensions

Introduction to mtm: An R Package for Marginalized Transition Models

Transcription:

Latent class analysis for intensive longitudinal data, Hidden Markov processes, Regime switching models and Dynamic Structural Equations in Mplus Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker May 24, 2016 Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 1/ 61

Overview Motivation Dynamic Structural Equations Model (DSEM) framework and estimation New Multilevel Mixture Models: these are needed as building block for the more advanced models Single level models: HMM (Hidden Markov Models), MSAR (Markov Switching Auto-Regressive), MSKF (Markov Switching Kalman Filter) Two-level HMM, MSAR, MSKF Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 2/ 61

Motivation Merge time series, structural equation, multilevel and mixture modeling concepts in a generalized modeling framework in Mplus V8 In this context two-level means single-level. Cluster is always the individual. Many observations are collected within subject and analyzed in long format. Most time-series models were developed for single level data however most social science applications need two-level methods because we study many individuals across time, rather than the US economy across time. Mplus release timeline: V8 will have DSEM and probably single level MSAR. V8.1 will have two-level MSAR. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 3/ 61

Motivation continued Consider the following hypothetical example. A group of depression patients answer daily a brief survey to evaluate their current state. Based on current observations, past history, most recent history, similar behavior from other patients we classify the patient in one of 3 states: S1: OK S2: Stretch of poor outcomes, needs doctor visit/evaluation S3: At risk for suicide, needs hospitalization Future of health care? Cheaper, smarter and more effective? The models we describe in this talk can be used to model the data from this hypothetical example: combine mixture, multilevel, time-series, latent variables, and structural models. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 4/ 61

Motivation continued The goal of these models is to provide continuous time monitoring for unobserved categorical and continuous latent constructs. This is needed to study latent variable development across time and to be able to detect problems earlier and prevent/react. Modeling two distinct sources of correlation: within individual correlations can be due to subject-specific effect (two-level modeling) or it can be due to correlation due to proximity of observations (autocorrelation) of the observations (time series modeling). The two types of correlations are easy to parse out from the data in sufficiently long longitudinal data. Various mobile devices are now utilized for collection of data: ecological momentary assessment (EMA) data and experience sampling methods (ESM). Increased need for time intensive methods. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 5/ 61

Mplus general DSEM framework For more DSEM information see about 200 slides from the pre-conference workshop. Here is a quick review. We use a simplified DSEM version that excludes TVEM (no cross-classified modeling) Let Y it be an observed vector of measurements for individual i at time t. Let η it be a latent variable vector for individual i at time t. Let X it be a observed vector of covariates individual i at time t. Similarly Y i, X i and η i are individual-specific variables, time invariant Main decomposition equation Y it = Y 1,it + Y 2,i Y 2,i are the individual specific contribution, normal latent variable. Y 1,it is the residual. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 6/ 61

DSEM framework continued The within level model includes previous periods latent variables η i,t l as predictors, modeling the correlations in consecutive periods. L is the lag variable: how many time period back are included in the model. Y 1,it = ν 1 + η i,t = α 1 + L l=0 L l=0 Λ 1,l η i,t l + ε it B 1,l η i,t l + Γ 1 x it + ξ it. The usual structural equations at level 2. Every within level parameter can be random effect: part of η i Y 2,i = ν 2 + Λ 2 η i + ε i η i = α 2 + B 2 η i + Γ 2 x i + ξ i Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 7/ 61

DSEM framework continued Observed variables can also have lag variables and can be used as predictors. Ordered polytomous and binary dependent variables are included in this framework using the underlying Y approach: probit link The model is also of interest when N=1. No second level. All observations are correlated. Multivariate econometrics models. The N = 1 model can be used also on the data from a single individual to construct a psychological profile and match it to the known profile of a psychological disorder. We use Bayes estimation The above model has variables with negative or zero indices. We treat those as auxiliary parameters that have prior distribution. Automatic option specification is implemented in Mplus. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 8/ 61

DSEM Mixture Model Let S it be a categorical latent variable for individual i at time t. We call it State. In Mplus this is a within level latent variable. S can take values 1,2,...,K where K is the number of classes/states in the model. The model on the between level is not affected by S. The model on the within level is state specific [Y 1,it S it = s] = ν 1,s + [η i,t S it = s] = α 1,s + L l=0 L l=0 Λ 1,l,s η i,t l + ε it B 1,l,s η i,t l + Γ 1,s x it + ξ it. The residual variance Var(ε it S it = s) and Var(ξ it S it = s) are also state specific Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 9/ 61

DSEM Mixture Model Continued In addition to the above model we have to specify the distribution of S it P(S it = s) = Exp(α is) K s=1 Exp(α is) α is are normally distributed random effects, i.e., they are part of the vector η i. For identification the last one is zero α ik = 0 Individual level predictors can be used to predict α is, regress the logits on covariates. In MCMC we use Metropolis Hastings to update the random effects. Using the proposal distribution N(α is,σ) where Σ is the model estimate for Σ = Var(α is ) from a burnin period. The new draw ˆα is is accepted with probability Acceptane ratio = Prior( ˆα is)likelihood(s it ˆα is ) Prior(α is )Likelihood(S it α is ) Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 10/ 61

DSEM Mixture Model Continued Potential avenues to improve mixing if needed Use cσ where c is a constant regulating the acceptance rate within a desirable range of 15% to 35% Use cluster specific proposal distribution from a burnin period cσ i where Σ i = Var(α is data). Unbalanced designs most likely will need this. Sperate α is from other random effects. Limited simulations show good performance even without these added steps, however, final version will include these probably. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 11/ 61

Bayes Multilevel Mixture Model Bayes Multilevel Mixture Model = DSEM Mixture without the lag variables η i,t l. For the next 20 slides no intensive longitudinal data. The model is essentially the Twolevel Mixture model implemented in Mplus which can be estimated with ML. Advantage of this model is that we have Bayes estimation and thus can estimate models with any number of random effects. With ML, α is typically can not be all estimated and we constrain them to be proportional via a factor to reduce the number of numerical integration from K 1 to 1. With Bayes we don t need to do that. Asparouhov, T. & Muthen, B. (2008). Multilevel mixture models. In Hancock, G. R., & Samuelsen, K. M. (Eds.), Advances in latent variable mixture models, pp. 27-51. Charlotte, NC: Information Age Publishing, Inc. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 12/ 61

LCA with clustered data Most common approach approach for LCA with clustering (nested data) is to use ML single level with robust/sandwitch SE that take the clustering into account. Three problems Does not allow cluster specific class distribution. Assumes full measurement invariance for the latent class variable measurement model If these assumptions are not met likely to yield spurious classes Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 13/ 61

LCA with measurement non-inavriance For indicator p individual i in cluster j P(U pij = 1 C ij = k) = Φ(τ pk + τ pj ) P(C ij = k) = Exp(α j + α jk ) K s=1 Exp(α j + α jk ) τ pk is a non-random parameter (the usual threshold parameter) τ pj is a measurement non-invariance zero mean random effect that allows certain indicators to be more or less frequent in cluster j than the population values, beyond what the latent class distribution explains. For example, certain measurement instruments not universally accurate. α j are non-random effects that fits the population level class distribution α jk are zero mean random effects that allow cluster specific class distribution ML estimation would use 10 dim integration. Bayes 40 sec rep. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 14/ 61

LCA with measurement non-inavriance: 3 Class simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 15/ 61

LCA with measurement non-inavriance: Simulation results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 16/ 61

Unrestricted Two-Level Mixture Model Simulation Y ij is a vector of observed continuous variables for individual i in cluster j Y ij = Y b,j + Y w,ij [Y w,ij C ij = k] N(µ k,σ k ) P(C ij = k) = Exp(α j + α jk ) K s=1 Exp(α j + α jk ) This model requires numerical integration in Mplus even if the latent class variable is observed With Bayes it can even accommodate categorical variables It makes feasible the models discussed in Asparouhov, T. & Muthen, B. (2012). Multiple group multilevel analysis. Mplus Web Notes: No. 16. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 17/ 61

Unrestricted Two-Level Mixture Model Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 18/ 61

Unrestricted Two-Level Mixture Model Simulation Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 19/ 61

Multilevel Latent Transition Analysis (MLTA) with cluster specific transition probabilities Hypothetical example: Students are nested within schools and are classified in 2 classes at two separate occasions. We are interested in how the P(C 2 C 1 ) varies across schools. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 20/ 61

Multilevel Latent Transition Analysis (MLTA) with cluster specific transition probabilities However γ dcj does not really vary across clusters it is really γ dc Even if we regress α 2j on α 1j (equivalent to correlation) we still have just 2 random effects for the joint distribution of two binary latent class variables while the degrees of freedom is 3 Current Mplus ML estimation P(C 2 C 1 ) does not fully vary Tihomiracross Asparouhov, clusters Bengt Muthén and Ellen Hamaker Muthén & Muthén 21/ 61 The model that can be estimated in Mplus with ML

Multilevel Latent Transition Analysis (MLTA) with cluster specific transition probabilities New Bayes model P(C 1,ij = k 1 ) = Exp(α jk 1 ) K k 1 =1 Exp(α jk 1 ) P(C 2,ij = k 2 C 1,ij = k 1 ) = Exp(α jk 1 k 2 ) K k 2 =1 Exp(α jk 1 k 2 ) The old model is equivalent to the assumption that α j21 α j11 is the constant γ α jk1 K = 0 for identification Note that the transition probabilities can also be regressed on predictors by regressing α jk1 k 2 on predictors Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 22/ 61

MLTA A model with two binary class variables has 3 random effects (3 df in joint C 1 by C 2 disytribution α j1 = log(p(c 1 = 1)/P(C 1 = 2)) α j11 = log(p(c 2 = 1 C 1 = 1)/P(C 2 = 2 C 1 = 1)) α j21 = log(p(c 2 = 1 C 1 = 2)/P(C 2 = 2 C 1 = 2)) In MCMC the update for α jk1 k 2 are no different than α jk1, i.e, we use MH Simulation studies show that clusters sizes should not be small (> 50). If cluster sizes are small joint tables will have empty cells that lead to logits of infinity. The results of that is biased overestimation for Var(α jk1 k 2 ). Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 23/ 61

MLTA Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 24/ 61

MLTA Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 25/ 61

MLTA Simulation Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 26/ 61

MLTA Simulation Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 27/ 61

Single Level LTA with Probability Parameterization In single level the logits of transition probabilities are not random effects. They are non-random parameters. Mplus has 3 different parameterizations for ML estimation of LTA: logit, loglinear, probability New Bayes estimation for LTA with probability parameterization: it allows for Lag=1 or Lag=2, P(C 2 C 1 ) or P(C 3 C 1,C 2 ) just like ML The model parameters are the probabilities directly P(C 1 ) and P(C 2 C 1 ) and P(C 3 C 1,C 2 ) Easy MCMC implementation. P(C 1 ) and P(C 2 C 1 ) and P(C 3 C 1,C 2 ) have conjugate Dirichlet prior. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 28/ 61

LTA with Probability Parameterization Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 29/ 61

LTA with Probability Parameterization Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 30/ 61

Single Level Hidden Markov Models Data is again intensive longitudinal. First we consider the single level model, N=1. We discuss 3 models Hidden Markov Model (HMM) Markov Switching Autoregressive (MSAR) Markov Switching Kalman Filter (MSKF) DSEM allows for time series modeling for observed and latent continuous variables. DSEM Mixture does not allow auto correlation for the latent categorical variable In time series data it is not realistic to assume that S t and S t 1 are independent, where S t is the state/class variable at time t. On the contrary. A realistic model will allow S t and S t 1 be highly correlated if the observations are taken very frequently. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 31/ 61

Hidden Markov Models The model has two parts: measurement part and Markov switching part The measurement part is like any other Mxiture model, it is defined by P(Y t S t ) where Y t is a vector of observed variables and S t is the latent class/state variable at time t The Markov switching (regime switching) part is given by P(S t S t 1 ). We use the same probability parametrization based on Dirichlet conjugate priors that we used with two latent class variables. The transition model Q = P(S t S t 1 ) has K(K-1) probability parameters. The transition matrix is K by K but the columns add up to 1 Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 32/ 61

Hidden Markov Models Note that p=p(s t ) is not a model parameter. The probability is implicit and is the distribution of S t if the sequence is observed to infinity. It can be obtained implicitly from the stationarity assumption that P(S t ) is independent of t. i.e., from the equation Qp = p. Since the first K 1 equations added up give the last we need to replace the last equation with Sum(p)=1 to solve it. The MCMC step that updates the latent class variable P(C t = k C t 1,C t+1 ) = P(C t+1 C t = k)p(c t = k C t 1 ) K k=1 P(C t+1 C t = k)p(c t = k C t 1 ) P(C t = k C t 1,C t+1,y t ) = P(Y t C t = k)p(c t = k C t 1,C t+1 ) K k=1 P(Y t C t = k)p(c t = k C t 1,C t+1 ) Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 33/ 61

Hidden Markov Models C t=0 is treated as an auxiliary parameter C t=0 can be given a prior Mplus provides an automatic prior option. The prior is updated in the first 100 MCMC iteration which are consequently discarded and the prior is set to be the current sample distribution C t. This is the default. If the length of the time series is long enough that prior does not matter Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 34/ 61

HMM Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 35/ 61

HMM Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 36/ 61

Combining HMM and DSEM By combing HMM and DSEM we obtains a general model that includes time series for the latent class variable as well as factors and observed variables Markov Switching Autoregressive (MSAR) is simply the combination of Mixture-AR and HMM Markov Switching Kalman Filter (MSKF) is simply the combination of Mixture-Kalman Filter and HMM Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 37/ 61

Bivariate MSAR Let Y 1t and Y 2t be two observed variables at time t and S t be the latent state variable at time t taking one of two states. Y 1t = α 1,St + β 1,St Y 1,t 1 + β 2,St Y 2,t 1 + ε 1it Y 2t = α 2,St + β 3,St Y 1,t 1 + β 4,St Y 2,t 1 + ε 2it The model has 20 parameters: 4 α, 8 β, 6 residual covariance, and 2 parameters in the transition matrix: P(S t = 1 S t 1 = 1) and P(S t = 1 S t 1 = 2) The model addresses the chicken or the egg problem. Example: wife and a husband mood in good and bad state. Hamaker, Grasman and Kamphuis (2016) Modeling BAS Dysregulation in Bipolar Disorder Illustrating the Potential of Time Series Analysis, Assessment. They argue that bipolar individuals show two state behaviour while controls don t. Simulation study speed of computation: it takes 1/3 of 1 second to estimate the model on a sample with 1000 observations Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 38/ 61

Bivariate MSAR Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 39/ 61

Bivariate MSAR Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 40/ 61

Markov Switching Kalman Filter (MSKF) Three factor indicators Y jt measuring a factor η t. S t is a two state categorical latent variable. We estimate a hidden Markov model for S t, i.e., P(S t = 1 S t 1 = 1) and P(S t = 1 S t 1 = 2) are probability parameters independent of t. For j=1,2,3 Y jt = ν j + λ j η t + ε jt η t = α St + β 1,St η t 1 + β 2,St η t 2 + ξ t MSAR(2) model for the factor For identification purposes α 1 = 0 and λ 1 = 1 (this sets the scale of the latent variable to be the same as the first indicator, which is probably a better parameterization than fixing the variance of the factor to 1?) Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 41/ 61

MSKF Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 42/ 61

MSKF Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 43/ 61

MSKF Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 44/ 61

MSKF Analysis of Results Small bias in the means. What to do? I left this unanswered on purpose to make the point. Possible potential causes Model poorly identified: Not enough parameters differ across class? Simplify model by holding parameter equal to improve identification. Parameters that are not significantly different and make sense can be constrained to be equal. Loadings not significantly different from 1 fix to 1. Not enough sample size to get to bias of zero. Increase sample size. Simplify model. Run simulations with bigger sample. Note that Entropy remains the same as sample increases Not enough MCMC iterations. Run with more iterations. Look at the traceplot of the offending parameters to evaluate convergence Label switching in Mixtures? I have not seen this happen yet. I think it it very rare because of the Markov regime switching/smoothing Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 45/ 61

MSKF Analysis of Results Add informative priors to improve identifiability if the model. Maybe entropy is too low? In smaller sample size situations the number of regime switching events could be low if the state is stable - not enough to build the model. Recall that we use P(C t = 1 C t 1 = 2) and P(C t = 2 C t 1 = 1) to figure out C t distribution. If there are very few switch events accurate stable estimates are probably unrealistic. Other things I don t know about These methods are new and are the cutting edge of methodology. Not using simulation studies in parallel to a real data estimation is irresponsible. Simulation studies not possible in Bugs. They are possible in Mplus because it is much faster. Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 46/ 61

MSKF Data Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 47/ 61

MSKF Analysis Class switching are not clearly visible. One needs the detailed analysis that the MSKF model provides Not very easy to come up with E(Y t S t ) explicitly. Using the simulated data however I can compute that E(Y t S t = 1) is around 1 and E(Y t S t = 2) is around 4. This can also be computed by imputing S t with the factor score command and computing the MCMC based values. Periods 1 to 25: Mostly switching back and forth with a stable phase in regime 1 for periods 10, 11, 12, 13. Periods 25 to 50: S t has entered a stable regime 2 with only a single drop to regime 1 at period 32. This makes a lot more sense once you know that the two estimated values for E(Y t S t ) are 1 and 4 Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 48/ 61

Two level models Not really two-level models. These are intensive longitudinal models: multiple observations nested within clusters. This methodology more suitable for social sciences than the case of N=1 Dynamic Latent Class Analysis (DLCA) Multilevel Markov Switching Autoregressive Models(MMSAR) Multilevel Markov Switching Kalman Filter Models (MMSKF) Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 49/ 61

Dynamic Latent Class Analysis 200 individuals, 4 binary class indicator, 2 class model, each individual has 100 times of observations. The model is traditional LTA with 100 time points P(U pit = 1 S it = k) = Φ(τ kp + τ ip ) P(S it = 1 S it 1 = 1) = Exp(α i1) 1 + Exp(α i1 ) P(S it = 2 S it 1 = 2) = Exp(α i2) 1 + Exp(α i2 ) τ ip N(0,σ ip ) α ij N(α j,σ j ) Subject specific transition matrix ( αi1 1 α i2 1 α i1 α i2 ) Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 50/ 61

DLCA Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 51/ 61

DLCA Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 52/ 61

Multilevel Markov Switching Autoregressive Models(MMSAR) Simulation 100 individuals with 100 times of observations. One dependent variable Y it S it is a two state categorical latent variable. We estimate a hidden Markov model for S it with subject specific transition probabilities We estimate a twolevel regression with random intercept and MSAR(1) model for the within part of Y it (note the mean is on the within level - version 8.1) Y it = Y b,i + Y w,it Y w,it = µ Sit + β Sit Y w,it 1 + ε it P(S it = j S it 1 = j) = Exp(α ij) 1 + Exp(α ij ) α ij N(α j,σ j ),Y b,i N(0,σ) Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 53/ 61

MMSAR Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 54/ 61

MMSAR Results Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 55/ 61

Multilevel Markov Switching Kalman Filter Models (MMSKF) 100 individuals with 100 times of observations, 4 continuous factor indicators Y pit measuring a factor η it. S it is a two state categorical latent variable. We estimate a hidden Markov model for S t with subject specific transition probabilities MSAR(1) model for the factor For p=1,...,4 Y pit = ν pi + λ p η it + ε pit η it = µ Sit + β Sit η it 1 + ξ it P(S it = j S it 1 = j) = Exp(α ij) 1 + Exp(α ij ) α ij N(α j,σ j ) ν pi N(ν p,σ p ) For identification purposes µ 1 = 0 and λ 1 = 1 Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 56/ 61

MMSKF Simulation Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 57/ 61

MMSKF Results, within level, class 1 Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 58/ 61

MMSKF Results, within level, class 2 Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 59/ 61

MMSKF Results, between level Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 60/ 61

Issues Determine the number of classes: ignore time series and use standard methods Starting values: maybe coming soon, not as big issue for Bayes as it is for ML due to MCMC naturally goes through many starting values Comparing models: maybe coming soon DIC, model test, new PPP methods, other new methods Multiple solutions: We get a lot of multiple solutions with ML. Does that happen with Bayes too? Using different starting values? This is an issue even without time-series. Label switching - hopefully not much of a problem due to Markov smoothing Tihomir Asparouhov, Bengt Muthén and Ellen Hamaker Muthén & Muthén 61/ 61