Latent Class Analysis

Size: px
Start display at page:

Download "Latent Class Analysis"

Transcription

1 Latent Class Analysis Karen Bandeen-Roche October 27, 2016

2 Objectives For you to leave here knowing When is latent class analysis (LCA) model useful? What is the LCA model its underlying assumptions? How are LCA parameters interpreted? How are LCA parameters commonly estimated? How is LCA fit adjudicated? What are considerations for identifiability / estimability?

3 Motivating Example Frailty of Older Adults the sixth age shifts into the lean and slipper d pantaloon, with spectacles on nose and pouch on side, his youthful hose well sav d, a world too wide, for his shrunk shank -- Shakespeare, As You Like It

4 The Frailty Construct Fried et al., J Gerontol 2001; Bandeen-Roche et al., J Gerontol, 2006

5 Frailty as a latent variable Underlying : status or degree of syndrome Surrogates : Fried et al. (2001) criteria weight loss above threshold low energy expenditure low walking speed weakness beyond threshold exhaustion

6 Part I: Model

7 Latent class model ε 1 Y 1 Frailty η Structural Y m ε m Measurement

8 Well-used latent variable models Latent variable scale Observed variable scale Continuous Discrete Continuous Factor analysis LISREL Discrete FA IRT (item response) Discrete Latent profile Growth mixture Latent class analysis, regression General software: MPlus, Latent Gold, WinBugs (Bayesian), NLMIXED (SAS)

9 Analysis of underlying subpopulations Latent class analysis POPULATION U i P 1 P J 11 1M J1 JM Y 1 Y M Y 1 Y M Lazarsfeld & Henry, Latent Structure Analysis, 1968; Goodman, Biometrika, 1974

10 Latent Variables: What? Integrands in a hierarchical model Observed variables (i=1,,n): Y i =M-variate; x i =P-variate Focus: response (Y) distribution = G Yx (y/x) ( y x) ; x-dependence Model: Y i generated from latent (underlying) U i : F (Measurement) Focus on distribution, regression re U i : F U x ( y U) u, x; π ), x Y U = ( u x; β ) Overall, hierarchical model: F Y x G Y x (Structural) ( y x) = F ( y U = u, x) df ( u x) Y U, x U x

11 Latent Variable Models Latent Class Regression (LCR) Model Model: f Y x J j= 1 M y mj m ( y x) = P π (1 π j m= 1 mj ) 1 y m Structural model: [ U x ] = Pr { U = j} = Pr{ η = j} = P, j = 1 J i i i j,..., Measurement model: = conditional probabilities > is MxJ π Compare to general form: [ Y ] i U i { Y = 1U = j} = Pr{ Y = j} π 1 mj = Pr im i im ηi = F Y x ( y x) = F ( y U = u, x) df ( u x) Y U, x U x

12 Latent Variable Models Latent Class Regression (LCR) Model Model: f Y x y ( ) m = P π 1 Measurement assumptions: Conditional independence J ( y x) π M j j = 1 m= 1 1 y Ø {Y i1,,y im } mutually independent conditional on U i Ø Reporting heterogeneity unrelated to measured, unmeasured characteristics mj mj [ ] Y i U i m

13 Latent Variable Models Latent Class Regression (LCR) Model Model: f Y x y ( ) m = P π 1 Measurement assumptions: Conditional independence J ( y x) π M j j = 1 m= 1 1 y Ø {Y i1,,y im } mutually independent conditional on C i Ø Reporting heterogeneity unrelated to measured, unmeasured characteristics mj mj [ ] Y i C i m

14 Analysis of underlying subpopulations Method: Latent class analysis Seeks homogeneous subpopulations Features that characterize latent groups Prevalence in overall population Proportion reporting each symptom Number of them = least to achieve homogeneity / conditional independence

15 Latent class analysis Prediction Of interest: Pr(C=j Y=y) = posterior probability of class membership Once model is fit, a straightforward calculation Pr(C=j Y=y) = Pr( C = j, Y = y) Pr( Y = y) M ym 1 ym Pj π mj ( 1 π mj ) = θ J m= 1 m Pk k= 1 m= 1 π ym mk ( 1 π ) 1 y = ij when evaluated at y i mk m

16 Part II: Fitting

17 Estimation Broad Strokes Maximum likelihood EM Algorithm Simplex method (Dayton & Macready, 1988) Possibly with weighting, robust variance correction ML software Specialty: Mplus, Latent Gold Stata: gllamm SAS: macro R: polca Bayesian: winbugs

18 Estimation Methods other than EM algorithm Bayesian MCMC methods (e.g. per Winbugs) A challenge: label-switching Reversible-jump methods Advantages: feasibility, philosophy Disadvantages Prior choice (high-dimensional; avoiding illogic) Burn-in, duration May obscure identification problems

19 Estimation Likelihood maximization: E-M algorithm A process of averaging over missing data in this case, missing data is class membership.

20 Estimation Likelihood maximization: E-M algorithm Rationale: LVs as missing data Brief review Complete data { Y, x u} W =, Complete data log likelihood = log F y, u x ( y, u x, φ) = w ( φ w) taken as a function of ϕ Iterate between (K+1) E-Step: evaluate Q( φ φ ( k ) ) = E u y, x [ ( k ( φ W) y, x; φ ) ] w (K+1) M-Step: maximize Q( φ φ (k ) ) wrt ϕ Convergence to a local likelihood maximum under regularity Dempster, Laird, and Rubin, JRSSB, 1977

21 Estimation EM example: Latent Class Model max L = η log i = 1 j = 1 m= 1 J m ( ) J y 1 y + im im P j π mj 1 π mj ψ P j = 1 j L π mj : n i= 1 θ ij π ( ) n yim π mj y = 0 π = ( ) mj n mj 1 π mj i= 1 im h= 1 θ θ ij hj L P j n { } 1 : θ = = ij Pjn 0 Pj θij n i= 1

22 EM-Algorithm Latent class model A process of averaging over missing data in this case, missing data is class membership. 1. Choose starting set of posterior probabilities 2. Use them to estimate P and π (M-step) 3. Calculate Log Likelihood 4. Use estimates of P and π to calculate posterior probabilities (E-step) 5. Repeat 2-4 until LL stops changing.

23 Global and Local Maxima Multiple starting values very important!

24 Example: Frailty Women s Health & Aging Studies Longitudinal cohort studies to investigate Causes / course of physical and cognitive disability Physiological determinants of frailty Up to 7 rounds spanning 15 years Companion studies in community, Baltimore, MD moderately disabled women 65+ years: n=1002 mildly disabled women years: n=436 This project: n=786 age years at baseline Probability-weighted analyses Guralnik et al., NIA, 1995; Fried et al., J Gerontol, 2001

25 Example: Latent Frailty Classes Women s Health and Aging Study Conditional Probabilities (π) Criterion 2-Class Model 3-Class Model CL. 1 NON- FRAIL CL. 2 FRAIL CL. 1 ROBUST CL. 2 INTERMED. CL. 3 FRAIL Weight Loss Weakness Slowness Low Physical Activity Exhaustion Class Prevalence (P) (%) Bandeen-Roche et al., J Gerontol, 2006

26 Example: Latent Frailty Classes Women s Health and Aging Study Criterion 2-Class Model 3-Class Model CL. 1 NON- FRAIL Conditional Probabilities (π) CL. 2 FRAIL CL. 1 ROBUST CL. 2 INTERMED. We estimate that 26% in the frail Subpopulation exhibit weight loss Weight Loss Weakness Slowness CL. 3 FRAIL Low Physical Activity Exhaustion Class Prevalence (P) (%) Bandeen-Roche et al., J Gerontol, 2006

27 Part III: Evaluating Fit

28 Choosing the Number of Classes a priori theory Chi-Square goodness of fit Entropy Information Statistics AIC, BIC, others Lo-Mendell-Rubin (LMR) Not recommended (designed for normal Y) Bootstrapped Likelihood Ratio Test

29 Entropy Measures classification error 0 terrible 1 perfect E = 1 N J Pr( SC i = j Y% i)*log Pr( Si = j Y% i =j C i =j i) i= 1 j= 1 N*log( J) Dias & Vermunt (2006)

30 Information Statistics s = # of parameters N= sample size smaller values are better AIC: -2LL+2s BIC: -2LL + s*log(n) BIC is typically recommended - Theory: consistent for selection in model family - Nylund et al, Struct Eq Modeling, 2007

31 Likelihood Ratio Tests LCA models with different # of classes NOT nested appropriately for direct LRT. Rather: LRT to compare a given model to the saturated model LCA df (binary case): J-1 + J*M P parameters (sum to 1) Saturated df: 2 M -1 Goodness of fit df: 2 M J(M+1) π parameters (M items*j classes)

32 Bootstrapped Likelihood Ratio Test In the absence of knowledge about theoretical distribution of difference in 2LL, can construct empirical distribution from data. per Nylund (2006) simulation studies, performs best

33 Example: Frailty Construct Validation Women s Health & Aging Studies Internal convergent validity Criteria manifestation is syndromic a group of signs and symptoms that occur together and characterize a particular abnormality - Merriam-Webster Medical Dictionary

34 Validation: Frailty as a syndrome Method: Latent class analysis If criteria characterize syndrome: At least two groups (otherwise, no cooccurrence) No subgrouping of symptoms (otherwise, more than one abnormality characterized)

35 Conditional Probabilities of Meeting Criteria in Latent Frailty Classes WHAS Criterion 2-Class Model 3-Class Model CL. 1 NON- FRAIL CL. 2 FRAIL CL. 1 ROBUST CL. 2 INTERMED. CL. 3 FRAIL Weight Loss Weakness Slowness Low Physical Activity Exhaustion Class Prevalence (%) Bandeen-Roche et al., J Gerontol, 2006

36 Results: Frailty Syndrome Validation Data: Women s Health and Aging Study Single-population model fit: inadequate Two-population model fit: good Pearson χ 2 p-value=.22; minimized AIC, BIC Frailty criteria prevalence stepwise across classes no subclustering Syndromic manifestation well indicated

37 Example Residual checking Frailty construct

38 Part IV: Identifiability / Estimability

39 Identifiability Rough idea for non -identifiability: More unknowns than there are (independent) equations to solve for them Definition: Consider a family of distributions F Φ = { F( y, φ); φ Φ}. identifiable iff * no φ Φ The parameter φ Φ * : F(y, φ) = F(y, φ ) is (globally) a.e.

40 Identifiability Related concepts Local identifiability Basic idea: ϕ identified within a neighborhood Definition: F is locally identifiable at exists a neighborhood τ about φ for all τ Φ. φ = φ 0 φ φ 0 if there 0 : F( y; φ0 ) = F( y, φ) Estimability, empirical identifiability: The information matrix for ϕ given y 1,,y n is non-singular.

41 Identifiability Latent class (binary Y) Latent class analysis (measurement only) Parameter dimension: 2 M -1 Unconstrained J-class model: J-1 + J*M Need 2 M J(M+1) (necessary, not sufficient) Local identifiability: evaluate the Jacobian of the likelihood function (Goodman, 1974) Estimability: Avoid fewer than 10 allocation per cell n > 10*(2 M ) (rule of thumb)

42 Identifiability / estimability Latent class analysis Frailty example Need 2 M J(M+1) (necessary, not sufficient) M=5; J=3; 32 3 (5+1) YES By this criterion, could fit up to 9 classes Local identifiability: evaluate the Jacobian of the likelihood function (Goodman, 1974) Estimability: n > 10*(2 M ) n > 10*(2 5 ) = YES

43 Objectives For you to leave here knowing When is latent class analysis (LCA) model useful? What is the LCA model its underlying assumptions? How are LCA parameters interpreted? How are LCA parameters commonly estimated? How is LCA fit adjudicated? What are considerations for identifiability / estimability?

What is Latent Class Analysis. Tarani Chandola

What is Latent Class Analysis. Tarani Chandola What is Latent Class Analysis Tarani Chandola methods@manchester Many names similar methods (Finite) Mixture Modeling Latent Class Analysis Latent Profile Analysis Latent class analysis (LCA) LCA is a

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

Determining the number of components in mixture models for hierarchical data

Determining the number of components in mixture models for hierarchical data Determining the number of components in mixture models for hierarchical data Olga Lukočienė 1 and Jeroen K. Vermunt 2 1 Department of Methodology and Statistics, Tilburg University, P.O. Box 90153, 5000

More information

Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

More information

Mixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University

Mixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University Identifying the Correct Number of Classes in a Growth Mixture Model Davood Tofighi Craig Enders Arizona State University Mixture Modeling Heterogeneity exists such that the data are comprised of two or

More information

Introduction to latent class model

Introduction to latent class model Epi 950, Fall 2006 1 Introduction to latent class model Latent variables Manifest variables Categorical Continuous Categorical Latent class analysis Latent trait analysis Continuous Latent profile analysis

More information

Growth Mixture Model

Growth Mixture Model Growth Mixture Model Latent Variable Modeling and Measurement Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 28, 2016 Slides contributed

More information

Web-based Supplementary Materials for Multilevel Latent Class Models with Dirichlet Mixing Distribution

Web-based Supplementary Materials for Multilevel Latent Class Models with Dirichlet Mixing Distribution Biometrics 000, 1 20 DOI: 000 000 0000 Web-based Supplementary Materials for Multilevel Latent Class Models with Dirichlet Mixing Distribution Chong-Zhi Di and Karen Bandeen-Roche *email: cdi@fhcrc.org

More information

Mixtures of Rasch Models

Mixtures of Rasch Models Mixtures of Rasch Models Hannah Frick, Friedrich Leisch, Achim Zeileis, Carolin Strobl http://www.uibk.ac.at/statistics/ Introduction Rasch model for measuring latent traits Model assumption: Item parameters

More information

SEM for Categorical Outcomes

SEM for Categorical Outcomes This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Variable-Specific Entropy Contribution

Variable-Specific Entropy Contribution Variable-Specific Entropy Contribution Tihomir Asparouhov and Bengt Muthén June 19, 2018 In latent class analysis it is useful to evaluate a measurement instrument in terms of how well it identifies the

More information

Power analysis for the Bootstrap Likelihood Ratio Test in Latent Class Models

Power analysis for the Bootstrap Likelihood Ratio Test in Latent Class Models Power analysis for the Bootstrap Likelihood Ratio Test in Latent Class Models Fetene B. Tekle 1, Dereje W. Gudicha 2 and Jeroen. Vermunt 2 Abstract Latent class (LC) analysis is used to construct empirical

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

lcda: Local Classification of Discrete Data by Latent Class Models

lcda: Local Classification of Discrete Data by Latent Class Models lcda: Local Classification of Discrete Data by Latent Class Models Michael Bücker buecker@statistik.tu-dortmund.de July 9, 2009 Introduction common global classification methods may be inefficient when

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics

More information

Generalization to Multi-Class and Continuous Responses. STA Data Mining I

Generalization to Multi-Class and Continuous Responses. STA Data Mining I Generalization to Multi-Class and Continuous Responses STA 5703 - Data Mining I 1. Categorical Responses (a) Splitting Criterion Outline Goodness-of-split Criterion Chi-square Tests and Twoing Rule (b)

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Testing the Limits of Latent Class Analysis. Ingrid Carlson Wurpts

Testing the Limits of Latent Class Analysis. Ingrid Carlson Wurpts Testing the Limits of Latent Class Analysis by Ingrid Carlson Wurpts A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Arts Approved April 2012 by the Graduate Supervisory

More information

Inference using structural equations with latent variables

Inference using structural equations with latent variables This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Mixture Modeling in Mplus

Mixture Modeling in Mplus Mixture Modeling in Mplus Gitta Lubke University of Notre Dame VU University Amsterdam Mplus Workshop John s Hopkins 2012 G. Lubke, ND, VU Mixture Modeling in Mplus 1/89 Outline 1 Overview 2 Latent Class

More information

Auxiliary Variables in Mixture Modeling: Using the BCH Method in Mplus to Estimate a Distal Outcome Model and an Arbitrary Secondary Model

Auxiliary Variables in Mixture Modeling: Using the BCH Method in Mplus to Estimate a Distal Outcome Model and an Arbitrary Secondary Model Auxiliary Variables in Mixture Modeling: Using the BCH Method in Mplus to Estimate a Distal Outcome Model and an Arbitrary Secondary Model Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 21 Version

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Class (or 3): Summary of steps in building unconditional models for time What happens to missing predictors Effects of time-invariant predictors

More information

Introduction to Structural Equation Modeling

Introduction to Structural Equation Modeling Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression

More information

Plausible Values for Latent Variables Using Mplus

Plausible Values for Latent Variables Using Mplus Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can

More information

The Common Factor Model. Measurement Methods Lecture 15 Chapter 9

The Common Factor Model. Measurement Methods Lecture 15 Chapter 9 The Common Factor Model Measurement Methods Lecture 15 Chapter 9 Today s Class Common Factor Model Multiple factors with a single test ML Estimation Methods New fit indices because of ML Estimation method

More information

Multiple Group CFA Invariance Example (data from Brown Chapter 7) using MLR Mplus 7.4: Major Depression Criteria across Men and Women (n = 345 each)

Multiple Group CFA Invariance Example (data from Brown Chapter 7) using MLR Mplus 7.4: Major Depression Criteria across Men and Women (n = 345 each) Multiple Group CFA Invariance Example (data from Brown Chapter 7) using MLR Mplus 7.4: Major Depression Criteria across Men and Women (n = 345 each) 9 items rated by clinicians on a scale of 0 to 8 (0

More information

DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals

DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals Overall Measures of GOF Deviance: this measures the overall likelihood of the model given a parameter vector D( θ) = 2 log L( θ) This can be averaged

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

Introduction to Random Effects of Time and Model Estimation

Introduction to Random Effects of Time and Model Estimation Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =

More information

ABSTRACT. Yunyun Dai, Doctor of Philosophy, Mixtures of item response theory models have been proposed as a technique to explore

ABSTRACT. Yunyun Dai, Doctor of Philosophy, Mixtures of item response theory models have been proposed as a technique to explore ABSTRACT Title of Document: A MIXTURE RASCH MODEL WITH A COVARIATE: A SIMULATION STUDY VIA BAYESIAN MARKOV CHAIN MONTE CARLO ESTIMATION Yunyun Dai, Doctor of Philosophy, 2009 Directed By: Professor, Robert

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Today s Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building

More information

MODEL BASED CLUSTERING FOR COUNT DATA

MODEL BASED CLUSTERING FOR COUNT DATA MODEL BASED CLUSTERING FOR COUNT DATA Dimitris Karlis Department of Statistics Athens University of Economics and Business, Athens April OUTLINE Clustering methods Model based clustering!"the general model!"algorithmic

More information

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection Model comparison Patrick Breheny March 28 Patrick Breheny BST 760: Advanced Regression 1/25 Wells in Bangladesh In this lecture and the next, we will consider a data set involving modeling the decisions

More information

Richard N. Jones, Sc.D. HSPH Kresge G2 October 5, 2011

Richard N. Jones, Sc.D. HSPH Kresge G2 October 5, 2011 Harvard Catalyst Biostatistical Seminar Neuropsychological Proles in Alzheimer's Disease and Cerebral Infarction: A Longitudinal MIMIC Model An Overview of Structural Equation Modeling using Mplus Richard

More information

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures Describing Within-Person Fluctuation over Time using Alternative Covariance Structures Today s Class: The Big Picture ACS models using the R matrix only Introducing the G, Z, and V matrices ACS models

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

Bayesian Analysis of Latent Variable Models using Mplus

Bayesian Analysis of Latent Variable Models using Mplus Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Relating Latent Class Analysis Results to Variables not Included in the Analysis

Relating Latent Class Analysis Results to Variables not Included in the Analysis Relating LCA Results 1 Running Head: Relating LCA Results Relating Latent Class Analysis Results to Variables not Included in the Analysis Shaunna L. Clark & Bengt Muthén University of California, Los

More information

Latent class analysis with multiple latent group variables

Latent class analysis with multiple latent group variables Communications for Statistical Applications and Methods 2017 Vol. 24 No. 2 173 191 https://doi.org/10.5351/csam.2017.24.2.173 Print ISSN 2287-7843 / Online ISSN 2383-4757 Latent class analysis with multiple

More information

Statistical power of likelihood ratio and Wald tests in latent class models with covariates

Statistical power of likelihood ratio and Wald tests in latent class models with covariates DOI 10.3758/s13428-016-0825-y Statistical power of likelihood ratio and Wald tests in latent class models with covariates Dereje W. Gudicha 1 Verena D. Schmittmann 1 Jeroen K. Vermunt 1 The Author(s) 2016.

More information

SURVIVAL ANALYSIS WITH MULTIPLE DISCRETE INDICATORS OF LATENT CLASSES KLAUS LARSEN, UCLA DRAFT - DO NOT DISTRIBUTE. 1.

SURVIVAL ANALYSIS WITH MULTIPLE DISCRETE INDICATORS OF LATENT CLASSES KLAUS LARSEN, UCLA DRAFT - DO NOT DISTRIBUTE. 1. SURVIVAL ANALYSIS WITH MULTIPLE DISCRETE INDICATORS OF LATENT CLASSES KLAUS LARSEN, UCLA DRAFT - DO NOT DISTRIBUTE Abstract.... 1. Introduction In many studies the outcome of primary interest is the time

More information

Application of Item Response Theory Models for Intensive Longitudinal Data

Application of Item Response Theory Models for Intensive Longitudinal Data Application of Item Response Theory Models for Intensive Longitudinal Data Don Hedeker, Robin Mermelstein, & Brian Flay University of Illinois at Chicago hedeker@uic.edu Models for Intensive Longitudinal

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

Latent class analysis and finite mixture models with Stata

Latent class analysis and finite mixture models with Stata Latent class analysis and finite mixture models with Stata Isabel Canette Principal Mathematician and Statistician StataCorp LLC 2017 Stata Users Group Meeting Madrid, October 19th, 2017 Introduction Latent

More information

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation Dimitris Rizopoulos Department of Biostatistics, Erasmus University Medical Center, the Netherlands d.rizopoulos@erasmusmc.nl

More information

All models are wrong but some are useful. George Box (1979)

All models are wrong but some are useful. George Box (1979) All models are wrong but some are useful. George Box (1979) The problem of model selection is overrun by a serious difficulty: even if a criterion could be settled on to determine optimality, it is hard

More information

Investigating the Feasibility of Using Mplus in the Estimation of Growth Mixture Models

Investigating the Feasibility of Using Mplus in the Estimation of Growth Mixture Models Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 31 5-1-2014 Investigating the Feasibility of Using Mplus in the Estimation of Growth Mixture Models Ming Li University of Maryland,

More information

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1 Mediation Analysis: OLS vs. SUR vs. ISUR vs. 3SLS vs. SEM Note by Hubert Gatignon July 7, 2013, updated November 15, 2013, April 11, 2014, May 21, 2016 and August 10, 2016 In Chap. 11 of Statistical Analysis

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

Longitudinal analysis of ordinal data

Longitudinal analysis of ordinal data Longitudinal analysis of ordinal data A report on the external research project with ULg Anne-Françoise Donneau, Murielle Mauer June 30 th 2009 Generalized Estimating Equations (Liang and Zeger, 1986)

More information

COMPARING THREE EFFECT SIZES FOR LATENT CLASS ANALYSIS. Elvalicia A. Granado, B.S., M.S. Dissertation Prepared for the Degree of DOCTOR OF PHILOSOPHY

COMPARING THREE EFFECT SIZES FOR LATENT CLASS ANALYSIS. Elvalicia A. Granado, B.S., M.S. Dissertation Prepared for the Degree of DOCTOR OF PHILOSOPHY COMPARING THREE EFFECT SIZES FOR LATENT CLASS ANALYSIS Elvalicia A. Granado, B.S., M.S. Dissertation Prepared for the Degree of DOCTOR OF PHILOSOPHY UNIVERSITY OF NORTH TEXAS December 2015 APPROVED: Prathiba

More information

Nesting and Equivalence Testing

Nesting and Equivalence Testing Nesting and Equivalence Testing Tihomir Asparouhov and Bengt Muthén August 13, 2018 Abstract In this note, we discuss the nesting and equivalence testing (NET) methodology developed in Bentler and Satorra

More information

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models General structural model Part 2: Categorical variables and beyond Psychology 588: Covariance structure and factor models Categorical variables 2 Conventional (linear) SEM assumes continuous observed variables

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Generalized Models: Part 1

Generalized Models: Part 1 Generalized Models: Part 1 Topics: Introduction to generalized models Introduction to maximum likelihood estimation Models for binary outcomes Models for proportion outcomes Models for categorical outcomes

More information

Estimating Diagnostic Error without a Gold Standard: A Mixed Membership Approach

Estimating Diagnostic Error without a Gold Standard: A Mixed Membership Approach 7 Estimating Diagnostic Error without a Gold Standard: A Mixed Membership Approach Elena A. Erosheva Department of Statistics, University of Washington, Seattle, WA 98195-4320, USA Cyrille Joutard Institut

More information

Multi-level Models: Idea

Multi-level Models: Idea Review of 140.656 Review Introduction to multi-level models The two-stage normal-normal model Two-stage linear models with random effects Three-stage linear models Two-stage logistic regression with random

More information

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures

More information

Global Model Fit Test for Nonlinear SEM

Global Model Fit Test for Nonlinear SEM Global Model Fit Test for Nonlinear SEM Rebecca Büchner, Andreas Klein, & Julien Irmer Goethe-University Frankfurt am Main Meeting of the SEM Working Group, 2018 Nonlinear SEM Measurement Models: Structural

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Variable selection for model-based clustering of categorical data

Variable selection for model-based clustering of categorical data Variable selection for model-based clustering of categorical data Brendan Murphy Wirtschaftsuniversität Wien Seminar, 2016 1 / 44 Alzheimer Dataset Data were collected on early onset Alzheimer patient

More information

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs Model-based cluster analysis: a Defence Gilles Celeux Inria Futurs Model-based cluster analysis Model-based clustering (MBC) consists of assuming that the data come from a source with several subpopulations.

More information

Multi-group analyses for measurement invariance parameter estimates and model fit (ML)

Multi-group analyses for measurement invariance parameter estimates and model fit (ML) LBP-TBQ: Supplementary digital content 8 Multi-group analyses for measurement invariance parameter estimates and model fit (ML) Medication data Multi-group CFA analyses were performed with the 16-item

More information

Chapter 14 Combining Models

Chapter 14 Combining Models Chapter 14 Combining Models T-61.62 Special Course II: Pattern Recognition and Machine Learning Spring 27 Laboratory of Computer and Information Science TKK April 3th 27 Outline Independent Mixing Coefficients

More information

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models Time-Invariant Predictors in Longitudinal Models Topics: What happens to missing predictors Effects of time-invariant predictors Fixed vs. systematically varying vs. random effects Model building strategies

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Time Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models Time Invariant Predictors in Longitudinal Models Longitudinal Data Analysis Workshop Section 9 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Optimization in latent class analysis

Optimization in latent class analysis Computational Optimization and Applications manuscript No. (will be inserted by the editor) Optimization in latent class analysis Martin Fuchs Arnold Neumaier Received: date / Accepted: date Abstract In

More information

INTRODUCTION TO STRUCTURAL EQUATION MODELS

INTRODUCTION TO STRUCTURAL EQUATION MODELS I. Description of the course. INTRODUCTION TO STRUCTURAL EQUATION MODELS A. Objectives and scope of the course. B. Logistics of enrollment, auditing, requirements, distribution of notes, access to programs.

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j Standard Errors & Confidence Intervals β β asy N(0, I( β) 1 ), where I( β) = [ 2 l(β, φ; y) ] β i β β= β j We can obtain asymptotic 100(1 α)% confidence intervals for β j using: β j ± Z 1 α/2 se( β j )

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Model Assumptions; Predicting Heterogeneity of Variance

Model Assumptions; Predicting Heterogeneity of Variance Model Assumptions; Predicting Heterogeneity of Variance Today s topics: Model assumptions Normality Constant variance Predicting heterogeneity of variance CLP 945: Lecture 6 1 Checking for Violations of

More information

Bayesian Mixture Modeling

Bayesian Mixture Modeling University of California, Merced July 21, 2014 Mplus Users Meeting, Utrecht Organization of the Talk Organization s modeling estimation framework Motivating examples duce the basic LCA model Illustrated

More information

Clustering on Unobserved Data using Mixture of Gaussians

Clustering on Unobserved Data using Mixture of Gaussians Clustering on Unobserved Data using Mixture of Gaussians Lu Ye Minas E. Spetsakis Technical Report CS-2003-08 Oct. 6 2003 Department of Computer Science 4700 eele Street orth York, Ontario M3J 1P3 Canada

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Stochastic approximation EM algorithm in nonlinear mixed effects model for viral load decrease during anti-hiv treatment

Stochastic approximation EM algorithm in nonlinear mixed effects model for viral load decrease during anti-hiv treatment Stochastic approximation EM algorithm in nonlinear mixed effects model for viral load decrease during anti-hiv treatment Adeline Samson 1, Marc Lavielle and France Mentré 1 1 INSERM E0357, Department of

More information

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA Topics: Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA What are MI and DIF? Testing measurement invariance in CFA Testing differential item functioning in IRT/IFA

More information

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline: CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2.

More information

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes

More information

Investigating Population Heterogeneity With Factor Mixture Models

Investigating Population Heterogeneity With Factor Mixture Models Psychological Methods 2005, Vol. 10, No. 1, 21 39 Copyright 2005 by the American Psychological Association 1082-989X/05/$12.00 DOI: 10.1037/1082-989X.10.1.21 Investigating Population Heterogeneity With

More information

Measurement Invariance Testing with Many Groups: A Comparison of Five Approaches (Online Supplements)

Measurement Invariance Testing with Many Groups: A Comparison of Five Approaches (Online Supplements) University of South Florida Scholar Commons Educational and Psychological Studies Faculty Publications Educational and Psychological Studies 2017 Measurement Invariance Testing with Many Groups: A Comparison

More information