Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

Similar documents
Generalized Linear Latent and Mixed Models with Composite Links and Exploded

Estimating chopit models in gllamm Political efficacy example from King et al. (2002)

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Semiparametric Generalized Linear Models

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Comparing IRT with Other Models

STA 216, GLM, Lecture 16. October 29, 2007

Generalized Linear Models for Non-Normal Data

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

An Overview of Item Response Theory. Michael C. Edwards, PhD

Modeling rater effects using a combination of Generalizability Theory and IRT

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models

Introduction To Confirmatory Factor Analysis and Item Response Theory

Comparison between conditional and marginal maximum likelihood for a class of item response models

Repeated ordinal measurements: a generalised estimating equation approach

Lesson 7: Item response theory models (part 2)

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions

Title. Description. Remarks and examples. stata.com. stata.com. methods and formulas for gsem Methods and formulas for gsem

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

STAT 526 Spring Final Exam. Thursday May 5, 2011

Basic IRT Concepts, Models, and Assumptions

Generalized logit models for nominal multinomial responses. Local odds ratios

What is an Ordinal Latent Trait Model?

Item Response Theory and Computerized Adaptive Testing

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Nonparametric Rasch Modeling: Methods and Software

Whats beyond Concerto: An introduction to the R package catr. Session 4: Overview of polytomous IRT models

Application of Item Response Theory Models for Intensive Longitudinal Data

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Item Response Theory (IRT) Analysis of Item Sets

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Factor Analysis and Latent Structure of Categorical Data

36-720: The Rasch Model

Continuous Time Survival in Latent Variable Models

Generalized Linear Models Introduction

Bayes methods for categorical data. April 25, 2017

Chapter 1. Modeling Basics

LOGISTIC REGRESSION Joseph M. Hilbe

Introduction to GSEM in Stata

Data-analysis and Retrieval Ordinal Classification

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, )

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

Pairwise Parameter Estimation in Rasch Models

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

A multilevel multinomial logit model for the analysis of graduates skills

Joint Modeling of Longitudinal Item Response Data and Survival

Single-level Models for Binary Responses

PIRLS 2016 Achievement Scaling Methodology 1

Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications

Introduction to Generalized Linear Models

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

,..., θ(2),..., θ(n)

Generalized linear models

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

A general mixed model approach for spatio-temporal regression data

Latent Variable Modelling: A Survey*

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

A Note on Item Restscore Association in Rasch Models

A Nonlinear Mixed Model Framework for Item Response Theory

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Generalized Linear Models (GLZ)

Generalized Linear Models 1

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

8 Nominal and Ordinal Logistic Regression

Generalized Models: Part 1

Statistical and psychometric methods for measurement: Scale development and validation

Models for Longitudinal Analysis of Binary Response Data for Identifying the Effects of Different Treatments on Insomnia

A general class of latent variable models for ordinal manifest variables with covariate effects on the manifest and latent variables

A Multilevel Analysis of Graduates Job Satisfaction

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

GENERALIZED LATENT TRAIT MODELS. 1. Introduction

Generalized Linear Modeling - Logistic Regression

2 Bayesian Hierarchical Response Modeling

Outline of GLMs. Definitions

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Analysing geoadditive regression data: a mixed model approach

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Links Between Binary and Multi-Category Logit Item Response Models and Quasi-Symmetric Loglinear Models

Bayesian Multivariate Logistic Regression

Introduction to Generalized Models

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Lecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Propensity Score Weighting with Multilevel Data

Rater agreement - ordinal ratings. Karl Bang Christensen Dept. of Biostatistics, Univ. of Copenhagen NORDSTAT,

COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Lecture 16: Mixtures of Generalized Linear Models

Interpreting and using heterogeneous choice & generalized ordered logit models

Transcription:

Constructing Latent Variable Models using Composite Links Anders Skrondal Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine Based on joint work with Sophia Rabe-Hesketh University of California, Berkeley Institute of Education, University of London University of Twente, October 15, 2009 p. 1

Composite links p. 2 Outline Traditional and generalized latent variable models Composite links Some standard extensions of archetypical IRT models specified using composite links Three-parameter model for guessing Graded response model for ordinal items Some non-standard IRT models specified using composite links Unfolding models for attitude and preference items Randomized response models Conclusion

Composite links p. 3 Traditional latent variable models LATENT VARIABLE(S) OBSERVED VARIABLE(S) Continuous Categorical Common factor model Continuous Structural equation model Latent profile model Linear mixed model Covariate measurement error model Categorical Latent trait model Latent class model [S & RH (2007)]

Composite links p. 4 Generalized latent variable models Response model: Generalized linear model conditional on latent variables Linear predictor: latent variables as factors or random coefficients varying at different hierarchical levels Links and distributions Structural model: Regressions of latent variables on observed variables Regressions of latent variables on other latent variables (at same and/or higher hierarchical levels) Distribution of latent variables (disturbances) Continuous Parametric Nonparametric Discrete Continuous and discrete [S & RH (2007)]

Composite links p. 5 Why generalized latent variable models Unifying model framework: Conceptually appealing Encourages specification of models tailormade to research problems by making it easy to combine features of different model types Facilitates unified approach to estimation implemented in single software program

Composite links p. 6 Generalized linear model 1. Linear predictor ν i = x iβ for unit i 2. Conditional expectation of response y i, µ i E(y i x i ), linked to linear predictor ν i through link function g( ), or equivalently, g(µ i ) = ν i, µ i = g 1 (ν i ), where g 1 ( ) is inverse link function 3. Conditional distribution of response, given linear predictor, from exponential family { } yi θ i b(θ i ) f(y i ν i ) = exp + c(y i, φ i ) φ i θ i is canonical or natural parameter (a function of µ i ) φ i is scale or dispersion parameter b( ) and c( ) are functions depending on distribution

Composite links p. 7 Common link functions Continuous responses: Identity link Link and Inverse link: µ i = ν i Dichotomous responses: Logit link Link: log[µ i /(1 µ i )] = ν i, Inverse link: µ i = exp(ν i) 1+exp(ν i ) Dichotomous responses: Probit link Link: Φ 1 (µ i ) = ν i, Inverse link: µ i = Φ(ν i ) Counts: Log link Link: log(µ i ) = ν i, Inverse link: µ i = exp(ν i )

Composite links p. 8 Composite links [Thompson & Baker, JRSSC, 1981] µ i linear combination of several inverse link functions Simple composite links: Expectation µ i weighted sum of inverse links with known weights w qi µ i = q w qi g 1 q (ν qi ) ν qi is qth linear predictor for unit i gq 1 ( ) an inverse link function Bilinear composite links: Known constants w qi replaced by linear combinations of unknown parameters and constants α w qi µ i = q α w qi g 1 q (ν qi ) Further extensions discussed in [T & B (1981)]

Composite links p. 9 Archetypical item response theory (IRT) model One parameter logistic (1-PL) model or Rasch model for ability testing of persons j using items i µ ij = Pr(y ij = 1 η j ) = exp(ν ij) 1 + exp(ν ij ), ν ij = β i + η j Pr(yij =1 ηj) 0.0 0.2 0.4 0.6 0.8 1.0 β i =1.5 β i =0 β i = 2-4 -2 0 2 4 β i is difficulty of item i Latent variable η j η j is the ability of person j (latent variable)

Composite links p. 10 Some standard extensions of archetypical IRT model using composite links Some limitations of Rasch model: Probability of positive response goes to 0 if ability goes to Not well suited for multiple choice items where examinees can guess right answer = I. IRT models with guessing parameters Only for dichotomous responses Unsuitable for items graded in ordered categories = II. IRT models for ordinal items

Composite links p. 11 I. IRT with guessing (3-PL model) If right answer to item can be guessed as with multiple choice questions, use three-parameter logistic (3-PL) model: exp(ν ij ) Pr(y ij =1 η j ) = c i + (1 c i ) 1+exp(ν ij ), where c i are guessing parameters (cf natural responsiveness or nonzero background in quantal response bioassay) Fixing w 1ij = c i and w 2ij = 1 c i, get model with composite link: Pr(y ij =1 η j ) = c i {}}{ w 1ij g1 1 }{{ (1) } identity 1 c i {}}{ + w 2ij g 1 2 (ν ij) }{{} logit

Composite links p. 12 IRT with guessing (cont d) Estimate guessing parameters in 3-PL using bilinear composite link with α = (1, c 1,...,c I ) w 1ij = (0,d i) and w 2ij = (1, d i), where d i is a vector of dimension I (for I items) with a one in position i and zeros elsewhere

Composite links p. 13 II. Samejima s graded response model for ordinal items Cumulative models for ordinal items y ij =s with linear predictor ν ij = β i + λ i η j, with thresholds κ si Satisfaction from five areas of life (i = 1,...,5), rated from 1= a very great deal to 7= none [S & R-H, 2004: Ch. 10] Conditional response probabilities for [Friend] item: Pr(yij =s ηj) 0.0 0.2 0.4 0.6 0.8 1.0 s= 1 s=2 s=3 s=4 s=5 s=6 s=7-3 -2-1 0 1 2 3 Life satisfaction η j

Composite links p. 13 II. Samejima s graded response model for ordinal items Cumulative models for ordinal items y ij =s with linear predictor ν ij = β i + λ i η j, with thresholds κ si Satisfaction from five areas of life (i = 1,...,5), rated from 1= a very great deal to 7= none [S & R-H, 2004: Ch. 10] Conditional response expectation for [Friend] item (monotonic): E(yij ηj) 1 2 3 4 5 6 7-3 -2-1 0 1 2 3 Life satisfaction η j

Composite links p. 14 Cumulative response models: Composite link S ordered response categories s = 1,...,S Pr(y ij >s η j ) = g 1 (ν ij κ s ), s = 0,...,S κ s are threshold parameters, κ 0 =, κ S = Inverse link function g 1 ( ) is cumulative density function such as standard normal, logistic or extreme value Composite link for difference model : 1 {}}{ Pr(y ij =s η j ) = w 1ij g 1 (ν ij κ s 1 ) + = g 1 (ν ij κ s 1 ) g 1 (ν ij κ s ) 1 {}}{ w 2ij g 1 (ν ij κ s ) Advantage that left and right-censoring, or even interval censoring of ordinal response easily accommodated Particularly useful for discrete time survival data Also handles accidental collapsing of categories

Composite links p. 15 Some non-standard IRT models specified using composite links Expected value function a monotonic function of latent variable in most IRT models May be unrealistic for attitude or preference items = III. Unfolding or ideal point models Responses to sensitive items may not be truthful = IV. Randomized response models

Composite links p. 16 III. Unfolding or ideal point models Preference items Stages of development Agreement with attitude statements e.g. Capital punishment seems wrong but is sometimes necessary, expected response given sentiment favoring capital punishment η j should be unimodal: agree strongly: 5 agree: 4 ambivalent: 3 disagree: 2 disagree strongly: 1 ideal point Sentiment favoring capital punishment

Composite links p. 17 Unfolding or ideal point models (cont d) Agreement with attitude statement, rated as objective response y ij = s, s = 1,...,5, S = 5 Respondent may disagree from above or below, with corresponding unobserved subjective responses z ij From below: latent variable is below position of item, z ij =s From above: latent var. is above position of item, z ij =2S+1 s strongly disagree disagree neither agree nor disagree agree strongly agree Objective Subjective 10 9 8 7 6 1 2 3 4 5 from above from below

Composite links p. 18 Unfolding or ideal point models: Composite link Since z ij not observed, probability of observed rating y ij equals sum of probabilities of the two possible subjective ratings Pr(y ij =s η j ) = Pr(z ij =s η j ) + Pr(z ij =2S+1 s η j ) Cumulative model for subjective ratings g 1 (ν ij κ s 1 ) g 1 (ν ij κ s ) + g 1 (ν ij κ 2S s ) g 1 (ν ij κ 2S+1 s ), }{{}}{{} Pr(z ij =s η j ) Pr(z ij =2S+1 s η j ) where ν ij =β i +λ i η j Composite link with 4 components

Composite links p. 19 IV. Randomized response models Want to estimate probability π of positive response to sensitive question (such as illegal drug use) Warner (1965) design: Question selected at random from two possibilities 1. Positively worded ( Have you ever used illegal drugs ) with known probability p 2. Negatively worded ( Have you never used illegal drugs ) with known probability 1 p Actual selection unknown to interviewer to secure privacy Probability of positive response to item whether positively or negatively worded π = Pr(y=1)+p 1 2p 1 Pr(y = 1) = pπ + (1 p)(1 π)

Composite links p. 20 Randomized response models (cont d) Combine Warner model with model for π i with linear predictor ν i and link function g( ), for instance logit link, Pr(y i = 1 ν i ) = p exp(ν i) 1 + exp(ν i ) Specify using composite link + (1 p) ( 1 exp(ν ) i) 1 + exp(ν i ) Pr(y i = 1 ν i ) = w 1i g 1 1 (ν 1i) + w 2i g 1 2 (ν 2i) g 1 ( ) and g 2 ( ) logit links w 1i = p and w 2i = 1 p ν 1i = ν i = ν 2i Straightforward to combine with IRT models using linear predictors ν ij = β i + λ i η j

Composite links p. 21 Some other uses of composite links in IRT Missing categorical data Validation data Phase II designs Fused cells Latent class models Standard single level models Multilevel models Models with combinations of discrete and continuous latent variables Log-normal latent variables [RH & S (2007)]

Composite links p. 22 Conclusion Using generalized latent variable framework with composite links produces a wide range of novel IRT models, for instance including: Several latent variables Discrete latent variables and NPML Latent variable(s) regressed on same or higher level latent variable(s) Latent variable(s) regressed on covariates Easy to implement in software for latent variable modelling Already implemented in gllamm Many extensions also for latent variable models used in other areas of statistics

Composite links p. 23 References Rabe-Hesketh & Skrondal (2007). Multilevel and latent variable modelling with composite links and exploded likelihoods. Psychometrika, 72, 123-140. Skrondal & Rabe-Hesketh (2007). Latent variable modelling: A survey. Scandinavian Journal of Statistics, 34, 712-745. Skrondal & Rabe-Hesketh (2004). Generalized latent variable modeling: Multilevel, longitudinal and structural equation models. Chapman & Hall/ CRC Rabe-Hesketh, Skrondal & Pickles (2004). Generalized multilevel structural equation modelling. Psychometrika, 69, 183-206. Skrondal & Rabe-Hesketh (2003). Mulitlevel logistic regression for polytomous data and rankings. Psychometrika, 68, 267-287. Thompson & Baker (1981). Composite link functions in generalized linear models. Journal of the Royal Statistical Society (Series C), 30, 125-131.

Composite links p. 24 The IRT epidemic Is transportation of IRT to medicine necessarily harmless? Example: MRC clinical trial of colorectal cancer Some example quality of life (QoL) items: Lack of appetite Decreased sexual interest Dry mouth