STA 216, GLM, Lecture 16. October 29, 2007

Similar documents
Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Default Priors and Effcient Posterior Computation in Bayesian

Lecture 16: Mixtures of Generalized Linear Models

Bayes methods for categorical data. April 25, 2017

Bayesian Multivariate Logistic Regression

Bayesian Analysis of Latent Variable Models using Mplus

Generalized Linear Models for Non-Normal Data

Default Priors and Efficient Posterior Computation in Bayesian Factor Analysis

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Contents. Part I: Fundamentals of Bayesian Inference 1

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

Part 8: GLMs and Hierarchical LMs and GLMs

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Bayesian non-parametric model to longitudinally predict churn

Partial factor modeling: predictor-dependent shrinkage for linear regression

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, )

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

MULTILEVEL IMPUTATION 1

The Multilevel Logit Model for Binary Dependent Variables Marco R. Steenbergen

Dynamic Generalized Linear Models

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes 1

November 2002 STA Random Effects Selection in Linear Mixed Models

Lecture 13: More on Binary Data

Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,

Nonparametric Bayes tensor factorizations for big data

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Latent Variable Models #1

Bayesian Linear Regression

Bayes Model Selection with Path Sampling: Factor Models

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian linear regression

CTDL-Positive Stable Frailty Model

Bayesian methods for latent trait modeling of longitudinal data

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

Scaling up Bayesian Inference

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Single-level Models for Binary Responses

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Richard N. Jones, Sc.D. HSPH Kresge G2 October 5, 2011

Chapter 1. Modeling Basics

Model Assumptions; Predicting Heterogeneity of Variance

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

STA 4273H: Statistical Machine Learning

36-720: The Rasch Model

Application of Latent Class with Random Effects Models to Longitudinal Data. Ken Beath Macquarie University

Comparing IRT with Other Models

Nonparametric Bayes Modeling

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

Using Bayesian Priors for More Flexible Latent Class Analysis

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Bayesian Nonparametric Modeling for Multivariate Ordinal Regression

Generalized Linear Models I

Departamento de Economía Universidad de Chile

STA 4273H: Statistical Machine Learning

An Extended BIC for Model Selection

Bayesian shrinkage approach in variable selection for mixed

Gibbs Sampling in Linear Models #2

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

STAT 518 Intro Student Presentation

Measurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007

Fixed and Random Effects Selection in Linear and Logistic Models

Bayesian Methods for Machine Learning

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Extending causal inferences from a randomized trial to a target population

Plausible Values for Latent Variables Using Mplus

Analysing geoadditive regression data: a mixed model approach

Generalized Models: Part 1

Bayesian Mixture Modeling

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Review of Panel Data Model Types Next Steps. Panel GLMs. Department of Political Science and Government Aarhus University.

Introduction to Generalized Models

Supplementary Material for Analysis of Job Satisfaction: The Case of Japanese Private Companies

Hierarchical Modeling for Univariate Spatial Data

Statistical Inference and Methods

Lecture 5: LDA and Logistic Regression

Nonparametric Bayes Uncertainty Quantification

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic

Linear Regression Models P8111

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Fixed and random effects selection in linear and logistic models

1 Data Arrays and Decompositions

Statistical Analysis of List Experiments

A Study into Mechanisms of Attitudinal Scale Conversion: A Randomized Stochastic Ordering Approach

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Monte Carlo Techniques for Regressing Random Variables. Dan Pemstein. Using V-Dem the Right Way

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Dirichlet Process Mixtures of Generalized Linear Models

Bayesian Areal Wombling for Geographic Boundary Analysis

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little

,..., θ(2),..., θ(n)

Transcription:

STA 216, GLM, Lecture 16 October 29, 2007

Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural Equation Models

How can we do efficient computation? Efficient posterior computation in factor analysis models is very challenging Typical Gibbs sampler can be subject to extreme slow-mixing Centering does not provide complete solution - can only center one measurement for each latent variable & eliminates conjugacy unless prior non-exchangeable What to do?

Parameter Expansion (PX) Originally proposed as a method for speeding up convergence of EM algorithm Redundant parameters are carefully introduced to allow faster convergence of EM & better mixing Gibbs samplers The idea is to induce the redundant parameters in such a way as to avoid changing the target distribution in the MCMC algorithm Hence, the posterior is not changed, but one reduces autocorrelation.

PX in Hierarchical Models It is very difficult to obtain a PX-accelerated Gibbs sampler in general cases Hard to avoid changing the target distribution Gelman (2005, Bayesian Analysis) proposes to use PX to induce a better prior, while also speeding up mixing in the setting of variance component models Ghosh & Dunson (2007) extend to factor analysis models

Homework Exercise Propose a PX Gibbs sampler for the sperm concentration latent factor regression model from lecture 15 Simulate data under the model and compare the PX Gibbs sampler to a typical Gibbs sampler without PX Due - next Friday

What if our data are categorical? The above models assume that the different elements of y i are continuous and normally distributed In most settings in which factor analysis models are used, at least some of the elements of y i are instead ordered categorical It is appealing to have a general framework for modeling of correlated measurements having different scales (continuous, binary, ordinal)

Underlying Normal Models To solve this problem, we can considering the following modification to the measurement model: y ij = g j (y ij; τ j ), j = 1,..., p y i = µ + Λη i + ɛ i, ɛ i N 3 (0, Σ), Here, y i are the observed variables & yi are normal latent variables underlying y i g j ( ; τ j ) = link function possibly involving threshold parameters τ j

Link functions in underlying normal models For continuous items (i.e., y ij is continuous), g j is chosen as the identity link For binary items (y ij {0, 1}), we choose a threshold link y ij = 1(y ij > 0) For ordered categorical items (y ij {1,..., c j }), we generalize the binary case to let y ij = c j l=1 = l 1(τ j,l 1 < y ij τ jl ), with τ j0 =, τ j1 = 0, τ j,cj =.

Some Comments Note that we are using an underlying multivariate normal model to characterize dependence in observations having a variety of scales In factor analysis, the dependence is induced through shared dependence on the latent factors Posterior computation is straightforward using a data augmentation Gibbs sampler, which imputes the y ij from their truncated normal full conditional distributions. Other sampling steps proceed as if yij were observed data

Generalized Latent Trait Models Underlying normal specification induces normal linear models on the continuous items & probit-type models on categorical items Structure is restrictive - may prefer to use a different GLM for each item, while allowing dependence Replace underlying normal measurement model with generalized latent trait model (GLTM): η ij = µ j + λ jξ i, where η ij =linear predictor in GLM for outcome type j, ξ i = (ξ i1,..., ξ ir ) =vector of latent traits

Comments on GLTMs GLTMs allow modeling also with count outcomes & for more flexible models for the individual items (e.g., logistic, complementary log-log, etc instead of just probit) Important - latent traits impact both dependence in the different elements of y i & lack of fit in the individual item GLMs Harder to fit such models routinely, though adaptive rejection sampling & other tricks possible

Dangers of GLTMs Dual role of latent variable component in accommodating dependence & lack of fit individual item links creates problems in interpretation Consider the case in which y ij is a 0/1 indicator of a disease, with i indexing family of j indexing individual within a family Following model commonly used to assess within-family dependence in probability of disease logit { Pr(y ij = 1 x i, β, ξ i ) } = x iβ + ξ i, ξ i N(0, ψ)

Application to Genetic Epidemiology Studies ξ i = difference in the log-odds of disease for family i relative to the population average Such differences among families are commonly attributable to genetic effects The estimated value of ψ is used to infer the magnitude of the genetic component Diseases having small ψ will exhibit limited within-family dependence & hence should have a small genetic component

Genetic Epidemiology Example (continued) Anything wrong with this interpretation? It has been shown that one can identify the fixed effects, β, & the random effects variance, ψ even if data are only available for a single individual per family. How can this be?? Random effect included to allow within-family dependence with a single individual per family no need for a random effect?

Punch Line We have identifiability even with a single individual per family because induced link function is no longer logistic In particular, we have a logistic-normal link function: Pr(y i = 1 x i, β, ψ) = g ψ (x iβ) ( = {1 + exp( x iβ + ξ i )} 1 (2πψ) 1/2 exp 1 ) 2ψ ξ2 i dξ i, Shape of the link function varies as ψ varies, so we can estimate β, ψ even with a single subject per family

Some Further Comments Is ψ interpretable as a genetic heterogeneity in this case? What if we have a few families with multiple individuals, and many with a single individual? Answer: ψ measures both lack-of-fit in the logistic link & heterogeneity among families. To obtain reliable inferences on genetic heterogeneity, you should use a flexible link function

Some General Comments about GLTMs Used simple random intercept genetic epi example as illustration Need to worry about these issues just as much in more complex settings involving multivariate outcomes having different scales Normal & underlying normal more robust to such issues, since one does not change the link in marginalizing out latent variables To allow additional flexibility, one can work within underlying normal family - e.g., using scale mixtures of normals.

Structural Equation Models (Bollen, 1989) When interest focuses on modeling of relationships among latent traits, factor analysis needs to be extended Structural Equation Models (SEMs) provide a broader framework Specified in two components: (1) measurement model relating observed variables to latent traits; (2) structural model characterizing joint distribution of latent traits

Structural Equation Models (Bollen, 1989) The measured data consist of a vector of response variables, y i = (y i1,..., y ip ), & a vector of predictor variables, x i = (x i1,..., x iq ). Measurement model: y i = µ y + Λ y η i + ɛ y i, ɛy i N p(0, Σ y ) x i = µ x + Λ x ξ i + ɛ x i, ɛ x i N p (0, Σ x ) Just two separate factor analysis models of y i and x i components

Linear Structural Relations (LISREL) model LISREL model: η i = Bη i + Γξ i + δ i, ξ i N s (0, I s ), δ i N r (0, I r ). Describes joint distribution and association among latent variables B has zeros along diagonals - putting η i standard notation in LISREL model (typically, B has upper triangular elements = 0) Γ=often parameters of primary interest - characterize association among latent predictor and response variables