Multiple Group Analysis. Structural Equation Models. Interactions in SEMs. Multiple Group Analysis

Size: px
Start display at page:

Download "Multiple Group Analysis. Structural Equation Models. Interactions in SEMs. Multiple Group Analysis"

Transcription

1 Multiple Group Analysis Structural Equation Models Multiple Group Analysis Klaus Kähler Holst Esben Budtz-Jørgensen Department of Biostatistics, University of Copenhagen November 5, 0 Multiple group analysis etends the structural equation model framework to easily Comparing Structural Equation Models across groups (interactions) Combining data sets Regression modeling with eternally estimated parameters Interactions in SEMs Multiple Group Analysis Interactions in SEMs: Between covariates: Include product terms between covariates Between latent variables: η3 = β η + β η + β3 η η + ζ3 Model is not linear in variables. Non-linear SEMs not available in standard software. Between categorical covariates and latent variables: Special case of multiple group analysis. Between continuous covariates and latent variables: Random slopes (growth models). i = (i,..., iq) t : covariates of subject i. yi = (yi,..., yip) t : response variables of subject i. Parameters may depend on a group variable g =,..., G The measurement part: ɛi N(0, Ω g ) The structural part: ζi N(0, Ψ g ) yi = ν g + Λ g ηi + K g i + ɛi ηi = α g + B g ηi + Γ g i + ζi

2 Density Y Y Multiple Group Analysis Likelihood For fied covariates the outcomes are assumed to follow a multivariate normal distribution y y y3 v v v λ λ 3 u ζ Females y y y3 v v v λ λ 3 u ζ Males The distribution is completely characterized by its density distribution, fθ, which is completely characterized by its mean and variance. β β The MLE principal We find the parameters for which the observed data is most likely to be observed n θmle = arg ma θ L(θ; y, ) = fθ(yi, ηi i) dηi i= R l Likelihood In a simple linear regression model we have yi = β0 + βi + ɛi, ɛi N (0, σ ). Multiple Group Analysis Multiple Group Analysis Model defined additively from G different (structural equation) models with likelihood-functions Lg, g =,..., G G log L(θ; y, ) = log Lg(θg, yg, g) g= Density: σ µ σ ( ) f(y ; θ) = ep (y β0 β) πσ σ Likelihood: n ( n i= L(θ; y, ) = f(yi i; θ) = (πσ ep ) n/ i= (Yi β0 βxi) σ ) with parameter (non-linear) constraints across different parameters θg, g =,..., G. > e <- estimate(list(m,m,m3,...),list(d,d,d3,...)) > e <- estimate(rep(list(m),5), split(d,d$))

3 Multiple Group Analysis, lava synta > e <- estimate(list(m,m,m3,...),list(d,d,d3,...)) > e <- estimate(rep(list(m),5), split(d,d$)) Multiple Group Analysis, Eample > m <- lvm() > regression(m) <- c(y,y,y3) ~ u 3 > regression(m) <- u ~ 4 > latent(m) <- ~u 5 > plot(m) Free parameters are unique for each group Named parameters are fied across groups m <- baptize(m): Label all parameters parameter(m) <- alpha+beta: Add parameters to model y y y3 u If data is repeated across groups the cluster argument must be used (GEE type variance estimates). Eamining the model Eamining the model plot: shows the path diagram regression, covariance, intercept, constrain: show parameter restrictions summary: Prints an overview of adjacecy and covariance matrices coef(m): show parameter names subset(m, y+y+): Etract sub-model m%+%m%+%m3: Merge models path(m,y ): Etract (directed) pathways between variables parents(m, y+y); children(m, +): parent and children of nodes (union) vars(m), eogenous(m), endogenous(m), latent, manifest: Etract variable names > plot(m,labels=true,diag=true) y y y3 p5 p6 p7 p p p3 u p4 p8

4 Eamining the model > summary(m) Latent Variable Model with: 5 variables. Npar=8+4 Regression parameters: y y y3 u y y y3 u * * * * Covariance parameters: y y y3 u y * y * y3 * u * Intercept parameters: y y y3 u * * * * Multiple Group Analysis, lava synta NB: Labeled parameters will not be altered to guarantee identification (lava.options()$param)! Automatic solution: > m <- baptize(fisome(m)) Manual solution: > regression(m, y~u[0]) <- We continue the eample by adding some additional constraints and free some parameters for the multigroup analysis: > intercept(m,endogenous(m)) <- 0 > covariance(m,endogenous(m)) <- 0 3 > regression(m,u~) <- NA 4 > covariance(m,~u) <- NA Multiple Group Analysis, Eample > m <- baptize(m) > summary(m)... Regression parameters: y y y3 u y y y3 u y<-u y<-u y3<-u u<- Covariance parameters: y y y3 u y y<->y y y<->y y3 y3<->y3 u u<->u Intercept parameters: y y y3 u y y y3 u Multiple Group Analysis, eample > summary(m)... Regression parameters: y y y3 u y y y3 u y<-u y3<-u * Covariance parameters: y y y3 u y v y v y3 v u * Intercept parameters: y y y3 u u

5 Multiple Group Analysis, eample Multiple Group Analysis, estimation y y y3 v v v y y y3 v v v λ λ 3 u ζ Females λ λ 3 u ζ Males > e <- estimate(list(males=m,females=m),list(d,d)) > e β β Available methods: summary, coef, vcov, confint, compare, plot, score, loglik,... Note that a meaningful test of equality of β and β requires measurement invariance (latent variables should measure the same, i.e. equal factor loadings) Multiple Group Analysis, estimation Multigroup analysis Group : Males (n=00) Estimate Std. Error Z value Pr(> z ) Measurements: y<-u <e- y3<-u <e- Regressions: u< <e- Intercepts: u Residual Variances: y u Group : Females (n=50) Estimate Std. Error Z value Pr(> z ) Measurements: y<-u <e- y3<-u <e- Regressions: u< <e- Intercepts: u Residual Variances: y u Wald test and LRT via compare: > e <- estimate(m,rbind(d,d)) > compare(e,e) Likelihood ratio test data: chisq =.8685, df =, p-value =.08e-05 sample estimates: log likelihood (model ) log likelihood (model ) Omnibus χ -test: compare(e) Non-linear constraints possible. However, parameters must be added across all groups with the parameter method.

6 Combining different models Stacking different models and datasets together... M(θ, ψ): Primary model with parameters (θ, ψ) where ψ is nuisance parameter (may be unidentified). θ: parameter of interest. M(θ, ψ): Model used to estimate nuisance parameter ψ Eample: measurement error model True model: Y = β0 + βx + ɛ, but instead of X we observe W = X + U. In an independent dataset we in addition observe W = X + U and W = X + U. Solution: Multiple Group Analysis (two-stage ignoring uncertainty in estimation leads to too small standard errors!) NB: Sample principle as missing data analysis in SEM. Combining different models > m <- lvm(c(y,w,w)~) > d <- sim(m,000,p=c("y<-"=-)) 3 > d <- sim(m,000) > estimate(y~,d) Estimate Std. Error Z-value P-value Regressions: y< <e- Intercepts: y Residual Variances: y > estimate(y~w,d) Estimate Std. Error Z-value P-value Regressions: y<-w <e- Intercepts: y Residual Variances: y Measurement error Y : response, X: true eposure, W : measured eposure Measurement error Y = β0 + βx + ɛ, W = X + U True data generating mechanism: 4 0 Y = β0 + βx + ɛ Assume X has finite variance σ and X ɛ N (0, σ ɛ ). Y X W Naïve analysis W = X + U, U N (0, σ u) X Naïve analysis replacing X with W attenuates the effect Bias increases with the degree of imprecision Y = β0 + βw + ξ, Violation of Linear Model assumptions? ξ = ɛ βu

7 Measurement error The MLE obtained by regressing Y on W is an unbiased estimate of λβ = Cov(Y, W )/Var(Y ), where Introducing confounder Z λ = σ <. σ + σu Y = β0 + βx + βzz + ɛ, Naïve analysis consistently estimates βvar(x Z) Var(X Z) + Var(U) W = X + U Large bias when variance of eposure is low (for fied levels of confounders) and imprecision is high. Combining different models Group (n=000) Estimate Std. Error Z value Pr(> z ) Measurements: y<-u <e- Intercepts: w y Residual Variances: w u y Group (n=000) Estimate Std. Error Z value Pr(> z ) Intercepts: w Residual Variances: w u Combining different models m: Primary model m: Model used to estimate nuisance parameter (from independent data set) > m <- lvm() > regression(m, c(w,w)~u) <- 3 > intercept(m,~w+w) <- "m" 4 > covariance(m,~w+w) <- "v" 5 > covariance(m,~u) <- "vu" 6 > intercept(m,~u) < > m <- kill(m,~w) 9 > regression(m) <- y~u > estimate(list(m,m),list(d,d)) Miture Models Multiple Group Analysis: Known groups Miture SEM: Known number of unknown groups > library(lava.miture) > miture(list(m,m,m3),data=d) 3 > miture(m,data=d,k=3) Modelling of heterogeneity Estimation by EM-algorithm (slower than multiple group) Controversy in choosing number of components Technical problems (convergence and boundedness of likelihood) Impact of model misspecification?

8 Twin Studies Twin Studies Considerable interest in finding out how much of specific traits and diseases that are inherited. DZ MZ Family and Twin studies can be used to shed light on the genetic and environmental influence. 4.5 Twin studies Include both monozygotic (MZ) and dizygotic (DZ) twin pairs. DZ pairs on averages shares half of their genes MZ pairs are natural copies Difference in similarity of DZ and MZ twins may indicate genetic influence! Birth weight of twin Birth weight of cotwin Twin similarity Similarity The difference in (product-moment) correlation within pairs of MZ and DZ twins is our measure of similarity, i.e. difference in amount of variance between pairs of the total variance of the phenotype. Higher correlation in MZ pairs indicates genetic influence. Twin similarity DZ MZ Decomposition What is contribution of genetic and environmental factors to the variation in the outcome? The phenotype is the sum of genetic and enviromental effects: Density Y Density Y Y = G + E Idea: decompose variance into genetic and environmental components Y Y ΣY = ΣG + ΣE

9 Polygenic model for continuous trait ACDE model Decompose outcome into Yi = Ai + Di + C + Ei, i =, A Additive genetic effects of alleles D Dominante genetic effects of alleles C Shared environmental effects E Unique environmental genetic effects Dissimilarity of MZ twins arises from unshared environmental effects only! Cor(E, E) = 0 and Cor(A MZ, A MZ ) =, Cor(D MZ, D MZ ) =, Cor(A DZ, A DZ ) = 0.5, Cor(D DZ, D DZ ) = 0.5, Polygenic model for continuous trait ACDE model Decompose outcome into Yi = Ai + Di + C + Ei, i =, A Additive genetic effects of alleles D Dominante genetic effects of alleles C Shared environmental effects E Unique environmental genetic effects Assumptions No gene-environment interaction No gene-gene interaction Same marginals of twin and twin, and MZ and DZ. Equal environmental effects for MZ and DZ. Polygenic model for continuous trait Polygenic model Model DZ 0.5/ MZ DZ 0.5/ MZ Yi = Ai + Ci + Di + Ei Ai N (0, σ A), Ci N (0, σ C), Di N (0, σ D), Ei N (0, σ E) A D E C A D E λa λd λe λc λc λa λd λe ( σ A ZAσA ) ZAσA σa where ZA = ( σ + C σc σc σ C Cov(Y, Y) = ) ( σ + D ZDσD ZDσD σd { {, MZ, MZ 0.5 and ZD = DZ 0.5 DZ ) ( ) σ + E 0 0 σe Y Y

10 Polygenic model DZ 0.5/ MZ DZ 0.5/ MZ A D E C A D E Polygenic model Obviously this is a structural equation model which is easily implemented in lava > m <- lvm() > regression(m, c(y,y) ~ A+A) <- c("a","a") 3... λa λd λe λc λc λa λd λe The differences in covariance between zygosities can be defined using a multigroup analysis... Y Y > mz <- dz <- m > covariance(mz,a~a) <- 3 > covariance(dz,a~a) < X Z > estimate(list(mz,dz),split(twinwide,twinwide$zyg.))... or you could cheat and use the mets package. Polygenic model With only MZ and DZ twins we can only identify three of the variance components. Often the following strategy is applied Estimate ACE model Compare with AE model 3 If C can be omitted, estimate ADE model Identification is possible by etending the design to including adoptive siblings or additional family members.. Heritability Heritability Heritability Narrow-sense heritability Shared environmental effect H Y = Var(G) VarY = σ A + σ D σ A + σ C + σ D + σ E h Y = VarA VarY c σc Y = σa + σ C + σ D + σ E In the ACE model the heritability is given by h = (ρmz ρdz)

11 Gene-Environment Interactions Covariate X modifying h : Multivariate Analysis of Twin Data log(σ A) := αa + γax log(σ C) := αc + γcx How should we proceed with multiple traits? log(σ E) := αe + γex For categorical X, easily implemented in lava using multigroup. For continuous X (much slower): > constrain(m, va \ti{} +alpha+gamma) <- function() []+[]*[3] or using random slopes, i.e. λa := αa + γax Why? Models for comorbidity Gain in efficiency Heritability pathways. > regression(m) <- c(y,y) \ti{} f(a,) Multivariate Analysis of Twin Data Multivariate Analysis of Twin Data A E A E A E A E A E A E A E A E C C C C Y Z Y Z Y Z Y Z

12 Cholesky Factor Model Independent pathways (biometric common factors) A A A3 A C E Y Y Y3 Y Y Y3 C E C E E3 C3 A C E A C E A3 C3 E3 Common pathways (psychometric common factors) Hjelmborg et al., Obesity 008 Longitudinal biometric analysis A C E I S t η t tp Y Y Y3 Y Y Yp A C E A C E A3 C3 E3

13 Longitudinal biometric analysis Non-normal traits A C E A C E Dichotomous Probit model / Threshold model I t t S tp Censoring Tobit model / Threshold model Inverse Probability Weights Y Y Yp Available with packages lava.tobit, mets

The Faroese Cohort 2. Structural Equation Models Latent growth models. Multiple indicator growth modeling. Response profiles. Number of children: 182

The Faroese Cohort 2. Structural Equation Models Latent growth models. Multiple indicator growth modeling. Response profiles. Number of children: 182 The Faroese Cohort 2 Structural Equation Models Latent growth models Exposure: Blood Hg Hair Hg Age: Birth 3.5 years 4.5 years 5.5 years 7.5 years Number of children: 182 Multiple indicator growth modeling

More information

Notes on Twin Models

Notes on Twin Models Notes on Twin Models Rodrigo Pinto University of Chicago HCEO Seminar April 19, 2014 This draft, April 19, 2014 8:17am Rodrigo Pinto Gene-environment Interaction and Causality, April 19, 2014 8:17am 1

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

Variance Component Models for Quantitative Traits. Biostatistics 666

Variance Component Models for Quantitative Traits. Biostatistics 666 Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Structural Equation Models Short summary from day 2. Selected equations. In lava

Structural Equation Models Short summary from day 2. Selected equations. In lava Path diagram for indicators of mercury exposure and childhood cognitive function Structural Equation Models Short summary from day 2 ζ 2 t(w hale) ǫ H Hg log(h-hg) η1 ζ 1 ǫ B Hg log(b-hg) Confounders 7

More information

Course topics (tentative) The role of random effects

Course topics (tentative) The role of random effects Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Introduction to Multivariate Genetic Analysis. Meike Bartels, Hermine Maes, Elizabeth Prom-Wormley and Michel Nivard

Introduction to Multivariate Genetic Analysis. Meike Bartels, Hermine Maes, Elizabeth Prom-Wormley and Michel Nivard Introduction to Multivariate Genetic nalysis Meike Bartels, Hermine Maes, Elizabeth Prom-Wormley and Michel Nivard im and Rationale im: to examine the source of factors that make traits correlate or co-vary

More information

Partitioning the Genetic Variance

Partitioning the Genetic Variance Partitioning the Genetic Variance 1 / 18 Partitioning the Genetic Variance In lecture 2, we showed how to partition genotypic values G into their expected values based on additivity (G A ) and deviations

More information

Figure 36: Respiratory infection versus time for the first 49 children.

Figure 36: Respiratory infection versus time for the first 49 children. y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects

More information

Partitioning Genetic Variance

Partitioning Genetic Variance PSYC 510: Partitioning Genetic Variance (09/17/03) 1 Partitioning Genetic Variance Here, mathematical models are developed for the computation of different types of genetic variance. Several substantive

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

Supplementary File 3: Tutorial for ASReml-R. Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight

Supplementary File 3: Tutorial for ASReml-R. Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight Supplementary File 3: Tutorial for ASReml-R Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight This tutorial will demonstrate how to run a univariate animal model using the software ASReml

More information

Go to Faculty/marleen/Boulder2012/Moderating_cov Copy all files to your own directory

Go to Faculty/marleen/Boulder2012/Moderating_cov Copy all files to your own directory Go to Faculty/marleen/Boulder01/Moderating_cov Copy all files to your own directory Go to Faculty/sanja/Boulder01/Moderating_covariances _IQ_SES Copy all files to your own directory Moderating covariances

More information

Multivariate Analysis. Hermine Maes TC19 March 2006

Multivariate Analysis. Hermine Maes TC19 March 2006 Multivariate Analysis Hermine Maes TC9 March 2006 Files to Copy to your Computer Faculty/hmaes/tc9/maes/multivariate *.rec *.dat *.mx Multivariate.ppt Multivariate Questions I Bivariate Analysis: What

More information

A Practitioner s Guide to Generalized Linear Models

A Practitioner s Guide to Generalized Linear Models A Practitioners Guide to Generalized Linear Models Background The classical linear models and most of the minimum bias procedures are special cases of generalized linear models (GLMs). GLMs are more technically

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Bayesian Analysis of Latent Variable Models using Mplus

Bayesian Analysis of Latent Variable Models using Mplus Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are

More information

Multilevel Structural Equation Modeling

Multilevel Structural Equation Modeling Multilevel Structural Equation Modeling Joop Hox Utrecht University j.hox@uu.nl http://www.joophox.net 14_15_mlevsem Multilevel Regression Three level data structure Groups at different levels may have

More information

Continuously moderated effects of A,C, and E in the twin design

Continuously moderated effects of A,C, and E in the twin design Continuously moderated effects of A,C, and E in the twin design Conor V Dolan & Michel Nivard Boulder Twin Workshop March, 208 Standard AE model s 2 A or.5s 2 A A E A E s 2 A s 2 E s 2 A s 2 E m pheno

More information

Fall Homework Chapter 4

Fall Homework Chapter 4 Fall 18 1 Homework Chapter 4 1) Starting values do not need to be theoretically driven (unless you do not have data) 2) The final results should not depend on starting values 3) Starting values can be

More information

Consequences of measurement error. Psychology 588: Covariance structure and factor models

Consequences of measurement error. Psychology 588: Covariance structure and factor models Consequences of measurement error Psychology 588: Covariance structure and factor models Scaling indeterminacy of latent variables Scale of a latent variable is arbitrary and determined by a convention

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Linear models Analysis of Covariance

Linear models Analysis of Covariance Esben Budtz-Jørgensen November 20, 2007 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor

More information

DNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to

DNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to 1 1 1 1 1 1 1 1 0 SUPPLEMENTARY MATERIALS, B. BIVARIATE PEDIGREE-BASED ASSOCIATION ANALYSIS Introduction We propose here a statistical method of bivariate genetic analysis, designed to evaluate contribution

More information

Comparing IRT with Other Models

Comparing IRT with Other Models Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used

More information

Sum Scores in Twin Growth Curve Models: Practicality Versus Bias

Sum Scores in Twin Growth Curve Models: Practicality Versus Bias DOI 10.1007/s10519-017-9864-0 ORIGINAL RESEARCH Sum Scores in Twin Growth Curve Models: Practicality Versus Bias Justin M. Luningham 1 Daniel B. McArtor 1 Meike Bartels 2 Dorret I. Boomsma 2 Gitta H. Lubke

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Lecture 1 Basic Statistical Machinery

Lecture 1 Basic Statistical Machinery Lecture 1 Basic Statistical Machinery Bruce Walsh. jbwalsh@u.arizona.edu. University of Arizona. ECOL 519A, Jan 2007. University of Arizona Probabilities, Distributions, and Expectations Discrete and Continuous

More information

Outline. Overview of Issues. Spatial Regression. Luc Anselin

Outline. Overview of Issues. Spatial Regression. Luc Anselin Spatial Regression Luc Anselin University of Illinois, Urbana-Champaign http://www.spacestat.com Outline Overview of Issues Spatial Regression Specifications Space-Time Models Spatial Latent Variable Models

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D.

Research Design - - Topic 15a Introduction to Multivariate Analyses 2009 R.C. Gardner, Ph.D. Research Design - - Topic 15a Introduction to Multivariate Analses 009 R.C. Gardner, Ph.D. Major Characteristics of Multivariate Procedures Overview of Multivariate Techniques Bivariate Regression and

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

An introduction to quantitative genetics

An introduction to quantitative genetics An introduction to quantitative genetics 1. What is the genetic architecture and molecular basis of phenotypic variation in natural populations? 2. Why is there phenotypic variation in natural populations?

More information

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington

More information

Longitudinal Data Analysis

Longitudinal Data Analysis Longitudinal Data Analysis Mike Allerhand This document has been produced for the CCACE short course: Longitudinal Data Analysis. No part of this document may be reproduced, in any form or by any means,

More information

MIXED MODELS THE GENERAL MIXED MODEL

MIXED MODELS THE GENERAL MIXED MODEL MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Linear models Analysis of Covariance

Linear models Analysis of Covariance Esben Budtz-Jørgensen April 22, 2008 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor

More information

Measuring Social Influence Without Bias

Measuring Social Influence Without Bias Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

6 Pattern Mixture Models

6 Pattern Mixture Models 6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs) 36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)

More information

The returns to schooling, ability bias, and regression

The returns to schooling, ability bias, and regression The returns to schooling, ability bias, and regression Jörn-Steffen Pischke LSE October 4, 2016 Pischke (LSE) Griliches 1977 October 4, 2016 1 / 44 Counterfactual outcomes Scholing for individual i is

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

STA 431s17 Assignment Eight 1

STA 431s17 Assignment Eight 1 STA 43s7 Assignment Eight The first three questions of this assignment are about how instrumental variables can help with measurement error and omitted variables at the same time; see Lecture slide set

More information

Introduction to mtm: An R Package for Marginalized Transition Models

Introduction to mtm: An R Package for Marginalized Transition Models Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

More information

Latent random variables

Latent random variables Latent random variables Imagine that you have collected egg size data on a fish called Austrolebias elongatus, and the graph of egg size on body size of the mother looks as follows: Egg size (Area) 4.6

More information

Nesting and Equivalence Testing

Nesting and Equivalence Testing Nesting and Equivalence Testing Tihomir Asparouhov and Bengt Muthén August 13, 2018 Abstract In this note, we discuss the nesting and equivalence testing (NET) methodology developed in Bentler and Satorra

More information

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use

Modeling Longitudinal Count Data with Excess Zeros and Time-Dependent Covariates: Application to Drug Use Modeling Longitudinal Count Data with Excess Zeros and : Application to Drug Use University of Northern Colorado November 17, 2014 Presentation Outline I and Data Issues II Correlated Count Regression

More information

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA Topics: Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA What are MI and DIF? Testing measurement invariance in CFA Testing differential item functioning in IRT/IFA

More information

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes

More information

The concept of breeding value. Gene251/351 Lecture 5

The concept of breeding value. Gene251/351 Lecture 5 The concept of breeding value Gene251/351 Lecture 5 Key terms Estimated breeding value (EB) Heritability Contemporary groups Reading: No prescribed reading from Simm s book. Revision: Quantitative traits

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

Mixed-Models. version 30 October 2011

Mixed-Models. version 30 October 2011 Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector

More information

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Generalized Linear Models. Last time: Background & motivation for moving beyond linear Generalized Linear Models Last time: Background & motivation for moving beyond linear regression - non-normal/non-linear cases, binary, categorical data Today s class: 1. Examples of count and ordered

More information

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models Confirmatory Factor Analysis: Model comparison, respecification, and more Psychology 588: Covariance structure and factor models Model comparison 2 Essentially all goodness of fit indices are descriptive,

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

Global Model Fit Test for Nonlinear SEM

Global Model Fit Test for Nonlinear SEM Global Model Fit Test for Nonlinear SEM Rebecca Büchner, Andreas Klein, & Julien Irmer Goethe-University Frankfurt am Main Meeting of the SEM Working Group, 2018 Nonlinear SEM Measurement Models: Structural

More information

Sex-limited expression of genetic or environmental

Sex-limited expression of genetic or environmental Multivariate Genetic Analysis of Sex Limitation and G E Interaction Michael C. Neale, 1 Espen Røysamb, 2 and Kristen Jacobson 3 1 Virginia Institute for Psychiatric and Behavioral Genetics,Virginia Commonwealth

More information

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006 Ability Bias, Errors in Variables and Sibling Methods James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006 1 1 Ability Bias Consider the model: log = 0 + 1 + where =income, = schooling,

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model EPSY 905: Multivariate Analysis Lecture 1 20 January 2016 EPSY 905: Lecture 1 -

More information

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised ) Ronald H. Heck 1 University of Hawai i at Mānoa Handout #20 Specifying Latent Curve and Other Growth Models Using Mplus (Revised 12-1-2014) The SEM approach offers a contrasting framework for use in analyzing

More information

Multivariate Regression Models in R: The mcglm package

Multivariate Regression Models in R: The mcglm package Multivariate Regression Models in R: The mcglm package Prof. Wagner Hugo Bonat R Day Laboratório de Estatística e Geoinformação - LEG Universidade Federal do Paraná - UFPR 15 de maio de 2018 Introduction

More information

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Exploring gene expression data Scale factors, median chip correlation on gene subsets for crude data quality investigation

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models SEM with observed variables: parameterization and identification Psychology 588: Covariance structure and factor models Limitations of SEM as a causal modeling 2 If an SEM model reflects the reality, the

More information

SUPPLEMENTARY SIMULATIONS & FIGURES

SUPPLEMENTARY SIMULATIONS & FIGURES Supplementary Material: Supplementary Material for Mixed Effects Models for Resampled Network Statistics Improve Statistical Power to Find Differences in Multi-Subject Functional Connectivity Manjari Narayan,

More information

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17 Modeling IBD for Pairs of Relatives Biostatistics 666 Lecture 7 Previously Linkage Analysis of Relative Pairs IBS Methods Compare observed and expected sharing IBD Methods Account for frequency of shared

More information

Resemblance among relatives

Resemblance among relatives Resemblance among relatives Introduction Just as individuals may differ from one another in phenotype because they have different genotypes, because they developed in different environments, or both, relatives

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Within-Person Analysis and RM ANOVA Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

Latent variable interactions

Latent variable interactions Latent variable interactions Bengt Muthén & Tihomir Asparouhov Mplus www.statmodel.com November 2, 2015 1 1 Latent variable interactions Structural equation modeling with latent variable interactions has

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

What is in the Book: Outline

What is in the Book: Outline Estimating and Testing Latent Interactions: Advancements in Theories and Practical Applications Herbert W Marsh Oford University Zhonglin Wen South China Normal University Hong Kong Eaminations Authority

More information

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012 Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O

More information

Lecture 6: Hypothesis Testing

Lecture 6: Hypothesis Testing Lecture 6: Hypothesis Testing Mauricio Sarrias Universidad Católica del Norte November 6, 2017 1 Moran s I Statistic Mandatory Reading Moran s I based on Cliff and Ord (1972) Kelijan and Prucha (2001)

More information

Variance Components: Phenotypic, Environmental and Genetic

Variance Components: Phenotypic, Environmental and Genetic Variance Components: Phenotypic, Environmental and Genetic You should keep in mind that the Simplified Model for Polygenic Traits presented above is very simplified. In many cases, polygenic or quantitative

More information

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements [Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers

More information

AMS-207: Bayesian Statistics

AMS-207: Bayesian Statistics Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.

More information

Lecture WS Evolutionary Genetics Part I 1

Lecture WS Evolutionary Genetics Part I 1 Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information