Consequences of measurement error. Psychology 588: Covariance structure and factor models

Similar documents
General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models

SEM 2: Structural Equation Modeling

4. Distributions of Functions of Random Variables

SEM 2: Structural Equation Modeling

Structural Equation Modeling An Econometrician s Introduction

Measurement Theory. Reliability. Error Sources. = XY r XX. r XY. r YY

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Online Appendices for: Modeling Latent Growth With Multiple Indicators: A Comparison of Three Approaches

Estimation of Parameters

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Final Exam. Economics 835: Econometrics. Fall 2010

THE GENERAL STRUCTURAL EQUATION MODEL WITH LATENT VARIATES

Factor Analysis. Qian-Li Xue

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

First Year Examination Department of Statistics, University of Florida

Random Vectors, Random Matrices, and Matrix Expected Value

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Financial Econometrics

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Introduction to Structural Equations

ECON 3150/4150, Spring term Lecture 7

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Discussion of Sensitivity and Informativeness under Local Misspecification

Bayesian Analysis of Latent Variable Models using Mplus

ECON The Simple Regression Model

The returns to schooling, ability bias, and regression

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

Exploring Cultural Differences with Structural Equation Modelling

Simple Linear Regression Analysis

Psychology 282 Lecture #3 Outline

Econometrics. 7) Endogeneity

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

Lecture 9 SLR in Matrix Form

Single-Equation GMM: Endogeneity Bias

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

AN INTRODUCTION TO STRUCTURAL EQUATION MODELING

Omitted Variable Bias, Coefficient Stability and Other Issues. Chase Potter 6/22/16

Multiple Linear Regression

Confirmatory Factor Analysis. Psych 818 DeShon

6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses.

Econometrics Summary Algebraic and Statistical Preliminaries

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Steps in Regression Analysis

Applied Statistics and Econometrics

Lecture 14 Simple Linear Regression

RESMA course Introduction to LISREL. Harry Ganzeboom RESMA Data Analysis & Report #4 February

Gov 2000: 9. Regression with Two Independent Variables

The Delta Method and Applications

Inference using structural equations with latent variables

ECON 3150/4150, Spring term Lecture 6

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

What is in the Book: Outline

Bivariate distributions

Reliability-Constrained Latent Structure Models

Statistics 910, #5 1. Regression Methods

Two-Variable Regression Model: The Problem of Estimation

AGEC 661 Note Fourteen

Using Structural Equation Modeling to Conduct Confirmatory Factor Analysis

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models

4. Path Analysis. In the diagram: The technique of path analysis is originated by (American) geneticist Sewell Wright in early 1920.

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Advanced Econometrics

A note on structured means analysis for a single group. André Beauducel 1. October 3 rd, 2015

Nesting and Equivalence Testing


Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Week 9 The Central Limit Theorem and Estimation Concepts

Stat 206: Sampling theory, sample moments, mahalanobis

2 Introduction to Response Surface Methodology

Economics 240A, Section 3: Short and Long Regression (Ch. 17) and the Multivariate Normal Distribution (Ch. 18)

Linear Regression with Multiple Regressors

Review of Econometrics

A Probability Review

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Lecture # 31. Questions of Marks 3. Question: Solution:

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

Regression Models - Introduction

Introduction to Confirmatory Factor Analysis

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Moderation 調節 = 交互作用

An Introduction to Bayesian Linear Regression

Latent variable interactions

Estimating Three-way Latent Interaction Effects in Structural Equation Modeling

Introduction to Maximum Likelihood Estimation

Applied Econometrics (QEM)

4.8 Instrumental Variables

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ABSTRACT. Chair, Dr. Gregory R. Hancock, Department of. interactions as a function of the size of the interaction effect, sample size, the loadings of

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Module 3. Latent Variable Statistical Models. y 1 y2

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22

Transcription:

Consequences of measurement error Psychology 588: Covariance structure and factor models

Scaling indeterminacy of latent variables Scale of a latent variable is arbitrary and determined by a convention for convenience Typically set to variance of one (factor analysis convention) or to be identical to an arbitrarily chosen indicator s scale By centering indicator variables, we set latent variables means to zero Consider the following transformation: x j J a b b j j j j,,...,,, 0 a j j j j b b

If all J indicators are considered simultaneously, vector notation is more convenient: x νλ δ, a b a ν λ λ δ b b meaning that the linear transformation of ξ can be exactly compensated in the accordingly transformed ν = ν λa/b and λ = λ/b, leaving the errors δ unchanged (i.e., same fit)

What s great about measurement errors in equation 4 Regression weights and correlations are interpreted, implicitly assuming that the operationally defined variables involve no measurement error --- hardly realized for theoretical constructs (e.g., self esteem, IQ, etc.) Ignoring the measurement error will lead to inconsistent estimates We will see consequences of ignoring measurement errors

Univariate consequences 5 Consider a mean-included equation for X (hours worked per week) to indicate ξ (achievement motivation): X, E, E 0, E 0 E X X var X var Given only one indicator per latent variable, the intercept and loading (i.e., weight) are simply scaling constants for ξ However, if the ξ scale is set comparable to the X scale (i.e., λ = ), we see that var(x) is an over-estimation of ϕ = var(ξ) if δ is not included in the equation

Bivariate relation and simple regression 6 True data structure: x y η: job satisfaction y: satisfaction scale xi x d gamma eta y e zeta cov(x, y) is unbiased estimate of cov(ξ, η) with λ = λ =, since no other variables (δ and ε) can explain cov(x, y) cov, cov, cov xy, cov,

From the previous equations, and by analogy with y = γ x + ζ if measurement errors are ignored, cov, cov xy, var x var The parenthesized ratio (reliability) becomes only with no measurement error; otherwise, γ is an attenuated estimate of γ and s s is an inconsistent estimator of γ, ˆ xy xx If the bias of regression weight has an additional factor as --- but such scaling is unusual xx when there is only one indicator per latent variable xx

Correlations: cov, var var xy cov xy, var x var y var x var y var var x var y var xx yy which shows an attenuation of the true correlation due to measurement error, with the familiar correction formula: xy xx yy

Consequences in multiple regression 9 True data structure: γξ x ξδ y with Λ x = I and λ y = d d d3 x x x3 xi xi xi3 g g g3 eta zeta y e Ignoring measurement errors: y γ x σ cov ξ, cov ξ, ξγ Φγ σ cov x, y cov ξδ, ξγ Φγ xy

γ Φ σ and by analogy with γ x, γ Σxxσ xy ΣxxΦγ Φ Θ Φγ y Without measurement error (Θ δ = 0), γ γ ; otherwise, γ γ Alternatively written: since --- where i Σ Σ xx x γ is the OLS estimator of B in i.e., regression weights for prediction of ξ by x Σ Σ γ xx x Σ Φ x ξ Bx e, Again, without measurement error, Σ Σ xx x I Note: in Bollen (pp. 59-68), are meant to be respectively, for the multiple regression model ΓΣ,, Σ γσ,,, xy σxy

As a very simplified case, suppose x is the only fallible as: x x, i,..., q i i with the true and estimated regression equations: x q q q In this special case, the regression weight matrix has a simple multiplicative form of bias (hint: use ): Σ Σ Φ Θ Φ I c 0 xx x q q c q Φ Φ Θ c 0 I,

Consequently, resulting bias factors are: i b x q b,,, i q i x i Bias-factor for x is less than in absolute value ( without measurement error), and so is biased toward 0 --- the bias factor indicates regression weight b in ξ = b 0 + b x + b ξ + + b q ξ q ~ i b x q Consequences for x i, i =,,q are additive, depending on relationships between ξ and ξ i holding all other IVs constant, and γ

So far all reasoning is based on rather unrealistic assumptions: Only single indicator per latent variable, and so its loading becomes simply scaling constant Only one fallible IV Without such assumptions (e.g., all IVs fallible), consequences of measurement error become too complicated and hard to simplify algebraically --- no particular simple form of aσ Σ One clear conclusion: all estimates are inconsistent --- systematically different from what they meant to be xx x

Consequence in standardization: standardized ii var i i i var var Consequence in SMC is similar to the bivariate case: plim R plimr What should we do with essentially omnipresent measurement error? Use SEM which allows for measurement errors in the model --- though we are limited in certain models regarding the model identification (e.g., Table 5., p. 64)

Correlated errors of measurement 5 Consequence in regression weights further complicated: γ Σ Σ γ Σ σ xx x xx For simple regression: Now, is not necessarily < γ cov, xx var x If correlated measurement errors are only within IVs (i.e., σ δε = 0, Σ xx = Φ + Θ σ where Θ σ is not diagonal), γ Σ Σ γ still holds (but the bias factor will have a more complicated form, also involving off-diagonal entries of Θ σ ) xx x

With multi-equations 6 In path models with sequential causal paths, consequences of measurement errors very hard to simply generalize --- see the union sentiment (Fig. 5., p. 69) and SES (Fig. 5.4, p. 73) examples If reliabilities are known, the corresponding error variances can be constrained; if unknown, the error variances may be modeled as free parameters provided that they are identifiable To keep in mind: we need more than one indicator per latent variable for identifiability and statistical testing --- leading to measurement models with multiple indicators or CFA