Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Similar documents
Using Structural Equation Modeling to Conduct Confirmatory Factor Analysis

Introduction to Structural Equation Modeling

An Introduction to Path Analysis

Introduction to Confirmatory Factor Analysis

An Introduction to Mplus and Path Analysis

STAT 730 Chapter 9: Factor analysis

Introduction to Structural Equation Modeling Dominique Zephyr Applied Statistics Lab

AN INTRODUCTION TO STRUCTURAL EQUATION MODELING WITH AN APPLICATION TO THE BLOGOSPHERE

Model Estimation Example

Inference using structural equations with latent variables

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Psychology 454: Latent Variable Modeling How do you know if a model works?

Factor Analysis & Structural Equation Models. CS185 Human Computer Interaction

Confirmatory Factor Analysis. Psych 818 DeShon

Evaluation of structural equation models. Hans Baumgartner Penn State University

Estimation of Curvilinear Effects in SEM. Rex B. Kline, September 2009

Dimensionality Reduction Techniques (DRT)

Applied Multivariate Analysis

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Model Building Strategies

Structural Equation Modelling

EVALUATION OF STRUCTURAL EQUATION MODELS

Compiled by: Assoc. Prof. Dr Bahaman Abu Samah Department of Professional Developmentand Continuing Education Faculty of Educational Studies

Factor analysis. George Balabanis

Can you tell the relationship between students SAT scores and their college grades?

Factor Analysis. Qian-Li Xue

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

CONFIRMATORY FACTOR ANALYSIS

Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions)

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

Dimensionality Assessment: Additional Methods

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

Nesting and Equivalence Testing

SEM Day 1 Lab Exercises SPIDA 2007 Dave Flora

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

sempower Manual Morten Moshagen

The Role of Leader Motivating Language in Employee Absenteeism (Mayfield: 2009)

Introduction to Structural Equation Modeling: Issues and Practical Considerations

6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses.

Empirical Validation of the Critical Thinking Assessment Test: A Bayesian CFA Approach

Multiple Group CFA Invariance Example (data from Brown Chapter 7) using MLR Mplus 7.4: Major Depression Criteria across Men and Women (n = 345 each)

A Re-Introduction to General Linear Models (GLM)

RESMA course Introduction to LISREL. Harry Ganzeboom RESMA Data Analysis & Report #4 February

Confirmatory Factor Models (CFA: Confirmatory Factor Analysis)

ADVANCED C. MEASUREMENT INVARIANCE SEM REX B KLINE CONCORDIA

Factor Analysis: An Introduction. What is Factor Analysis? 100+ years of Factor Analysis FACTOR ANALYSIS AN INTRODUCTION NILAM RAM

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

Chapter 3: Testing alternative models of data

The Common Factor Model. Measurement Methods Lecture 15 Chapter 9

ANCOVA. Lecture 9 Andrew Ainsworth

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Preface. List of examples

A Threshold-Free Approach to the Study of the Structure of Binary Data

Measurement Theory. Reliability. Error Sources. = XY r XX. r XY. r YY

PRESENTATION TITLE. Is my survey biased? The importance of measurement invariance. Yusuke Kuroki Sunny Moon November 9 th, 2017

Exploratory Factor Analysis and Canonical Correlation

C:\Users\Rex Kline\AppData\Local\Temp\AmosTemp\AmosScratch.amw. Date: Friday, December 5, 2014 Time: 11:20:30 AM

Title. Description. Remarks and examples. stata.com. stata.com. Variable notation. methods and formulas for sem Methods and formulas for sem

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Confirmatory Factor Analysis

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models

Exploratory Factor Analysis and Principal Component Analysis

Psychology 454: Latent Variable Modeling How do you know if a model works?

Correlation: Relationships between Variables

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

THE GENERAL STRUCTURAL EQUATION MODEL WITH LATENT VARIATES

Multilevel Structural Equation Modeling

Basic IRT Concepts, Models, and Assumptions

Introduction to Matrix Algebra and the Multivariate Normal Distribution

Correlation and Linear Regression

Introduction to Confirmatory Factor Analysis

Intermediate Social Statistics

Exploratory Factor Analysis and Principal Component Analysis

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this:

Systematic error, of course, can produce either an upward or downward bias.

Correlation Analysis

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Bivariate Relationships Between Variables

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6

MULTILEVEL IMPUTATION 1

Latent Variable Analysis

1. The Multivariate Classical Linear Regression Model

Introduction to Within-Person Analysis and RM ANOVA

REVIEW 8/2/2017 陈芳华东师大英语系

Psychology 454: Latent Variable Modeling

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Exam details. Final Review Session. Things to Review

Measurement Invariance Testing with Many Groups: A Comparison of Five Approaches (Online Supplements)

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Next is material on matrix rank. Please see the handout

Statistics Introductory Correlation

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

UNIVERSITY OF CALGARY. The Influence of Model Components and Misspecification Type on the Performance of the

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models

Review of CLDP 944: Multilevel Models for Longitudinal Data

Transcription:

/4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris Types of Variables Nominal: Names, Categories, ID numbers Ordinal: Ranks Interval: Dichotomous, Polytomous (No Absolute Zero) Ratio: Measurements, Scalars (Absolute Zero)

/4/04 Describing Data by the Center Example Data Set: 50, 0,, 7,, 5, 0 Mean: Center Value x 7 0 0 5 50 4 x 6.3 n 7 7 Median: Center Term,, 7, 0, 0, 5, 50 Mode: Most Often Repeated Term(s) Degrees of Freedom Number of independent observations Consider a group of 4 observations. The mean is 0 (sum = 80). 0, so we estimate μ = 0 In the next sample, we ve already estimated the population mean to be 0, so 4 data points must sum to 80. The first 3 observations are free to be anything, but the fourth must be fixed to make the sum 80. Free + Free + Free + Fixed = 80 So, we always lose a degree of freedom when we estimate a parameter.

/4/04 Variance and Standard Deviation Consider a sample data set 59 63 7 67 64 7 66 X X -7-7 -3 5-6 0 Is the Mean Difference - -3 5 6 If the sum of the difference is 0, how can we compute a meaningful average? c P b Notice that a + b c c P a a b 3

/4/04 Enter Pythagoras a + b = c c a b a b c So, square areas can be used to calculate distance, and they eliminate the 0 sum problem. Returning to Sample Data 59 63 7 67 64 7 66 Distance (Distance) -7-3 5-6 0 X X 49 9 5 4 36 X X 4 Let s look at a picture of these squares 4

/4/04 0 9 8 7 6 5 4 3 0 55 56 57 58 59 60 6 6 63 64 65 66 67 68 69 70 7 7 73 74 75 - - -3-4 -5-6 -7-8 -9-0 How can we find the side length of this square? 4.5 4.95 5 = Why 5 and not 6? Variance = 4.8 5

/4/04 What does that get us? The average square of mean distances is referred to as variance. Variance σ X X n This noise gives us a measure of how much of the data is not represented by the mean. Standard Deviation: The side length of the variance square; the average distance from the mean Standard Deviation σ σ X X n Two Variables: Lines of Best Fit Linear Regression or Least Squares Regression How far do points regress from a line? Regress = Deviate from. For two variables, we begin by considering the dependent variable, average distance of each Y from. X Y 9 6 5 8 3 9 5 4 9 4 6 Y Y 5 3 0 4 0 = 6

/4/04 Converting to Squares Y 9 3 5 4 6 N = 8 df = 7 Y Y Y Y 5 5 3 9 4 4 0 0 4 Y Y SS? Y So the average sized square of variance is: 48 Y Y df? Then the average distance from mean is: Y Y df 48 7 = 6.9? =.6 Do the same for X X 6 5 8 9 4 9 N = 8 df = 7 X? 8 X X X X 3 9 4 3 9 0 0 4 6 4 6 X X 0 X X SS X? 56 X X df X X df?? 56 7 =.83 =8 7

/4/04 r X Y X X Y Y X X Y Y X X Y Y 9 3 5 9 5 5 6 3 4 9 6 5 3 9 4 6 8 3 0 0 0 9 5 4 4 6 4 8 9 4 0 0 0 6 4 6 4 8 XY sxy s s X Y SPXY SS SS X Y SS X? Y? 56 48 r XY SS X X Y Y? 44 SP XY 5648 How strong is the relationship between X and Y? 44 7 6.9 44 0. 849 5. 846 44 Estimate Line of Best Fit ˆ Y i b 0 b X i b SP 44 0 786 56 XY. SS X 0. 7868. 0 88 b0 Y b X 4. 0. So, Yˆ.88 0. 786 i X i 8

/4/04 Use Linear Equation Information X Y X X Y Y X X Y Y X X Y Y Ŷ Y Yˆ Y Ŷ 9 3 5 9 5 5 6.358.64 6.98 6 3 4 9 6.48.48.04 5 3 9 4 6.64 0.358 0.3 8 3 0 0 0 4.00 9 5 4.786 0.4 0.05 4 4 6 4 8 0.856.44.3 9 4 0 0 0 4.786 0.786 0.6 6 4 6 4 8 7.44.44.3 48 Yˆ i.88 0. 786X i ˆ Y Y? 3.43 How much of an effect did the regression have? SS regression 48 3.49 34.57 r XY SS regression SS Y 34. 57 0. 70 48. 000 9

/4/04 SEM *Causal* processes can be represented by structural equations (regression equations dependent variables being predicted by independent variables). A model of these structural relations can be generated (and modeled pictorially) Error SEM Variables Observed or manifest or measured variables: X s or Y s. Latent variables (factors) constructs that cannot be directly observed (or measured). Latent variables are estimated through hypothesized relationships with observed variables. Exogenous latent variables independent variables that cause changes in other latent variables in the model. These are taken as given by the model under consideration, and any changes in exogenous variables are due to factors outside the model. Endogenous latent variables dependent variables that are influenced by exogenous variables in the model. These are the outcomes your SEM model wishes to explain. Observed (X) Exogenous Endogenous Latent Residual Latent Factor Loadings Latent Factor Loading Observed (Y) Error 0

/4/04 Factor Analysis Used to identify the factor structure or model for a set of variables (Stevens, 0) Two types: Exploratory (EFA) and Confirmatory (CFA) Exploratory Factor Analysis Several Methods: Principal Components Analysis (PCA): Each successive component accounts for the largest amount of unexplained variance Principal Axis Factoring: Identical to PCA, except that the factors are extracted from a correlation matrix with communality estimates on the main diagonal rather than s, as in PCA. Unweighted Least Squares: Minimizes the sum of squared differences between the observed and modelimplied off-diagonal correlation matrices. Generalized Least Squares: Correlations weighted by the inverse of their uniqueness, high uniqueness less weight. Alpha: Maximizes the Cronbach alpha of the factors (i.e., reliability) Image: Factors are defined by their linear regression on variables not associated with the hypothetical factors.

/4/04 Maximum Likelihood Estimation Attempts to find the population parameter values from which the observed data are most likely to have arisen. The likelihood function quantifies the discrepancy between the observed and model-implied parameters, assuming normal distribution. Closed-form solutions for parameters usually do not exist, so iterative algorithms are used in practice for parameter estimation. 3 The Model Fitting Process Let S = the sample variance/covariance matrix of observed scores from p variables. Let Σ = the variance/covariance matrix of the population. Let θ represent the vector of model parameters. Therefore, Σ(θ) represents the restricted variance/covariance matrix implied by the model. We are testing the hypothesis that the restricted matrix holds in the population: Null Hypothesis: Σ = Σ(θ). SEM computes a minimum discrepancy function, F min. 4

/4/04 Trace: The sum of the diagonal of a matrix Understanding the F min Function So, as Σ(θ) approaches S, the difference of the trace and p approaches 0. F Min Trace S p log log S An inverse matrix times itself = the Identity Matrix (I), So, as Σ(θ) approaches S, Σ(θ) - S approaches I, as a result, the trace of the matrix will approach the number of observed variables, p As Σ(θ) approaches S, this difference approaches 0 5 Maximum Likelihood Estimation (Cont d.) The shape of the multivariate normal curve is defined by: l Σ Substituting an individual s vector of scores yields the likelihood of that set of scores given the population mean vector μ and covariance matrix Σ 6 3

/4/04 Maximum Likelihood Estimation (Cont d.) A model s final parameter estimates are those that yield model-implied variances and covariances (and means) that maximize the combined likelihood of all n cases. l l l l l l Σ 7 Casewise Log Likelihoods Likelihoods tend to be very small numbers, and hence their products become practically infinitesimal. Taking the natural log of the likelihood makes things a bit more manageable. l l l l l l Σ Σ 8 4

/4/04 Casewise Log Likelihoods (Cont d.) With complete data, each case s contribution to the overall log likelihood (LL) is: Σ Σ In the missing data context, each case s contribution to the log likelihood is: Σ Σ Data and parameter arrays can vary for each ith case. The ith case s contribution to the overall likelihood is based only on those variables for which that case has complete data. 9 Maximum Likelihood in SEM Model s final parameter estimates are those that yield model-implied variances and covariances (and means) that maximize the aggregated casewise log likelihoods: Σ Σ In FIML, no data are ever imputed. Parameters and their SE are estimated directly using all observed data. FIML is the default in many software (e.g., Mplus, Amos) 30 5

/4/04 Confirmatory Factor Analysis Cannot be run easily in basic statistics packages such as SPSS they do not offer the option to force variables to load on particular factors, only the number of factors. SEM software easily accommodates CFA models, e.g., MPlus, AMOS, EQS, LISREL. 3 Psychological Distress CFA First-Order CFA Second-Order CFA 3 6

/4/04 Psychological Distress CFA Results Chi RMSEA RMSEA Model Model Description N AIC DF Square CFI RMSEA LO90 HI90 SRMR ECVI CFA Caregiver Psychological 0a Distress 7 898. 03 83.8 0.90 0.088 0.076 0.00 0.049 36.56 0a 0a with Q030 and Q03 covaried 7 836.7 0 9.97 0.935 0.07 0.058 0.084 0.044 36.8 0a3 nd order CFA built on 0a 7 838.7 0 9.97 0.935 0.07 0.059 0.085 0.044 36.9 Variable Criterion Minimum Fit χ Nested Model Comparison CFI (Comparative Fit Index) > 0.95 AIC (Akaike Information Criterion) SRMR (Standardized Root Mean Square Residual) RMSEA (Root Mean Square Error of Approximation) ECVI Model Comparison Only (Does not have to be nested), Smaller Value = Better Fit < 0.0, Reasonable Fit < 0.08 Good Fit < 0.05 = Good Fit 0.05 0.08 = Reasonable 0.08 0.0 = Mediocre > 0.0 = Poor Fit As model is changed, smaller value indicates greater likelihood of being generalizable in the population 33 Reflection on CFA What is your dissertation/thesis conceptual framework? Are the constructs in your framework welldefined, and are the definitions wellestablished? Could a CFA strengthen your study? Why or why not? 34 7

/4/04 Thank You! All materials from this workshop series can be downloaded at http://csrakes.yolasite.com/resource.php 35 8