Chapter 4: Factor Analysis

Similar documents
STAT 730 Chapter 9: Factor analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Applied Multivariate Analysis

9.1 Orthogonal factor model.

Factor Analysis and Kalman Filtering (11/2/04)

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Factor Analysis Edpsy/Soc 584 & Psych 594

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Introduction to Factor Analysis

Multivariate Statistics

Dimensionality Reduction Techniques (DRT)

Intermediate Social Statistics

Factor Analysis (10/2/13)

Machine Learning 2nd Edition

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Capítulo 12 FACTOR ANALYSIS 12.1 INTRODUCTION

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan

STATISTICAL LEARNING SYSTEMS

3.1. The probabilistic view of the principal component analysis.

CS281 Section 4: Factor Analysis and PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Introduction to Machine Learning

UCLA STAT 233 Statistical Methods in Biomedical Imaging

Introduction to Factor Analysis

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Data Preprocessing Tasks

Factor Analysis Continued. Psy 524 Ainsworth

1 Data Arrays and Decompositions

Second-Order Inference for Gaussian Random Curves

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

Vector Space Models. wine_spectral.r

Factor Analysis. -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD

Multivariate Statistical Analysis

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA

Stat 216 Final Solutions

Introduction to Confirmatory Factor Analysis

Computational functional genomics

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

Principal component analysis

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Table of Contents. Multivariate methods. Introduction II. Introduction I

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

MATH 829: Introduction to Data Mining and Analysis Graphical Models I

Reliability-Constrained Latent Structure Models

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

R = µ + Bf Arbitrage Pricing Model, APM

A survey of dimension reduction techniques

Estimation of Unique Variances Using G-inverse Matrix in Factor Analysis

Ch. 10 Principal Components Analysis (PCA) Outline

Likelihood-Based Methods

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

B. Weaver (18-Oct-2001) Factor analysis Chapter 7: Factor Analysis

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

Unconstrained Ordination

Principal Component Analysis (PCA) Theory, Practice, and Examples

Dimension Reduction. David M. Blei. April 23, 2012

Lecture 16 Solving GLMs via IRWLS

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Factor analysis. George Balabanis

Linear Dimensionality Reduction

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Pollution Sources Detection via Principal Component Analysis and Rotation

Dimension Reduction and Classification Using PCA and Factor. Overview

STAT 501 EXAM I NAME Spring 1999

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM.

More PCA; and, Factor Analysis

Notes on the Multivariate Normal and Related Topics

Harvard University. Rigorous Research in Engineering Education

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis

1 Principal Components Analysis

Manabu Sato* and Masaaki Ito**

Notes on Latent Semantic Analysis

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Data reduction for multivariate analysis

CS281A/Stat241A Lecture 17

Smooth Common Principal Component Analysis

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University

Linear Factor Models. Sargur N. Srihari

Canonical Correlations

6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses.

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

Chapter 7, continued: MANOVA

Exploratory Factor Analysis and Principal Component Analysis

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Lecture 6: April 19, 2002

Principal Component Analysis vs. Independent Component Analysis for Damage Detection

Measuring relationships among multiple responses

Transcription:

Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest. Goal of factor analysis (FA) is to relate the unobservable latent variables of interest to the observed manifest variables. The technique used to relate the latent variables (often called factors) to the manifest variables is similar to multiple regression. The estimation of the regression coefficients (called loadings in this context) is less straightforward, however. STAT J530 Page 1

A Factor Analysis Example: The Wechsler Adult Intelligence Study The Wechsler Adult Intelligence Scale (WAIS) series of tests measures participants scores in 11 different tests. The multivariate data set consisted of 13 variables: these 11 test scores, plus age and years of education. Based on the observed variables, we may want to identify certain underlying factors that cause the individuals to differ. Is there a general intelligence factor? Is there a language ability factor? Is there a math ability factor? Factor analysis can help us answer these questions. STAT J530 Page 2

Our Factor Analysis Model Our factor analysis model assumes that we can explain the correlations among the manifest variables through these variables relationships with the latent variables. The q manifest variables are denoted x 1, x 2,...,x q. The k latent variables, or factors, (where k < q) are denoted f 1, f 2,...,f k. We relate them via a series of regression equations: x 1 = λ 11 f 1 + λ 12 f 2 + + λ 1k f k + u 1 x 2 = λ 21 f 1 + λ 22 f 2 + + λ 2k f k + u 2. x q = λ q1 f 1 + λ q2 f 2 + + λ qk f k + u q The λ ij values (called loadings) show how much each manifest variable depends on the j-th factor. The loading values help in the interpretation of each factor. STAT J530 Page 3

Our Factor Analysis Model (continued) We can write the regression equations in matrix notation: x = Λf + u, where Λ = λ 11 λ 1k..... λ q1 λ qk and f = (f 1,...,f k ), u = (u 1,...,u q ). The model assumes u 1,...,u q are mutually independent and are independent of the f 1,...,f k. The factors are unobserved, so we may assume they have mean 0 and variance 1, and that they are uncorrelated with each other. STAT J530 Page 4

Partitioning the Variance of the Data Vectors The communality h 2 i is the variability in manifest variable x i shared with the other variables (via the factors) and ψ i is the specific variance, not shared with the other variables. STAT J530 Page 5

Covariance of the Data Vectors Hence the population covariance matrix Σ for (x 1, x 2,...,x q ) is Σ = ΛΛ + Ψ, where Ψ = diag(ψ i ). STAT J530 Page 6

Factor Analysis in Practice If this decomposition of the covariance matrix holds, then the k-factor model is correct. In practice, Σ is unknown and is estimated by S (or the sample correlation matrix R will be used). So we need to find estimates of Λ and Ψ so that the sample covariance matrix can be decomposed in this way: S ˆΛˆΛ + ˆΨ. In practice, we also don t know the true value of k, the number of factors. STAT J530 Page 7

Methods of Estimating the Factor Analysis Model: Principal Factor Analysis The Principal Factor Analysis approach to estimation relies on estimating the communalities. It uses the reduced covariance matrix S = S ˆΨ. The diagonal elements of S are s 2 i ˆψ i = ĥ2 i, the (estimated) communality for the i-th variable. We could standardize the variables, which amounts to using the reduced correlation matrix R = R ˆΨ. STAT J530 Page 8

Estimating the Communalities To estimate the h 2 i values, we cannot use the factor loadings, since those have not been estimated yet. A more straightforward approach (when working with the correlation matrix) is one of the following: 1. Initially let ĥ2 i equal the R2 value of a regression of x i against the other manifest variables. This is 1 1 r ii, where r ii is the i-th diagonal element of R 1. 2. Initially let ĥ2 i equal the largest absolute correlation coefficient between x i and any other manifest variable. In both of these approaches, a stronger association between x i and the other variables will lead to a higher communality value ĥ2 i. When working with the covariance matrix, we could base the communality estimates on the diagonal elements of S 1 rather than R 1. STAT J530 Page 9

Using the Initial Communality Estimates Once we have our initial ĥ2 i values, we can calculate S (or R ). We perform a principal components analysis on S (or R ) and the first k eigenvectors contain the estimates of the first k factor loadings. These estimated loadings ˆλ ij can be used to obtain new communality estimates: ĥ 2 i = k j=1 ˆλ 2 ij We can re-form S (or R ) with the revised communality estimates, and repeat the process until the communality estimates converge. This works well unless the communality estimate becomes larger than the manifest variable s total variance, implying a negative specific variance, an impossibility. STAT J530 Page 10

Maximum Likelihood Factor Analysis Maximum likelihood (ML) is a general method of estimating parameters in a statistical model. Classical ML requires an assumption about the form of the distribution of the data. If we can assume we have multivariate normal data, we can motivate a maximum likelihood estimation of our k-factor model. Recall that the observed sample covariance matrix is S and, under the factor analysis model, the true covariance matrix is Σ = ΛΛ + Ψ. The goodness-of-fit of the k-factor model can be judged by a distance measure F between the sample covariance matrix and the predicted covariance matrix under the model. STAT J530 Page 11

The Distance Measure and Maximum Likelihood Let F = ln ΛΛ + Ψ + trace(s[λλ + Ψ] 1 ) ln S q. This distance measure equals zero if S = ΛΛ + Ψ. F is large when S is far from ΛΛ + Ψ. We can calculate (for a given data set) the elements of Λ and Ψ that make F as small as possible. This implies we have estimates of the communalities h 2 1,...,h 2 q and the specific variances ψ 1,...,ψ q. Under the assumption of multivariate normality, the likelihood L = 0.5nF plus a function of the data. Hence minimizing F is equivalent to maximizing L. This method could also produce negative estimates for the specific variances. STAT J530 Page 12

Estimating the Number of Factors With factor analysis, the choice of the number of factors k is critical. If we use k + 1 factors, we will get different factors and loadings than if we use k factors. With too few factors, there will be too many high loadings. With too many factors, the loadings will be spread out too much over the factors, and the factors will be difficult to interpret. STAT J530 Page 13

Methods for Estimating the Number of Factors A subjective approach is to try various choices of k and pick the one that gives the most interpretable result this is probably too subjective. Could use the scree diagram as in PCA, but the eigenvalues are not as directly interpretable in factor analysis. When using maximum likelihood, we can use a formal sequence of hypothesis tests to help determine k. We use the test statistic U = n min(f), where n = n+1 (2q+5)/6 2k/3. If the k-factor model is appropriate, this test statistic has a large-sample χ 2 distribution with degrees of freedom (q k) 2 /2 (q + k)/2. Typically we begin with a small value of k, and increase k by 1 sequentially. If at any stage, the U has a non-significant P-value, we choose that value of k. If at any stage the degrees of freedom go to zero, the factor analysis model may be inappropriate. STAT J530 Page 14