Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017
Dimension reduction Exploratory (EFA)
Background While the motivation in PCA is to replace the original (correlated) variables by a small number of uncorrelated new variables that are linear combinations of the old ones such that the new variables capture major part of the variance of the original variables, the aim in EFA is to find underlying latent factors that explain the correlations between the original variables. Thus, as PCA is purely a mathematical transformation to produce new variables, EFA is more model based (observed variables reflect some latent constructs). History of factor analysis dates back to Spearman 1904. 1 1 Spearman, C., 1904, General intelligence objectively determined and measured, American Journal of Psychology 15, 201 293.
Background The idea is that there is an unobserved latent construct that explains the correlation of observed variables (observed variables in factor analysis are often called also as items). Spearman examined correlations of marks between three subjects [Classics (x 1 ), French (x 2 ), and English (x 3 )] obtained from a sample of children. His hypothesis was that there is one latent construct which he named general intelligence (f ) that governs the success in the tests.
Single latent factor Accordingly he ended up with a model x 1 = λ 1 f + u 1 x 2 = λ 2 f + u 2 (1) x 3 = λ 3 f + u 3. In this system λ i is called loading that indicates how strongly the underlying latent factor, f, is reflected by the observed (or manifest) measurement x i, and u i a unique factor that is unique to each individual. Accordingly it is assumed that u i s are not correlated with each other (and of course not with the common factor f ). Thus, f is supposed to explain all the interrelations between the observed variables.
Several latent factors This idea generalizes to several underlying latent variables. The general mathematical representation of a factor structure for p observed variables and k (< p) factors is x 1 = λ 11 f 1 + λ 12 f 2 + + λ 1k f k + u 1 x 2 = λ 21 f 1 + λ 22 f 2 + + λ 2k f k + u 2. (2) x p = λ p1 f 1 + λ p2 f 2 + + λ pk f k + u p, which in matrix format becomes x = Λf + u. (3) x contains the x-variables, Λ contains the loadings, f contains the factors, and u the unique factors that also are called errors in the measurement.
Steps in FA The exploratory nature of the approach implies from the fact that initially it is not clear how many factors there are. Steps in factor analysis are: Find the number of factors Rotate the initial solution to figure out what the factors are Orthogonal rotation Oblique rotation Check the goodness of fit of the factor model, if needed remove variables to facilitate interpretation of the final model Interpret the factor solution
Approaches to extract factor structures There are different approaches to extract a factor structure to find the number of factors. A popular solution is mathematically identical to the principal component solution that amounts to solving eigenvalues of the covariance or correlation matrix. An other popular solution nowadays is the so called maximum likelihood (ML) approach. The ML method relies on normality of the variables and allows for testing statistically the number of factors.
Measure of Sampling Adequacy (MSA) As a preliminary check the Measure of Sampling Adequacy (MSA) or Kaiser-Meyer-Olkin (KMO) is sometimes used as a criterion to judge whether data are appropriate for factor analysis. 0.00 to 0.49 unacceptable 0.50 to 0.59 miserable 0.60 to 0.69 mediocre 0.70 to 0.79 middling 0.80 to 0.89 meritorious 0.90 to 1.00 marvelous
Number of Factors Determining the number of factors Theory: How many factors theory predicts. Eigen values: (i) eigenvalues > 1, (ii) scree-plot, (iii) total variance explained Statistical test: ML method produces chi-square statistics Criterion functions: AIC BIC Interpretation: How many factors can be interpreted
Number of Factors Example 1 In this example 103 police officers were rated by their superiors on 14 scales (Source: SAS/STAT 14.1 User s guide, p 2340). The scree test proposes 3 or 5 factors, the likelihood test indicates that 3 or 4 factors would appropriate. We ll start with the 4-factor solution.
Factor rotation and interpretation Because the factor solution is not unique (with the exception of the one factor case), it can be transformed to facilitate interpretation. The transformation is called rotation which aims to find a representation where each variable loads ideally only on one factor. Thus in this ideal case the observed variables cluster by factors. Such a structure is called a simple structure. If so, we can interpret that the variable reflects that underlying factors. A factor is named according to the variables that load high (i.e., cluster) on it. Rotation can be orthogonal or oblique.
Rotation In the orthogonal rotation the factors are uncorrelated. In the oblique rotation factors are allowed to correlate with each other. This results typically to a simpler structure. Factors, however, should not be too highly correlated (preferably < 0.7 on absolute value) as high correlation implies that the underlying constructs are not well separated. Accordingly, if two factors are highly correlated we say that the discriminant validity of the constructs is low (weak or poor).
Rotation There are several rotation methods. Examples of orthogonal rotations are: Varimax and Quartimax and its variants. Examples of oblique rotations are: Oblimin (and its variants), Quartimin, Promax, and HK (Harris-Kaiser). Oblique rotation produces two loading matrices: Factor pattern which includes the regression coefficients of the observed variables on the latent factors Factor structure which is the correlation matrix of observed variables with the factors.
Selecting items In order to further facilitate interpretation of the factor results some observed variables (items) can be dropped from the analysis (variable selection). Selection is based on checking Communality which is the fraction the factors explain of the variance of the item. Ideally should be >.5 (i.e., over 50%). Primary loading which indicates the strength each item loads on a factor (preferable on absolute value >.5) Cross-loadings indicate the strength an item loads on other factor (preferably small) Meaningfulness, i.e., does the item contribute meaningfully the interpretation of the factor Reliability which refers to internal consistency of the items of each factors (Cronbach s alpha, should be >.6)
Eliminating items Elminating items from an EFA is subjective. Communalities (each ideally >.5) Size of main loading (bare min >.4, preferably >.5, ideally >.6 on absolute values) Meaning of item (face validity), does item contribute meaningfully Contribution it makes to the factor (i.e., is a better measure of the latent factor achieved by including or not including this item?) Number of items already in the factor (i.e., if there are already many items (e.g.,> 6) in the factor, then the researcher can be more selective about which ones to include and which ones to drop) Eliminate 1 variable at a time, then re-run, before deciding which/if any items to eliminate next Cross loadings should be low, preferably on absolute value <.3 There should be minimum 2 items per factor, preferably 3 or more (however, not too many as hamper interpretation)
Example Rotating the four factors: (see SAS example on the web-site) In the police job rating example it turns out that all the four factors are hard to interpret. A three factor solution is easier. Factor 1: Physical skills Factor 2: Interpersonal skills Factor 3: Cognitive skills