Essays on Estimation Methods for Factor Models and Structural Equation Models

Size: px
Start display at page:

Download "Essays on Estimation Methods for Factor Models and Structural Equation Models"

Transcription

1 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Social Sciences 111 Essays on Estimation Methods for Factor Models and Structural Equation Models SHAOBO JIN ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2015 ISSN ISBN urn:nbn:se:uu:diva

2 Dissertation presented at Uppsala University to be publicly examined in Hörsal 2, Ekonomikum, Kyrkogårdsgatan 10, Uppsala, Friday, 8 May 2015 at 10:15 for the degree of Doctor of Philosophy. The examination will be conducted in English. Faculty examiner: Li Cai (UCLA). Abstract Jin, S Essays on Estimation Methods for Factor Models and Structural Equation Models. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Social Sciences pp. Uppsala: Acta Universitatis Upsaliensis. ISBN This thesis which consists of four papers is concerned with estimation methods in factor analysis and structural equation models. New estimation methods are proposed and investigated. In paper I an approximation of the penalized maximum likelihood (ML) is introduced to fit an exploratory factor analysis model. Approximated penalized ML continuously and efficiently shrinks the factor loadings towards zero. It naturally factorizes a covariance matrix or a correlation matrix. It is also applicable to an orthogonal or an oblique structure. Paper II, a simulation study, investigates the properties of approximated penalized ML with an orthogonal factor model. Different combinations of penalty terms and tuning parameter selection methods are examined. Differences in factorizing a covariance matrix and factorizing a correlation matrix are also explored. It is shown that the approximated penalized ML frequently improves the traditional estimation-rotation procedure. In Paper III we focus on pseudo ML for multi-group data. Data from different groups are pooled and normal theory is used to fit the model. It is shown that pseudo ML produces consistent estimators of factor loadings and that it is numerically easier than multi-group ML. In addition, normal theory is not applicable to estimate standard errors. A sandwich-type estimator of standard errors is derived. Paper IV examines properties of the recently proposed polychoric instrumental variable (PIV) estimators for ordinal data through a simulation study. PIV is compared with conventional estimation methods (unweighted least squares and diagonally weighted least squares). PIV produces accurate estimates of factor loadings and factor covariances in the correctly specified confirmatory factor analysis model and accurate estimates of loadings and coefficient matrices in the correctly specified structure equation model. If the model is misspecified, robustness of PIV depends on model complexity, underlying distribution, and instrumental variables. Keywords: shrinkage, factor rotation, penalized maximum likelihood, pseudo-maximum likelihood, multi-group analysis, ordinal data, robustness Shaobo Jin, Department of Statistics, Uppsala University, SE Uppsala, Sweden. Shaobo Jin 2015 ISSN ISBN urn:nbn:se:uu:diva (

3 Dedicated to my parents

4

5 List of papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. I II III IV Jin, S., Moustaki, I., and Yang-Wallentin, F. (2015) Approximated Penalized Maximum Likelihood for Exploratory Factor Analysis. Manuscript. Jin, S., Moustaki, I., and Yang-Wallentin, F. (2015) Approximated Penalized Maximum Likelihood for Exploratory Factor Analysis: A Simulation Study of an Orthogonal Case. Manuscript. Jin, S., Yang-Wallentin, F., and Christoffersson, A. (2015) Asymptotic Efficiency of the Pseudo-Maximum Likelihood Estimator in Multi-Group Factor Models with Pooled Data. Accepted with minor revision. Jin, S., Luo, H., and Yang-Wallentin, F. (2015) A Simulation Study of Polychoric Instrumental Variable Estimation in Structural Equation Models. Under revision

6

7 Contents 1 Introduction Background Exploratory factor analysis Factor rotation Penalized least squares Penalized maximum likelihood Confirmatory factor analysis Multi-group analysis Violation of the normality assumption Structural equation model Ordinal data Polychoric instrumental variable Research goals Summary of papers Paper I Paper II Paper III Paper IV Conclusion References

8

9 1. Introduction Factor analysis is a multivariate technique to decompose a covariance matrix or a correlation matrix among observed variables into equations of unobserved (latent) factors. A large number of variables, also known as indicators, are observed through interviews, questionnaires, psychological tests, etc. Indicators are assumed to be driven by a fewer number of common factors that are unobservable. The covariance structures among indicators and common factors can be explored using factor analysis, especially in social and behavioral sciences. Factor analysis can be categorized into two types according to its purpose: exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). EFA is commonly used to search for possible relations between indicators and latent factors in order to account for the correlations among indicators. Researchers usually do not have a substantive theory or any other kind of theory before EFA is conducted. The meanings of common factors are assigned after the factor model is fitted. Hence, no model is clearly defined a prior and, as its name implies, it is an exploratory technique used to search for a parsimonious representation of a set of variables. The number of latent factors is typically much smaller than the number of indicators. Therefore, EFA can also be viewed as a data reduction technique. In contrast, CFA allows researchers to test hypotheses on possible relationships between observed variables and latent factors. Researchers begin with a hypothesis stating how the indicators are related with latent factors. A CFA model is predefined based on the theory and/or hypothesis of interest. As an extension of a CFA model, a structural equation model (SEM) enables researchers to model not only the relations between indicators and latent factors but also the relations among latent factors. The SEM, which incorporates simultaneous equation models as a special case, is a multivariate regression model. However, it differs from the usual multivariate regression model in that both the response and explanatory variables in a SEM can be latent. This thesis, which consists of four papers, is devoted to estimation methods for factor analysis and SEM. New estimation methods are discussed and recently proposed methods are investigated in the thesis. 9

10 2. Background 2.1 Exploratory factor analysis An EFA model is of the form y = µ + Λ f + ε, (2.1) where y is a p 1 vector of indicators (also known as manifest variables), µ is a vector of intercepts, Λ is a loading matrix with the (i, j)th element λ i j, f N (0, Φ) is an m 1 common factor, ε N (0, Ψ) is the error term, and Ψ is a diagonal matrix with diagonal elements ψ i for i = 1,2,..., p. The common factor f and the error term ε are assumed to be uncorrelated. Model (2.1) implies the covariance matrix Σ(θ) = ΛΦΛ T + Ψ, where θ is the vector of unknown parameters. Several types of indeterminacy exist in the EFA. Let P be any invertible matrix and consider Λ = ΛP. The same model-implied covariance matrix is obtained if Φ = P 1 Φ(P T ) 1, since ΛΦΛ T = Λ Φ Λ T. Hence, the factor covariance matrix Φ is not uniquely identified. To estimate unknown parameters in Model (2.1), Φ is usually fixed to be an identity matrix, I. The model where Φ = I is referred to as an orthogonal model; otherwise, it is an oblique model. However, the model-implied covariance matrix is still invariant for orthogonal transformations. Letting the invertible matrix P be any orthogonal matrix, the identity ΛΛ T = Λ Λ T holds. Hence the loading matrix can be rotated orthogonally while the covariance matrix remains the same. Restrictions such as Λ T Ψ 1 Λ are diagonal and the diagonal elements are ordered in decreasing order (Jöreskog, 1967) remove rotational indeterminacy. Maximum likelihood (ML) is commonly used to fit Model (2.1), and has been discussed by several authors (e.g., Jöreskog, 1967; Lawley, 1940). Without loss of generality, we let µ = 0. Then the ML estimator minimizes F (θ) = n 2 log Σ(θ) + n 2 tr [SΣ(θ) 1], (2.2) where n is the sample size and S is the sample covariance matrix. 2.2 Factor rotation Because of rotational indeterminacy, an orthogonal EFA model is estimated first and then the estimated loading matrix is rotated to produce an interpretable loading matrix, possibly with a few large loadings and many small 10

11 loadings. After factor rotation, common factors can be either uncorrelated or correlated. A rotation method that results in an orthogonal structure is referred to as an orthogonal rotation. Likewise, an oblique rotation produces an oblique structure. Some commonly used orthogonal rotations are the quartimax (Neuhaus & Wrigley, 1954) and varimax rotations (Kaiser, 1958) while commonly used oblique rotations include the quartimin (Carroll, 1953) and promax rotations (Hendrickson & White, 1964). Further discussions regarding rotations can be found, for example, in Browne (2001) and Yanai & Ichikawa (2006). Factor rotations typically produce many small non-zero loadings. A sparse loading matrix with many zero elements is usually easier to interpret than a dense loading matrix. Consequently, factor loadings are often truncated at some user-specified level to produce a loading matrix with zero loadings. One rule of thumb is to treat loadings greater than 0.3 in their absolute values as sufficient loadings (Hair et al., 2010). 2.3 Penalized least squares Truncating a small loading is similar to truncating a small regression coefficient, which is, in fact, a hard-thresholding approach (Fan & Li, 2001): parameter estimates whose absolute values are less than a certain level are set to zero. Fan & Li (2001) suggested that using a hard-thresholding approach to select variables is inferior to using a soft-thresholding approach in which coefficients are continuously shrunk to zero. Soft-thresholding has been widely studied in linear regression models and generalized linear models to conduct estimation and variable selection simultaneously. One of the most famous soft-thresholding techniques is to use the least absolute shrinkage and selection operator (LASSO) proposed by Tibshirani (1996). For the linear model the LASSO estimator is argmin θ y = X θ + ε, (2.3) (y X θ) T (y X θ) + β θ, (2.4) where β > 0 is a tuning parameter and θ is the sum of all the elements in their absolute values. The LASSO conducts parameter estimation and variable selection simultaneously by shrinking coefficients towards zero and creating zero estimates. Note that ordinary least squares typically does not produce exactly zero coefficient estimates. The LASSO estimator of θ is a function of the tuning parameter β. As a simple example, the minimizer of 1 2 (y θ)2 + β θ (2.5) 11

12 is y β, y > β; ˆθ = y + β, y < β; 0, otherwise, (2.6) (Donoho & Johnstone, 1994). The tuning parameter β controls the magnitude of ˆθ, especially the values of y such that ˆθ is shrunk to zero. Therefore, ˆθ should be understood as a function ˆθ(β), which is referred to as the solution path of θ. In the case of multiple regressors, every ˆθ i, the ith element in ˆθ, is a function of β and the solution path of θ consists of the solution path of every θ i. The LASSO has been generalized to many other types of penalty terms: for instance, the smoothly clipped absolute deviation (SCAD) (Fan & Li, 2001), the elastic net (EN) (Zou & Hastie, 2005), and the minimax concave penalty with plus algorithm (MC+) (Zhang, 2010). See Tibshirani (2011) for a review of the LASSO and its variants. 2.4 Penalized maximum likelihood Selecting factor loadings after factor rotation is essentially a coefficient selection problem. Hence, the aforementioned techniques are naturally applicable to an EFA model. Choi et al. (2010) studied the penalized orthogonal EFA model with a LASSO penalty in the likelihood function. The penalized ML estimator with the LASSO penalty minimizes n 2 log Σ(θ) + n [ 2 tr SΣ(θ) 1] + nβ p i=1 m j=1 λ i j. (2.7) The LASSO penalty enables simultaneous estimation and shrinkage of factor loadings. Since the penalty term nβ p i=1 m j=1 λ i j is not differentiable, traditional methods are not applicable. An EM-type algorithm is proposed in Choi et al. (2010) to minimize equation (2.7). One advantage of penalized ML is that it removes rotational indeterminacy. Recall that for every orthogonal matrix P, ΛΛ T = Λ Λ T, where Λ = ΛP and the covariance matrix of f is still an identity matrix. An orthogonal rotation does not change the sum of the first two terms in equation (2.7), i.e. the unpenalized ML fit function. However, p i=1 m j=1 λ i j in the third term is likely to be different, unless the orthogonal matrix is a permutation matrix or an alternating sign matrix. Therefore, the penalized ML estimator is rotationfree. As a consequence, an orthogonal EFA model using penalized ML cannot be rotated to an oblique EFA model and vice versa. 12

13 More generally, the penalized ML estimator minimizes n 2 log Σ(θ) + n [ 2 tr SΣ(θ) 1] + np( Λ ; β, w), (2.8) where P is a scalar-valued penalty function, β is a vector of tuning parameters, w is a vector of weights on the factor loadings, and the absolute operator to Λ is performed element-wise. In particular, Hirose & Yamamoto (2014a,b) considered the MC+ penalty in both an orthogonal EFA and an oblique EFA. 2.5 Confirmatory factor analysis A CFA model has the same form as the EFA model (2.1) but with constraints on the parameters. Some common constraints are zero loadings, equal loadings, correlated error terms, and restricted factor covariances. These constraints represent the theories that researchers wish to test before fitting the model. If ML is used to fit a CFA model, the fit function (2.2) is minimized. The ML estimator ˆθ is asymptotically normal with asymptotic covariance matrix Ξ = 2 { T [ Σ(θ) 1 Σ(θ) 1] } 1, where = Vec[Σ(θ)]/ θ T and the operator Vec( ) stacks the columns of the enclosed matrix (Magnus & Neudecker, 1999). Ξ is in fact the inverse of the information matrix. 2.6 Multi-group analysis In the case where a common model is suitable for observations from different groups, a multi-group analysis is conducted. A single-group CFA model that has the form (2.1) is a special case of a multi-group model. A multi-group CFA model is of the form y g = µ g + Λ g f g + ε g, g = 1,2,...,G, (2.9) where the subscript g represents the gth group, and the groups are mutually independent of each other. The parameter vector θ contains all the free parameters in the mean vector µ g, the factor means E( f g ), the loading matrix Λ g, the covariance matrix of common factors Φ g, and the error variances Ψ g for all g. Model (2.9) is a general expression of a multi-group CFA. Different levels of measurement invariance can be imposed (Meredith, 1993), in which some parameters remain the same across different groups. In particular, strong measurement invariance assumes that µ g = µ and Λ g = Λ for all groups. This thesis is restricted to the case of strong measurement invariance. In the single-group analysis µ and E( f ) are not identified without the assumption that E( f ) = 0. In the multi-group analysis relative means of f g 13

14 are identified if E( f 1 ) = 0 (Sörbom, 1974), i.e. the factor mean in the first group is 0. Let the number of observations in the gth group be n g with its corresponding proportion p g. The multi-group ML estimator minimizes F (ML) = G g=1 p g ( log Σg (θ) + tr{s g Σ g (θ) 1 } +tr{(y g µ g )(y g µ g ) T Σ g (θ) 1 } ), (2.10) where Σ g is the model-implied covariance matrix of group g, S g is the sample covariance matrix of group g, and y g is the sample mean of group g. 2.7 Violation of the normality assumption One crucial aspect in the factor models introduced above is the multivariate normality assumption from which the likelihood function is formulated. The normality assumption is often questionable in practice. Therefore, the consequence of violating the normality assumption is particularly noteworthy. If the normality assumption fails but the fit function (2.2) is still used, it is called pseudo ML. For a general discussion, see White (1982). The pseudo ML estimator is shown to be consistent and asymptotically normally distributed under some general conditions for factor analysis (Anderson & Amemiya, 1988). In the single-group CFA the asymptotic covariance matrix is a sandwich estimator of the form Ξ ( Ω T HV HΩ ) Ξ (Yuan & Bentler, 1997), where Ω = Vech[Σ(θ)]/ θ T, the operator Vech( ) vectorizes the lower diagonal elements of the enclosed symmetric matrix, H = D T [ Σ(θ) 1 Σ(θ) 1] D/2 with D being a duplication matrix (Magnus & Neudecker, 1999) such that Vec(Q) = DVech(Q) for a symmetric matrix Q, and V is the asymptotic covariance matrix of Vech(S). When the normal assumption holds, the sandwichtype covariance matrix reduces to the covariance matrix with normal data, i.e. Ξ. A pseudo ML estimator does not necessarily jeopardize normal theory and thus the asymptotic covariance matrix Ξ may still be valid for computing standard errors. Numerous studies have been devoted to the consequences of violating the normality assumption, especially conditions under which the normal theory is valid for non-normal data. In the single-group analysis Amemiya et al. (1987) and Anderson & Amemiya (1988) have shown that, under some conditions, normal theory-based standard errors are still valid, even though the normality assumption is violated. Similar results for the multi-group analysis can be found in Satorra (2002) and Papadopoulos & Amemiya (2005). 14

15 2.8 Structural equation model As a generalization of the CFA model, the full SEM is η = Bη + Γξ + ζ, (2.11) x = Λ x ξ + δ, y = Λ y η + ε, (2.12) where x and y are vectors of indicators, ξ and η are vectors of latent variables, and ζ, δ, and ε are disturbances. Matrices Λ x and Λ y are factor loading matrices. The matrices B and Γ are coefficient matrices indicating relations among the latent factors. The latent vector ξ and the disturbances δ, ε, and ζ are mutually uncorrelated of each other. The SEM reduces to a factor model if B and Γ are zero. Equation (2.11) is the structural model for latent variables and Equation (2.12) is the measurement model that links the latent variables with indicators. Similar to the CFA model, ML is often used to estimate unknown parameters in a full SEM under the assumption of normally distributed indicators. If the normality assumption is violated, discussions in Section 2.7 are essentially applicable to a full SEM. 2.9 Ordinal data So far, indicators in the above models are continuous. However, ordinal data are very common in practice. For example, Likert scales are widely used in questionnaires. Conceptually, ordinal data cannot be treated as continuous data since the mean and variance are not identified. There are several ways of incorporating ordinal data into a CFA model or a SEM. In particular, the underlying distribution approach assumes that the observed variables are the counterparts of underlying continuous variables. The values of an indicator y are defined through a continuous variable y as y = i if τ i 1 < y < τ i, i = 1,2,...,c, where c is the number of categories and τ i s are thresholds. The underlying variable y is assumed to follow a standard normal distribution. Consequently, the factor model with ordinal data is expressed as y = Λ f + ε, (2.13) and the measurement model of a full SEM with ordinal data is x = Λ x ξ + δ, y = Λ y η + ε, (2.14) where x and y are underlying continuous variables. The structural model of a full SEM with ordinal data remains to be the same as Equation (2.11). 15

16 To fit a CFA model or a SEM with ordinal data polychoric correlations among ordinal variables are computed as in Olsson (1979) and their asymptotic covariance matrix Π are estimated as in Jöreskog (1994). Then the least squares fit function or the ML fit function (n 1)(s σ) T W (s σ), (2.15) log Σ + tr ( SΣ 1) log S p, (2.16) is minimized to produce parameter estimates, where s stacks the lower offdiagonal part of the polychoric correlation matrix and σ stacks the lower offdiagonal part of the model-implied correlation matrix. Choices of the weight matrix W have been extensively studied in the literature, such as weighted least squares (WLS) W = Π 1 (Muthén, 1978; Browne, 1984), unweighted least squares (ULS) W = I (Muthén, 1978), and diagonally weighted least squares (DWLS) W = diag(π) 1 (Muthén et al., 1997) Polychoric instrumental variable Although the discrepancy functions are different for the methods listed above, they all share one common feature, namely, that the entire correlation matrix and the model specification are used to estimate unknown parameters. For this reason, they are hereby referred to as system-wide methods as in Bollen (1996). Bollen (1996) pointed out that system-wide methods are likely to spread biases due to model misspecification over the entire model and therefore the author proposed a limited information estimator using instrumental variables to reduce the biases. It was generalized by Bollen & Maydeu- Olivares (2007) to ordinal data and was referred to as the polychoric instrumental variable (PIV) estimator. PIV is a two-stage estimation procedure and only part of the correlation matrix is used at every step in the first stage. Hence, it decomposes the correlation matrix using a different approach to the system-wide methods. Considering the SEM (2.11) and (2.14), the factor loading of the first indicator for every latent variable is scaled to 1. Hence, the measurement model can be partitioned as 16 ( x 1 x 2 ( y 1 y 2 ) ( ) I 0 = 0 Λ x,2 ) = ( ) I 0 η + 0 Λ y,2 ( δ ξ + 1 ) ; δ 2 ( ) ε1. (2.17) ε 2

17 Subsequently, the full SEM becomes = B Γ Λ y,2 0 0 Λ x,2 y 1 y 2 x 2 + ( y 1 x 1 ) I B Γ 0 0 I Λ y,2 0 I Λ x,2 0 I 0 ε 1 δ 1 ε 2 δ 2 ζ. (2.18) Model (2.18) is a linear regression model whose regressors are correlated with the error terms. Instrumental variables selected from the ordinal indicators are used to deal with these correlations. In the first stage the slope coefficients in Model (2.18), denoted by θ 1, are estimated using least squares. Hence, θ 1 consists of parameters in B, Γ, Λ y,2, and Λ x,2. To be more specific let u j and z j be the left-hand side observed variable and the vector of the right-hand side observed variables, respectively, in the jth row of Model (2.18). Then the parameters in the jth row can be estimated by ( P T vzp 1 vv P vz ) 1 P T vzp 1 vv P vu, (2.19) where P vz = cor(v j, z T j ), P vv = cor(v j, v T j ), P vu = cor(v j,u j ), and v j is a vector of instrumental variables for z j. In the second stage the remaining parameter vector, θ 2, is estimated by minimizing the least squares function ( ( )) T ( ( )) s σ ˆθ 1, θ 2 s σ ˆθ 1, θ 2. (2.20) Bollen & Maydeu-Olivares (2007) have shown that the PIV estimator is consistent and asymptotically normal if the instrumental variables are chosen appropriately. Because the entire model specification is not used in the first stage, PIV is expected to be more robust against model misspecification for correctly specified parameters (Bollen, 2001; Bollen & Maydeu-Olivares, 2007). Moreover, two goodness-of-fit test statistics for PIV are also proposed in Bollen & Maydeu-Olivares (2007). Nestler (2013) investigated small sample properties of PIV by comparing PIV with ULS and DWLS in the estimation of dichotomous CFA models through a Monte Carlo study. He found that PIV estimators are as accurate as ULS and DWLS estimators when the model is correctly specified and produce less bias when the model is misspecified. 17

18 3. Research goals The main research goal of this thesis is to study new estimation methods for factor analysis models and SEMs. To achieve the research goal the thesis is split into three major parts. Papers I and II deal with estimating and selecting non-zero factor loadings in an EFA model. As previously mentioned, truncating factor loadings is a hard-thresholding approach and penalized ML is a soft-thresholding approach. Based on our experience, the EM-type algorithm proposed by Choi et al. (2010) requires many steps until the algorithm converges. The solution path produced by penalized ML is not necessarily smooth either. For these reasons, Papers I and II are devoted to overcoming some limitations of the penalized ML estimation for EFA and inheriting the idea of using a soft-thresholding approach to select sufficient factor loadings. Paper III focuses on multi-group analysis. Under the correct model and correct distributional assumptions, multi-group ML is efficient. However, it involves a large number of parameters. For example, consider a multi-group model with G groups, m factors per group, and p indicators per group. If we assume that strong measurement invariance holds and that the loading matrix is perfect simple where every indicator is loaded on one factor, then free parameters include the deterministic means (p parameters), relative factor means (Gm m parameters), factor loadings (p m parameters), covariances of latent factors (0.5Gm Gm parameters), and error variances (Gp parameters). In total, there are 0.5Gm Gm + Gp + 2p 2m unknown parameters. If the sample size is insufficiently large, inadmissible results such as non-converged estimates and non-positive definite covariance matrices are likely to happen. Under the assumption of strong measurement invariance, an alternative is to pool all the data and conduct a single-group analysis. A single-group analysis with pooled data efficiently reduces the number of parameters to be estimated. Under the same condition as the multi-group analysis mentioned earlier, the free parameters of a single-group analysis with pooled data include the mean vector (p parameters), factor loadings (p m parameters), covariances of latent factors (0.5m m parameters), and error variances (p parameters). The total number of parameters is 3p + 0.5m 2 0.5m. If the main interest is the factor loadings, a single-group analysis is expected to be numerically easier. Hence, Paper III first examines the properties of pseudo ML. Moreover, the effect of pooling has been previously studied in Luo (2011) by investigating standard errors of least squares estimators. The effect of pooling with ML is unknown to us. Therefore, a second aim of Paper III is to investigate the robustness of the normal theory for pooled data. 18

19 Paper IV focuses on the properties of PIV. Simulation studies in both Bollen & Maydeu-Olivares (2007) and Nestler (2013) only considered dichotomous data. However, ordinal data with more than two categories is a more common scenario. To our knowledge, no simulation studies have been done to investigate the properties of the PIV estimator in the context of ordinal data. Moreover, no studies have been devoted to PIV in a full SEM. Therefore, Paper IV conducts a simulation study to examine the properties of PIV in a CFA model and a full SEM with ordinal data. In particular, we are interested in estimation accuracy when the model is correctly and incorrectly specified. 19

20 4. Summary of papers 4.1 Paper I In Paper I the ML fit function (2.2) is Taylor-expanded around the ML estimator ˆθ and the function ( ) 1 ( θ 2 ˆθ ) T 2 F ˆθ ( θ θ T θ ˆθ ) + np( Λ ; β, w), (4.1) is minimized to provide an approximated solution for a penalty function P. Both the orthogonal and oblique structure fit into the framework of approximated penalized ML (4.1). It also naturally factorizes a covariance or correlation matrix. Equation (4.1) is merely a penalized weighted least squares problem and therefore provides a smooth and continuous solution path for a properly chosen penalty term. Plenty of efficient algorithms exist to optimize a penalized least squares problem. Hence, approximated penalized ML is numerically more efficient than the EM-type algorithm in Choi et al. (2010). As a tradeoff, the approximated penalized ML depends on the starting point ˆθ. Two ways of selecting the tuning parameter are discussed, namely analytical selection and solution path-based selection. The analytical selection relies on a numerical criterion. Some typical examples are AIC, BIC, mean squared error, and Kullback-Leibler information. Solution path based selection subjectively chooses a path diagram by researchers from all the path diagrams suggested by the approximated penalized ML. The approximated penalized ML is demonstrated through the Holzinger & Swineford (1939) test dataset. Various penalty terms (e.g., LASSO, SCAD, EN, MC+, and some of their variations) are applied. They naturally produce sparse loading matrices with many zero loadings. It is demonstrated through the LASSO that approximated penalized ML continuously shrinks factor loadings to zero and suggests a series of loading matrices with different number of zero loadings. On the other hand, it is seen that different combinations of a penalty term and an analytical selection method possibly lead to loading matrices with different numbers of zeros. Different starting points also have an impact on the solution path. Therefore, approximated penalized ML can also be understood as a way to select sufficient factor loadings continuously after factor rotations. Furthermore, factorizing a covariance matrix using approximated penalized ML may lead to a different loading matrix as compared with factorizing a correlation matrix. 20

21 4.2 Paper II In Paper II a simulation study is conducted to investigate properties of EFA using the approximated penalized ML introduced in Paper I. Attention has been restricted to an orthogonal EFA with continuous and normally distributed indicators. Three factor models are considered in the study. All the models have three factors and the number of indicators ranges from 9 to 18. Two models have a perfect simple loading matrix, whereas the other model has cross loadings. The sample sizes considered are n = 100 and 200. The LASSO (Tibshirani, 1996), naive EN (Zou & Hastie, 2005), EN (Zou & Hastie, 2005), and their adaptive versions (Zou, 2006) are considered as well as the SCAD (Fan & Li, 2001) and MC+ (Zhang, 2010). To provide guidelines for choosing the optimal selection method different analytical selection methods are compared. Hence, Paper II provides information on both the optimal analytical selection method for a given penalty and the optimal combination of the penalty and the selection method. For all penalty terms, the starting point is the varimax solution. Approximated penalized ML is compared with the varimax solution with the cut-off value 0.3 for sparsity and with the varimax rotation with and without using the cut-off value for estimation accuracy. In the case of factorizing a covariance matrix, two approaches can be applied. The first approach factorizes the covariance matrix directly, whereas the second approach first factorizes the correlation matrix and then rescales the results back to the covariance structure using estimated variances. Both approaches are considered and compared in the paper. The varimax rotation with the cut-off value of 0.3 is able to recover the correct loading matrix if it is perfect simple, but seldom recovers a loading matrix that is not perfect simple. The approximated penalized ML with analytical selection methods does not perform as well as the varimax rotation if the loading matrix is perfect simple but greatly improves the varimax rotation otherwise. Further, the perfect simple loading matrix is frequently contained in the solution path, even though an analytical selection may possibly suggest a different loading matrix. Various combinations of a penalty term and a selection method frequently produce a lower average mean squared error of factor loadings, the covariance (or correlation) matrix, and factor scores than the varimax solution without truncation. However, approximated penalized ML is generally inferior to the varimax solution with truncation if the loading matrix is perfect simple; otherwise, approximated penalized ML is generally better if the loading matrix contains small cross loadings. Given a penalty term, the optimal analytical selection criterion depends on the focus of the study and whether a covariance matrix or a correlation matrix is factorized. The optimal combination of a penalty term and an analytical selection method also depends on the main purpose of the study, the true loading structure, and the sample size. Generally speaking, the adaptive penalties and the MC+ with 21

22 BIC and rescaling generally work well for all purposes considered in the study if a covariance matrix is factorized. The SCAD with BIC is also promising. On the other hand, the MC+ with BIC is a good universal choice if a correlation matrix is factorized. At last, rescaling improves the percentage of correct loading structures recovered except for the EN and SCAD. The effect of rescaling depends on the penalty term, but it generally improves estimation accuracy. 4.3 Paper III In Paper III we study pseudo ML for multi-group data under the assumption of strong measurement invariance. Data are observed from but pooled and the single-group model y g = µ + Λ f g + ε g, g = 1,2,...,G, (4.2) y = µ + Λ f + ε (4.3) is fitted. It is assumed that the pooled data follow a normal distribution and the normal theory ML fit function (2.2) for one group is minimized to obtain pseudo ML estimators. Via a numerical example, it is first shown that robustness of the normal theory does not hold for pooled data. In the example a two-group model with six indicators and two factors is considered. Asymptotic relative efficiency of the pseudo ML estimator of factor loadings with respect to the multi-group ML estimator is computed. It is seen that normal theory may grossly underestimate the standard errors, which are even lower than the standard errors produced by multi-group ML. Second, the correct sandwich estimator of standard errors for the pseudo ML is derived. Using the correct formula, the multi-group ML is still asymptotically efficient with smaller standard errors than those of pseudo ML. Third, a small Monte Carlo study shows that pseudo ML produces less inadmissible solutions than multi-group ML when the sample size is small and the pseudo ML estimator of factor loadings is as accurate as the multigroup estimator. The same multi-group model as in the numerical example is used. Hence, there are 30 free parameters for the multi-group ML but only 19 free parameters for the pseudo ML. The reduction in the parameter space is profound. 4.4 Paper IV In Paper IV, properties of PIV estimators for ordinal data are investigated through a simulation study. Inadmissible rates, bias of parameter estimates, 22

23 bias of standard errors, and test statistics of PIV are compared with those of system-wide methods, i.e. ULS and DWLS. A CFA model and a full SEM are considered in the paper. The CFA model has 12 indicators and four factors; the SEM has three latent endogenous variables with seven indicators and two latent exogenous variables with five indicators. All the indicators are ordinal with five categories. Three levels of normality are considered and three sample sizes are simulated (i.e. n = 400, 800, and 3200). The correctly specified CFA model is fitted to investigate whether the PIV estimators are as accurate as the estimators using the system-wide methods. Four levels of misspecification are considered in which a zero loading is altered to non-zero, one at a time. Newly added non-zero loadings are fixed as zeros in the estimation. Hence, the CFA model is misspecified in the sense that non-zero loadings are omitted. Likewise, the correctly specified SEM and four levels of misspecified SEM are simulated. Misspecification is introduced to the coefficient matrices, the measurement models, or both. When the CFA model is correctly specified, PIV estimators of factor loadings and factor covariances are as accurate as the system-wide methods. However, non-normality may have negative effects. When the SEM is correctly specified, the average relative bias of PIV estimators of loadings and coefficient matrices are generally low. However, PIV tends to produce a higher average relative mean squared error. When the CFA model is misspecified, PIV tends to produce more robust estimates for factor loadings. However, PIV generally does not accurately estimate factor covariances and standard errors for incorrectly specified loadings and factor covariances. When the SEM is misspecified, PIV generally produces less biased estimates for factor loadings. However, PIV does not necessarily produce more accurate covariance estimates. In addition, invalid instrumental variables are likely to be selected and affect estimation accuracy because of model misspecification. The optimal choice of a test statistic for PIV depends on the sample size and model specification. However, the mean-and-variance adjusted statistic proposed in Bollen & Maydeu-Olivares (2007) is generally preferred in the CFA model if the sample size is n = 400 and 800. However, the mean-andvariance adjusted statistic is slightly undersized and has a low power in the SEM. 23

24 5. Conclusion This thesis contributes to estimating factor analysis models or SEMs in several ways. First, the idea of model penalization and coefficient shrinkage, which has been extensively studied in the statistical literature in the past two decades, has been combined with psychometrics models. Penalized ML provides an alternative to the traditional estimation-rotation procedure and to the selection of factor loadings. Second, the single-group analysis with pooled data, which is numerically simpler, provides an option to alleviate the numerical issue in a multi-group analysis. The study allows researchers to pool data and conduct a single-group analysis when the sample size is not large enough. Third, it is alerted that robustness of the normal theory does not carry over to the pooled data. Normal theory may severely underestimate standard errors of factor loadings. Fourth, properties of PIV have been investigated through a simulation study. The results shed some light on the comparison of systemwide methods and PIV. Many future research projects can be performed based on this thesis. The simulation study on the properties of approximated penalized ML is restricted to orthogonal EFA with continuous indicators. A future direction can be to study an oblique EFA model and an ordinal EFA model. Although approximated penalized ML overcomes some limitations of the penalized ML, dependence on the starting point has been noted in Paper I. As a future topic, guidelines on the optimal starting point are desired. In addition, Paper III considers the case where group memberships are known. In the case where group information is unknown, it would be worthwhile to compare pseudo ML with the mixture factor model. Studies on PIV can also be extended to other topics. For example, model misspecification, such as incorrectly scaling a cross loading to one in the first stage of PIV, is worth investigating. 24

25 Acknowledgements First I would like to thank my advisor Fan Yang-Wallentin gratefully. You brought me to start working on the subject of structural equation modelling. As an advisor, you have put a lot effort in supervising my research, introducing me to the big figures in the psychometric society and creating opportunities for me. As a senior, you are always available for advice. I could not ask an advisor to do more than what you have done for me. I also wish to thank another advisor of mine, Rolf Larsson. You supervised me a small project when I was a master student in the department and my licentiate thesis after I became a PhD student. You encouraged me to proceed for a PhD, shared many interesting comments with me and discussed various topics with me. These experiences have always been very helpful. I want to extend my thanks to Irini Moustaki who hosted my visit at LSE as an exchange student. It was a short visit but you have made the stay very cheerful. It is an honour to cooperate with you. Your expertise in the area and enthusiasm towards research are the keys behind the papers with you. My special thanks go to Adam Taube who was my first advisor during my stay in Sweden. We had so many interesting conversations when you supervised my master s thesis and all the time after that. You are more like a kind and knowledgeable grandpa than an advisor. I am flattered to be your second best Chinese student who likes smörgåstårta. I sincerely extend my thanks to my colleagues at the department, notably my officemates Björn Andersson and David Kreiberg. You are always in the department, day and night. Your company is a unfading color of my life as a PhD student. To Björn Andersson, you literally witnessed my entire stay at the department: not only working hard in the office but also travelling around the world in three continents. You are talented as a researcher, who push me to work harder, and trustworthy as a good friend, who keeps my secrets and helps me in many ways. To David Kreiberg, our endless academic discussions help me to enhance my statistical knowledge. Our non-academic talks and buffet nights have made PhD s life after work exciting. Many other people are part of my wonderful life in Sweden. Hao Li, Hao Luo, Jia Zhou, Daniel Ekbom, Jian Kang, Ran He, Qiao Wei, Xing He, and Zheng Ning, you are very good friends. My dormmates at 5082, Yin Zhang, Jiayin Zheng and Yongxin Xu, you still got my back even though you are not in Sweden. Yi Xu, we have not seen each other for a long time. But a what s up everyday makes me feel like we were still young and still studying together somewhere in ZJG. Xuan Li, thanks for your company during my last journey of the program. 25

26 Finally, I would like to thank my parents for years of support. You encouraged me to pursuit a PhD when I was hesitating. Whenever I felt sad and lonely, I can always find my strength back from you. Without your love, I cannot make this thesis. 26

27 References Amemiya, Y., Fuller, W. A., & Pantula, S. G. (1987). The asymptotic distributions of some estimators for a factor analysis model. Journal of Multivariate Analysis, 22, Anderson, T. W., & Amemiya, Y. (1988). The asymptotic normal distribution of estimators in factor analysis under general conditions. The Annals of Statistics, 16, Bollen, K. A. (1996). An alternative two stage least squares (2SLS) estimator for latent variable equations. Psychometrika, 61, Bollen, K. A. (2001). Two-stage least squares and latent variable models: Simultaneous estimation and robustness to misspecifications. In R. Cudeck, S. D. Toit,, & D. Sörbom (Eds.), Structural equation modeling: Present and future (p ). Lincolnwood, IL: Scientific Software. Bollen, K. A., & Maydeu-Olivares, A. (2007). A polychoric instrumental variable (PIV) estimator for structural equation models with categorical variables. Psychometrika, 72, Browne, M. W. (1984). Asymptotically distribution-free methods in the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, Browne, M. W. (2001). An overview of analytic rotation in exploratory factor analysis. Multivariate Behavioral Research, 36, Carroll, J. B. (1953). An analytic rotation for approximating simple structure in factor analysis. Psychometrika, 18, Choi, J., Zou, H., & Oehlert, G. (2010). A penalized maximum likelihood approach to sparse factor analysis. Statistics and Its Interface, 3, Donoho, D. L., & Johnstone, J. M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika, 81, Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, Hair, J., Black, W., Babin, B., & Anderson, R. (2010). Multivariate data analysis (7th ed.). Upper Saddle River, NJ: Prentice Hall. Hendrickson, A. E., & White, P. O. (1964). Promax: A quick method for rotation to oblique simple structure. British Journal of Mathematical and Statistical Psychology, 17, Hirose, K., & Yamamoto, M. (2014a). Estimation of an oblique structure via penalized likelihood factor analysis. Computational Statistics & Data Analysis, 79, Hirose, K., & Yamamoto, M. (2014b). Sparse estimation via non-covave penalized likelihood in factor analysis model. Statistics and Computing, Advance online publication. 27

28 Holzinger, K., & Swineford, F. (1939). A study in factor analysis: The stability of a bifactor solution. Supplementary Educational Monograph, No. 48, Chicago, IL: University of Chicago Press. Jöreskog, K. G. (1967). Some contributions to maximum likelihood factor analysis. Psychometrika, 32, Jöreskog, K. G. (1994). On the estimation of polychoric correlations and their asymptotic covariance matrix. Psychometrika, 59, Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, Lawley, D. N. (1940). The estimation of factor loadings by the method of maximum likelihood. Proceedings of the Royal Society of Edinburgh, 60, Luo, H. (2011). The effect of pooling multi-group data on the estimation of factor loadings. Unpublished manuscript, Department of Statistics, Uppsala University, Uppsala, Sweden. Magnus, J. R., & Neudecker, H. (1999). Matrix differential calculus with applications in statistics and econometrics. Hoboken, NJ: Wiley. Meredith, W. (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Unpublished manuscript. Nestler, S. (2013). A Monte Carlo study comparing piv, uls and dwls in the estimation of dichotomous confirmatory factor analysis. British Journal of Mathematical and Statistical Psychology, 66, Neuhaus, J. O., & Wrigley, C. (1954). The quartimax method: An analytical approach to orthogonal simple structure. British Journal of Mathematical and Statistical Psychology, 7, Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44, Papadopoulos, S., & Amemiya, Y. (2005). Correlated samples with fixed and nonnormal latent variables. The Annals of Statistics, 33, Satorra, A. (2002). Asymptotic robustness in multiple group linear-latent variable models. Econometric Theory, 18, Sörbom, D. (1974). A general method for studying differences in factor means and factor structure between groups. British Journal of Mathematical and Statistical Psychology, 27, Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58, Tibshirani, R. (2011). Regression shrinkage and selection via the lasso: a retrospective. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica, 50, Yanai, H., & Ichikawa, M. (2006). Factor analysis. In C. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26, p ). North Holland: Elsevier. 28

29 Yuan, K.-H., & Bentler, P. M. (1997). Improving parameter tests in covariance structure analysis. Computational Statistics & Data Analysis, 26, Zhang, C. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67,

Nesting and Equivalence Testing

Nesting and Equivalence Testing Nesting and Equivalence Testing Tihomir Asparouhov and Bengt Muthén August 13, 2018 Abstract In this note, we discuss the nesting and equivalence testing (NET) methodology developed in Bentler and Satorra

More information

Testing Structural Equation Models: The Effect of Kurtosis

Testing Structural Equation Models: The Effect of Kurtosis Testing Structural Equation Models: The Effect of Kurtosis Tron Foss, Karl G Jöreskog & Ulf H Olsson Norwegian School of Management October 18, 2006 Abstract Various chi-square statistics are used for

More information

Uppsala University and Norwegian School of Management, b Uppsala University, Online publication date: 08 July 2010

Uppsala University and Norwegian School of Management, b Uppsala University, Online publication date: 08 July 2010 This article was downloaded by: [UAM University Autonoma de Madrid] On: 28 April 20 Access details: Access Details: [subscription number 93384845] Publisher Psychology Press Informa Ltd Registered in England

More information

Factor analysis. George Balabanis

Factor analysis. George Balabanis Factor analysis George Balabanis Key Concepts and Terms Deviation. A deviation is a value minus its mean: x - mean x Variance is a measure of how spread out a distribution is. It is computed as the average

More information

Misspecification in Nonrecursive SEMs 1. Nonrecursive Latent Variable Models under Misspecification

Misspecification in Nonrecursive SEMs 1. Nonrecursive Latent Variable Models under Misspecification Misspecification in Nonrecursive SEMs 1 Nonrecursive Latent Variable Models under Misspecification Misspecification in Nonrecursive SEMs 2 Abstract A problem central to structural equation modeling is

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

Correlations with Categorical Data

Correlations with Categorical Data Maximum Likelihood Estimation of Multiple Correlations and Canonical Correlations with Categorical Data Sik-Yum Lee The Chinese University of Hong Kong Wal-Yin Poon University of California, Los Angeles

More information

ABSTRACT. Chair, Dr. Gregory R. Hancock, Department of. interactions as a function of the size of the interaction effect, sample size, the loadings of

ABSTRACT. Chair, Dr. Gregory R. Hancock, Department of. interactions as a function of the size of the interaction effect, sample size, the loadings of ABSTRACT Title of Document: A COMPARISON OF METHODS FOR TESTING FOR INTERACTION EFFECTS IN STRUCTURAL EQUATION MODELING Brandi A. Weiss, Doctor of Philosophy, 00 Directed By: Chair, Dr. Gregory R. Hancock,

More information

SAMPLE SIZE IN EXPLORATORY FACTOR ANALYSIS WITH ORDINAL DATA

SAMPLE SIZE IN EXPLORATORY FACTOR ANALYSIS WITH ORDINAL DATA SAMPLE SIZE IN EXPLORATORY FACTOR ANALYSIS WITH ORDINAL DATA By RONG JIN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

STAT 730 Chapter 9: Factor analysis

STAT 730 Chapter 9: Factor analysis STAT 730 Chapter 9: Factor analysis Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Data Analysis 1 / 15 Basic idea Factor analysis attempts to explain the

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

Can Variances of Latent Variables be Scaled in Such a Way That They Correspond to Eigenvalues?

Can Variances of Latent Variables be Scaled in Such a Way That They Correspond to Eigenvalues? International Journal of Statistics and Probability; Vol. 6, No. 6; November 07 ISSN 97-703 E-ISSN 97-7040 Published by Canadian Center of Science and Education Can Variances of Latent Variables be Scaled

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

Ridge Structural Equation Modeling with Correlation Matrices for Ordinal and Continuous Data. Ke-Hai Yuan University of Notre Dame

Ridge Structural Equation Modeling with Correlation Matrices for Ordinal and Continuous Data. Ke-Hai Yuan University of Notre Dame Ridge Structural Equation Modeling with Correlation Matrices for Ordinal and Continuous Data Ke-Hai Yuan University of Notre Dame Rui Wu Beihang University Peter M. Bentler University of California, Los

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

Composite Likelihood Estimation for Latent Variable Models with Ordinal and Continuous, or Ranking Variables

Composite Likelihood Estimation for Latent Variable Models with Ordinal and Continuous, or Ranking Variables Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Social Sciences 86 Composite Likelihood Estimation for Latent Variable Models with Ordinal and Continuous, or Ranking Variables

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

The comparison of estimation methods on the parameter estimates and fit indices in SEM model under 7-point Likert scale

The comparison of estimation methods on the parameter estimates and fit indices in SEM model under 7-point Likert scale The comparison of estimation methods on the parameter estimates and fit indices in SEM model under 7-point Likert scale Piotr Tarka Abstract In this article, the author discusses the issues and problems

More information

Regularized Common Factor Analysis

Regularized Common Factor Analysis New Trends in Psychometrics 1 Regularized Common Factor Analysis Sunho Jung 1 and Yoshio Takane 1 (1) Department of Psychology, McGill University, 1205 Dr. Penfield Avenue, Montreal, QC, H3A 1B1, Canada

More information

Assessing Factorial Invariance in Ordered-Categorical Measures

Assessing Factorial Invariance in Ordered-Categorical Measures Multivariate Behavioral Research, 39 (3), 479-515 Copyright 2004, Lawrence Erlbaum Associates, Inc. Assessing Factorial Invariance in Ordered-Categorical Measures Roger E. Millsap and Jenn Yun-Tein Arizona

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Exploratory (EFA) Background While the motivation in PCA is to replace the original (correlated) variables

More information

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra.

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra. Scaled and adjusted restricted tests in multi-sample analysis of moment structures Albert Satorra Universitat Pompeu Fabra July 15, 1999 The author is grateful to Peter Bentler and Bengt Muthen for their

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

STRUCTURAL EQUATION MODELS WITH LATENT VARIABLES

STRUCTURAL EQUATION MODELS WITH LATENT VARIABLES STRUCTURAL EQUATION MODELS WITH LATENT VARIABLES Albert Satorra Departament d Economia i Empresa Universitat Pompeu Fabra Structural Equation Modeling (SEM) is widely used in behavioural, social and economic

More information

Citation for published version (APA): Jak, S. (2013). Cluster bias: Testing measurement invariance in multilevel data

Citation for published version (APA): Jak, S. (2013). Cluster bias: Testing measurement invariance in multilevel data UvA-DARE (Digital Academic Repository) Cluster bias: Testing measurement invariance in multilevel data Jak, S. Link to publication Citation for published version (APA): Jak, S. (2013). Cluster bias: Testing

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

Ruth E. Mathiowetz. Chapel Hill 2010

Ruth E. Mathiowetz. Chapel Hill 2010 Evaluating Latent Variable Interactions with Structural Equation Mixture Models Ruth E. Mathiowetz A thesis submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment

More information

Strati cation in Multivariate Modeling

Strati cation in Multivariate Modeling Strati cation in Multivariate Modeling Tihomir Asparouhov Muthen & Muthen Mplus Web Notes: No. 9 Version 2, December 16, 2004 1 The author is thankful to Bengt Muthen for his guidance, to Linda Muthen

More information

miivfind: A command for identifying model-implied instrumental variables for structural equation models in Stata

miivfind: A command for identifying model-implied instrumental variables for structural equation models in Stata The Stata Journal (yyyy) vv, Number ii, pp. 1 16 miivfind: A command for identifying model-implied instrumental variables for structural equation models in Stata Shawn Bauldry University of Alabama at

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

A Threshold-Free Approach to the Study of the Structure of Binary Data

A Threshold-Free Approach to the Study of the Structure of Binary Data International Journal of Statistics and Probability; Vol. 2, No. 2; 2013 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education A Threshold-Free Approach to the Study of

More information

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models Chapter 5 Introduction to Path Analysis Put simply, the basic dilemma in all sciences is that of how much to oversimplify reality. Overview H. M. Blalock Correlation and causation Specification of path

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 11 November 2, 2005 Multivariate Analysis Lecture #11-11/2/2005 Slide 1 of 58 Today s Lecture Factor Analysis. Today s Lecture Exploratory factor analysis (EFA). Confirmatory

More information

A Composite Likelihood Approach for Factor Analyzing Ordinal Data

A Composite Likelihood Approach for Factor Analyzing Ordinal Data A Composite Likelihood Approach for Factor Analyzing Ordinal Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio

More information

WHAT IS STRUCTURAL EQUATION MODELING (SEM)?

WHAT IS STRUCTURAL EQUATION MODELING (SEM)? WHAT IS STRUCTURAL EQUATION MODELING (SEM)? 1 LINEAR STRUCTURAL RELATIONS 2 Terminología LINEAR LATENT VARIABLE MODELS T.W. Anderson (1989), Journal of Econometrics MULTIVARIATE LINEAR RELATIONS T.W. Anderson

More information

RANDOM INTERCEPT ITEM FACTOR ANALYSIS. IE Working Paper MK8-102-I 02 / 04 / Alberto Maydeu Olivares

RANDOM INTERCEPT ITEM FACTOR ANALYSIS. IE Working Paper MK8-102-I 02 / 04 / Alberto Maydeu Olivares RANDOM INTERCEPT ITEM FACTOR ANALYSIS IE Working Paper MK8-102-I 02 / 04 / 2003 Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C / María de Molina 11-15, 28006 Madrid España Alberto.Maydeu@ie.edu

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Causal Inference Using Nonnormality Yutaka Kano and Shohei Shimizu 1

Causal Inference Using Nonnormality Yutaka Kano and Shohei Shimizu 1 Causal Inference Using Nonnormality Yutaka Kano and Shohei Shimizu 1 Path analysis, often applied to observational data to study causal structures, describes causal relationship between observed variables.

More information

Introduction to Confirmatory Factor Analysis

Introduction to Confirmatory Factor Analysis Introduction to Confirmatory Factor Analysis Multivariate Methods in Education ERSH 8350 Lecture #12 November 16, 2011 ERSH 8350: Lecture 12 Today s Class An Introduction to: Confirmatory Factor Analysis

More information

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018 SRMR in Mplus Tihomir Asparouhov and Bengt Muthén May 2, 2018 1 Introduction In this note we describe the Mplus implementation of the SRMR standardized root mean squared residual) fit index for the models

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

Minimax design criterion for fractional factorial designs

Minimax design criterion for fractional factorial designs Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:

More information

Dimensionality Reduction Techniques (DRT)

Dimensionality Reduction Techniques (DRT) Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Journal of Multivariate Analysis. Use of prior information in the consistent estimation of regression coefficients in measurement error models

Journal of Multivariate Analysis. Use of prior information in the consistent estimation of regression coefficients in measurement error models Journal of Multivariate Analysis 00 (2009) 498 520 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Use of prior information in

More information

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models Confirmatory Factor Analysis: Model comparison, respecification, and more Psychology 588: Covariance structure and factor models Model comparison 2 Essentially all goodness of fit indices are descriptive,

More information

Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach

Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach Massimiliano Pastore 1 and Luigi Lombardi 2 1 Department of Psychology University of Cagliari Via

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

The Risk of James Stein and Lasso Shrinkage

The Risk of James Stein and Lasso Shrinkage Econometric Reviews ISSN: 0747-4938 (Print) 1532-4168 (Online) Journal homepage: http://tandfonline.com/loi/lecr20 The Risk of James Stein and Lasso Shrinkage Bruce E. Hansen To cite this article: Bruce

More information

arxiv: v3 [stat.me] 15 Mar 2013

arxiv: v3 [stat.me] 15 Mar 2013 Sparse estimation via non-concave penalized likelihood in factor analysis model Kei Hirose and Michio Yamamoto Division of Mathematical Science, Graduate School of Engineering Science, Osaka University,

More information

Recovery of weak factor loadings in confirmatory factor analysis under conditions of model misspecification

Recovery of weak factor loadings in confirmatory factor analysis under conditions of model misspecification Behavior Research Methods 29, 41 (4), 138-152 doi:1.3758/brm.41.4.138 Recovery of weak factor loadings in confirmatory factor analysis under conditions of model misspecification CARMEN XIMÉNEZ Autonoma

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised ) Ronald H. Heck 1 University of Hawai i at Mānoa Handout #20 Specifying Latent Curve and Other Growth Models Using Mplus (Revised 12-1-2014) The SEM approach offers a contrasting framework for use in analyzing

More information

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model

More information

Exploratory Factor Analysis: dimensionality and factor scores. Psychology 588: Covariance structure and factor models

Exploratory Factor Analysis: dimensionality and factor scores. Psychology 588: Covariance structure and factor models Exploratory Factor Analysis: dimensionality and factor scores Psychology 588: Covariance structure and factor models How many PCs to retain 2 Unlike confirmatory FA, the number of factors to extract is

More information

Sparse orthogonal factor analysis

Sparse orthogonal factor analysis Sparse orthogonal factor analysis Kohei Adachi and Nickolay T. Trendafilov Abstract A sparse orthogonal factor analysis procedure is proposed for estimating the optimal solution with sparse loadings. In

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Methods for Handling Missing Non-Normal Data in Structural Equation Modeling. Fan Jia

Methods for Handling Missing Non-Normal Data in Structural Equation Modeling. Fan Jia Methods for Handling Missing Non-Normal Data in Structural Equation Modeling By Fan Jia Submitted to the graduate degree program in the Department of Psychology and the Graduate Faculty of the University

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Penalized varimax. Abstract

Penalized varimax. Abstract Penalized varimax 1 Penalized varimax Nickolay T. Trendafilov and Doyo Gragn Department of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes MK7 6AA, UK Abstract A common weakness

More information

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models General structural model Part 2: Categorical variables and beyond Psychology 588: Covariance structure and factor models Categorical variables 2 Conventional (linear) SEM assumes continuous observed variables

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

arxiv: v1 [stat.me] 30 Aug 2018

arxiv: v1 [stat.me] 30 Aug 2018 BAYESIAN MODEL AVERAGING FOR MODEL IMPLIED INSTRUMENTAL VARIABLE TWO STAGE LEAST SQUARES ESTIMATORS arxiv:1808.10522v1 [stat.me] 30 Aug 2018 Teague R. Henry Zachary F. Fisher Kenneth A. Bollen department

More information

Iterative Selection Using Orthogonal Regression Techniques

Iterative Selection Using Orthogonal Regression Techniques Iterative Selection Using Orthogonal Regression Techniques Bradley Turnbull 1, Subhashis Ghosal 1 and Hao Helen Zhang 2 1 Department of Statistics, North Carolina State University, Raleigh, NC, USA 2 Department

More information

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal

More information

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis Multivariate Fundamentals: Rotation Exploratory Factor Analysis PCA Analysis A Review Precipitation Temperature Ecosystems PCA Analysis with Spatial Data Proportion of variance explained Comp.1 + Comp.2

More information

SEM for Categorical Outcomes

SEM for Categorical Outcomes This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Evaluating Small Sample Approaches for Model Test Statistics in Structural Equation Modeling

Evaluating Small Sample Approaches for Model Test Statistics in Structural Equation Modeling Multivariate Behavioral Research, 9 (), 49-478 Copyright 004, Lawrence Erlbaum Associates, Inc. Evaluating Small Sample Approaches for Model Test Statistics in Structural Equation Modeling Jonathan Nevitt

More information

Regularization: Ridge Regression and the LASSO

Regularization: Ridge Regression and the LASSO Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression

More information

Educational and Psychological Measurement

Educational and Psychological Measurement Target Rotations and Assessing the Impact of Model Violations on the Parameters of Unidimensional Item Response Theory Models Journal: Educational and Psychological Measurement Manuscript ID: Draft Manuscript

More information

Latent variable interactions

Latent variable interactions Latent variable interactions Bengt Muthén & Tihomir Asparouhov Mplus www.statmodel.com November 2, 2015 1 1 Latent variable interactions Structural equation modeling with latent variable interactions has

More information

Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models

Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider variable

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from

More information

Psychology 454: Latent Variable Modeling How do you know if a model works?

Psychology 454: Latent Variable Modeling How do you know if a model works? Psychology 454: Latent Variable Modeling How do you know if a model works? William Revelle Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 18 Outline 1 Goodness

More information

MODEL IMPLIED INSTRUMENTAL VARIABLE ESTIMATION FOR MULTILEVEL CONFIRMATORY FACTOR ANALYSIS. Michael L. Giordano

MODEL IMPLIED INSTRUMENTAL VARIABLE ESTIMATION FOR MULTILEVEL CONFIRMATORY FACTOR ANALYSIS. Michael L. Giordano MODEL IMPLIED INSTRUMENTAL VARIABLE ESTIMATION FOR MULTILEVEL CONFIRMATORY FACTOR ANALYSIS Michael L. Giordano A thesis submitted to the faculty at the University of North Carolina at Chapel Hill in partial

More information

Factor Analysis & Structural Equation Models. CS185 Human Computer Interaction

Factor Analysis & Structural Equation Models. CS185 Human Computer Interaction Factor Analysis & Structural Equation Models CS185 Human Computer Interaction MoodPlay Recommender (Andjelkovic et al, UMAP 2016) Online system available here: http://ugallery.pythonanywhere.com/ 2 3 Structural

More information

Bayesian Analysis of Latent Variable Models using Mplus

Bayesian Analysis of Latent Variable Models using Mplus Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

Introduction to Structural Equation Modeling

Introduction to Structural Equation Modeling Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

ABSTRACT. Phillip Edward Gagné. priori information about population membership. There have, however, been

ABSTRACT. Phillip Edward Gagné. priori information about population membership. There have, however, been ABSTRACT Title of dissertation: GENERALIZED CONFIRMATORY FACTOR MIXTURE MODELS: A TOOL FOR ASSESSING FACTORIAL INVARIANCE ACROSS UNSPECIFIED POPULATIONS Phillip Edward Gagné Dissertation directed by: Professor

More information

Exploratory Graph Analysis: A New Approach for Estimating the Number of Dimensions in Psychological Research. Hudson F. Golino 1*, Sacha Epskamp 2

Exploratory Graph Analysis: A New Approach for Estimating the Number of Dimensions in Psychological Research. Hudson F. Golino 1*, Sacha Epskamp 2 arxiv:1605.02231 [stat.ap] 1 Exploratory Graph Analysis: A New Approach for Estimating the Number of Dimensions in Psychological Research. Hudson F. Golino 1*, Sacha Epskamp 2 1 Graduate School of Psychology,

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models General structural model Part 1: Covariance structure and identification Psychology 588: Covariance structure and factor models Latent variables 2 Interchangeably used: constructs --- substantively defined

More information

Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances

Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances METRON - International Journal of Statistics 2008, vol. LXVI, n. 3, pp. 285-298 SHALABH HELGE TOUTENBURG CHRISTIAN HEUMANN Mean squared error matrix comparison of least aquares and Stein-rule estimators

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Outlier detection and variable selection via difference based regression model and penalized regression

Outlier detection and variable selection via difference based regression model and penalized regression Journal of the Korean Data & Information Science Society 2018, 29(3), 815 825 http://dx.doi.org/10.7465/jkdi.2018.29.3.815 한국데이터정보과학회지 Outlier detection and variable selection via difference based regression

More information

Multiple group models for ordinal variables

Multiple group models for ordinal variables Multiple group models for ordinal variables 1. Introduction In practice, many multivariate data sets consist of observations of ordinal variables rather than continuous variables. Most statistical methods

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information