On the relationships among latent variables and residuals in PLS path modeling: The formative-reflective scheme

Size: px
Start display at page:

Download "On the relationships among latent variables and residuals in PLS path modeling: The formative-reflective scheme"

Transcription

1 Computational Statistics & Data Analysis 51 (2007) wwwelseviercom/locate/csda On the relationships among latent variables and residuals in PLS path modeling: The formative-reflective scheme Giorgio Vittadini, Simona C Minotti, Marco Fattore, Pietro G Lovaglio Dipartimento di Statistica, Università degli Studi di Milano-Bicocca, Via Bicocca degli Arcimboldi 8, Milano, Italy Received 4 December 2005; received in revised form 26 October 2006; accepted 26 October 2006 Available online 27 November 2006 Abstract A new approach for the estimation and the validation of a structural equation model with a formative-reflective scheme is presented The basis of the paper is a proposal for overcoming a potential deficiency of PLS path modeling In the PLS approach the reflective scheme assumed for the endogenous latent variables (LVs) is inverted; moreover, the model errors are not explicitly taken into account for the estimation of the endogenous LVs The proposed approach utilizes all the relevant information in the formative manifest variables (MVs) providing solutions which respect the causal structure of the model The estimation procedure is based on the optimization of the redundancy criterion The new approach, entitled redundancy analysis approach to path modeling (RA-PM) is compared with both traditional PLS Path Modeling and LISREL methodology, on the basis of real and simulated data 2006 Elsevier BV All rights reserved Keywords: Latent variables; Partial least squares; PLS path modeling; Redundancy analysis; LISREL model 1 Introduction This paper proposes a new approach for the estimation and the validation of a structural equation model characterized by a formative scheme for exogenous latent variables (LVs) and a reflective scheme for endogenous LVs The proposal consists of an alternative approach to the PLS path modeling approach (PLS-PM) introduced by Lohmöller (1989) In order to examine the implications involved in the adoption of a formative-reflective scheme, let us examine an example regarding the causal links between the investment in human capital (HC), (ie educational attainment and work experience), and the ability to generate earned and property income, as indicated in Fig 1 The HC model, which is described in detail in Section 7, consists of two exogenous LVs, educational HC and work experience HC, and two endogenous LVs which can be defined as the ability to generate earned income and the ability to generate property income These two endogenous LVs represent the innate capability to transform educational attainment and work experience into the process of accumulation of earned and property income The results of the application reveal some critical points in PLS-PM First of all, the goodness of fit is quite low (GoF = 03863) and, as Goldstein Dillon s ρ shows, the formative block containing the Job indicators is not unidimensional (ρ = 01630) We expect that the extraction of further information from the formative manifest variables (MVs) will improve the model s goodness of fit Hence we suggest the explicit introduction of the model errors (estimated Corresponding author Tel: ; fax: address: simonaminotti@unimibit (SC Minotti) /$ - see front matter 2006 Elsevier BV All rights reserved doi:101016/jcsda

2 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Educational indicators Educational human capital Ability to generate earned income Earned income indicators Job indicators Work experience human capital Ability to generate property income Property income indicators Fig 1 The structural model for human capital by means of the residuals) into PLS-PM Secondly, we observe that the redundancy of the reflective MVs accounted for by the estimated endogenous LVs (4062%) is greater than the redundancy accounted for by the formative MVs (3451%) This anomalous result depends on Lohmöller s algorithm, which estimates the endogenous LVs as linear combinations of their reflective MVs In order to resolve the problems mentioned above, we propose a new approach based on the optimization of the redundancy criterion (Stewart and Love, 1968) The new approach, entitled RA-PM (redundancy analysis approach to path modeling), utilizes all the relevant information in the formative MVs and provides solutions which respect the causal structure of the model The paper is organized as follows: in Section 2, the formative-reflective model is introduced; in Section 3, the traditional methodologies utilized for its estimation are discussed; in Section 4, the RA-PM algorithm is presented; in Section 5, an appropriate measure of goodness of fit is proposed; in Section 6, a discussion of the basic choices and properties of RA-PM is provided; in Sections 7 and 8, RA-PM is compared with traditional PLS-PM and the LISREL methodology, on the basis of real and simulated data; in Section 9, some conclusions and further research are proposed 2 The formative-reflective model The non-recursive structural equation model considered in this paper is characterized by a formative scheme for the exogenous LVs and a reflective scheme for the endogenous LVs The measurement model of the exogenous LVs imposes a formative relationship between each LV and its corresponding MVs, ie each exogenous LV summarizes its MVs ξ (j) = ω (j) X (j) + u (j), j = 1,,r, where ξ (j), j = 1,,r, is a random variable denoting the jth exogenous LV; X (j) = (X (j)1,,x (j)pj ) is a p j - dimensional vector of centred observable random variables; ω (j) = (ω (j)1,,ω (j)pj ) is a p j -dimensional vector of unknown outer weights; u (j) is a random disturbance, with expected value null and uncorrelated to the MVs X (j) It can be observed that, in the formative scheme, the blocks of MVs can be multidimensional (Tenenhaus et al, 2005) The measurement model of the endogenous LVs imposes a reflective relationship between each LV and its corresponding MVs, ie each endogenous LV is described by its MVs: Y (k) = λ (k) η (k) + ε (k), k = 1,,s, (2) where η (k), k = 1,,s, is a random variable which denotes the kth endogenous LV; Y (k) = (Y (k)1,,y (k)qk ) is a q k -dimensional vector of centred observable random variables; λ (k) = (λ (k)1,,λ (k)qk ) is a q k -dimensional vector of unknown loadings; ε (k) = (ε (k)1,,ε (k)qk ) is a q k -dimensional vector of errors of measurement, with expected value null and uncorrelated to η (k) It can be observed that, in the reflective scheme, the blocks of MVs are unidimensional in accordance with factor analysis (Tenenhaus et al, 2005) The model is completed by the structural equation which describes the relationship between the LVs η (k) = β (k) η + γ (k) ξ + ζ (k), k = 1,,s, (3)

3 5830 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) ζ X ξ η Y X (2) ξ (2) η (2) Y (2) ζ (2) Fig 2 A formative-reflective structural equation model where η is a random vector which contains the endogenous LVs; ξ is a random vector which contains the exogenous LVs; β (k) and γ (k) are vectors of s and r unknown path coefficients, respectively; ζ (k) is a random variable indicating the error in equation associated with the kth endogenous LV η (k) Errors in equations may represent the effect of unknown variables or the effect of known but omitted variables (Saris and Stronkhorst, 1984) In a compact form, model (3) can be rewritten as ξ = Ω XX + u, (4) Y = Λ Y η + ε (5) and η = B η + Γ ξ + ζ = (I B ) 1 Γ ξ + (I B ) 1 ζ (6) Given the observations of n independent drawings of the random vectors x (j) and y (k), by arranging the observations in n-dimensional column vectors, model (4) (6) becomes Ξ = XΩ X + U, (7) Y = HΛ Y + E (8) and H = HB + ΞΓ + Z = ΞΓ(I B) 1 + Z(I B) 1, (9) where Ξ = (ξ,,ξ (r) ) is a (n r) matrix of latent scores on ξ; H = (η,,η (s) ) is a (n s) matrix of latent scores on η; X = (x,,x (p) ) is a (n p) matrix of observations on X; Y = (y,,y (q) ) is a (n q) matrix of observations on Y ; U is a (n r) matrix of random disturbances; E = (ɛ,,ɛ (q) ) is a (n q) matrix of errors of measurement; Z = (ζ,,ζ (s) ) is a (n s) matrix of errors in equations In the special case of r = 2 and s = 2, model (7) (9) corresponds to the HC model, completed by the errors in equations (see Fig 2) A suitable estimation method for model (7) (9) should have the following properties: (i) The parameter identifiability (ii) The non-correlation between exogenous LVs and errors in equations (iii) The uniqueness of the latent scores (iv) The coherence of the solutions with the causal structure of the model (ie the path directions) (v) The utilization of all the relevant information in the formative MVs, which implies the modelization of the residuals

4 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Problems in PLS path modeling and LISREL model PLS-PM (Wold, 1982) satisfies properties (i) and (iii) (given that the rank of components models is equal to the rank of the MVs, Lohmöller, 1989) but does not respect properties (ii), (iv) and (v), as will be discussed in the following section For the sake of simplicity, the discussion will regard the case of a formative-reflective model characterized by r = 1 and s = 1 In traditional PLS-PM, LVs are estimated as linear combinations of their MVs, with weights obtained by means of an iterative procedure which takes into account the inner relations of the model, as will now be briefly described (for an up-to-date review, see Tenenhaus et al, 2005) Outer estimation: The standardized outer estimates of latent score vectors ξ and η, indicated by ˆξ outer and ˆη outer, respectively, are obtained as linear combinations of their observed MVs, given the arbitrary vectors of outer weights ω 0 and υ 0 : ˆξ outer Xω 0 (10) and ˆη outer Yυ 0 (11) Inner estimation: The standardized inner estimates of ξ and η, indicated by ˆξ inner and ˆη inner, respectively, are obtained as linear combinations of standardized outer estimates of their adjacent LVs and ˆξ inner γˆη outer (12) ˆη inner β ˆξ outer, (13) where the inner weights ˆγ and ˆβ can be alternatively defined by means of the centroid scheme, the factorial scheme or the path weighting scheme Estimation modes for the outer weights: The outer weights ω 0 are updated by means of Mode B (ie a multiple regression of the inner estimate ˆξ inner on matrix X), while the weights υ 0 are updated by means of Mode A (simple regressions of the vectors y h, h = 1,,q, on the inner estimate ˆη inner ) and ˆω = (X X) 1 X ˆξ inner (14) ˆυ = (ˆη inner ˆη inner) 1 ˆη inner Y =ˆη innery (15) The procedure is iterated until the algorithm converges, producing ˆξ X(X X) 1 X ˆη (16) and ˆη YY ˆξ (17) From (16) and (17) it follows that PLS-PM does not respect properties (iv) and (v) given that: The final estimate ˆξ in (16) is proportional to the final estimate ˆη and this does not respect the causal relation implied in (3) Moreover, the final estimate ˆη in (17) is a linear combination of the columns of matrix Y and this does not respect the reflective scheme assumed in (5), although the outer weights, by means of Mode A, preserve this assumption The inner estimate ˆη inner in (13) does not explicitly take into account the error in equation, ie further components underlied by the block of formative MVs, which can be multidimensional (Tenenhaus et al, 2005)

5 5832 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) It can be observed that the LISREL model (Jöreskog, 1973, 1981, 2000) satisfies properties (ii), (iv) and (v), but the formative scheme assumed in is substituted by a reflective scheme, which implies that properties (i) and (iii) are not respected In fact, in the LISREL model instead of (7) we have X = ΞΛ X + Δ (18) and consequently the reduced form of model (7) (9) is J = ML M + T, (19) with a covariance structure equal to Σ J = L M Σ ML M + Σ T, (20) where [ ] J =[X, Y ], M =[Ξ, Z ] Λ, L M = X 0 Γ(I B) 1 Λ Y (I B) 1 and T =[F E ] Λ Y Since no necessary and sufficient conditions for identification are available, Jöreskog (1981) suggests that the identification problem be studied on a case by case basis, examining the equations, choosing the restrictions, not only in number but also in position, in order to obtain unique parameters Moreover, it has been demonstrated that even if the parameters of the LISREL model are perfectly identified, the scores of the LVs are not unique (Guttman, 1955; Schönemann, 1971; Vittadini, 1989) Given that the infinite scores of each LV in M express the same concept, it would be opportune if these scores were highly and positively correlated In practice, the correlations between the two maximally different solutions M and M, given by the diagonal elements of matrix 2D 1/2 M Σ ML M Σ 1 J L M Σ MD 1/2 M D 1/2 M Σ MD 1/2 M (21) (where D M is the diagonal matrix of Σ M ) can be weak or even negative 4 The redundancy analysis approach to path modeling In order to obtain solutions which respect properties (i) (v), we propose a new approach based on the optimization of the redundancy criterion (Stewart and Love, 1968) which is entitled RA-PM Traditional redundancy analysis (Van den Wollenberg, 1977) does not calculate errors in equations and endogenous LVs, because it does not refer to structural equation models RA-PM is an extension of redundancy analysis to a formative-reflective structural equation model with r formative blocks and s reflective blocks It consists of an iterative procedure which is characterized by the following steps: 1 Estimation of exogenous LVs and initial estimation of endogenous LVs: The initialization of the RA-PM algorithm is given by estimating each exogenous LV as the first redundancy component of the Y variables on the corresponding X block The choice of the redundancy criterion ensures that the maximum amount of information concerning Y is extracted from each X block and that the causal structure of the model is respected Next, the initial estimates of the endogenous LVs are obtained by means of a Redundancy Analysis of each single Y block on the estimates of the exogenous LVs 2 Estimation of further components and updated estimation of endogenous LVs: The variables within each X block are projected onto the orthogonal complement of the linear space spanned by all the redundancy components extracted previously Thus, an orthogonal subspace is identified for each X block These subspaces are referred to as residual X blocks A second set of components is estimated by means of a redundancy analysis of the Y variables on each residual X block Next, the endogenous LVs are re-estimated by means of a redundancy analysis of each single Y block on the linear space spanned by the first and the second set of extracted components This scheme is iterated according to a forward selection procedure This procedure considers new residual blocks, extracts the corresponding redundancy components and updates the estimates of the endogenous LVs, until the accounted redundancy added at each iteration becomes marginal

6 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Refinement of the solutions: A backward selection procedure of the extracted components is performed in order to identify the minimal subset of components which accounts for a given Y redundancy level Note that although backward selection requires some amount of computational work, it guarantees that we obtain a more parsimonious model On the basis of the components which have been selected by backward selection, the final estimates of the endogenous LVs are computed 4 Estimation of structural parameters and loadings: Finally, the structural parameters of the model are estimated, together with the errors in equations Moreover, the estimation of the loadings of each y variable on the corresponding endogenous LV is provided In the following, we present each step analytically, assuming that the MVs are centred 41 Estimation of exogenous LVs and initial estimation of endogenous LVs Estimation of exogenous LVs: Each exogenous LV ξ (j), j = 1,,r, is estimated as the first redundancy component ˆξ (j) = X (j) ˆω (j), (22) ie the unit variance linear combination of block X (j) which maximizes the redundancy index R (ξ) (j), defined as R (ξ) (j) = q k=1 cor(x (j) ω (j), y k ) 2 /q (23) In this way, ˆω (j) is computed as the eigenvector corresponding to the highest eigenvalue of matrix Σ 1 X (j) X (j) Σ X(j) YΣ YX(j), under the constraint ˆω (j) Σ X (j) X (j) ˆω (j) = 1 (24) in such a way that ˆξ (j) is standardized We have thus established the following association: X ˆξ X (r) ˆξ (r) The estimates of the exogenous LVs are arranged in matrix ˆΞ, defined as ˆΞ =[ˆξ,, ˆξ (r) ] (25) Initial estimation of endogenous LVs: For each block Y (k), k = 1,,s, each endogenous LV η (k) is estimated as the first redundancy component ˆη (in) (k) on the linear space V (Ξ) spanned by the columns of ˆΞ, ie the unit variance linear combination of the estimated exogenous LVs which maximizes the redundancy index R (ηin ) (k), defined as q k R (ηin ) (k) = i=1 cor( ˆΞρ (k), y (k)i ) 2 /q k (26) In this way, ˆρ (k) is the eigenvector corresponding to the highest eigenvalue of matrix Σ 1 ˆΞ ˆΞ Σ ˆΞY Σ (k) Y(k), under the ˆΞ constraint ˆρ (k) Σ ˆΞ ˆΞ ˆρ (k) = 1 (27) in such a way that ˆη (k) is standardized We have thus established the following association: Y ˆη (in) Y (s) ˆη (in) (s)

7 5834 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Estimation of further components and updated estimation of endogenous LVs Estimation of further components: We define V (X) as the linear space spanned by X,,X (r), V as the linear space spanned by ˆξ,, ˆξ (r), P V and Q V =I P V as the orthogonal projectors onto V and V, respectively (here the orthogonal complement is defined in V (X) ) For j = 1,,r, the residual blocks Q V X (j) are taken into consideration The second component of block X (j) is estimated as the first redundancy component, ie the unit variance linear combination ˆφ (j) = Q V X (j) ˆψ (j), j = 1,,r (28) which maximizes the redundancy index R (φ) (j), defined as q R (φ) (j) = cor(q V X (j) ψ (j), y k ) 2 /q (29) k=1 In this way, ˆψ (j) is the eigenvector corresponding to the highest eigenvalue of matrix Σ 1 Q V X (j) Q V X (j) Σ QV X (j) YΣ YQV X (j) under the constraint ˆψ (j) Σ Q V X (j) Q V X (j) ˆψ (j) = 1 (30) in such a way that ˆφ (j) is standardized We have thus established the following association: X ˆξ ˆφ X (r) ˆξ (r) ˆφ (r) It can be observed that the estimates ˆφ,, ˆφ (r) of the second set of components are not correlated to any of the estimates ˆξ,, ˆξ (r) of the exogenous LVs (ie the first set of components), being linear combinations of vectors in V In this way, we ensure that the new components share no part of the already extracted redundancy Next, we arrange the estimates ˆφ,, ˆφ (r) in matrix ˆΦ, defined as ˆΦ =[ˆφ,, ˆφ (r) ] (31) and introduce matrix ˆΠ, where the first r columns are those of matrix ˆΞ while the remaining columns are those of matrix ˆΦ ˆΠ =[ˆΞ, ˆΦ ] (32) Updated estimation of endogenous LVs: For each block Y (k), k = 1,,s, the estimate ˆη (k) is updated by computing the first redundancy component ˆη (upd) (k) on the linear space V spanned by the columns of ˆΠ, ie the unit variance linear combination of the extracted components, which maximizes the redundancy index R (ηupd ) (k), defined as q k cor( ˆΠ θ (k), y (k)i ) 2 /q k (33) R (ηupd ) (k) = i=1 In this way, ˆθ (k) is the eigenvector corresponding to the highest eigenvalue of matrix Σ 1 ˆΠ ˆΠ Σ ˆΠ Σ, under Y (k) Y(k) ˆΠ the constraint ˆθ (k) Σ ˆθ ˆΠ ˆΠ (k) = 1 (34) in such a way that ˆη (k) is standardized

8 We have thus established the association: Y ˆη (upd) Y (s) ˆη (upd) (s) G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) The above scheme can be iterated in order to provide the estimation of more sets of components ˆΦ (2),, ˆΦ (p) and consequently, to update the estimates ˆη (upd),,ˆη (upd) (s) of the endogenous LVs In order to establish the number of components to be extracted and a suitable stopping rule, we adopt a forward selection procedure This means that the newly extracted components will be utilized for updating the estimates of the endogenous LVs in the following two situations: first, if these new components add a relevant marginal proportion of Y redundancy, or second, if the Y block redundancy accounted for by the updated ˆη (upd) variables increases in a significant way (see Section 5 for a discussion of the indicators to be used to perform these evaluations) 43 Refinement of the solutions Backward selection of the extracted components: In order to achieve the most parsimonious model, we apply a backward selection procedure on the set of the extracted components: for each block X (j) we eliminate the components ˆφ (j) whose removal has minor effect on the accounted Y redundancies This makes it possible to obtain a different number of components for each block X (j) X ˆξ ˆφ ˆφ (v 1) X (r) ˆξ (r) ˆφ (r) ˆφ (v r ) (r), where v j, j = 1,,r, is the number of selected components for block X (j) The estimates of the selected components are arranged in matrix ˆΦ ˆΦ =[ˆφ,, ˆφ (r),, ˆφ(v 1), ˆφ(v r ) (r) ] (35) We then introduce matrix ˆΠ whose first r columns are those of matrix ˆΞ while the remaining columns are those of matrix ˆΦ ˆΠ =[ˆΞ, ˆΦ], (36) thus the columns of ˆΠ contain all the components which have been selected by means of backward selection Finally, we indicate by V (Π) the linear space spanned by the columns of ˆΠ Final estimation of endogenous LVs: For each block Y (k), k = 1,,s, the final estimate ˆη (k) is obtained as the first redundancy component on the linear subspace V (Π), ie the unit variance linear combination of the selected components which maximizes the redundancy index R (η) (k), defined as qk R (η) (k) = i=1 cor( ˆΠθ (k), y (k)i ) 2 /q k (37) In this way, ˆθ (k) is the eigenvector corresponding to the highest eigenvalue of matrix Σ 1 ˆΠ ˆΠ Σ ˆΠY Σ (k) Y(k), under the ˆΠ constraint ˆθ (k) Σ ˆΠ ˆΠ ˆθ (k) = 1 (38) in such a way that ˆη (k) is standardized

9 5836 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) We have thus established the final association: Y ˆη Y (s) ˆη (s) If we arrange the final estimates ˆη,,ˆη (s) in matrix Ĥ =[ˆη,,ˆη (s) ] and the corresponding eigenvectors ˆθ,,ˆθ (s) in matrix ˆΘ =[ˆθ,,ˆθ (s) ], we can express the final estimates of the endogenous LVs as Ĥ = ˆΠ ˆΘ = ˆΞ ˆΘ 1 + ˆΦ ˆΘ 2, (39) where ˆΘ 1 is the matrix obtained from ˆΘ retaining only the first r rows and ˆΘ 2 the matrix obtained from ˆΘ retaining only the rows after the rth one 44 Structural parameters and loadings Structural parameters and errors in equations: The estimate ˆγ of the structural parameter γ and the estimate ˆζ of the error in equation ζ are obtained by means of a regression of ˆη on ˆΞ ˆη = ˆΞˆγ + ˆζ, (40) where ˆγ is the first column of matrix ˆΘ 1 and ˆζ can be decomposed as ˆζ = ˆΦˆα, with ˆα as the first column of ˆΘ 2 Under the assumption that ˆη and the columns ˆξ,, ˆξ (r) of ˆΞ form a linearly independent set of regressors, the estimates ˆγ (2) and ˆζ (2) are obtained by means of a regression of ˆη (2) on ˆη and ˆΞ: ˆη (2) =ˆη ˆb (2) + ˆΞˆγ (2) + ˆζ (2), (41) where ˆζ (2) can be decomposed as ˆζ (2) = ˆΦˆα (2) For k = 3,,s, under the assumption that the columns ˆη,,ˆη (k 1) of matrix Ĥ (k 1) and the columns of ˆΞ form a linearly independent set of regressors, the estimates ˆγ (k) and ˆζ (k) are obtained by means of a regression of ˆη (k) on Ĥ (k 1) and ˆΞ ˆη (k) = Ĥ (k 1) ˆb (k) + ˆΞˆγ (k) + ˆζ (k), k = 3,,s, (42) where ˆζ (k) can be decomposed as ˆζ (k) = ˆΦˆα (k) We now define ˆβ as a zero vector and ˆβ (k), k = 2,,s, by means of the following association: ˆβ (k)i = ˆb (k)i, 1 i k 1 or ˆβ(k)i = 0, k i s, (43) where ˆb (k)i is the ith component of vector ˆb (k) If we consider vectors ˆβ (k), ˆγ (k) and ˆζ (k), k = 1,,s,asthekth columns of matrices ˆB, ˆΓ and Ẑ, respectively, we can express the final estimates of the endogenous LVs in the form of Eq (9) Ĥ = Ĥ ˆB + ˆΞ ˆΓ + Ẑ (44) Loadings: For each variable y (k)i, k = 1,,s, i = 1,,q k, the loadings on the corresponding variable ˆη (k) are estimated as ˆλ (k)i = (ˆη (k) ˆη (k)) 1 ˆη (k) y (k)i (45)

10 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Model validation At this point, an important question is raised: does the introduction of the errors in equations improve the model? And if so, to what degree? In order to evaluate the effectiveness of RA-PM as compared to traditional PLS-PM, we utilize the redundancy criterion for both the exogenous and endogenous LVs An absolute measure of goodness of fit for the estimates of the exogenous LVs is the accounted redundancy associated to the linear subspace V (Ξ) with respect to Y Tr(Σ 1 Red(Y V (Ξ) ˆΞ ) = ˆΞ Σ ˆΞY Σ Y ˆΞ ) Tr(Y (46) Y) By dividing Red(Y V (Ξ) ) by its theoretical maximum, we obtain a relative measure of goodness of fit If Red REL (Y V (Ξ) ) = Tr(Σ 1 ˆΞ ˆΞ Σ ˆΞY Σ Y ˆΞ ) Tr(Σ 1 XX Σ XYΣ YX ) (47) Red(Y X) = Tr(Σ 1 XX Σ XYΣ YX ) Tr(Y Y) is the redundancy accounted for by X, from (46) and (47) we have Red(Y V (Ξ) ) = Red REL (Y V (Ξ) ) Red(Y X) (49) Note that an analogous measure of goodness of fit may be calculated for the extracted components, thus representing the accounted redundancy associated to the linear subspace V (Π) with respect to Y A measure of goodness of fit for the estimates of the endogenous LVs is the redundancy accounted for by ˆη (k) with respect to the single block Y (k) Red(Y (k) ˆη (k) ) = ˆη (k) Y (k)y(k) ˆη (k) Tr(Y (k) Y, k = 1,,s (50) (k)) 6 RA-PM properties 61 The utilization of residual information in the formative model Tenenhaus et al (2005) state that in a formative model, the block of MVs can be multidimensional In practice, multidimensionality can be verified by means of various indicators (eg Goldstein-Dillon s ρ, Cronbach s α or Principal Component Analysis) In these circumstances, in order to extract the residual information in the data, more redundancy components need to be considered In RA-PM, this is achieved by an iterative algorithm that extracts the most informative components and generates a suitable orthogonal decomposition of the linear spaces spanned by the MVs in each X block In practice, on the basis of the redundancy criterion, the algorithm starts by extracting the most informative component within each X block By means of suitable orthogonal projections, it then identifies the residual subspaces (one for each X block), ie those containing information which has not yet been extracted At each successive iteration, the most informative component within each residual subspace is extracted and new residual subspaces are obtained The algorithm stops when all the relevant residual information about Y contained in the X blocks is extracted 62 Algebraic description and optimality In order to justify the backward selection procedure and demonstrate the optimality of RA-PM algorithm, we adopt an algebraic perspective based on the orthogonal decomposition of the linear space V (X) induced by the iterative extraction of the redundancy components (48)

11 5838 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) By performing the estimation of the exogenous LVs, we identify the linear subspace V (Ξ) V (X), defined as the linear span of ˆξ,, ˆξ (r) Similarly, when a second set of components is extracted, the linear subspace V V (X) is identified as the linear span of ˆξ,, ˆξ (r), ˆφ,, ˆφ (r) The components which are extracted at different iterations are uncorrelated, thus the linear subspace W = span( ˆφ,, ˆφ (r) ) is orthogonal to V (Ξ) and the linear subspace V can ben expressed as V = V (Ξ) W (51) As new sets of redundancy components are extracted, a chain of subspaces V (Ξ) V V (l) V (X), l = max(v 1,,v r ), is generated From the above discussion it can be deduced that V (k) = V (k 1) W (k), k = 1,,l, (52) where W (k) is the linear space spanned by the redundancy components extracted during iteration k, and that V (X) can be decomposed as V (X) = V (Ξ) W W (l) (53) At this point, if we halt the extractions at iteration ˆl l according to forward selection (ie when the marginal contribution to the accounted redundancies due to W ( ˆl+1) is negligible), V (X) can be decomposed as V (X) = V ( ˆl) V ( ˆl) (54) If we then establish a target level of Y redundancy to be accounted for by the extracted components, backward selection provides the removal of the less contributing components, reducing V ( ˆl) to a subspace V (Π) of minimum dimension accounting for the target redundancy level In this sense, the procedure of estimation and selection of the components (ie the procedure of identification of the subspace V (Π) ) is optimal Based on this selection, the computation of the final estimates of the endogenous LVs fulfills the optimality criterion of redundancy analysis Hence, we can affirm that the proposed estimation procedure for the LVs is optimal 63 Features of RA-PM estimates As a consequence of the adoption of the RA-PM algorithm, the following algebraic and geometrical features hold: (i) Once the exogeneous and endogenous LVs have been estimated, the estimates ˆB and ˆΓ of the structural parameters and the estimates ˆΛ Y of the loadings are univocally determined (ii) The estimates Ẑ of the errors in equations are linear combinations of the estimates ˆΦ of the further components, which are uncorrelated with the estimates ˆΞ of the exogenous LVs; it follows that cov( ˆξ (j), ˆζ (k) ) = 0, j = 1,,r, k= 1,,s, (55) meaning that the estimates Ẑ share no information with the estimates ˆΞ (iii) The estimates ˆΞ of the exogenous LVs are univocally defined as linear combinations of the X variables, the estimates Ĥ of the endogenous LVs and the estimates Ẑ of the errors in equations are univocally defined as linear combinations of the estimates ˆΞ and ˆΦ Moreover, the rank of the set { ˆξ,, ˆξ (r), ˆζ,,ˆζ (s) } does not exceed the rank of X (iv) The solutions respect the causal structure of the model In fact, in the RA-PM algorithm the estimates ˆΞ are linear combinations of the X variables, the estimates Ẑ are linear combinations of the estimates ˆΦ, and the estimates Ĥ are linear combinations of the estimates ˆΞ and Ẑ (v) The errors in equations are taken into consideration for the estimation of the endogenous LVs, improving their redundancy accounting capability

12 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Table 1 LVs and MVs in the HC model Exogenous LVs Educational human capital (ξ 1 ) Work experience human capital (ξ 2 ) Endogenous LVs Ability to generate earned income (η 1 ) Ability to generate property income (η 2 ) Formative MVs Educational indicators H Years of schooling (z 1 ) S Years of schooling (z 2 ) H Father s years of schooling (z 3 ) H Mother s years of schooling (z 4 ) S Father s years of schooling (z 5 ) S Mother s years of schooling (z 6 ) Household real assets (z 7 ) Household financial assets (z 8 ) Household total debt (z 9 ) Job indicators H Age of entrance in the labour market (x 1 ) S Age of entrance in the labour market (x 2 ) H Number of years of full time job (x 3 ) S Number of years of full time job (x 4 ) H Number of years of part time job (x 5 ) S Number of years of part time job (x 6 ) Reflective MVs Earned income Household net disposable earned income (wages and salaries) (y 1 ) Household pensions and net transfers (y 2 ) Household net income from self-employment (y 3 ) Property income Household income from property (y 4 ) Household financial income (y 5 ) Table 2 Percentage of Y, Y and Y (2) redundancy accounted for by X Red(Y X) Red(Y X) Red(Y (2) X) The HC model 71 The variables We will now briefly describe the MVs involved in the HC model introduced in Section 1 The data regard n = 1936 Italian families and are drawn from the survey on Italian Household Budget performed by Banca d Italia (2002) As indicated in Table 1, the formative MVs associated with the two exogenous LVs regard variables pertaining to education and work experience Similarly, the reflective MVs associated with the two endogenous LVs regard the indicators of earned and property income For each family in the survey, some of the above variables are measured for both household head (H) and spouse (S) 72 The model estimation Before estimating the exogenous and endogenous LVs involved in the HC model, we evaluated the redundancy accounting capability of X with respect to Y, Y and Y (2) The results are reported in Table 2 We then analysed the redundancy structure of the data It is known that each eigenvalue of matrix Σ PX(j) YP X(j) Y is associated with a redundancy component of X (j) with respect to Y (D Ambra and Lauro, 1982); the analysis of these eigenvalues enables to formulate some hypotheses on the optimal number of redundancy components to be extracted (see Table 3)

13 5840 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Table 3 First five eigenvalues for X and X (2) blocks 1st 2nd 3rd 4th 5th X block X (2) block Table 4 Percentage of Y redundancy accounted for by the extracted components at different iterations Iteration Red(Y V (Ψ) ) Red REL (Y V (Ψ) ) Table 5 Percentage of Y and Y (2) accounted redundancy for by ˆη and ˆη (2) at different iterations Iteration Estimate Red(Y ˆη ) Estimate Red(Y (2) ˆη (2) ) 1 ˆη (in) 1850 ˆη (in) (2) ˆη upd 1896 ˆη upd (2) ˆη upd 1930 ˆη upd (2) 3849 Table 6 Percentage of Y, Y and Y (2) redundancy accounted for by the RA-PM estimates Red(Y V (Π) ) Red(Y ˆη ) Red(Y (2) ˆη (2) ) It is evident that it is possible to extract only one redundancy component from block X and three redundancy components from block X (2) Note that if we do not use the components beyond the first extraction, we lose a relevant amount of information, as it is confirmed in Table 4, which reports the accounted Y redundancy corresponding to different iterations of the algorithm As the number of iterations grows, the accounted redundancy increases; however, the increasing rate slows down after five iterations (Note that we have indicated by V (Ψ) the linear space spanned by the extracted components) At each iteration we computed the estimates of the endogenous LVs and the corresponding accounted Y block redundancy (see Table 5 ) The relevance of the residuals (taken into account by means of the components extracted after the first iteration) in improving the accounted redundancy is highly evident, particularly for the estimate of η (2) By performing a backward selection procedure, and removing the less contributing variables, we ultimately selected ˆξ, the first component from X, together with ˆξ (2), ˆφ (2) and ˆφ(2) (2), (ie the first three components from X (2)) Table 6 reports the redundancy accounted for by the selected components and the final estimates of the endogenous LVs (We remind that V (Π) indicates the linear space spanned by the selected components)

14 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Table 7 Percentage of Y, Y and Y (2) redundancy accounted for by the PLS-PM estimates PLS PLS Red(Y ˆξ, ˆξ (2) ) Red(Y ˆη PLS ) Red(Y (2) ˆη PLS (2) ) Table 8 Jöreskog goodness of fit index (GFI) and minimum correlations Minimum correlation GFI ˆξ Lsr ˆξ Lsr (2) ˆζ Lsr ˆζ Lsr Table 9 RA-PM outer weights z 1 z 2 z 3 z 4 z 5 z 6 z 7 z 8 z 9 ˆξ x 1 x 2 x 3 x 4 x 5 x 6 ˆξ (2) Comparison with PLS path modeling and LISREL model The HC data set has been also analysed by means of PLS-PM The redundancy of Y, Y and Y (2) which is accounted for by the PLS-PM estimates is reported in Table 7 PLS PLS From the above, it is evident that, when only the first extraction is performed, ˆξ and ˆξ (2) account for approximately the same Y redundancy as RA-PM (2102%, see Table 4) As more components are being estimated, the accounted Y redundancy in RA-PM overcomes that accounted for by PLS-PM; this is particularly true for the selected components It should be noted that the total Y redundancy accounted for by ˆη PLS and ˆη PLS (2) can be directly computed as 4062%, while the total Y redundancy accounted for by the X variables is 3451% (see Table 2) This demonstrates that a part of the accounting capability of PLS-PM is derived from the Y variables themselves and not from the X variables Before interpreting the results, we verified the consequences of the non-uniqueness of the scores of the LVs and the errors in equations in the LISREL Model In Table 8, it can be seen that the goodness of fit of the LISREL Model is high, but the correlations between the two maximally different solutions regarding the same identified model for HC are far from 1 and also negative 74 Interpretation of model parameters The outer weights of the exogenous LV estimates (see Table 9) are in agreement with the knowledge of HC (Vittadini and Lovaglio, 2004); in fact, ˆξ is correlated to both the years of schooling and household real assets (it thus represents the educational HC), and ˆξ (2) is correlated to years of work (it thus represents the work experience HC) The structural parameters are reported in Table 10; ˆη expresses a contrast between educational HC and work experience HC, according to the Italian situation; ˆη (2) presents structural parameters of different signs, but it is positively correlated with ˆξ The loadings of each y variable are reported in Table 11 and can be seen to be in agreement with a model for HC

15 5842 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Table 10 RA-PM structural parameters ˆη ˆη (2) ˆξ ˆξ (2) ˆη 283 Table 11 RA-PM loadings y 1 y 2 y 3 ˆη y (2)1 y (2)2 ˆη (2) RA-PM simulation study 81 The simulation experiment In order to illustrate the effectiveness of RA-PM, we have performed a simulation study based on model (7) (9), with r = 2, s = 2, p = 8, q = 8 and n = 1000 The purpose of the simulation is to show how taking into account redundancy components beyond the first extraction can improve the redundancy accounted for by the endogenous LV estimates, particularly when the covariance structure within and between the formative and the reflective MVs does not enable for unidimensional description of the variables To this aim, we have first defined two reference situations, indicated by A and K In situation A, the variables in the Y block are correlated with those in the X block alone and, similarly, the variables in the Y (2) block are only correlated with those in the X (2) block; moreover the variables in the X block are not correlated with the variables in the X (2) block On the contrary, in situation K there are cross correlations between the variables in the X and the Y blocks and there are correlations among some of the variables in the X block and some of the variables in the X (2) block Situation A is transformed into situation K through nine intermediate situations (indicated by B J) The transformation is driven by a parameter which progressively transforms Σ (A) XX and Σ(A) XY into Σ(K) XX and Σ(K) XY More specifically, we first defined two lower triangular matrices U (A) and U (K) such that matrices Σ (A) XX = U(A) U (A) and Σ (K) XX = U(K) U (K) (56) reproduced the desired covariance structure for the two reference situations Starting from U (A) and U (K) we introduced a sequence of nine intermediate lower triangular matrices defined as U (t) = (1 t) U (A) + t U (K) (t = 01, 02,,09) (57) After obtaining the sequence {U (t) }, the corresponding sequence {Σ (t) XX } was computed and nine data sets of X variables, distributed as N(0, Σ (t) XX ), were generated In this paper we report the results for only the nine intermediate situations (from B to J), given that the reference situations were used only as a starting point for the generation of the other data sets 82 Discussion of the results For each simulation case, the eigenvalues of covariance matrices Σ PX YP X Y and Σ PX(2) YP X(2) Y indicate the number of relevant redundancy components within the X variables (see Table 12) In both cases, it is evident that, moving from

16 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Table 12 First five eigenvalues/trace (percentage) for X and X (2) blocks (situations B J) X block X (2) block 1st 2nd 3rd 4th 5th 1st 2nd 3rd 4th 5th B C D E F G H I J Table 13 Percentage of accounted redundancy (situations B J) Red(Y X) Red(Y X ) Red(Y X (2) ) Red(Y X) Red(Y (2) X) B C D E F G H I J situation B to situation J, a second component is progressively revealed From an analytical point of view, this is due to the fact that the eigenvalues of a matrix are a continuous function of the matrix elements In the simulation, the eigenvalues are a continuous function of parameter t, driving the computation of the covariance matrices from situation A, where there is only one relevant eigenvalue, to situation K, where there are two relevant eigenvalues Table 13 reports the results of a redundancy analysis performed on the variables involved in the simulation It can be seen that the redundancy accounting capability of the X variables concerning Y, Y and Y (2) decreases as we leave situation B, reaches a minimum in the middle cases, and then increases when approaching situation J; on the contrary, the Y redundancy accounting capability of X and X (2) increases progressively from situation B to J We ran RA-PM on the nine data sets, extracting a pair of components (ie a component from each X block) as the number of iterations grows Table 14 reports the Y redundancy accounted for by the pairs of extracted components at each iteration We note that the higher the proximity to situation J is reached, the less the first extraction (ie the estimation of the two exogenous LVs alone) accounts for the Y redundancy This justifies the extraction of more redundancy components In order to show the relevance of the components beyond the first extraction, Table 15 reports the Y redundancy accounted for by both the initial estimates ˆη (in) and ˆη(in) (2) and the final estimates ˆη and ˆη (2) In the last three situations the improvement is remarkable, as it is expected by considering the redundancy structure of the simulation cases 83 Comparison with PLS path modeling and LISREL model The simulation data sets have been also analysed by means of PLS-PM The redundancy of Y, Y and Y (2) which is accounted for by the PLS-PM estimates in situations B J is reported in Table 16 If in RA-PM only the first iteration is performed, the amount of Y redundancy accounted for by the estimates of the two exogenous LVs is similar in the two approaches However, if a second iteration is performed, the Y redundancy

17 5844 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Table 14 Percentage of Y redundancy accounted for by the pairs of extracted components B C D E F G H I J Table 15 Percentage of Y and Y (2) redundancy accounted for by ˆη (in), ˆη(in) (2) and ˆη, ˆη (2) Red(Y ˆη (in) ) Red(Y ˆη ) Red(Y (2) ˆη (in) (2) ) Red(Y (2) ˆη (2) ) B C D E F G H I J Table 16 Percentage of Y, Y and Y (2) redundancy accounted for by the PLS-PM estimates PLS PLS Red(Y ˆξ, ˆξ (2) ) Red(Y ˆη PLS ) Red(Y (2) ˆη PLS (2) ) B C D E F G H I J accounted for by the RA-PM estimates becomes higher than in PLS-PM in all simulation cases Concerning the estimates of the endogenous LVs, a direct comparison between RA-PM and PLS-PM is more problematic because these two approaches estimate the endogenous LVs in extremely different ways In fact, while ˆη PLS and ˆη PLS (2) are expressed as linear combinations of the Y variables, the corresponding RA-PM estimates are expressed as linear combinations of the X variables Computing the redundancies of the Y blocks in PLS-PM, we see that ˆη PLS accounts for a redundancy comparable with that accounted for by ˆη in RA-PM However, ˆη PLS (2) reduces its accounting capability as it approaches situation J, where it accounts for 765% We conclude by noting that when the Y redundancy structure of the formative blocks is not unidimensional, the addition of more redundancy components really improves the redundancy accounted capability of the RA-PM estimates of the endogenous LVs We maintain that PLS-PM approach could also benefit from considering more

18 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Table 17 Jöreskog goodness of fit index (GFI) and minimum correlations GFI Minimum correlation ˆξ Lsr ˆξ Lsr (2) ˆζ Lsr ˆζ Lsr (2) B C D E F G H I J components This would increase the redundancy accounting capability of the PLS-PM estimates of the endogenous LVs and also match the formative scheme of the models underlying the data, as in the examples we have discussed in this paper Finally, we have performed a LISREL model analysis on each simulation case Referring to Table 17, it can be seen that the goodness of fit is high in every situation, however the correlations between the two maximally different solutions of the same identified models are in many cases far from 1 9 Conclusions and further research The paper proposes a new approach for the estimation of a structural equation model with a formative-reflective scheme The proposal is entitled RA-PM (redundancy analysis approach to path modeling) and consists of an alternative approach to the PLS-PM approach introduced by Lohmöller (1989) The new RA-PM algorithm provides unique solutions for both the LVs and the parameters by preserving the causal structure of the model and utilizing all the relevant information in the formative MVs The solutions are obtained by means of an iterative procedure which optimizes the redundancy criterion of Stewart and Love (1968) Future developments will address with the extension of the methodology proposed to the case of categorical MVs based on the non-symmetric correspondence analysis quantification In fact, it has been demonstrated that a relationship exists between redundancy analysis and non-symmetric correspondence analysis (Lauro and D Ambra, 1984) As a consequence, it is possible to consider in the model both simple and interaction effects among categorical variables Moreover, given that redundancy analysis represents a special case of PLS-PM with two sets of variables in the formativereflective scheme (Tenenhaus et al, 2005), the iterative steps of PLS regression can be successfully introduced in the new proposed algorithm in order to overcome the problems related to missing data, multicollinearity and flat tables Acknowledgements The authors would like to express their thanks to the referees for their valuable remarks and comments References Banca d Italia, 2002 I bilanci delle famiglie italiane nell anno 2000 Supplemento al Bollettino Statistico, Banca d Italia, Roma D Ambra, L, Lauro, CN, 1982 A principal component analysis with respect to a reference subspace Statist Appl 15, Guttman, L, 1955 The determinacy of factor score matrices with implications for five other basic problems of common factor theory British J Math Statist Psych 8 (II), Jöreskog, KG, 1973 A general method for estimating a linear structural equation system In: Goldberger, AS, Duncan, OD (Eds), Structural Equation Models in the Social Sciences Seminar Press, New York, pp Jöreskog, KG, 1981 Analysis of covariance structures Scand J Statist 8, Jöreskog, KG, 2000 Latent Variable Scores and Their Uses

19 5846 G Vittadini et al / Computational Statistics & Data Analysis 51 (2007) Lauro, CN, D Ambra, L, 1984 Non-symmetrical correspondence analysis In: Diday, E et al (Eds), Data Analysis and Informatics III North- Holland, Amsterdam, pp Lohmöller, JB, 1989 Latent Variable Path Modeling with Partial Least Squares Physica-Verlag, Heidelberg Saris, WE, Stronkhorst, LH, 1984 Causal Modeling in Non Experimental Research Sociometric Research Foundation, Amsterdam Schönemann, PH, 1971 The minimum average correlation between equivalent sets of uncorrelated factors Psychometrika 35, Stewart, DK, Love, WA, 1968 A general canonical correlation index Psych Bullet 70, Tenenhaus, M, Esposito Vinzi, V, Chatelin, YM, Lauro, C, 2005 PLS path modeling Comput Statist Data Anal 48, Van den Wollenberg, AL, 1977 Redundancy Analysis: An Alternative for Canonical Correlation Analysis Psychometrika 42, Vittadini, G, 1989 Indeterminacy problem in the LISREL model Multivariate Behavioral Res 24 (4), Vittadini, G, Lovaglio, PG, 2004 The estimate of human capital from two sets of observed indicators: formative and reflective In: Proceedings of the ASA Joint Statistical Meeting, Section on Business and Economic Statistics CD-ROM, Toronto, pp Wold, H, 1982 Soft modeling: the basic design and some extensions In: Jöreskog, KG, Wold, H (Eds), Systems Under Indirect Observation: Causality, Structure, Prediction North-Holland, Amsterdam, pp 1 54

Role and treatment of categorical variables in PLS Path Models for Composite Indicators

Role and treatment of categorical variables in PLS Path Models for Composite Indicators Role and treatment of categorical variables in PLS Path Models for Composite Indicators Laura Trinchera 1,2 & Giorgio Russolillo 2! 1 Dipartimento di Studi sullo Sviluppo Economico, Università degli Studi

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models SEM with observed variables: parameterization and identification Psychology 588: Covariance structure and factor models Limitations of SEM as a causal modeling 2 If an SEM model reflects the reality, the

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models General structural model Part 1: Covariance structure and identification Psychology 588: Covariance structure and factor models Latent variables 2 Interchangeably used: constructs --- substantively defined

More information

Number of cases (objects) Number of variables Number of dimensions. n-vector with categorical observations

Number of cases (objects) Number of variables Number of dimensions. n-vector with categorical observations PRINCALS Notation The PRINCALS algorithm was first described in Van Rickevorsel and De Leeuw (1979) and De Leeuw and Van Rickevorsel (1980); also see Gifi (1981, 1985). Characteristic features of PRINCALS

More information

Reconciling factor-based and composite-based approaches to structural equation modeling

Reconciling factor-based and composite-based approaches to structural equation modeling Reconciling factor-based and composite-based approaches to structural equation modeling Edward E. Rigdon (erigdon@gsu.edu) Modern Modeling Methods Conference May 20, 2015 Thesis: Arguments for factor-based

More information

Introduction to Structural Equation Modeling

Introduction to Structural Equation Modeling Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression

More information

Chapter 2 PLS Path Modeling: From Foundations to Recent Developments and Open Issues for Model Assessment and Improvement

Chapter 2 PLS Path Modeling: From Foundations to Recent Developments and Open Issues for Model Assessment and Improvement Chapter 2 PLS Path Modeling: From Foundations to Recent Developments and Open Issues for Model Assessment and Improvement Vincenzo Esposito Vinzi, Laura Trinchera, and Silvano Amato Abstract In this chapter

More information

Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys

Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys Emmanuel Jakobowicz CEDRIC-CNAM - 292 rue Saint Martin - 75141 Paris Cedex

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

4. Path Analysis. In the diagram: The technique of path analysis is originated by (American) geneticist Sewell Wright in early 1920.

4. Path Analysis. In the diagram: The technique of path analysis is originated by (American) geneticist Sewell Wright in early 1920. 4. Path Analysis The technique of path analysis is originated by (American) geneticist Sewell Wright in early 1920. The relationships between variables are presented in a path diagram. The system of relationships

More information

FORMATIVE AND REFLECTIVE MODELS: STATE OF THE ART. Anna Simonetto *

FORMATIVE AND REFLECTIVE MODELS: STATE OF THE ART. Anna Simonetto * Electronic Journal of Applied Statistical Analysis EJASA (2012), Electron. J. App. Stat. Anal., Vol. 5, Issue 3, 452 457 e-issn 2070-5948, DOI 10.1285/i20705948v5n3p452 2012 Università del Salento http://siba-ese.unile.it/index.php/ejasa/index

More information

arxiv: v1 [stat.me] 20 Dec 2012

arxiv: v1 [stat.me] 20 Dec 2012 arxiv:12125049v1 [statme] 20 Dec 2012 UNIVERSITÀ CATTOLICA DEL SACRO CUORE DIPARTIMENTO DI SCIENZE STATISTICHE A PARTIAL LEAST SQUARES ALGORITHM HANDLING ORDINAL VARIABLES ALSO IN PRESENCE OF A SMALL NUMBER

More information

Conjoint use of variables clustering and PLS structural equations modelling

Conjoint use of variables clustering and PLS structural equations modelling Conjoint use of variables clustering and PLS structural equations modelling Valentina Stan 1 and Gilbert Saporta 1 1 Conservatoire National des Arts et Métiers, 9 Rue Saint Martin, F 75141 Paris Cedex

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

Do not copy, quote, or cite without permission LECTURE 4: THE GENERAL LISREL MODEL

Do not copy, quote, or cite without permission LECTURE 4: THE GENERAL LISREL MODEL LECTURE 4: THE GENERAL LISREL MODEL I. QUICK REVIEW OF A LITTLE MATRIX ALGEBRA. II. A SIMPLE RECURSIVE MODEL IN LATENT VARIABLES. III. THE GENERAL LISREL MODEL IN MATRIX FORM. A. SPECIFYING STRUCTURAL

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Statistical methods for Education Economics

Statistical methods for Education Economics Statistical methods for Education Economics Massimiliano Bratti http://www.economia.unimi.it/bratti Course of Education Economics Faculty of Political Sciences, University of Milan Academic Year 2007-08

More information

Estimation and Testing for Common Cycles

Estimation and Testing for Common Cycles Estimation and esting for Common Cycles Anders Warne February 27, 2008 Abstract: his note discusses estimation and testing for the presence of common cycles in cointegrated vector autoregressions A simple

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Regularized Discriminant Analysis and Reduced-Rank LDA

Regularized Discriminant Analysis and Reduced-Rank LDA Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and

More information

Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling

Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling Valentina Stan and Gilbert Saporta Abstract In PLS approach, it is frequently assumed that the blocks of variables

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

Dimensionality Reduction Techniques (DRT)

Dimensionality Reduction Techniques (DRT) Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where: VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

The 3 Indeterminacies of Common Factor Analysis

The 3 Indeterminacies of Common Factor Analysis The 3 Indeterminacies of Common Factor Analysis James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) The 3 Indeterminacies of Common

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No. 7. LEAST SQUARES ESTIMATION 1 EXERCISE: Least-Squares Estimation and Uniqueness of Estimates 1. For n real numbers a 1,...,a n, what value of a minimizes the sum of squared distances from a to each of

More information

AN INTRODUCTION TO STRUCTURAL EQUATION MODELING

AN INTRODUCTION TO STRUCTURAL EQUATION MODELING AN INTRODUCTION TO STRUCTURAL EQUATION MODELING Giorgio Russolillo CNAM, Paris giorgio.russolillo@cnam.fr Structural Equation Modeling (SEM) Structural Equation Models (SEM) are complex models allowing

More information

Lecture 4 The Centralized Economy: Extensions

Lecture 4 The Centralized Economy: Extensions Lecture 4 The Centralized Economy: Extensions Leopold von Thadden University of Mainz and ECB (on leave) Advanced Macroeconomics, Winter Term 2013 1 / 36 I Motivation This Lecture considers some applications

More information

FACTOR ANALYSIS AS MATRIX DECOMPOSITION 1. INTRODUCTION

FACTOR ANALYSIS AS MATRIX DECOMPOSITION 1. INTRODUCTION FACTOR ANALYSIS AS MATRIX DECOMPOSITION JAN DE LEEUW ABSTRACT. Meet the abstract. This is the abstract. 1. INTRODUCTION Suppose we have n measurements on each of taking m variables. Collect these measurements

More information

Title. Description. var intro Introduction to vector autoregressive models

Title. Description. var intro Introduction to vector autoregressive models Title var intro Introduction to vector autoregressive models Description Stata has a suite of commands for fitting, forecasting, interpreting, and performing inference on vector autoregressive (VAR) models

More information

Structural Equation Modeling and simultaneous clustering through the Partial Least Squares algorithm

Structural Equation Modeling and simultaneous clustering through the Partial Least Squares algorithm Structural Equation Modeling and simultaneous clustering through the Partial Least Squares algorithm Mario Fordellone a, Maurizio Vichi a a Sapienza, University of Rome Abstract The identification of different

More information

Math 20F Final Exam(ver. c)

Math 20F Final Exam(ver. c) Name: Solutions Student ID No.: Discussion Section: Math F Final Exam(ver. c) Winter 6 Problem Score /48 /6 /7 4 /4 5 /4 6 /4 7 /7 otal / . (48 Points.) he following are rue/false questions. For this problem

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting

More information

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018 Homework #1 Assigned: August 20, 2018 Review the following subjects involving systems of equations and matrices from Calculus II. Linear systems of equations Converting systems to matrix form Pivot entry

More information

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Factor Analysis Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Factor Models The multivariate regression model Y = XB +U expresses each row Y i R p as a linear combination

More information

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models Massimiliano Bratti & Alfonso Miranda In many fields of applied work researchers need to model an

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS LINEAR ALGEBRA, -I PARTIAL EXAM SOLUTIONS TO PRACTICE PROBLEMS Problem (a) For each of the two matrices below, (i) determine whether it is diagonalizable, (ii) determine whether it is orthogonally diagonalizable,

More information

Ch 7: Dummy (binary, indicator) variables

Ch 7: Dummy (binary, indicator) variables Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male

More information

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II Ragnar Nymoen Department of Economics University of Oslo 9 October 2018 The reference to this lecture is:

More information

ECON 4160, Spring term Lecture 12

ECON 4160, Spring term Lecture 12 ECON 4160, Spring term 2013. Lecture 12 Non-stationarity and co-integration 2/2 Ragnar Nymoen Department of Economics 13 Nov 2013 1 / 53 Introduction I So far we have considered: Stationary VAR, with deterministic

More information

SUMMARY OF MATH 1600

SUMMARY OF MATH 1600 SUMMARY OF MATH 1600 Note: The following list is intended as a study guide for the final exam. It is a continuation of the study guide for the midterm. It does not claim to be a comprehensive list. You

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 8: Canonical Correlation Analysis MS-E2112 Multivariate Statistical (5cr) Lecture 8: Contents Canonical correlation analysis involves partition of variables into two vectors x and y. The aim is to find linear combinations α T x and β

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

Forecasting 1 to h steps ahead using partial least squares

Forecasting 1 to h steps ahead using partial least squares Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van

More information

A dynamic perspective to evaluate multiple treatments through a causal latent Markov model

A dynamic perspective to evaluate multiple treatments through a causal latent Markov model A dynamic perspective to evaluate multiple treatments through a causal latent Markov model Fulvia Pennoni Department of Statistics and Quantitative Methods University of Milano-Bicocca http://www.statistica.unimib.it/utenti/pennoni/

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Introduction to Confirmatory Factor Analysis

Introduction to Confirmatory Factor Analysis Introduction to Confirmatory Factor Analysis Multivariate Methods in Education ERSH 8350 Lecture #12 November 16, 2011 ERSH 8350: Lecture 12 Today s Class An Introduction to: Confirmatory Factor Analysis

More information

Vector Space Models. wine_spectral.r

Vector Space Models. wine_spectral.r Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Y t = ΦD t + Π 1 Y t Π p Y t p + ε t, D t = deterministic terms

Y t = ΦD t + Π 1 Y t Π p Y t p + ε t, D t = deterministic terms VAR Models and Cointegration The Granger representation theorem links cointegration to error correction models. In a series of important papers and in a marvelous textbook, Soren Johansen firmly roots

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

FORECASTING THE RESIDUAL DEMAND FUNCTION IN

FORECASTING THE RESIDUAL DEMAND FUNCTION IN FORECASTING THE RESIDUAL DEMAND FUNCTION IN ELECTRICITY AUCTIONS Dipartimento di Statistica Università degli Studi di Milano-Bicocca Via Bicocca degli Arcimboldi 8, Milano, Italy (e-mail: matteo.pelagatti@unimib.it)

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Projection. Ping Yu. School of Economics and Finance The University of Hong Kong. Ping Yu (HKU) Projection 1 / 42

Projection. Ping Yu. School of Economics and Finance The University of Hong Kong. Ping Yu (HKU) Projection 1 / 42 Projection Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Projection 1 / 42 1 Hilbert Space and Projection Theorem 2 Projection in the L 2 Space 3 Projection in R n Projection

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 2017-2018 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 6 Jakub Mućk Econometrics of Panel Data Meeting # 6 1 / 36 Outline 1 The First-Difference (FD) estimator 2 Dynamic panel data models 3 The Anderson and Hsiao

More information

Panel Threshold Regression Models with Endogenous Threshold Variables

Panel Threshold Regression Models with Endogenous Threshold Variables Panel Threshold Regression Models with Endogenous Threshold Variables Chien-Ho Wang National Taipei University Eric S. Lin National Tsing Hua University This Version: June 29, 2010 Abstract This paper

More information

ECON 4160, Lecture 11 and 12

ECON 4160, Lecture 11 and 12 ECON 4160, 2016. Lecture 11 and 12 Co-integration Ragnar Nymoen Department of Economics 9 November 2017 1 / 43 Introduction I So far we have considered: Stationary VAR ( no unit roots ) Standard inference

More information

Simultaneous Equation Models (Book Chapter 5)

Simultaneous Equation Models (Book Chapter 5) Simultaneous Equation Models (Book Chapter 5) Interrelated equations with continuous dependent variables: Utilization of individual vehicles (measured in kilometers driven) in multivehicle households Interrelation

More information

R = µ + Bf Arbitrage Pricing Model, APM

R = µ + Bf Arbitrage Pricing Model, APM 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses.

6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses. 6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses. 0 11 1 1.(5) Give the result of the following matrix multiplication: 1 10 1 Solution: 0 1 1 2

More information

Cheat Sheet for MATH461

Cheat Sheet for MATH461 Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A

More information

Regression: Lecture 2

Regression: Lecture 2 Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and

More information

REDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS ARNOLD L. VAN DEN WOLLENBERG UNIVERSITY OF NIJMEGEN

REDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS ARNOLD L. VAN DEN WOLLENBERG UNIVERSITY OF NIJMEGEN PSYCHOMETRIKA-VOL. 42, NO, 2 JUNE, 1977 REDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS ARNOLD L. VAN DEN WOLLENBERG UNIVERSITY OF NIJMEGEN A component method is presented maximizing

More information

New Notes on the Solow Growth Model

New Notes on the Solow Growth Model New Notes on the Solow Growth Model Roberto Chang September 2009 1 The Model The firstingredientofadynamicmodelisthedescriptionofthetimehorizon. In the original Solow model, time is continuous and the

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 The required files for all problems can be found in: http://www.stat.uchicago.edu/~lekheng/courses/331/hw3/ The file name indicates which problem

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

ON D-OPTIMAL DESIGNS FOR ESTIMATING SLOPE

ON D-OPTIMAL DESIGNS FOR ESTIMATING SLOPE Sankhyā : The Indian Journal of Statistics 999, Volume 6, Series B, Pt. 3, pp. 488 495 ON D-OPTIMAL DESIGNS FOR ESTIMATING SLOPE By S. HUDA and A.A. AL-SHIHA King Saud University, Riyadh, Saudi Arabia

More information

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM.

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM. 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

Combinatorial semigroups and induced/deduced operators

Combinatorial semigroups and induced/deduced operators Combinatorial semigroups and induced/deduced operators G. Stacey Staples Department of Mathematics and Statistics Southern Illinois University Edwardsville Modified Hypercubes Particular groups & semigroups

More information

Appendix to The Life-Cycle and the Business-Cycle of Wage Risk - Cross-Country Comparisons

Appendix to The Life-Cycle and the Business-Cycle of Wage Risk - Cross-Country Comparisons Appendix to The Life-Cycle and the Business-Cycle of Wage Risk - Cross-Country Comparisons Christian Bayer Falko Juessen Universität Bonn and IGIER Technische Universität Dortmund and IZA This version:

More information

1. The OLS Estimator. 1.1 Population model and notation

1. The OLS Estimator. 1.1 Population model and notation 1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models Chapter 8 Models with Structural and Measurement Components Good people are good because they've come to wisdom through failure. Overview William Saroyan Characteristics of SR models Estimation of SR models

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

DYNAMIC AND COMPROMISE FACTOR ANALYSIS

DYNAMIC AND COMPROMISE FACTOR ANALYSIS DYNAMIC AND COMPROMISE FACTOR ANALYSIS Marianna Bolla Budapest University of Technology and Economics marib@math.bme.hu Many parts are joint work with Gy. Michaletzky, Loránd Eötvös University and G. Tusnády,

More information

Vector autoregressions, VAR

Vector autoregressions, VAR 1 / 45 Vector autoregressions, VAR Chapter 2 Financial Econometrics Michael Hauser WS17/18 2 / 45 Content Cross-correlations VAR model in standard/reduced form Properties of VAR(1), VAR(p) Structural VAR,

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53 State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

COMBINING PLS PATH MODELING AND QUANTILE REGRESSION FOR THE EVALUATION OF CUSTOMER SATISFACTION

COMBINING PLS PATH MODELING AND QUANTILE REGRESSION FOR THE EVALUATION OF CUSTOMER SATISFACTION Statistica Applicata - Italian Journal of Applied Statistics Vol. 26 (2) 93 COMBINING PLS PATH MODELING AND QUANTILE REGRESSION FOR THE EVALUATION OF CUSTOMER SATISFACTION Cristina Davino 1 Department

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model Olubusoye, O. E., J. O. Olaomi, and O. O. Odetunde Abstract A bootstrap simulation approach

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle elecomunicazioni A.A. 015-016 Pietro Guccione, PhD DEI - DIPARIMENO DI INGEGNERIA ELERICA E DELL INFORMAZIONE POLIECNICO DI BARI Pietro

More information