Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling

Similar documents
Conjoint use of variables clustering and PLS structural equations modelling

Understanding PLS path modeling parameters estimates: a study based on Monte Carlo simulation and customer satisfaction surveys

A Partitioning Method for the Clustering of Categorical Variables

Role and treatment of categorical variables in PLS Path Models for Composite Indicators

Chapter 2 PLS Path Modeling: From Foundations to Recent Developments and Open Issues for Model Assessment and Improvement

A multiple regression model for predicting rattle noise subjective rating from in-car microphones measurements

Generalized Ward and Related Clustering Problems

Structural Equation Modeling and simultaneous clustering through the Partial Least Squares algorithm

COMBINING PLS PATH MODELING AND QUANTILE REGRESSION FOR THE EVALUATION OF CUSTOMER SATISFACTION

Overview of clustering analysis. Yuehua Cui

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Cluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

DIMENSION REDUCTION AND CLUSTER ANALYSIS

PLS discriminant analysis for functional data

SOP Release. FEV Chassis Reliable Partner in Chassis Development. FEV Chassis Applications and Activities. Concept Layout. Design

Basics of Multivariate Modelling and Data Analysis

Forecasting 1 to h steps ahead using partial least squares

PCR and PLS for Clusterwise Regression on Functional Data

Classification methods

Clusterwise analysis for multiblock component methods

Preprocessing & dimensionality reduction

High Dimensional Discriminant Analysis

FORMATIVE AND REFLECTIVE MODELS: STATE OF THE ART. Anna Simonetto *

Generalizing Partial Least Squares and Correspondence Analysis to Predict Categorical (and Heterogeneous) Data

SCIENTIFIC APPLICATIONS OF THE AHP METHOD IN TRANSPORT PROBLEMS

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Modeling with Heterogeneity

Statistical Machine Learning

FROM TANDEM TO SIMULTANEOUS DIMENSIONALITY REDUCTION AND CLUSTERING OF TOURISM DATA 1

Lecture 5: Clustering, Linear Regression

Computer Vision Group Prof. Daniel Cremers. 14. Clustering

STATISTICA MULTIVARIATA 2

Multivariate Statistics

A Note on the Scale Efficiency Test of Simar and Wilson

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Duration of online examination will be of 1 Hour 20 minutes (80 minutes).

Lecture 5: Clustering, Linear Regression

Machine Learning for Software Engineering

Repeatability of high velocity crash tests

Multivariate analysis of genetic data: exploring groups diversity

PRINCIPAL COMPONENTS ANALYSIS

Discriminant Analysis on Mixed Predictors

Ballast Flying Risk Assessment Method for High Speed Line. ²ALTEN, Paris, France ;

Advising on Research Methods: A consultant's companion. Herman J. Ader Gideon J. Mellenbergh with contributions by David J. Hand

Clustering using Mixture Models

Regression. Oscar García

Explaining Results of Neural Networks by Contextual Importance and Utility

Principal component analysis (PCA) for clustering gene expression data

Clustering analysis of vegetation data

1 Basic Concept and Similarity Measures

Statistics 202: Data Mining. c Jonathan Taylor. Model-based clustering Based in part on slides from textbook, slides of Susan Holmes.

Working performance. Access & Service

Clustering. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein. Some slides adapted from Jacques van Helden

Applying cluster analysis to 2011 Census local authority data

Installation instructions, accessories. Snow chains

HELLA AGLAIA MOBILE VISION GMBH

DYNAMIC AND COMPROMISE FACTOR ANALYSIS

CATALOGUE OF DOCUMENTATION PRODUCTS AND SERVICES

REVIEW OF SHORT-TERM TRAFFIC FLOW PREDICTION TECHNIQUES

On the relationships among latent variables and residuals in PLS path modeling: The formative-reflective scheme

Software Quality. Introduction " Martin Glinz. Chapter 1. Department of Informatics!

Cosimulation MotionView ST Activate

Lecture 5: Clustering, Linear Regression

Applied Multivariate Analysis

Discrimination Among Groups. Discrimination Among Groups

Lecture 2: Data Analytics of Narrative

Exp All technology at operator s service. Suggested Orange peel. Straight boom m. Industrial straight stick 8.00 m RV 600 2,

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

How rural the EU RDP is? An analysis through spatial funds allocation

Machine Learning - MT Clustering

Item Reliability Analysis

Dimension Reduction and Classification Using PCA and Factor. Overview

An Investigation of the Accuracy of Parallel Analysis for Determining the Number of Factors in a Factor Analysis

Unsupervised Learning. k-means Algorithm

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

Decision trees on interval valued variables

Optimization of the detection of train wheel defects. SNCF Innovation and Research Department Paris, FRANCE 1

Unsupervised machine learning

On the Variance of Eigenvalues in PCA and MCA

Clustering. Genome 373 Genomic Informatics Elhanan Borenstein. Some slides adapted from Jacques van Helden

John Pavlopoulos and Ion Androutsopoulos NLP Group, Department of Informatics Athens University of Economics and Business, Greece

AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-12 NATIONAL SECURITY COMPLEX

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Experimental Design and Data Analysis for Biologists

1817. Research of sound absorption characteristics for the periodically porous structure and its application in automobile

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

arxiv: v1 [stat.me] 20 Dec 2012

FINM6900 Finance Theory Noisy Rational Expectations Equilibrium for Multiple Risky Assets

Expressions for the covariance matrix of covariance data

Bearing fault diagnosis based on EMD-KPCA and ELM

Clustering. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein. Some slides adapted from Jacques van Helden

Modeling dependence and feedback in ANP with fuzzy cognitive maps Jiří Mazurek 1, Zuzana Kiszová 2

Data Mining Techniques

Safety and Reliability of Embedded Systems. (Sicherheit und Zuverlässigkeit eingebetteter Systeme) Fault Tree Analysis Obscurities and Open Issues

Machine Learning for Data Science (CS4786) Lecture 2

Clusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Estimation of Tire-Road Friction by Tire Rotational Vibration Model

Estimating and Identifying Vector Autoregressions Under Diagonality and Block Exogeneity Restrictions

COMPARISON OF A DOZEN AHP TECHNIQUES FOR GLOBAL VECTORS IN MULTIPERSON DECISION MAKING AND COMPLEX HIERARCHY

Transcription:

Chapter 10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling Valentina Stan and Gilbert Saporta Abstract In PLS approach, it is frequently assumed that the blocks of variables satisfy the assumption of unidimensionality. In order to fulfill at best this hypothesis, we use clustering methods of variables. We illustrate the conjoint use of variables clustering and PLS structural equations modeling on data provided by PSA Company (Peugeot Citroën) on customers satisfaction. The data are satisfaction scores on 32 manifest variables given by 2,922 customers. 10.1 Clustering of Variables There are two main methodologies: hierarchical methods and direct partitioning methods. Hierarchical methods are either agglomerative or divisive. Partitioning methods usually require that the number of groups should be defined beforehand and will not be used here. A good partition is such that the variables of the same class are correlated as much as possible. We will use here algorithms which provide clusters which are as unidimensional as possible, and where correlations between variables of the same clusters are larger than correlations between variables of different clusters. This means that blocks of variables should be as homogeneous as possible, but are not independent. One may distinguish two cases, depending on whether the sign of the correlation coefficient is important or not (i.e. if negative values show a disagreement between variables). V. Stan Groupe ESSCA Angers, 1 Rue Lakanal BP 40348, 49003 Angers, Cedex 01, France e-mail: valentina titu@yahoo.fr G. Saporta Conservatoire National des Arts et Métiers, Chaire de Statistique Appliquée, case 441, 292 rue Saint Martin, 75141 Paris, Cedex 03, France e-mail: gilbert.saporta@cnam.fr V. Esposito Vinzi et al. (eds.), Handbook of Partial Least Squares, Springer Handbooks of Computational Statistics, DOI 10.1007/978-3-540-32827-8 11, c Springer-Verlag Berlin Heidelberg 2010 235

236 V. Stan and G. Saporta 10.1.1 Agglomerative Hierarchical Clustering Methods 10.1.1.1 Methods Derived from Clustering of Statistical Units (Nakache and Confais 2005) Various dissimilarity measures can be used, based on the usual correlation coefficient like: 1 r jj 0or1 ˇˇr jj 0ˇ if the sign of the correlation is not importanti s jj 0D cos 1.r jj 0/: Then we use the following strategies of aggregation: single linkage, average linkage, complete linkage, Ward s criteria etc. 10.1.1.2 The VARHCA Method (Vigneau and Qannari 2003) Let C 1 ;C 2 ::::C k be k blocks (or clusters) of manifest variables and Y 1, Y 2 ::::Y k the standardized latent variables (first principal component) associated respectively with each cluster. Manifest variables are centred, but not necessarily standardized. The following hierarchical procedure aims at locally optimizing the criterion T defined by: kx px T D n ı rj cov 2 x j ;Y r rd1 j D1 where ı rj D 1 if xj 2 C r 0 otherwise At the first level of the hierarchy, each variable forms a cluster by itself; then, T 0 D p P j D1 var x j ; At level i, one merges the two clusters giving the minimal variation of T: T D T i 1 T i D.A/ 1 C.B/ 1.A[B/ 1 where.a/ 1 ;.B/ 1 ;.A[B/ 1 are the largest eigenvalues of the covariance matrices of the variables in clusters A, B and A [ B. 10.1.2 Cutting Trees The resulting tree should be cut at a suitable level to get a partition. We use here a criterion of unidimensionality of the groups to obtain this cut. Starting from the root of the tree, we first realize a cut in 2 classes and verify the hypothesis of unidimensionality by using the Cronbach s or the Dillon Goldstein s.ifthesevalues are close to 1, then the hypothesis of unidimensionality is accepted. Otherwise, we proceed to a cut at the following level of the tree, and so on. We repeat the procedure until we obtain classes satisfying the unidimensionality criteria.

10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling 237 10.1.3 Divisive Methods SAS VARCLUS procedure is one of the best known. At first step one performs a PCA with all manifest variables. If there is only one principal component with an eigenvalue greater than 1, there is only one cluster. Otherwise one considers the first two principal components: each manifest variable is associated with the principal component to which it is the closest, in regard to the squared linear correlation coefficient, thus forming two groups of variables. If the second eigenvalue of a group is greater than 1, this group is divided in turn, according to the same method, and so on, until each group has only one principal component. 10.2 Application to Structural Equation Modeling Let p variables be observed upon n units. The p variables are partitioned in J subsets or blocks of k j variables which are presumed to be pertinent for describing the phenomenon. Each of these blocks is designed to describe a theme of the general phenomenon. We shall designate these blocks by X j and we shall consider them as matrices with dimension.n k j / (Tenenhaus et al. 2005). In the following, we shall always suppose that each block is associated with only one latent variable (unidimensionality). In order to obtain unidimensional blocks, we propose to use some of the clustering methods, previously presented in Sect. 10.1. Therefore we can identify the blocks by the same name as their latent variable. The latent variable corresponding to the X j block will be designated by Ÿ j. In the following, we study the specific case where there are no pre-defined causal relationships between the latent variables. We use the blocks obtained by each method to build the causality scheme. With the help of experts we propose relationships between latent variables with the aim of explaining the general satisfaction of the customers, and we therefore establish the inner model. To choose the best model from many, we use the global criterion developed by Amato et al. (2004): q GoF D communality R 2 where communality is the average of the communality of each block and measures the of the external model. R 2 is the average of R 2 for each endogenous latent variable. The R 2 measures the of the inner model and is calculated for each endogenous variable according to latent variables which explain it. The software used is PLSX module of SPAD.

238 V. Stan and G. Saporta 10.3 Practical Application 10.3.1 The Questionnaire The data obtained are satisfaction scores scaled between 1 and 10 on 32 services for a car. 2,922 customers participated. Manifest variables are the followings (Table 10.1). Table 10.1 Manifest variables Variable General satisfaction General Quality price ratio Absence of small, irritating defects Absence of noise and vibrations General state of the paintwork Robustness of commands, buttons Solidity and robustness Lock, door and window mechanisms Inside space and seat modularity Inside habitability Dashboard: of materials and finishing Insider: of mat. and finishing Front seat comfort position Visibility from driver s seat Radio CD-ROM Heating ventilation Boot capacity Security Braking Acceleration Handling Suspension comfort Silence in rolling Maniability Direction Gears Mechanic reliability Oil consumption Mechanic s efficiency in solving problems Maintenance cost and repairs Sat01h Sat02h Sat03h Sat04h Sat05h Sat06h Sat33h Sat08h Sat09h Sat34h Sat11h Sat12h Sat13h Sat14h Sat15h Sat16h Sat17h Sat18h Sat19h Sat20h Sat21h Sat22h Sat23h Sat24h Sat25h Sat26h Sat27h Sat28h Sat29h Sat30h Sat31h Sat32h

10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling 239 10.3.2 Clustering Variables We have used 1 r jj 0 as distance. We have applied 6 clustering methods of variables: single linkage, average linkage, complete linkage, Ward s criterion, VARCLUS and VARHCA. Single linkage and average linkage did not provide well separated clusters, so they are eliminated. For Ward s criterion, the tree shows that a partition in 8 classes is reasonable and for complete linkage in 6 classes. The partition obtained by cutting VARHCA tree into 7 clusters is here exactly the same as the partition given by VARCLUS. The Cronbach s coefficients show that the obtained blocks are unidimensional. In the following, we present the blocks for complete linkage, Ward s criterion VARCLUS and VARHCA: In Table 10.2 we can observed that the blocks solidity and driving are identical for all methods. General satisfaction has the same composition for complete linkage, VARCLUS and VARHCA, but partition issued from Ward s criterion is more logical, according to experts. By comparison with the other methods, complete linkage groups in a single block the variables which form the blocks interior design, driving comfort, interior comfort in Ward s criterion, VARCLUS and VARHCA. For VARCLUS and VARHCA, the variables which are associated to the block maintenance in Ward s criterion and complete linkage, are in the same block with price ratio. Complete linkage is the only method which realizes a distinct block for the variable -price ratio. The tree for Ward s criterion (partition in 8 classes): sat 01h sat 02h sat 03h sat 04h sat 05h sat 31h sat 32h sat 06h sat 08h sat 09h sat 33h sat 11h sat 34h sat 19h sat 12h sat 13h sat 14h sat 15h sat 16h sat 17h sat 18h sat 20h sat 21h sat 22h sat 23h sat 24h sat 25h sat 26h sat 27h sat 28h sat 29h sat 30h 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20

240 V. Stan and G. Saporta Table 10.2 Blocks of manifest variables after Ward s criterion, complete linkage, VARCLUS and VARHCA Ward s criterion Complete linkage VARCLUS and VARHCA Block MV Block MV Block MV Block MV Block MV Block MV Block MV Block MV Block MV General Sat01h Sat06h Sat20h Sat01h Sat11h Sat20h Sat01h Sat06h Sat20h General General satisfaction Sat02h Solidity Sat08h Sat21h Sat02h Sat34h Sat21h Sat02h Solidity Sat08h Sat21h satisfaction satisfaction (Gs) Sat03h (Sd) Sat09h Sat22h (Sd) (Gs) Sat04h Sat12h Sat22h (Gs) Sat04h Sat09h Sat22h Sat04h Sat33h Sat23h Sat05h Sat13h Sat23h Sat05h Sat33h Sat23h Construct Quality (Cq) Sat05h Sat12h Sat24h price ratio Sat03h Sat14h Sat24h Sat03h Sat12h Sat24h Comfort (Qpr) Quality (Cf) price (Qp) Maintenance Sat31h comfort Sat13h (Dq) Sat25h Maintenance Sat31h Sat15h (Dq) Sat26h Sat31h comfort Sat13h (Dq) Sat25h (Mn) Sat32h (Dc) Sat14h Sat26h (Mn) Sat32h Sat16h Sat27h Sat32h (Dc) Sat14h Sat26h Sat11h Sat15h Sat27h Sat06h Sat19h Sat25h Sat11h Sat15h Sat27h Interior Interior Sat34h Sat16h Sat28h Solidity Sat08h Sat17h Sat28h Sat19h Sat16h Sat28h design design (Sd) (Id) Sat19h Interior Sat17h Sat29h Sat09h Sat18h Sat29h (Id) Sat34h Interior Sat17h Sat29h comfort (Ic) Sat18h Sat30h Sat33h Sat30h comfort (Ic) Sat18h Sat30h

10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling 241 The tree for complete linkage (partition in 6 classes): sat 01h sat 02h sat 04h sat 05h sat 03h sat 31h sat 32h sat 06h sat 08h sat 09h sat 33h sat 11h sat 34h sat 12h sat 13h sat 14h sat 15h sat 16h sat 19h sat 17h sat 18h sat 20h sat 21h sat 22h sat 23h sat 24h sat 26h sat 27h sat 25h sat 28h sat 29h sat 30h 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 The tree for VARHCA et VARCLUS (partition in 7 classes) SAT 17H SAT 18H SAT 30H SAT 25H SAT 22H SAT 20H SAT 21H SAT 26H SAT 27H SAT 23H SAT 24H SAT 28H SAT 29H SAT 06H SAT 33H SAT 08H SAT 09H SAT 12H SAT 13H SAT 16H SAT 14H SAT 15H SAT 19H SAT 11H SAT 34H SAT 31H SAT 03H SAT 32H SAT 05H SAT 04H SAT 01H SAT 02H 106.6 96.60 86.60 76.60 66.60 56.60 46.60 36.60 26.60

242 V. Stan and G. Saporta 10.3.3 PLS Structural Models The clustering techniques provide blocks but not the relationships between them. With the help of experts we then propose relations between blocks, so as to explain the latent variable general satisfaction. The following figures give the 3 causality schemes (Figs. 10.1 10.3): The values of Amato s criterion GoF are: For Ward s criterion: GoF D 0:48 For complete linkage: GoF D 0:42 For VARCLUS: GoF D 0:47 Ward s clustering gives the best result and will be selected. 10.3.4 Results and Interpretations 10.3.4.1 The Measurement Model After convergence of the PLS algorithm, one obtains the final weights which allow us to link the manifest variables with the latent variables. An example for general satisfaction : Gs D 0; 22 Sat 01h C 0; 57 Sat 02h C 0; 48 Sat 03h: This table presents only correlations larger than the mean of the absolute values (0.3333) (Table 10.3). Dq 0,50 Ic Mn 0,53 0,29 0,19 0,36 0,51 Gs 0,62 Sd 0,29 0,40 Dc 0,22 Cq 0,57 Id Fig. 10.1 Causality scheme after Ward s clustering

10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling 243 Cf 0,24 0,33 Qpr 0,68 Gs 0,56 0,27 Dq 0,52 0,40 0,40 Mn Sd Fig. 10.2 Causality scheme after complete linkage clustering Dq 0,50 Ic 0,52 0,27 0,16 0,51 0,62 Sd 0,51 0,41 0,25 Gs 0,46 0,57 Dc 0,16 Qp Id Fig. 10.3 Causality scheme after VARCLUS or VARCHA clustering Analyzing the correlations, we observe that all latent variables are well correlated with their own manifest. So, the manifest variables describe their latent appropriately and the blocks are therefore validated. 10.3.4.2 The Structural Model The R 2 coefficients between connected latent variables are: R 2. comf orti Sd; Id/ D 0:42 R 2. I Sd; Id; Dc; Ic/ D 0:5 R 2.General satisf actioni Cq;Mn;Sd;Id;Dc;Ic;Dq/D0:27

244 V. Stan and G. Saporta Table 10.3 Correlations between manifest and latent variables Variables General satisfaction Construct Maintenance Solidity Interior design comfort Interior comfort Sat01h 0.6442 0.3588 Sat02h 0.8706 0.4011 0.3731 Sat03h 0.7397 Sat04h 0.3667 0.8780 Sat05h 0.8449 Sat31h 0.3828 0.8739 Sat32h 0.8332 Sat06h 0.6534 0.3428 Sat08h 0.7867 0.3558 0.4223 0.4605 Sat09h 0.7057 0.3493 0.3707 Sat33h 0.7061 0.3420 Sat11h 0.3597 0.8801 0.5249 0.4442 Sat34h 0.4039 0.8286 0.4816 0.4088 Sat19h 0.7015 0.3651 0.3774 Sat12h 0.4308 0.4684 0.7711 0.3480 0.4782 Sat13h 0.4305 0.4502 0.7903 0.3396 0.4522 Sat14h 0.3756 0.4351 0.8122 0.3461 0.4786 Sat15h 0.3914 0.4611 0.8283 0.3851 0.5367 Sat16h 0.3444 0.3971 0.6595 0.3403 0.4455 Sat17h 0.3508 0.8110 0.3895 Sat18h 0.3434 0.3506 0.4086 0.8665 0.4514 Sat20h 0.4589 0.4924 0.5299 0.4713 0.7315 Sat21h 0.3909 0.3453 0.3952 0.3760 0.6739 Sat22h 0.3349 0.3944 0.3349 0.6757 Sat23h 0.3737 0.3685 0.4458 0.3690 0.7716 Sat24h 0.3685 0.3647 0.4789 0.3379 0.7362 Sat25h 0.3840 0.6218 Sat26h 0.3908 0.3724 0.4791 0.3593 0.7837 Sat27h 0.3902 0.3594 0.4880 0.3751 0.7841 Sat28h 0.3573 0.4048 0.6396 Sat29h 0.5690 Sat30h 0.4743 For general satisfaction, the R 2 coefficient generated by the other latent variables is 27%, and we consider that as satisfactory because there are 2,922 individuals (Table 10.4). The correlations between the latent variables are given below: Analyzing the correlations between the latent variables, we can see that to improve driving, the producer should concentrate on driving comfort (correlation coefficient D 0:62), on the solidity (0.53) and on the interior design (0.51). In order to obtain a good driving comfort, the producer could concentrate on interior design (0.57) and on solidity (0.51). Given the causality scheme, the determination of general satisfaction is a complex procedure in which almost all the latent variables are directly involved.

10 Conjoint Use of Variables Clustering and PLS Structural Equations Modeling 245 Table 10.4 The correlations between latent variables General satisfaction Construct Maintenance Solidity Interior design comfort Interior comfort General 1.0000 satisfaction Construct 0.4041 1.0000 Maintenance 0.3576 0.3503 1.0000 Solidity 0.3722 0.3407 0.2914 1.0000 Interior 0.2237 0.0988 0.1979 0.4217 1.0000 design 0.2928 0.1539 0.2266 0.5119 0.5729 1.0000 comfort Interior 0.1854 0.1233 0.2301 0.3951 0.3812 0.4542 1.0000 comfort 0.2943 0.2023 0.3071 0.5257 0.5085 0.6180 0.5029 1.0000 Construct is the most important variable for the general satisfaction (correlation coefficient D 0:40) and the less important is the interior comfort (0.19). Consequently, in order to increase the general satisfaction, the producer should concentrate first on the construct and then on the solidity, maintenance, driving, driving comfort, interior design and interior comfort. The equation is as follows: Gs D 0:26 Cq C 0:19 Mn C 0:15 Sd C 0:03 Id C 0:10 Dc 0:03 Ic C 0:04 Dq: 10.4 Conclusions Variables clustering provide a simple way of obtaining unidimensional blocks in structural equation modeling, when prior knowledge of blocks is not available. It must be underlined that this study did not follow the logical sequence of steps of the PLS approach: the construction of a model by experts, the construction of a questionnaire using this model, and the collection of customer data using this questionnaire. In our case, the process is inverted: we have tried to build a model using data that had already been collected. This fact has obviously effects on the final results which cannot be measured. By means of clustering methods of variables, we established the external model. According to Amato s criterion, Ward s clustering was choosen as the best technique for our data set. But we observe that the values of this criterion for the 3 models are very close.

246 V. Stan and G. Saporta For the chosen model, a hierarchy of the influence of the latent variables on general satisfaction can be established using the structural model: I. Construct ; II. Solidity; III. Maintenance; IV., V. comfort, VI. Interior design, VII. Interior comfort. The results obtained are satisfactory: R 2 D 27% for a large sample of almost 3,000 respondents. References Amato, S., Esposito Vinzi, V., & Tenenhaus, M. (2004). A global goodness-of-fit index for PLS structural equation modeling. Oral Communication to PLS Club, HEC School of Management, France, March 24. Nakache, J. P., & Confais J. (2005) Approche pragmatique de la classification. Ed. Technip, Paris. Tenenhaus, M., Esposito Vinzi, V., Chatelin, Y.-M., & Lauro, C. (2005). PLS path modeling. Computational Statistics and Data Analysis, 48(1), 159 205. Vigneau, E., & Qannari, E. M. (2003). Clustering of variables around latent component application to sensory analysis. Communications in Statistics, Simulation and Computation, 32(4), 1131 1150.