The European Commission s science and knowledge service. Joint Research Centre

Size: px
Start display at page:

Download "The European Commission s science and knowledge service. Joint Research Centre"

Transcription

1 The European Commission s science and knowledge service Joint Research Centre

2 Step 7: Statistical coherence (II) Principal Component Analysis and Reliability Analysis Hedvig Norlén COIN th JRC Annual Training on Composite Indicators & Scoreboards 05-09/11/2018 Ispra (IT)

3 Step 7: Statistical coherence (II) introduction New property will be given by a linear combination w 1 x + w 2 y y PCA will find this best line: 1) Maximum variance 2) Minimum error First Principal Component! x 3 JRC-COIN Step 7: Statistical Coherence (II)

4 Step 7: Statistical coherence (II) introduction Input Output Black Box PCA = Principal Component Analysis RA= Reliability Analysis 4 JRC-COIN Step 7: Statistical Coherence (II)

5 Outline talk Statistical coherence (II) How multivariate methods can help to understand the statistical coherence of our composite indicator Part 1 Principal Component Analysis Part 2 Reliability Analysis (Cronbach s alfa) Is the structure of our composite indicator statistically well-defined? Is the set of available indicators sufficient to describe the pillars and the sub-pillars of the composite indicator? 5 JRC-COIN Step 7: Statistical Coherence (II)

6 Ex 1. European Skills Index The European Skills Index (ESI) measures the performance of a country s skills system taking into account its manifold facets from continually developing the skills of the population to activating and effectively matching these skills to the needs of employers in the labour market. ESI launched in September JRC-COIN Step 7: Statistical Coherence (II)

7 European Skills Index Framework of the European Skills Index 2018 A Composite Indicator measures multifaceted phenomenon - combination of different aspects (Sub-pillars/Pillars). Each aspect can be measured by a set of observable variables (Indicators). Composite Indicator Pillars Sub-pillars Indicators Type equation here. 7 JRC-COIN Step 7: Statistical Coherence (II)

8 Statistical coherence (II) - unidimensionality PCA is used to verify the internal consistency, verify unidimensionality within: 1) each Sub-pillar (across indicators) Framework of the European Skills Index 2018 Type equation here. 8 JRC-COIN Step 7: Statistical Coherence (II)

9 Statistical coherence (II) - unidimensionality PCA is used to verify the internal consistency, verify unidimensionality within: 1) each Sub-pillar (across Indicators) 2) each Pillar (across Sub-pillars) Framework of the European Skills Index 2018 Type equation here. 9 JRC-COIN Step 7: Statistical Coherence (II)

10 Statistical coherence (II) - unidimensionality PCA is used to verify the internal consistency, verify unidimensionality within: 1) each Sub-pillar (across Indicators) 2) each Pillar (across Sub-pillars) 3) the Composite Indicator (across Pillars) Framework of the European Skills Index 2018 Type equation here. 10 JRC-COIN Step 7: Statistical Coherence (II)

11 PCA PCA Identify a small number of averages (PCs) that explain most of the variance observed. Indicator 1 X 1 Indicator 2 X 2... Indicator p X p w 2 w 1 w p Component PCA summarizes information of all indicators and reduces it into a fewer number of components Each principal component PC i is a new variable computed as a linear combination of the original (standardized) variables Observed indicators are reduced into components PC 1 = w 1 x 1 + w 2 x 2 + w p x p 11 JRC-COIN Step 7: Statistical Coherence (II)

12 European Skills Index 2 dimensions y PC2 x PC1 x RO BG IT BE SE FI PC1 RO BG IT BE FI SE Variance 81% y IT BGRO BE FI SE PC2 ITFI BG ROBESE Variance 19% 12 JRC-COIN Step 7: Statistical Coherence (II)

13 European Skills Index third piller dimension added PCA example to be completed in a little while! x RO FI Variance 58% y IT SE Variance 29% z EL CZ Variance 12% 13 JRC-COIN Step 7: Statistical Coherence (II)

14 First steps in PCA Check the correlation structure of the data and perform 2 pre-tests 1) Bartlett s sphericity test The test checks if the observed correlation matrix R diverges significantly from the identity matrix. H 0 : R = 1, H 1 : R 1 Want to reject H 0 to be able to do perform a valid PCA (Bartlett (1937)) 2) Kaiser-Meyer-Olkin (KMO) Measure for Sampling Adequacy Measure of the strength of relationship among variables based on correlations and partial correlations. KMO between [0;1]. Want KMO close to 1 to be able to perform a valid PCA. KMO>0.6 OK! (Kaiser (1970), Kaiser-Meyer (1974)) 14 JRC-COIN Step 7: Statistical Coherence (II)

15 First steps in PCA 2) Kaiser-Meyer-Olkin (KMO) Measure for Sampling Adequacy Kaiser s own interpretations of the KMO values KMO values: in the.90s, marvelous in the.80s, meritorious in the.70s, middling in the.60s, mediocre in the.50s, miserable below.50, unacceptable 1) Bartlett s sphericity test and 2) KMO test provide info whether it is possible to do PCA, but do not give info of the magic number - how many components are needed KMO>0.6 OK 15 JRC-COIN Step 7: Statistical Coherence (II)

16 How to get the PCs 1) Eigenvalue decomposition of a data correlation matrix (case of composite indicator), 2) Singular Value Decomposition (SVD) of a data matrix after mean centering (normalizing) the data matrix Assumptions: I) Linearity II)Normal distribution III) Principal components are orthogonal 16 JRC-COIN Step 7: Statistical Coherence (II)

17 Finding the magic number - determining how many components in PCA Several methods exist. The 3 most common are: 1) Kaiser Guttman Eigenvalues greater than one criterion (Guttman (1954), Kaiser (1960)). Select all components with eigenvalues over 1 (or 0.9) 2) Cattell s scree test (Cattell (1966)) Above the elbow approach Scree 3) Certain percentage of explained variance e.g., >2/3, 75%, 80%, 17 JRC-COIN Step 7: Statistical Coherence (II)

18 European Skills Index finalizing the PCA example Use PCA to explore the dimensionality of ESI across the three pillars. Framework of the European Skills Index 2018 The 3 pillars should be correlated. Supports the assumption that they are all measuring the a similar concept! Type equation here. 18 JRC-COIN Step 7: Statistical Coherence (II)

19 European Skills Index - PCA example Check the correlation structure Pearson correlation coefficients (Correlations not significant at significance level α = 0.01 are grey, n=28, critical value = 0.48) 19 JRC-COIN Step 7: Statistical Coherence (II)

20 European Skills Index - PCA example Check the correlation structure 1) Bartlett s sphericity test p-value (1.5e-17) < 0.01 Reject H 0 2) KMO Measure for Sampling Adequacy Overall MSA = 0.54 KMO>0.6! Pearson correlation coefficients 20 JRC-COIN Step 7: Statistical Coherence (II)

21 European Skills Index - PCA results 1.75= Total variance explained 58.41=1.75/3*100 Pearson correlation coefficients between pillar and principal component Component loadings Component Eigenvalue Variance Cumulative variance PC1 PC2 PC Skills Development Skills Activation ,00 3. Skills Matching Sum Sum of Squares Stopping criterion Eigenvalue > 1 PC 0. 51X 0.88X 0.84X 1 SD SA SM One dimension verified! 21 JRC-COIN Step 7: Statistical Coherence (II)

22 European Skills Index - Audit results ESI theoretical framework 22 JRC-COIN Step 7: Statistical Coherence (II)

23 Ex 2. Social Progress Index The Social Progress Index (SPI) is an international monitoring framework for measuring social progress without resorting to the use of economic indicators. SPI measures country performance on aspects of social and environmental performance, which are relevant for countries at all levels of economic development. 146 fully and 90 partially ranked countries. 23 JRC-COIN Step 7: Statistical Coherence (II)

24 Social Progress Index - PCA at Index level Social progress is measured taking into account the following three broad aspects: 1. Meeting everyone s basic needs for food, clean water, shelter and security. 2. Living long, healthy lives with basic knowledge and communication and a clean environment. 3. Practicing equal rights and freedoms and pursuing higher education. 5 th version of SPI was launched September PCA to explore the dimensionality of SPI across the three dimensions. 24 JRC-COIN Step 7: Statistical Coherence (II)

25 Social Progress Index - PCA example Check the correlation structure Correlation matrix Basic Human Needs Foundations of Wellbeing Opportunity Basic Human Needs 1 Foundations of Wellbeing Opportunity Pearson correlation coefficients (Significance level α = 0.01, n=146, critical value = 0.21) 25 JRC-COIN Step 7: Statistical Coherence (II)

26 Social Progress Index - PCA example Check the correlation structure Correlation matrix Basic Human Needs Basic Human Needs 1 Foundations of Wellbeing Foundations of Wellbeing Opportunity Opportunity ) Bartlett s sphericity test p-value (2.0e-116) < 0.01 Reject H 0 2) KMO Measure for Sampling Adequacy Overall MSA = 0.69 KMO>0.6 Pearson correlation coefficients 26 JRC-COIN Step 7: Statistical Coherence (II)

27 Social Progress Index - PCA results Total variance explained Pearson correlation coefficients between pillar and principal component Component loadings Component Eigenvalue Variance Cumulative variance PC1 PC2 PC Basic Human Needs 2. Foundations of Wellbeing ,00 3. Opportunity Sum Sum of Squares Stopping criterion Eigenvalue > 1 PC 0. 93X 0.96X 0.98X 1 BHN FoW Opp One dimension verified! 27 JRC-COIN Step 7: Statistical Coherence (II)

28 Social Progress Index - PCA results PCA results often summarized in a Factor map PC 0.96X 0.98X 0. 93X 1 BHN FoW PC 0.25X 0.09X 0. 35X PC 2 2 BHN FoW Second principal component is useful to evaluate the differences between the first two and third dimensions. Opp Opp 28 JRC-COIN Step 7: Statistical Coherence (II)

29 Social Progress Index - Audit results 29 JRC-COIN Step 7: Statistical Coherence (II)

30 Social Progress Index - Audit results Redundancies in SPI framework - very strong correlations between the SPI aggregates (dimensions and components). PCA was also performed on the whole set of 51 indicators Some components measure similar concepts: C1:Water and Sanitation C2:Shelter C5: Access to Basic Knowledge C12: Access to Advanced Education Index 3 Dim 12 Comp JRC-COIN Step 7: Statistical Coherence (II)

31 Social Progress Index - Audit results Very strong correlations between the SPI aggregates (dimensions and components). PCA was also performed on the whole set of 51 indicators. 7 latent dimensions (principal components) are retrieved, which capture 78% of the total variance in the underlying indicators. 31 JRC-COIN Step 7: Statistical Coherence (II) Total variance explained Component Eigenvalue Cumulative variance 1 27,55 54,02 2 5,30 64,42 3 1,98 68,31 4 1,63 71,51 5 1,24 73,94 6 1,22 76,34 7 1,07 78,44 8 0,97 80,33 9 0,84 81, ,76 83, ,67 84, ,67 86, ,01 100,00 Sum 51 Stopping criterion Eigenvalue > 1 Difference less than 8%

32 Social Progress Index - Audit results From JRC Audit report: Less number of components is also in line with the results from the beyond GDP approach by the Commission on the Measurement of Economic Performance and Social Progress*. The SPI conceptual framework has been influenced by this work. So apart from the statistical reasoning, the result from the beyond GDP approach, may also give conceptual justification for reducing the number of components in the SPI framework. * Stiglitz, J., Sen,A., Fitoussi,J.-P., Report by the Commission on the Measurement of Economic Performance and Social Progress. In: Commission on the Measurement of Economic Performance and Social Progress, Paris, France. 32 JRC-COIN Step 7: Statistical Coherence (II)

33 Historical notes PCA PCA is probably the oldest of the multivariate analysis methods. Pearson (1901) said that his methods can be easily applied to numerical problems and although he says that the calculations become cumbersome for four or more variables he suggests that they are still quite feasible. Hotelling (1933) chose his components so as to maximize their successive contributions to the total of the variances of the original variables, principal components. Karl Pearson ( ) Harold Hotelling ( ) 33 JRC-COIN Step 7: Statistical Coherence (II)

34 Part 2: Reliability Analysis Cronbach s α is measure of internal consistency and reliability Based upon classical test theory. Most common estimate of reliability, Cronbach (1951). Cronbach s α is defined as Ratio of the Total Covariance in the test to the Total Variance in the test. By test we mean the set of indicators constituting a Sub-pillar/Pillar/Composite Indicator. 34 JRC-COIN Step 7: Statistical Coherence (II)

35 Cronbach s α Cronbach s α is measure of internal consistency and reliability α ranges from 0 to 1. If all indicators are entirely independent from one another (i.e. are not correlated or share no covariance) α = 0. If all of the items have high covariances, then α will approach 1. Caution! α increases when the number of indicators increases. 35 JRC-COIN Step 7: Statistical Coherence (II)

36 Cronbach s α cut off values Cronbach s α is measure of internal consistency and reliability Lee Cronbach ( ) Cronbach's alpha Internal consistency 0.9 α Excellent 0.8 α < 0.9 Good 0.7 α < 0.8 Acceptable 0.6 α < 0.7 Questionable 0.5 α < 0.6 Poor α < 0.5 Unacceptable 36 JRC-COIN Step 7: Statistical Coherence (II)

37 Cronbach s α - European Skills Index Cronbach s α is measure of internal consistency and reliability European Skills Index Cronbach's Alpha Index level (3 pillars) Item level Corrected Item-Total Correlation 0.59 Alpha if Item is dropped Pillar 1: Skills Development Pillar 2: Skills Activation Pillar 3: Skills Matching Cronbach's alpha Internal consistency 0.9 α Excellent 0.8 α < 0.9 Good 0.7 α < 0.8 Acceptable 0.6 α < 0.7 Questionable 0.5 α < 0.6 Poor α < 0.5 Unacceptable 37 JRC-COIN Step 7: Statistical Coherence (II)

38 Cronbach s α - Social Progress Index Cronbach s α is measure of internal consistency and reliability Social Progress Index Index level (3 dimensions) Item level Cronbach's Alpha Corrected Item-Total Correlation 0.96 Alpha if Item is dropped 1: Basic Human Needs : Foundations of Wellbeing : Opportunity Cronbach's alpha Internal consistency 0.9 α Excellent 0.8 α < 0.9 Good 0.7 α < 0.8 Acceptable 0.6 α < 0.7 Questionable 0.5 α < 0.6 Poor α < 0.5 Unacceptable If α is too high it may suggest that some items are redundant as that are measuring very similar concepts. Maximum α 0.9 has been recommended. 38 JRC-COIN Step 7: Statistical Coherence (II)

39 Concluding words PCA and RA Input Output Open transparent Black Box box 39 JRC-COIN Step 7: Statistical Coherence (II)

40 References Books Johnson, Applied Multivariate Statistical Analysis (6th Revised edition), Pearson Education Limited (2013) Jolliffe, Principal Component Analysis, Second Edition, Springer (2002) Leandre, Fabrigar, Duane, Exploratory Factor Analysis, Oxford (2011) Rencher, Methods of multivariate analysis, Wiley (1995) Tinsley and Brown, Handbook of Applied Multivariate Statistics and Mathematical Modeling, Academic Press (2000) Tucker, MacCallum, Exploratory Factor Analysis, (1997) 40 JRC-COIN Step 7: Statistical Coherence (II)

41 THANK YOU Any questions? Welcome to us at: You may contact us & The European Commission s Competence Centre on Composite Indicators and Scoreboards COIN in the EU Science Hub COIN tools are available at:

42 Data matrix of indicators x x... x x x... x X.. x x... x p p n1 n2 np n = number of objects (countries, ) p = number of variables (indicators) The previous steps in building our CI - Step 3 Outlier detection and missing data estimation - Step 4 Normalisation/Standardisation X correlated data set (without outliers and (without or a few) missing values) PCA is based on correlations 42 JRC-COIN Step 7: Statistical Coherence (II)

43 First steps in PCA Check the correlation structure of the data and do 1) and 2) 1) Bartlett s sphericity test The Bartlett s test checks if the observed correlation matrix R diverges significantly from the identity matrix. H 0 : R = 1, H 1 : R 1 Test statistic 2 n 1 2 p 5 log 6 Under H 0, it follows a χ² distribution with a p(p-1)/2 degrees of freedom. R n = sample size p = number of indicators Want to reject H 0 to be able to do PCA,FA! (Bartlett (1937)) 43 JRC-COIN Step 7: Statistical Coherence (II)

44 First steps in PCA Check the correlation structure of the data and do 1) and 2) 2) Kaiser-Meyer-Olkin (KMO) Measure for Sampling Adequacy Indicators are more or less correlated, but the correlation between two indicators can be influenced by the others. Use partial correlation in order to measure the relation between two indicators by removing the effect of the remaining indicators (controlling for the confounding variable) KMO 2 rij 2 r ij 2 aij r ij Pearson correlation coefficient a ij partial correlation coefficient Want KMO close to 1 to be able to do PCA! KMO>0.6 ok (Kaiser (1970), Kaiser-Meyer (1974)) 44 JRC-COIN Step 7: Statistical Coherence (II)

45 PCA and sample size Sufficient number of cases (countries, regions, cities ) necessary to perform PCA. NO scientific answer opinions differ! Two different approaches : (A) Minimum total sample size (B) Examining the ratio of subjects to variables Rule of 10: These should be at least 10 cases for each variable 3:1 ratio: The cases-to variables ratio should be no lower than 3 5:1 ratio: The cases-to variables ratio should be no lower than 5 45 JRC-COIN Step 7: Statistical Coherence (II)

46 PCA and sample size (cont.1) (A) Minimum total sample size Rule of 100: Gorsuch (1983) and Kline (1979, p. 40) recommended at least 100 (MacCallum, Widaman, Zhang & Hong, 1999). No sample should be less than 100 even though the number of variables is less than 20 (Gorsuch, 1974, p. 333; in Arrindell & van der Ende, 1985, p. 166); Hatcher (1994) recommended that the number of subjects should be the larger of 5 times the number of variables, or 100. Even more subjects are needed when communalities are low and/or few variables load on each factor (in David Garson, 2008). Rule of 150: Hutcheson and Sofroniou (1999) recommends at least cases, more toward the 150 end when there are a few highly correlated variables, as would be the case when collapsing highly multicollinear variables (in David Garson, 2008). Rule of 200: Guilford (1954, p. 533) suggested that N should be at least 200 cases (in MacCallum, Widaman, Zhang & Hong, 1999, p84; in Arrindell & van der Ende, 1985; p. 166). Rule of 250: Cattell (1978) claimed the minimum desirable N to be 250 (in MacCallum, Widaman, Zhang & Hong, 1999, p84). Rule of 300: There should be at least 300 cases (Noruis, 2005: 400, in David Garson, 2008). Significance rule. Lawley and Maxwell (1971) suggested 51 more cases than the number of variables, to support chi-square testing (in David Garson, 2008). Rule of 500. Comrey and Lee (1992) thought that 100 = poor, 200 = fair, 300 = good, 500 = very good, 1,000 or more = excellent They urged researchers to obtain samples of 500 or more observations whenever possible (in MacCallum, Widaman, Zhang & Hong, 1999, p84). 46 JRC-COIN Step 7: Statistical Coherence (II)

47 PCA and sample size (cont.2) (B) Examining the ratio of subjects to variables A ratio of 20:1: Hair, Anderson, Tatham, and Black (1995, in Hogarty, Hines, Kromrey, Ferron, & Mumford, 2005) Rule of 10: There should be at least 10 cases for each item in the instrument being used. (David Garson, 2008; Everitt, 1975; Everitt, 1975, Nunnally, 1978, p. 276, in Arrindell & van der Ende,1985, p. 166; Kunce, Cook, & Miller, 1975, Marascuilor & Levin, 1983, in Velicer & Fava, 1998, p. 232) Rule of 5: The subjects-to-variables ratio should be no lower than 5 (Bryant and Yarnold, 1995, in David Garson, 2008; Gorsuch, 1983, in MacCallum, Widaman, Zhang & Hong, 1999; Everitt, 1975, in Arrindell & van der Ende, 1985; Gorsuch, 1974, in Arrindell & van der Ende, 1985, p. 166) A ratio of 3(:1) to 6(:1) of subjects-to-variables is acceptable if the lower limit of variables-to-factors ratio is 3 to 6. But, the absolute minimum sample size should not be less than 250.(Cattell, 1978, p. 508, in Arrindell & van der Ende, 1985, p. 166) Ratio of 2. There should be at least twice as many subjects as variables in factor-analytic investigations. This means that in any large study on this account alone, one should have to use more than the minimum 100 subjects" (Kline, 1979, p. 40). 47 JRC-COIN Step 7: Statistical Coherence (II)

48 Multivariate data The type of analysis to be performed depends on the data of the indicators Data Indicators Quantitative Qualitative Mixed/ structured PCA/FA PCA Principal Component Analysis FA Factor Analysis dispca/fa,(m)ca FAMD/MFA PCA for discrete data (based on polychoric correlations) (M)CA (Multiple) Correspondence Analysis. FAMD Factor Analysis of Mixed Data MFA Multiple Factor Analysis. 48 JRC-COIN Step 7: Statistical Coherence (II)

49 Exploratory FA Two main types of FA: Exploratory (EFA) and Confirmatory (CFA) EFA examining the relationships among the variables/indicators and do not have an à priori fixed number of factors CFA a clear idea about the number of factors you will encounter, and about which variables will most likely load onto each factor 49 JRC-COIN Step 7: Statistical Coherence (II)

50 Is PCA = Exploratory FA? PCA FA Similar in many ways but an important difference that has effects on how to use them 1) Both are data reduction techniques - they allow you to capture the variance in variables in a smaller set 2) Same (or similar) procedures used to run in many software (same steps extraction, interpretation, rotation, choosing the number of components/factors) 50 JRC-COIN Step 7: Statistical Coherence (II)

51 Is PCA = Exploratory FA? NO! PCA FA Similar in many ways but an important difference that has effects on how to use them 1) Both are data reduction techniques - they allow you to capture the variance in variables in a smaller set 2) Same (or similar) procedures used to run in many software (same steps extraction, interpretation, choosing the number of components/factors) Difference: PCA is a linear combination of variables FA is a measurement model of a latent variable (statistical model) 51 JRC-COIN Step 7: Statistical Coherence (II)

52 Is PCA = Exploratory FA? NO! PCA Observed indicators are reduced into components FA Latent factors drive the observed variables Item 1 X 1 w 1 λ 1 Item 1 X 1 ε 1 Item 2 X 2... w 2 Component Factor λ 2 Item 2 X 2... ε 2 ε i Error terms Item p X p w p λ p Item p X p ε p PCA summarizes information of all indicators and reduces it into a fewer number of components Each PC i is a new variable computed as a linear combination of the original variables PC 1 w x w x w p x p Model of the measurement of a latent variable. This latent variable cannot be directly measured with a single variable. Seen through the relationships it causes in a set of X variables. X X X 1 i p F 1 F i F p i p 52 JRC-COIN Step 7: Statistical Coherence (II)

53 PCA for discrete data A practically important violation of the normality assumption underlying the PCA occurs when the data are discrete! Several kinds of discrete data (binary, ordinal, count, nominal ) Two ways on how to proceed: 1) Simple approach, use the discrete x s as if they were continuous in the PCA but instead of using the Pearson moment correlation coefficients use Spearman or Kendall rank correlation coefficients. 2) PCA using polychoric correlations. Pearson (1901) founder of this approach (as well!), developed further by Olsson (1979) and Jöreskog (2004) 53 JRC-COIN Step 7: Statistical Coherence (II)

54 PCA using polychoric correlations The polychoric correlation (PCC) coefficient is a measure of association for ordinal variables which rests upon an assumption of an underlying joint continuous normal distribution. (Later generalizations of different continuous distributions). PCC coefficient is MLE of the product-moment correlation between the underlying normal variables 54 JRC-COIN Step 7: Statistical Coherence (II)

55 55 JRC-COIN Step 7: Statistical Coherence (II) Cronbach s α is measure of internal consistency and reliability Based upon classical test theory. Most common estimate of reliability, Cronbach (1951) Cronbach s α is defined as Ratio of the Total Covariance in the test to the Total Variance in the test. By test we mean the set of indicators constituting a sub-pillar/ pillar/ci. Let be the variance of indicator i and total variance in sub-pillar/pillar/ci Average covariance of an indicator with any other indicator is Part 2: Reliability Analysis ) ( i X Var p i i X Var 1 1) ( ) ( 1 1 p p X Var X Var p i i p i i p i i p i i p i i p i i p i i p i i X Var X Var X Var p p p X Var p p X Var X Var ) ( 1 / 1) ( / ) ( p = number of indicators p i i p i i X Var X Var p p 1 1 ) ( 1 1

The European Commission s science and knowledge service. Joint Research Centre

The European Commission s science and knowledge service. Joint Research Centre The Euroean Commission s science and knowledge service Joint Research Centre Ste 7: Statistical coherence (II) PCA, Exloratory Factor Analysis, Cronbach alha Hedvig Norlén COIN 2017-15th JRC Annual Training

More information

Factor analysis. George Balabanis

Factor analysis. George Balabanis Factor analysis George Balabanis Key Concepts and Terms Deviation. A deviation is a value minus its mean: x - mean x Variance is a measure of how spread out a distribution is. It is computed as the average

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Exploratory (EFA) Background While the motivation in PCA is to replace the original (correlated) variables

More information

VAR2 VAR3 VAR4 VAR5. Or, in terms of basic measurement theory, we could model it as:

VAR2 VAR3 VAR4 VAR5. Or, in terms of basic measurement theory, we could model it as: 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variables) -Factors are linear constructions of the set of variables (see #8 under

More information

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting

More information

Multivariate and Multivariable Regression. Stella Babalola Johns Hopkins University

Multivariate and Multivariable Regression. Stella Babalola Johns Hopkins University Multivariate and Multivariable Regression Stella Babalola Johns Hopkins University Session Objectives At the end of the session, participants will be able to: Explain the difference between multivariable

More information

Or, in terms of basic measurement theory, we could model it as:

Or, in terms of basic measurement theory, we could model it as: 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables; the critical source

More information

Principles of factor analysis. Roger Watson

Principles of factor analysis. Roger Watson Principles of factor analysis Roger Watson Factor analysis Factor analysis Factor analysis Factor analysis is a multivariate statistical method for reducing large numbers of variables to fewer underlying

More information

Statistical Analysis of Factors that Influence Voter Response Using Factor Analysis and Principal Component Analysis

Statistical Analysis of Factors that Influence Voter Response Using Factor Analysis and Principal Component Analysis Statistical Analysis of Factors that Influence Voter Response Using Factor Analysis and Principal Component Analysis 1 Violet Omuchira, John Kihoro, 3 Jeremiah Kiingati Jomo Kenyatta University of Agriculture

More information

Exploratory Factor Analysis and Canonical Correlation

Exploratory Factor Analysis and Canonical Correlation Exploratory Factor Analysis and Canonical Correlation 3 Dec 2010 CPSY 501 Dr. Sean Ho Trinity Western University Please download: SAQ.sav Outline for today Factor analysis Latent variables Correlation

More information

SAMPLE SIZE IN EXPLORATORY FACTOR ANALYSIS WITH ORDINAL DATA

SAMPLE SIZE IN EXPLORATORY FACTOR ANALYSIS WITH ORDINAL DATA SAMPLE SIZE IN EXPLORATORY FACTOR ANALYSIS WITH ORDINAL DATA By RONG JIN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

More information

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

Principal Components Analysis using R Francis Huang / November 2, 2016

Principal Components Analysis using R Francis Huang / November 2, 2016 Principal Components Analysis using R Francis Huang / huangf@missouri.edu November 2, 2016 Principal components analysis (PCA) is a convenient way to reduce high dimensional data into a smaller number

More information

International Journal of Advances in Management, Economics and Entrepreneurship. Available online at: RESEARCH ARTICLE

International Journal of Advances in Management, Economics and Entrepreneurship. Available online at:   RESEARCH ARTICLE International Journal of Advances in Management, Economics and Entrepreneurship Available online at: www.ijamee.info RESEARCH ARTICLE A Factor Analysis of Determinants of Human Development in Rural Odisha

More information

A Factor Analysis of Key Decision Factors for Implementing SOA-based Enterprise level Business Intelligence Systems

A Factor Analysis of Key Decision Factors for Implementing SOA-based Enterprise level Business Intelligence Systems icbst 2014 International Conference on Business, Science and Technology which will be held at Hatyai, Thailand on the 25th and 26th of April 2014. AENSI Journals Australian Journal of Basic and Applied

More information

Dimensionality Reduction Techniques (DRT)

Dimensionality Reduction Techniques (DRT) Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

Dimensionality Assessment: Additional Methods

Dimensionality Assessment: Additional Methods Dimensionality Assessment: Additional Methods In Chapter 3 we use a nonlinear factor analytic model for assessing dimensionality. In this appendix two additional approaches are presented. The first strategy

More information

6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses.

6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses. 6348 Final, Fall 14. Closed book, closed notes, no electronic devices. Points (out of 200) in parentheses. 0 11 1 1.(5) Give the result of the following matrix multiplication: 1 10 1 Solution: 0 1 1 2

More information

Assessment of Some Factors Associated With Empowerment and Development Gap of Women in Three East African Countries

Assessment of Some Factors Associated With Empowerment and Development Gap of Women in Three East African Countries Assessment of Some Factors Associated With Empowerment and Development Gap of Women in Three East African Countries Rocky R.J. Akarro (Ph.D) Professor, Department of Statistics University of Dar es Salaam,

More information

Package paramap. R topics documented: September 20, 2017

Package paramap. R topics documented: September 20, 2017 Package paramap September 20, 2017 Type Package Title paramap Version 1.4 Date 2017-09-20 Author Brian P. O'Connor Maintainer Brian P. O'Connor Depends R(>= 1.9.0), psych, polycor

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

Exploratory Factor Analysis: dimensionality and factor scores. Psychology 588: Covariance structure and factor models

Exploratory Factor Analysis: dimensionality and factor scores. Psychology 588: Covariance structure and factor models Exploratory Factor Analysis: dimensionality and factor scores Psychology 588: Covariance structure and factor models How many PCs to retain 2 Unlike confirmatory FA, the number of factors to extract is

More information

Didacticiel - Études de cas

Didacticiel - Études de cas 1 Topic New features for PCA (Principal Component Analysis) in Tanagra 1.4.45 and later: tools for the determination of the number of factors. Principal Component Analysis (PCA) 1 is a very popular dimension

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

Supplementary materials for: Exploratory Structural Equation Modeling

Supplementary materials for: Exploratory Structural Equation Modeling Supplementary materials for: Exploratory Structural Equation Modeling To appear in Hancock, G. R., & Mueller, R. O. (Eds.). (2013). Structural equation modeling: A second course (2nd ed.). Charlotte, NC:

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 11 November 2, 2005 Multivariate Analysis Lecture #11-11/2/2005 Slide 1 of 58 Today s Lecture Factor Analysis. Today s Lecture Exploratory factor analysis (EFA). Confirmatory

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

A Note on Using Eigenvalues in Dimensionality Assessment

A Note on Using Eigenvalues in Dimensionality Assessment A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute

More information

Retained-Components Factor Transformation: Factor Loadings and Factor Score Predictors in the Column Space of Retained Components

Retained-Components Factor Transformation: Factor Loadings and Factor Score Predictors in the Column Space of Retained Components Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 6 11-2014 Retained-Components Factor Transformation: Factor Loadings and Factor Score Predictors in the Column Space of Retained

More information

Package rela. R topics documented: February 20, Version 4.1 Date Title Item Analysis Package with Standard Errors

Package rela. R topics documented: February 20, Version 4.1 Date Title Item Analysis Package with Standard Errors Version 4.1 Date 2009-10-25 Title Item Analysis Package with Standard Errors Package rela February 20, 2015 Author Michael Chajewski Maintainer Michael Chajewski

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

Principal Components. Summary. Sample StatFolio: pca.sgp

Principal Components. Summary. Sample StatFolio: pca.sgp Principal Components Summary... 1 Statistical Model... 4 Analysis Summary... 5 Analysis Options... 7 Scree Plot... 8 Component Weights... 9 D and 3D Component Plots... 10 Data Table... 11 D and 3D Component

More information

Assessment, analysis and interpretation of Patient Reported Outcomes (PROs)

Assessment, analysis and interpretation of Patient Reported Outcomes (PROs) Assessment, analysis and interpretation of Patient Reported Outcomes (PROs) Day 2 Summer school in Applied Psychometrics Peterhouse College, Cambridge 12 th to 16 th September 2011 This course is prepared

More information

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses. ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Factor Analysis. Summary. Sample StatFolio: factor analysis.sgp

Factor Analysis. Summary. Sample StatFolio: factor analysis.sgp Factor Analysis Summary... 1 Data Input... 3 Statistical Model... 4 Analysis Summary... 5 Analysis Options... 7 Scree Plot... 9 Extraction Statistics... 10 Rotation Statistics... 11 D and 3D Scatterplots...

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

Factor Analysis & Structural Equation Models. CS185 Human Computer Interaction

Factor Analysis & Structural Equation Models. CS185 Human Computer Interaction Factor Analysis & Structural Equation Models CS185 Human Computer Interaction MoodPlay Recommender (Andjelkovic et al, UMAP 2016) Online system available here: http://ugallery.pythonanywhere.com/ 2 3 Structural

More information

Classification methods

Classification methods Multivariate analysis (II) Cluster analysis and Cronbach s alpha Classification methods 12 th JRC Annual Training on Composite Indicators & Multicriteria Decision Analysis (COIN 2014) dorota.bialowolska@jrc.ec.europa.eu

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Factor Analysis: An Introduction. What is Factor Analysis? 100+ years of Factor Analysis FACTOR ANALYSIS AN INTRODUCTION NILAM RAM

Factor Analysis: An Introduction. What is Factor Analysis? 100+ years of Factor Analysis FACTOR ANALYSIS AN INTRODUCTION NILAM RAM NILAM RAM 2018 PSYCHOLOGY R BOOTCAMP PENNSYLVANIA STATE UNIVERSITY AUGUST 16, 2018 FACTOR ANALYSIS https://psu-psychology.github.io/r-bootcamp-2018/index.html WITH ADDITIONAL MATERIALS AT https://quantdev.ssri.psu.edu/tutorials

More information

STAT 730 Chapter 9: Factor analysis

STAT 730 Chapter 9: Factor analysis STAT 730 Chapter 9: Factor analysis Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Data Analysis 1 / 15 Basic idea Factor analysis attempts to explain the

More information

UCLA STAT 233 Statistical Methods in Biomedical Imaging

UCLA STAT 233 Statistical Methods in Biomedical Imaging UCLA STAT 233 Statistical Methods in Biomedical Imaging Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology University of California, Los Angeles, Spring 2004 http://www.stat.ucla.edu/~dinov/

More information

Manabu Sato* and Masaaki Ito**

Manabu Sato* and Masaaki Ito** J. Japan Statist. Soc. Vol. 37 No. 2 2007 175 190 THEORETICAL JUSTIFICATION OF DECISION RULES FOR THE NUMBER OF FACTORS: PRINCIPAL COMPONENT ANALYSIS AS A SUBSTITUTE FOR FACTOR ANALYSIS IN ONE-FACTOR CASES

More information

Principal Component Analysis, A Powerful Scoring Technique

Principal Component Analysis, A Powerful Scoring Technique Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new

More information

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models Chapter 8 Models with Structural and Measurement Components Good people are good because they've come to wisdom through failure. Overview William Saroyan Characteristics of SR models Estimation of SR models

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Principal component analysis (PCA) for clustering gene expression data

Principal component analysis (PCA) for clustering gene expression data Principal component analysis (PCA) for clustering gene expression data Ka Yee Yeung Walter L. Ruzzo Bioinformatics, v17 #9 (2001) pp 763-774 1 Outline of talk Background and motivation Design of our empirical

More information

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis Multivariate Fundamentals: Rotation Exploratory Factor Analysis PCA Analysis A Review Precipitation Temperature Ecosystems PCA Analysis with Spatial Data Proportion of variance explained Comp.1 + Comp.2

More information

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables 1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 6 Offprint

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 6 Offprint Biplots in Practice MICHAEL GREENACRE Proessor o Statistics at the Pompeu Fabra University Chapter 6 Oprint Principal Component Analysis Biplots First published: September 010 ISBN: 978-84-93846-8-6 Supporting

More information

Introduction to Confirmatory Factor Analysis

Introduction to Confirmatory Factor Analysis Introduction to Confirmatory Factor Analysis Multivariate Methods in Education ERSH 8350 Lecture #12 November 16, 2011 ERSH 8350: Lecture 12 Today s Class An Introduction to: Confirmatory Factor Analysis

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Sustainable City Index SCI-2014

Sustainable City Index SCI-2014 Sustainable City Index SCI-2014 Summary Since its successful launch in 2006, the Sustainable Society Index (SSI) still remains the only index that shows in quantitative terms the level of sustainability

More information

E X P L O R E R. R e l e a s e A Program for Common Factor Analysis and Related Models for Data Analysis

E X P L O R E R. R e l e a s e A Program for Common Factor Analysis and Related Models for Data Analysis E X P L O R E R R e l e a s e 3. 2 A Program for Common Factor Analysis and Related Models for Data Analysis Copyright (c) 1990-2011, J. S. Fleming, PhD Date and Time : 23-Sep-2011, 13:59:56 Number of

More information

PRINCIPAL COMPONENTS ANALYSIS (PCA)

PRINCIPAL COMPONENTS ANALYSIS (PCA) PRINCIPAL COMPONENTS ANALYSIS (PCA) Introduction PCA is considered an exploratory technique that can be used to gain a better understanding of the interrelationships between variables. PCA is performed

More information

Principal Component Analysis

Principal Component Analysis I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

The European Commission s science and knowledge service. Joint Research Centre

The European Commission s science and knowledge service. Joint Research Centre The European Commission s science and knowledge service Joint Research Centre Step 6: Aggregation rules Giulio Caperna COIN 2018-16th JRC Annual Training on Composite Indicators & Scoreboards 05-07/11/2018,

More information

APPLICATION OF PRINCIPAL COMPONENT ANALYSIS IN MONITORING EDUCATIONAL PROSPECTS IN MATHEMATICS RELATED CAREERS IN THE UPPER EAST REGION OF GHANA

APPLICATION OF PRINCIPAL COMPONENT ANALYSIS IN MONITORING EDUCATIONAL PROSPECTS IN MATHEMATICS RELATED CAREERS IN THE UPPER EAST REGION OF GHANA APPLICATION OF PRINCIPAL COMPONENT ANALYSIS IN MONITORING EDUCATIONAL PROSPECTS IN MATHEMATICS RELATED CAREERS IN THE UPPER EAST REGION OF GHANA Clement Ayarebilla Ali University of Education, Winneba

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Multilevel Component Analysis applied to the measurement of a complex product experience

Multilevel Component Analysis applied to the measurement of a complex product experience Multilevel Component Analysis applied to the measurement of a complex product experience Boucon, C.A., Petit-Jublot, C.E.F., Groeneschild C., Dijksterhuis, G.B. Outline Background Introduction to Simultaneous

More information

Sparse orthogonal factor analysis

Sparse orthogonal factor analysis Sparse orthogonal factor analysis Kohei Adachi and Nickolay T. Trendafilov Abstract A sparse orthogonal factor analysis procedure is proposed for estimating the optimal solution with sparse loadings. In

More information

Can Variances of Latent Variables be Scaled in Such a Way That They Correspond to Eigenvalues?

Can Variances of Latent Variables be Scaled in Such a Way That They Correspond to Eigenvalues? International Journal of Statistics and Probability; Vol. 6, No. 6; November 07 ISSN 97-703 E-ISSN 97-7040 Published by Canadian Center of Science and Education Can Variances of Latent Variables be Scaled

More information

Estimating confidence intervals for eigenvalues in exploratory factor analysis. Texas A&M University, College Station, Texas

Estimating confidence intervals for eigenvalues in exploratory factor analysis. Texas A&M University, College Station, Texas Behavior Research Methods 2010, 42 (3), 871-876 doi:10.3758/brm.42.3.871 Estimating confidence intervals for eigenvalues in exploratory factor analysis ROSS LARSEN AND RUSSELL T. WARNE Texas A&M University,

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Principal Component Analysis (PCA) The problem in exploratory multivariate data analysis usually is

More information

Revision: Chapter 1-6. Applied Multivariate Statistics Spring 2012

Revision: Chapter 1-6. Applied Multivariate Statistics Spring 2012 Revision: Chapter 1-6 Applied Multivariate Statistics Spring 2012 Overview Cov, Cor, Mahalanobis, MV normal distribution Visualization: Stars plot, mosaic plot with shading Outlier: chisq.plot Missing

More information

SEM for Categorical Outcomes

SEM for Categorical Outcomes This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Factor Analysis Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Factor Models The multivariate regression model Y = XB +U expresses each row Y i R p as a linear combination

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Intermediate Social Statistics

Intermediate Social Statistics Intermediate Social Statistics Lecture 5. Factor Analysis Tom A.B. Snijders University of Oxford January, 2008 c Tom A.B. Snijders (University of Oxford) Intermediate Social Statistics January, 2008 1

More information

Introduction to Structural Equation Modeling

Introduction to Structural Equation Modeling Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression

More information

Vector Space Models. wine_spectral.r

Vector Space Models. wine_spectral.r Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components

More information

Canonical Correlation & Principle Components Analysis

Canonical Correlation & Principle Components Analysis Canonical Correlation & Principle Components Analysis Aaron French Canonical Correlation Canonical Correlation is used to analyze correlation between two sets of variables when there is one set of IVs

More information

Principal Component Analysis for Detection of Globally Important Input Parameters in Nonlinear Finite Element Analysis

Principal Component Analysis for Detection of Globally Important Input Parameters in Nonlinear Finite Element Analysis Principal Component Analysis for Detection of Globally Important Input Parameters in Nonlinear Finite Element Analysis Sascha Ackermann 1, Lothar Gaul 1,Michael Hanss 1,Thomas Hambrecht 2 1 IAM - Institute

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE Supported by Patrick Adebayo 1 and Ahmed Ibrahim 1 Department of Statistics, University of Ilorin, Kwara State, Nigeria Department

More information

Recovery of weak factor loadings in confirmatory factor analysis under conditions of model misspecification

Recovery of weak factor loadings in confirmatory factor analysis under conditions of model misspecification Behavior Research Methods 29, 41 (4), 138-152 doi:1.3758/brm.41.4.138 Recovery of weak factor loadings in confirmatory factor analysis under conditions of model misspecification CARMEN XIMÉNEZ Autonoma

More information

Statistical methods and data analysis

Statistical methods and data analysis Statistical methods and data analysis Teacher Stefano Siboni Aim The aim of the course is to illustrate the basic mathematical tools for the analysis and modelling of experimental data, particularly concerning

More information

B. Weaver (18-Oct-2001) Factor analysis Chapter 7: Factor Analysis

B. Weaver (18-Oct-2001) Factor analysis Chapter 7: Factor Analysis B Weaver (18-Oct-2001) Factor analysis 1 Chapter 7: Factor Analysis 71 Introduction Factor analysis (FA) was developed by C Spearman It is a technique for examining the interrelationships in a set of variables

More information

Dimensionality of Hierarchical

Dimensionality of Hierarchical Dimensionality of Hierarchical and Proximal Data Structures David J. Krus and Patricia H. Krus Arizona State University The coefficient of correlation is a fairly general measure which subsumes other,

More information

Modelling the Electric Power Consumption in Germany

Modelling the Electric Power Consumption in Germany Modelling the Electric Power Consumption in Germany Cerasela Măgură Agricultural Food and Resource Economics (Master students) Rheinische Friedrich-Wilhelms-Universität Bonn cerasela.magura@gmail.com Codruța

More information

WELCOME! Lecture 14: Factor Analysis, part I Måns Thulin

WELCOME! Lecture 14: Factor Analysis, part I Måns Thulin Quantitative methods II WELCOME! Lecture 14: Factor Analysis, part I Måns Thulin The first factor analysis C. Spearman (1904). General intelligence, objectively determined and measured. The American Journal

More information

Measuring relationships among multiple responses

Measuring relationships among multiple responses Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.

More information

Comparison of goodness measures for linear factor structures*

Comparison of goodness measures for linear factor structures* Comparison of goodness measures for linear factor structures* Borbála Szüle Associate Professor Insurance Education and Research Group, Corvinus University of Budapest E-mail: borbala.szule@unicorvinus.hu

More information

MSP Research Note. RDQ Reliability, Validity and Norms

MSP Research Note. RDQ Reliability, Validity and Norms MSP Research Note RDQ Reliability, Validity and Norms Introduction This research note describes the technical properties of the RDQ. Evidence for the reliability and validity of the RDQ is presented against

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 6: Bivariate Correspondence Analysis - part II the Contents the the the Independence The independence between variables x and y can be tested using.

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Performance In Science And Non Science Subjects

Performance In Science And Non Science Subjects Canonical Correlation And Hotelling s T 2 Analysis On Students Performance In Science And Non Science Subjects Mustapha Usman Baba 1, Nafisa Muhammad 1,Ibrahim Isa 1, Rabiu Ado Inusa 1 and Usman Hashim

More information

Research on the Influence Factors of Urban-Rural Income Disparity Based on the Data of Shandong Province

Research on the Influence Factors of Urban-Rural Income Disparity Based on the Data of Shandong Province International Journal of Managerial Studies and Research (IJMSR) Volume 4, Issue 7, July 2016, PP 22-27 ISSN 2349-0330 (Print) & ISSN 2349-0349 (Online) http://dx.doi.org/10.20431/2349-0349.0407003 www.arcjournals.org

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information