Data set Science&Environment (ISSP,1993) Q. SCIENCE AND ENVIRONMENT Multiple categorical variables Multiple Correspondence Analysis How much do you agree or disagree with each of these statements? Q.a We believe too often in science, and not enough in feelings and faith. Q.b Over all, modern science does more harm than good. Q.c Any change humans cause in nature - no matter how scientific - is likely to make things worse. Q.d Modern science will solve our environmental problems with little change to our way of life. Response categories 1. Strongly agree. Agree 3. Neither agree nor disagree. Disagree 5. Strongly disagree 8. Can't choose, don't know 9. NA, refused We are interested now in the relationship between the four variables, not so much as the differences between countries. Since the relationship between the four variables might change across the countries, we shall restrict our attention for the moment to one country, say Germany (dataset wg93 in the ca package). Missing values have been removed in this initial example of MCA. The sample size is n = 871. Indicator matrix Va(A) Burt matrix Vb(B) Vc(C) Vd(D) Original responses (Q = questions) a b c d 3 3 3 3 3 3 3 3 3 3 5 3 3 3 1 3 3 etc... Indicator matrix (J = 0 categories) 1 3 5 1 3 5 1 3 5 1 3 5 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 etc.. (871 rows) Va Vb Vc Vd 119 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 178 0 0 0 0 0 8 7 38 3 3 0 8 7 8 1 3 30 8 63 3 5 96 73 79 11 1 30 17 5 9 9 67 18 16 0 1 75 50 9 18 60 70 0 9 7 1 3 56 16 5 1 7 16 1 15 10 9 5 10 5 9 17 76 68 8 13 7 8 30 1 38 7 8 96 30 3 8 63 73 17 3 1 3 79 5 0 3 5 11 9 71 0 0 0 0 0 17 0 0 0 0 0 05 0 0 0 0 0 81 0 0 0 0 0 10 3 36 37 7 9 19 88 90 88 31 3 57 75 7 3 15 19 7 3 1 17 30 9 16 10 6 19 17 51 53 66 5 10 63 70 17 9 0 18 7 5 67 1 60 1 1 18 75 70 3 7 16 50 0 56 16 9 9 16 1 3 19 3 36 88 3 15 1 37 90 57 19 7 88 75 7 17 9 31 7 3 30 15 0 0 0 0 0 316 0 0 0 0 0 197 0 0 0 0 0 15 0 0 0 0 0 5 5 15 5 6 9 97 51 16 15 67 83 30 7 15 5 17 3 8 10 76 68 5 10 68 58 9 5 8 5 35 9 13 1 10 9 17 10 10 5 16 51 5 0 10 53 63 51 8 6 66 70 9 7 19 5 17 8 31 5 15 38 50 15 97 67 89 8 5 51 83 1 17 6 30 51 3 9 16 7 7 13 60 0 0 0 0 0 3 0 0 0 0 0 0 0 0 3 68 58 5 1 10 5 51 9 8 38 89 1 51 7 0 0 0 6 0 8 5 35 10 5 0 8 7 31 50 8 17 3 13 0 0 0 0 151 B can be thought of as the covariance matrix between the variables
Indicator Z = (n J) CA ofz Two equivalent forms of MCA Column standard coordinates Row profiles are 0s with 1/Q = ¼ in the response positions; mass of each row is 1/n. In asymmetric map the rows (cases) are at ordinary averages of their reponses. Principal inertias 1,, Burt B = Z T Z = (J J) A B C D CA ofb Column standard coordinates Row standard coordinates also Row profiles are 1/Q = ¼ times the -way profiles, but including a diagonal profile (which is a vertex point); masses proportional to marginal frequencies Principal inertias 1 = 1, =, In both versions, the percentages of inertia are very low (less so for B). -3 - -1 0 1 CA of indicator matrix D 0.31 (10.8%) -3 - -1 0 1 3 0.57 (11.%) CA of Burt matrix R code for MCA -3 - -1 0 1 D3 D D 0.186 (16.5%) -3 - -1 0 1 3 0.09 (18.6%) # use wg93 data set on science & environment data(wg93) # CA of idicator matrix using mjca function ca.wg93.z <- mjca(wg93[,1:], lambda="indicator") plot(ca.wg93.z, labels=c(0,), map="rowprincipal") # inertias and percentages of inertia ca.wg93.z$sv^ 100*ca.wg93.Z$sv^/sum(ca.wg93.Z$sv^) # Burt matrix wg93.b <- mjca(wg93)$burt # CA of Burt matrix using ca function ca.wg93.b <- ca(wg93.b[1:0,1:0]) ca.wg93.b$rowcoord <- - ca.wg93.b$rowcoord ca.wg93.b$colcoord <- - ca.wg93.b$colcoord plot(ca.wg93.b, map="rowprincipal") # inertias and percentages of inertia ca.wg93.b$sv^ 100*ca.wg93.B$sv^/sum(ca.wg93.B$sv^)
Va Vb Vc Vd Burt matrix subtable inertias Va(A) Vb(B) Vc(C) Vd(D) 119 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 178 0 0 0 0 0 8 7 38 3 3 0 8 7 8 1 3 30 8 63 3 5 96 73 79 11 1 30 17 5 9 9 67 18 16 0 1 75 50 9 18 60 70 0 9 7 1 3 56 16 5 1 7 16 1 15 10 9 5 10 5 9 17 76 68 8 13 3 68 58 5 1 8 5 35 10 0.38 0.19 0.053 71 0 0 0 0 0 17 0 0 0 0 0 05 0 0 0 0 0 81 0 0 0 0 0 10 3 36 37 7 9 19 88 90 88 31 3 57 75 7 3 15 19 7 3 1 17 30 9 16 10 6 19 17 51 53 66 5 10 63 70 17 10 5 51 9 8 5 0 8 7 31 0.97 0.085 15 0 0 0 0 0 316 0 0 0 0 0 197 0 0 0 0 0 15 0 0 0 0 0 5 5 15 5 6 9 97 51 16 15 67 83 30 7 38 89 1 51 7 50 8 17 3 13 0.155 60 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 151 Inertia of B is the average of the 16 inertias: 1.18 Inertia of off-diagonal blocks of Burt matrix is the average of the 1 inertias: 0.170 Adjustment of principal inertias (eigenvalues) We can rescale an existing MCA solution quite simply in order to best fit the off-diagonal tables. All we need is the total inertia of the Burt matrix, inertia(b), and the principal inertias k of the Burt matrix in the solution space. If the solution has been computed on the indicator matrix Z, the eigenvalues calculated are k½ so all the squares of the principal inertias of Z need to be summed in order to get inertia(b). If the Burt matrix B has been analysed, the k s are available and inertia(b) is the total inertia, the sum of the k s. Here are the steps to rescale the solution: 1. Calculate the average off-diagonal inertia : Q J Q average off-diagonal inertia = inertia ( B) Q 1 Q. Calculate the adjusted principal inertias : Q 1/ 1 1/ 1 adjusted principal inertias = λ only for λ Q 1 k Q k Q 3. Calculate adjusted percentages of inertia : adjusted principal inertias adjusted percentages of inertia = average off - diagonal inertia Adjustment of inertias: wg93 data average off-diagonal inertia = (/3) (1.177 16/16) = 0.1703 adjusted principal inertias in first two dimensions = (/3) (0.57 ¼) = 0.0767 (dimension 1) = (/3) (0.310 ¼) = 0.058 (dimension ) adjusted percentages of inertia = 0.0767 / 0.1703 = 0.90 i.e..9 % = 0.058 / 0.1703 = 0.30 i.e. 3. % Here are the steps to rescale the solution: 1. Calculate the average off-diagonal inertia : Q J Q average off-diagonal inertia = inertia ( B) Q 1 Q. Calculate the adjusted principal inertias : Q 1/ 1 1/ 1 adjusted principal inertias = λ only for λ Q 1 k Q k Q 3. Calculate adjusted percentages of inertia : adjusted principal inertias adjusted percentages of inertia = average off - diagonal inertia -3 - -1 0 1 Analysis with adjusted eigenvalues D D 0.058 (3.%) -3 - -1 0 1 3 0.0765 (.9%)
CA of Burt matrix -3 - -1 0 1 D D 0.186 (16.5%) 0.09 (18.6%) Burt B = Z T Z = (J J) A B C D Adjusted MCA: Do CA of B and then adjust the coordinates so that the off-diagonal are optimally fitted. Joint CA (JCA) Fit the off-diagonal optimally. This is a different algorithm, performed iteratively. Advantage: maximum inertia in is explained. Disadvantages: axes are not nested, scale optimality degraded. -3 - -1 0 1 3 R output for JCA > summary(mjca(wg93[,1:], lambda="jca") Principal inertias (eigenvalues): dim value [1,] 1 0.099091 [,] 0.065033 [3,] 3 0.00559 [,] 0.00335 [5,] 5 0.001176 [6,] 6 0.00007 [7,] -------- [8,] Total: 0.185 Diagonal inertia discounted from eigenvalues: 0.05705 Percentage explained by JCA in dimensions: 85.7% (Eigenvalues are not nested) [Iterations in JCA:, epsilon = 9.91e-05] -0.6-0. -0. 0.0 0. 0. D3 D -0.5 0.0 0.5 % of inertia explained by axes 1 & : 85.7% (79.1% before, with adjustments) Axes are not nested, as in CA and MCA.
- -1 0 1 D3 D - -1 0 1 85.7% inertia explained -3 - -1 0 1 Analysis with adjusted eigenvalues D D 0.058 (3.%) -3 - -1 0 1 3 0.0765 (.9%) Quantification as a goal Homogeneity Analysis (HOMALS) Original responses 1 5 3 5 5 1 5 1 1 5 1 3 1 3 3 3 5 5 Quantified responses a b c1 d a b c d5 a b3 c d5 a b5 c d a b c1 d5 a1 b c1 d5 a1 b c d3 a1 b3 c d a3 b c d a3 b5 c5 d Scores (averaged or summated (a+b+c1+d)/ (a+b+c+d5)/ (a+b3+c+d5)/ (a+b5+c+d)/ (a+b+c1+d5)/ (a1+b+c1+d5)/ (a1+b+c+d3)/ (a1+b3+c+d)/ (a3+b+c+d)/ (a3+b5+c5+d)/.. Objective of HOMALS: to determine the set of J scale values a1,a,..., b1,b,..., etc... so that the implied scores for each individual are as close as possible to that individual s particular set of Q scale values. Closeness is defined in terms of squared sum of differences, and the solution is obtained by least-squares this is mathematically equivalent to maximising the sum of squared correlations between the scores and the quantified responses. Scaling properties of MCA The category quantifications in MCA maximize the average squared correlation between items scores and summated scores: Principal inertia of indicator matrix (square root of principal inertia of Burt matrix) is this average squared correlation: 1½ = ¼ [cor (A,A+B+C+D) + cor (B,A+B+C+D) + cor (C,A+B+C+D) + cor (D,A+B+C+D) ] The category quantifications in MCA maximize the Cronbach reliability coefficient Q 1 1 1 1/ = (/3)[1-1/( 0.57)] Q Q1 = 0.605 (dropping variable D) = (/3)[1-1/( 0.6018)] = 0.779