A Goodness-of-Fit Measure for the Mokken Double Monotonicity Model that Takes into Account the Size of Deviations

Size: px

Start display at page:

Download "A Goodness-of-Fit Measure for the Mokken Double Monotonicity Model that Takes into Account the Size of Deviations"

Marilynn Berry
5 years ago
Views:

1 Methods of Psychological Research Online 2003, Vol.8, No.1, pp Department of Psychology Internet: University of Koblenz-Landau A Goodness-of-Fit Measure for the Mokken Double Monotonicity Model that Takes into Account the Size of Deviations Teresa Rivas Moya 1 Málaga University. Spain Based on Mokken model and Isotonic Regression, Rivas Moya (2000b) gives a Global Deviation (GD) measure from the Double Monotonicity (DM). This paper illustrates the GD measure and gives the procedure to calculate it. Several examples from responses of 294 subjects to 16 dichotomous items showed (1) the procedure which calculates the GD measure from and P 11 matrices obtained by MSP5 (Molenaar et al., 2000) (2) GD is 0 if there are no individual deviations from DM (3) an increase in the number and/or size of individual deviations from DM leads to an increase in GD measure when accumulative scales with different numbers of items are considered. The principal advantage of this measure over other indices which evaluate DM is that the procedure to calculate GD estimates the size of deviations from DM. It also takes into account the number of deviations because it obtains the measure by summing up the deviations in each item pair. This measure provides interesting complementary information to the set of indices which evaluate the DM model. Keywords: Goodness of fit, scaling, nonlinear regression Within the framework of the non-parametric item response theory, Mokken (1971, 1997) defines monotone homogeneity (MH) and double monotonicity (DM) models for dichotomous items. A set of items that satisfies unidimensionality and local independence, and whose item response functions (IRFs) are non-decreasing monotone, verifies the assumptions of the MH model. If, in addition to the foregoing conditions, IRFs do not intersect, the set of items also verifies the assumptions of the DM model. Checking whether a set of items satisfies the assumptions of the DM model is laborious. This means that in practice the fit of the model was not viable until Molenaar, De- 1 Address: Teresa Rivas Moya. Departamento de Psicobiología y Metodología. Facultad de Psicología. Universidad de Málaga. Campus de Teatinos. Málaga-29071, Spain. moya@uma.es

2 82 MPR-Online 2003, Vol. 8, No. 1 bets, Stsma and Hemker (1994) developed the software (MSP 3.0) to check if a set of items satisfies DM. Later, Molenaar, Stsma, van Schuur and Mokken (2000) developed a new improved version (MSP5 for Windows) adding news indices. Previous references together with Stsma and Junker (1996), Stsma (1998), Molenaar and Stsma (2000), Stsma (2001), Stsma and Molenaar (2002) give an overall idea about these models, the evaluation indices and their applications. As well as the indices proposed by Mokken (1971) and Rosenbaum (1984, 1987), Molenaar et al. (1994), Molenaar and Stsma (2000) also set out indices which evaluate whether a set of items satisfies the MH or DM. These authors suggest additional research regarding the detailed study of some of these evaluation indices. To evaluate if IRFs are non-decreasing monotone or satisfy single monotonicity there are: - Scalability coefficients based on the analysis of each item ( i ) or a set of items ( H ) (Mokken, 1971). - Indices for each item obtained in the entire group. H, item pairs ( H ) - Indices for each item obtained in rest score groups. These groups are defined on the scores of the remaining items. Given an increasing rest score r, the proportion of positive responses must be monotonically non-decreasing in r (Molenaar et al. 1994, p. 9; Rosenbaum, 1984). - Diagnostic Value Crit which summarize information obtained through the checking of the single monotonicity via the entire group, enabling the identification of items least fitted by the MH model (Molenaar & Stsma, 2000, pp. 49, 74). Indices which evaluate the non intersection of IRFs are usually based on the analysis of item pairs. They check the non intersection via: - P-matrices by visual inspection (Mokken, 1971) and by a count of the violations. Given an ordering on the items, a MH set of items is DM when columns and rows in P 11 matrix are monotonely non-decreasing, and columns and rows in matrix are monotonely non-increasing. Local deviations from these orders are considered violations from DM. - Indices for item pairs obtained in rest score groups. These rest score groups are determined on the remaining items. The proportion of positive responses on an item should be smaller than or equal to that of the other item in each rest score group (Molenaar et al., 1994, p. 10; Rosenbaum, 1987).

3 Rivas Moya: Goodness of Fit Measure for Mokken Model 83 - Indices for item pairs obtained in rest split groups. These groups are determined by distinguishing between counts for the low and high groups that are based on the use of cut points. (Molenaar, 1991; Molenaar & Stsma; 2000; Stsma & Junker; 1996, pp ). - Diagnostic Value Crit which summarize information obtained through the checking of the non intersection via rest score groups and via P-matrices, enabling the identification of items least fitted by the DM model. - T H a and T H coefficients. T H checks if a set of items have intersecting IRF. for the total set of items based on the transposed data matrix and H T a coefficients on the level of individuals are determined. Rules of thumb for their interpretation were based on results from a study using simulated data (Stsma & Meer, 1992). T H - Graphical representations of IRF can be shown and can be seen if pairs of items intersect. These indices enable a detailed analysis to ascertain whether or not IRFs are monotone and intersect. If, from these indices, we concentrate on those that evaluate the non intersection of IRFs, based on item pairs and on the analysis of P-matrices, the final decision is generally based upon: 1a Violations that are evaluated by means of the differences of two such proportions, when their ordering is not in agreement with the requirements of the model. A minimum violation, by default, of.02 (Molenaar et al., 1994, p. 45) or.03 (Molenaar & Stsma, 2000, p. 66) is admitted, although this boundary may be altered by the researcher. 2a Recommendations made by these authors regarding the sample size and group size; and 3a other summary statistics and test of significance. Given a set of items that satisfies the assumptions of the MH model, this paper sets out the procedure to calculate a GD measure from the DM of the set of items (or non intersection of pairs of items) based on the estimation of the size of deviation (violation) from monotonicity. (N.B.: All further references to DM refer only to the assumption of non intersection of IRFs).

4 84 MPR-Online 2003, Vol. 8, No. 1 The GD measure presents two differences with regard to the violations defined in Molenaar et al. (1994, p. 45): 1b When there is a deviation from DM, the discrepancies between the observed proportions and estimated theoretical proportions (disparities) are calculated. These disparities are the values which the observed proportions should assume to satisfy the monotonicity when, in fact, they violate it. 2b It takes into account the size and number of deviations in a set of items. Therefore, these deviations give different and additional information in respect of the violations defined in (1a). In order to define the GD measure from DM, the following concepts are linked together: 1. The concept of DM established by Mokken (1971, 1997) for pairs of responses for ( ) triples of items ( k,, ) : Let P = p ( 1,1 ), p ( 0,0) 11 jk 00 jk ( ) P = of order n n be the symmetric matrices of joint proportions of scoring both correct and failed. k=1,...,n j<k; j<k denotes the item ordering for j is more difficult than k. Then, for all item k I and all pairs ( jk, ) I, j< k: p ( 1,1 ) pik ( 1,1) and p ( 0,0 ) pik ( 0,0) (Mokken, 1997, p. 357). The difficulty of an item, δ j, being the proportion of correct responses to item j. If a set of items is MH, Mokken proposes the study of DM analyzing visually the P 11 and matrices. If the items are ordered in decreasing difficulty, a MH set of items is DM when columns and rows in P 11 matrix are monotonely non-decreasing, and columns and rows in matrix are monotonely non-increasing. Local deviations from these orders are considered violations of DM. The concept of Mokken s DM means that the items must satisfy a monotone relation in rows and columns of P 11 and matrices. Whether or not this monotone relation is satisfied can be ascertained by the Isotonic Regression, then 2. Isotonic regression quantifies the degree to which the items of each row or column of P 11 and satisfy a monotone relation. Isotonic regression method, as in nonmetric scaling, allows the quantification of discrepancies from monotonicity in each row or column, by means of the differences between the observed proportions and the estimated theoretical proportions (disparities) obtained at the slopes of GCM.

5 Rivas Moya: Goodness of Fit Measure for Mokken Model 85 To this end, the concepts of Cumulative Sum Diagram (CSD) and Greatest Convex Minorant (GCM) of isotonic regression (Barlow, Bartholomew, Bremner & Brunk, 1972) adapted to calculate disparities associated with dissimilarities in non-metric Multidimensional Scaling in Rivas Moya (2000a) are used to calculate the disparities associated with the observed proportions. As a result, associated with the observed proportions which do not satisfy the monotonicity, the pˆ disparities are estimated in such a way that, when considering the disparities, the items satisfy a monotone relation in rows and columns. If this is satisfied in P 11 and, then the set of items is DM (Rivas Moya, 2000b). The basic idea is the following: Given the P 11 matrix in Table 1, the difficulty of items in columns j:1,2,..,n induces an order in each row of P 11. Numbers in brackets, in Table 1, denote the non-decreasing order induced by the difficulty. This order can be considered a dissimilarity measure, d( i, j ), between pairs of items i, j. Index of row i can be omitted because in each row it is a constant. Then, in any row i, indices of columns i+ 1 i n or d d... d denote the dissimilarities between pairs of items. i+ 1 i+ 2 n p Table1 Observed Proportions and Dissimilarities in the P 11 Matrix i / j i 1 i 2 i 3 i 4... i n i 1 - p 12 (2) p 13 (3) p 14 (4)... p 1n (n) i 2 - p 23 (3) p 24 (4)... p 2n (n) i n-1 - p n-1n (n) δ j δ 1 δ 2 δ 3 δ 4... δ n If p p j,k: 12,..n is not satisfied in each row i of P ik 11, the DM is violated because the proportions p do not satisfy the non-decreasing monotone relation in the same way as the dissimilarities between items d( i,j ). Then p must be substituted by the corresponding disparities ˆp, and the relation is now non-decreasing monotone. The disparities will be obtained as the isotonic regression of proportion function. To calcu- ( ) late them, it is necessary to know the coordinates of points P W,P = of CSD and

6 86 MPR-Online 2003, Vol. 8, No. 1 ˆ ( ) P ˆ = W,P GCM. of GCM, and the slopes of segments joining the points of CSD and The discrepancies from monotonicity in the items of each row can now be estimated, and deviation from monotonicity in all the rows of P 11 is defined as n p ( 1,1) p ˆ ( 1,1 ). Similarly, deviation from monotonicity in all the columns of P 11 i < j is defined. Then the GD measure for the P 11 -matrix is given as: being the dissimilarity matrix. n( n- 1) because ( ) and columns. n n p ( 1,1) pˆ ( 1,1) + p ( 1,1) pˆ ( 1,1) i < j j < i D( P,, n) = (1) 11 n n 1 ( ) n n 1 2 elements in each P -matrix are compared twice: in rows The GD measure from monotonicity non-increasing D( P,, n) in P matrix is similarly defined. Justification and details can be found in Rivas Moya (2000b). These measures are bounded between 0 and 1. There is no deviation from DM if these measures are 0. The maximum deviation from DM is given when these GD measures are 1. This paper gives several examples from responses of 294 subjects to 16 dichotomous items to show 1. The procedure which calculates the GD measure from and P 11 matrices. In addition, discrepancies from DM in each row / column of the P-matrix can be seen when plotting the coordinates of CSD and GCM associated with observed and estimated theoretical proportions, respectively. 2. If there are no individual deviations from DM then p = pˆ and GD is zero. 3. An increase in the number and/or size of individual deviations from DM leads to an increase in GD measure when sets of different numbers of items are considered. Thus: 3a From two different sets of 8 items, GD is larger when the size and the number of deviations is larger.

7 Rivas Moya: Goodness of Fit Measure for Mokken Model 87 3b A set of 7 items shows a lesser number of deviations, but of larger size than those of the previous examples. Then, GD is greater or equal to the measures in previous examples. These examples show empirically that in the GD measure the discrepancies from DM are obtained by comparing the observed proportions ( p ) with the disparities ( ˆp ) calculated by isotone regression. Procedure to Calculate the GD Measure from P-Matrices Given P 11 and matrices, and items in descending order of difficulty, the steps to obtain the GD from DM are set out in columns 1 10 of a table. The procedure described should be made in each row or column of each P- matrix. Column 1. Enumerates the rows or columns of the P-matrix. Column 2. Enumerates pairs of items (, ) of a row or column Column 3. Column 4. Dissimilarities d induced by the difficulty of items, that means the order that induces the difficulty. Observed proportions p given in Table 1. The matrices being obtained by MSP5 (Molenaar, et al. 2000). Values p which do not satisfy nondecreasing monotonicity are shown in bold and with (*). Columns 5, 6, 7.Weights w, accumulated weights W proportions P j k = 1 = p, respectively. Here, weights are fixed and they assume a value of 1.( W, P ) ik j = w ik and accumulated k = 1 these points the CSD is obtained. being the coordinates of CSD. By joining Column 8. Disparities pˆ associated with the proportions. 1. If there is no deviation from monotonicity, p = pˆ. 2. If there is deviation from monotonicity, disparity is calculated as:

8 88 MPR-Online 2003, Vol. 8, No. 1 P P p ˆ = with i< j, j : i + 1,...,n (2a) W W If there are two consecutive proportions p, p + 1 which do not satisfy monotonicity, the disparity associated with both proportions is given by: P P pˆ = W W (2b) Then, the slope of segment joining (, 1) and (, ) is P ˆ and the slope of segment joining (, ) and ( +, 1) is P ˆ + 1 = pˆ. Similarly, the method of calculating P ˆ can be extended to more than two successive proportions which do not satisfy monotonicity 4. If a deviation from monotonicity is found in the last item n of the row (column), disparity calculation is made as follows: A fictitious index ( n + 1) is added and assigned the value p in + 1 = p in 1. The deviation from monotonicity is calculated in respect of p in 1 That is, if P in 1 > P in and P is the last value of row i that satisfies the monotony, then P + 1 = P + p and in in P ˆ in 1 P p in 1 in = + Win + 1 Win 1 (2c) Column 9. Ordinates ˆP of GCM, are obtained as ˆP = P, if there is no deviation from monotonicity. According to the previous situations (2a), (2b), (2c), if there is deviation from monotonicity, then ordinates ˆP of GCM are obtained in (3a), (3b), (3c), respectively. For example, to obtain (3a), the equation of the segment, with slope ˆp, joining the points ( W 1, P 1) and ( W, ˆ P ) ( P ˆ P ) = p ˆ ( W W ) 1 1, then ( ), is P ˆ = P 1 + p ˆ W W 1 (3a) Similarly, ordinates (3b) and (3c) are obtained:

9 Rivas Moya: Goodness of Fit Measure for Mokken Model 89 ( ) ( ) Pˆ = P 1 + pˆ W W 1 Pˆ + 1 = P 1 + pˆ W + 1 W 1 (3b) Pˆ in = Pin 1 + pˆ in ( Win Win 1) (3c) Abscissas ˆ W coincide with those of CSD W. The GCM which satisfies non-decreasing monotonicity is obtained by joining the points ( W, Pˆ ). Column 10. The absolute values of the differences between observed proportions p and disparities ˆp for each pair of items, ˆp p. Columns 8,9, 10, are left blank if there is no deviation from DM in the entire row or column pˆ p = p p = 0. The discrepancies in rows and columns of P 11 are summed up to obtain GD measure in P 11. Similarly, this measure is calculated in. The graphs CSD and GCM show these discrepancies in each row (column). The greater the difference between these diagrams, the greater the discrepancy from monotonicity. The following has been used to apply this procedure to the data: 1. MSP5 for Windows program of Molenaar et al. (2000) to obtain the difficulty of items and observed proportions p in P 11, matrices. 2. Excel program to calculate, pˆ,, ˆ P P, pˆ p and the GD measure. 3. Graphs CSD and GCM are obtained with a graph program. Data In order to calculate GD, data collected by Ortiz (1997) were used. A test of 12 items, comprising several levels of Numerical Inductive Reasoning each item being a series of distinct difficulty was given to 296 subjects of General Basic Education (age 8-12). The task which each subject carried out was to complete the numerical sequence of the series by either addition, subtraction, multiplication or division. Subjects responses were codified in binary form 0, 1 indicating that the series had been completed either incorrectly or correctly, respectively.

10 90 MPR-Online 2003, Vol. 8, No. 1 Different subgroups of items have been selected to illustrate the GD: In Study 1: Four items whose numerical sequence is obtained by addition or subtraction. In Study 2: The four easiest items whose numerical sequence is obtained by addition. In Study 3: (1) The eight easiest items, (2) the eight most difficult items, (3) four easy and three difficult items. Study 1: Application of the Procedure to Calculate the GD Measure The P11and matrices from the responses of 294 subjects to 4 dichotomous items are given in Table 2. The items are given in descending order of difficulty δ. Table 2 P11and Matrices P 11 Items Items δ δ Columns in P 11 are non-decreasing. Therefore, there is no deviation from DM when analyzing the columns. In other rows or columns of P 11 and matrices there are deviations. Proportions p of P matrices are included in column 4 of tables 3 and 4. In each row of Table 3, p (1,1) should satisfy the same order as d. In row 1, P 11 shows one deviation from non-decreasing monotonicity. Then, the fixed weights w = 1 are given. Accumulated weights W and accumulated proportions p are obtained to calculate the disparity p 13(1,1) associated with p 13(1,1) (column 8).

11 Rivas Moya: Goodness of Fit Measure for Mokken Model 91 ˆ (1,1) = = =.294 and the size of Then, p ( P P ) ( W W ) ( ) ( ) this deviation is pˆ p = =.016, in row 1 of the P 11 - matrix. This value close to 0 indicates that there is a small deviation from monotonicity in the items of row 1. In rows 2 and 3 there is no deviation from monotonicity, column 4 is equal to column 8. In consequence, column 7 is equal to column 9, and p ˆ p = 0. So the values p ˆ, pˆ and p ˆ p are left blank. Table 3 Deviation from DM in the P11- matrix P 11 (i,j) d (1,1) p w W P pˆ Pˆ pˆ p row 1 (1,2) 1.300* (1,3) 2.278* (1,4) 3.310* row 2 (2,3) 1.293* (2,4) 2.321* row 3 (3,4) 1.345* Total.016 In Column 7, the ordinates of CSD are: P = , P = + =, P = = In Column 9, the ordinates of GCM are: P ˆ 12 = P12 = Applying (3a) P ˆ13 = =., P ˆ = P ( ) = Figure 1 shows the CSD (columns 6,7) and the GCM (columns, 6, 9) associated with p (1,1) in row 1 where it is found that the discrepancy from monotonicity is In this figure, the graphs of CSD and GCM almost overlap. This means that there is hardly any difference between the observed proportions and disparities in row1.

12 92 MPR-Online 2003, Vol. 8, No. 1 Ordinates CSD/GCM 1,0 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 CSD_Row 1 GCM_Row1 0, Weights Figure 1. CSD and GCM (row 1 in table 3) There are deviations from DM in the rows and columns of -matrix (Table 4). In Row 2 there is one deviation from DM in the last value of the row item pair (2,4). Then, a fictitious index of 5 is included, and is given the value p 25 = p 23 =.293 (Value.293 is underlined in Table 4) and the disparity associated with this value is obtained, P P by calculating p ˆ 24(0, 0) = = = This gives a deviation of W W = In Column 4 there is one deviation from DM in item pair (3,4), then a new fictitious index of 4 is included, and is given the value p 44 = p24 =. 439 (Value.439 is underlined in Table 4). The disparity associated with this value is calculated as P P p ˆ 34(0, 0) = = =.451. The size of the deviation is W W =.012. In 00 P there is a total deviation ( p p ) ˆ = =.085. P00 Row 1 and Column 3 of Table 4 show no deviation from monotonicity. So p ˆ = are not included in this Table. In consequence, vertical Column 4 is equal to vertical Column 8, and pˆ p = 0 (similar to Table 3). GD measure is given as DP ( 11,, 4) = = in P11-matrix and as DP ( 00,, 4) = = in 00 P - matrix. GD in both matrices is close to zero, so there is no great deviation from DM. p

13 Rivas Moya: Goodness of Fit Measure for Mokken Model 93 Table 4 Deviation from DM in the Matrix (i,j) d p (0,0) w W P pˆ pˆ p Row 1 (1,2) 1.557* (1,3) 2.537* (1,4) 3.439* Row 2 (2,3) 1.293* (2,4) 2.439* * Row3 (3,4) 1.463* Column 2 (1,2) 1.557* Column 3 (1,3) 1.537* (2,3) 2.293* Column 4 (1,4) 1.439* (2,4) 2.439* (3,4) 3.463* * Total.085 Pˆ Study 2: GD is Zero if there are no Individual Deviations from DM P11and matrices from the responses of 294 subjects to 4 dichotomous items are given in Table 5 (These 4 items are different to the 4 items in Study 1). It can be seen that there is no deviation from DM in any row or column of these matrices.

14 94 MPR-Online 2003, Vol. 8, No. 1 Table 5 P11and Matrices P 11 Items Items δ δ If the procedure of Study 1 is applied, P = and p = pˆ for all p. In each row (or column) the CSD and GCM coincide and the GD measure is zero. Pˆ Study 3: Effect of Size of Individual Deviations on GD P11and matrices from the responses of 294 subjects to two sets of 8 and one set of 7 dichotomous items are given in Tables 6a, 6b, 7a, 7b, 8a, 8b. After applying the above procedure, described in Tables 3 and 4 of Study 1, only the proportions p and disparities pˆ associated with proportions not satisfying monotonicity are included. This is done with the object of summarizing the information as follows: each cell of P 11 and matrices shows two values which appear in the following order (from top to bottom). When there is deviation from DM in rows, in Tables 6a, 7a and 8a: 1. p appears in bold, 2. the estimated pˆ is shown below the figure in bold. When there is deviation from DM in columns, in Tables 6b, 7b and 8b: 1. p appears in bold, 2. the estimated p is shown below the figure in bold. For example, in the P 11 matrix of Table 6a, there is one deviation in Row 2 (pair (2,7) p. 70 and p ˆ 27 =. 705 ) and in Table 6b, there is one deviation in Column 6 of = P matrix (pair (3,6) p. 70 and p ˆ 36 =. 72 ). 36 =

15 Rivas Moya: Goodness of Fit Measure for Mokken Model 95 The sum of all deviations and GD for each P-matrix is shown at the foot of each Table. pˆ in each matrix of Tables 6a and 6b is obtained, with p ˆ p = = DP (,,8) = = and In P 11, p ˆ = = D ( P 11,,8) = = P11 p Table 6a P Matrices and Deviations from DM in Rows (8 Items) P 11 Item Item δ δ , Rows p ˆ = p ˆ = p Table 6b P Matrices and Deviations from DM in Columns (8 Items) P11, Rows p P 11 Item Item δ δ , Col p ˆ = p ˆ = p P11, Col p

16 96 MPR-Online 2003, Vol. 8, No. 1 In Tables 7a and 7b, the deviations in and P 11 are p ˆ p = = and 00 P11 DP (,,8) = = p ˆ p = = 0.21and ( P 11,,8) = = D. Table 7a P Matrices and Deviations from DM in Rows (8 Items) P 11 Item Item δ δ , Rows p ˆ = p ˆ = p P11, Rows p Table 7b P Matrices and Deviations from DM in Columns (8 Items) P 11 Item Item δ δ , Col p ˆ = p ˆ = p P11, Col p

17 Rivas Moya: Goodness of Fit Measure for Mokken Model 97 Tables 6a and 6b present fewer deviations from DM than Tables 7a and 7b, and the GD obtained from Tables 6a 6b is lower than GD from Tables 7a 7b. P 11. In Tables 8a and 8b, there is no deviation in the rows of nor in the columns of P, ˆ = = In 00 In 11 p and D (,,7) = = p P, ˆ = = P11 p and D ( P 11,,7) = = p Table 8a P Matrices and Deviations from DM in Rows (7 Items) P 11 Item Item δ δ , Rows p ˆ = 0 p ˆ = p P11, Rows p Table 8b P Matrices and Deviations from DM in Columns (7 Items) P 11 Item Item δ δ , Col p ˆ p = 0.16 p pˆ = 0 P11, Col

18 98 MPR-Online 2003, Vol. 8, No. 1 In Tables 8a and 8b there are only 5 deviations from DM, but these deviations are greater than those in Tables 6a, 6b and 7a, 7b. These deviations are reflected overall in the GD measure, which in Tables 8a, 8b is greater than or equal to any of the others. From these examples, when different set of items are considered, it can be seen that a few large deviations have more effect than many small deviations in this goodness of fit measure. In order to analyze visually the deviation from DM, for example in row 1 of P 11 (Table 8a), values P and Pˆ associated with p and pˆ are given in Table 9. There are deviations in consecutive item pairs ( 1,4) and (,5) value pˆ associated with p 14 and 15 p is ( ) ( 5 2) = Applying (2b), the and applying (3b), P ˆ14 = ( 3 2) = and P ˆ15 = ( 4 2) = the items there is no deviation from monotonocity, P ˆ = P.. As in the rest of Table 9 P and P ˆ of Row 1 in P 11 (Table 8a) Item pairs Weights p P pˆ Pˆ ( 1,2 ) ( 1,3) ( 1,4 ) ( 1,5) ( 1,6 ) ( 1,7 ) CSD and GCM of row 1 in P 11 are plotted in Figure 2. 0,8 Ordinates CSD/GCM 0,7 0,6 0,5 0,4 0,3 0,2 CSD_Row1 GCM_Row1 0, Weights Figure 2. CSD and GCM (row 1 in P 11, table 8a)

19 Rivas Moya: Goodness of Fit Measure for Mokken Model 99 It can be seen from Figure 2 that deviations from DM in this row are important. Comparing Figures 1 and 2, it can be seen that the deviation from DM in row 1 of Table 3 (0.016) is less important that that in row 1 of Table 8a, this deviation being Row1 ( p p ) ˆ = = In the same way, deviations in other rows and/or columns can be calculated and visualized. Discussion Mokken s concept of DM and isotonic regression are linked together to define a goodness of fit measure (GD) from DM in P 11 and matrices. In this way, the study of non decreasing monotonicity (in rows and columns of P 11) and non increasing monotonicity (in rows and columns of ) is made by isotonic regression. This method is applied in the same way as it is applied to define the goodness of fit measure stress in non-metric multidimensional scaling. The GD measure is shown, explaining the procedure to calculate it in general terms. From the examples presented in this work, the GD measure has been checked with real data. It shows: 1. The empirical procedure for calculating the measure (Study 1), 2. the minimum value is 0 if there is no deviation from DM (Study 2), and 3. a comparison of results of several types of data is given. On one hand, given two sets with the same number of items (8) but of varying difficulty, the example shows that the greater number of deviations, the more the GD increases. On the other hand, given two sets with a differing number of items (8 and 7) but also of varying difficulty, the example shows that a few large-size deviations result in a greater or equal GD measure than more deviations of smaller size. The advantages of this measure over others that investigate deviations from DM are: 1. It takes into account the size and number of deviations. The size of the deviations is obtained by calculating the differences between the observed and estimated theoretical proportions (disparities), the latter being obtained by Isotone Regression. Thus, in P 11, the disparities which make the GCM satisfy the non-decreasing monotonicity are considered in order to ascertain whether the discrepancies from

20 100 MPR-Online 2003, Vol. 8, No. 1 DM are great or small. In addition, deviations from DM in each row or column can be analyzed visually by plotting the CSD (associated with observed proportions p ) and the GCM (associated with estimated proportions pˆ ). The greater the difference between these Diagrams, the greater the discrepancy from DM. 2. The procedure given to calculate the measure allows us to know the deviations between pairs of items, all the items of one row or column of or P 11, or global deviation in each P-matrix. This last measure gives the global deviation from nondecreasing monotonicity, on one hand, and the global deviation from non-increasing monotonicity, on the other hand. Further studies are necessary to 1. Prove that in some cases, given the same number of items, a few large-size deviations can result in a greater deviation measure than more deviations of smaller size 2. Compare GD empirically with other indices which evaluate DM. 3. Determine the bounds of GD using simulated data in order to see if there is low, medium or high deviation from DM. 4. Extend this measure to polytomous items. References Barlow, R. E., Bartholomew, D. J., Bremner, J. M. & Brunk, H. D. (1972). Statistical inference under order restrictions. London: John Wiley and Sons. Mokken, R. J. (1971). A theory and procedure of scale analysis with applications in political research. Berlin: Walter de Gruyter, Mouton. Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. Van der Linden & R K. Hambleton (Eds.) Handbook of modern item response theory (pp ). New York: Springer. Molenaar, I. W. & Stsma, K. (2000). User s manual MSP5 for windows. Groningen: iec ProGAMMA. Molenaar, I. W. (1991). Fit investigation in the multicategory Mokken scale. (Unpublished manuscript). Molenaar, I. W., Debets, P., Stsma, K. & Hemker, B. T. (1994). MSP 3.0. A program for Mokken scale analysis for polytomous items. Groningen: iec ProGAMMA.

21 Rivas Moya: Goodness of Fit Measure for Mokken Model 101 Molenaar, I. W., Stsma, K., van Schuur, W. H. & Mokken, R. J. (2000). MSP5 for Windows. A program for Mokken scale analysis for polytomous items. Groningen: iec ProGAMMA. Ortiz, A. (1997). Razonamiento inductivo numérico: Un estudio en Educación Primaria. Unpublished doctoral dissertation. Granada University, Spain. Rivas Moya, T. (2000a). Calculating isotonic regression of the distance function in nonmetric multidimensional scaling model. Methods of Psychological Research, 5(3), 1-8. Rivas Moya, T. (2000b). Goodness of fit measure based on sample isotone regression of mokken double monotonicity model. In H. A. L Kiers, J.-P. Rasson, P. J. F. Groenen & M. Schader (Eds.), Data analysis, classification and related methods (pp ) Berlin: Springer Verlag. Rosenbaum, P. R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, Rosenbaum, P. R. (1987). Comparing item characteristic curves. Psychometrika, 52, Stsma, K. & Junker, B. W. (1996). A survey of theory and methods of invariant item ordering. British Journal of Mathematical and Statistical Psychology, 49, Stsma, K. & Meer, R. R. (1992). A method for investigating the intersection of item response functions in Mokken s nonparametric IRT model, Applied Psychological Measurement. 16, Stsma, K. & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. London: Sage Publications. Stsma, K. (1998). Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores. Applied Psychological Measurement, 22, Stsma, K. (2001). Developments in measurement of persons and items by means of item response models. Behaviormetrika, 28,

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit March 27, 2004 Young-Sun Lee Teachers College, Columbia University James A.Wollack University of Wisconsin Madison