Use of Continuous Exponential Families to Link Forms via Anchor Tests

Size: px

Start display at page:

Download "Use of Continuous Exponential Families to Link Forms via Anchor Tests"

Ashlyn Turner
5 years ago
Views:

1 Research Report ETS RR Use of Continuous Exponential Families to Link Forms via Anchor Tests Shelby J. Haberman Duanli Yan April 11

2 Use of Continuous Exponential Families to Link Forms via Anchor Tests Shelby J. Haberman and Duanli Yan ETS, Princeton, NJ April 11

3 As part of its nonprofit mission, ETS conducts and disseminates the results of research to advance quality and equity in education and assessment for the benefit of ETS s constituents and the field. To obtain a PDF or a print copy of a report, please visit: Technical Review Editor: Matthias von Davier Technical Reviewers: Frank Rijmen and Sandip Sinharay Copyright 11 by Educational Testing Service. All rights reserved. ETS, the ETS logo, and LISTENING. LEARNING. LEADING., are registered trademarks of Educational Testing Service (ETS).

4 Abstract Continuous exponential families are applied to linking test forms via an internal anchor. This application combines work on continuous exponential families for single-group designs and work on continuous exponential families for equivalent-group designs. Results are compared to those for kernel and equipercentile equating in the case of chained equating. The conversions produced by all methods are quite similar. Key words: moments, information theory, nonequivalent groups i

5 Acknowledgments Alina von Davier and Hexin Chang have assisted in this project. ii

6 Application of continuous exponential families to linking has been considered for equivalentgroups designs (Haberman, 08a) and single-group designs (Haberman, 08b). The procedure for a single-group design is readily applied to the chained approach to the equating design for nonequivalent groups with anchor tests (NEAT). In this report, the required methodology is described, and application is made to the equating of several forms from several components of a test in which kernel equating is currently used on an operational basis. Results of equating by continuous exponential families are compared to those for kernel equating and to those for equipercentile equating with log-linear smoothing. On the whole, all equating procedures yield quite similar results; however, continuous exponential families have some advantage. As in kernel equating, readily-computed asymptotic standard deviations are available. In addition, unlike in kernel equating, a bandwidth need not be specified or estimated. In addition, continuous exponential families can be applied to continuous score distributions and to score distributions with very large numbers of possible values. This feature may gain increasing significance in the future if scoring begins to include such components as essentially continuous electronically derived features of essays. Section 1 describes use of continuous exponential families in the NEAT design. In this section, all distributions of random variables and random vectors are assumed known. Section 2 considers the more realistic case in which sample data must be used to determine the appropriate conversions. Section 3 summarizes results of the application to the test data. Section 4 provides some conclusions. Discussion assumes familiarity with kernel and equipercentile equating methods (von Davier, Holland, & Thayer, 04). 1 Equating for the NEAT Design With Continuous Exponential Families To equate two test forms with a common anchor test by continuous exponential families is relatively straightforward if the chained approach is employed. Consider two test forms, Form 1 and Form 2, and consider an anchor test A. For 1 j 2, let n j be a positive integer, and let Examinee i, 1 i n j, receive a score X ij on Form j and a score A ij on the anchor test. Assume that the pairs (X ij, A ij ), 1 i n j, 1 j 2, are mutually independent. For 1 j 2, let the joint distribution of (X ij, A ij ) be the same for 1 i n j. The examinees who receive Form 1 are not assumed to be from the same population as the examinees who receive Form 2, so that A i1 and A i 2 do not have the same distributions for Examinee i who received Form 1 and Examinee i 1

7 who received Form 2. For Form j, where j is 1 or 2, possible scores X ij are in the closed interval with finite lower bound c Xj and finite upper bound d Xj > c Xj. In addition, the anchor test scores A ij are all in a closed interval with lower bound c A and upper bound d A > c A. No requirement is imposed that the scores be integers or rational numbers. Nonetheless, in typical applications, the common distribution function F Xj of X ij, 1 i n j, and the common distribution function F Aj of A ij, 1 i n j, are not continuous, so that an equipercentile approach to equating of Form 1 and Form 2 based on observed scores normally involves some approximation of the distribution functions F Xj and F Aj by continuous distribution functions G Xj and G Aj, respectively. The distribution function G Xj is strictly increasing on some open interval B Xj that contains both c Xj and d Xj, and the distribution function G Aj is strictly increasing on some open interval B A that contains c A and d A. For each positive real p < 1, there are unique continuous and increasing quantile functions R Xj and R Aj such that G Xj (R Xj (p)) = p and G Aj (R Aj ) = p. With the chained approach, the linking function e X1X2 for conversion of a score on Form 1 to a score on Form 2 is then e X1X2 (x) = R X2 (G A2 (R A1 (G X1 (x)))) for x in B X1, while the linking function e X2X1 for conversion of a score on Form 2 to a score on Form 1 is e X2X1 (x) = R X1 (G A1 (R A2 (G X2 (x)))) for x in B X2. Both e X1X2 and e X2X1 are strictly increasing and continuous on their respective ranges, and e X1X2 and e X2X1 are inverses, so that e X1X2 (e X2X1 (x)) = x for x in B X2 and e X2X1 (e X1X2 (x)) = x for x in B X1 (Haberman, 08a). If G X1 has a continuous derivative g X1 at x in B X1, G A1 has a positive and continuous derivative g A1 at e X1A (x) = R A1 (G X1 (x)), G A2 has a continuous derivative g A2 at e X1A (x), and G X2 has continuous and positive derivative g X2 at e X1X2 (x), then application of standard results from calculus shows that e X1X2 has continuous derivative e X1X2(x) = g X1 (x)g A2 (e X1A (x)) g A1 (e X1A (x))g X2 (e X1X2 (x)) at x. Similarly, if G X2 has a continuous derivative g X2 at x in B X2, G A2 has a positive and continuous derivative g A2 at e X2A (x) = R A2 (G X2 (x)), G A1 has a continuous derivative g A1 at e X2A2 (x), and G X1 has continuous and positive derivative g X2 at e X2X1 (x), then e X2X1 has continuous derivative at x. e X2X1(x) = g X2 (x)g A1 (e X2A (x)) g A2 (e X2A (x))g X1 (e X2X1 (x)) One method to obtain distribution functions G X1, G A1, G X2, and G A2 is to approximate 2

8 the joint distribution of (X ij, A ij ) by use of a bivariate continuous exponential family for both j = 1 and j = 2 (Haberman, 08b). For simplicity, let B Xj, 1 j 2, and B A be bounded. For k 0, let u kxj be a polynomial of degree k on the interval B Xj for 1 j 2, and let u ka be a polynomial of degree k on B A. For 1 j 2 and a pair k = (k Xj, k A ) of nonnegative integers, let u kj be the polynomial on the plane such that u kj (x j ) = u kxj (x Xj )u ka (x A ) for real pairs x j = (x Xj, x A ). Let X ij = (X ij, A ij ). Let µ kj be the expectation of u kj (X ij ), so that µ kj is a linear combination of the bivariate moments E(X h Xj ij A h A ij ) of X ij for nonnegative integers h Xj k Xj and h A k A. Consider a nonempty set K j of r j pairs of nonnegative integers k = (k Xj, k A ) such that k Xj or k A is positive. Let µ Kj j be the K j -array of µ kj, k in K j, and let u Kj j(x) be the K j -array of u kj (x), k in K j. If y Kj j is a real K j -array of y kj, k in K j, and z Kj j is a real K j -array of z kj, k in K j, then let y K j jz Kj j = k K j y kj z kj. Assume that, for any real K j -array y Kj j, the variance of y K j j u K j j(x ij ) is 0 only if y kj = 0 for each k in K j. Let B XjA = B Xj B A be the interval in the plane that consists of pairs (b Xj, b A ) such that b Xj is in B Xj and b A is in B A. To treat issues such as internal anchors, let w j be a bounded and positive real function on B XjA. For numerical work, it is helpful to assume that w j is infinitely differentiable. Then a unique continuous bivariate distribution with positive density on B XjA has the exponential family density g Kj j(x) = γ Kj j(θ Kj j)w j (x) exp[θ K j ju Kj j(x)], x in B XjA, for a unique K j -array θ Kj j with elements θ kkj j, k in K j, and a unique positive real γ Kj j(θ Kj j) such that for k in K j and u kj (x)g Kj j(x)dx = µ kj B XjA g Kj j(x)dx = 1 B XjA (Gilula & Haberman, 00; Haberman, 08b). A random vector Y Kj j = (Y XjKj j, Y AKj j) in B ja then exists with density g Kj j. The moment equalities E(u kj (Y Kj j)) = E(u kj (X ij )) hold for k in K j, so that Y Kj j has a distribution close to that of X ij in the sense that the expected log penalty 3

9 function I Kj j = E( log g Kj j(x ij )) is the smallest expected log penalty function E( log g(x ij )) for all probability densities g on B XjA such that g(x) = γ Kj j(θ Kj j)w j (x) exp[θ K j ju Kj j(x)] for some real K j -array θ Kj j, and E( log g(x ij )) = I Kj j only if θ Kj j = θ Kj j. If K j includes the pairs (1, 0), (0, 1), (2, 0), (0, 2) and (1, 1) and w j is always 1, then log g Kj j(x) is a quadratic function β 0 + β Xj x Xj + β A x A + β XjXj x 2 Xj + 2β XjA x Xj x A + β AA x 2 A. If β XjXj and β AA are both negative and if β 2 XjA < β XjXjβ AA, then g Kj j is the conditional density of a bivariate normal random vector given that the vector is in the interval B XjA. The random vector Y Kj j with density g Kj j then has the same mean and covariance matrix as (X ij, A ij ). The moment equations expressed in terms of u kj can be interpreted in terms of conventional moments if the set K j satisfies the hierarchy rule that (k Xj, k A ) is in K j whenever (h Xj, h A ) is in K j, k Xj h Xj, k A h A, k Xj and k A are nonnegative integers, and k Xj or k A is positive. The equations E(u kj (Y Kj j)) = E(u kj (X ij )) for k in K j then hold if, and only if, E(Y k Xj XjK j j Y k A AK j j ) = E(Xk Xj ij A k A ij ) for all k in K j. For 1 j 2, the distribution function G XjKj j of Y XjKj j and the distribution function G AKj j of Y AKj j are strictly increasing and continuously differentiable on their respective ranges B Xj and B A. If B XjyA, y in B Xj, consists of all pairs (y Xj, y A ) such that y Xj is in B Xj, y A is in B A, and y Xj y, then G XjKj j(y) = g Kj j(x)dx. B XjyA If B XjAy, y in B A, consists of all pairs (y Xj, y A ) such that y Xj is in B Xj, y A is in B A, and y A y, then G AKj j(y) = g AKj j(x)dx. B XjAy The inverse R XjKj j defined by G XjKj j(r XjKj j(p)) = p for 0 < p < 1 and the inverse R AKj j defined by G AKj j(r AKj j(p) = p for 0 < p < 1 are also continuously differentiable and strictly increasing, so that the conversion functions e X1X2K1 K 2 = R X2K2 2(G AK2 2(R AK1 1(G X1K1 1))) 4

10 and e X2X1K1 K 2 = R X1K1 1(G AK1 1(R AK2 2(G X2K2 2))) are also continuously differentiable and strictly increasing. Note that e X1X2K1 K 2 = e AX2K2 (e X1AK1 ), where e X1AK1 = R AK1 1(G X1K1 1) provides a conversion from Form 1 to the anchor test and e AX2K2 = R X2K2 2(G AK2 2) provides a conversion from the anchor test to Form 2, while e X2X1K2 K 1 = e AX1K1 (e X2AK2 ), where e X2AK2 = R AK2 2(G X2K2 2) provides a conversion from Form 2 to the anchor test and e AX1K1 = R X1K1 1(G AK1 1) provides a conversion from the anchor test to Form 1. As in other cases of continuous exponential families (Haberman, 08a, 08b), numerical work is simplified if computations employ the Legendre polynomials P k for k 0 (Abramowitz & Stegun, 1965, chapters 8, 22). These polynomials are determined by the equations P 0 (x) = 1, P 1 (x) = x, and P k+1 (x) = (k + 1) 1 [(2k + 1)xP k (x) kp k 1 (x)], k 1. If inf(b Xj ) is the infimum of B Xj and sup(b Xj ) is the supremum of B Xj for 1 j 2, inf(b A ) is the infimum of B A, and sup(b A ) is the supremum of B A, then it is relatively efficient for numerical work to let β Xj = [inf(b Xj ) + sup(b Xj )]/2 be the midpoint of B Xj for 1 j 2, to let β A = [inf(b A ) + sup(b A )]/2 be the midpoint of B A, to let η Xj = [sup(b Xj ) inf(b Xj )]/2 be half the range of B Xj for 1 j 2, to let η A = [sup(b A ) inf(b A )]/2 be half the range of B A, to let u kxj (x) = P k ((x β Xj )/η Xj ) for 1 j 2, and to let u ka (x) = P k ((x β A )/η A ). In applications considered in this report, for integers r Xj > 1 and r Aj > 0, 1 j 2, the set K j consists of the r Xj + r Aj + 1 elements (k Xj, 0), 1 k Xj r Xj, (0, k A ), 1 k A r Aj, and (1, 1), so that the hierarchy principle holds and, for 1 j 2, Y XjKj j and X ij have the same r Xj initial moments, Y AKj j and A ij have the same r Aj initial moments,, and Y XjKj j and Y AKj j have the same correlation as X ij and A ij. Thus Y XjKj j and X ij have the same mean and variance for each j, and Y AKj j and A ij have the same mean and variance for each j. If r Xj > 2, then Y XjKj j and X ij have the same skewness coefficient. If r Xj > 3, then Y XjKj j and X ij have the same kurtosis coefficient. Similarly, if r Aj > 2, then Y AKj j and A ij have the same skewness coefficient. 5

11 If r Aj > 3, then Y AKj j and A ij have the same kurtosis coefficient. In the case of r Xj = r Aj = 2 in which Legendre polynomials are used, if θ k is negative for k equal to (2, 0) or (0, 2) and θ(1,1) 2 is less than 36θ 2 (2,0) θ2 (0,2), then Y K j j corresponds to a bivariate normal random variable Z = (Z Xj, Z A ) (Haberman, 08b). The distribution of Y Kj j is the same as the conditional distribution of Z conditional on Z Xj being in B Xj and Z A being to B A (Haberman, 08b). One alternative choice of K j (Wang, 08) has K j contain all pairs (k Xj, k A ) of nonnegative integers such that k Xj or k A is positive, k Xj r Xj, and k A r A. In typical cases, w j is just the constant 1; however, in some cases with internal anchors A ij X ij, inf(b Xj ) = inf(b A ) and sup(b A ) < sup(b Xj ). In such a case, it may be reasonable to let w j (x) = exp[z j(x Xj x A )] 1 + exp[z j (x Xj x A )] for x = (x Xj, x A ) in B XjA, where z j is a positive real constant. As z j becomes large, w j (x) goes to 1 for x Xj > x A and to 0 for x Xj < x A. In applications in this report, z j = 2. This choice of w j and z j facilitates use of -point Gauss-Legendre integration (Haberman, 08b). 2 Estimation of Parameters The parameters θ Kj j, the information criterion I Kj j, the distribution functions G XjKj j and G AKj j, and the conversion functions e X1X2K1 K 2 and e X2X1K1 K 2 are readily estimated (Gilula & Haberman, 00; Haberman, 08a, 08b). For k in K j, let m kj be the sample mean n 1 nj j i=1 u kj(x ij ), and let m Kj j be the K j -array with elements m kj, k in K j. If the covariance matrix of m Kj j is positive definite, then θ Kj j is estimated by the unique K j -array ˆθ Kj j such that B XjA u Kj j(x)ĝ Kj j(x)dx = m Kj j, and for x in B XjA. B XjA ĝ K(j)j (x)dx = 1, ĝ Kj j(x) = γ Kj j(ˆθ Kj j)w j (x) exp[ˆθ K j ju Kj j(x)] For 1 j 2, as the sample size n j approaches, ˆθ Kj j converges to θ Kj j with probability 1, and n 1/2 j (ˆθ Kj j θ Kj j) converges in distribution to a multivariate normal random variable with zero mean and with covariance matrix V Kj j = C 1 K j j D K j jc 1 K j j 6 (Gilula & Haberman, 00). Here

12 D Kj j is the covariance matrix of u Kj j(x ij ) and C Kj j is the covariance matrix of the K j -array u Kj j(y Kj j). Thus C Kj j = The estimate of C Kj j is The estimate of D Kj j is Thus V Kj j has estimate B XjA ) [u Kj j(x) µ Kj j][u Kj j(x) µ Kj j] g Kj j(x)dx. Ĉ Kj j = [u Kj j(x) m Kj j][u Kj j(x) m Kj j] ĝ Kj (x)dx. B XjA ˆD Kj j = n 1 j n j [u Kj j(x i ) m Kj j][u Kj j(x i ) m Kj j]. i=1 ˆV Kj j = Ĉ 1 K j j ˆD Kj jĉ 1 K j j. For any nonzero constant K j -array z Kj, the estimated asymptotic standard deviation (EASD) of z K j ˆθKj j is ˆσ(z K j ˆθKj j) = n 1/2 j (z K j ˆVKj jz Kj ) 1/2, so that (z K j ˆθKj j z K j θ Kj j)/ˆσ(z K j ˆθKj j) converges in distribution to a standard normal random variable. The minimum expected penalty I Kj j may be estimated by Î Kj j = log γ Kj j(ˆθ Kj j) ˆθ K j jm Kj j. As the sample size n j increases, ÎK j j converges to I Kj j with probability 1 and n 1/2 j (ÎK j j I Kj j) converges in distribution to a normal random variable with mean 0 and variance σ 2 ( log g Kj j(x ij )) = θ K j jv K j jθ Kj j. The EASD of ÎK j j is then ˆσ(ÎK j j) = n 1/2 j (ˆθ K j j ˆV K j j ˆθ Kj j) 1/2 (Haberman, 08b). 7

13 For 1 j 2, the distribution function G XjKj j has estimate ĜXjK j j defined by Ĝ XjKj j(y) = ĝ Kj j(x)dx B XjyA for y in B Xj, and the quantile function R XjKj j has estimate ˆR XjKj j defined by Ĝ XjKj j( ˆR XjKj j(p)) = p for 0 < p < 1. The distribution function G AKj j has estimate ĜAK j j defined by Ĝ AKj j(y) = ĝ Kj j(x)dx B XjAy for y in B A, and the quantile function R AKj j has estimate ˆR AKj j defined by for 0 < p < 1. Let and Ĝ AKj j( ˆR AKj j(p)) = p T XjKj j(y) = [u Kj j(x) µ Kj j]g Kj j(x)dx B XjyA T AKj j(y) = [u Kj j(x) µ Kj j]g Kj j(x)dx. B XjAy As the sample sizes n 1 and n 2 approach, Ĝ XjKj j(y) converges to G XjKj j(y) with probability 1 for y in B Xj, so that ĜXjK j j G XjKj j, the supremum of ĜXjK j j(y) G XjKj j(y) for y in B Xj, converges to 0 with probability 1. Similarly, ĜAK j j(y) converges to G AKj j(y) with probability 1 for y in B A, so that ĜAK j j G AKj j, the supremum of ĜAK j j(y) G AKj j(y) for y in B A, converges to 0 with probability 1 (Haberman, 08b). In addition, [ĜXjK j j(y) G XjKj j(y)]/σ(ĝxjk j j(y)) converges in distribution to a normal random variable with mean 0 and variance 1 if σ(ĝxjk j j(y)) = n 1/2 j {[T XjKj j(y)] V Kj jt XjKj j(y)} 1/2, and [ĜAK j j(y) G AKj j(y)]/σ(ĝak j j(y)) converges in distribution to a normal random variable with mean 0 and variance 1 if σ(ĝak j j(y)) = n 1/2 j {[T AKj j(y)] V Kj jt AKj j(y)} 1/2, Similarly, ˆRXjKj j(p) converges to R XjKj j(p) with probability 1, and [ ˆR XjKj j(p) R XjKj j(p)]/σ( ˆR XjKj j(p)) converges in distribution to a normal random variable with mean 0 and variance 1 if σ( ˆR XjKj j(p)) = [g XjKj j(r XjKj j(p))] 1 σ(ĝxjk j j(r XjKj j(p))) 8

14 and g XjKj j(y) is the marginal density corresponding to G XjKj j. Thus g XjKj j(y) is the integral of g Kj j((y, x A )) over x A in B A. The estimate ˆR AKj j(p) converges to R AKj j(p) with probability 1, and [ ˆR AKj j(p) R AKj j(p)]/σ( ˆR AKj j(p)) converges in distribution to a normal random variable with mean 0 and variance 1 if σ( ˆR AKj j(p)) = [g AKj j(r AKj j(p))] 1 σ(ĝak j j(r AKj j(p))) and g AKj j(y) is the marginal density corresponding to G AKj j. Thus g AKj j(y) is the integral of g Kj j((x Xj, y)) over x Xj in B Xj. Estimated asymptotic standard deviations may be derived by use of obvious substitutions of estimated parameters for actual parameters. Thus ˆσ(ĜXjK j j(y)) = n 1/2 j {[ ˆT XjKj j(y)] ˆVKj j ˆT XjKj j(y)} 1/2, where ˆT XjKj j(y) = [u Kj j(x) m Kj j]ĝ Kj j(x)dx, B XjyA ˆσ( ˆR XjKj j(p)) = [ĝ XjKj j( ˆR XjKj j(p))] 1ˆσ(ĜXjK j j( ˆR XjKj j(p)), and ĝ XjKj j(y) is the marginal density corresponding to ĜXjK j j. In like manner, ˆσ(ĜAK j j(y)) = n 1/2 j {[ ˆT AKj j(y)] ˆVKj j ˆT AKj j(y)} 1/2, where ˆT AKj j(y) = [u Kj j(x) m Kj j]ĝ Kj j(x)dx, B XjAy ˆσ( ˆR AKj j(p)) = [ĝ AKj j( ˆR AKj j(p))] 1ˆσ(ĜAK j j( ˆR AKj j(p)), and ĝ AKj j(y) is the marginal density corresponding to ĜAK j j. The estimate ê X1X2K1 K 2 of the conversion function e X1X2K1 K 2 from Form 1 to Form 2 satisfies ê X1X2K1 K 2 = ê AX2K2 (ê X1AK1 ), where ê AX2K2 = ˆR X2K2 2(ĜAK 2 2) and ê X1AK1 = ˆR AK1 1(ĜX1K 1 1). 9

15 The corresponding estimate ê X2X1K1 K 2 of e X2X1K1 K 2 satisfies ê X2X1K1 K 2 = ê AX1K1 (ê X2AK2 ), where ê AX1K1 = ˆR X1K1 1(ĜAK 1 1) and ê X2AK2 = ˆR AK2 2(ĜX2K 2 2). As the sample sizes n 1 and n 2 become large, ê X1X2K1 K 2 (y) converges with probability 1 to e X1X2K1 K 2 (y) for y in B X1, and ê X2X1K1 K 2 (y) converges with probability 1 to e X2X1K1 K 2 (y) for y in B X2. In addition, [ê X1X2K1 K 2 (y) e X1X2K1 K 2 (y)]/σ(ê X1X2K1 K 2 (y)) converges in distribution to a standard normal random variable if σ 2 (ê X1X2K1 K 2 (y)) = n 1 1 [T X1K 1 1(y) T AK1 1(e X1AK1 (y))] V K1 1[T X1K1 1(y) T AK1 1(e X1X2K1 K 2 (y))] {[g AK2 2(e X1AK1 (y))]/[g AK1 1(e X1AK1 (y))g X2K2 2(e X1X2K1 K 2 (y))]} 2 +n 1 2 [T AK 2 2(e X1AK1 (y)) T X2K2 2(e X1X2K1 K 2 (y))] V K2 2 [T AK2 2(y) T X2K2 2(e X1X2K1 K 2 (y))]/[g X2K2 2(e X1X2K1 K 2 (y))] 2. In like manner, [ê X2X1K1 K 2 (y) e X2X1K1 K 2 (y)]/σ(ê X2X1K1 K 2 (y)) converges in distribution to a standard normal random variable if σ 2 (ê X1X2K1 K 2 (y)) = n 1 2 [T X2K 2 2(y) T AK2 2(e X2AK2 (y))] V K2 2[T X2K2 2(y) T AK2 2(e X2X1K1 K 2 (y))] {[g AK1 1(e X2AK2 (y))]/[g AK2 2(e X2AK2 (y))g X1K1 1(e X2X1K1 K 1 (y))]} 2 +n 1 1 [T AK 1 1(e X1AK2 (y)) T X1K1 1(e X2X1K1 K 2 (y))] V K1 1 [T AK1 1(y) T X1K1 1(e X2X1K1 K 2 (y))]/[g X1K1 1(e X2X1K1 K 2 (y))] 2.

16 The EASD of ê X1X2K1 K 2 (y) satisfies ˆσ 2 (ê X1X2K1 K 2 (y)) = n 1 1 [ ˆT X1K1 1(y) ˆT AK1 1(ê X1AK1 (y))] ˆVK1 1[ ˆT X1K1 1(y) ˆT AK1 1(ê X1X2 (y))] {[ĝ AK2 2(ê X1AK1 (y))]/[ĝ AK1 1(ê X1AK1 (y))ĝ X2K2 2(ê X1X2K1 K 2 (y))]} 2 +n 1 2 [ ˆT AK2 2(ê X1AK1 (y)) ˆT X2K2 2(ê X1X2K1 K 2 (y))] ˆVK2 2 [ ˆT AK2 2(y) ˆT X2K2 2(ê X1X2K1 K 2 (y))]/[ĝ X2K2 2(ê X1X2K1 K 2 (y))] 2, and the EASD of ê X2X1K1 K 2 (y) satisfies ˆσ 2 (ê X2X1K1 K 2 (y)) = n 1 2 [ ˆT X2K2 2(y) ˆT AK2 2(ê X2AK2 (y))] ˆVK2 2[ ˆT X2K2 2(y) ˆT AK2 2(ê X2X1 (y))] {[ĝ AK1 1(ê X1AK2 (y))]/[ĝ AK2 2(ê X2AK2 (y))ĝ X1K1 1(ê X2X1K1 K 2 (y))]} 2 +n 1 1 [ ˆT AK1 1(ê X1AK1 (y)) ˆT X1K1 1(ê X2X1K1 K 2 (y))] ˆVK1 1 [ ˆT AK1 1(y) ˆT X1K1 1(ê X2X1K1 K 2 (y))]/[ĝ X1K1 1(ê X2X1K1 K 2 (y))] 2. 3 Application Equating was considered for the verbal, quantitative, writing, and English tests for two administrations. In each case, results are based on 1,414 examinees for the new form and 1,271 examinees for the old form. To avoid identification of the assessment, details concerning the test are omitted. Kernel equating with log-linear smoothing, equipercentile equating with log-linear smoothing, and equating by exponential families were compared. To facilitate comparison, current practices were followed in the following ways. Log-linear models used linear, quadratic, cubic, and quartic terms for main effects, and a linear-by-linear interaction. In continuous exponential families, the corresponding model was used, so that each K j included the pair (1, 1) and the pairs (k, 0) and (0, k) for 1 k 4. Ranges of tests used in kernel equating or equipercentile equating were used to specify c A, d A, c X1, d X2, c X2, and d X2. The sets B X1, B X2, and B A were selected to have inf B Xj = c Xj 0.5 and sup(b Xj ) = d Xj for 1 j 2, inf(b A ) = c A 0.5, and sup(b A ) = d A Anchors were internal. Bandwidth selection in kernel equating was based on the criterion in von Davier et al. (04, p. 63) with K = 1. Bandwidths used are found in Table 3. Results for conversion of the new form to the base form are summarized in Tables 1 5 and in Figures 1 8. Note that conversions are not provided outside of the observed range of raw scores. 11

17 Table 1 Bandwidths Used in Kernel Equating Verbal Quantitative Writing English New form New anchor Old form Old anchor Table 2 Equating Results for Verbal Test Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD

18 Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD Note. EASD = estimated asymptotic standard deviation. 13

19 Table 3 Equating Results for Quantitative Test Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD

20 Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD Note. EASD = estimated asymptotic standard deviation. Table 4 Equating Results for Writing Test Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD

21 Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD Note. EASD = estimated asymptotic standard deviation. 16

22 Table 5 Equating Results for English Test Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD

23 Exponential Kernel Equipercentile Score Conversion EASD Conversion EASD Conversion EASD Conclusions On the whole, results for all methods are quite similar. Differences are most noticeable for the highest and lowest scores. The results do illustrate an occasional difficulty with kernel equating based on the normal density. The equated score can be somewhat beyond the range of possible scores. This issue does not arise with continuous exponential families or equipercentile equating. It can also be avoided by use of alternate density functions (Lee & von Davier, 08). The asymptotic standard deviations for the kernel method do not consider the effects of selection of bandwidth on the basis of data. In the equipercentile case, the discontinuities in the fitted density function are not considered. These issues do not arise with continuous exponential families. The data do not provide a compelling case in favor of or against any of the alternative equating methods. Current implementations of kernel equating with log-linear smoothing and equipercentile equating with log-linear smoothing assume that the scores to be equated are integers, as is the case with the operational test examined. Continuous exponential families can be applied to scores that are arbitrary real numbers; however, this feature does not have direct impact in this example. Although both approaches require selection of a polynomial, equating by continuous exponential families does have the advantage over kernel equating because a bandwidth need not be selected. The exact method of adjustment in continuous exponential families for internal rather than external anchors had negligible impact for the data examined. Virtually the same results are obtained if the weight function is simply set to 1. Results here are for chained equating rather than for post-stratified equating. The authors plan to consider the latter approach in a separate report. 18

24 0 Comparison of Equating Methods - Verbal Exp KE Eq Score 90 Exponential Families Equating Method - Verbal Exp Exp_2SE- Exp_2SE Score Figure 1 Verbal Results: Continuous Exponential Case 19

25 90 Kernel Equating Method - Verbal KE KE_2SE Score 0 Equipercentile Equating Method - Verbal Eq KE_2SE- Eq_2SE- Eq_2SE Score Figure 2 Verbal Results: Other Methods

26 80 Comparison of Equating Methods - Quantitative Exp KE Eq Score 70 Exponential Families Equating Method - Quantitative Exp Exp_2SE- Exp_2SE Score Figure 3 Quantitative Results: Continuous Exponential Case 21

27 80 Kernel Equating Method - Quantitative KE KE_2SE Score 70 Equipercentile Equating Method - Quantitative Eq KE_2SE- Eq_2SE- Eq_2SE Score Figure 4 Quantitative Results: Other Methods 22

28 60 Comparison of Equating Methods - Writing Exp KE Eq Score 60 Exponential Families Equating Method - Writing Exp Exp_2SE- Exp_2SE Score Figure 5 Writing Results: Continuous Exponential Case 23

29 60 Kernel Equating Method - Writing KE KE_2SE Score 50 Equipercentile Equating Method - Writing Eq KE_2SE- Eq_2SE- Eq_2SE Score Figure 6 Writing Results: Other Methods 24

30 60 Comparison of Equating Methods - English Exp KE Eq Score 60 Exponential Families Equating Method - English Exp EXP_2SE- Exp_2SE Score Figure 7 English Results: Continuous Exponential Case 25

31 60 Kernel Equating Method - English KE KE_2SE Score 60 Equipercentile Equating Method - English Eq KE_2SE- Eq_2SE- Eq_2SE Score Figure 8 English Results: Other Methods 26

32 In general, it appears that continuous exponential families can be applied to nonequivalent groups with anchor tests. This approach is competitive with kernel approaches and approaches with equipercentile equating. The principal potential gain from use of continuous exponential families is achieved when the number of possible combinations of scores is very large. 27

33 References Abramowitz, M., & Stegun, I. A. (1965). Handbook of mathematical functions. New York, NY: Dover. Gilula, Z., & Haberman, S. J. (00). Density approximation by summary statistics: An information-theoretic approach. Scandinavian Journal of Statistics, 27, Haberman, S. J. (08a). Continuous exponential families: An equating tool (ETS Research Report No. RR-08-05). Princeton, NJ: ETS. Haberman, S. J. (08b). Linking with continuous exponential families: Single-group designs (ETS Research Report No. RR-08-61). Princeton, NJ: ETS. Lee, Y.-H., & von Davier, A. (08). Comparing alternative kernels for the kernel method of test equating: Gaussian, logistic, and uniform kernels (ETS Research Report No. RR-08-12). Princeton, NJ: ETS. von Davier, A. A., Holland, P. W., & Thayer, D. T. (04). The kernel method of test equating. New York, NY: Springer. Wang, T. (08). The continuized log-linear method: An alternative to the kernel method of continuization in test equation. Applied Psychological Measurement, 32,

Research on Standard Errors of Equating Differences

Research Report Research on Standard Errors of Equating Differences Tim Moses Wenmin Zhang November 2010 ETS RR-10-25 Listening. Learning. Leading. Research on Standard Errors of Equating Differences Tim