Appendix A Installing and Reading Data in R

Size: px

Start display at page:

Download "Appendix A Installing and Reading Data in R"

Arlene Williamson
5 years ago
Views:

1 Appendix A Installing and Reading Data in R A.1 Installing R R (R Core Team 2016) is a free software environment that runs on Windows, MacOS and Unix platforms. R can be downloaded from the Comprehensive R Archive Network (CRAN) webpage 1 : There are precompiled binary distributions of the base system and contributed packages for different platforms. The following list shows the links where these distributions can be downloaded for different platforms when installing R for the first time: 1. R for Linux ( 2. R for MacOS ( 3. R for Windows ( All of the examples given in this book have been created and tested on Windows using R version To install R, run the executable file (R win.exe) by double clicking on it, which will start the R for Windows setup wizard. A.1.1 R Studio After installing the R software it is possible to install R studio which is a graphical user interface. R studio ( provides the usual R console and has the advantage of providing a history/workspace window with common data set features such as load/save/import. There are also a window with easy access to files 1 It is recommended to use a mirror site near the place where the program is being downloaded. A full list of mirrors can be found at Springer International Publishing AG 2017 J. González, M. Wiberg, Applying Test Equating Methods, Methodology of Educational Measurement and Assessment, DOI /

2 180 Appendix A and list of available packages. The Plots tab include easy export to a PDF file, a GIF image and copy features to facilitate to insert plots into Word documents. A.2 Installing and Loading R Packages Once R has been installed, it is time to open the program and start working. To install contributed R packages that do not come with the initial distribution, one can go to the tab Packages/Install package(s). Choose the package of interest (e.g. equate) and press ok to install the package. To use the package, one can either go to the menu and select Packages/load package or simply type in the R command prompt > library(equate) Using the library command gives access to all functions within that package. When help is needed with an R function or an R package, the user can simply write help() in the R command prompt. For instance, if help is needed on the use of the equate() function, one can write > help(equate) It should be noted that there are a number of books, online help resources, and online books (simply search for R Statistics book ) where information about R can be obtained. The R packages used for the examples shown in this book have been executed under the following versions: equate version SNSequate version kequate version equateirt version ltm version mirt version A.3 Working Directory and Accessing Data It is convenient to choose a working directory and to store everything connected to a program in such a folder. This can be done in two ways, either by right clicking on the R icon on the desktop and setting which directory R always should start in, or from inside R. In the R command prompt, the current working directory can be identified by writing > getwd()

3 Appendix A 181 and to set a working directory we can write > setwd("c:/users/documents/myrproject") A.4 Loading Data of Different File Formats In Chap. 2 we described how the data sets used in this book can be loaded. If the user has data with different file formats, it is possible to use other functions such as read.table(). Assume that the ADM data described in Chap. 2 are in another file format than.rda. The following examples show how data with different formats can be read into R. The data sets used in the examples can be download from the book s webpage. First, if we want to read in a text file (.txt), that is separated with commas and has headers, we can write the following > ADMt<-read.table(file="ADMt.txt",sep="\t",header=TRUE) If the data are in an Excel file, an easy way to read the data into R is to save the file as either a.dat file or a.csv file. The following commands can be used after saving the files as either of those file types > ADMc1 <-read.csv("admc.csv",sep=";",header=true) > ADMc2 <-read.table("admc.csv",sep=";",header=true) > ADMd <-read.table("admd.dat",header=t) The same kind of data can be read directly from the Excel file (.xls) if the R package XLConnect is installed. In this example Sheet1 in the Excel file is named ADMsheet. Reading the data into R can be done by writing > library(xlconnect) > ADMe <- loadworkbook("adme.xls") > ADMee <- readworksheet(adme, sheet="admsheet") Finally, we can read in an SPSS file (.sav) using the R package foreign > library(foreign) > ADMs <- read.spss("adms.sav",to.data.frame=true) We can use any of these files with the R package equate to obtain the sum scores for the verbal section of the admissions test as described in Chap. 2. Each line of the following code yields the same result as the object "verb.y" in Sect > ADMtSum <- apply(admt[,41:120],1,sum) > ADMc1Sum <- apply(admc1[,41:120],1,sum) > ADMc2Sum <- apply(admc2[,41:120],1,sum) > ADMdSum <- apply(admd[,41:120],1,sum) > ADMeSum <- apply(admee[,41:120],1,sum) > ADMsSum <- apply(adms[,41:120],1,sum) One way to assure that data are read correctly into R is to visualize the data set. To perform a visual inspection, we can, for example, write > View(ADMt)

4 182 Appendix A Reference R Core Team (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.

5 Appendix B Additional Material In this appendix we give more details on definitions, derivations, and other relevant material covered throughout the chapters of the book. B.1 Design Functions As seen in Sect , score data can be modeled separately (EG design) or jointly (SG, CB, NEAT, or NEC designs). The design function DF is used to map the score probabilities (either univariate or bivariate) into r and s. Thus, different design functions (DF) are used for different equating designs (von Davier et al. 2004). For instance, under the EG design, because two independent vectors of scores are obtained, the DF is just the identity function, i.e., r D DF.r; s/ D s IJ 0 0 I K r s (B.1) The situation is different, however, when bivariate distributions are considered (e.g., under the SG, CB, NEAT, and NEC designs). Define P as the J K matrix with entries p jk D Pr.X D x j ; Y D y k /. Further, define the vectorized version of P as v.p/ D 0 p 1 : : p K 1 C A ; (B.2) Springer International Publishing AG 2017 J. González, M. Wiberg, Applying Test Equating Methods, Methodology of Educational Measurement and Assessment, DOI /

6 184 Appendix B where p k is the k-th column of P. In the SG design, the DF is defined as r D DF.P/ D s M v.p/; N where M D.I J I J / is a J KJ matrix and 0 1 t J B 0t J 1 0t J N : : : C : A (B.3) (B.4) 0 t J 0t J 1t J is a K KJ matrix. The CB design can be obtained by using two independent SG designs. In this case, the DF is defined as r D DF.P.12/ ; P.21/ / D s wx M.1 w X /M.1 w Y /N w Y N v.p.12/ / v.p.21/ / (B.5) where the weights w X and w Y satisfies 0 w X ; w Y 1. DFs are different under the NEAT design depending on whether CE or PSE is used. For CE, the DF is defined as r P Bt P t Q A D DF.P; Q/ D DFP.P/ D B DF s Q MP N P 0 0 NQ M Q 1 C v.p/ A v.q/ (B.6) The DF for NEAT PSE is given by 0 r D DF.P; Q/ s P l w C.1 w/ P P k q kl j p jl.1 w/ C w P j p jl P P l k q kl p l q l 1 A (B.7) where p l D.p 1l ; p 2l ;:::;p Jl / t and q l D.q 1l ; q 2l ;:::;q Kl / t. Finally, the DF for the NEC design (Wiberg and Bränberg 2015) is given by 0 r D DF.P; Q/ s P P m m w C.1 w/t Qm t Pm p m.1 w/ C wt Pm t Qm q m 1 A ; (B.8) where p m is the mth column of P and q m is the mth column of Q and the summation is over all the m possible combinations of the values of the covariates, in other words a summation over all columns in the matrices.

7 Appendix B 185 B.2 C Matrices The polynomial log-linear models described in Sect can be written in matrix form as follows log.r/ D C u C B tˇ If Oˇ is the maximum likelihood estimator (MLE) of ˇ, the estimator of r j is Or j DOr j.ˇ/, i.e. the MLE or fitted value of r j. It is assumed that the estimators, Or and Os, are obtained separately (independently) so that Cov.Or; Os/ D Or;Os D 0: The following theorem establishes a computationally easy way to calculate Or;Os. Theorem If Or is the MLE of a log-linear model for r, the estimated covariance matrix Or D Cov.r/ can be obtained as where C r is a J T r matrix Or D C r C t r ; C r D N 1=2 D p rq: The diagonal matrix, D p r has elements p Or j.also,q is a JT r orthogonal matrix that comes from the following QR factorization ŒD p r p OrOr t B t D QR ; where Q is a J T r matrix with orthogonal columns and R is an T r T r upper triangular matrix. A proof and more details can be seen in Holland and Thayer (1989) and von Davier et al. (2004). B.3 Calculation of the SEE To calculate the SEE, start by letting R and S be the vectors of the pre-smoothed score distributions obtained under any of the data collection designs. If we assume that they are estimated independently, we can write the covariances as (see Sect. B.2) Cov! OR CR C D t R 0 OS 0 C S C t D CC t ; S (B.9)

8 186 Appendix B where CR 0 C D 0 C S : (B.10) The pre-smoothed score distributions, obtained in kernel equating (step 1) can then be transformed into r and s through the DF given in Sect. B.1 for the different designs. The Jacobian of the DF is defined as J : (B.11) Further, the Jacobian of the equating transformation in Eq. (4.10) is defined by J : Note, if OR and OS are approximately normal distributed with mean R and S, respectively, and variances given in Eq. (B.9), then SEE Y.x/ D J' J DF C (B.13) where kk is the Euclidian norm of the vector. For more details refer, to von Davier et al. (2004). B.4 Score Distributions Under the NEAT Design To formally derive the score distributions of X and Y on T let us use f XP.x j a/ and f XQ.x j a/ to denote the conditional distributions of X scores on P and Q, respectively. Similarly, f YP.y j a/ and f YQ.y j a/ are the conditional distributions for Y scores on P and Q, respectively. Further, because both samples (and thus both populations) take the anchor test A, we define f AP.a/ and f AQ.a/ as the (marginal) distributions for anchor scores in P and Q, respectively. Following the definition of a synthetic population, we have that f AT.a/ D w P f AP.a/ C w Q f AQ.a/; and thus the score distributions of X and Y on T can be obtained as Z Z f XT.x/ D f XP.x j a/f AP.a/da w P C f XQ.x j a/f AQ.a/da Z f YT.y/ D Z f YP.y j a/f AP.a/da w P C w Q f YQ.y j a/f AQ.a/da w Q (B.14) (B.15)

9 Appendix B 187 Because under a NEAT design test form X is only administered to population P and test form Y is only administered to population Q, the conditional distributions f XQ.x j a/ and f YP.y j a/ cannot be estimated from the collected data. 1 However, note that if we assume that the conditional distributions of X j A are the same in both populations P and Q and analogously for that of Y j A, it can easily be shown that Eqs. (B.14) and (B.15) become Z f XT.x/ D f XP.x j a/f AT.a/da Z f YT.y/ D f YQ.y j a/f AT.a/da which can be estimated from the observed quantities. The CDFs F XT.x/ and F YT.y/ can be obtained by cumulating f XT.x/ and f YT.y/ over values of X and Y, respectively. With this quantities, the equating transformation ' T.x/ D FYT 1.F XT.x// is obtained. A different approach for the NEAT design that does not make use of a synthetic population for equating can be found in San Martín and González (2017). B.5 The Lord-Wingersky Algorithm Let X D P J jd1 X ij be the sum score of a test taker with ability. If we assume that test takers with a given ability correctly answer each of J items of a test with probability p j (i D 1;:::;J), then the conditional distribution of X for a given is called the compound binomial (Lord and Novick 1968) and is defined as 0 1 Pr.X D x j j / D f.x j / D X A ; (B.16) P xj Dx p x j j q1 x j j jd1 where p j D.;! j / (as defined in Sect. 5.1), and q j D 1 p j. The direct use of Eq. (B.16) for the calculation of score probabilities is computationally demanding, especially as the number of items in the test increases. An alternative to the direct calculation of score probabilities using the compound binomial distribution has traditionally been the use of a recursion formula given by Lord and Wingersky (1984). The formula reads as follows f r.x j / Df r 1.x j /q r ; x D 0 (B.17) 1 Technically, the score probability distributions are not identifiable (see, San Martín and González 2017).

10 188 Appendix B Dq r f r 1.x j /C p r f r 1.x 1 j /; 0 < x < r (B.18) Df r 1.x 1 j /p r ; x D r (B.19) where f r.x j / is the distribution of sum scores over the first r items for test takers with ability, and p r D p j is the probability defined from the chosen IRT model. Thus, having item parameter estimates so that Op j D.; O! j /, the above formula can be used to obtain the observed score distribution for test takers of a given ability. Alternatives to the Lord-Wingersky algorithm can be found in González et al. (2016), and an extension of this algorithm for the case of polytomous score data can be found in Thissen et al. (1995). B.6 Other Justifications for Local Equating There are other motivations and justifications of the local equating method that have been discussed in the literature. For instance, local equating can be justified from a matching and conditioning point of view as described in Wiberg and van der Linden (2011). Matching on a variable in observed-score equating results in different pairs of distributions of the observed scores X and Y for each value of the matching variable. This means that the focus is shifted from the marginal distributions of X and Y to their conditional distributions given the values of the matching variables. By conditioning on information about the test takers ability the equating becomes less dependent on the ability distribution of the population of test takers. If we could condition on the test takers true abilities the equating would be independent of population. Matching and conditioning was used in the equating literature before local equating was developed but only in the context of adjusting a population distribution so that the population could be used with a single equating transformation (Cook and Petersen 1987; Dorans 1990;Liou etal.2001; Livingston et al. 1990; Wright and Dorans 1993). The idea of one-size-fits all in the history of test theory has been used for standard error of measurement. Nowadays the single standard error of measurement is often replaced by the conditional standard deviation of the observed scores, given the ability ŒVar.Xj/ 1=2 : (B.20) Because local equating is based on conditional distributions of observed scores, the same argument can be made for using local equating as can be made for using the standard error of measurement (van der Linden 2011, 2013). Using different standard errors for different test takers is nowadays well accepted. In line with this, one should accept that different equating transformations might be used for different test takers in observed score equating (van der Linden 2006a,b; van der Linden and Wiberg 2010).

11 Appendix B 189 Local equating essentially relies on the same two basic assumptions that classical test theory and IRT rely on as stated in van der Linden (2006b): 1. For a fixed test taker, the observed score is random across replications of the test. 2. The observed score has a different distribution for each individual test taker. The first assumption implies that each test taker has an observed score distribution. An equating transformation is needed to map the scale of the X test to the scale of the Y test such that the distributions of the score X and the transformed version of score Y are identical. The true equating transformation can accomplish this. The second assumption implies that there exist different true equating transformations for different test takers. These two assumptions together imply that instead of one equating transformation one should use a family of true equating transformations (van der Linden 2006b). B.7 Epanechnikov Kernel Density Estimate and Derivatives Taking the derivative in Eq. (7.2), which we repeat here for completeness, F hx D X J r j 3R jx.x/ R 3 jx.x/ C 2 4 C X K r j ; (B.21) we obtain F 0 h X D f hx D 1 a X h X X J 3r j 1 R 2 jx.x/ 4 (B.22) The first derivative of f hx which we denote as f.1/ h X f.1/ h X.x/ D 1 a X h J D 1 a X h X D 1 a X h X X J 1 X h X.x/ D.a X h X / 2 f.1/ 3r j.1 R 2 jx.x//! 4 3rj 1 X 4 a X h X J 3r j R jx.x/ 2 J 1 a X h X is 3r j R 2 jx.x/! 3r j R jx.x/ : (B.23) 2 4

12 190 Appendix B B.8 The Double Smoothing Bandwidth Selection Method in Kernel Equating Let J be the total number of possible scores in test form X. Then, for l D 1;:::2J 1 the components of Eq. (7.5) are defined as Or l ( Or lc1 if l is odd, D 2 (B.24) Of h X.xl /; if l is even. and Of h X.x l / D JX jd1 x OaX x j.1 Oa X / O X 1 Of gx.x j / ; h X Oa X h X Oa X (B.25) where.z/ is the standard normal density function, and with Of gx D JX jd1 g x Oa x X Or j x j.1 Oa g X X / O X g X Oa g x X 1 g X Oa g X X ; (B.26) q Oa g x X D O X 2=. O X 2 C g2 X /: (B.27) In practice, the DS method is implemented in kernel equating (Häggström and Wiberg 2014) by carrying out the following steps. 1. Start with a very smooth estimate of the density function by using a large subjectively chosen bandwidth g X and estimate f gx at the score values and the values halfway between the score values. This means x D xl D Œx 1 ; x 1 C 0:5; x 2 ;:::x J 0:5; x J T, l D 1;:::2J Because the obtained smooth estimate Of gx will not perfectly interpolate the estimated score probabilities; the Or:s, the first estimate can be improved by estimating f hx at x using Of gx at the actual score values x. Thus we obtain a DS estimate Of h X 3. Select the bandwidth of h X that minimizes the sum of the squared difference between Or l and the lth DS estimate Of h X.xl / (see Eq. (7.5)).

13 Appendix B 191 B.9 The DBPP Model We assume that for every z 2 Z, F z has a density function, with respect to Lebesgue measure of the form 1X f.z/./ D w jˇ jdk j.z/e; k dk j.z/ec1 ; jd1 (B.28) where w j D v j Qi<j Œ1 v i, v 1 ;v 2 ;::: are i.i.d. random variables with common distribution Beta.1; / with >0, k is a discrete random variable with distribution indexed by a finite-dimensional parameter, j.z/ D h z.r j.z//, r 1 ; r 2 ;::: are independent and identically distributed real-valued stochastic processes with law indexed by the parameter, and h z is a known Œ0; 1 -valued bijective continuous functions defined on R. Note that Eq. (B.28) is a version of the single weights DBPP, i.e., covariate dependence is introduced only in the atom processes f j.z/g by means of the link function h. For further reference, we denote this model as where H Dfh z W z 2 Z g. ff z W z 2 Z g wdbpp. ;; ;H /; B.10 Measures of Statistical Assessment When Equating Test Scores Let O'.x/ denote an estimator of '.x/. The definitions of bias, MSE, RMSE and SE are well known and are shown here explicitly for the case of equating transformations. Bias. O'/ D E f'. O' '/ MSE. O'/ D E f' Œ. O' '/ 2 q RMSE. O'/ D E f' Œ. O' '/ 2 SE. O'/ D p Var. O'/ (B.29) (B.30) (B.31) (B.32) Note that SE can be obtained from MSE and bias because the former can be decomposed as the square of the latter plus variance. Because the expectations are not generally available in closed form, randomly generated score data are used to calculate them in practice using Monte Carlo simulation. Thus, in order to practically evaluate the statistical measures in Eqs. (B.29), (B.30), (B.31), and (B.32), the Monte Carlo method is used with replicated data generated from a known

14 192 Appendix B probability model. For each assessment measure, the true and estimated equated values are compared for each test score. Assume that we have a specific test score x i, i D 0;:::;n, and n is the number of possible score values. The measures for an equated value '.x/ over 1000 replications are defined as bias. O'.x i // D 1 X1000. O'.l/.x i / '.x i //; (B.33) 1000 ld1 MSE. O'.x i // D 1 X1000. O'.l/.x i / '.x i // 2 ; (B.34) 1000 ld1 v u RMSE. O'.x i // D t 1 X1000. O' 1000.l/.x i / '.x i // 2 ; (B.35) ld1 SE. O'.x// D p Var. O'.x// (B.36) where O'.l/.x i / is the estimated equated score for the l-th replication. References Cook, L. L., & Petersen, N. S. (1987). Problems related to the use of conventional and item response theory equating methods in less than optimal circumstances. Applied Psychological Measurement, 11(3), Dorans, N. J. (1990). Equating methods and sampling designs. Applied Measurement in Education, 3(1), González, J., Wiberg, M., & von Davier, A. A. (2016). A note on the Poisson s binomial distribution in item response theory. Applied Psychological Measurement, 40(4), Häggström, J., & Wiberg, M. (2014). Optimal bandwidth selection in observed-score kernel equating. Journal of Educational Measurement, 51(2), Holland, P., & Thayer, D. (1989). The kernel method of equating score distributions. Technical report, Educational Testing Service, Princeton. Liou, M., Cheng, P. E., & Li, M.-Y. (2001). Estimating comparable scores using surrogate variables. Applied Psychological Measurement, 25(2), Livingston, S. A., Dorans, N. J., & Wright, N. K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3(1), Lord, F., & Novick, M. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley. Lord, F., & Wingersky, M. (1984). Comparison of IRT true-score and equipercentile observed-score equatings. Applied Psychological Measurement, 8(4), San Martín, E., & González, J. (2017). Analyzing the anchor test design in equating from an identification perspective. Manuscript submitted for publication. Thissen, D., Pommerich, M., Billeaud, K., & Williams, V. S. L. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19(1),

15 Appendix B 193 van der Linden, W. J. (2006a). Equating error in observed-score equating. Applied Psychological Measurement, 30(5), van der Linden, W. J. (2006b). Equating scores from adaptive to linear tests. Applied Psychological Measurement, 30(6), van der Linden, W. J. (2011). Local observed-score equating. In A. von Davier (Ed.), Statistical models for test equating, scaling, and linking (pp ). New York: Springer. van der Linden, W. J. (2013). Some conceptual issues in observed-score equating. Journal of Educational Measurement, 50(3), van der Linden, W. J., & Wiberg, M. (2010). Local observed-score equating with anchor-test designs. Applied Psychological Measurement, 34(8), von Davier, A. A., Holland, P., & Thayer, D. (2004). The kernel method of test equating.newyork: Springer. Wiberg, M., & Bränberg, K. (2015). Kernel equating under the non-equivalent groups with covariates design. Applied Psychological Measurement, 39(5), Wiberg, M., & van der Linden, W. J. (2011). Local linear observed-score equating. Journal of Educational Measurement, 48, Wright, N. K., & Dorans, N. J. (1993). Using the selection variable for matching or equating. ETS Research Report Series, 1993(1), 1 22.

16 Index A Adaptive kernel, 94, 158 Akaike Information Criterion (AIC), 38, 86, 87 Alternative kernels, 16, 157 Anchor test, 10 12, 20, 21, 30, 31, 48 51, 64, 78, 85, 138, 140, 141, 146, 149, 154, 186 B Bandwidth selection, 16, 93, 109, 161 Bayesian nonparametric equating, 168 Bias, 10, 65, 68 70, 129, 139, 162, 173, 176, 192 Bootstrap, 53, 65 67, 69, 165 Braun Holland, 51 C Calibration, 113, CB design, 11 Chained equipercentile equating, 51 Chained linear equating, 50 Classical Test Theory (CTT), 48, 50, 120, 140, 189 Common item, 10, 12, 30, 114, 115, 117, 128, 130, 131 Conditional means, 140 Continuization, 8, 13, 73, 74, 92, 94, 158 Counterbalanced design, 11 D Data collection design, 11 Delta method, 103 Design function, 73, 90, 103, 183, 184, 186 Difference that matters (DTM), 173 Difficulty parameter, 111, 126, 129 Discrimination parameter, 112, 114, 164, 167 Double smoothing, 162 E EG design, 11 Epanechnikov kernel, 157 Equating design, 10, 12, 14, 74, 90, 113, 139, 144, 183 Equating requirements, 9, 139 Equipercentile equating, 8, 13, 43, 54, 57, 58, 64, 121, 147 Equity, 9, 138 Equivalent groups design, 11 Estimation of score probabilities, 90 F Freeman-Tukey residuals, 86, 87 Frequency estimation, 50 G Gaussian kernel, 93, 94, 100, 105, 109, 157, 158, 161, 163, 174 Springer International Publishing AG 2017 J. González, M. Wiberg, Applying Test Equating Methods, Methodology of Educational Measurement and Assessment, DOI /

17 196 Index H Haebara method, 115, 116, 118, 119, 165 I IRT, xii, 3, 5, 6, 16, 100, 109, , , , 128, 130, 133, 134, , 151, 152, 157, 163, 165, 168, 189 IRT kernel equating, 163 IRT observed-score equating, 120, 123, 124, 134 IRT true-score equating, 119, 120 Item characteristic curve (ICC), 112 K Kernel equating, 14, 73 Kurtosis, 54 L Levine observed-score, 48 Levine true-score, 49 Linear equating, 14, 43, 47 Linear interpolation, 9, 14 Linking, 10, 111, , 118, 119, 125, 134 Local equating, 137 Local equipercentile equating, 144 Local IRT observed-score equating, 145 Local linear equating, 139 Local observed-score kernel equating, 146 Log-linear modeling, 4, 9, 33, 34, 36 38, 62, 66, 69, 73 76, 78, 80, 82, 83, 85 89, 95, 100 M Mean equating, 43, 44, 46, 56, 57 Mean squared error (MSE), 162, 173 Moments, 14, 34 36, 41, 43, 68, 76, 85, 89, 103, 104, 114, 174 MSE, 10, 176, , 113, 140, 146, 149, 151, 165, 168, 183, 184, 187 NEC design, 12, 22, 44, 85, 109, 183, 184 Nominal weights, 48 Non equivalent groups with anchor test design, 12 Non equivalent groups with covariates design, 12 P Percent relative error (PRE), 103, 104, 106, 173, 174 Percentile, 8, 43, 50 Polytomous IRT kernel equating, 163 Presmoothing, 19, 33, 36, 38, 41, 42, 73, 74, 76, 78, 82, 83, 85, 86, 109 R Rasch model, 124, 126, 129, 130 Root mean squared error (RMSE), 65, 68 70, 139, 173, 176, 192 Rule-based bandwidth selection, 161 S Sample size, 22, 33, 54 Score scale, 2, 7, 19, 22, 25, 44, 56, 112, 169 SG design, 11, 12, 20 22, 27, 28, 30, 33, 34, 37, 44 46, 52, 53, 57, 76, 82, 83, 100, 109, 113, 140, 144, 183, 184 Single group design, 11 Skewness, 54 Standard error (SE), 10, 173, 176 Standard error of equating (SEE), 9, 73, 102, 103, 160 Standard error of equating differences, 103 Standard error, SE, 192 Stocking and Lord method, 115, 116, 118, 126, 165 Synthetic population, 47, 187 N NEAT design, 12, 20 22, 30, 33, 37, 44, 45, 47, 52, 53, 60, 62, 63, 68, 85, 86, 100, T True equating transformation, 138 Tucker equating, 47

Haiwen (Henry) Chen and Paul Holland 1 ETS, Princeton, New Jersey

Haiwen (Henry) Chen and Paul Holland 1 ETS, Princeton, New Jersey Research Report Construction of Chained True Score Equipercentile Equatings Under the Kernel Equating (KE) Framework and Their Relationship to Levine True Score Equating Haiwen (Henry) Chen Paul Holland