Item calibration in incomplete testing designs

Similar documents
Nested case-control and case-cohort studies

MARKOV CHAIN AND HIDDEN MARKOV MODEL

Image Classification Using EM And JE algorithms

COXREG. Estimation (1)

Delay tomography for large scale networks

Associative Memories

Neural network-based athletics performance prediction optimization model applied research

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

On the Power Function of the Likelihood Ratio Test for MANOVA

Chapter 6. Rotations and Tensors

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Research on Complex Networks Control Based on Fuzzy Integral Sliding Theory

LECTURE 21 Mohr s Method for Calculation of General Displacements. 1 The Reciprocal Theorem

Journal of Multivariate Analysis

L-Edge Chromatic Number Of A Graph

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Cyclic Codes BCH Codes

Lower bounds for the Crossing Number of the Cartesian Product of a Vertex-transitive Graph with a Cycle

3. Stress-strain relationships of a composite layer

G : Statistical Mechanics

QUARTERLY OF APPLIED MATHEMATICS

Note 2. Ling fong Li. 1 Klein Gordon Equation Probablity interpretation Solutions to Klein-Gordon Equation... 2

Supplementary Material: Learning Structured Weight Uncertainty in Bayesian Neural Networks

A finite difference method for heat equation in the unbounded domain

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Linear Approximation with Regularization and Moving Least Squares

NUMERICAL DIFFERENTIATION

COMBINING SPATIAL COMPONENTS IN SEISMIC DESIGN

ON AUTOMATIC CONTINUITY OF DERIVATIONS FOR BANACH ALGEBRAS WITH INVOLUTION

Xin Li Department of Information Systems, College of Business, City University of Hong Kong, Hong Kong, CHINA

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Multispectral Remote Sensing Image Classification Algorithm Based on Rough Set Theory

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Quantum Runge-Lenz Vector and the Hydrogen Atom, the hidden SO(4) symmetry

2.3 Nilpotent endomorphisms

A parametric Linear Programming Model Describing Bandwidth Sharing Policies for ABR Traffic

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

GENERATIVE AND DISCRIMINATIVE CLASSIFIERS: NAIVE BAYES AND LOGISTIC REGRESSION. Machine Learning

Predicting Model of Traffic Volume Based on Grey-Markov

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

A General Column Generation Algorithm Applied to System Reliability Optimization Problems

Problem Set 9 Solutions

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP

Interference Alignment and Degrees of Freedom Region of Cellular Sigma Channel

Economics 130. Lecture 4 Simple Linear Regression Continued

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

The line method combined with spectral chebyshev for space-time fractional diffusion equation

Composite Hypotheses testing

Uncertainty Specification and Propagation for Loss Estimation Using FOSM Methods

Distributed Moving Horizon State Estimation of Nonlinear Systems. Jing Zhang

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter 13: Multiple Regression

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Optimum Selection Combining for M-QAM on Fading Channels

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Lecture 12: Discrete Laplacian

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

Numerical Investigation of Power Tunability in Two-Section QD Superluminescent Diodes

Complete subgraphs in multipartite graphs

Lecture Notes on Linear Regression

Lecture 6: Introduction to Linear Regression

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Inthem-machine flow shop problem, a set of jobs, each

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Lecture 10 Support Vector Machines II

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Linear Regression Analysis: Terminology and Notation

EXPERIMENT AND THEORISATION: AN APPLICATION OF THE HYDROSTATIC EQUATION AND ARCHIMEDES THEOREM

STK4080/9080 Survival and event history analysis

A DIMENSION-REDUCTION METHOD FOR STOCHASTIC ANALYSIS SECOND-MOMENT ANALYSIS

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

Optimization of JK Flip Flop Layout with Minimal Average Power of Consumption based on ACOR, Fuzzy-ACOR, GA, and Fuzzy-GA

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

/ n ) are compared. The logic is: if the two

APPENDIX A Some Linear Algebra

A marginal mixture model for discovering motifs in sequences

Lower Bounding Procedures for the Single Allocation Hub Location Problem

Estimation: Part 2. Chapter GREG estimation

Signal Processing 142 (2018) Contents lists available at ScienceDirect. Signal Processing. journal homepage:

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

Difference Equations

Chapter 8 Indicator Variables

More metrics on cartesian products

Module 9. Lecture 6. Duality in Assignment Problems

Comparison of Regression Lines

Lecture 3: Probability Distributions

Globally Optimal Multisensor Distributed Random Parameter Matrices Kalman Filtering Fusion with Applications

Singularity structures and impacts on parameter estimation in finite mixtures of distributions

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

Qiong (Joan) Wu Harvard Center for Population and Development Studies. INDEPTH-SAGE WORKSHOP April 20, 2010

1 Generating functions, continued

Analysis of Non-binary Hybrid LDPC Codes

GENERATION OF GOLD-SEQUENCES WITH APPLICATIONS TO SPREAD SPECTRUM SYSTEMS

Uncertainty in measurements of power and energy on power networks

LOW-DENSITY Parity-Check (LDPC) codes have received

Laboratory 3: Method of Least Squares

Transcription:

Pscoógca (20), 32, 07-32 Item cabraton n ncompete testng desgns Theo JHM Eggen * & Norman D Verhest** *Cto/Unversty of Twente, The etherands **Cto, The etherands Ths study dscusses the justfabty of tem parameter estmaton n ncompete testng desgns n tem response theory Margna maxmum kehood (MML) as we as condtona maxmum kehood (CML) procedures are consdered n three commony used ncompete desgns: random ncompete, mutstage testng and targeted testng desgns Msevy and Sheenan (989) have shown that n ncompete desgns the justfabty of MML can be deduced from Rubn's (976) genera theory on nference n the presence of mssng data Ther resuts are recaptuated and extended for more stuatons In ths study t s shown that for CML estmaton the justfcaton must be estabshed n an aternatve way, by consderng the negected part of the compete kehood The probems wth ncompete desgns are not generay recognzed n practca stuatons Ths s due to the stochastc nature of the ncompete desgns whch s not taken nto account n standard computer agorthms For that reason, ncorrect uses of standard MML- and CML-agorthms are dscussed Introducton Wthn the framework of tem response theory (IRT) tem cabraton nvoves the estmaton of the tem parameters n the chosen IRT mode For these so-caed scang procedures often data gathered n ncompete desgns are used In tem bankng studes the researcher frequenty decdes to admnster ony subsets of the tota avaabe tem poo to the avaabe (samped) students Sometmes there are just practca reasons for usng ncompete desgns, for exampe because of mted testng tme not a the avaabe tems can be admnstered to every student However, often effcency s the motvatng factor for budng ncompete desgns Effcency n tem cabraton s ganed when (a pror) knowedge about the dffcuty of the tems and the abty of the students s used n aocatng students to subsets of tems In equatng studes, the ncompete desgns s mosty a startng pont, because ony party overappng tests are admnstered to dfferent groups of students Agorthms for tem cabraton whch aow for ncompete testng desgns are mpemented n severa computer programs For exampe, BILOG-MG (Zmowsk, Murak, Msevy, & Bock, 996), uses the margna maxmum kehood (MML) approach n the one-, two- and threeparameter ogstc test mode and OPLM (Verhest, Gas, & Verstraen, 995), uses condtona maxmum kehood (CML) as we as MML procedures n genera one parameter ogstc modes The appcaton of these or smar computer programs n tem bankng, mutstage testng, adaptve testng and equatng studes s common psychometrc practce In these appcatons, however, some * E-ma: theoeggen@cton

08 TJHM Eggen & D Verhest probems wth ncompete desgns are not generay recognzed Ths s due to the gnorance of the consequences of the stochastc nature of the ncompete desgns whch s not taken nto account n these computer agorthms In partcuar ths s the case n equatng studes where tem cabraton n ncompete desgns as studed here s often caed concurrent cabraton and s then compared wth nkng on the same scae separatey cabrated tests wth data from compete desgns (see eg, Hanson & Bégun, 2002) In ths study (concurrent) cabraton procedures n ncompete testng desgns are revewed The statstca approach of use mputaton technques (Ltte & Rubn, 987) n the handng of mssng data and subsequenty anaysng compete data w not be consdered n ths study Here, the kehood approach, n whch as we erved as mssng data are modeed, w be studed The justfcaton of appyng MML and CML procedures n the ncompete desgns w be studed For convenence, the one-parameter ogstc test mode for dchotomousy scored tems (Rasch, 980) w be used for ustratve purposes After revewng IRT tem parameter estmaton n genera, Rubn's (976) concepts and theory on nference n the presence of mssng data are summarzed Next, the appcabty of ths theory n MML as we as CML tem cabraton w be dscussed Ths w be eaborated for three commony used ncompete desgn structures For MML estmaton, Msevy and Wu (996), wth an emphass on the estmaton of person parameters and Msevy and Sheenan (989), focussng on the use of coatera nformaton, have used the approach as presented here The MML resuts n ths study are party recaptuatons of ther work and are extended to other stuatons The resuts for the justfcaton of CML estmaton of the tem parameters n ncompete desgns are necessary deduced va a dfferent approach Item Response Theory In IRT we consder the random vector, the response pattern X = ( ), =,, n ; j =,, K, where X s the response of student j to tem j Wth dchotomousy scored tems X = f the answer s correct and X = 0 f the answer s not correct j j The one-parameter ogstc mode has as ts basc equaton (Rasch, 980) X j exp((θ β ) x ) P( X = x ) = = P ( x ), () j j j j θ,β j j + exp[(θ β ) j where x {0,}, {,, n} and j {,, k} j The dstrbuton of X, denoted by P ( x ), foows the bnoma dstrbuton n whch θ θ,β s j j j the abty parameter of student and β j the dffcuty parameter of tem j By the usua assumptons of (oca) ndependence the probabty of the response pattern s gven by (wth θ = (θ ), =,, n and β = ( β j ), j=,, n )

Item cabraton n ncompete testng desgns 09 Pθ,β ( x) = P = P ( xj ) (2) θ, β θ,β j j Cabratng an tem poo nvoves estmatng the tem parameters β and testng the vadty of the mode In IRT maxmum kehood estmaton s common, that s the probabty of the erved response pattern X = x, or the kehood functon L( β, θ; x) P ( x ) = θ,β s maxmzed wth respect to the parameters β and θ It s we known that because of the ncdenta parameters θ n the mode ths does not ead to consstent estmates of the parameters, but n genera two approaches are known to avod ths probem: CML and MML estmaton Condtona Maxmum Lkehood Estmaton If t s possbe to construct a suffcent statstc S( X ) for the ncdenta parameter θ, n the presence of the tem parameter β, we can factor the probabty of the response pattern as P ( x ) =Π P ( x s( x )) P ( s( x )), (3) θ, β β θ, β In (3) P ( s( x )) s the dstrbuton of the suffcent statstc S( X ),,, θ, β = n And the frst factor Π P β ( x s( x )), s the smutaneous condtona probabty of the erved responses x, whch does not depend on the abty parameters because of the suffcency of S( X ) for θ In CML estmaton we then proceed estmatng the tem parameters by just maxmzng ths condtona kehood functon wth respect to β : L ( β;( x s(x) ) =Π P ( x s( x )) c β In CML estmaton of the tem parameters ony random varatons of the ervatons, fxng (gven) the vaues of the condtonng statstcs s( x ) are consdered The justfcaton of ths depends on whether a random varaton that s reevant to the probem (here estmatng the tem parameters β ) s n ths reduced frame of reference Ths s easy seen to be heavy dependent on the propertes of the negected part of (3) If the dstrbuton of the suffcent statstc s( x ) woud be competey ndependent of the tem parameters β, the justfcaton woud be obvous However ths condton s not fufed n our stuaton But dscardng ths term s justfed because Andersen (973) has shown that the resutng CML estmators of β are, under md reguarty condtons,

0 TJHM Eggen & D Verhest consstent, and asymptotcay normay dstrbuted and effcent Furthermore, n Eggen (2000) t was shown that the possbe oss of nformaton n CML estmaton, by negectng the nformaton on β n the dstrbuton of s( x ), s very sma aready at short test engths A major feature of CML estmaton of the tem parameters s that t s vad (e, havng the above statstca propertes) rrespectve of any assumptons on the dstrbuton of the abty of the students takng the test The ndvdua parameters are ony part of the factor n the tota kehood whch s negected Margna Maxmum Lkehood Estmaton In MML estmaton, mode (2) s extended by assumng that the abty parameters θ are a random sampe from a popuaton wth probabty densty functon gven by g (θ), wth γ the (possby vector vaued) parameter of the abty dstrbuton Thus the response pattern X we as the abty θ are consdered random varabes here The θ are not as before ndvdua person abty parameters, but reazatons of the unervabe random varabe θ In MML we consder the margna dstrbuton of the response pattern X, γ P ( x) = P ( x,θ) dθ =Π P ( x θ ) g (θ ) d θ, (4) β, γ β, γ β γ where P β (,θ), γ x s the smutaneous dstrbuton of the response pattern X and the abty θ P ( x θ ) =Π P ( x θ ) s the IRT mode as n (2), gvng the probabty of a response vector of β β j j person, wth abty θ In MML estmaton the tem parameters β are smutaneousy estmated wth the parameter γ of the abty dstrbuton by maxmzng the margna probabty of the erved response pattern x (the margna kehood functon) wth respect to the parameters, that s, L ( β, γ; x ) =Π P ( x θ ) g (θ ) dθ (5) m β γ The consstency of the tem parameter estmators wth MML can be deduced from the work by Kefer and Wofowtz (956) In practce, the most popuar approach here s to assume that the 2 abty dstrbuton of θ s norma wth γ = (µ,σ ) Bock and Atkn (98) were the frst to gve computatona procedures for maxmzng (5) usng the EM-agorthm Inference and Mssng Data Rubn (976) and Ltte and Rubn (987) present a genera framework for nference n the presence of mssng data Here ther defned concepts and some of the resuts are summarzed Frst, some notatons and defntons

Item cabraton n ncompete testng desgns Let U = ( U,, U m ) be a vector random varabe wth probabty densty functon f ( u ), wth τ a vector parameter τ, on whch we want to draw nferences on the bass of the data, a sampe reazaton u Assume for convenence that m= n k, wth k the number of varabes and n the number of persons samped In the presence of mssng data a vector random desgn varabe, or mssng data ndcator, M = ( M,, M m ) s defned, ndcatng whether a varabe U, s actuay j erved, m =, or not erved, m = 0 The erved vaue of M ( m ) effects a partton of the j j vector random varabe U and of ts erved vaue: U = ( U, U ) and u= ( u, u ) The sets of ndces of erved and not erved varabes are = { j m j = } and ms= { j m j = 0} In Rubn's (976) theory the condtona dstrbuton of the mssng data ndcator gven the data has a key roe: ms ms P ( M = m U = u) = h ( m u ), φ φ whch s defned as the dstrbuton correspondng to the process that causes the mssng data, wth φ a possby vector vaued parameter In genera, φ can be dependent on the parameter of nterest τ : they coud have common or functonay reated eements The genera probem n nference n the presence of mssng data s that we have a sampe reazaton of M and U and we want to nfer on the parameter τ of the dstrbuton of the ony partay erved U In the presence of mssng data, the bass for nference on τ shoud be the jont dstrbuton of M and U : u f ( um, ) du = f ( u) h ( m u) du (6) ms τ,φ ms τ φ u ms ms Because we are ony nterested to nfer on the parameter τ of the dstrbuton of the partay erved U, a possbe approach coud be to gnore n the nference the process that causes the mssng data Foowng Rubn (976), gnorng the process that causes mssng data means: (a) fxng the random varabe M at the erved pattern of mssng data m and (b) assumng that the vaues of the erved U data are reazatons of the margna densty of U : τ τ,φ f ( u ) = f ( um, ) du (7) u ms ms When we gnore the process that causes the mssng data, not a possbe random varaton n the data due to sampng of M and U s consdered, but ony random varaton due to U fxng the random varabe M at the partcuary erved pattern m The generay more convenent form (7) s used nstead of (6) n the nference onτ

2 TJHM Eggen & D Verhest It w be cear that gnorng the mssng data process does not necessary ead to a correct nference on τ Frsty, we possby dsregard the nfuence of φ on τ : possbe restrctons, due to φ θ, are not taken n account n the nference on τ Secondy, t s understood that the data u are n fact no reazatons of (7) but of the condtona densty of U gven the random varabe M took the fxed vaue m : f ( u, m) f ( u) h ( m u), (8) τ,φ τ φ fτ,φ( u m) dums = dums = dums fφ ( m) fτ ( u) hφ ( u m) d u whch s n genera not equa to (7) We now specfy Rubn s (976) suffcent condtons under whch gnorng the process that causes the mssng data yeds the correct drect kehood nference about τ By drect kehood nference s meant nference on parameter(s) based on comparson of kehoods as eg the determnaton of a maxmum kehood estmator and kehood rato tests The suffcent condtons are on the dstrbuton h ( m u ) Defne: φ The mssng data are mssng at random (MAR) f for each vaue of φ hφ ( m u, ums ) = hφ ( m u ) for a u ms, that s, the mssngness of the data does not depend on the not erved vaues of depend on the erved vaues of u 2 The mssng data are mssng competey at random (MCAR) f for each vaue of φ u ms, but may h ( m u, u ) h ( m) φ ms = φ for a ms u and u Note that MCAR mpes MAR 3 The parameter φ s dstnct (D) from τ f the jont parameter space of (φ, τ) s the Cartesan product of the parameter space of φ and the space of τ Dstnctness means that a possbe vaues of φ are possbe n combnaton wth a possbe vaues of τ These three defntons enabe us to state Rubn's (976) gnorabty prncpe: f both MAR and D hod, gnorng the process that causes the mssng data gves correct drect kehood nferences about τ Ths means that nstead of usng the fu-kehood

Item cabraton n ncompete testng desgns 3 L( τ, ϕ ; u, m) = f ( u, m) = f ( u, m) du, (9) τ, ϕ τ, ϕ ms u ms the smpe kehood functon L(τ; u ) = f ( u ) = f ( u) du (0) τ τ ms u ms can be used for nferrng on τ Ignorng the process that causes mssng data s of course aso justfed f the stronger condton MCAR, nstead of MAR, (and D) s met It s noted that these condtons ony guarantee correct drect kehood nferences as determnng the correct maxmum kehood estmate It s not guaranteed that the resutng estmates n usng (9) or (0) w have the same statstca propertes, such as consstency or asymptotc normaty In genera, then stronger condtons have to be fufed (Rubn, 976) Incompete Cabraton Desgns Usng ncompete testng desgns s very common n the appcaton of IRT Athough many varants are possbe, one of three cabraton desgn structures s commony used: random ncompete desgns, mutstage testng desgns and targeted testng desgns The foowng notaton and assumptons are used to descrbe these desgns We have T test forms, ndexed by t=,, T From the tota tem poo of k tems, subsets of k,( t=,, T ) tems are assembed n the test forms t We assume that there s overap n tems between the test forms Va the nkng tems the tem poo can be cabrated on the same scae Fscher (98) gves the exact condtons that have to be fufed for the exstence and unqueness of the tem parameter estmates n ncompete desgns usng CML n the Rasch mode In practce, these condtons are amost aways met f there are some common tems n the test forms In MML estmaton the nkng n ncompete desgns s aso mosty estabshed va common tems Athough, for MML estmaton Gas (989) has shown that n the speca case where we do not have a nked desgn but assume a common abty dstrbuton for a samped students the parameters are estmabe We assume that every student takes ony one test form and for every student takng tems from the poo we defne a desgn or tem ndcator vector wth as many eements as there are tems n the tem poo ( k ) The tem ndcator varabe for every student R can take T vaues: r = perm (,0 ), ( t=,, T ) () t t kt k kt Each vaue of the desgn vector rt s a permutaton of the vector ( k, 0 ) t k k, ndcatng that t there are k t vaues at the eements ndexed by the tems n the admnstered test t, and k kt vaues 0

4 TJHM Eggen & D Verhest It s noted that the mssng data ndcator M s strongy reated to the tem ndcator R In our appcatons t s aways true that R M But R concerns ony the ndcaton whether tems are erved, whe M aso concerns the ervaton or mssngness of other varabes consdered n a probem More specfcay, when the abty θ s consdered as a random varabe as n MML estmaton (5), we w use the ndcator varabe M, havng a vaue zero for a reazatons of θ Random Incompete Desgns In random ncompete desgns the researcher decdes whch test form s taken by whch students wthout usng any a pror knowedge on the abty of a student Every student has an a pror known chance of takng one of the T test forms In these desgns the test forms are often assembed from the tem poo n such a way that the forms have an equa number of tems and are parae n content and dffcuty A test form can be randomy assgned to a student so that every student has an equa chance of gettng a partcuar test form Or more generay a student gets a test form wth a known probabty R s gven by: φ t such that T φ = The dstrbuton of the tem ndcator varabe t= t P( R = r ) = φ wth ( t=,, T ),( =,, n) (2) t t Mutstage Testng Desgns In mutstage testng desgns the assgnment of students to subsets of tems from the tota tem poo n a testng stage s based on the erved responses n the former stage A typca exampe s gven n Fgure A students n the sampe take the frst stage test whch s of medum dffcuty Ths (part of the) test s caed the routng test Students wth hgh scores on the routng test are admnstered a more dffcut subset of tems from the poo n the next stage and students wth ow scores a more easy subset The same procedure s possby contnued n next testng stages students Items s < c s s < c 2, 2, c 2, 2, s c s s < c 2,2 2,2 c 2,2 2,2 st stage 2nd stage 3rd stage Fgure Exampe of a mutstage testng desgn

Item cabraton n ncompete testng desgns 5 In Fgure, s ndcates the score on the tems of the frst stage (routng) test, and s 2, s the score on a second stage (routng) test whch content depends on the score on the frst routng test In each stage the score s compared to a cut-off c, on whch t s decded whch tems are admnstered next In ths exampe, consderng the tota data matrx, the tota number of tests T s 4 Mutstage testng was ntroduced (Lord, 97) for effcenty measurng the abty of students, but t s understood that the underyng prncpe can aso be apped n desgns for the cabraton of the tems A mtng case of mutstage testng s computerzed adaptve testng, where the stages have a ength of ony one tem: after every tem, the next tem admnstered s seected on the bass of the resut on the prevousy admnstered tems In a mutstage testng desgn, as n a random ncompete desgn, the tem ndcator varabe for every student R can take as many vaues as there are tests T (see ()) The dstrbuton of R has aways the foowng form: P( R = r x ) = φ ( x ), wth ( t=,, T ),( =,, n) (3) t, t, If a functon of erved tem scores x, meets a crteron for gettng test t, the tem ndcator varabe R takes the vaue r t wth probabty φ t If the crteron s not met the probabty s -φ t It shoud be understood that n a mutstage testng desgn the probabty of a certan desgn s not constant for a vaues of x,, because n that case the desgn s random ncompete Exampe We have a routng test consstng of 3 tems wth β = β = β = 0 Wth a tota score of 0 or 2 3 on these 3 tems an easer test of 4 tems wth parameters β = 25, β = 0, β = 05 and 4 5 6 β = 05 s admnstered When the score on the routng test s 2 or 3, a harder test, havng two 7 tems n common wth the easer, wth the parameters β = 05, β = 05, β = 0 and β = 25 s 6 7 8 9 admnstered The functons φ ( x ) (3) can then be defned as: φ t, ( x, ) = f φ ( x ) = f 3 x >, where test s the easer test and test 2 the harder test 2, j = j 3 x, and j= j Targeted testng desgns In targeted testng desgns the structure of the desgn s determned a pror on the bass of background nformaton, say vaues of a random varabe Y of the students Ths background varabe s usuay postvey reated to the abty Students wth vaues of Y whch are expected to have ower abtes are admnstered easer test forms, and students wth vaues of Y expected to have hgher abtes are admnstered the more dffcut forms As n mutstage testng desgns gans n precson of the estmates are to be expected An exampe of a varabe often used n these desgns s the grade eve of the student

6 TJHM Eggen & D Verhest We w assume that the varabe Y of the students s categorca (or categorzed), takng (or dstngushng) T vaues: y,,, y T In targeted testng, for each vaue of Y a dfferent subset from the tota tem poo s admnstered to the students The varabe Y can, besdes for the assgnment of the tems to the students, aso pay a roe n the sampng of the students We can dstngush two stuatons Frst, the background varabe Y s ony used n the assgnment of tems or tests to students and not n the sampng of students Second, the Y s used n the sampng of students as we as n the assgnment of tests to students In the frst stuaton the roe of usng Y s mted to ncrease the precson of the parameter estmates of the tems to be cabrated In ths stuaton there s no expct nterest n the varabe Y tsef There s, for nstance, no nterest to have estmates of the parameters of the abty dstrbuton for each dstngushed eve of Here the students are samped from one popuaton wth no regard to the vaues of Y In the second stuaton, the background varabe aso pays a roe n sampng the students In ths case there s an expct nterest n the varabe tsef A stuaton often occurrng s that Y s the stratfcaton varabe n the sampng of students from the tota popuaton Often the sampng proportons wthn the strata are not the same n the tota popuaton and one s expcty nterested n estmates of the abty dstrbuton of the dfferent strata and possby, but not necessary, n the tota popuaton In ths case, unke the frst stuaton, the samped students can n genera not be consdered to be a sampe from one popuaton but are sampes from a tota popuaton dvded n subpopuatons of nterest Where reevant we w dstngush these two targeted testng stuatons: (a) targeted testng wth student sampe from one popuaton (TTOP), and (b) targeted testng wth student sampes from mutpe (sub)popuatons (TTMP) In targeted desgns the tem ndcator varabe R for every student can agan take as many vaues as there are tests The dstrbuton of R, s gven by P( R = r y = y ) = φ ( y ), ( t=,, T ) (4) t t t For any (dstngushed) vaue of the background varabe Y here s a fxed probabty that a certan test s admnstered An exampe s the gender of the student A boy y = then gets wth a probabty φ ( y = ) test and a gr wth probabty φ ( y = 2) Smar probabtes can be specfed for a second test In practce, often φt =, whch means that gven the vaue y, a specfc test s admnstered The forma resembance between a targeted testng (4) and a mutstage testng desgn (4) s noted But the dfference s aso cear: In a targeted testng desgn y, can be any measured characterstc of a student, wth the excepton that t s not (based on) responses to tems whose parameters are to be estmated as we have n mutstage testng (4)

Item cabraton n ncompete testng desgns 7 Item Cabraton and Mssng Data Athough tem cabraton n ncompete testng desgns s common n psychometrc practce and modern computer programs can anayze ncompete desgns, t s commony assumed that the stochastc nature of the tem ndcator varabe R does not pay a roe n the cabraton In mpemented computer agorthms the desgn varabe vaue s fxed at the erved vaue and ony random varatons n the erved tem responses are consdered One coud say that the gnorabty prncpe s assumed to hod In ths secton we w expore the justfabty of ths practce n the ncompete cabraton desgns descrbed n the former secton We w treat margna as we as condtona estmaton of the tem parameters n these desgns We assume that we have tested a group of n students, for whch the erved and mssng varabes are notated wth U, and U ms,, U = ( U, U ms ) wth U = ( U,,, U, n) and Ums = ( Ums,,, U ms, n ) The mssng data ndcator s M = ( M,, M n ), n whch every eement M s a vector of the same ength as there are varabes (erved and unerved) The Margna Mode and Mssng Data Usng the same approach as Msevy and Sheenan (989), the gnorabty condtons for the desgn varabe n ncompete desgns for MML tem parameter estmaton can be checked We w gve next the resuts for compete, random ncompete, mutstage and targeted testng desgns MML n compete, random ncompete and mutstage testng desgns Frst we note that the justfcaton of usng MML for compete data, see (4) and (5), can aso be deduced from the genera framework of Rubn for nference n the presence of mssng data Compete data MML can be descrbed as a procedure n whch we have mssng data and the gnorabty prncpe s apped n kehood nference Ths s ready seen as foows The varabe on whch we want to base our nference on s U = ( X,θ) = ( X,θ,, X,θ ) n n n whch X s as before the random answer vector of student on the k tems admnstered The parameter to be estmated s τ = (β, γ) In the compete data stuaton the X are aways erved and the θ are aways mssng So we have for every student a degenerated desgn dstrbuton, that equas ts tem ndcator dstrbuton P( M = (,0)) = P( R = ( )) =, ( =,, n) k k The partton whch the erved desgn varabe m effects s U, = X and U = θ, ( =,, n), ms Because the parameter space of the dstrbuton of M s empty and MCAR s ceary met, the margna dstrbuton of U (here X ) can be used by the gnorabty prncpe for correct kehood nference:

8 TJHM Eggen & D Verhest f ( u) du = P ( x,θ) dθ = P ( x θ )g (θ ) dθ ums τ ms β,γ β γ Whch s dentca to (5) In random ncompete desgns and mutstage testng desgns the gnorabty condtons are aso fufed In Tabe we gve for these desgns and for the compete testng desgn respectvey the erved and unerved varabes and the desgn dstrbuton The desgn dstrbuton n random ncompete and n mutstage testng desgn foow respectvey from (2) and (4) In random ncompete desgns the MCAR condton s fufed and n mutstage testng desgn the MAR condton In both desgns the D condton s ceary met Therefore gnorabty hods n these desgns and MML can be apped usng the margna dstrbuton of the ervatons Ths can ready be checked by consderng, eg n the mutstage testng desgn, the dstrbuton of ( U,, M ) needed for the fu kehood: τ,φ P ( u, m) du = P ( x, x,θ, m) dx dθ= ums ms β,γ,φ θ x ms ms ms P ( x, x,θ) h ( m x, x,θ) dx dθ= θ xms β,γ ms φ ms ms h ( m x ) P ( x,θ) dθ = (5) φ θ β,γ h ( m x ) P ( X,θ ) g (θ ) dθ φ, β, γ θ Tabe Varabes n ncompete testng desgns Desgn U, ms U, h m U, U ) φ(, ms, compete random ncompete X X, θ P ( M = ( k,0)) = P( R = ( k )) = X, θ P ( M = ( r,0)) = P( R = ( r )) = φ ms, t t t mutstage X, X ms,, θ P ( M = ( rt,0) X, ) = P( R = rt X, ) = φt ( X, )

Item cabraton n ncompete testng desgns 9 In (5) the thrd equaty hods because of MAR resutng n a factorzaton of the fu kehood n a term ndependent of (β,γ) and the margna dstrbuton of X So just consderng the margna dstrbuton of X w thus gve the correct maxmum kehood estmates of β and γ T Note that f we ndcate by n t the number of students takng test t wth Σ = nt = n and defne β( t as the k ) t -vector of the tem parameters of the tems n test we can rewrte the second factor of (5) as n P ( x θ ) g (θ ) dθ = P ( x θ ) g (θ ) dθ β, γ θ θ β ( t ), γ t = = = T n t The margna kehood n the ncompete desgn case s thus wrtten as a product of T compete data margna kehoods MML n targeted testng desgns Msevy and Sheenan (989) have presented a genera dscusson on the effect of usng or not usng (gnorng) the background nformaton of the students n MML tem cabraton In compete testng desgns they consder the two dfferent roes of the background varabe Y can pay n the sampng: students can be samped from one popuaton, or (stratfed) from mutpe subpopuatons In targeted testng desgns aso the same two roes of Y can be dstngushed As mentoned before n TTOP, Y has no roe n the sampng of the students, but dependng on the vaues of Y dfferent subsets of the tem poo are admnstered and n TTMP Y has both a roe n the sampng of the students and n the assgnment of tems to students Msevy and Sheenan (989) have ony consdered the atter stuaton Ther resuts w be summarzed and w be compared and competed wth the resuts n the TTOP stuaton Assume Y to be a categorca (or categorzed) varabe takng one of L vaues, estabshng a dvson of the tota student popuaton n L subpopuatons The vaue of Y for student s defned as y = ( y,, y ) wth y = f student s assocated wth subpopuaton and 0 f not, L () =,, L If y = we w aternatvey wrte y = y The abty dstrbuton g γ(θ) of the tota popuaton n ths case can be expressed as a fnte mxture of L subpopuaton abty dstrbutons: γ L L L ( ) ( ) ( ) g (θ) = P(θ, Y = y ) = P(θ Y = y )P( Y = y ) = g (θ)π, (6) = = = γ n whch y s the possby vector vaued parameter of the abty dstrbuton n subpopuaton and π the proporton of subpopuaton n the tota popuaton In compete testng desgns usng or not usng Y n MML tem cabraton s equvaent wth consderng Y as erved or mssng data In Msevy and Sheenan (989) checks of Rubns gnorabty condtons n ths stuaton are gven Summarzed the resuts are: Y usng n MML tem cabraton makes t possbe, ndependent of the sampng roe, to estmate the tem parameters β = (β,,β k ) and the popuaton parameters γ = (γ,, γ,, γ, π,, π,, π ) smutaneousy The L L

20 TJHM Eggen & D Verhest justfabty of gnorng Y depends on the sampng roe of Y n the desgn: correct estmates of the tem parameters and the popuaton parameters n MML tem cabraton are guaranteed ony when we have a random sampe from one popuaton In case we have sampes from mutpe subpopuatons, gnorng Y may ead to wrong estmates In targeted testng desgns we frst consder the TTOP stuaton In TTOP we have a random () sampe from the tota popuaton wth abty dstrbuton g γ (θ) (6) For students wth vaue y of Y denote wth β( ) the k -vector of the tem parameters of the tems admnstered and wth r the accompanyng vaue of the tem ndcator varabe (see ()) Wthout oss of generaty we may assume that the tota number of dstngushed subpopuatons s the same as the number of dfferent tests admnstered: T = L If we use the background nformaton n MML cabraton n ths case the partton whch the erved desgn varabe m effects s: U, = ( X,, Y ) Ums, = ( X ms,,θ ),( =,,n) and the dstrbuton of the mssng data ndcator foows from (4): ( ) P( M = ( r,,0) Y = y ) = φ, ( =,, L) (7) Note that the desgn vector M varabe has one eement more compared to the stuatons n compete, n random ncompete and n mutstage testng ndcatng the ervaton of Y The th th ( k + 2) eement ndcates Y, the ( k+ 2) θ From (7) t s easy seen that the condtons for gnorabty MAR (dependng ony on erved responses) and D are fufed So the correct kehood nference can be based on the margna dstrbuton of the ervatons For a randomy samped student we have: P ( x, Y = y = P ( x, x, Y = y,θ ) dθ dx = ( ) ( ) β,γ, β,γ, ms, ms, xms, θ P ( x Y = y,θ ) P (θ Y = y ) P( Y = y ) dθ = θ ( ) ( ) ( ) β ( ), y P ( x θ ) g (θ ) π dθ = θ β ( ), y L L y = = θ { P x g d β ( ), y } π ( θ ) (θ ) θ y (8)

Item cabraton n ncompete testng desgns 2 The kehood of the tota sampe s gven by: n ( ) (β,, π;, ) (, ) β,γ, = L y x y = P x Y = y = n L n L y = = = = θ { P x g d β ( ), y } π ( θ ) (θ ) θ y (9) From (9) t s seen that the kehood functon consst of a term that depends ony on the proportons π of the subpopuatons n the tota popuaton, and a term whch s a product of L ordnary margna kehood functons Ths s because there s aways exacty one for whch y =, wth the understandng that they not a contan the same tem parameters Standard maxmum kehood estmates ˆπ, =,, L of the proportons can be obtaned from the frst part Maxmzng the second term wth respect to y, =,, L and β w gve estmates of L popuaton parameters and the tem parameters Cabraton usng the background nformaton n the TTOP case s thus a generazaton of standard MML If we do not use the background nformaton n the TTOP case, the partton the erved desgn varabe m estabshes becomes: U, = X,,( =,,n) Ums, = ( X ms, Y,θ ) (20) The desgn dstrbuton s gven by: P( M ( ) = ( r,0,0) Y = y ) = φ, ( =,,L) (2) We see that the MAR condton n ths case s not fufed, because the desgn dstrbuton depends on vaues of Y whch are consdered as mssng f we do not use Y n the anayses Not usng Y n the TTOP case s not justfed by the gnorabty prncpe and can ead to ncorrect estmates of the parameters The next exampe w ustrate ths Exampe 2 In a smuaton study, data were generated accordng to the foowng specfcatons: two nonequvaent sampes of 000 students were drawn from two norma dstrbutons, respectvey θ ~ N(,) and θ ~ N( +,) The ess abe popuaton s admnstered the frst 6 tems out of a poo of 9 tems The more abe pups took the ast 6 tems So the anchor conssted of 3 tems The responses are generated accordng to the Rasch mode and the tem parameters are: β= 20,β2 = 0,β3 = 05, β4 = β5 = β6 = 0 and β 7 = 05,β 8 = 0,β 9 = 2 0 So we have a data matrx wth the same structure as n a targeted testng desgn, n whch students are assgned to one

22 TJHM Eggen & D Verhest of the two test bookets on the bass of a background varabe If we estmate the tem parameters gnorng the background varabe or desgn varabe and appy MML estmaton n a standard way wth one abty dstrbuton, we get the resuts gven the thrd coumn of Tabe 2 We see a cear bas n the estmates of the parameters that were admnstered n ony one of the two non-equvaent sampes The dffcuty parameters of the tems ony admnstered n the ess abe ( θ 0) ( Ε θ = 0) group Ε ( ) = are overestmated and are underestmated n the more abe group ( ) If we do not gnore the desgn varabe and estmate wth two margna dstrbutons (9) we get the resuts n the fourth coumn of Tabe 2, whch are seen to be free from systematc bas It s noted that the resuts of these two cabratons are comparabe by fxng both scaes by β = 0 Tabe 2 Input and estmated dffcuty parameters Rasch mode tem β (nput) β (se); one margna β (se); two margnas -20-52 (080) -979 (079) 2-0 -048 (072) -0938 (072) 3-05 005 (072) -0498 (073) 4 0-0042 (05) -0066 (053) 5 0 0032 (05) 004 (053) 6 0-0045 (05) -0069 (053) 7 05 0046 (073) 0589 (075) 8 0 047 (073) 0952 (074) 9 20 480 (079) 996 (080) mean µˆ = 0047 µˆ = -0986 µˆ = 097 sd σˆ = 293 σˆ = 0954 σˆ = 29 2 2 In the TTMP stuaton the background varabe s used as a stratfcaton varabe: from every subpopuaton, =,, L, we have a random sampe from g (θ) wth n the number of L ervatons n subpopuaton and Σ = n = n the tota sampe sze The sampng proportons n the subpopuatons, π = n / n can but w n genera not be equa to the popuaton proportons π These popuaton proportons π are not estmabe n ths case but they must be known n advance Ths aso means that n the TTMP case the dstrbuton n the tota popuaton (6) can ony competey be estmated provded the popuaton proportons are known and that we have sampes γ

Item cabraton n ncompete testng desgns 23 from every subpopuaton, n > 0, =,, L Otherwse we are not abe to estmate a subpopuaton parameters γ, =,, L Another dfference from the TTOP stuaton s that n TTMP the vaues of Y are known before sampng, so Y s not a random varabe here In order to dentfy the membershp of a student of a subpopuaton we w have to use the vaues of Y So we w not consder the smutaneous probabty of the erved response vector x, and Y as n the TTOP case (8), but the condtona dstrbuton of X gven () Y = y The desgn dstrbuton s gven by: P( M = m = ( r,0)) = f Y ( ) = y (22) Compared to (7), the TTOP case has one eement ess, because s not random Because of () (22) the condtona dstrbuton of a response vector gven Y = y s the same as the condtona dstrbuton of gven the desgn varabe For a randomy samped student from subpopuaton we have: P x m = P x Y = y = ( ) ( ) ( ) β ( ),γ, β ( ),γ, x P ( x, x,θ ) Y = y ) P (θ Y = y ) dθ dx = θ ms, ( ) ( ) β,γ, ms, γ ms, ( ) P ( x θ ) g (θ ) dθ θ β, γ ( ) And for the tota sampe we have the kehood L n P ( x θ ) g (θ ) dθ (23) = = θ β ( ), γ As before the parameters β and γ, =,, L, provded n > 0, can be estmated from (23) It s noted that n the TTMP stuaton we do not gnore the desgn varabe n the anayses but expcty condton on t If we do not use the background nformaton n the TTMP case ths w not ead to correct nferences on the parameters If we were wng to make the unreastc extra assumpton that a students are randomy drawn from one popuaton wth abty dstrbuton g (θ) defned by γ γ L L = γ = g (θ) = π g (θ) = ( n / n) g (θ) γ

24 TJHM Eggen & D Verhest then we are n fact n the TTOP stuaton for whch t was shown ((20) and (2)) that by Y gnorng the MAR condton for gnorabty s not fufed Summarzng we can say that n MML tem cabraton n compete testng desgns s justfed as ong as we are sampng from one popuaton there s more or ess a free choce of whether the background varabe s used n order to get estmates of the tem parameters However when sampng from mutpe subpopuatons and aways n ncompete targeted testng desgns, n TTOP as we as TTMP, there s no choce whether the background nformaton Y must be used Not usng Y never eads to correct nferences on the tem parameters or the popuaton parameters So we are obged to use the subpopuaton structure n MML estmaton n order to get a correct estmaton procedure It w aso be cear that the parameters of the abty dstrbuton of the tota popuaton can ony be estmated correcty, even n the case that we have a random sampe from one popuaton, va estmatng the subpopuaton parameters and the popuaton proportons Athough standard computer mpementaton of MML procedures (eg, n BILOG-MG, OPLM) have factes to use Y, and to dstngush more subgroups n the sampes, the awareness of the possbe probems s not genera and n practce many faures are made The Condtona Mode and Mssng Data In the precedng secton t was shown that n MML estmaton n ncompete desgns checkng Rubns (976) condtons for gnorabty s usefu Ony when we are sampng from mutpe popuatons t s not possbe to gnore the desgn varabe (n targeted testng) and expcty use the desgn n the anayss But n a other cases consdered checkng the standard condtons to be met for gnorabty, makes cear that estmatng the parameters wth MML whe gnorng the desgn varabe s justfed We w eaborate now on whether appyng these gnorabty checks are aso usefu n CML estmaton In appyng the gnorabty prncpe we fx the random desgn varabe M at the erved pattern of mssng data m and assume that the vaues u are reazatons of the margna dstrbuton of U (7): u f ( u, u ) du ms ms ms Remember (8) that the correct dstrbuton of the reazatons u, u f ( u, u m) du, ms ms ms the condtona dstrbuton of U gven M = m, s not used n the anayss, but ony the margna dstrbuton of the erved responses Note that n the CML case, the desgn varabe M and the tem ndcator varabe R are the same because the ony varabes nferred on are the tem responses X, and θ s not treated as a random varabe as n MML It w be cear that gnorng the desgn

Item cabraton n ncompete testng desgns 25 varabe n CML estmaton s ony possbe f for an ndvdua erved response vector X, there exsts a suffcent statstc S, = S( X, ) for θ n the margna dstrbuton (40) It can easy be shown that n the IRT modes we consder, for exampe n the Rasch mode the sum score S, = X, j j s not ony not suffcent for θ n the margna dstrbuton of the ervatons suffcent n the dstrbuton of a erved data ( X,, R ) S X,, but aso not, s ony suffcent n the condtona dstrbuton of the responses gven the tem ndcator varabe R An exampe w make ths cear Assume we have 3 tems foowng the Rasch mode wth parameters = exp( β ), =,2,3 and a random tem ndcator varabe wth two possbe outcomes ( 0< φ< ) : P( R (,,0)) φ = r = =, and R r2 P( = = (,0,)) = φ In Tabe 3 the reevant probabtes for a outcomes wth S =, wth exp(θ) = ξ are gven Tabe 3 Probabtes for a outcomes wth S =, x r p( x, r) p x r ) x r ) 0,0 φξ (+ ξ )(+ ξ ) ( p( 2 () () () 2 0,0 φξ 2 (+ ξ )(+ ξ ) 2 0,0 (-φ)ξ (+ ξ )(+ ξ ) 3 0,0 (-φ)ξ 3 (+ ξ )(+ ξ ) 3 ξ (+ ξ )(+ ξ ) 2 ξ 2 (+ ξ )(+ ξ ) 2 0 0 0 ξ (+ ξ )(+ ξ ) 0 ξ 3 3 (+ ξ )(+ ξ ) 3 + 2 φξ( ) (+ ξ )(+ ξ ) + (-φξ( + 3 ) (+ ξ )(+ ξ ) 2 3 ξ( + 2 ) (+ ξ )(+ ξ ) 2 ξ( + 3 ) (+ ξ )(+ ξ ) 3 s P( s ) P( s r ) P( s r ) 2

26 TJHM Eggen & D Verhest Condtonng on S n the jont dstrbuton of X and R, that s, dvdng n Tabe 3 the terms n the upper part of coumn () by the term n the ower part, does not cance the ndvdua parameter ξ On the other hand t can easy be checked that n the condtona dstrbutons of X gven R, S s suffcent for ξ Dvde the upper part terms n coumn () and () n Tabe 3 by ther ower part term In the exampe the same s easy checked for the outcomes wth 0 In genera, the probabty of the erved varabes can be wrtten as S s 2 and P ( x, r) = P ( x r ) P ( r ) (24) θ,β,φ θ,β,φ, φ We use the same notaton as before We dstngush T vaues of the desgn varabe r t, t=,, T ; n t s the number of students takng test t ; β( t ) s the kt - vector of the parameters of the tems n test t We can then rewrte (24) as: T nt P ( x, r) = P ( x r ) P ( r ) (25) θ,β,φ θ,β,φ, φ (t) t t t= We see n (25) that we have n fact the product of T compete data kehoods For every t the frst factor n the rght-hand sde of (25) can, as n compete data CML (see (3)), be wrtten as nt nt P ( x, r ) = P ( x s, r ) P ( s, r ) (26) θ,β,φ β,, θ,β,φ, (t) t (t) t (t) t = = And the frst factor n the rght-hand sde of (26) s agan free of any ncdenta parameters, and T nt L = P ( x s, r ) (27) c β,, (t) t t= = can be used for CML estmaton of β Note that when estmatng the tem parameters n ths way there are as many dfferent suffcent statstcs as there are desgns nvoved So we have seen that the standard gnorabty checks of Rubn cannot be apped n CML estmaton We have to condton expcty on the desgn varabe n order to get suffcent statstcs for the ncdenta parameters But whether t s justfed to estmate the tem parameters by just maxmzng the kehood (27) depends of course, as n the compete data case, on the propertes of the part of the tota kehood (25) we negect n that case The negected part n CML estmaton n ncompete desgns s (combnng (25), (26) and (27))

Item cabraton n ncompete testng desgns 27 T nt T nt T nt P ( s, r ) = P ( s r ) P ( r ) (28) θ,β,φ, θ,β,φ, φ (t) t (t) t t t= = t= = t= = In (28), the frst factor on the rght hand sde s the product of T terms, whch are aso negected n the compete data case Because negectng ths part was shown to be possbe (Eggen, 2000) wthout severe consequences, the propertes of the margna dstrbuton of the desgn varabe w be decsve for the justfcaton of negectng the term We w dscuss the propertes of (28) for the three consdered desgn types next CML n random ncompete desgns In random ncompete desgns the desgn dstrbuton s gven by (2) Consderng the frst factor of the part of the kehood we negect n CML (28), we see ths factor conssts of the product of T compete data dstrbutons of the suffcent statstcs s, whch can be negected From the desgn dstrbuton (2) t s easy seen that the second part of (28), P ( r ), does not depend on the tem parameters at a As a consequence, (28) can be negected φ t n CML estmaton So CML estmaton s justfed n random ncompete desgns CML n mutstage testng desgns In mutstage testng the frst part of (28) can be negected for the same reason as n random ncompete desgns The second part, however, the desgn dstrbuton n mutstage testng desgns, s dependent of the erved varabes Gven the desgn dstrbuton (4) we can wrte the second part as: T nt T nt P ( R = r ) = P ( R = r x ) P ( x ) (29) φ t φ t, β ( ),θ, t= = t= = We see that (29) s for every t drecty dependent of the tem parameters of the tems used for estabshng the desgn Ths means that (28), cannot be negected n CML estmaton So CML estmaton s n ths stuaton not justfed because t mpes that not a random varatons n the data reevant for estmatng the tem parameters are consdered n the condtona kehood Appyng CML estmaton n these desgns, whch s possbe by runnng standard computer programs for CML, gves ncorrect estmates of the tem parameters An exampe w ustrate ths Exampe (contnued) The tems and the desgn used are gven n exampe Generated are 4000 responses on these tems usng a standard norma abty dstrbuton Frst the tem parameters estmated n the compete desgn are gven n the thrd coumn n Tabe 4 In the fourth coumn the resuts are gven of the tem parameter estmates n the two stage testng desgn It s cear that appyng CML estmaton n ths two stage testng desgn gves systematc errors n the tem parameter estmates: the tem parameters of the easy tems (4 and 5) are underestmated, and the parameters of the hard tems are overestmated

28 TJHM Eggen & D Verhest Tabe 4 CML estmates and standard errors n a two stage testng desgn tem β (nput) β (se) β (se) β (se) compete mutstage mutstage 0-0360 (033) -0360 (035) - 2 0 0004 (033) 0060 (035) - 3 0 0024 (033) 0028 (035) - 4-25 -284 (037) -709 (049) -326 (053) 5-0 -0990 (036) -49 (048) -02 (052) 6-05 -0445 (034) -0467 (035) -0452 (035) 7 05 0506 (034) 0535 (035) 057 (036) 8 0 0964 (035) 387 (047) 0989 (05) 9 25 257 (037) 674 (048) 293 (052) The ast coumn of Tabe 4 gves the resuts n case the tem parameters of the routng test are not estmated themseves It s seen that n that case CML gves correct estmates on the other tems Ths can be understood by the fact that dstrbuton of the desgn varabe (26) s not dependent on the parameters to be estmated If we denote the ndces of the erved tems n the routng test wth () (2) and the parameter vector wth β, and the other wth 2 and β then n CML estmaton of the tems that are not n the routng test the foowng kehood s used: T nt L = P ( x s, r ) c 2 β 2, 2, t t= = ( t ) And the dstrbuton of the desgn whch s negected n the estmaton s gven by T nt P ( R = r x ) P ( x ) t= = φ t, β, θ, does not depend on the parameters (2) β, whch are estmated Foowng the procedure gven n Exampe s a possbe practca souton f the tems are to be estmated wth CML a two stage testng desgn Gas (988) showed that another possbe approach for CML n mutstage testng, condtonng on the scores for every stage of the desgn,

Item cabraton n ncompete testng desgns 29 fas, because t resuts n separate cabratons for the tems n a stage, whch can not be connected on the same scae CML n targeted testng desgns In targeted testng desgns the vaue of a background varabe Y determnes the desgn The desgn dstrbuton s gven by (4) Before we made the dstncton between the two sampng roes Y can pay n the desgn and usng or not usng Y was of utmost mportance n MML estmaton In CML estmaton, however, these dstnctons are not reevant Frsty, consder compete testng desgns n the presence of background nformaton The smutaneous probabty of the response vector X and of Y of student s gven by P ( x, Y = y ) = P ( x Y = y )P ( Y = y ) ( ) ( ) ( ) θ,β,π θ,β π Condtonng on the suffcent statstc S gves: P ( x, Y = y ) = P ( x s, Y = y ) P ( s Y = y )P ( Y = y ) ( ) ( ) ( ) ( ) θ,β,π θ,β θ,β π = P ( x s ) P ( s Y = y )P ( Y = y ) ( ) ( ) β θ,β π (30) () In (30) Y = y cances n P β ( x s ) because gven θ the tem responses are not dependent of any other characterstc of the students (oca ndependence) The compete kehood of the sampe s gven by: P ( x s ) P ( s Y = y )P ( Y = y ) (3) ( ) ( ) β θ,β π From (3), the frst factor s used n CML estmaton And, as before the second factor s aways dscarded n CML estmaton and the thrd factor s ndependent of t So CML s a justfed procedure to estmate β Furthermore t s cear that the background nformaton s n fact aways used n the anayses, snce t defnes the desgn, but t appears ony n that part of the kehood whch can be negected n CML estmaton If we woud have sampes from mutpe popuatons a ( ) the above st hods The ony change we have to make s that we start wth Pθ,β( x Y = y ) wth ( ) as a consequence that P ( Y = y ) cances n (30) and (3) So t can be concuded that n CML π estmaton a the sampe nformaton s n that part of the tota kehood whch s justfed to be negected The ndependence of CML estmaton of the actua sampe avaabe for estmaton can be understood n ths way

30 TJHM Eggen & D Verhest Next, we consder ncompete targeted testng Here we dstngush as many vaues (L) of the desgn varabe r as we dstngush vaues of the background varabe Y If we rewrte the tota kehood as before ((25), (27) and (28)) we see that the condtona kehood to be maxmzed s: L n P ( x s, r, Y = y ), = = β,, ( ) (32) and the negected part becomes L n ( ) P ( s, r, Y y ) θ,β( ),φ,π, = = = = L n L n ( ) ( ) P s r Y = y = P r Y = y θ,β( ),φ,π, φ,π = = = = (, ) (, ) (33) From the desgn dstrbuton (4) t s seen that the second part of the rght hand sde of (33) s ndependent of the tem parameters whch are to be estmated So CML estmaton, on the bass of the condtona kehood (32), s justfed n targeted testng Exampe 2 (contnued) If we estmate the tem parameters of exampe 2 wth CML, we see n resuts of Tabe 5 that targeted testng does not cause any systematc errors n the tem parameter estmates Tabe 5 Input β and estmated βˆ dffcuty parameters Rasch mode tem β (nput) βˆ (se);cml -20-980 (080) 2-0 -0935 (072) 3-05 0497 (073) 4 0-0066 (053) 5 0 005 (053) 6 0-0069 (053) 7 05 0592 (075) 8 0 0954 (074) 9 20 986 (080)

Item cabraton n ncompete testng desgns 3 Concuson In ths study t has been shown for the three most common stochastc ncompete desgn types under whch condtons tem cabraton s possbe It was seen that n MML estmaton Rubns gnorabty prncpe can drecty be apped to justfy the mssng data procedures In CML estmaton ths was seen not to be the case In CML the desgn s never gnored and must aways be an expct part of the condtona kehood In CML we n fact aways work wth the combnaton of as many compete data kehoods as there are desgns The key condton for justfyng CML s n the dependence of the dstrbuton of the desgn varabe on the tem parameters whch are to be estmated Summarzng t can be sad that n random ncompete desgns both MML and CML are possbe In mutstage testng desgns MML s aways a good opton for tem cabraton CML estmaton s n mutstage testng n genera not justfed It was shown, that n a two stage testng desgn a practca feasbe souton s, to conduct the CML estmaton wthout estmatng the tem parameters of the routng test In targeted testng CML s aways possbe MML estmaton gves sometmes probems If one knows, for nstance by stratfed sampng, that n the testng desgn the assgnments to the test bookets s accordng to these strata, MML estmaton s justfed when as many margna abty dstrbutons are specfed as strata or desgns Ignorng the background varabe gves based resuts It was notced that n standard computer agorthms for MML assumng a random sampe from one popuaton n practce many faures are made when we have n fact not one random but a stratfed sampe or when we have a targeted testng desgn In CML computer agorthms data from mutstage testng desgns can gve ncorrect resuts In ths study some sma exampes were gven to show the mss-behavour of some procedures How much mpact ths has n practca stuatons n whch we have more tems and wth other dstrbutons of the tem parameters, s worthwhe exporng It shoud be notced that a the prncpes eaborated for the three basc desgns can aso be apped n combnaton, when we have desgns n whch propertes of the basc desgns are combned Fnay t s remarked that n ths study a resuts are for convenence ustrated by the smpe one-parameter ogstc mode for dchotomousy scored tems But the resuts aso appy, whenever CML or MML s appcabe, for modes for poytomousy scored tems and for modes wth more than one tem parameter REFERE CES Andersen, EB (973) Condtona nference and modes for measurng Unpubshed PhDThess, Copenhagen: Mentahygejnsk Forag Bock, RD & Atkn, M (98) Margna maxmum kehood estmaton of tem parameters: Appcaton of an EM agorthm Psychometrka, 46, 443-459 Eggen, TJHM (2000) On the oss of nformaton n condtona maxmum kehood estmaton Psychometrka, 65, 337-362 Fscher, GH (98) On the exstence and unqueness of maxmum kehood estmates n the Rasch mode Psychometrka, 46, 59-77 Gas, CAW (988) The Rasch mode and mutstage testng Journa of Educatona Statstcs, 3, 45-52 Gas, CAW (989) Contrbutons to estmatng and testng Rasch modes Unpubshed PhD Thess, Arnhem: Cto