Hao Ren, Wim J. van der Linden and Qi Diao

Size: px
Start display at page:

Download "Hao Ren, Wim J. van der Linden and Qi Diao"

Transcription

1 psychometrika vol. 82, no. 2, June 2017 doi: /s CONTINUOUS ONLINE ITEM CALIBRATION: PARAMETER RECOVERY AND ITEM UTILIZATION Hao Ren, Wim J. van der Linden and Qi Diao PACIFIC METRICS Parameter recovery and item utilization were investigated or dierent designs or online test item calibration. The design was adaptive in a double sense: it assumed both adaptive testing o examinees rom an operational pool o previously calibrated items and adaptive assignment o ield-test items to the examinees. Four criteria o optimality or the assignment o the ield-test items were used, each o them based on the inormation in the posterior distributions o the examinee s ability parameter during adaptive testing as well as the sequentially updated posterior distributions o the ield-test item parameters. In addition, dierent stopping rules based on target values or the posterior standard deviations o the ield-test parameters and the size o the calibration sample were used. The impact o each o the criteria and stopping rules on the statistical eiciency o the estimates o the ield-test parameters and on the time spent by the items in the calibration procedure was investigated. Recommendations as to the practical use o the designs are given. Key words: adaptive testing, Bayesian optimal design, D-optimality, item calibration. 1. Introduction The need or rapid and reliable replenishment o the items in an item pool has been greatly increased by the introduction o computerized adaptive testing (CAT). Online real-time calibration not only enables us to calibrate new items (or ield-test items, as we will call them) under the exact same conditions as or their uture use, but also produces estimates o ield-test item parameters on the scale already in use or the testing program. I well done, no urther linking or rescaling should be needed. Online calibration has already been used, or instance, to calibrate ield-test items or the adaptive version o the armed services vocational aptitude battery (CAT-ASVAB) (Segall & Moreno, 1999). Various estimation procedures or this application were evaluated by Segall (2003). Previous studies o alternative procedures o online calibration (Stocking, 1988, 1990; Wainer & Mislevy, 1990; Segall, 2002; Chang, 2013) may seem to suggest that Stocking s (1988) Method A is a natural way to proceed. The method treats the estimates o the ability parameters obtained rom the responses to the operational items during the ield test as their true values, which are then simply ixed during the calibration. However, even or highly eicient orms o testing, such as adaptive testing, ability estimates still have error. It is unknown in what ways propagation o this error will impact the accuracy o the new item parameter estimates. Neither is it known i continued use o the principle might lead to any serious scale drit over time. But whatever its actual consequences, an eective way to account or such error is an empirical Bayes approach based on the ull posterior distributions o all ability and ield-test parameters involved in the calibration. In addition, through permanent updates o the posterior distributions o the ield-test parameters, the approach would also enable us to better and better adapt the assignment o the ield-test items to new examinees. Several advantages o a Bayesian approach were explored by van der Linden and Ren (2015) using MCMC sampling rom the posterior distributions o all Correspondence should be made to Hao Ren, Paciic Metrics, 1 Lower Ragsdale Dr #150, Monterey, CA 93940, USA. hren@paciicmetrics.com 2017 The Psychometric Society 498

2 HAO REN ET AL. 499 parameters. These authors also evaluated the approach or several criteria or the assignment o ield-test items rom the optimal design literature (e.g., Berger & Wong, 2009; 1980), with the criterion o posterior expected D-optimality as the clear winner. The intention o the current research was to take the approach several steps urther toward practical implementation. For instance, or an active testing program, it makes much practical sense to produce new items continuously and eed them into the calibration procedure as soon as they are written and reviewed. But, when doing so, these new items are allowed to compete immediately with the ield-test items already in the pool. Consequently, the practice may lead to cases in which some o the ield-test items stay in the calibration process much longer than anticipated, which may be undesirable i they have certain content attributes badly needed or operational testing. As this process has never been documented, our current research careully investigated the various patterns o item utilization that dierent online calibration designs may generate. Obviously, i a pattern is ound to be undesirable, the item-assignment criterion used or the design should be adjusted, giving the pertinent items a somewhat higher priority to inish their calibration. Our second goal was to study the impact o several alternative implementations o the criterion o D-optimality on the accuracy o the ield-test parameter estimates. The same criterion was used by Berger (1992, 1994) and Jones and Jin (1994) in earlier studies or use in a sequential calibration procedure. van der Linden and Ren (2015) used a Bayesian version o the criterion with maximization o the expectation o the observed inormation matrix o the ield-test items jointly across the (updated) posterior distributions o the examinee and all item parameters (Eq. 7 below). From a well-known result in optimal design we know however that, or the logistic regression model, the criterion o D-optimality boils down to a design with equal weights at design points θ = b i ± 1.542/a i (e.g., Abdelbasit & Plankett, 1983). For the two-parameter logistic (2PL) response model, the design implies equal numbers o examinees at the true ability levels corresponding to the response probabilities o and or the items. In act, this design is exactly the one used in the sequential approach to item parameter estimation by Berger (1992). To our knowledge, ully adaptive versions o this two-point design have not yet been applied to the problem o online adaptive item calibration though. In the current research we used two versions o this type o design: a straightorward version based on point estimates o the ability parameters produced by the adaptive test and a ully Bayesian version o it based on the posterior expectation o the response probabilities. In addition, we used a similar Bayesian version o Buyske s (1998) weighted A-criterion, which also implies a two-point design but with somewhat less extreme response probabilities. Each o these three alternatives was evaluated against the Bayesian deinition o D-optimality borrowed rom van der Linden and Ren (2015). A more ormal introduction to each o these criteria ollows in the main section o this paper below. In addition, the impact o two dierent types o stopping rules or the calibration o ield-test items was investigated. One type was based on a target or the posterior standard deviation o the ield-test parameters; the other on a target on the total number o administrations per item. As soon as the target o choice was met or an item, it was removed rom the calibration procedure. Meanwhile, simulating the condition o continuous item production, new ield-test items were ed into the item pool. In each o our studies, the item pool consisted o two sections: (i) operational items calibrated with enough precision to treat their item parameters as known and (ii) ield-test items assigned to the examinees to collect responses or their calibration. The adaptive test was simulated to run using the operational items until a position toward the end o the test was reached and a ield-test item needed to be assigned to the examinee. The responses to the ield-test items were not used to update the examinee s ability parameter, only to update the parameters o the ield-test items.

3 500 PSYCHOMETRIKA Generally, the more toward the end o the test the position o the ield-test items, the more inormative the current posterior distribution o the examinees ability parameters and the more eicient the assignment o the ield-test items. On the other hand, the practice o assigning ieldtest items to the very last positions in the test may quickly become known, with subsequent careless responses to them by the examinees. Hence, our practical solution o randomly selecting k positions or ield testing rom the last p positions in the test or each examinee (with k = 2or 3 and p = 2k, say). In principle, it is possible to update ield-test parameters ater each single new response. But in real-world applications, with dierent examinees being tested at dierent machines at possibly dierent sites, we expect to update one time ater a group (batch) o new responses in practical. And we used a ew dierent batch sizes to study their eect. 2. Model and Methods The two-parameter logistic (2PL) item response model deines the probability o a correct response to item i by an examinee with ability θ as P(u i = 1 θ) = [ 1 + exp ( a i (θ b i )) ] 1 (1) where u i represents the response to item i, 1 means correct and 0 means wrong, respectively, and a i and b i are the item discrimination and diiculty parameter o item i, respectively. We will use η i (a i, b i ) Estimation o the Examinee s Ability A common N(0, 1) prior distribution was adopted or estimating the ability parameters during adaptive testing. Ater n items, the posterior distribution o the jth examinee s ability parameter is ( θ j u j, η i, i = 1,...,n ) ni=1 P u ij i (1 P i ) 1 u ij e θ 2 j /2 = ni=1 P u, (2) ij i (1 P i ) 1 u ij e θ 2 j /2 dθ j where u j (u ij ) is the vector o n responses. The item parameters η i in this study are estimated using operational data and assumed as true parameters values. The integral in the denominator does not have a closed orm, but various numerical methods are available to calculate it. In this study, simple numerical quadrature was used to calculate the posterior mean and standard deviation o θ j. The posterior distributions o θ in adaptive testing are known to converge to normality (Chang & Ying, 2009), and practical experience has shown the convergence to be ast. Hence, the ull posterior distributions o θ j used to assign the ield-test items to the examinees toward the end o adaptive testing were approximated to be normal with means and standard deviations calculated rom (2) Calibration o the Field-Test Items Let = 1,...,F, denote the ield-test items. The initial prior distribution was a bivariate normal or the logarithm o discrimination parameter a and diiculty parameter b. That is, ( ( ) ) ln a ( ) N μ (0) b, (0) (3)

4 HAO REN ET AL. 501 where μ (0) is the vector with the prior mean o ln(a ) and b, and (0) is the prior covariance matrix. Logarithmic transormation o a is a standard procedure to give its distribution an appropriate ( support. ) The calibration process was assumed to begin with a low-inormative choice o μ (0), (0). The posterior updates ater each new batch o examinees were perormed using (bivariate) normal approximations as well. Let j = 1,...,J b denote the examinees in batch b = 1,...,B used to update the posterior distribution o the parameters η = (a, b ) o ield-test item, and let θ (θ j ) and u (u j ). Assume b 1 batches have already been processed. The joint posterior distribution o the responses and all model parameters ater the collection o batch b has density ( ) (b) u, η, θ μ (b 1), (b 1),μ θ j,σθ 2 j, j = 1,...,J b = J b j=1 P u ) j j (1 P j ) 1 u j (θ j μ θ j,σθ 2 j ( ) η μ (b 1), (b 1), (4) where μ θ j and σ 2 θ j are the mean and variance o the posterior distribution o the ability parameter or the jth examinee in the batch. The posterior distribution o η is ( ) (b) η u, μ,,μ θ j,σθ 2 j, j = 1,...,J b ( ) (b) u, η, θ μ (b 1), (b 1),μ θ j,σθ 2 j, j = 1,...,J b dθ = ). (5) (b) (u, η, θ μ (b 1), (b 1),μ θ j,σθ 2 j, j = 1,...,J b dθdη As the integral over θ actorizes into its components, it is straightorward to use Gauss Hermite quadrature (e.g., with 20 nodes) to calculate the update o the mean vector and variance matrix o (5) used or the assignment o the ield-test item to the next batch o test takers Field-Test Item Assignment The criterion o D-optimality involves minimization o the determinant o the covariance matrix o the ield-test parameters η i or, (asymptotically) equivalently, maximization o the determinant o their Fisher inormation matrix. For computational convenience, the latter is used. In the current context o adaptive testing, the choice o this criterion amounts to the selection o the ield-test item or which the current examinee would yield the greatest increase in the determinant. The Bayesian version o the criterion allows us to anticipate the increase by calculating its posterior expectation across the examinee s response distribution. More speciically, let J u (η ; θ j ) denote the observed inormation matrix or the response U = u by examinee j with ability θ j to ield-test item. The update o the posterior distribution o η in (5) based on the preceding batch b 1 along with the posterior distribution o θ j in (2) ater n items taken by examinee j enables us to calculate the posterior expectation o the observed inormation matrix as ( ) ( ) ( ) J U = J u η ; θ j p u ; η,θ j (b 1) η u, μ,,μ θ j,σθ 2 j, j = 1,...,J b u ( θ j u i, η i, i = 1,...,n ) dη dθ j

5 502 PSYCHOMETRIKA = I U ( η ; θ j ) (b 1) (η u, μ,,μ θ j,σ 2 θ j, j = 1,...,J b ) ( θ j u i, η i, i = 1,...,n ) dη dθ j, (6) where I U (η ; θ j ) is the expected Fisher inormation matrix. Now, let J (b 1) be the cumulative inormation matrix or the same ield-test item until examinee j. The expected gain in the determinant due to the assignment o to j would be D (1) ( ) ( ) = det J (b 1) + J U det J (b 1). (7) The ield-test item with maximum expected gain among all ield-test items is assigned to the examinee. Theoretically, the two-point D-optimal design holds or the 2PL model with true design points θ = b ± 1.542/a, (8) which, as already noted, are the points with the probabilities o or o a correct response to the item. In spite o its simple nature, the design can only be used once we have decided on how to deal with the unknown θ parameter o the examinees. Besides, or each ield-test item we have to create a two-point distribution over this parameter with equal numbers o examinees at both points. This requirement o equal numbers is an essential eature o the design. O course, minor violations o it might not hurt. But i we would assign an item to examinees all at the same o the two points, the result would be completely counterproductive; it is impossible to produce any stable estimate o a diiculty or discrimination parameter rom a one-point ability distribution. A simple way to avoid this outcome is by alternating between the two points or each item. Let n be the count o the number o times ield-test item has already been assigned. The alternation will be based on the indicator variable { 0, i n δ j = is odd or examinee j, 1, otherwise. (9) Two dierent implementations o the criterion were studied. The irst implementation consisted o the substitution o point estimates or the parameters in (8) or each o the ield-test items and selection o the item with the result closest to the examinee s ability estimate. More ormally, let θ (1) b 1.542/a and θ (2) b /a, both times with the item parameters evaluated at their current posterior means. In addition, let θ j denote the current posterior mean o θ or examinee j. The ield-test item with the minimum value o { } D (2) = θ (1) θ j δ j, θ (2) θ j 1 δ j, (10) across all ield-test items is then assigned to examinee j. The implementation is straightorward but does ignore existing uncertainty in the parameters. The second implementation ocuses on the success probabilities P(U j = 1 θ j, η ) at the two design points or the ield-test items. It allows or the current uncertainty about the parameters by calculating the posterior expectation o the probabilities given the examinee s response vector u j as

6 P j = HAO REN ET AL. 503 P ( ) ( ) U j = 1 θ j, η η u j, μ (b 1), (b 1),μ θ j,σθ 2 j, j = 1,...,J b ( θ j u j, η i, i = 1,...,n ) dη dθ j. (11) The item with the minimum o { D (3) = P j δ j, P j j} 1 δ, (12) across all ield-test items is assigned to examinee j. Observe that δ j = 0 sets the distance between P j and the lower design point equal to one, so this point is never chosen. The same holds or 1 δ j = 0 and the upper point. Buyske (1998) proposed an alternative two-point design or use with adaptive testing based on an A-criterion that maximizes the trace o the Fisher inormation matrix about the θ parameter to be expected once the item is calibrated. The criterion implies alternate θ points or each item with success probabilities equal to 0.25 and As these points are less extreme than those in (12), the design can be expected to be less dependent on the correctness o the model. In addition, as one o the eatures o adaptive testing is quick convergence o the operational items to rather comortable response probabilities close to 0.50, Buyske s criterion avoids the administration o ield-test items that are suddenly too boring or challenging to the test takers. In order to assess robustness o the D-optimal design in (12) against such changes in its probabilities, we added D (4) { = min P j 0.25 δ j, P j 0.75 j} 1 δ, (13) with P j still calculated as in (11), as a criterion to our study. 3. Setup o the Simulation Study The goal o this simulation study was to simulate continuous online item calibration under the various conditions speciied above, evaluating the results or the statistical quality o the parameter estimates as well as the time spent by the ield-test items in the calibration procedure. The setup o the study was as ollows: The simulated examinees were randomly sampled rom a population with a standard normal ability distribution. Each simulated examinee took an adaptive test or n = 30 items, ater which a decision on the insertion o three ield-test items in the remaining portion o the test was to be made. During adaptive testing, the operational items were selected using the well-known maximum inormation selection procedure with the ability parameter estimated by expected a posteriori (EAP) estimation (Bock & Mislevy, 1982). The initial prior distribution or the ability parameters was the standard normal; consequently, each irst item was selected to have maximum inormation at θ 0 = 0. Ater 30 items, the posterior distribution o the examinee s θ was computed using the method described above. The operational item pool consisted o 253 items selected rom an inventory calibrated using data rom a national ield test with more than 10,000 students. The item parameter estimates rom the national ield test were treated as their true values or the simulation in this study, and the values are used to generate the simulated responses and in the computation o ability estimate during the CAT simulation. In total, 150 ield-test items were adopted. In order to simulate the case o the same team o specialists continuing to write items or the pool, the ield-test items were randomly sampled without replacement rom the entire inventory. A summary o the item

7 504 PSYCHOMETRIKA Table 1. Summary o item parameters o the operational item pool and the ield-test items. Parameter a i Parameter b i Min Mean SD Max Min Mean SD Max Operational item pool Field-test items parameters o the operational item pool and the ield-test items is given in Table 1. Continuous item calibration was simulated by starting the study with a subset o 50 o the available ield-test items. Subsequently, when any o these items met the stopping rule, the item was retired rom the calibration procedure and replaced by the next ield-test item in the queue. Thus, the operational pool continued to consist o items that had passed the ield test. The simulation ended as soon as the irst 100 ield-test items met the stopping rule. All ield-test items were assigned to the examinees using either the ull criterion o D- optimality in (7) or one o the two-point criteria in (10), (12), and (13). Because o the use o a common prior, the our item-assignment criteria would have been unable to discriminate between any o the ield-test items until their irst update. Thereore, in order to initialize the calibration procedure, each new ield-test item was randomly assigned to ive examinees beore the collection o its irst batch o responses began. Two stopping rules were simulated. One was based on the ixed number o 1000 examinees to which an item had to be assigned beore it was retired rom the calibration procedure. The other rule was based on the accuracy o the parameter estimates, i.e., an item was retired as soon as the posterior standard deviations o its parameters reached the thresholds o SD a = 0.1 and SD b = The parameters o each ield-test items were updated per batch o examinees. Three dierent batch sizes o (20, 50, 100) were used to evaluate their impact both on the speed and quality o calibration. As a result o the use o the our dierent assignment rules, three dierent batch sizes, and two dierent stopping rules, a total o 24 dierent conditions were investigated Parameter Recovery 4. Results The general statistical quality o the parameter estimates or each o the simulated conditions is shown in Figs. 1, 2, 3 and 4. Each o the plots reveals results centered about the identity line. The results or the b i parameter have much smaller scatter about the line than or the a i parameters a typical result o item calibration under the current model. The most remarkable result observed in these igures, however, is the apparent lack o any systematic dierences between the our assignment criteria under any o the other conditions. In act, on the basis o these results alone it would be hard to deend a preerence or one criterion over any o the others. Likewise, except or a slightly smaller scatter in the a i parameter estimates or the largest batch sizes or the stopping rule with the ixed posterior SD, there does not appear much reason to preer one batch size over the other either. The picture becomes slightly more nuanced i we look at the role o the stopping rules, though. Figures 5, 6, 7 and 8 present the posterior standard deviations o the estimates o all ield parameters as a unction o their true value. The major dierence was ound between the two

8 HAO REN ET AL. 505 Figure 1. Estimated versus true b i parameters or our dierent assignment criteria and three batch sizes (ixed number o examinees per item as stopping rule). stopping rules. Obviously, or the rule based on the posterior standard deviation, none o the b i parameters had a posterior SD greater than the threshold o 0.06, and none o the a i parameters had an SD greater than 0.1. As or the rule with the ixed number o examinees per item, although the number o items with an SD or their b i parameter greater than this threshold was generally small, the number o items with a larger SD or their a i parameter was substantial greater, especially or

9 506 PSYCHOMETRIKA Figure 2. Estimated versus true b i parameters or our dierent assignment criteria and three batch sizes (ixed posterior SD as stopping rule). the higher true parameter values conirming the common observation o higher values or this model parameter tending to result in larger standard errors o estimation. Tables 2, 3 and 4 shed some urther light on this tradeo between the numbers o examinees used and the posterior standard deviations or these two stopping rules. Table 2 shows the numbers

10 HAO REN ET AL. 507 Figure 3. Estimated versus true a i parameters or our dierent assignment criteria and three batch sizes (ixed number o examinees per item as stopping rule). o ield-test items that met the thresholds or their posterior SD or the rule with the ixed number o 1000 examinees. The results or the three criteria with a two-point design were better than or the original D (1) criterion in (7). Table 3 repeats the analysis or the numbers o ield-test items rom the original set o 50 that completed their calibration. Again, or the same stopping rule with a ixed number o examinees, the table shows a slightly better job or the three two-point designs,

11 508 PSYCHOMETRIKA Figure 4. Estimated versus true a i parameters or our dierent assignment criteria and three batch sizes (ixed posterior SD as stopping rule). whereas the numbers or the stopping rule with a ixed SD indicated a slightly better perormance or the Buyske criterion in (12). Reversely, or the total number o examinees required to calibrate all 100 ield-test items shown in Table 4, the Bayesian version o the criterion o D-optimality in (7) clearly required lower numbers. However, or the alternative stopping with a threshold or the posterior SD, the same criterion required the greatest numbers. The same outcome is visible

12 HAO REN ET AL. 509 Figure 5. Posterior standard deviations o b i parameters as unction o their true values or our dierent assignment criteria and three batch sizes (ixed number o examinees per item as stopping rule. in Figs. 9 and 10 where the curve or D (1) is to the let o the others in the ormer but tends to be to the right o them or the later items in the latter. The best way to summarize the results in all these tables and igures is that they remind us o the tradeo between the size o the calibration sample and the posterior uncertainty o the item parameter estimates. There does not appear to be a clear winner that perorms uniormly best under either criterion.

13 510 PSYCHOMETRIKA Figure 6. Posterior standard deviations o b i parameters as unction o their true values or our dierent assignment criteria and three batch sizes (ixed posterior SD as stopping rule). It may be interesting to look at the distributions o the abilities preerred by each o the our criteria or the ield-test items. Figures 11, 12 and 13 illustrate the variation o the distributions or Item 3 with parameter values b 3 = and a 3 = The selection is entirely arbitrary. As each observed process depends not only on the true values o the parameters o the item but

14 HAO REN ET AL. 511 Figure 7. Posterior standard deviations o a i parameters as unction o their true values or our dierent assignment criteria and three batch sizes (ixed number o examinees per item as stopping rule. also to a considerable extent on those o the other items in their various stages o calibration, the processes vary considerably. Also, remember that the selection o the irst batch o examinees was based on the random initialization o the procedure described earlier; consequently, at this stage it is impossible or the our assignment criteria to discriminate between the examinees yet. Nevertheless, it seems to be possible to draw two more general conclusions. The irst is that or

15 512 PSYCHOMETRIKA Figure 8. Posterior standard deviations o a i parameters as unction o their true values or our dierent assignment criteria and three batch sizes (ixed posterior SD as stopping rule). larger batch sizes the entire process looks smoother, that is, with much smaller changes in the distributions o the ability parameters between the updates o the posterior distributions o the items parameters. This trend, immediately obvious rom a comparison between Figs. 11 and 13, should not come as a surprise. For larger batch sizes there is just more opportunity or random dierences to get aggregated out. The other observation is the trend or the ability distributions

16 HAO REN ET AL. 513 Table 2. Number o items that met the thresholds or their posterior standard deviation in the simulation with stopping rule based on a ixed number o examinees. Batch size D (1) D (2) D (3) D (4) Table 3. Number o items rom the initial set o 50 ield-test items that completed their calibration. Stopping rule Criteria Batch size Fixed D (1) D (2) D (3) D (4) SD D (1) D (2) D (3) D (4) Table 4. Total number o examinees required to calibrate the 100 ield-test items in the simulation study. Stopping rule Criteria Batch size Fixed D (1) 34,217 33,982 34,511 D (2) 37,777 38,129 39,291 D (3) 38,309 36,644 38,323 D (4) 38,932 39,620 39,383 SD D (1) 42,516 43,788 46,370 D (2) 39,998 38,804 37,556 D (3) 39,190 37,855 35,766 D (4) 41,822 42,233 35,862 or the three two-point designs to be centered about the diiculty o the item, whereas the original D (1) criterion tended to ocus on the middle o the ability distribution. The ormer is not surprising given the centering o the two design points about b [see (8)]. The latter may be explained by the expectation over the ull posterior distributions o the ability parameters by the test takers in (6). However, as just discussed, this dierence did not imply any systematic dierences in the quality o the parameter recovery, and thus does not seem to have much practical meaning.

17 514 PSYCHOMETRIKA Figure 9. Number o items that completed the calibration as a unction o the number o simulated examinees or the our criteria o optimality (batch size o 50; ixed number o examinees per item as stopping rule). Figure 10. Number o items that completed the calibration as a unction o the number o simulated examinees or the our criteria o optimality (batch size o 50; ixed posterior SD as stopping rule). The main conclusion rom all these results is that there is no systematic winner among each o the our current criteria o optimality. The dierences between the results in each o the preceding igures and tables were too small and too unsystematic to recommend any o them over the others.

18 HAO REN ET AL. 515 Figure 11. Boxplot o ability parameters in each o the 50 batches o size 20 or the updates o the posterior distributions o the parameters o Item 3 (total o 1000 examinees). Also, the choice o batch size did not lead to any systematic dierences, suggesting it could be based on a choice o convenience as well Item Utilization In the simulation study, a ield-test item was removed rom the pool when it completed its calibration, whereupon a new item was added. Consequently, the number o ield-test items present in the study was always 50, and when the calibration o the irst 100 ield-test items was

19 516 PSYCHOMETRIKA Figure 12. Boxplot o ability parameters in each o the 20 batches o size 50 or the updates o the posterior distributions o the parameters o Item 3 (total o 1000 examinees). completed, 150 items had entered the simulation. Each o these 150 items always had to compete or assignment to the examinees with all other items already present in study. Table 5 shows the number o items updated at least once during the entire simulation. This time the batch size did appear to have a systematic eect: the smaller the batch, the greater the number o ield-test items that entered the calibration process at least once. The numbers also dierentiated between the D (1) criterion and the three two-point criteria. Use o the D (1) criterion led to some ield-test items being updated at least once during the entire study, whereas

20 HAO REN ET AL. 517 Figure 13. Boxplot o ability parameters in each o the 10 batches o size 100 or the updates o the posterior distributions o the parameters o Item 3 (total o 1000 examinees). this happened to nearly all o the 150 items or each o the other three. The dierence points at a relatively strong tendency or the D (1) criterion to ocus on a ew generally most inormative items at a time. As soon as one o them retired, its position tended to be taken over by the next avorable in line. The same behavior was observed or the MCMC implementation o (7) by van der Linden and Ren (2015). On the other hand, or the three two-point implementation, the competition tended to be more permanently between larger sets o ield-test items, without any outspoken avorites.

21 518 PSYCHOMETRIKA Table 5. Number o items or which the posterior distribution was updated at least once. Stopping rule Criteria Batch size Fixed D (1) D (2) D (3) D (4) SD D (1) D (2) D (3) D (4) However, as already observed, none o the our criteria completed the calibration o all ieldtest items in the initial set o 50 (Table 3). Even ater tens o thousands o examinees were tested, some 20% o them still had not inished their calibration, no matter the stopping rule or batch size. Our post hoc analysis showed that these items just had generally low values or each o these criteria based on their true parameter values. Figures 14 and 15 illustrate the same dierences in utilization o the ield-test items graphically. In both sets o graphs, the horizontal axis represents the ield-test items sorted by the irst time they were assigned to any examinee, while the vertical lines represents the length o their subsequent stay in the study. The graphs in the irst row show again how the D (1) criterion ocused on a much smaller set o items whereas the other three criteria tended to entertain larger sets o item simultaneously, no matter the batch sizes used to calibrate the items. Again, one o the most conspicuous aspects was the relatively long stay in the calibration study o some o the original items or each o these criteria. 5. Discussion The primary goal or this study was to investigate both the quality o parameter recovery and item utilization during adaptive online calibration o ield-test items or dierent implementations o the criterion o D-optimality and a closely related version o them under the realistic condition o each o the initial items permanently having to compete with new items or the avor o the item-assignment criterion. Generally, in spite o the dierent nature o each o these criteria, the quality o their parameter updates was generally comparable. This inding can be partly explained by their commonality in the orm o the same normal approximation used in the updates o all posteriors underlying them. However, the D (1), D (3) and D (4) criteria diered rom D (2) in the use o integration o a key quantity over the entire posterior distribution while the latter only used its mean as point estimate. In doing so, the ormer acknowledge the current uncertainty about these quantities with an obvious consequence o more uncertainty in the result here: the estimate o the ield-test parameters. On the other hand, because o their honest estimates, the adaptive assignments o the ield-test items improved, with the result being better subsequent updates o the posterior distributions. Apparently, these two eects did counter each other largely. Consequently, as the

22 HAO REN ET AL. 519 Figure 14. Length o stay o each o the 100 ield-test items in the calibration study or our dierent assignment criteria and three batch sizes (ixed number o examinees per item as stopping rule). use o D (2) is computationally somewhat simpler, it might be the criterion o choice i ease o computer programming is an issue. The lack o any systematic impact o the use o the less extreme response probabilities o 0.25 and 0.75 in D (4) than the original probabilities o or shows potential robustness

23 520 PSYCHOMETRIKA Figure 15. Length o stay o each o the 100 ield-test items in the calibration study or our dierent assignment criteria and three batch sizes (ixed posterior SD as stopping rule). o the two-point D-optimal design against such changes. We did not try seeing how ar we could go in bringing the probabilities to 0.50 or the 2PL model beore the criterion would break down. But i the use o probabilities more motivating to the examinees is a practical requirement indeed, this avenue should be urther explored.

24 HAO REN ET AL. 521 The choice o stopping rule proved to be a mainly practical issue as well. I similar precision o the estimates o all ield-test items is required, a rule based on thresholds or the posterior standard deviations o the ield-test items should be used. However, given the dependency o the accuracy o item calibration both on the (unknown) true values o the parameters o the ield-test items and the ability parameters o the examinees that happen to be tested once a ield-test item is launched, it is then generally hard to project the total number o examinees to which each possible item has to be assigned. It is thereore recommended to set an upper limit on the number o examinees as a backup to avoid the risk o some o the items requiring too many o them. One o the systematic observations was a small subset o the ield-test items being dominated by all others with the result being an extremely long stay in the entire calibration process. I ast calibration o only some o the items is preerred, this may not be an issue. On the other hand, i there is a scarcity o test items with certain desirable attributes, it may pay o to detect them at an earlier stage and choose between replacing them by better candidates or giving them a higher priority, or instance, by adding a (temporary) positive term to their value or the criterion. Finally, running time o the computer program during calibration might be another consideration or practical application. The times in our simulation, conducted under R with the computationally more intensive procedures programmed in C, on a PC with a Xeon(R) 2.8 GHz CPU and 12 GB o RAM, were quite avorable. The average running time per examinee or the calculation o the posterior approximation in (2) was less than s, while the average time or the calculation o the dierent implementations o the criterion o D-optimality was s or the selection rom 50 ield-test items. As or the posterior updates o the ield-test parameters, the average times were 0.4 s or the batch size o 20 examinees, 1 s or 50 examinees, and 2 s or 100 examinees. Reerences Abdelbasit, K. M., & Plankett, R. L. (1983). Experimental design or binary data. Journal o the American Statistical Association, 78, Berger, M. P. F. (1992). Sequential sampling designs or the two-parameter item response theory model. Psychometrika, 57, Berger, M. P. F. (1994). D-optimal sequential sampling design or item response theory models. Journal o Educational Statistics, 19, Berger, M. P. F., & Wong, W. K. (2009). Introduction to optimal designs or social and biomedical research. Chichester, West Sussex: Wiley. Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation o ability in microcomputer environment. Applied Psychological Measurement, 6, Buyske, S. (1998). Optimal design or item calibration in computerized adaptive testing: The 2PL case. In N. Flournoy, et al. (Ed.), New developments and applications in experimental design. Lecture Notes Monograph Series, vol. 34. Haywood, CA: Institute o Mathematical Statistics. Chang, H.-H., & Ying, Z. (2009). Nonlinear sequential designs or logistic item response models with applications to computerized adaptive testing. The Annals o Statistics, 37, Chang, Y. C. I. (2013). Sequential estimation in item calibration with a two-stage design. arxiv: [stat.ap]. Jones, D. H., & Jin, Z. (1994). Optimal sequential designs or on-line item estimation. Psychometrika, 59, Segall, D. O. (2002). Conirmatory item actor analysis using Markov chain Monte Carlo estimation with applications to online calibration in CAT. Paper presented at the annual meeting o the National Council on Measurement in Education. New Orleans, LA. Segall, D. O. (2003). Calibrating CAT pools and online pretest items using MCMC methods. Paper presented at the annual meeting o the National Council on Measurement in Education. Chicago, IL. Segall, D. O., & Moreno, K. E. (1999). Development o the computerized adaptive testing version o the Armed Services Vocational Aptitude Battery. In F. Drasgow & J. B. Olson-Buchanan (Eds.), Innovations in computerized assessment. Hillsdale, NJ: Lawrence Erlbaum Associates. Silvey, S. D. (1980). Optimal design. London: Chapman and Hall. Stocking, M. L. (1988). Scale drit in on-line calibration (research report 88-28). Princeton, NJ: Educational Testing Service. Stocking, M. L. (1990). Speciying optimum examinees or item parameter estimation in item response theory. Psychometrika, 55,

25 522 PSYCHOMETRIKA van der Linden, W. J., & Ren, H. (2015). Optimal Bayesian adaptive design or test-item calibration. Psychometrika, 80, Wainer, H., & Mislevy, R. J. (1990). Item response theory, item calibration, and proiciency estimation. In H. Wainer (Ed.), Computer adaptive testing: A primer (pp ). Hillsdale, NJ: Lawrence Erlbaum. Manuscript Received: 28 MAR 2014 Published Online Date: 13 MAR 2017

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

Objectives. By the time the student is finished with this section of the workbook, he/she should be able FUNCTIONS Quadratic Functions......8 Absolute Value Functions.....48 Translations o Functions..57 Radical Functions...61 Eponential Functions...7 Logarithmic Functions......8 Cubic Functions......91 Piece-Wise

More information

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points Roberto s Notes on Dierential Calculus Chapter 8: Graphical analysis Section 1 Extreme points What you need to know already: How to solve basic algebraic and trigonometric equations. All basic techniques

More information

Computerized Adaptive Testing With Equated Number-Correct Scoring

Computerized Adaptive Testing With Equated Number-Correct Scoring Computerized Adaptive Testing With Equated Number-Correct Scoring Wim J. van der Linden University of Twente A constrained computerized adaptive testing (CAT) algorithm is presented that can be used to

More information

The concept of limit

The concept of limit Roberto s Notes on Dierential Calculus Chapter 1: Limits and continuity Section 1 The concept o limit What you need to know already: All basic concepts about unctions. What you can learn here: What limits

More information

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction . ETA EVALUATIONS USING WEBER FUNCTIONS Introduction So ar we have seen some o the methods or providing eta evaluations that appear in the literature and we have seen some o the interesting properties

More information

( x) f = where P and Q are polynomials.

( x) f = where P and Q are polynomials. 9.8 Graphing Rational Functions Lets begin with a deinition. Deinition: Rational Function A rational unction is a unction o the orm ( ) ( ) ( ) P where P and Q are polynomials. Q An eample o a simple rational

More information

Curve Sketching. The process of curve sketching can be performed in the following steps:

Curve Sketching. The process of curve sketching can be performed in the following steps: Curve Sketching So ar you have learned how to ind st and nd derivatives o unctions and use these derivatives to determine where a unction is:. Increasing/decreasing. Relative extrema 3. Concavity 4. Points

More information

Bayesian Technique for Reducing Uncertainty in Fatigue Failure Model

Bayesian Technique for Reducing Uncertainty in Fatigue Failure Model 9IDM- Bayesian Technique or Reducing Uncertainty in Fatigue Failure Model Sriram Pattabhiraman and Nam H. Kim University o Florida, Gainesville, FL, 36 Copyright 8 SAE International ABSTRACT In this paper,

More information

Gaussian Process Regression Models for Predicting Stock Trends

Gaussian Process Regression Models for Predicting Stock Trends Gaussian Process Regression Models or Predicting Stock Trends M. Todd Farrell Andrew Correa December 5, 7 Introduction Historical stock price data is a massive amount o time-series data with little-to-no

More information

Chapter 6 Reliability-based design and code developments

Chapter 6 Reliability-based design and code developments Chapter 6 Reliability-based design and code developments 6. General Reliability technology has become a powerul tool or the design engineer and is widely employed in practice. Structural reliability analysis

More information

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions RATIONAL FUNCTIONS Finding Asymptotes..347 The Domain....350 Finding Intercepts.....35 Graphing Rational Functions... 35 345 Objectives The ollowing is a list o objectives or this section o the workbook.

More information

OPTIMAL PLACEMENT AND UTILIZATION OF PHASOR MEASUREMENTS FOR STATE ESTIMATION

OPTIMAL PLACEMENT AND UTILIZATION OF PHASOR MEASUREMENTS FOR STATE ESTIMATION OPTIMAL PLACEMENT AND UTILIZATION OF PHASOR MEASUREMENTS FOR STATE ESTIMATION Xu Bei, Yeo Jun Yoon and Ali Abur Teas A&M University College Station, Teas, U.S.A. abur@ee.tamu.edu Abstract This paper presents

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 41 A Comparative Study of Item Response Theory Item Calibration Methods for the Two Parameter Logistic Model Kyung

More information

Test Design and Speededness

Test Design and Speededness Journal of Educational Measurement Spring 2011, Vol. 48, No. 1, pp. 44 60 Test Design and Speededness Wim J. van der Linden CTB/McGraw-Hill A critical component of test speededness is the distribution

More information

Thu June 16 Lecture Notes: Lattice Exercises I

Thu June 16 Lecture Notes: Lattice Exercises I Thu June 6 ecture Notes: attice Exercises I T. Satogata: June USPAS Accelerator Physics Most o these notes ollow the treatment in the class text, Conte and MacKay, Chapter 6 on attice Exercises. The portions

More information

Robust Residual Selection for Fault Detection

Robust Residual Selection for Fault Detection Robust Residual Selection or Fault Detection Hamed Khorasgani*, Daniel E Jung**, Gautam Biswas*, Erik Frisk**, and Mattias Krysander** Abstract A number o residual generation methods have been developed

More information

Adaptive Item Calibration Via A Sequential Two-Stage Design

Adaptive Item Calibration Via A Sequential Two-Stage Design 1 Running head: ADAPTIVE SEQUENTIAL METHODS IN ITEM CALIBRATION Adaptive Item Calibration Via A Sequential Two-Stage Design Yuan-chin Ivan Chang Institute of Statistical Science, Academia Sinica, Taipei,

More information

2.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics. References - geostatistics. References geostatistics (cntd.

2.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics. References - geostatistics. References geostatistics (cntd. .6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics Spline interpolation was originally developed or image processing. In GIS, it is mainly used in visualization o spatial

More information

SEPARATED AND PROPER MORPHISMS

SEPARATED AND PROPER MORPHISMS SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN The notions o separatedness and properness are the algebraic geometry analogues o the Hausdor condition and compactness in topology. For varieties over the

More information

PIRLS 2016 Achievement Scaling Methodology 1

PIRLS 2016 Achievement Scaling Methodology 1 CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally

More information

Making the Most of What We Have: A Practical Application of Multidimensional Item Response Theory in Test Scoring

Making the Most of What We Have: A Practical Application of Multidimensional Item Response Theory in Test Scoring Journal of Educational and Behavioral Statistics Fall 2005, Vol. 30, No. 3, pp. 295 311 Making the Most of What We Have: A Practical Application of Multidimensional Item Response Theory in Test Scoring

More information

Estimation and detection of a periodic signal

Estimation and detection of a periodic signal Estimation and detection o a periodic signal Daniel Aronsson, Erik Björnemo, Mathias Johansson Signals and Systems Group, Uppsala University, Sweden, e-mail: Daniel.Aronsson,Erik.Bjornemo,Mathias.Johansson}@Angstrom.uu.se

More information

9.3 Graphing Functions by Plotting Points, The Domain and Range of Functions

9.3 Graphing Functions by Plotting Points, The Domain and Range of Functions 9. Graphing Functions by Plotting Points, The Domain and Range o Functions Now that we have a basic idea o what unctions are and how to deal with them, we would like to start talking about the graph o

More information

STAT 801: Mathematical Statistics. Hypothesis Testing

STAT 801: Mathematical Statistics. Hypothesis Testing STAT 801: Mathematical Statistics Hypothesis Testing Hypothesis testing: a statistical problem where you must choose, on the basis o data X, between two alternatives. We ormalize this as the problem o

More information

ROBUST STABILITY AND PERFORMANCE ANALYSIS OF UNSTABLE PROCESS WITH DEAD TIME USING Mu SYNTHESIS

ROBUST STABILITY AND PERFORMANCE ANALYSIS OF UNSTABLE PROCESS WITH DEAD TIME USING Mu SYNTHESIS ROBUST STABILITY AND PERFORMANCE ANALYSIS OF UNSTABLE PROCESS WITH DEAD TIME USING Mu SYNTHESIS I. Thirunavukkarasu 1, V. I. George 1, G. Saravana Kumar 1 and A. Ramakalyan 2 1 Department o Instrumentation

More information

Online Item Calibration for Q-matrix in CD-CAT

Online Item Calibration for Q-matrix in CD-CAT Online Item Calibration for Q-matrix in CD-CAT Yunxiao Chen, Jingchen Liu, and Zhiliang Ying November 8, 2013 Abstract Item replenishment is important to maintaining a large scale item bank. In this paper

More information

NONLINEAR CONTROL OF POWER NETWORK MODELS USING FEEDBACK LINEARIZATION

NONLINEAR CONTROL OF POWER NETWORK MODELS USING FEEDBACK LINEARIZATION NONLINEAR CONTROL OF POWER NETWORK MODELS USING FEEDBACK LINEARIZATION Steven Ball Science Applications International Corporation Columbia, MD email: sball@nmtedu Steve Schaer Department o Mathematics

More information

Finger Search in the Implicit Model

Finger Search in the Implicit Model Finger Search in the Implicit Model Gerth Stølting Brodal, Jesper Sindahl Nielsen, Jakob Truelsen MADALGO, Department o Computer Science, Aarhus University, Denmark. {gerth,jasn,jakobt}@madalgo.au.dk Abstract.

More information

SEPARATED AND PROPER MORPHISMS

SEPARATED AND PROPER MORPHISMS SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN Last quarter, we introduced the closed diagonal condition or a prevariety to be a prevariety, and the universally closed condition or a variety to be complete.

More information

Treatment and analysis of data Applied statistics Lecture 6: Bayesian estimation

Treatment and analysis of data Applied statistics Lecture 6: Bayesian estimation Treatment and analysis o data Applied statistics Lecture 6: Bayesian estimation Topics covered: Bayes' Theorem again Relation to Likelihood Transormation o pd A trivial example Wiener ilter Malmquist bias

More information

Part I: Thin Converging Lens

Part I: Thin Converging Lens Laboratory 1 PHY431 Fall 011 Part I: Thin Converging Lens This eperiment is a classic eercise in geometric optics. The goal is to measure the radius o curvature and ocal length o a single converging lens

More information

Supplement for In Search of the Holy Grail: Policy Convergence, Experimentation and Economic Performance Sharun W. Mukand and Dani Rodrik

Supplement for In Search of the Holy Grail: Policy Convergence, Experimentation and Economic Performance Sharun W. Mukand and Dani Rodrik Supplement or In Search o the Holy Grail: Policy Convergence, Experimentation and Economic Perormance Sharun W. Mukand and Dani Rodrik In what ollows we sketch out the proos or the lemma and propositions

More information

AH 2700A. Attenuator Pair Ratio for C vs Frequency. Option-E 50 Hz-20 khz Ultra-precision Capacitance/Loss Bridge

AH 2700A. Attenuator Pair Ratio for C vs Frequency. Option-E 50 Hz-20 khz Ultra-precision Capacitance/Loss Bridge 0 E ttenuator Pair Ratio or vs requency NEEN-ERLN 700 Option-E 0-0 k Ultra-precision apacitance/loss ridge ttenuator Ratio Pair Uncertainty o in ppm or ll Usable Pairs o Taps 0 0 0. 0. 0. 07/08/0 E E E

More information

Supplement To: Search for Tensor, Vector, and Scalar Polarizations in the Stochastic Gravitational-Wave Background

Supplement To: Search for Tensor, Vector, and Scalar Polarizations in the Stochastic Gravitational-Wave Background Supplement To: Search or Tensor, Vector, and Scalar Polarizations in the Stochastic GravitationalWave Background B. P. Abbott et al. (LIGO Scientiic Collaboration & Virgo Collaboration) This documents

More information

(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results

(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results (C) The rationals and the reals as linearly ordered sets We know that both Q and R are something special. When we think about about either o these we usually view it as a ield, or at least some kind o

More information

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions R U T C O R R E S E A R C H R E P O R T Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions Douglas H. Jones a Mikhail Nediak b RRR 7-2, February, 2! " ##$%#&

More information

Physics 5153 Classical Mechanics. Solution by Quadrature-1

Physics 5153 Classical Mechanics. Solution by Quadrature-1 October 14, 003 11:47:49 1 Introduction Physics 5153 Classical Mechanics Solution by Quadrature In the previous lectures, we have reduced the number o eective degrees o reedom that are needed to solve

More information

SPOC: An Innovative Beamforming Method

SPOC: An Innovative Beamforming Method SPOC: An Innovative Beamorming Method Benjamin Shapo General Dynamics Ann Arbor, MI ben.shapo@gd-ais.com Roy Bethel The MITRE Corporation McLean, VA rbethel@mitre.org ABSTRACT The purpose o a radar or

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

NONPARAMETRIC PREDICTIVE INFERENCE FOR REPRODUCIBILITY OF TWO BASIC TESTS BASED ON ORDER STATISTICS

NONPARAMETRIC PREDICTIVE INFERENCE FOR REPRODUCIBILITY OF TWO BASIC TESTS BASED ON ORDER STATISTICS REVSTAT Statistical Journal Volume 16, Number 2, April 2018, 167 185 NONPARAMETRIC PREDICTIVE INFERENCE FOR REPRODUCIBILITY OF TWO BASIC TESTS BASED ON ORDER STATISTICS Authors: Frank P.A. Coolen Department

More information

Basic properties of limits

Basic properties of limits Roberto s Notes on Dierential Calculus Chapter : Limits and continuity Section Basic properties o its What you need to know already: The basic concepts, notation and terminology related to its. What you

More information

Online Appendix: The Continuous-type Model of Competitive Nonlinear Taxation and Constitutional Choice by Massimo Morelli, Huanxing Yang, and Lixin Ye

Online Appendix: The Continuous-type Model of Competitive Nonlinear Taxation and Constitutional Choice by Massimo Morelli, Huanxing Yang, and Lixin Ye Online Appendix: The Continuous-type Model o Competitive Nonlinear Taxation and Constitutional Choice by Massimo Morelli, Huanxing Yang, and Lixin Ye For robustness check, in this section we extend our

More information

An Ensemble Kalman Smoother for Nonlinear Dynamics

An Ensemble Kalman Smoother for Nonlinear Dynamics 1852 MONTHLY WEATHER REVIEW VOLUME 128 An Ensemble Kalman Smoother or Nonlinear Dynamics GEIR EVENSEN Nansen Environmental and Remote Sensing Center, Bergen, Norway PETER JAN VAN LEEUWEN Institute or Marine

More information

Analysis Scheme in the Ensemble Kalman Filter

Analysis Scheme in the Ensemble Kalman Filter JUNE 1998 BURGERS ET AL. 1719 Analysis Scheme in the Ensemble Kalman Filter GERRIT BURGERS Royal Netherlands Meteorological Institute, De Bilt, the Netherlands PETER JAN VAN LEEUWEN Institute or Marine

More information

THE use of radio frequency channels assigned to primary. Traffic-Aware Channel Sensing Order in Dynamic Spectrum Access Networks

THE use of radio frequency channels assigned to primary. Traffic-Aware Channel Sensing Order in Dynamic Spectrum Access Networks EEE JOURNAL ON SELECTED AREAS N COMMUNCATONS, VOL. X, NO. X, X 01X 1 Traic-Aware Channel Sensing Order in Dynamic Spectrum Access Networks Chun-Hao Liu, Jason A. Tran, Student Member, EEE, Przemysław Pawełczak,

More information

The Ascent Trajectory Optimization of Two-Stage-To-Orbit Aerospace Plane Based on Pseudospectral Method

The Ascent Trajectory Optimization of Two-Stage-To-Orbit Aerospace Plane Based on Pseudospectral Method Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 00 (014) 000 000 www.elsevier.com/locate/procedia APISAT014, 014 Asia-Paciic International Symposium on Aerospace Technology,

More information

Equating Tests Under The Nominal Response Model Frank B. Baker

Equating Tests Under The Nominal Response Model Frank B. Baker Equating Tests Under The Nominal Response Model Frank B. Baker University of Wisconsin Under item response theory, test equating involves finding the coefficients of a linear transformation of the metric

More information

Lecture : Feedback Linearization

Lecture : Feedback Linearization ecture : Feedbac inearization Niola Misovic, dipl ing and Pro Zoran Vuic June 29 Summary: This document ollows the lectures on eedbac linearization tought at the University o Zagreb, Faculty o Electrical

More information

Detecting Exposed Test Items in Computer-Based Testing 1,2. Ning Han and Ronald Hambleton University of Massachusetts at Amherst

Detecting Exposed Test Items in Computer-Based Testing 1,2. Ning Han and Ronald Hambleton University of Massachusetts at Amherst Detecting Exposed Test s in Computer-Based Testing 1,2 Ning Han and Ronald Hambleton University of Massachusetts at Amherst Background and Purposes Exposed test items are a major threat to the validity

More information

A Simulation Study to Compare CAT Strategies for Cognitive Diagnosis

A Simulation Study to Compare CAT Strategies for Cognitive Diagnosis A Simulation Study to Compare CAT Strategies for Cognitive Diagnosis Xueli Xu Department of Statistics,University of Illinois Hua-Hua Chang Department of Educational Psychology,University of Texas Jeff

More information

Figure 1: Bayesian network for problem 1. P (A = t) = 0.3 (1) P (C = t) = 0.6 (2) Table 1: CPTs for problem 1. (c) P (E B) C P (D = t) f 0.9 t 0.

Figure 1: Bayesian network for problem 1. P (A = t) = 0.3 (1) P (C = t) = 0.6 (2) Table 1: CPTs for problem 1. (c) P (E B) C P (D = t) f 0.9 t 0. Probabilistic Artiicial Intelligence Problem Set 3 Oct 27, 2017 1. Variable elimination In this exercise you will use variable elimination to perorm inerence on a bayesian network. Consider the network

More information

Scattering of Solitons of Modified KdV Equation with Self-consistent Sources

Scattering of Solitons of Modified KdV Equation with Self-consistent Sources Commun. Theor. Phys. Beijing, China 49 8 pp. 89 84 c Chinese Physical Society Vol. 49, No. 4, April 5, 8 Scattering o Solitons o Modiied KdV Equation with Sel-consistent Sources ZHANG Da-Jun and WU Hua

More information

Differentiation. The main problem of differential calculus deals with finding the slope of the tangent line at a point on a curve.

Differentiation. The main problem of differential calculus deals with finding the slope of the tangent line at a point on a curve. Dierentiation The main problem o dierential calculus deals with inding the slope o the tangent line at a point on a curve. deinition() : The slope o a curve at a point p is the slope, i it eists, o the

More information

Quantile POD for Hit-Miss Data

Quantile POD for Hit-Miss Data Quantile POD for Hit-Miss Data Yew-Meng Koh a and William Q. Meeker a a Center for Nondestructive Evaluation, Department of Statistics, Iowa State niversity, Ames, Iowa 50010 Abstract. Probability of detection

More information

Application of Wavelet Transform Modulus Maxima in Raman Distributed Temperature Sensors

Application of Wavelet Transform Modulus Maxima in Raman Distributed Temperature Sensors PHOTONIC SENSORS / Vol. 4, No. 2, 2014: 142 146 Application o Wavelet Transorm Modulus Maxima in Raman Distributed Temperature Sensors Zongliang WANG, Jun CHANG *, Sasa ZHANG, Sha LUO, Cuanwu JIA, Boning

More information

Probabilistic Optimisation applied to Spacecraft Rendezvous on Keplerian Orbits

Probabilistic Optimisation applied to Spacecraft Rendezvous on Keplerian Orbits Probabilistic Optimisation applied to pacecrat Rendezvous on Keplerian Orbits Grégory aive a, Massimiliano Vasile b a Université de Liège, Faculté des ciences Appliquées, Belgium b Dipartimento di Ingegneria

More information

Feedback Linearization

Feedback Linearization Feedback Linearization Peter Al Hokayem and Eduardo Gallestey May 14, 2015 1 Introduction Consider a class o single-input-single-output (SISO) nonlinear systems o the orm ẋ = (x) + g(x)u (1) y = h(x) (2)

More information

Feasibility of a Multi-Pass Thomson Scattering System with Confocal Spherical Mirrors

Feasibility of a Multi-Pass Thomson Scattering System with Confocal Spherical Mirrors Plasma and Fusion Research: Letters Volume 5, 044 200) Feasibility o a Multi-Pass Thomson Scattering System with Conocal Spherical Mirrors Junichi HIRATSUKA, Akira EJIRI, Yuichi TAKASE and Takashi YAMAGUCHI

More information

Probabilistic Model of Error in Fixed-Point Arithmetic Gaussian Pyramid

Probabilistic Model of Error in Fixed-Point Arithmetic Gaussian Pyramid Probabilistic Model o Error in Fixed-Point Arithmetic Gaussian Pyramid Antoine Méler John A. Ruiz-Hernandez James L. Crowley INRIA Grenoble - Rhône-Alpes 655 avenue de l Europe 38 334 Saint Ismier Cedex

More information

The Deutsch-Jozsa Problem: De-quantization and entanglement

The Deutsch-Jozsa Problem: De-quantization and entanglement The Deutsch-Jozsa Problem: De-quantization and entanglement Alastair A. Abbott Department o Computer Science University o Auckland, New Zealand May 31, 009 Abstract The Deustch-Jozsa problem is one o the

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares Scattered Data Approximation o Noisy Data via Iterated Moving Least Squares Gregory E. Fasshauer and Jack G. Zhang Abstract. In this paper we ocus on two methods or multivariate approximation problems

More information

Solving Multi-Mode Time-Cost-Quality Trade-off Problem in Uncertainty Condition Using a Novel Genetic Algorithm

Solving Multi-Mode Time-Cost-Quality Trade-off Problem in Uncertainty Condition Using a Novel Genetic Algorithm International Journal o Management and Fuzzy Systems 2017; 3(3): 32-40 http://www.sciencepublishinggroup.com/j/ijms doi: 10.11648/j.ijms.20170303.11 Solving Multi-Mode Time-Cost-Quality Trade-o Problem

More information

Exponential and Logarithmic. Functions CHAPTER The Algebra of Functions; Composite

Exponential and Logarithmic. Functions CHAPTER The Algebra of Functions; Composite CHAPTER 9 Exponential and Logarithmic Functions 9. The Algebra o Functions; Composite Functions 9.2 Inverse Functions 9.3 Exponential Functions 9.4 Exponential Growth and Decay Functions 9.5 Logarithmic

More information

Lesson 7: Item response theory models (part 2)

Lesson 7: Item response theory models (part 2) Lesson 7: Item response theory models (part 2) Patrícia Martinková Department of Statistical Modelling Institute of Computer Science, Czech Academy of Sciences Institute for Research and Development of

More information

Supplementary Information Reconstructing propagation networks with temporal similarity

Supplementary Information Reconstructing propagation networks with temporal similarity Supplementary Inormation Reconstructing propagation networks with temporal similarity Hao Liao and An Zeng I. SI NOTES A. Special range. The special range is actually due to two reasons: the similarity

More information

Syllabus Objective: 2.9 The student will sketch the graph of a polynomial, radical, or rational function.

Syllabus Objective: 2.9 The student will sketch the graph of a polynomial, radical, or rational function. Precalculus Notes: Unit Polynomial Functions Syllabus Objective:.9 The student will sketch the graph o a polynomial, radical, or rational unction. Polynomial Function: a unction that can be written in

More information

Definition: Let f(x) be a function of one variable with continuous derivatives of all orders at a the point x 0, then the series.

Definition: Let f(x) be a function of one variable with continuous derivatives of all orders at a the point x 0, then the series. 2.4 Local properties o unctions o several variables In this section we will learn how to address three kinds o problems which are o great importance in the ield o applied mathematics: how to obtain the

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

JORIS MULDER AND WIM J. VAN DER LINDEN

JORIS MULDER AND WIM J. VAN DER LINDEN PSYCHOMETRIKA VOL. 74, NO. 2, 273 296 JUNE 2009 DOI: 10.1007/S11336-008-9097-5 MULTIDIMENSIONAL ADAPTIVE TESTING WITH OPTIMAL DESIGN CRITERIA FOR ITEM SELECTION JORIS MULDER AND WIM J. VAN DER LINDEN UNIVERSITY

More information

On the Efficiency of Maximum-Likelihood Estimators of Misspecified Models

On the Efficiency of Maximum-Likelihood Estimators of Misspecified Models 217 25th European Signal Processing Conerence EUSIPCO On the Eiciency o Maximum-ikelihood Estimators o Misspeciied Models Mahamadou amine Diong Eric Chaumette and François Vincent University o Toulouse

More information

Fluctuationlessness Theorem and its Application to Boundary Value Problems of ODEs

Fluctuationlessness Theorem and its Application to Boundary Value Problems of ODEs Fluctuationlessness Theorem and its Application to Boundary Value Problems o ODEs NEJLA ALTAY İstanbul Technical University Inormatics Institute Maslak, 34469, İstanbul TÜRKİYE TURKEY) nejla@be.itu.edu.tr

More information

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values Supplementary material or Continuous-action planning or discounted ininite-horizon nonlinear optimal control with Lipschitz values List o main notations x, X, u, U state, state space, action, action space,

More information

AP Calculus Notes: Unit 1 Limits & Continuity. Syllabus Objective: 1.1 The student will calculate limits using the basic limit theorems.

AP Calculus Notes: Unit 1 Limits & Continuity. Syllabus Objective: 1.1 The student will calculate limits using the basic limit theorems. Syllabus Objective:. The student will calculate its using the basic it theorems. LIMITS how the outputs o a unction behave as the inputs approach some value Finding a Limit Notation: The it as approaches

More information

GF(4) Based Synthesis of Quaternary Reversible/Quantum Logic Circuits

GF(4) Based Synthesis of Quaternary Reversible/Quantum Logic Circuits GF(4) ased Synthesis o Quaternary Reversible/Quantum Logic Circuits MOZAMMEL H. A. KHAN AND MAREK A. PERKOWSKI Department o Computer Science and Engineering, East West University 4 Mohakhali, Dhaka, ANGLADESH,

More information

The achievable limits of operational modal analysis. * Siu-Kui Au 1)

The achievable limits of operational modal analysis. * Siu-Kui Au 1) The achievable limits o operational modal analysis * Siu-Kui Au 1) 1) Center or Engineering Dynamics and Institute or Risk and Uncertainty, University o Liverpool, Liverpool L69 3GH, United Kingdom 1)

More information

36-720: The Rasch Model

36-720: The Rasch Model 36-720: The Rasch Model Brian Junker October 15, 2007 Multivariate Binary Response Data Rasch Model Rasch Marginal Likelihood as a GLMM Rasch Marginal Likelihood as a Log-Linear Model Example For more

More information

MEASUREMENT UNCERTAINTIES

MEASUREMENT UNCERTAINTIES MEASUREMENT UNCERTAINTIES What distinguishes science rom philosophy is that it is grounded in experimental observations. These observations are most striking when they take the orm o a quantitative measurement.

More information

APPENDIX 1 ERROR ESTIMATION

APPENDIX 1 ERROR ESTIMATION 1 APPENDIX 1 ERROR ESTIMATION Measurements are always subject to some uncertainties no matter how modern and expensive equipment is used or how careully the measurements are perormed These uncertainties

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

arxiv:quant-ph/ v2 12 Jan 2006

arxiv:quant-ph/ v2 12 Jan 2006 Quantum Inormation and Computation, Vol., No. (25) c Rinton Press A low-map model or analyzing pseudothresholds in ault-tolerant quantum computing arxiv:quant-ph/58176v2 12 Jan 26 Krysta M. Svore Columbia

More information

CONVECTIVE HEAT TRANSFER CHARACTERISTICS OF NANOFLUIDS. Convective heat transfer analysis of nanofluid flowing inside a

CONVECTIVE HEAT TRANSFER CHARACTERISTICS OF NANOFLUIDS. Convective heat transfer analysis of nanofluid flowing inside a Chapter 4 CONVECTIVE HEAT TRANSFER CHARACTERISTICS OF NANOFLUIDS Convective heat transer analysis o nanoluid lowing inside a straight tube o circular cross-section under laminar and turbulent conditions

More information

Finite Dimensional Hilbert Spaces are Complete for Dagger Compact Closed Categories (Extended Abstract)

Finite Dimensional Hilbert Spaces are Complete for Dagger Compact Closed Categories (Extended Abstract) Electronic Notes in Theoretical Computer Science 270 (1) (2011) 113 119 www.elsevier.com/locate/entcs Finite Dimensional Hilbert Spaces are Complete or Dagger Compact Closed Categories (Extended bstract)

More information

Hierarchical Linear Models. Jeff Gill. University of Florida

Hierarchical Linear Models. Jeff Gill. University of Florida Hierarchical Linear Models Jeff Gill University of Florida I. ESSENTIAL DESCRIPTION OF HIERARCHICAL LINEAR MODELS II. SPECIAL CASES OF THE HLM III. THE GENERAL STRUCTURE OF THE HLM IV. ESTIMATION OF THE

More information

Circuit Complexity / Counting Problems

Circuit Complexity / Counting Problems Lecture 5 Circuit Complexity / Counting Problems April 2, 24 Lecturer: Paul eame Notes: William Pentney 5. Circuit Complexity and Uniorm Complexity We will conclude our look at the basic relationship between

More information

OBSERVER/KALMAN AND SUBSPACE IDENTIFICATION OF THE UBC BENCHMARK STRUCTURAL MODEL

OBSERVER/KALMAN AND SUBSPACE IDENTIFICATION OF THE UBC BENCHMARK STRUCTURAL MODEL OBSERVER/KALMAN AND SUBSPACE IDENTIFICATION OF THE UBC BENCHMARK STRUCTURAL MODEL Dionisio Bernal, Burcu Gunes Associate Proessor, Graduate Student Department o Civil and Environmental Engineering, 7 Snell

More information

Item Response Theory and Computerized Adaptive Testing

Item Response Theory and Computerized Adaptive Testing Item Response Theory and Computerized Adaptive Testing Richard C. Gershon, PhD Department of Medical Social Sciences Feinberg School of Medicine Northwestern University gershon@northwestern.edu May 20,

More information

0,0 B 5,0 C 0, 4 3,5. y x. Recitation Worksheet 1A. 1. Plot these points in the xy plane: A

0,0 B 5,0 C 0, 4 3,5. y x. Recitation Worksheet 1A. 1. Plot these points in the xy plane: A Math 13 Recitation Worksheet 1A 1 Plot these points in the y plane: A 0,0 B 5,0 C 0, 4 D 3,5 Without using a calculator, sketch a graph o each o these in the y plane: A y B 3 Consider the unction a Evaluate

More information

Probabilistic Analysis of Multi-layered Soil Effects on Shallow Foundation Settlement

Probabilistic Analysis of Multi-layered Soil Effects on Shallow Foundation Settlement Probabilistic Analysis o Multi-layered Soil ects on Shallow Foundation Settlement 54 Y L Kuo B Postgraduate Student, School o Civil and nvironmental ngineering, University o Adelaide, Australia M B Jaksa

More information

8.4 Inverse Functions

8.4 Inverse Functions Section 8. Inverse Functions 803 8. Inverse Functions As we saw in the last section, in order to solve application problems involving eponential unctions, we will need to be able to solve eponential equations

More information

On the Development of Implicit Solvers for Time-Dependent Systems

On the Development of Implicit Solvers for Time-Dependent Systems School o something FACULTY School OF OTHER o Computing On the Development o Implicit Solvers or Time-Dependent Systems Peter Jimack School o Computing, University o Leeds In collaboration with: P.H. Gaskell,

More information

3. Several Random Variables

3. Several Random Variables . Several Random Variables. Two Random Variables. Conditional Probabilit--Revisited. Statistical Independence.4 Correlation between Random Variables. Densit unction o the Sum o Two Random Variables. Probabilit

More information

Hidden Markov Models Part 1: Introduction

Hidden Markov Models Part 1: Introduction Hidden Markov Models Part 1: Introduction CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Modeling Sequential Data Suppose that

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

A Systematic Approach to Frequency Compensation of the Voltage Loop in Boost PFC Pre- regulators.

A Systematic Approach to Frequency Compensation of the Voltage Loop in Boost PFC Pre- regulators. A Systematic Approach to Frequency Compensation o the Voltage Loop in oost PFC Pre- regulators. Claudio Adragna, STMicroelectronics, Italy Abstract Venable s -actor method is a systematic procedure that

More information

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability

More information

A Particle Swarm Optimization Algorithm for Neighbor Selection in Peer-to-Peer Networks

A Particle Swarm Optimization Algorithm for Neighbor Selection in Peer-to-Peer Networks A Particle Swarm Optimization Algorithm or Neighbor Selection in Peer-to-Peer Networks Shichang Sun 1,3, Ajith Abraham 2,4, Guiyong Zhang 3, Hongbo Liu 3,4 1 School o Computer Science and Engineering,

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Solution. outside Styrofoam

Solution. outside Styrofoam Ice chest You are in charge o keeping the drinks cold or a picnic. You have a Styrooam box that is illed with cola, water and you plan to put some 0 o ice in it. Your task is to buy enough ice to put in

More information