VARIABILITY OF KUDER-RICHARDSON FOm~A 20 RELIABILITY ESTIMATES. T. Anne Cleary University of Wisconsin and Robert L. Linn Educational Testing Service
|
|
- Bernard Dorsey
- 5 years ago
- Views:
Transcription
1 ~ E S [ B A U ~ L t L H E TI VARIABILITY OF KUDER-RICHARDSON FOm~A 20 RELIABILITY ESTIMATES RB-68-7 N T. Anne Cleary University of Wisconsin and Robert L. Linn Educational Testing Service This Bulletin is a draft for interoffice circulation. Corrections and suggestions for revision are solicited. The Bulletin should not be cited as a reference without the specific permission of the authors. It is automatically superseded upon formal publication of the material. Educational Testing Service Princeton, New Jersey February 1968
2 Variability of Kuder-Richardson Formula 20 Reliability Estimates Abstract The standard error of a Kuder-Richardson Formula 20 reliability coefficient is derived and two approximations to it are presented. The values from the exact solutions and from the approximations are compared with empirical values.
3 Variability of Kuder-Richardson Formula 20 Reliability Estimates There are a number of different methods of estimating the reliability of a test; the more commonly used are the parallel-form correlation and internal-consistency measures such as the Kuder-Richardson formula 20 reliability coefficient. Each of these reliability estimates is based on particular assumptions and has a different interpretation; the choice of a reliability estimate in a given situation should be dictated by the interpretation. r~evertheless, it is instructive to consider the standard errors of these indices or the sampling standard deviations of the inqices under Type I Type I examinees: sampling fluctuation. sampling was used by Lord (1955) to refer to sampling of the same test is administered to a large number of separate groups of examinees, each group being a random sample from a population of examinees. Under Type I sampling the standard error of a parallel-test correlation is well known: P xx (1) where r is the observed correlation between two parallel tests, Pxx xx is the correlation between the parallel tests in the population, N is the number of persons in the sample. Feldt (1965) has derived an approximation to the sampling distribution of r 20, but does not state the standard error explicitly, Lord (1955) gives an explicit formula for the standard error under Type II (sampling of items) but not Type I sampling (sampling of persons)"
4 -2- Reliability can be defined as, where is the variance of the true scores and (J2 X is the variance 0f the observed ~cores. In deriving the KR-20 formula from the analysis-of-variance model (Hoyt, 1941; Feldt, 1965), the score of person p (p = 1,...N) on the item i (i ::: 1,...K) is represented as X. ::: M + A. + t + e. pl l P pl where M::: grand mean, Ai ::: the score component due to the difficulty of item i t p the item true score for person p, and e. ::: error for item :i. and person p, pl The item errors, e., are assumed to be normally and independently displ tributed with zero means and ~ommon variance (J2 The item true score, e t, is assumed to have a normal distribution with variance p ~. The test true score, T :::: Kt, and test error score, p are then normally distributed with variances E:::L:e., i pl. and 2 2 (J K o' E e
5 -3- Reliability is then defined in terms of item parameters as p '02 t The variance of the true score component, t,is estimated by p MS p - MS 1P K ; the variance of the item error score by I~Ip/K and the reliability by MS 1P MS p where MS 1P is the mean square for the items-by-persons interaction, and MS p is the mean square for persons. The expected values of the mean squares are: =C? e and The population covariance of MS 1P and NS p is zero, The sampling distributions of the mean squares are known: (N - 1) MS p 2 2 CT +KCT e t is distributed as chi-square...,ith (N - 1) degrees of freedom and (N - 1) (K - 1) MS 1P c? e is distributed as chi-square with (N - 1) (K - 1) degrees of freedom.
6 -4- Since these two chi-square variates are independent, the ratio 1 - r P is distributed as a central F with (N - l)(k - 1) and (N - 1) degrees of freedom. The variance of r 20 can then be ITitten c = (1 _ )2 2(N - l)((n - 3) + (N - l)(k - 1)) P (K - l)(n - 3)2(N - 5), and =(l-p) PCN-l)[(N-3)+(N-l)(K-l») /- (K - l)(n - 3)2(N - 5) (2) An approximation to this variance is obtained by considering the variance of a ratio of the two chi-square variates. Since the variance of a chi-square distribution is equal to tttlice the number of degrees of freedom, the sampling variances of the mean squares are: 2 4 Var (MS 1P) = (N - l)(k - 1) ~e and The variance of the ratio of two random variables, X 1 /X 2, is approximately equal to (see Kendall & Stuart, 1958, p. 232, Eq ): 2 cov
7 -5- The standard error of is then: ( ).: ( ) I 2K S. E. r P ~17(I~~-'"'="1"'T')+'(K:-=----:-l"'<'") A still cruder approximation S. E. (r 20 ) - (1- p)~ (4 ) is obtained by assuming that K - 1 is approximately equal to K. For tests of typical length, formula (4) will give results similar to those of formula (3), Baker (1962) conducted an empirical study of sampling distribution of some common test analysis statistics, including the KR-20 coefficient. Using a population of 747 answer sheets of an So-item test, Baker drew 200 Type I samples, with replacement, for each of four different sample sizes (N = 15, 30, 60, 120). The resulting standard deviations of the observed sample KR-20 coefficients are reported in Table 1 along with the theoretical standard errors obtained by using formulas (2), (3), and (4). As can be seen, there is a fairly close agreement among the sets of values which improves as the sampling s j_ze increases Insert Table 1 about here It is of some interest to note that for a parallel-test correlation with the same population value (p = <906) the theoretical standard xx errors for the four cases in Table 1 are.044,.031,.022, and.016 for an N of 15, 30, 60, and 120 respectively~ These values are all higher
8 /' -0- than the corresponding standard errors of P20 except for the smal.lest N (N = 15) A comparison of formulas (1) and (2) indicates that for the standard error of I rill generally be smaller than the standard error of for values of p, N, and K that are apt to be encountered in practice. It should be noted that the standard. test errors given above should not be used for confidence limits. The sampling distribution of the sample reliability coefficient is skevred and it is a biased estimator of the population value.
9 -7- References Baker, F. B. Empirical determination of sampling distribution of item discrimination indices and a reliability coefficient. Wisconsin: University of Wisconsin, Madison, Feldt, Lo S. The approximate sampling distribution of Kuder-Richardson reliability coefficient twenty, Psychometrika, 1965, 30, HOJ~' C. Test reliability estimated by analysis of variance. Psychometrika, 1941, ~, Kendall, M. G. & stuart, A. Advanced theory of statistics, Vol. 1. London: Charles Griffin & Co., Ltd., Lord, F. M. Sampling fluctuations resulting from the sampling of test items. Psychometrika, 1955, g, 1-22
10 -8- Table 1 Empirical and Theoretical Standard Errors of Kuder-Richardson Formula 20 for an &)-Item Test with P20 =.906. Theoretical Results Sample Empirical Size Results a Formula 2 Formula 3 :F'ormula :;0.028, ) , aempirical results are based on 200 samples from a finite population of 747 test scores (Baker, 1962).
A TEST OF SIGNIFICANCE OF DIFFERENCE BETWEEN CORRELATED PROPORTIONS. John A. Keats
~ E S E B A U ~ L t L I-i E TI RB-55-20 A TEST OF SIGNIFICANCE OF DIFFERENCE BETWEEN CORRELATED PROPORTIONS John A. Keats N This Bulletin is a draft for interoffice circulation. Corrections and suggestions
More informationPROBABILITIES OF MISCLASSIFICATION IN DISCRIMINATORY ANALYSIS. M. Clemens Johnson
RB-55-22 ~ [ s [ B A U R L t L Ii [ T I N PROBABILITIES OF MISCLASSIFICATION IN DISCRIMINATORY ANALYSIS M. Clemens Johnson This Bulletin is a draft for interoffice circulation. Corrections and suggestions
More informationAN INDEX OF THE DISCRIMINATING POWER OF A TEST. Richard Levine and Frederic M. Lord
RB-58-13 ~ [ s [ B A U ~ L t L I-t [ T I N AN INDEX OF THE DISCRIMINATING POWER OF A TEST AT DIFFERENT PARTS OF THE SCORE RANGE Richard Levine and Frederic M. Lord This Bulletin is a draft for interoffice
More informationTwo Measurement Procedures
Test of the Hypothesis That the Intraclass Reliability Coefficient is the Same for Two Measurement Procedures Yousef M. Alsawalmeh, Yarmouk University Leonard S. Feldt, University of lowa An approximate
More informationSAMPLE IS USED IN A NEW SAMPLE
EB 50-40 ~ E S E B A U R L t L Ii E TI EFFICIENCY OF PBEDICTION WHEN A REGRESSION EQUATION FROM ONE SAMPLE IS USED IN A NEW SAMPLE Frederic M. Lord (Prepublication draft) N ~----'-- This Bulletin is a
More informationReliability Coefficients
Testing the Equality of Two Related Intraclass Reliability Coefficients Yousef M. Alsawaimeh, Yarmouk University Leonard S. Feldt, University of lowa An approximate statistical test of the equality of
More informationCORRELATIONS ~ PARTIAL REGRESSION COEFFICIENTS (GROWTH STUDY PAPER #29) and. Charles E. Werts
RB-69-6 ASSUMPTIONS IN MAKING CAUSAL INFERENCES FROM PART CORRELATIONS ~ PARTIAL CORRELATIONS AND PARTIAL REGRESSION COEFFICIENTS (GROWTH STUDY PAPER #29) Robert L. Linn and Charles E. Werts This Bulletin
More informationAbility Metric Transformations
Ability Metric Transformations Involved in Vertical Equating Under Item Response Theory Frank B. Baker University of Wisconsin Madison The metric transformations of the ability scales involved in three
More informationA Use of the Information Function in Tailored Testing
A Use of the Information Function in Tailored Testing Fumiko Samejima University of Tennessee for indi- Several important and useful implications in latent trait theory, with direct implications vidualized
More informationPIRLS 2016 Achievement Scaling Methodology 1
CHAPTER 11 PIRLS 2016 Achievement Scaling Methodology 1 The PIRLS approach to scaling the achievement data, based on item response theory (IRT) scaling with marginal estimation, was developed originally
More informationSequential Reliability Tests Mindert H. Eiting University of Amsterdam
Sequential Reliability Tests Mindert H. Eiting University of Amsterdam Sequential tests for a stepped-up reliability estimator and coefficient alpha are developed. In a series of monte carlo experiments,
More informationSTATISTICAL INFERENCES ABOUT THE ERROR VARIANCE. Walter Kristof
~ [ s RB-62-2l ~~ ~ L c L H [ T I N STATISTICAL INFERENCES ABOUT THE ERROR VARIANCE Walter Kristof This Bulletin is a draft for interoffice circulation Corrections and suggestions for revision are solicited.
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 37 Effects of the Number of Common Items on Equating Precision and Estimates of the Lower Bound to the Number of Common
More informationThe Discriminating Power of Items That Measure More Than One Dimension
The Discriminating Power of Items That Measure More Than One Dimension Mark D. Reckase, American College Testing Robert L. McKinley, Educational Testing Service Determining a correct response to many test
More informationFOR THE HYFOTHES IS THAT TWO VARIABLES MEASURE THE SAME TRAIT EXCEPT FOR ERRORS OF MEASUREMENT. Frederic M. Lord
RB-56-9 ~ ES [ B AU R L t L H E TI A SIGNIFICANCE TEST FOR THE HYFOTHES IS THAT TWO VARIABLES MEASURE THE SAME TRAIT EXCEPT FOR ERRORS OF MEASUREMENT Frederic M. Lord N This Bulletin is a draft for interoffice
More informationT. Mark Beasley One-Way Repeated Measures ANOVA handout
T. Mark Beasley One-Way Repeated Measures ANOVA handout Profile Analysis Example In the One-Way Repeated Measures ANOVA, two factors represent separate sources of variance. Their interaction presents an
More informationLatent Trait Reliability
Latent Trait Reliability Lecture #7 ICPSR Item Response Theory Workshop Lecture #7: 1of 66 Lecture Overview Classical Notions of Reliability Reliability with IRT Item and Test Information Functions Concepts
More informationChained Versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data
Research Report Chained Versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data Gautam Puhan February 2 ETS RR--6 Listening. Learning. Leading. Chained Versus Post-Stratification
More informationMeasurement Theory. Reliability. Error Sources. = XY r XX. r XY. r YY
Y -3 - -1 0 1 3 X Y -10-5 0 5 10 X Measurement Theory t & X 1 X X 3 X k Reliability e 1 e e 3 e k 1 The Big Picture Measurement error makes it difficult to identify the true patterns of relationships between
More informationA Note on the Choice of an Anchor Test in Equating
Research Report ETS RR 12-14 A Note on the Choice of an Anchor Test in Equating Sandip Sinharay Shelby Haberman Paul Holland Charles Lewis September 2012 ETS Research Report Series EIGNOR EXECUTIVE EDITOR
More informationConditional Standard Errors of Measurement for Performance Ratings from Ordinary Least Squares Regression
Conditional SEMs from OLS, 1 Conditional Standard Errors of Measurement for Performance Ratings from Ordinary Least Squares Regression Mark R. Raymond and Irina Grabovsky National Board of Medical Examiners
More informationPROGRAM STATISTICS RESEARCH
An Alternate Definition of the ETS Delta Scale of Item Difficulty Paul W. Holland and Dorothy T. Thayer @) PROGRAM STATISTICS RESEARCH TECHNICAL REPORT NO. 85..64 EDUCATIONAL TESTING SERVICE PRINCETON,
More informationLesson 6: Reliability
Lesson 6: Reliability Patrícia Martinková Department of Statistical Modelling Institute of Computer Science, Czech Academy of Sciences NMST 570, December 12, 2017 Dec 19, 2017 1/35 Contents 1. Introduction
More informationGroup Dependence of Some Reliability
Group Dependence of Some Reliability Indices for astery Tests D. R. Divgi Syracuse University Reliability indices for mastery tests depend not only on true-score variance but also on mean and cutoff scores.
More informationStyle Insights DISC, English version 2006.g
To: From:. Style Insights DISC, English version 2006.g Bill Bonnstetter Target Training International, Ltd. www.documentingexcellence.com Date: 12 May 2006 www.documentingexcellence.com 445 S. Julian St,
More informationClassical Test Theory. Basics of Classical Test Theory. Cal State Northridge Psy 320 Andrew Ainsworth, PhD
Cal State Northridge Psy 30 Andrew Ainsworth, PhD Basics of Classical Test Theory Theory and Assumptions Types of Reliability Example Classical Test Theory Classical Test Theory (CTT) often called the
More informationThe Influence Function of the Correlation Indexes in a Two-by-Two Table *
Applied Mathematics 014 5 3411-340 Published Online December 014 in SciRes http://wwwscirporg/journal/am http://dxdoiorg/10436/am01451318 The Influence Function of the Correlation Indexes in a Two-by-Two
More informationGRAPHICAL REPRESENTATION OF CORRELATION ANALYSIS OF ORDERED DATA BY LINKED VECTOR PATTERN
Journ. Japan Statist. Soc. 6. 2. 1976. 17 `25 GRAPHICAL REPRESENTATION OF CORRELATION ANALYSIS OF ORDERED DATA BY LINKED VECTOR PATTERN Masaaki Taguri*, Makoto Hiramatsu**, Tomoyoshi Kittaka** and Kazumasa
More informationNonequivalent-Populations Design David J. Woodruff American College Testing Program
A Comparison of Three Linear Equating Methods for the Common-Item Nonequivalent-Populations Design David J. Woodruff American College Testing Program Three linear equating methods for the common-item nonequivalent-populations
More informationConcept of Reliability
Concept of Reliability 1 The concept of reliability is of the consistency or precision of a measure Weight example Reliability varies along a continuum, measures are reliable to a greater or lesser extent
More informationClarifying the concepts of reliability, validity and generalizability
Clarifying the concepts of reliability, validity and generalizability Maria Valaste 1 and Lauri Tarkkonen 2 1 University of Helsinki, Finland e-mail: maria.valaste@helsinki.fi 2 University of Helsinki,
More informationRR R E E A H R E P DENOTING THE BASE FREE MEASURE OF CHANGE. Samuel Messick. Educational Testing Service Princeton, New Jersey December 1980
RR 80 28 R E 5 E A RC H R E P o R T DENOTING THE BASE FREE MEASURE OF CHANGE Samuel Messick Educational Testing Service Princeton, New Jersey December 1980 DENOTING THE BASE-FREE MEASURE OF CHANGE Samuel
More informationResearch on Standard Errors of Equating Differences
Research Report Research on Standard Errors of Equating Differences Tim Moses Wenmin Zhang November 2010 ETS RR-10-25 Listening. Learning. Leading. Research on Standard Errors of Equating Differences Tim
More informationMixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755
Mixed- Model Analysis of Variance Sohad Murrar & Markus Brauer University of Wisconsin- Madison The SAGE Encyclopedia of Educational Research, Measurement and Evaluation Target Word Count: 3000 - Actual
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report. A Multinomial Error Model for Tests with Polytomous Items
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 1 for Tests with Polytomous Items Won-Chan Lee January 2 A previous version of this paper was presented at the Annual
More informationEquating Tests Under The Nominal Response Model Frank B. Baker
Equating Tests Under The Nominal Response Model Frank B. Baker University of Wisconsin Under item response theory, test equating involves finding the coefficients of a linear transformation of the metric
More informationYour use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
A Note on the Efficiency of Least-Squares Estimates Author(s): D. R. Cox and D. V. Hinkley Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 30, No. 2 (1968), pp. 284-289
More informationA Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model
A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model Rand R. Wilcox University of Southern California Based on recently published papers, it might be tempting
More informationItem Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions
R U T C O R R E S E A R C H R E P O R T Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions Douglas H. Jones a Mikhail Nediak b RRR 7-2, February, 2! " ##$%#&
More informationLesson 7: Item response theory models (part 2)
Lesson 7: Item response theory models (part 2) Patrícia Martinková Department of Statistical Modelling Institute of Computer Science, Czech Academy of Sciences Institute for Research and Development of
More informationA Comparison of Bivariate Smoothing Methods in Common-Item Equipercentile Equating
A Comparison of Bivariate Smoothing Methods in Common-Item Equipercentile Equating Bradley A. Hanson American College Testing The effectiveness of smoothing the bivariate distributions of common and noncommon
More informationLinear Equating Models for the Common-item Nonequivalent-Populations Design Michael J. Kolen and Robert L. Brennan American College Testing Program
Linear Equating Models for the Common-item Nonequivalent-Populations Design Michael J. Kolen Robert L. Brennan American College Testing Program The Tucker Levine equally reliable linear meth- in the common-item
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 24 in Relation to Measurement Error for Mixed Format Tests Jae-Chun Ban Won-Chan Lee February 2007 The authors are
More informationA Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions
A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions Cees A.W. Glas Oksana B. Korobko University of Twente, the Netherlands OMD Progress Report 07-01. Cees A.W.
More informationA Cautionary Note on the Use of LISREL s Automatic Start Values in Confirmatory Factor Analysis Studies R. L. Brown University of Wisconsin
A Cautionary Note on the Use of LISREL s Automatic Start Values in Confirmatory Factor Analysis Studies R. L. Brown University of Wisconsin The accuracy of parameter estimates provided by the major computer
More informationBIOL 4605/7220 CH 20.1 Correlation
BIOL 4605/70 CH 0. Correlation GPT Lectures Cailin Xu November 9, 0 GLM: correlation Regression ANOVA Only one dependent variable GLM ANCOVA Multivariate analysis Multiple dependent variables (Correlation)
More informationTrend analysis of fire season length and extreme fire weather in North America between 1979 and 2015
1 2 3 4 5 6 7 8 9 10 11 International Journal of Wildland Fire, 26, 1009 1020 IAWF 2017 doi:10.1071/wf17008_ac Supplementary material Trend analysis of fire season length and extreme fire weather in North
More informationStatistics and Measurement Concepts with OpenStat
Statistics and Measurement Concepts with OpenStat William Miller Statistics and Measurement Concepts with OpenStat William Miller Urbandale, Iowa USA ISBN 978-1-4614-5742-8 ISBN 978-1-4614-5743-5 (ebook)
More informationA Quadratic Curve Equating Method to Equate the First Three Moments in Equipercentile Equating
A Quadratic Curve Equating Method to Equate the First Three Moments in Equipercentile Equating Tianyou Wang and Michael J. Kolen American College Testing A quadratic curve test equating method for equating
More informationSection 4. Test-Level Analyses
Section 4. Test-Level Analyses Test-level analyses include demographic distributions, reliability analyses, summary statistics, and decision consistency and accuracy. Demographic Distributions All eligible
More informationThe Difficulty of Test Items That Measure More Than One Ability
The Difficulty of Test Items That Measure More Than One Ability Mark D. Reckase The American College Testing Program Many test items require more than one ability to obtain a correct response. This article
More informationAn Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin
Equivalency Test for Model Fit 1 Running head: EQUIVALENCY TEST FOR MODEL FIT An Equivalency Test for Model Fit Craig S. Wells University of Massachusetts Amherst James. A. Wollack Ronald C. Serlin University
More informationA Multivariate Perspective
A Multivariate Perspective on the Analysis of Categorical Data Rebecca Zwick Educational Testing Service Ellijot M. Cramer University of North Carolina at Chapel Hill Psychological research often involves
More informationUse of e-rater in Scoring of the TOEFL ibt Writing Test
Research Report ETS RR 11-25 Use of e-rater in Scoring of the TOEFL ibt Writing Test Shelby J. Haberman June 2011 Use of e-rater in Scoring of the TOEFL ibt Writing Test Shelby J. Haberman ETS, Princeton,
More informationStudy Sheet. December 10, The course PDF has been updated (6/11). Read the new one.
Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is
More informationBiometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.
Biometrika Trust An Improved Bonferroni Procedure for Multiple Tests of Significance Author(s): R. J. Simes Source: Biometrika, Vol. 73, No. 3 (Dec., 1986), pp. 751-754 Published by: Biometrika Trust Stable
More informationComparing IRT with Other Models
Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used
More informationEstimating ability for two samples
Estimating ability for two samples William Revelle David M. Condon Northwestern University Abstract Using IRT to estimate ability is easy, but how accurate are the estimate and what about multiple samples?
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report. Coefficients and Indices in Generalizability Theory
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 1 Coefficients and Indices in Generalizability Theory Robert L. Brennan August 2003 A revised version of a paper presented
More information2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006
and F Distributions Lecture 9 Distribution The distribution is used to: construct confidence intervals for a variance compare a set of actual frequencies with expected frequencies test for association
More informationUCLA Department of Statistics Papers
UCLA Department of Statistics Papers Title Can Interval-level Scores be Obtained from Binary Responses? Permalink https://escholarship.org/uc/item/6vg0z0m0 Author Peter M. Bentler Publication Date 2011-10-25
More informationModeling and Performance Analysis with Discrete-Event Simulation
Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 9 Input Modeling Contents Data Collection Identifying the Distribution with Data Parameter Estimation Goodness-of-Fit
More informationObserved-Score "Equatings"
Comparison of IRT True-Score and Equipercentile Observed-Score "Equatings" Frederic M. Lord and Marilyn S. Wingersky Educational Testing Service Two methods of equating tests are compared, one using true
More informationProperties of the least squares estimates
Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares
More informationHaiwen (Henry) Chen and Paul Holland 1 ETS, Princeton, New Jersey
Research Report Construction of Chained True Score Equipercentile Equatings Under the Kernel Equating (KE) Framework and Their Relationship to Levine True Score Equating Haiwen (Henry) Chen Paul Holland
More informationKR- 21 FOR FORMULA SCORED TESTS WITH. Robert L. Linn, Robert F. Boldt, Ronald L. Flaugher, and Donald A. Rock
RB-66-4D ~ E S [ B A U R L C L Ii E TI KR- 21 FOR FORMULA SCORED TESTS WITH OMITS SCORED AS WRONG Robet L. Linn, Robet F. Boldt, Ronald L. Flaughe, and Donald A. Rock N This Bulletin is a daft fo inteoffice
More informationBasic Statistical Analysis
indexerrt.qxd 8/21/2002 9:47 AM Page 1 Corrected index pages for Sprinthall Basic Statistical Analysis Seventh Edition indexerrt.qxd 8/21/2002 9:47 AM Page 656 Index Abscissa, 24 AB-STAT, vii ADD-OR rule,
More informationReconciling factor-based and composite-based approaches to structural equation modeling
Reconciling factor-based and composite-based approaches to structural equation modeling Edward E. Rigdon (erigdon@gsu.edu) Modern Modeling Methods Conference May 20, 2015 Thesis: Arguments for factor-based
More informationChapter - 5 Reliability, Validity & Norms
Chapter - 5 Reliability, Validity & Norms Chapter - 5 Reliability, Validity & Norms 5.1.0 Introduction 5.2.0 Concept of the Reliability 5.3.0 Methods of Estimation of reliability 5.3.1 Method of equivalent
More informationFormulas and Tables by Mario F. Triola
Copyright 010 Pearson Education, Inc. Ch. 3: Descriptive Statistics x f # x x f Mean 1x - x s - 1 n 1 x - 1 x s 1n - 1 s B variance s Ch. 4: Probability Mean (frequency table) Standard deviation P1A or
More informationRandomized Complete Block Designs
Randomized Complete Block Designs David Allen University of Kentucky February 23, 2016 1 Randomized Complete Block Design There are many situations where it is impossible to use a completely randomized
More informationEmpirical Power of Four Statistical Tests in One Way Layout
International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo
More informationMeasurement Error in Nonparametric Item Response Curve Estimation
Research Report ETS RR 11-28 Measurement Error in Nonparametric Item Response Curve Estimation Hongwen Guo Sandip Sinharay June 2011 Measurement Error in Nonparametric Item Response Curve Estimation Hongwen
More informationItem Reliability Analysis
Item Reliability Analysis Revised: 10/11/2017 Summary... 1 Data Input... 4 Analysis Options... 5 Tables and Graphs... 5 Analysis Summary... 6 Matrix Plot... 8 Alpha Plot... 10 Correlation Matrix... 11
More informationCausal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies
Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed
More informationESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES. Dimitar Atanasov
Pliska Stud. Math. Bulgar. 19 (2009), 59 68 STUDIA MATHEMATICA BULGARICA ESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES Dimitar Atanasov Estimation of the parameters
More informationChapter 8 Heteroskedasticity
Chapter 8 Walter R. Paczkowski Rutgers University Page 1 Chapter Contents 8.1 The Nature of 8. Detecting 8.3 -Consistent Standard Errors 8.4 Generalized Least Squares: Known Form of Variance 8.5 Generalized
More informationA Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts
A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of
More informationEstimating Measures of Pass-Fail Reliability
Estimating Measures of Pass-Fail Reliability From Parallel Half-Tests David J. Woodruff and Richard L. Sawyer American College Testing Program Two methods are derived for estimating measures of pass-fail
More informationIn this module I again consider compositing. This module follows one entitled, Composites and Formative Indicators. In this module, I deal with a
In this module I again consider compositing. This module follows one entitled, Composites and Formative Indicators. In this module, I deal with a special situation where there is an endogenous link that
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 31 Assessing Equating Results Based on First-order and Second-order Equity Eunjung Lee, Won-Chan Lee, Robert L. Brennan
More informationELSEVIER FIRST PROOFS CONTRIBUTORS PROOFCHECKING INSTRUCTIONS FOR ENCYCLOPEDIA OF SOCIAL MEASUREMENT
CONTRIBUTORS PROOFCHECKING INSTRUCTIONS FOR ENCYCLOPEDIA OF SOCIAL MEASUREMENT PROOFREADING The text content and layout of your article is not in final form when you receive proofs. Read proofs for accuracy
More informationStudy of the Relationship between Dependent and Independent Variable Groups by Using Canonical Correlation Analysis with Application
Modern Applied Science; Vol. 9, No. 8; 2015 ISSN 1913-1844 E-ISSN 1913-1852 Published by Canadian Center of Science and Education Study of the Relationship between Dependent and Independent Variable Groups
More informationClassical Test Theory (CTT) for Assessing Reliability and Validity
Classical Test Theory (CTT) for Assessing Reliability and Validity Today s Class: Hand-waving at CTT-based assessments of validity CTT-based assessments of reliability Why alpha doesn t really matter CLP
More informationBasic IRT Concepts, Models, and Assumptions
Basic IRT Concepts, Models, and Assumptions Lecture #2 ICPSR Item Response Theory Workshop Lecture #2: 1of 64 Lecture #2 Overview Background of IRT and how it differs from CFA Creating a scale An introduction
More informationImproved General Class of Ratio Type Estimators
[Volume 5 issue 8 August 2017] Page No.1790-1796 ISSN :2320-7167 INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER RESEARCH Improved General Class of Ratio Type Estimators 1 Banti Kumar, 2 Manish Sharma,
More informationLAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2
LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2 Data Analysis: The mean egg masses (g) of the two different types of eggs may be exactly the same, in which case you may be tempted to accept
More informationRepeated Measures Analysis of Variance
Repeated Measures Analysis of Variance Review Univariate Analysis of Variance Group A Group B Group C Repeated Measures Analysis of Variance Condition A Condition B Condition C Repeated Measures Analysis
More informationChapter 13 Correlation
Chapter Correlation Page. Pearson correlation coefficient -. Inferential tests on correlation coefficients -9. Correlational assumptions -. on-parametric measures of correlation -5 5. correlational example
More information[ B R L A U. t L 1-1 E T I. RB-60-l3. S. H. Abdel-Aty
RB-60-l3 ~ [ s [ B A U R L t L 1-1 E T I N TECHNIQUES OF TESTING SJNlLARITY BETWEEN PROFILES S. H. Abdel-Aty This Bulletin is a draft for interoffice circulation. Corrections and suggestions for revision
More informationItem Response Theory and Computerized Adaptive Testing
Item Response Theory and Computerized Adaptive Testing Richard C. Gershon, PhD Department of Medical Social Sciences Feinberg School of Medicine Northwestern University gershon@northwestern.edu May 20,
More informationPsy 420 Final Exam Fall 06 Ainsworth. Key Name
Psy 40 Final Exam Fall 06 Ainsworth Key Name Psy 40 Final A researcher is studying the effect of Yoga, Meditation, Anti-Anxiety Drugs and taking Psy 40 and the anxiety levels of the participants. Twenty
More informationReporting Subscores: A Survey
Research Memorandum Reporting s: A Survey Sandip Sinharay Shelby J. Haberman December 2008 ETS RM-08-18 Listening. Learning. Leading. Reporting s: A Survey Sandip Sinharay and Shelby J. Haberman ETS, Princeton,
More informationComputer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.
Simulation Discrete-Event System Simulation Chapter 8 Input Modeling Purpose & Overview Input models provide the driving force for a simulation model. The quality of the output is no better than the quality
More informationTesting the Untestable Assumptions of the Chain and Poststratification Equating Methods for the NEAT Design
Research Report Testing the Untestable Assumptions of the Chain and Poststratification Equating Methods for the NEAT Design Paul W. Holland Alina A. von Davier Sandip Sinharay Ning Han Research & Development
More informationUsing Dice to Introduce Sampling Distributions Written by: Mary Richardson Grand Valley State University
Using Dice to Introduce Sampling Distributions Written by: Mary Richardson Grand Valley State University richamar@gvsu.edu Overview of Lesson In this activity students explore the properties of the distribution
More informationA White Paper on Scaling PARCC Assessments: Some Considerations and a Synthetic Data Example
A White Paper on Scaling PARCC Assessments: Some Considerations and a Synthetic Data Example Robert L. Brennan CASMA University of Iowa June 10, 2012 On May 3, 2012, the author made a PowerPoint presentation
More informationA Non-parametric bootstrap for multilevel models
A Non-parametric bootstrap for multilevel models By James Carpenter London School of Hygiene and ropical Medicine Harvey Goldstein and Jon asbash Institute of Education 1. Introduction Bootstrapping is
More informationSTAT 501 EXAM I NAME Spring 1999
STAT 501 EXAM I NAME Spring 1999 Instructions: You may use only your calculator and the attached tables and formula sheet. You can detach the tables and formula sheet from the rest of this exam. Show your
More informationEquating Subscores Using Total Scaled Scores as an Anchor
Research Report ETS RR 11-07 Equating Subscores Using Total Scaled Scores as an Anchor Gautam Puhan Longjuan Liang March 2011 Equating Subscores Using Total Scaled Scores as an Anchor Gautam Puhan and
More information