Paul Barrett
|
|
- Patrick Small
- 5 years ago
- Views:
Transcription
1 Paul Barrett Affiliations: The The State Hospital, Carstairs Dept. of of Clinical Psychology, Univ. Of Of Liverpool 20th 20th November, 1998/*Addendum on on 12/10/99
2 What is is Rasch Scaling.1.1 A mathematical procedure that attempts to scale responses to individual items, such that the probability of answering an item in a certain way (whether YES/NO, or multiple-choice Likert format) is computed solely from the amount of a latent variable that a person is measured to possess and from the difficulty measure for the item.
3 What is is Rasch Scaling.2.2 Scaling can be defined as: the encoding of empirical observations using numbers to represent attribute/variable magnitudes, given a set of rules or axioms that the proposed measurement must subsequently satisfy.
4 What is is Rasch Scaling.3.3 latent variable can be defined as: the particular, inferred, construct that we are trying to measure with a set of items. This may be an ability, an attitude, or a personality variable such as Anxiety. A factor from a factor analysis is what we would also generally refer to as a latent variable or attribute.
5 What is is Rasch Scaling.4.4 difficulty measure for an item can be defined as: Classically, the ratio of the number of respondents scoring an item in the keyed or correct direction, over the total number of respondents. In Rasch scaling Rasch scaling, it is an index that expresses the position of the item on the latent variable scale, where 50% of the respondents on the test would respond in the keyed or correct direction.
6 What is is Rasch Scaling.5.5 Critical Point Rasch scaling uses the same scale of measurement for expressing both item difficulty and person ability. That is, the same unit of measurement is used to express difficulty and ability.
7 What is is Rasch Scaling An Item Characteristic Curve (ICC) Probability of a correct reponse to this item a = discrimination parameter. The value of the slope of the line at the midpoint of the curve (inflexion point) b = item difficulty parameter. The location of the inflexion point of the curve on the Theta axis Latent Variable Measure -Theta- (in z-scores)
8 What is is Rasch Scaling The same ICC - which now includes the guessing parameter - c=0.2 Probability of a correct response to this item c = guessing probability. the lower asymptote of the ICC curve. a = discrimination parameter. The value of the slope of the line at the midpoint of the curve (inflexion point) b = item difficulty parameter. The location of the inflexion point of the curve on the Theta axis Latent Variable Measure (in z-scores)
9 What is is Rasch Scaling.8.8 The one-parameter Rasch Model The two-parameter IRT Model p p i i ( θ ) ( θ ) 1 = D( θ 1+ e 1 b i ) = Da i ( θ b i ) 1+ e The three-parameter IRT Model p i ( θ ) c (1 c ) 1+ e = i + i Da i ( θ b i ) whereθ = the measure (score) of a person on the latent trait b = the difficulty of item i a = thediscrimination of item i c = the guessing probability for item i D = a constant used for "normalisation" Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November
10 What is is Rasch Scaling.9.9 So, what we are doing is attempting to model the responses to items in a test, given the amount of the latent variable inferred to be present within every individual who provided responses, and the difficulty of each item. But, given we do not know the individuals latent variable scores or the item difficulties, these have to be estimated - jointly. The solution is iterative, requiring a computer to implement the estimation process.
11 What is is Rasch Scaling.10 How do we create a test score (measure)? By summing the probabilities of keyed or correct responses for each item in the test, using our model parameters of item difficulty and person ability. X j where K = And: i= 1 P i X P ( θ ) i j = = P ( θ ) i the test score for person the probability for item 1 = D( θ 1+ e items Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998 b i ) i j of with ability θ K
12 Why should we prefer Rasch over CTT??.1.1 Classical Test Theory: CTT The equation x = t + e provides the essence of the foundational proposition of this theory. x = the observed test score t = a hypothetical error-free true-score e = the random error associated with a true score. Further, items are assumed to be sampled from universes or domains. Estimation of reliability and other parameters may be made using the algebra of linear sums.
13 Why should we prefer Rasch over CTT??.2.2 A Probabilistic form of Additive Conjoint Measurement Conjoint Measurement.1 Conjoint Measurement.1 The function that describes the concatenation relation between two variables and a third can be deduced axiomatically from the measurements made of the outcome (the third variable) produced by combining the values of the two variables. In our case, the items and the amount of latent variable are combined to produce a third variable (the test score).
14 Why should we prefer Rasch over CTT??.3.3 Conjoint Measurement.2 It requires that the two variables in the concatenation operation are non-interactive (i.e. values on each variable can be manipulated independently of each other). It enables quantitative structure to be detected via ordinal relations upon a variable. As Cliff (1992) has written a certain kind of mild-looking ordinal consistency among three or more variables is necessary and sufficient to define equal-interval scales. Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
15 Why should we prefer Rasch over CTT??.4.4 Why should this matter? Because: measures created using the Rasch measurement model also satisfy the constraints of conjoint measurement. This means that creating tests using the Rasch measurement model gives you equal interval measurement AND additivity of units. Further, Rasch measurement also gives you unidimensional measurement, given the measurement axioms are met (i.e. the model fits the data).
16 Why should we prefer Rasch over CTT??.5.5 Resume of Features associated with the model: Equal-interval, additive units of measurement An explicit ordering of items as a cumulative response scale Sample Free calibration of item and person parameters Computation of both item and person reliability Computation of the location-sensitive standard error-of-measurement over the range of the test measures.
17 So far so good. But this is all theory. What happens in practice? Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
18 Data: EPQ -N (Neuroticism Scale) - UK reference sample Number of Items = 23 Number of respondents = 4140 Mixed gender adults Scale alpha = First, I take a look at a single item characteristic curve prior to fitting the Rasch model. I am predicting the probability of respondents keying this item in the scored direction, for each possible scale score for the test (0-23). For convenience, I convert the scale scores into standardized z-score values prior to fitting a scaled logistic Rasch function to the item.
19 Probability of correct response Fitting the EPQ - N item #3 (Does your mood often go up or down?) Model is: Probability=1/(1+euler^(-1.7*( )*(x-( )))) Least Squares fit = (% variance accounted for) Z score Transformed Raw score level
20 I then fit the Rasch model to the scale of items (using the Andrich et al RUMM software package). The data fail to fit the Rasch model (using the Chi-Square test of model fit) at P < (actually p ~2.5* ) Apart from 2 items, none fit the model (using standardised chi-square residual tests) However, I note that the Rasch person measures correlate at 0.99 with the conventional raw score.
21 Raw Scale Scores Scatterplot: Rasch measures vs Raw EPQ - N scale scores (r=0.99) UK Reference sample Data. N=4140 Mixed Gender sample RASCH Measures
22 Next, I find that my item-difficulties are now quite different to those computed individually, as shown on the previous slide. The plot of Rasch predicted vs actual proportions of respondents scoring an item in the keyed direction, at each Raw Scale score (or Rasch measure) exemplifies this discrepancy for item #3.
23 Rasch Expected Cumulative Probabilities vs Observed Probabilities Probability of a Keyed response Observed Data Predicted Data Rasch Measures (correlate 0.99 with Raw Scores) Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
24 It would appear that by concentrating solely upon adjusting item difficulty location parameters, and latent trait person parameters, whilst minimising the discrepancy between the predicted raw test scores and actual raw test scores, the Rasch modelling procedure has indirectly induced considerable item misfit whilst attempting to remain within the constraints required by the axiomatic measurement properties. This is somewhat unexpected and more than a cause for serious concern - especially as the model (and all items) fit when N=200 respondents! Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
25 This is obviously a problem with the Chi-Square test being too sensitive to discrepancies between the observed and model-generated proportions of respondents (at each scale score/rasch measure) getting the item correct. Which leaves us with the problem of just how to assess fit of items to the model. There are solutions - but somewhat heuristic I m afraid.
26 The next exploration looks directly at a major purported benefit of the Rasch model - the creation of equal-interval, additive units of measurement for the latent variable. Here I comprehensively extended an example briefly presented by Fisher (1992) - where we use a bad ruler (unequal units over a range of measurement) to make ordinal measurement of true equal-interval unit lengths (in cm units). This tests the capacity of the Rasch model to uncover the true equal-interval scale that underlies the raw score measurement.
27 Here, I present 40 objects for measurement to my bad-ruler which consists of 16 unequal divisions of length the objects are actually cm units on a real ruler - expressed in terms of my bad-ruler units. Each measurement is in the form of a dichotomy - a 1 is assigned to a bad measure unit if my cm measure extends beyond than this unit. Where my cm measure is smaller than the remaining bad-ruler units, I assign a 0 to these units. E.g.
28 The real ruler The bad ruler A 1 cm measure on the good ruler would generate the following record: For 2cm etc. In this way, we build up 40 records - which are like responses to items on a test.
29 Fitting these slightly jittered data (for the Rasch is a probabilistic model) we have 40 persons and 16 items to be provided with parameters. The rather simple-minded test here is whether the Rasch model will recover the equal-interval cm scale from the ordinal measures made by me. Fisher claimed it would - and my simplistic reasoning with regard conjoint measurement would seem to suggest it should.
30 The model fits almost perfectly (chi-square probability ~ 0.99) All items fit the model (via chi-square) The actual raw scores for each person correlate with the expected scores computed from the Rasch Modelling. This is no longer any surprise because the model-fit procedure is minimising this discrepancy. The Rasch item location/difficulty parameters correlate with the bad-ruler units. Now this IS interesting.
31 Rasch "Difficulty" parameters and Ordinal units against actual cm measurement A scale for Rasch measures and the Raw Score units Item Difficulty Raw Score Units Measurement in actual cm Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
32 The Rasch estimates are mirroring my badruler units. If we did not know that my 16 units were (in reality) unequally spaced, we would probably treat them as equal-interval, and plot them accordingly which means that the Rasch difficulty/location parameters would also now be equally spaced. Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
33 The Rasch Difficulty Measure for each item "unit". Rasch difficulty parameters vs the Ordinal Rank units The Ordinal "bad-ruler"units
34 A final test, to confirm my suspicion that the Rasch model is NOT able to address the issue of a fundamental unit of measurement! Here, I map my bad-ruler units onto log e (cm).. Then use the units to make measurement as before. The graph on the next slide shows the extent to which my bad-ruler is now making curvilinear measurement of a set of extensive, equal-interval measurement of cm unit objects.
35 Centimetres(cm) Centimetre "objects" vs log(cm) ordinal units Log(cm) with imposed ordinal "ticks"
36 The model fits almost almost perfectly (chi-square probability ~ 0.97) All but 1 items fit the model (via chi-square) The Rasch item location/difficulty parameters correlate with the bad-ruler units. Once again, the Rasch model is seen to attempt to linearise the 16 items and 40 length measures - and it succeeds well. BUT, those item units were mapped onto extensive units of measurement, using a logarithmic concatenation of cm to bad-ruler units. Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
37 Rasch Difficulty and Ordinal Units against actual cm measurement A scale for Rasch measures and the raw score units Item Difficulty Raw Score Units Measurement in actual cm Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
38 The Rasch Difficulty Measure for each item "unit" Rasch Difficulty parameters vs the Ordinal Rank Units The Ordinal "bad-ruler" Units
39 Critical Point Given that the deterministic axioms of Luce and Tukey s (1964) simultaneous conjoint measurement hold for this form of probabilistic model, then Rasch scaling is producing equal interval, additive units of measurement. but of what exactly? By demonstrating its virtual identity with the raw scores and ranked, (but equal-interval ) ordinal units, I am led to conclude it is producing an equal-interval scaling of the numerals representing the ranked items. Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998
40 Critical Point Given my mapped units were equal interval, it is then no surprise that the Rasch locations and rankunit locations are so closely related in a linear function. But, the key issue is that the real unit of measurement (cm) was never exposed by the model. It is this result that causes me to question the automatic use of the model for psychological science investigations.
41 Critical Point My use of the Rasch model here seems to be dependent upon some form of inductive logic - that is, I use the model to determine a unit of unknown meaning. But surely science proceeds by first defining a meaningful (in some theoretical sense) unit, then designs measurement to determine if that unit functions in the manner specified by some theory?
42 Critical Point Thus, if we are to use the Rasch model productively, it cannot be used as an inductive unit-generating procedure, but rather, as part of a hypothetico-deductive process of investigation.
43 So, might we conclude that the Rasch model is of more value to scientific investigation than True-Score Theory?. Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998 On balance, YES. Given that a quantitative science requires equal-interval, additive units of measurement for variables, then there really is no alternative to the Rasch model for psychological measurement. However, we have also seen that scaling, in the absence of theory for the fundamental unit of measurement for a latent variable, is not of value except perhaps pragmatically.
44 Four clear, justifiable, and pragmatic reasons to use the Rasch Model - given fit of the model to your data. Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998 An explicit ordering of items as a cumulative response scale, on a common linear metric shared with person measures. Additive units of measurement Computation of both item and person reliability Computation of the location-sensitive standard error-of-measurement/information function over the range of the latent variable measures.
45 Conclusions.1 Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998 ❶ The Rasch Model is dangerous for the wrong reasons! Whereas I was thinking of it as a means to assist in the development of fundamental standard units of measurement for latent variables, this is not possible without first having a model for what and how these units should be instantiated within a deductive theoretical framework.
46 Conclusions.2 Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998 ❷ The problem remains with our conception of the constituent properties of latent variables - and their proposed units of measurement. ❸The Rasch model provides more information about any test respondent, and test items, than does CTT. For pragmatic purposes alone, this is surely of benefit to the applied psychological professions, both test developers and test users.
47 Conclusions.3 Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998 ❹ CERTAIN domains of Psychometric tests do have good validity - the recent paper by Schmidt and Hunter (Sept. 1998) in the Psychological Bulletin demonstrates this clearly but also demonstrates the poverty of scientific understanding in this area. Schmidt. F.L. and Hunter, J.E. (1998) The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 85 years of research findings. Psychological Bulletin, 124, 2,
48 Conclusions.4 Paul Barrett: BPS Millenium Conference: Beyond Psychometrics November 1998 ❺ If we question the use of the Rasch model, on the basis of the stability of the constructs we are measuring (because of environmental or situational factors that may change over time), then we are in fact NOT questioning the Rasch model at all, but the very rules and meaning by which we are proposing to instantiate our constructs. To spurn the possibility of equal-interval measurement on this basis is quite wrong. Rather, we need to consider the conceptual status of what it is that we think we are measuring.
49 Addendum.1-12/10/99 Paul Barrett: Addendum October 1999 Following this presentation in November 1998, Ben Wright from the MESA group at Chicago re-analysed my data - and concluded that there was insufficient stochasticity (random error) in my observations. In short, my data may have been artificially too clean for the Rasch model to fit well (as it is a form of probabilistic conjoint scaling, not deterministic). I am not happy with the implications of this, although I see exactly the veracity of his argument. I think our disagreement may lie in the fact that others (like William Fisher Jnr.) see the meaning-measurement unit linkage as a construction that is created after Rasch measurement is created. For me, this is the wrong way round. You first need to define the meaning, then develop the measurement, based upon the conceptualisation of a meaningful unit. Simply scaling a set of items (aka arithmetic items), then deciding that the equal-interval unit can be known as arithmets, is acceptable if the only purpose of the measurement is pragmatic, but not acceptable if the aim of this work is to make statements about the magnitude of some psychological attribute/process that underlies an individual s ability to solve arithmetic items.
50 Addendum.2-12/10/99 So, I am now setting up a better data-generation program that gives me greater control over the amount of error I introduce, and the amount of ordinality in my measurement ruler. Then, I intend to produce some better tests of my propositions. Further, as David Andrich pointed out - he does not use the chi-square statistic as a measure of model or item fit- but rather, uses more a mixture of graphical, tabular, and other data to examine the issue of model fit. However, I note that George Karabatsos (gkarab@lsumc.edu) is working on model fit from the perspective of additive conjoint measurement (ACM). He noted in a recent At face value, this approach to correspond goodness of fit with the ACM axioms may seem convenient and sensible. But the goodness of fit stats are hampered by several issues. As I found out in my dissertation work, and later in several sources (please refer to the bottom of this message), the correspondence between goodness of fit statistics and measurement axioms is far from perfect. For instance, the Rasch model can conclude perfect fit, while the axiom tests reveal significant departures from the measurement model. The rest of the message was almost as bleak for me! (Rasch listserv rasch@acer.edu.au message date...27/09/99). Anyway, time for some more detailed exploration into this measurement model. Paul Barrett: Addendum October 1999
Paul Barrett
For: AERA-D Rasch Measurement SIG. New Orleans, USA, 3 rd April, 2002 Symposium entitled: Is Educational Measurement really possible? Chair and Organizer: Assoc. Prof. Trevor Bond Paul Barrett email: p.barrett@liv.ac.uk
More informationBasic IRT Concepts, Models, and Assumptions
Basic IRT Concepts, Models, and Assumptions Lecture #2 ICPSR Item Response Theory Workshop Lecture #2: 1of 64 Lecture #2 Overview Background of IRT and how it differs from CFA Creating a scale An introduction
More informationItem Response Theory and Computerized Adaptive Testing
Item Response Theory and Computerized Adaptive Testing Richard C. Gershon, PhD Department of Medical Social Sciences Feinberg School of Medicine Northwestern University gershon@northwestern.edu May 20,
More informationThe application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments
The application and empirical comparison of item parameters of Classical Test Theory and Partial Credit Model of Rasch in performance assessments by Paul Moloantoa Mokilane Student no: 31388248 Dissertation
More informationSummer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010
Summer School in Applied Psychometric Principles Peterhouse College 13 th to 17 th September 2010 1 Two- and three-parameter IRT models. Introducing models for polytomous data. Test information in IRT
More informationLECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS
LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal
More informationWhat Rasch did: the mathematical underpinnings of the Rasch model. Alex McKee, PhD. 9th Annual UK Rasch User Group Meeting, 20/03/2015
What Rasch did: the mathematical underpinnings of the Rasch model. Alex McKee, PhD. 9th Annual UK Rasch User Group Meeting, 20/03/2015 Our course Initial conceptualisation Separation of parameters Specific
More informationAn Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin
Equivalency Test for Model Fit 1 Running head: EQUIVALENCY TEST FOR MODEL FIT An Equivalency Test for Model Fit Craig S. Wells University of Massachusetts Amherst James. A. Wollack Ronald C. Serlin University
More informationUCLA Department of Statistics Papers
UCLA Department of Statistics Papers Title Can Interval-level Scores be Obtained from Binary Responses? Permalink https://escholarship.org/uc/item/6vg0z0m0 Author Peter M. Bentler Publication Date 2011-10-25
More informationLesson 7: Item response theory models (part 2)
Lesson 7: Item response theory models (part 2) Patrícia Martinková Department of Statistical Modelling Institute of Computer Science, Czech Academy of Sciences Institute for Research and Development of
More informationAn Overview of Item Response Theory. Michael C. Edwards, PhD
An Overview of Item Response Theory Michael C. Edwards, PhD Overview General overview of psychometrics Reliability and validity Different models and approaches Item response theory (IRT) Conceptual framework
More informationOverview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications
Multidimensional Item Response Theory Lecture #12 ICPSR Item Response Theory Workshop Lecture #12: 1of 33 Overview Basics of MIRT Assumptions Models Applications Guidance about estimating MIRT Lecture
More information1. THE IDEA OF MEASUREMENT
1. THE IDEA OF MEASUREMENT No discussion of scientific method is complete without an argument for the importance of fundamental measurement - measurement of the kind characterizing length and weight. Yet,
More informationMeasurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA
Topics: Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA What are MI and DIF? Testing measurement invariance in CFA Testing differential item functioning in IRT/IFA
More informationProbability and Statistics
Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT
More informationMonte Carlo Simulations for Rasch Model Tests
Monte Carlo Simulations for Rasch Model Tests Patrick Mair Vienna University of Economics Thomas Ledl University of Vienna Abstract: Sources of deviation from model fit in Rasch models can be lack of unidimensionality,
More informationWhats beyond Concerto: An introduction to the R package catr. Session 4: Overview of polytomous IRT models
Whats beyond Concerto: An introduction to the R package catr Session 4: Overview of polytomous IRT models The Psychometrics Centre, Cambridge, June 10th, 2014 2 Outline: 1. Introduction 2. General notations
More informationCHAPTER 3. THE IMPERFECT CUMULATIVE SCALE
CHAPTER 3. THE IMPERFECT CUMULATIVE SCALE 3.1 Model Violations If a set of items does not form a perfect Guttman scale but contains a few wrong responses, we do not necessarily need to discard it. A wrong
More informationThe Rasch Model, Additive Conjoint Measurement, and New Models of Probabilistic Measurement Theory
JOURNAL OF APPLIED MEASUREMENT, 2(4), 389 423 Copyright 2001 The Rasch Model, Additive Conjoint Measurement, and New Models of Probabilistic Measurement Theory George Karabatsos LSU Health Sciences Center
More informationEquating Tests Under The Nominal Response Model Frank B. Baker
Equating Tests Under The Nominal Response Model Frank B. Baker University of Wisconsin Under item response theory, test equating involves finding the coefficients of a linear transformation of the metric
More informationA PSYCHOPHYSICAL INTERPRETATION OF RASCH S PSYCHOMETRIC PRINCIPLE OF SPECIFIC OBJECTIVITY
A PSYCHOPHYSICAL INTERPRETATION OF RASCH S PSYCHOMETRIC PRINCIPLE OF SPECIFIC OBJECTIVITY R. John Irwin Department of Psychology, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand
More informationHow to Measure the Objectivity of a Test
How to Measure the Objectivity of a Test Mark H. Moulton, Ph.D. Director of Research, Evaluation, and Psychometric Development Educational Data Systems Specific Objectivity (Ben Wright, Georg Rasch) Rasch
More informationStatistical and psychometric methods for measurement: Scale development and validation
Statistical and psychometric methods for measurement: Scale development and validation Andrew Ho, Harvard Graduate School of Education The World Bank, Psychometrics Mini Course Washington, DC. June 11,
More informationLatent Trait Reliability
Latent Trait Reliability Lecture #7 ICPSR Item Response Theory Workshop Lecture #7: 1of 66 Lecture Overview Classical Notions of Reliability Reliability with IRT Item and Test Information Functions Concepts
More informationBayesian Nonparametric Rasch Modeling: Methods and Software
Bayesian Nonparametric Rasch Modeling: Methods and Software George Karabatsos University of Illinois-Chicago Keynote talk Friday May 2, 2014 (9:15-10am) Ohio River Valley Objective Measurement Seminar
More informationPrentice Hall Mathematics, Geometry 2009 Correlated to: Connecticut Mathematics Curriculum Framework Companion, 2005 (Grades 9-12 Core and Extended)
Grades 9-12 CORE Algebraic Reasoning: Patterns And Functions GEOMETRY 2009 Patterns and functional relationships can be represented and analyzed using a variety of strategies, tools and technologies. 1.1
More informationMATHEMATICS Paper 980/11 Paper 11 General comments It is pleasing to record improvement in some of the areas mentioned in last year s report. For example, although there were still some candidates who
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationApplication of Item Response Theory Models for Intensive Longitudinal Data
Application of Item Response Theory Models for Intensive Longitudinal Data Don Hedeker, Robin Mermelstein, & Brian Flay University of Illinois at Chicago hedeker@uic.edu Models for Intensive Longitudinal
More informationIntroduction to Confirmatory Factor Analysis
Introduction to Confirmatory Factor Analysis Multivariate Methods in Education ERSH 8350 Lecture #12 November 16, 2011 ERSH 8350: Lecture 12 Today s Class An Introduction to: Confirmatory Factor Analysis
More informationOn the Construction of Adjacent Categories Latent Trait Models from Binary Variables, Motivating Processes and the Interpretation of Parameters
Gerhard Tutz On the Construction of Adjacent Categories Latent Trait Models from Binary Variables, Motivating Processes and the Interpretation of Parameters Technical Report Number 218, 2018 Department
More informationAssessment, analysis and interpretation of Patient Reported Outcomes (PROs)
Assessment, analysis and interpretation of Patient Reported Outcomes (PROs) Day 2 Summer school in Applied Psychometrics Peterhouse College, Cambridge 12 th to 16 th September 2011 This course is prepared
More informationStudies on the effect of violations of local independence on scale in Rasch models: The Dichotomous Rasch model
Studies on the effect of violations of local independence on scale in Rasch models Studies on the effect of violations of local independence on scale in Rasch models: The Dichotomous Rasch model Ida Marais
More informationADDITIVITY IN PSYCHOLOGICAL MEASUREMENT. Benjamin D. Wright MESA Psychometric Laboratory The Department of Education The University of Chicago
Measurement and Personality Assessment Edw. E. Roskam (ed.) Elsevier Science Publishers B.V. (North-Holland), 198S 101 ADDITIVITY IN PSYCHOLOGICAL MEASUREMENT Benjamin D. Wright MESA Psychometric Laboratory
More informationClassical Test Theory. Basics of Classical Test Theory. Cal State Northridge Psy 320 Andrew Ainsworth, PhD
Cal State Northridge Psy 30 Andrew Ainsworth, PhD Basics of Classical Test Theory Theory and Assumptions Types of Reliability Example Classical Test Theory Classical Test Theory (CTT) often called the
More informationComparing IRT with Other Models
Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used
More informationTutorial on Mathematical Induction
Tutorial on Mathematical Induction Roy Overbeek VU University Amsterdam Department of Computer Science r.overbeek@student.vu.nl April 22, 2014 1 Dominoes: from case-by-case to induction Suppose that you
More informationLecture Notes on Certifying Theorem Provers
Lecture Notes on Certifying Theorem Provers 15-317: Constructive Logic Frank Pfenning Lecture 13 October 17, 2017 1 Introduction How do we trust a theorem prover or decision procedure for a logic? Ideally,
More informationAn Introduction to Mplus and Path Analysis
An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression
More informationExperimental Design and Graphical Analysis of Data
Experimental Design and Graphical Analysis of Data A. Designing a controlled experiment When scientists set up experiments they often attempt to determine how a given variable affects another variable.
More informationA FLOW DIAGRAM FOR CALCULATING LIMITS OF FUNCTIONS (OF SEVERAL VARIABLES).
A FLOW DIAGRAM FOR CALCULATING LIMITS OF FUNCTIONS (OF SEVERAL VARIABLES). Version 5.5, 2/12/2008 In many ways it is silly to try to describe a sophisticated intellectual activity by a simple and childish
More informationThe roots of computability theory. September 5, 2016
The roots of computability theory September 5, 2016 Algorithms An algorithm for a task or problem is a procedure that, if followed step by step and without any ingenuity, leads to the desired result/solution.
More informationIntroduction To Confirmatory Factor Analysis and Item Response Theory
Introduction To Confirmatory Factor Analysis and Item Response Theory Lecture 23 May 3, 2005 Applied Regression Analysis Lecture #23-5/3/2005 Slide 1 of 21 Today s Lecture Confirmatory Factor Analysis.
More informationNew Developments for Extended Rasch Modeling in R
New Developments for Extended Rasch Modeling in R Patrick Mair, Reinhold Hatzinger Institute for Statistics and Mathematics WU Vienna University of Economics and Business Content Rasch models: Theory,
More informationPairwise Parameter Estimation in Rasch Models
Pairwise Parameter Estimation in Rasch Models Aeilko H. Zwinderman University of Leiden Rasch model item parameters can be estimated consistently with a pseudo-likelihood method based on comparing responses
More informationAn Introduction to Path Analysis
An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving
More informationPackage paramap. R topics documented: September 20, 2017
Package paramap September 20, 2017 Type Package Title paramap Version 1.4 Date 2017-09-20 Author Brian P. O'Connor Maintainer Brian P. O'Connor Depends R(>= 1.9.0), psych, polycor
More informationOn the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit
On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit March 27, 2004 Young-Sun Lee Teachers College, Columbia University James A.Wollack University of Wisconsin Madison
More informationA Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts
A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of
More informationDiagnostics and Transformations Part 2
Diagnostics and Transformations Part 2 Bivariate Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University Multilevel Regression Modeling, 2009 Diagnostics
More informationA Little History Incompleteness The First Theorem The Second Theorem Implications. Gödel s Theorem. Anders O.F. Hendrickson
Gödel s Theorem Anders O.F. Hendrickson Department of Mathematics and Computer Science Concordia College, Moorhead, MN Math/CS Colloquium, November 15, 2011 Outline 1 A Little History 2 Incompleteness
More informationBayesian Methods for Testing Axioms of Measurement
Bayesian Methods for Testing Axioms of Measurement George Karabatsos University of Illinois-Chicago University of Minnesota Quantitative/Psychometric Methods Area Department of Psychology April 3, 2015,
More informationLocal response dependence and the Rasch factor model
Local response dependence and the Rasch factor model Dept. of Biostatistics, Univ. of Copenhagen Rasch6 Cape Town Uni-dimensional latent variable model X 1 TREATMENT δ T X 2 AGE δ A Θ X 3 X 4 Latent variable
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationThe Discriminating Power of Items That Measure More Than One Dimension
The Discriminating Power of Items That Measure More Than One Dimension Mark D. Reckase, American College Testing Robert L. McKinley, Educational Testing Service Determining a correct response to many test
More informationDimensionality Assessment: Additional Methods
Dimensionality Assessment: Additional Methods In Chapter 3 we use a nonlinear factor analytic model for assessing dimensionality. In this appendix two additional approaches are presented. The first strategy
More informationChapter One. The Real Number System
Chapter One. The Real Number System We shall give a quick introduction to the real number system. It is imperative that we know how the set of real numbers behaves in the way that its completeness and
More informationEstimating ability for two samples
Estimating ability for two samples William Revelle David M. Condon Northwestern University Abstract Using IRT to estimate ability is easy, but how accurate are the estimate and what about multiple samples?
More informationPath Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis
Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 38 Goodness - of fit tests Hello and welcome to this
More informationCalifornia Content Standard. Essentials for Algebra (lesson.exercise) of Test Items. Grade 6 Statistics, Data Analysis, & Probability.
California Content Standard Grade 6 Statistics, Data Analysis, & Probability 1. Students compute & analyze statistical measurements for data sets: 1.1 Compute the mean, median & mode of data sets 1.2 Understand
More informationWalkthrough for Illustrations. Illustration 1
Tay, L., Meade, A. W., & Cao, M. (in press). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods. doi: 10.1177/1094428114553062 Walkthrough for Illustrations
More informationSlope Fields: Graphing Solutions Without the Solutions
8 Slope Fields: Graphing Solutions Without the Solutions Up to now, our efforts have been directed mainly towards finding formulas or equations describing solutions to given differential equations. Then,
More informationTest Homogeneity The Single-Factor Model. Test Theory Chapter 6 Lecture 9
Test Homogeneity The Single-Factor Model Test Theory Chapter 6 Lecture 9 Today s Class Test Homogeneity. The Single Factor Model. AKA the Spearman model. Chapter 6. Homework questions? Psych 892 - Test
More informationLearning Causal Direction from Transitions with Continuous and Noisy Variables
Learning Causal Direction from Transitions with Continuous and Noisy Variables Kevin W. Soo (kws10@pitt.edu) Benjamin M. Rottman (rottman@pitt.edu) Department of Psychology, University of Pittsburgh 3939
More informationBLAST: Target frequencies and information content Dannie Durand
Computational Genomics and Molecular Biology, Fall 2016 1 BLAST: Target frequencies and information content Dannie Durand BLAST has two components: a fast heuristic for searching for similar sequences
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationChapter 7 Linear Regression
Chapter 7 Linear Regression 1 7.1 Least Squares: The Line of Best Fit 2 The Linear Model Fat and Protein at Burger King The correlation is 0.76. This indicates a strong linear fit, but what line? The line
More informationItem Response Theory (IRT) Analysis of Item Sets
University of Connecticut DigitalCommons@UConn NERA Conference Proceedings 2011 Northeastern Educational Research Association (NERA) Annual Conference Fall 10-21-2011 Item Response Theory (IRT) Analysis
More informationWhat is an Ordinal Latent Trait Model?
What is an Ordinal Latent Trait Model? Gerhard Tutz Ludwig-Maximilians-Universität München Akademiestraße 1, 80799 München February 19, 2019 arxiv:1902.06303v1 [stat.me] 17 Feb 2019 Abstract Although various
More informationA Guide to Proof-Writing
A Guide to Proof-Writing 437 A Guide to Proof-Writing by Ron Morash, University of Michigan Dearborn Toward the end of Section 1.5, the text states that there is no algorithm for proving theorems.... Such
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology Kharagpur Lecture No. #13 Probability Distribution of Continuous RVs (Contd
More informationMATHEMATICS (MIDDLE GRADES AND EARLY SECONDARY)
MATHEMATICS (MIDDLE GRADES AND EARLY SECONDARY) l. Content Domain Mathematical Processes and Number Sense Range of Competencies Approximate Percentage of Test Score 0001 0003 24% ll. Patterns, Algebra,
More informationChapter 5: Preferences
Chapter 5: Preferences 5.1: Introduction In chapters 3 and 4 we considered a particular type of preferences in which all the indifference curves are parallel to each other and in which each indifference
More informationDraft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM
1 REGRESSION AND CORRELATION As we learned in Chapter 9 ( Bivariate Tables ), the differential access to the Internet is real and persistent. Celeste Campos-Castillo s (015) research confirmed the impact
More informationConditional Probability in the Light of Qualitative Belief Change. David Makinson LSE Pontignano November 2009
Conditional Probability in the Light of Qualitative Belief Change David Makinson LSE Pontignano November 2009 1 Contents Explore ways in which qualitative belief change in AGM tradition throws light on
More informationPSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing
Page Title PSY 305 Module 3 Introduction to Hypothesis Testing Z-tests Five steps in hypothesis testing State the research and null hypothesis Determine characteristics of comparison distribution Five
More informationSemiparametric Generalized Linear Models
Semiparametric Generalized Linear Models North American Stata Users Group Meeting Chicago, Illinois Paul Rathouz Department of Health Studies University of Chicago prathouz@uchicago.edu Liping Gao MS Student
More informationFairfield Public Schools
Mathematics Fairfield Public Schools Pre-Algebra 8 Pre-Algebra 8 BOE Approved 05/21/2013 1 PRE-ALGEBRA 8 Critical Areas of Focus In the Pre-Algebra 8 course, instructional time should focus on three critical
More informationRegression Analysis: Exploring relationships between variables. Stat 251
Regression Analysis: Exploring relationships between variables Stat 251 Introduction Objective of regression analysis is to explore the relationship between two (or more) variables so that information
More informationIntroduction to Survey Analysis!
Introduction to Survey Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California! Reading Assignment:! 2/22/13 None! 1 Goals for this Lecture! Introduction to analysis for surveys!
More informationDrug Combination Analysis
Drug Combination Analysis Gary D. Knott, Ph.D. Civilized Software, Inc. 12109 Heritage Park Circle Silver Spring MD 20906 USA Tel.: (301)-962-3711 email: csi@civilized.com URL: www.civilized.com abstract:
More informationPsychometric Issues in Formative Assessment: Measuring Student Learning Throughout the Academic Year Using Interim Assessments
Psychometric Issues in Formative Assessment: Measuring Student Learning Throughout the Academic Year Using Interim Assessments Jonathan Templin The University of Georgia Neal Kingston and Wenhao Wang University
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More informationAbility Metric Transformations
Ability Metric Transformations Involved in Vertical Equating Under Item Response Theory Frank B. Baker University of Wisconsin Madison The metric transformations of the ability scales involved in three
More informationWarm-up Using the given data Create a scatterplot Find the regression line
Time at the lunch table Caloric intake 21.4 472 30.8 498 37.7 335 32.8 423 39.5 437 22.8 508 34.1 431 33.9 479 43.8 454 42.4 450 43.1 410 29.2 504 31.3 437 28.6 489 32.9 436 30.6 480 35.1 439 33.0 444
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More information4. DEDUCING THE MEASUREMENT MODEL
4. DEDUCING THE MEASUREMENT MODEL OBJECTIVITY In this chapter we deduce the Rasch Model from Thurstone's requirement that item comparisons be sample free. Thurstone (1928) says, "The scale must transcend
More informationINTRODUCTION TO ANALYSIS OF VARIANCE
CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two
More informationUSING BAYESIAN TECHNIQUES WITH ITEM RESPONSE THEORY TO ANALYZE MATHEMATICS TESTS. by MARY MAXWELL
USING BAYESIAN TECHNIQUES WITH ITEM RESPONSE THEORY TO ANALYZE MATHEMATICS TESTS by MARY MAXWELL JIM GLEASON, COMMITTEE CHAIR STAVROS BELBAS ROBERT MOORE SARA TOMEK ZHIJIAN WU A DISSERTATION Submitted
More informationSampling Distributions
Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Remember sampling? Sampling Part 1 of definition Selecting a subset of the population to create a sample Generally random sampling
More informationFoundations of Mathematics 11 and 12 (2008)
Foundations of Mathematics 11 and 12 (2008) FOUNDATIONS OF MATHEMATICS GRADE 11 [C] Communication Measurement General Outcome: Develop spatial sense and proportional reasoning. A1. Solve problems that
More informationContents. Acknowledgments. xix
Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables
More information1 Random and systematic errors
1 ESTIMATION OF RELIABILITY OF RESULTS Such a thing as an exact measurement has never been made. Every value read from the scale of an instrument has a possible error; the best that can be done is to say
More informationMIDLAND ISD ADVANCED PLACEMENT CURRICULUM STANDARDS AP CALCULUS BC
Curricular Requirement 1: The course teaches all topics associated with Functions, Graphs, and Limits; Derivatives; Integrals; and Polynomial Approximations and Series as delineated in the Calculus BC
More informationESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES. Dimitar Atanasov
Pliska Stud. Math. Bulgar. 19 (2009), 59 68 STUDIA MATHEMATICA BULGARICA ESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES Dimitar Atanasov Estimation of the parameters
More informationMIDLAND ISD ADVANCED PLACEMENT CURRICULUM STANDARDS AP CALCULUS AB
Curricular Requirement 1: The course teaches all topics associated with Functions, Graphs, and Limits; Derivatives; and Integrals as delineated in the Calculus AB Topic Outline in the AP Calculus Course
More informationLinearity in Calibration:
Linearity in Calibration: The Durbin-Watson Statistic A discussion of how DW can be a useful tool when different statistical approaches show different sensitivities to particular departures from the ideal.
More information