Paul Barrett - PDF Free Download

For: AERA-D Rasch Measurement SIG. New Orleans, USA, 3 rd April, 2002 Symposium entitled: Is Educational Measurement really possible? Chair and Organizer: Assoc. Prof. Trevor Bond Paul Barrett email: p.barrett@liv.ac.uk paul.barrett@mariner7.com, p.barrett@xtra.co.nz http://www.liv.ac.uk/~pbarrett/paulhome.htm Affiliations: Mariner7 Ltd., Ltd., Auckland NZ NZ Dept. Dept. of of Psychology, Univ. Univ. of of Auckland Dept. Dept. of of Clinical Clinical Psychology, Univ. Univ. Of Of Liverpool

Consider creating a measure of something or somebody. What comes first? Some notion of the specific feature or attribute [variable] for which you would like to differentially identify magnitudes, or? The operation of constructing measurement?

We construct measurement for a purpose. That purpose requires that we have a reason for such construction. This reason implies that we have an understanding about why the measurement of something will be worth constructing. This understanding requires some meaning-laden statements about the something, otherwise we would never have had thought of the purpose in the first place.

Now you have to decide what kind of measurement you wish to make kind of is conveniently delimited by whether or not you wish to invoke a concatenation operation using a standard unit - with which all measures of magnitude of measures of your variable will be so constructed.

So, a standard unit is not for you. This means you will at best be able to make measurement using only ordinal relations between measured magnitudes. i.e. magnitudes expressed as ranks, orderrelations, and no additive operations. This is still of utility, but it will limit the understanding of causes of phenomena to explanations couched in terms of qualitative magnitudes.

Otherwise known as the siren call of social scientists! The distinguished speakers before me have clearly explained the errors of logic involved with using the operations of additive concatenation, without ever considering whether the variable so measured was capable of sustaining such operations. You now know its not all just numbers.

Maybe but, hope persists because there is Rasch scaling. Using this, we can create probabilistic equal-interval measurement of latent variables using order-relation (or even categorical relation) comparisons between levels or categories on two variables to imply the equal-interval properties of the 3 rd derived latent variable. Voila! Quantitative scientific measurement, with the standard unit constructed via the scaling operation.

Can any two variables be so conjoined to produce a 3 rd? If the axioms of conjoint measurement are met, or the statistical fit indices of the modelling procedure deem it so, then YES. Barrett and Kline (1981) showed this with a single test constructed of Extraversion and Neuroticism items from the Eysenck Personality Questionnaire.

Didn t Robert Wood (1978) fit random coin-toss data with a Rasch model? YES, easily in fact. What was created was an equal-interval latent variable of coin- tossing ability! This is the result of measurement construction which is literally meaningless.

Yes, definitely. The fault lies not with the methodology in the two examples just mentioned, but with the meaning-laden conditions under which the scaling was initiated.

And you d still have exactly the same problems. My abstract mentioned Rozeboom s paradox a simple example of the failure of additive concatenation with the physical measurement of the volume of two liquids.

Suppose that my garage contains exactly a pint of brake fluid, exactly two quarts of alcohol, exactly a gallon of distilled water, and not a trace of any other fluids. Do you agree that this implies that my garage contains 13 pints of liquid -- not just approximately but EXACTLY? If so, how do you reach that conclusion? The proximal argument, of course, is that 1 pint + 4 pints + 8 pints equals 13 pints

But how does additivity apply here? Does it follow by what we have learned about the physical concatenation of liquids that if the fluids in my hypothetical garage were to be poured together into a suitably calibrated container of sufficient size, the mixture would measure exactly 13 pints or differ from that only by what can be explained by evaporation and some adhesion to the original containers?

Unfortunately for the concatenation argument, this is known (or at least alleged by second-hand information I have encountered) to be untrue: Distilled water absorbs enough alcohol to reduce the combined volumes to something less than the expected 13 pints.

Because it shows again that manipulating quantitative objects without regard to the meaning of the units of those objects, can lead to unexpected errors as shown above. This is not to argue for a one-to-one isomorphism of a unit with some physical property of an object (as per Campbell s thesis), but to stress that we do need to understand why a measure works as it does.

In the Rozeboom paradox, this knowing why is crucial to explaining the apparent failure of simple additive concatenation. The explanation in this particular case is found in the consideration of the constituent properties of volume, measured as pints of liquid, and understood in terms of molecular density, composition, and interaction.

Can we really make measurement like this in the social sciences Well, what do we mean by this? If we mean can we produce equalinterval measurement scales with certain properties of measurement, for meaningful variables, then YES. The literature abounds with examples from many domains. BUT

We now realise that constructing measurement scales without deep regard to what it is that we are attempting to measure is likely to end in a morass of competing and virtually arbitrary scales with practically no coherent means of choosing one over any other. What s worse is that none of them are likely to make very accurate measurement, except by chance alone.

Michael Maraun (1998) Measurement practice in psychology misdiagnoses the nature of measurement, since it is uniformly formulated under the assumption that measurement claims are justified in large part through empirical casebuilding [aka construct validity] (p. 436)

The problem is that in construct validation theory, knowing about something is confused with an understanding of the meaning of the concept that denotes that something.. But, if we look at Cronbach and Meehl Scientifically speaking, to make clear what something is means to set forth the laws in which it occurs. (Cronbach and Meehl, 1955)

This is mistaken. One may know more or less about it, build a correct or incorrect case about it, articulate to a greater or lesser extent the laws into which it enters, discover much, or very little about it. However, these activities all presuppose rules for the application of the concept that denotes it (e.g. intelligence, dominance)

Furthermore, one must be prepared to cite these standards as justification for the claim that these empirical facts are about it. (Maraun 1998 p. 448)

Let us also note Maraun (1998) again The relative lack of success of measurement in the social sciences as compared to the physical sciences is attributable to their sharply different conceptual foundations.

In particular, the physical sciences rest on a bedrock of technical concepts, whilst psychology rests on a web of common- or-garden psychological concepts. These concepts have notoriously complicated grammars [of meaning]. (p. 436)

Whatever measurement is to be created, if at all possible, will need to be created within a normative frame of meaning. That is, it is impossible to create measures of intelligence or depression unless these constructs/ phenomena have a normative meaning such that all investigators can work within this common semantic framework.

Without this normative agreement, as is the case today, chaos reigns as measure after measure is produced but with no common units or unambiguous common/shared meaning

Step 1: Define a normative meaning for your technical construct. It will be of narrow focus, capable of sustaining precise measurement. Step 2: Construct appropriate normative measurement for this construct. Step 3: Test the hypothesis that the measurement does indeed imply the normative meaning of the construct as defined. Step 4: Maintain this measurement via metrology

Of course, it is already a reality. We classify and make ordinal statements as a matter of course. However, the real question is: Is it possible to make measurement that accords to the properties required by the instantiation of a concatenation function using a standard measurement unit?

To answer this question, steps 1 to 3 above are required. So, put away the Hierarchical Linear Modelling, Structural Equation Models, and all those over-powered statistical modelling and scaling techniques that demand an additive concatenation unit. Sit down, and first THINK about Step 1 and how you aim to define the technical, normative, meaning of the constructs you propose.

Then, if you wish to construct measurement using an additive concatenation unit, use the Rasch model as your means of operationalising this. Begin to construct your measurement with that very specific, proposed normative meaning in mind. Then, if you are successful in producing measurement with the properties you specified, maintain it via metrology.

Barrett, P. T., & Kline, P. (1981) A comparison between Rasch analysis and factor analysis of items in the EPQ. Personality Study and Group Behaviour, 1, 2, 11-28 Cronbach, L.J., & Meehl, P. (1955) Construct validity in Psychological Tests. Psychological Bulletin, 52,, 281-302.

Maraun, M. (1998) Measurement as a Normative Practice. Theory and Psychology, 8, 4, 435-461. Rozeboom, R. (1966) Scaling Theory and the Nature of Measurement. Synthese, 16, 170-233. Wood, R. (1978) Fitting the Rasch Model: a heady tale. British Journal of Mathematical and Statistical Psychology, 31, 27-32.