Item Response Theory and Computerized Adaptive Testing

Similar documents
An Overview of Item Response Theory. Michael C. Edwards, PhD

Basic IRT Concepts, Models, and Assumptions

Assessment, analysis and interpretation of Patient Reported Outcomes (PROs)

Comparing Multi-dimensional and Uni-dimensional Computer Adaptive Strategies in Psychological and Health Assessment. Jingyu Liu

Using modern statistical methodology for validating and reporti. Outcomes

Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications

Latent Trait Reliability

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010

Paul Barrett

LINKING IN DEVELOPMENTAL SCALES. Michelle M. Langer. Chapel Hill 2006

IKDC DEMOGRAPHIC FORM

Walkthrough for Illustrations. Illustration 1

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

Computerized Adaptive Testing With Equated Number-Correct Scoring

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Section 4. Test-Level Analyses

What Rasch did: the mathematical underpinnings of the Rasch model. Alex McKee, PhD. 9th Annual UK Rasch User Group Meeting, 20/03/2015

Comparing IRT with Other Models

Norm Referenced Test (NRT)

Categorical and Zero Inflated Growth Models

( ) P A B : Probability of A given B. Probability that A happens

Modeling differences in itemposition effects in the PISA 2009 reading assessment within and between schools

Psychometric Issues in Formative Assessment: Measuring Student Learning Throughout the Academic Year Using Interim Assessments

A Simulation Study to Compare CAT Strategies for Cognitive Diagnosis

Detecting Exposed Test Items in Computer-Based Testing 1,2. Ning Han and Ronald Hambleton University of Massachusetts at Amherst

Lectures of STA 231: Biostatistics

UCLA Department of Statistics Papers

Classical Test Theory. Basics of Classical Test Theory. Cal State Northridge Psy 320 Andrew Ainsworth, PhD

The Discriminating Power of Items That Measure More Than One Dimension

Item Response Theory (IRT) Analysis of Item Sets

Equating Subscores Using Total Scaled Scores as an Anchor

Lesson 7: Item response theory models (part 2)

Outline. Reliability. Reliability. PSY Oswald

Methods for the Comparison of DIF across Assessments W. Holmes Finch Maria Hernandez Finch Brian F. French David E. McIntosh

Application of Item Response Theory Models for Intensive Longitudinal Data

Observed-Score "Equatings"

Introduction to GSEM in Stata

Base Rates and Bayes Theorem

Statistical and psychometric methods for measurement: Scale development and validation

Sampling and Sample Size. Shawn Cole Harvard Business School

The robustness of Rasch true score preequating to violations of model assumptions under equivalent and nonequivalent populations

AC : GETTING MORE FROM YOUR DATA: APPLICATION OF ITEM RESPONSE THEORY TO THE STATISTICS CONCEPT INVENTORY

18. SEPARATION STATISTICS

Centre for Education Research and Practice

Pubh 8482: Sequential Analysis

Logistic Regression and Item Response Theory: Estimation Item and Ability Parameters by Using Logistic Regression in IRT.

8 Nominal and Ordinal Logistic Regression

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

The Factor Analytic Method for Item Calibration under Item Response Theory: A Comparison Study Using Simulated Data

4 Data collection tables, worksheets, and checklists

USING BAYESIAN TECHNIQUES WITH ITEM RESPONSE THEORY TO ANALYZE MATHEMATICS TESTS. by MARY MAXWELL

Test Strategies for Experiments with a Binary Response and Single Stress Factor Best Practice

Probability and Probability Distributions. Dr. Mohammed Alahmed

PROGRAM STATISTICS RESEARCH

ECON1310 Quantitative Economic and Business Analysis A

Q-Matrix Development. NCME 2009 Workshop

ESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES. Dimitar Atanasov

Because it might not make a big DIF: Assessing differential test functioning

P (B)P (A B) = P (AB) = P (A)P (B A), and dividing both sides by P (B) gives Bayes rule: P (A B) = P (A) P (B A) P (B),

HOW TO DETERMINE THE NUMBER OF SUBJECTS NEEDED FOR MY STUDY?

Ability Metric Transformations

Introduction To Confirmatory Factor Analysis and Item Response Theory

Utilizing Hierarchical Linear Modeling in Evaluation: Concepts and Applications

Ensemble Rasch Models

ADDITIVITY IN PSYCHOLOGICAL MEASUREMENT. Benjamin D. Wright MESA Psychometric Laboratory The Department of Education The University of Chicago

Multidimensional item response theory observed score equating methods for mixed-format tests

arxiv: v1 [math.st] 7 Jan 2015

Sample Size Determination

WELCOME! Lecture 14: Factor Analysis, part I Måns Thulin

Monte Carlo Simulations for Rasch Model Tests

Draft Proof - Do not copy, post, or distribute

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Two-sample Categorical data: Testing

A measure for the reliability of a rating scale based on longitudinal clinical trial data Link Peer-reviewed author version

Liang Li, PhD. MD Anderson

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

Power and sample size calculations

PIRLS 2016 Achievement Scaling Methodology 1

The Difficulty of Test Items That Measure More Than One Ability

The Lady Tasting Tea. How to deal with multiple testing. Need to explore many models. More Predictive Modeling

STAT 526 Spring Final Exam. Thursday May 5, 2011

Maryland High School Assessment 2016 Technical Report

Sequential Analysis of Quality of Life Measurements Using Mixed Rasch Models

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

Classical Test Theory (CTT) for Assessing Reliability and Validity

Problem Set #5. Due: 1pm on Friday, Nov 16th

A PSYCHOPHYSICAL INTERPRETATION OF RASCH S PSYCHOMETRIC PRINCIPLE OF SPECIFIC OBJECTIVITY

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Practice of SAS Logistic Regression on Binary Pharmacodynamic Data Problems and Solutions. Alan J Xiao, Cognigen Corporation, Buffalo NY

Higher National Unit specification. General information for centres. Unit code: F6JK 35

Development and Calibration of an Item Response Model. that Incorporates Response Time

IRT Potpourri. Gerald van Belle University of Washington Seattle, WA

Use of Robust z in Detecting Unstable Items in Item Response Theory Models

Sampling Distributions: Central Limit Theorem

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

Whats beyond Concerto: An introduction to the R package catr. Session 4: Overview of polytomous IRT models

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Transcription:

Item Response Theory and Computerized Adaptive Testing Richard C. Gershon, PhD Department of Medical Social Sciences Feinberg School of Medicine Northwestern University gershon@northwestern.edu May 20, 2011

Item Response Theory versus Classical Test Theory Uses of IRT Item Banking Short Forms Computerized Adaptive Tests

Psychometrics is the field of study concerned with the theory and technique of educational and psychological measurement, which includes the measurement of knowledge (achievement), abilities, attitudes, and personality traits. Source-Wikipedia

Measurement requires the concept of an underlying trait that can be expressed in terms of more or less Test items are the operational definition of the underlying trait Test items - can be ordered from easy to hard Test takers - can be ordered from less able to more able

A latent trait is an unobservable latent dimension that is thought to give rise to a set of observed item responses. I am too tired to do errands False True Low Severe Fatigue

Latent Traits (con t) These latent traits (constructs, variables, q) are measured on a continuum of severity. I am too tired to do errands? energetic False True severe Fatigue

Equal Interval Measure Test-takers and items are represented on the same scale Item calibrations are independent of the test-takers used for calibration Candidate ability estimates are independent of the particular set of items used for estimation Measurement precision is estimated for each person and each item

Item Difficulty = Severity = Measure = Theta = Item Calibration = Location Person Ability = Measure = Theta = Person Calibration = Location

Test-takers and Items are Represented on the Same Scale Item Difficulty Easy Low Person QOL Hard High

Discrimination = the degree to which an item discriminates person ability Item Information = the area where an item discriminates Test Information = the area where the test discriminates

Item Parameters = IRT statistics about an item Primary: Item Difficulty Often: Item Discrimination Sometimes: Guessing Lots of other ugly looking numbers

Differential Item Functioning (DIF) Does an item have different item parameters for different subgroups? Gender Race Age Disease

Rasch model one parameter logistic model (1PL) Two parameter logistic model (2PL) Three parameter logistic model (3PL)

How to choose an appropriate IRT Model OR My religion is better than your religion!

You are about to see about to see mathematical formulas!

One Parameter Logistic Model e (ability - difficulty) P 1,0 = 1 + e (ability - difficulty) When the difficulty of a given item exactly matches the Examinee s ability level, then the person has 50% chance of answering that item correctly: P 1,0 = e (0) 1 + e (0) = 1 2 =.50

Only option for small sample sizes Often the real model underlying a test labeled as three parameter Less costly The simple solution is always the best

Two Parameter Logistic Model P 1,0 = e a (ability - b) 1 + e a (ability - b) Two parameters a=discrimination b=item Difficulty

a=.5,b=.5,c=.1 a=1.5,b=.5,c=.1 a=2.5,b=.5,c=.1

Three Parameter Logistic Model P 1,0 = c + (1-c) e a (ability - b) 1 + e a (ability - b) Three parameters a= Discrimination b= Item Difficulty c= Guessing

Requires a large sample size Significant research demonstrating that theoretically 3PL is better, but practically has little advantage over 1PL Most accepted theoretical model

a=1.5,b=.5,c=.1 a=2.5,b=.5,c=.25

One Parameter Rating Scale Model Partial Credit Model Two Parameter Graded Response Model Generalized Partial Credit Model

There are also IRT models which consider more than one unidimensional trait at a time

How does IRT differ from conventional test theory?

An individual takes an assessment Their total score on that assessment is used for comparison purposes High Score The person is higher on the trait Low Score-The person is lower on the trait

Each individual item can be used for comparison purposes Person endorses better rating on hard items - The person is higher on the trait Person endorses worse rating on easy items - The person is lower on the trait Items that measure the same construct can be aggregated into longer assessments

CTT IRT Reliability is based upon the total test. Regardless of patient ability, reliability is the same. Reliability is calculated for each patient ability and varies across the continuum. Typically, there is better reliability in the middle of the distribution.

CTT Validity is based upon the total test. Typically, validity would need to be re-assessed if the instrument is modified in any way. IRT Validity is assessed for the entire item bank. Subsets of items (full length tests, short forms and CAT) all inherit the validity assessed for the original item bank.

How Scores Depend on the Difficulty of Test Items Very Easy Test 1 8 Person Expected Score 8 Very Hard Test Person Expected Score 0 1 8 Medium Test 1 Person Expected Score 5 8 Reprinted with permission from: Wright, B.D. & Stone, M. (1979) Best test design, Chicago: MESA Press, p. 5.

Raw Scores vs. IRT Measures IRT has Equal Interval Measurement Raw: 4 Item Test 1 2 3 4 Logit Measures: 1.00 1.25 1.50 2.50

How Differences Between Person Ability and Item Difficulty Ought to Affect the Probability of a Correct Response Person Ability Item Difficulty p >. 5 Person Ability p <. 5 Person Ability Item Difficulty Item Difficulty p =. 5 Reprinted with permission from: Wright, B.D. & Stone, M. (1979) Best test design, Chicago: MESA Press, p. 13.

The Item Characteristic Curve

I Have a Lack of Energy Traditional Test Theory 0 = Very Much 1 = Quite a Bit 2 = Somewhat 3 = A Little Bit 4 = Not at All

I Have a Lack of Energy Traditional Test Theory 0 = Very Much 1 = Quite a Bit 2 = Somewhat 3 = A Little Bit 4 = Not at All Item Response Theory

The IRT Reality of a 10 Point Rating-Scale Item 0 1 2 3 4 5 6 7 8 9 No Pain Worst Pain 0 1 2 3 4 5 6 7 8 9

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 Probability Curve I have a lack of energy 1 0.9 0.8 0.7 This is an Item Characteristic Curve (ICC) for a rating scale item (each option has its own curve) 0.6 0.5 0.4 0 1 2 4 3 0.3 0.2 0.1 0 Trait Measure 0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 Probability Curve I have a lack of energy 1 0.9 0.8 0.7 0.6 0.5 0.4 0 1 2 4 3 0.3 0.2 0.1 0 Trait Measure 0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 Probability Curve I have a lack of energy 1 0.9 0.8 0.7 0.6 0.5 0.4 0 1 2 4 3 0.3 0.2 0.1 0 Trait Measure 0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 Probability Curve I have a lack of energy 1 0.9 0.8 0.7 0.6 0.5 0.4 0 1 2 4 3 0.3 0.2 0.1 0 Trait Measure 0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 Probability Curve I have a lack of energy 1 0.9 0.8 0.7 0.6 0.5 0.4 0 1 2 4 3 0.3 0.2 0.1 0 Trait Measure 0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

Probability of Response 1.0 0.8 0.6 None of the time A little of the time Some of the time Most of the time All of the time 0.4 0.2 0.0-3.00-2.00-1.00 0.00 1.00 2.00 3.00 Energetic Fatigue q Severe Fatigue

Probability of Response 1.0 0.8 0.6 All of the time Most of the time Some of the time A little of the time None of the time 0.4 0.2 0.0-3.00-2.00-1.00 0.00 1.00 2.00 3.00 Energetic Fatigue q Severe Fatigue

Probability of Response 1.0 0.8 0.6 None of the time A little of the time Some of the time All of the time Most of the time 0.4 0.2 0.0-3.00-2.00-1.00 0.00 1.00 2.00 3.00 Energetic Fatigue q Severe Fatigue

Short Forms 5-7 Items in each HRQL Area Constructed to cover full range of trait OR Multiple forms constructed to only cover a narrow range of trait (e.g., high, medium, or low) Computerized Adaptive Testing (CAT) Custom individualized assessment Suitable for clinical use Accuracy level chosen by researcher Source: Expert Rev. of Pharmoeconomics Outcomes Res. (2003) Emotional Distress Pain Physical Function Item Bank Item40 Item38 Item36 Item34 Item32 Item30 Item28 Item26 Item24 Item22 Item20 Item18 Item16 Item14 Item12 Item10 Item8 Item6 Item4 Item2 Prostate Cancer Item40 Item38 Item34 Item32 Item26 Item 22 Item 18 Item 16 Item 8 Item 2 3 Diseases 3 Trials Custom Item Selection Breast Cancer Item 36 Item 34 Item 32 Item 28 Item 26 Item 22 Item 14 Item 10 Item 2 3 Unique Instruments Brain Tumor Item 40 Item 32 Item 24 Item 16 Item 8 Each based on content interest of individual researchers

5-7 Items in each HRQL Area Constructed to cover full range of trait OR Multiple forms constructed to only cover a narrow range of trait (e.g., high, medium, or low) Emotional Distress Pain Physical Function Item Bank Item40 Item38 Item36 Item34 Item32 Item30 Item28 Item26 Item24 Item22 Item20 Item18 Item16 Item14 Item12 Item10 Item8 Item6 Item4 Item2 Source: Expert Rev. of Pharmoeconomics Outcomes Res. (2003)

Depression Depressive Symptoms Form C Depression Symptoms Form A Depression Symptoms Form B no depression mild depression moderate depression severe depression extreme depression Depressive Symptoms Item Bank Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item n

Custom individualized assessment Suitable for clinical use Accuracy level chosen by researcher Emotional Distress Pain Physical Function Item Bank Item40 Item38 Item36 Item34 Item32 Item30 Item28 Item26 Item24 Item22 Item20 Item18 Item16 Item14 Item12 Item10 Item8 Item6 Item4 Item2 Source: Expert Rev. of Pharmoeconomics Outcomes Res. (2003)

Source: Expert Rev. of Pharmoeconomics Outcomes Res. (2003) Emotional Distress Pain Physical Function Item Bank Item40 Item38 Item36 Item34 Item32 Item30 Item28 Item26 Item24 Item22 Item20 Item18 Item16 Item14 Item12 Item10 Item8 Item6 Item4 Item2 Prostate Cancer Item40 Item38 Item34 Item32 Item26 Item 22 Item 18 Item 16 Item 8 Item 2 3 Diseases 3 Trials Custom Item Selection Breast Cancer Item 36 Item 34 Item 32 Item 28 Item 26 Item 22 Item 14 Item 10 Item 2 3 Unique Instruments Brain Tumor Item 40 Item 32 Item 24 Item 16 Item 8 Each based on content interest of individual researchers

Create a standard static instrument Construct short forms Enable CAT Select items based on unique content interests and formulate custom short-form or full-length instruments

In every case, using a validated, pre-calibrated item bank allows any of these instruments to be pre-validated and produce standardized scores on the same scale

Computerized Adaptive Testing

What is Computerized Adaptive Testing? Shorter Targeting Computerized Algorithm

Armed Services Vocational Aptitude Battery (ASVAB)

ACCUPLACER OnLine

Low Pass High Able Point Able J? L? J? J? PASS!

Low Pass High Able Point Able L? J? L L?? FAIL

Binary search

Low TRAIT High 4 3 4 Result: Medium-High on the Trait 3

Heterogeneous populations Diagnostic tests On-demand testing Long tests

Small populations Difficulty in calibrating items Higher administration costs

Calibrated item bank Administration software

You probably already have done the hard work! It s usually the same as for CBT.

Create an Item Bank Part 1 Item sources Previous exams With a CAT bank fewer constraints exist regarding item re-use Write new items

Create an Item Bank Part 2 Item quality Statistics relevant to CAT (not necessarily print) Difficulty Other variables relevant to selected IRT model

Analyze item level data from previously administered items Preferably using raw person-data Alternatively, could create raw estimates from p-values Several software packages exist for this purpose

Pilot (beta) new test items On paper On computer Typically requires a psychometrician to supervise analyses

Starting rule With item which provides maximum information At cut point

Stopping Rule Fixed length Variable length by Total Test/Subtest Calculated Specified precision of measure Specified confidence in a pass/fail decision Maximum item count Minimum item count

Content balancing None Fixed percentage

Testing new items (beta testing, field testing, experimental items)

Person ability algorithm Item selection algorithm Test difficulty Maximum jump size Content issues Item exposure control Option to not allow same items to be used during retesting Overlapping items (items that cue other items)

339585909 Entry= 1 MLT Ver: 10/01/01 Clear Pass Tested: 01/28/02 Status: 2 P A -3.0-2.0-1.0 S0 1.0 2.0 3.0 Item AN Cont Diff Ans = Time! Meas SE ++++*++++ ++++*++++ ++++*+++S ++++*++++ ++++*++++ ++++*++++ 1 21151 BBN -0.09 1 1 o 2'30 9.99 9.99 + 2 22805 CHE 0.03 2 1 o 2'56 9.99 9.99 + 3 22479 HEM 0.13 4 0 o 0'36 0.72 1.22 * + X * 4 21986 MIC 0.13 3 1 o 0'29 1.15 1.15 * + X > 5 22397 IMM 0.26 1 1 o 0'10 1.48 1.12 * + X > 6 21793 UA 0.46 4 1 o 0' 9 1.76 1.10 * + X > 7 22504 BBN 0.50 3 1 o 0'56 1.99 1.08 * + X > 8 22083 CHE 0.57 4 1 o 0'22 2.19 1.07 * + X > 9 22641 HEM 0.74 4 1 o 0'59 2.38 1.06 *+ X > 10 20194 MIC 0.90 2 1 + 3'17 2.56 1.05 *+ X > 11 22032 BBN 1.00 4 0 o 1'26 1.92 0.78 * + X > 12 20344 CHE 1.00 4 1 o 1' 0 2.08 0.77 * + X > 13 22261 HEM 1.12 4 0 o 1' 9 1.72 0.66 * + X * 14 21851 MIC 0.94 4 1 o 1'15 1.85 0.65 *+ X * 15 21511 IMM 1.02 1 1 o 2'14 1.97 0.65 *+ X > 16 21450 UA 1.27 1 1 + 1'17 2.09 0.64 * + X > 17 20537 BBN 0.93 3 1 o 0'35 2.18 0.64 + * X > 18 22330 CHE 1.12 2 1 + 2'32 2.28 0.63 +* X > 19 21218 HEM 1.02 1 1 o 0'37 2.36 0.63 + * X > 20 21628 MIC 0.96 3 1 o 1' 3 2.44 0.63 + * X > 21 22748 BBN 1.07 1 1 o 2'10 2.51 0.62 + * X > 22 22553 CHE 1.22 3 1 o 0'31 2.59 0.62 + * X > 23 22639 HEM 1.28 1 1 o 0'57 2.66 0.62 + * X > 24 22646 MIC 1.35 2 0 = 2'44 2.40 0.55 + * X > 25 22663 IMM 1.27 1 0 o 1'17 2.19 0.50 +* X > 26 22557 UA 1.06 2 1 o 0'41 2.25 0.50 + * X > 27 20686 BBN 1.15 1 1 o 0'27 2.31 0.50 + * X > 28 22634 CHE 1.37 3 0 o 1'19 2.15 0.46 + X * 29 21646 HEM 1.16 2 1 o 0'15 2.20 0.46 + * X * 30 22387 MIC 1.31 4 1 o 0'23 2.26 0.46 + * X > 31 20018 BBN 1.27 3 1 o 0'34 2.31 0.45 + * X > 32 22059 CHE 1.40 1 1 o 0'48 2.37 0.45 + * X > 33 22471 HEM 1.34 1 1 o 0'41 2.42 0.45 + * X >

434843789 Entry= 1 HT Ver: 01/01/02 Clear Fail Tested: 01/28/02 Status: 1 A -3.0-2.0-1.0 0 S 1.0 2.0 3.0 Item AN Cont Diff Ans = Time! Meas SE ++++*++++ ++++*++++ ++++*++++ ++S+*++++ ++++*++++ ++++*++++ 1 31384 ST 0.35 1 0 0'34-9.99 9.99 + 2 31009 FIX 0.22 3 1 0'18 0.29 1.41 * +X * 3 31113 LO 0.18 1 0 0'26-0.44 1.22 * X + * 4 30385 MIC 0.36 3 1 0'33 0.28 1.00 * X+ * 5 30873 ST 0.24 3 0 0'31-0.14 0.91 * X + * 6 30533 PRO 0.05 2 0 0'30-0.46 0.87 * X + * 7 30525 ST 0.35 2 0 0'16-0.67 0.84 * X + * 8 31008 FIX 0.37 4 0 0'31-0.83 0.82 * X +* 9 30664 ST 0.30 2 0 0'32-0.98 0.80 * X + 10 31086 LO 0.35 4 0 0'12-1.11 0.79 * X * + 11 31626 ST 0.34 2 0 0'23-1.22 0.78 * X * + 12 31356 MIC 0.32 4 1 0'41-0.81 0.67 * X + 13 31210 PRO 0.21 2 0 0'35-0.92 0.66 * X + 14 31148 ST 0.39 1 0 0'20-1.01 0.65 * X * + 15 31620 FIX 0.25 4 0 0'10-1.10 0.65 * X * + 16 30224 ST 0.20 4 0 0'25-1.19 0.64 * X * + 17 30940 FIX 0.40 2 0 0'32-1.25 0.64 * X * + 18 31288 ST 0.25 3 1 1'14-0.97 0.57 * X * + 19 31529 LO 0.28 1 0 0'58-1.04 0.56 * X * + 20 31120 ST 0.40 2 0 0'11-1.10 0.56 * X * + 21 31355 MIC 0.36 2 0 0'59-1.15 0.56 * X * + 22 31207 PRO 0.33 2 0 0'34-1.21 0.55 * X * + 23 30745 ST 0.33 4 0 0'33-1.26 0.55 * X * + 24 31285 FIX 0.40 3 0 0'13-1.31 0.55 * X * + 25 30237 ST 0.39 3 0 0'22-1.35 0.55 * X * + 26 30179 ST 0.26 1 0 0'24-1.40 0.54 * X * + 27 31055 FIX 0.23 4 1 0'24-1.18 0.50 * X * + 28 31598 LO 0.29 2 0 0'33-1.23 0.49 * X * + 29 30384 MIC 0.27 3 0 0'11-1.27 0.49 * X * + 30 30524 ST 0.20 1 0 0'21-1.31 0.49 * X * + 31 31470 PRO 0.38 1 0 0'16-1.35 0.49 * X * + 32 30188 ST 0.31 3 0 0'21-1.39 0.49 * X * + 33 31402 FIX 0.28 3 0 1' 9-1.42 0.49 * X * + P

11433522 Entry= 1 PBT Ver: 10/01/01 ested: 01/26/02 tatus: 1 Fence Sitter P A -3.0-2.0-1.0 S 0 1.0 2.0 3.0 Item AN Cont Diff Ans = Time! Meas SE ++++*++++ ++++*++++ ++++*+S++ ++++*++++ ++++*++++ ++++*++++ 1 220576 SC -0.33 3 1 o 0'37 9.99 9.99 + 2 220304 LO -0.24 2 1 o 1'13 9.99 9.99 + 3 220935 SPH -0.13 4 0 o 1' 3 0.46 1.22 * + X * 4 220213 SC -0.03 1 1 + 0'52 0.92 1.15 * + X * 5 220378 AP -0.11 3 0 = 0'40 0.24 0.91 * + X * 6 220523 SC -0.30 4 1 o 0'10 0.50 0.87 * + X * 7 220611 LO -0.37 2 1 o 0'17 0.70 0.84 * + X * 8 220928 SC -0.38 1 0 o 0'33 0.27 0.73 * + X 9 220218 SPH -0.48 3 0 o 0'50-0.04 0.67 * + X * * 10 220975 SC -0.65 3 1 o 0'46 0.10 0.65 * + X * 11 220709 SC -0.79 1 1 o 0'35 0.21 0.63 + X 12 220634 LO -0.56 2 0 = 0'41-0.03 0.59 * + X * * 13 220708 SPH -0.81 1 1 o 0'22 0.07 0.57 *+ X 14 220748 SC -0.65 2 0 o 0'34-0.13 0.54 * + X * * 15 220369 AP -0.88 2 1 o 0'39-0.04 0.53 + X * 16 220777 SC -0.68 1 0 o 0'40-0.21 0.50 * + X * 17 220265 LO -0.97 1 0 o 0'12-0.37 0.49 * + X * 18 220885 SC -0.95 1 1 o 0'33-0.29 0.47 *+ X * 19 220302 SPH -0.98 2 1 o 0' 8-0.22 0.46 + X * 20 220044 SC -0.88 1 1 o 0'32-0.15 0.46 + X * 21 220442 SC -0.80 4 0 o 0'16-0.28 0.44 * + X 22 220263 LO -1.01 1 1 o 0'52-0.22 0.43 +* X * * 23 220507 SPH -0.79 1 0 o 0'30-0.34 0.42 * + X * 24 220037 SC -1.00 4 1 + 0'43-0.28 0.41 + X 25 220317 AP -1.05 3 1 o 0'11-0.23 0.41 +* X * * 26 220535 SC -0.92 3 0 = 0'51-0.33 0.40 *+ X 27 220987 LO -1.02 4 1 o 0'25-0.28 0.39 +* X * * 28 220342 SC -0.99 3 1 o 0'49-0.23 0.39 +* X * 29 220089 SPH -0.89 2 0 o 0'41-0.33 0.38 *+ X 30 220860 SC -1.11 2 1 o 0'20-0.29 0.37 + * X * * 31 220754 SC -0.98 3 0 o 0'47-0.38 0.36 + X * 32 220610 LO -1.08 3 1 o 0'23-0.33 0.36 + * X 33 220347 SPH -0.91 1 1 o 0'49-0.29 0.36 + X * * 34 220856 SC -1.01 2 1 + 1' 2-0.25 0.35 + * X *

Information Function 0.9 Simulate Measure = 48 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Item Meas SE 0 0 10 20 30 40 50 60 70 80 90 100 1 GP1 I have a lack of energy 0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

Information Function 0.9 Simulate Measure = 48 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Item Meas SE 1 37 21 0 0 10 20 30 40 50 60 70 80 90 100 GP1 I have a lack of energy 0 = Very Much; 1 = Quite a Bit; 2 = Somewhat; 3 = A Little Bit; 4 = Not at All

Information Function 1.2 Simulate Measure = 48 1 0.8 0.6 0.4 0.2 Item Meas SE 1 37 21 0 0 10 20 30 40 50 60 70 80 90 100 2 F65

Information Function 1.2 Simulate Measure = 48 1 0.8 0.6 0.4 0.2 Item Meas SE 2 40 12 0 0 10 20 30 40 50 60 70 80 90 100 3 F64

Information Function 1.2 Simulate Measure = 48 1 0.8 0.6 0.4 0.2 Item Meas SE 3 42 9 0 0 10 20 30 40 50 60 70 80 90 100 4 F68

Item Meas SE 1 37 21 2 40 12 3 42 9 4 44 8 5 45 7 6 46 7 7 47 6 8 48 6 Simulate Measure = 48 0 10 20 30 40 50 60 70 80 90 100 9 47 5 10 48 5

Simulate Measure = 15 Item Meas SE 1 9 38 2 16 13 3 19 10 4 14 9 5 15 7 6 16 7 7 17 6 8 16 6 0 10 20 30 40 50 60 70 80 90 100 9 15 6 10 14 5

Item Meas SE 1 100 25 Simulate Measure = 92 0 10 20 30 40 50 60 70 80 90 100 2 97 15 3 100 12 4 99 10 5 97 9 6 95 8 7 97 8 8 99 8 9 100 7 10 99 7

FATEXP 20 FATEXP 5 FATEXP 18 FATIMP 33 FATIMP 30 In the past 7 days Never Rarely How often did you feel tired? How often did you experience extreme exhaustion? How often did you run out of energy? How often did your fatigue limit you at work (include work at home)? How often were you too tired to think clearly? 1 1 1 1 1 2 2 2 2 2 Sometimes 3 3 3 3 3 Often Always 4 4 4 4 4 5 5 5 5 5 FATIMP 21 FATIMP 40 How often were you too tired to take a bath or shower? How often did you have enough energy to exercise strenuously? 1 1 2 2 3 3 4 4 5 5 Reprinted with permission of the PROMIS Health Organization and the PROMIS Cooperative Group 2007.

CAT in Assessment Center