Measurement Theory. Reliability. Error Sources. = XY r XX. r XY. r YY

Similar documents
Classical Test Theory. Basics of Classical Test Theory. Cal State Northridge Psy 320 Andrew Ainsworth, PhD

Classical Test Theory (CTT) for Assessing Reliability and Validity

Outline. Reliability. Reliability. PSY Oswald

Concept of Reliability

The Reliability of a Homogeneous Test. Measurement Methods Lecture 12

Lesson 6: Reliability

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Consequences of measurement error. Psychology 588: Covariance structure and factor models

Latent Trait Reliability

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Comparing IRT with Other Models

Assessment, analysis and interpretation of Patient Reported Outcomes (PROs)

Factor Analysis. Qian-Li Xue

Basic IRT Concepts, Models, and Assumptions

An Overview of Item Response Theory. Michael C. Edwards, PhD

Item Reliability Analysis

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

Introduction to Confirmatory Factor Analysis

Reconciling factor-based and composite-based approaches to structural equation modeling

Instrumentation (cont.) Statistics vs. Parameters. Descriptive Statistics. Types of Numerical Data

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Basic Statistical Analysis

Section 4. Test-Level Analyses

A Use of the Information Function in Tailored Testing

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Part III Classical and Modern Test Theory

Applied Multivariate Analysis

Validation Study: A Case Study of Calculus 1 MATH 1210

Chapter 12 - Lecture 2 Inferences about regression coefficient

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Style Insights DISC, English version 2006.g

Statistical and psychometric methods for measurement: Scale development and validation

Chapter 13 Correlation

Data Analysis as a Decision Making Process

Reliability Theory Classical Model (Model T)

Reminder: Student Instructional Rating Surveys

Conditional Standard Errors of Measurement for Performance Ratings from Ordinary Least Squares Regression

APPENDIX 1 BASIC STATISTICS. Summarizing Data

Chapter 4: Factor Analysis

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Correlation and regression

Taguchi Method and Robust Design: Tutorial and Guideline

1. How will an increase in the sample size affect the width of the confidence interval?

Consistent and Dependent

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

COEFFICIENTS ALPHA, BETA, OMEGA, AND THE GLB: COMMENTS ON SIJTSMA WILLIAM REVELLE DEPARTMENT OF PSYCHOLOGY, NORTHWESTERN UNIVERSITY

Confirmatory Factor Analysis. Psych 818 DeShon

Introduction to Random Effects of Time and Model Estimation

Statistical Foundations:

T. Mark Beasley One-Way Repeated Measures ANOVA handout

Statistics Introductory Correlation

Introduction to Factor Analysis

Psychology 205: Research Methods in Psychology

Given a sample of n observations measured on k IVs and one DV, we obtain the equation

Myths about the Analysis of Change

Introduction To Confirmatory Factor Analysis and Item Response Theory

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

TESTING AND MEASUREMENT IN PSYCHOLOGY. Merve Denizci Nazlıgül, M.S.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Deciphering Math Notation. Billy Skorupski Associate Professor, School of Education

1 Correlation between an independent variable and the error

Clarifying the concepts of reliability, validity and generalizability

Correlation: Relationships between Variables

Reader s Guide ANALYSIS OF VARIANCE. Quantitative and Qualitative Research, Debate About Secondary Analysis of Qualitative Data BASIC STATISTICS

An Introduction to Path Analysis

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Package gtheory. October 30, 2016

ECE Lecture #10 Overview

Lecture Notes #13: Reliability & Structural Equation Modeling 13-1

Measures of Agreement

Sequential Reliability Tests Mindert H. Eiting University of Amsterdam

A Threshold-Free Approach to the Study of the Structure of Binary Data

Formula for the t-test

Assignment 10 Design of Experiments (DOE)

Correlation and the Analysis of Variance Approach to Simple Linear Regression

UCLA STAT 233 Statistical Methods in Biomedical Imaging

ECO220Y Simple Regression: Testing the Slope

Two Measurement Procedures

Passing-Bablok Regression for Method Comparison

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Beyond classical statistics: Different approaches to evaluating marking reliability

Measurement exchangeability and normal one-factor models

Econometrics. 7) Endogeneity

Multilevel Modeling: A Second Course

Introduction to Factor Analysis

Outline

Introduction to Structural Equations

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

An Introduction to Mplus and Path Analysis

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

COSINE SIMILARITY APPROACHES TO RELIABILITY OF LIKERT SCALE AND ITEMS

The regression model with one stochastic regressor (part II)

Lecture 18: Simple Linear Regression

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Intermediate Social Statistics

Design of Experiments. Factorial experiments require a lot of resources

Transcription:

Y -3 - -1 0 1 3 X Y -10-5 0 5 10 X Measurement Theory t & X 1 X X 3 X k Reliability e 1 e e 3 e k 1 The Big Picture Measurement error makes it difficult to identify the true patterns of relationships between important variables Reliability indexes the amount of measurement error Relationships are attenuated by unreliability r XY = XY r XX r YY -4-0 4 Attenuation Y =.05+.96(X) Y =.03+.11(X) R =.49; r xx = 1.0 R =.07; r xx = 0.5-4 - 0 4 The Big Picture The purpose of measurement theory is to provide reliability estimates for the measures To make good decisions, you must know how much error is in the data upon which the decisions are based The challenge is to identify/develop a measurement model that makes reasonable assumptions while still allowing a useful way to estimate reliability. 3 4 Where does Error Come From? Reliability is the extent to which measurements are consistent, repeatable, dependable, or generalizable across measurement conditions Anything that results in inconsistencies across measurement conditions (e.g., items, raters, times, etc) is a source of measurement error Sources... Error Sources Psychologically interesting variables that matter, but were not included in the model Idiosyncratic Error Mood, fatigue, boredom, language difficulties, attention Generic Error Poor instructions, confusing categories or anchors, setting, variations in administration Additive Error acquiescent, leniency, severity response sets 5 6

Error Sources Systematic errors Social desirability Demand characteristics, experimenter expectancies Halo error Interviewer bias Rater dispersion bias, midpoint response sets Classical Test Theory Single Indicator Most common measurement model Based on the linear combination of true score and error and the assumption that error is random: X i = t i + e i 7 8 Classical Test Theory Single Indicator Recall, that the variance of a linear composite is: X = If you assume that the error and true score are independent, then X = Classical Test Theory Single Indicator Signal to Noise Ratio: Reliability: SNR= XX = X 9 10 CTT Parallel Forms Reliability Estimation Original approach to estimating the reliability ratio was based on the logic of Parallel forms Assume you have two exactly equal measures of the same latent variable (X and X') 11 CTT Parallel Forms Reliability Estimation If two measures are parallel, then: Same true scores Therefore same true score variance Same error variance X' = X = X' = X X' = X = Therefore same observed variance X' = X = 1

CTT Parallel Forms Reliability Estimation Given these results, the correlation between the two parallel measures is an estimate of the reliability xx ' = COV x x, x' x' x x ' xx ' = t x,t x' t, x t, x' x, x' x = 14 x CTT Multiple Indicators Theoretically, a person's true score may be defined as the mean response to an infinite set of observations Domain sampling theory A very simple measurement model... 1 t 3 k Unobserved Therefore, one of the best ways to get closer to the true score (less error!) is to add measurements Now, reliability of a linear composite of indicators 15 X 1 X X 3 X k e 1 e e 3 e k Key assumptions r t,e = 0 r ei e j = 0 r xi x j.t = 0 Observed Unobserved 16 CTT Multiple Indicators Parallel indicators Equal lambdas (loadings) Equal error variances Tau-equivalent indicators Equal lamdas (loadings) Congeneric indicators It's a free-for-all 17 By adding indicators (items, times, judges, etc) you get a better measure. The composite provides a more precise estimate of the true score because the random errors on each of the indicators cancel each other out Averages are better estimates than single observations 18

CTT Split Half Reliability Split Half Reliability problems... How to split? Spearman-Brown Prophesy formula Reliability for a shorter or longer set of items 19 k r ij r sb = 1 k 1 r ij r ij =averageitem correlation 0 CTT alpha Cronbach's alpha ( ) Generally used on items as an index of internal consistency, but applicable to any types of measures that meet the assumptions Equals the average split half reliability corrected for length Assumes tau-equivalence Not a measure of uni-dimensionality Equal to KR-0 when computed on dichotomously scored items 1 alpha ] LC = k [ 1 xi k 1 The denominator is simply the variance of the linear composite of all the indicators The variance of a linear combination is simply the sum of all elements in the variance-covariance matrix of the original components. 3 4

σ 1 σ 1 σ 13 σ 14 σ 15 σ 16 σ 1 σ σ 3 σ 4 σ 5 σ 6 Standardized alpha σ 31 σ 41 σ 3 σ 4 σ 3 σ 43 σ 34 σ 4 σ 35 σ 45 σ 36 σ 46 Used when you will standardize the variance of each indicator before forming the composite σ 51 σ 5 σ 53 σ 54 σ 5 σ 56 k r ij s = 1 k 1 r ij σ 61 σ 6 σ 63 σ 64 σ 65 σ 6 5 6 Reliability of Congeneric test Estimated using confirmatory factor analysis r xx = i Var i Var Var i CTT Alternative interpretation of reliability Correlation of observed test score with true score Reliability index Estimated as the square root of the reliability coefficient The estimated correlation between If the reliability is students test and domain scores is 0.49 0.70 0.64 0.80 0.81 0.90 7 8 Some important points... The level of reliability does not increase with increases in sample size (N). Reliability depends on the conditions of measurement. There are as many reliabilities for a measure as there are different conditions of measurement. Application of the Spearman-Brown prophecy formula assumes that new items are the same as old items. Uses Once you have a reliability estimate you can do many things Dissatenuation Standard Error of Measurement Standard Error of the Difference 9 30

Uses Dissattenuation of an observed relationship for unreliability in either variable ' r xy = r xy r xx r yy Standard Error of Measurement The standard error of measurement is similar to the standard error of estimate in regression (the standard deviation of the errors of prediction). Estimate of the average distance of observed test scores from an individual's true score. SEM = test 1 r xx 31 Examples: σ = 10; r xx =.90 SEM = 3.16 σ = 10; r xx =.60 SEM = 6.3 3 Standard Error of Measurement Confidence Interval: Use SEM to calculate a band or range of scores around the person's observed score that has a high probability of including the person s true score. CI = Obtained score + z (SEM) z = 1.0 for 68% level z = 1.44 for 85% level z = 1.65 for 90% level z = 1.96 for 95% level Standard Error of Measurement Normal Curve 95% 99% -4σ -3σ -σ -1σ Mean +1σ +σ +3σ +4σ Actual z-score =.58 z =.58 for 99% level 33 3.16 x.58 = 8.15 (99% confidence) 34 68% Actual z-score = 1.96 3.16 x 1.96 = 6.19 (95% confidence) Standard Error of the Difference How reliable is reliable? When we want to compare two scores: SED= SEM 1 SEM 35 Reliability estimates of.70-.80 are usually good enough for research. However, if important decisions are being made, the test should have reliabilities of at least.90 and.95 is the desirable standard. It s usually not worthwhile to try to improve reliability beyond.90 in most cases. The exception is when individual decisions of great import are being made. In this case, it s important to also consider the SEM. Nunnally is the standard source for these standards 36